You need to enable JavaScript to run this app

Click the "Customize" menu button in the top-right corner of the Chrome window

Select the Settings menu item

At the bottom, click Advanced

Under Privacy and security, click Site settings

Click JavaScript

Turn on Allowed (recommended)

Click the Safari menu

Click the Preference menu item

Click on the Security tab

Click the Enable Javascript checkbox

Close the Preferences window. JavaScript is now enabled

Open the "Easy Setup" menu In the top right corner of the Opera window

At the very bottom, click Go to browser settings

Search for Javascript

Click Site settings

Click JavaScript

Choose your JavaScript settings

Click into the Address Bar, type in about:config and press enter

Accept the warning about changing advanced settings

Type javascript into the search field

Find the javascript:enabled preference

Reset the JavaScript preference to default (enabled)

Select the Tools menu

Select Internet Options

Select the Security tab

Click the Custom level... button

Scroll down to the sub-section Active scripting under Scripting section, click Enable

Confirm the change

docs

Support

API reference - Historical

Databento's historical data service can be accessed programmatically over its HTTP API. To make it easier to integrate the API, we also provide official client libraries that simplify the code you need to write.

Our HTTP API is designed as a collection of RPC-style methods, which can be called using URLs in the form https://hist.databento.com/v0/METHOD_FAMILY.METHOD.

Our client libraries wrap these HTTP RPC-style methods with more idiomatic interfaces in their respective languages.

You can use our API to stream or load data directly into your application. You can also use our API to make batch download requests, which instruct our service to prepare the data as flat files that can downloaded from the Download center.

HISTORICAL DATA

Client Libraries

Python

C++

Rust

APIs

HTTP

pip install -U databento

209

Basics

Overview

Our historical API has the following structure:

Metadata provides information about the datasets themselves.
Time series provides all types of time series data. This includes subsampled data (second, minute, hour, daily aggregates), trades, top-of-book, order book deltas, order book snapshots, summary statistics, static data and macro indicators. We also provide properties of products such as expirations, tick sizes and symbols as time series data.
Symbology provides methods that help find and resolve symbols across different symbology systems.
Batch provides a means of submitting and querying for details of batch download requests.

Authentication

Databento uses API keys to authenticate requests. You can view and manage your keys on the API keys page of your portal.

Each API key is a 32-character string starting with db-. By default, our library uses the environment variable DATABENTO_API_KEY as your API key. However, if you pass an API key to the Historical constructor through the key parameter, then this value will be used instead.

Related: Securing your API keys.

Example usage
PythonC++RustHTTP

import databento as db

# Establish connection and authenticate
client = db.Historical("$YOUR_API_KEY")

# Authenticated request
print(client.metadata.list_datasets())

Schemas and conventions

A schema is a data record format represented as a collection of different data fields. Our datasets support multiple schemas, such as order book, trades, bar aggregates, and so on. You can get a dictionary describing the fields of each schema from our List of market data schemas.

You can get a list of all supported schemas for any given dataset using the Historical client's list_schemas method. The same information can also be found on the dataset details pages on the user portal.

The following table provides details about the data types and conventions used for various fields that you will commonly encounter in the data.

Name	Field	Description
Dataset	`dataset`	A unique string name assigned to each dataset by Databento. Full list of datasets can be found from the metadata.
Publisher ID	`publisher_id`	A unique 16-bit unsigned integer assigned to each publisher by Databento. Full list of publisher IDs can be found from the metadata.
Instrument ID	`instrument_id`	A unique 32-bit unsigned integer assigned to each instrument by the venue. Information about instrument IDs for any given dataset can be found in the symbology.
Order ID	`order_id`	A unique 64-bit unsigned integer assigned to each order by the venue.
Timestamp (event)	`ts_event`	The matching-engine-received timestamp expressed as the number of nanoseconds since the UNIX epoch.
Timestamp (receive)	`ts_recv`	The capture-server-received timestamp expressed as the number of nanoseconds since the UNIX epoch.
Timestamp delta (in)	`ts_in_delta`	The matching-engine-sending timestamp expressed as the number of nanoseconds before `ts_recv`. See timestamping guide.
Timestamp out	`ts_out`	The Databento gateway-sending timestamp expressed as the number of nanoseconds since the UNIX epoch. See timestamping guide.
Price	`price`	The price expressed as signed integer where every 1 unit corresponds to 1e-9, i.e. 1/1,000,000,000 or 0.000000001.
Book side	`side`	The side that initiates the event. Can be Ask for a sell order (or sell aggressor in a trade), Bid for a buy order (or buy aggressor in a trade), or None where no side is specified by the original source.
Size	`size`	The order quantity.
Flag	`flag`	A bit field indicating event end, message characteristics, and data quality.
Action	`action`	The event type or order book operation. Can be Add, Cancel, Modify, cleaR book, Trade, Fill, or None.
Sequence number	`sequence`	The original message sequence number from the venue.

Datasets

Databento provides time series datasets for a variety of markets, sourced from different publishers. Our available datasets can be browsed through the search feature on our site.

Each dataset is assigned a unique string identifier (dataset ID) in the form PUBLISHER.DATASET, such as GLBX.MDP3. For publishers that are also markets, we use standard four-character ISO 10383 Market Identifier Codes (MIC). Otherwise, Databento arbitrarily assigns a four-character identifier for the publisher.

These dataset IDs are also found on the Data catalog and Download request features of the Databento user portal.

When a publisher provides multiple data products with different levels of granularity, Databento subscribes to the most-granular product. We then provide this dataset with alternate schemas to make it easy to work with the level of detail most appropriate for your application.

More information about different types of venues and publishers is available in our FAQs.

Symbology

Databento's historical API supports several ways to select an instrument in a dataset. An instrument is specified using a symbol and a symbology type, also referred to as an stype. The supported symbology types are:

Raw symbology (raw_symbol) original string symbols used by the publisher in the source data.
Instrument ID symbology (instrument_id) unique numeric ID assigned to each instrument by the publisher.
Parent symbology (parent) groups instruments related to the market for the same underlying.
Continuous contract symbology (continuous) proprietary symbology that specifies instruments based on certain systematic rules.

When requesting data from our timeseries.get_range or batch.submit_job endpoints, an input and output symbology type can be specified. By default, our client libraries will use raw symbology for the input type and instrument ID symbology for the output type. Not all symbology types are supported for every dataset.

The process of converting between one symbology type to another is called symbology resolution. This conversion can be done, for no cost, with the symbology.resolve endpoint.

For more about symbology at Databento, see our Standards and conventions.

Encodings

DBN

Databento Binary Encoding (DBN) is an extremely fast message encoding and highly-compressible storage format for normalized market data. It includes a self-describing metadata header and adopts a binary format with zero-copy serialization.

We recommend using our Python, C++, or Rust client libraries to read DBN files locally. A CLI tool is also available for converting DBN files to CSV or JSON.

CSV

Comma-separated values (CSV) is a simple text file format for tabular data, CSVs can be easily opened with Excel, loaded into pandas data frames, or parsed in C++.

Our CSVs have one header line, followed by one record per line. Lines use UNIX-style \n separators.

JSON

JavaScript Object Notation (JSON) is a flexible text file format with broad language support and wide adoption across web apps.

Our JSON files follow the JSON lines specification, where each line of the file is a JSON record. Lines use UNIX-style \n separators.

Compression

Databento provides options for compressing files from our API. Available compression formats depend on the encoding you select.

Zstd

The Zstd compression option uses the Zstandard format.

This option is available for all encodings, and is recommended for faster transfer speeds and smaller files.

You can read Zstandard files in Python using the zstandard package.

Read more about working with Zstandard-compressed files.

None

The None compression option disables compression entirely, resulting in significantly larger files. However, this can be useful for loading small CSV files directly into Excel.

Dates and times

Our Python client library has several functions with timestamp arguments. These arguments will have type pandas.Timestamp | datetime.date | str | int and support a variety of formats.

It's recommended to use pandas.Timestamp, which fully supports timezones and nanosecond-precision. If a datetime.date is used, the time is set to midnight UTC. If an int is provided, the value is interpreted as UNIX nanoseconds.

The client library also handles several string-based timestamp formats based on ISO 8601.

yyyy-mm-dd, e.g. "2022-02-28" (midnight UTC)
yyyy-mm-ddTHH:MM, e.g. "2022-02-28T23:50"
yyyy-mm-ddTHH:MM:SS, e.g. "2022-02-28T23:50:59"
yyyy-mm-ddTHH:MM:SS.NNNNNNNNN, e.g. "2022-02-28T23:50:59.123456789"

Timezone specification is also supported.

yyyy-mm-ddTHH:MMZ
yyyy-mm-ddTHH:MM±hh
yyyy-mm-ddTHH:MM±hhmm
yyyy-mm-ddTHH:MM±hh:mm

Bare dates

Some parameters require a bare date, without a time. These arguments have type datetime.date | str and must either be a datetime.date object, or a string in yyyy-mm-dd format, e.g. "2022-02-28".

Errors

Our historical API uses HTTP response codes to indicate the success or failure of an API request. The client library provides exceptions that wrap these response codes.

2xx indicates success.
4xx indicates an error on the client side. Represented as a BentoClientError.
5xx indicates an error with Databento's servers. Represented as a BentoServerError.

The full list of the response codes and associated causes is as follows:

Code	Message	Cause
200	OK	Successful request.
206	Partial Content	Successful request, with partially resolved symbols.
400	Bad Request	Invalid request. Usually due to a missing, malformed or unsupported parameter.
401	Unauthorized	Invalid username or API key.
402	Payment Required	Issue with your account payment information.
403	Forbidden	The API key has insufficient permissions to perform the request.
404	Not Found	A resource is not found, or a requested symbol does not exist.
409	Conflict	A resource already exists.
422	Unprocessable Entity	The request is well formed, but we cannot or will not process the contained instructions.
429	Too Many Requests	API rate limit exceeded.
500	Internal Server Error	Unexpected condition encountered in our system.
503	Service Unavailable	Data gateway is offline or overloaded.
504	Gateway Timeout	Data gateway is available but other parts of our system are offline or overloaded.

API method
PythonC++RustHTTP

      
    
class databento.BentoError(Exception)
class databento.BentoHttpError(databento.BentoError)
class databento.BentoClientError(databento.BentoHttpError)
class databento.BentoServerError(databento.BentoHttpError)

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("INVALID_API_KEY")

try:
    print(client.metadata.list_datasets())
except db.BentoClientError as e:
    print(e)

Example response
PythonC++RustHTTP

      
    
400 auth_invalid_username_in_basic_auth
Invalid username in Basic auth ('INVALID_API_KEY').
documentation: https://databento.com/docs

Rate limits

Our historical API allows each IP address up to:

100 concurrent connections.
100 time series requests per second.
100 symbology requests per second.
20 metadata requests per second.
20 batch list jobs requests per second.
20 batch submit job requests per minute.

When a request exceeds a rate limit, a BentoClientError exception is raised with a 429 error code.

Retry-After

The Retry-After response header indicates how long the user should wait before retrying.

If you find that your application has been rate-limited, you can retry after waiting for the time specified in the Retry-After header.

If you are using Python, you may use the time.sleep function as seen below to wait for the time specified in the Retry-After header. e.g. time.sleep(int(response.headers("Retry-After", 1)))

This code snippet works best for our current APIs with their rate limits. Future APIs may have different rate limits, and might require a different default time delay.

Size limits

There is no size limit for either stream or batch download requests. Batch download is more manageable for large datasets, so we recommend using batch download for requests over 5 GB.

You can also manage the size of your request by splitting it into multiple, smaller requests. The historical API allows you to make stream and batch download requests with time ranges specified up to nanosecond resolution. You can also use the limit parameter in any request to limit the number of data records returned from the service.

Batch download supports different delivery methods which can be specified using the delivery parameter.

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

job = client.batch.submit_job(
    dataset="GLBX.MDP3",
    symbols="CLZ7",
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T00:10:00",
    limit=10000,
)

Metered pricing

Databento only charges for the data that you use. You can find rates (per MB) for the various datasets and estimate pricing on our Data catalog. We meter the data by its uncompressed size in binary encoding.

When you stream the data, you are billed incrementally for each outbound byte of data sent from our historical gateway. If your connection is interrupted while streaming our data and our historical gateway detects connection timeout over 5 seconds, it will immediately stop sending data and you will not be billed for the remainder of your request.

Duplicate streaming requests will incur repeated charges. If you intend to access the same data multiple times, we recommend using our batch download feature. When you make a batch download request, you are only billed once for the request and, subsequently, you can download the data from the Download center multiple times over 30 days for no additional charge.

You will only be billed for usage of time series data. Access to metadata, symbology, and account management is free. The Historical.metadata.get_cost method can be used to determine cost before you request any data.

Related: Billing management.

Versioning

Our historical and live APIs and its client libraries adopt MAJOR.MINOR.PATCH format for version numbers. These version numbers conform to semantic versioning. We are using major version 0 for initial development, where our API is not considered stable.

Once we release major version 1, our public API will be stable. This means that you will be able to upgrade minor or patch versions to pick up new functionality, without breaking your integration.

Starting with major versions after 1, we will provide support for previous versions for one year after the date of the subsequent major release. For example, if version 2.0.0 is released on January 1, 2024, then all versions 1.x.y of the API and client libraries will be deprecated. However, they will remain supported until January 1, 2025.

We may introduce backwards-compatible changes between minor versions in the form of:

New data encodings
Additional fields to existing data schemas
Additional batch download customizations

Our Release notes will contain information about both breaking and backwards-compatible changes in each release.

Our API and official client libraries are kept in sync with same-day releases for major versions. For instance, 1.x.y of the C++ client library will use the same functionality found in any 1.x.y version of the Python client.

Related: Release notes.

Client

Historical

To access Databento's historical API, first create an instance of the Historical client. The entire API is exposed through instance methods of the client.

Note that the API key can be passed as a parameter, which is not recommended for production applications. Instead, you can leave out this parameter to pass your API key via the DATABENTO_API_KEY environment variable:

Currently, only BO1 is supported for historical data.

Parameters

key

optional | str

32-character API key. Found on your API keys page. If None then DATABENTO_API_KEY environment variable is used.

gateway

optional | HistoricalGateway or str

Site of historical gateway to connect to. Currently only BO1 is supported. If None then will connect to the default historical gateway.

API method
PythonC++RustHTTP

      
    
class Historical(
    key: str | None = None,
    gateway: HistoricalGateway | str = HistoricalGateway.BO1,
)

Example usage
PythonC++RustHTTP

import databento as db

# Pass as parameter
client = db.Historical("$YOUR_API_KEY")

# Or, pass as `DATABENTO_API_KEY` environment variable
client = db.Historical()

Metadata

Historical.metadata.list_publishers

List all publisher ID mappings.

Use this method to list the details of publishers, including their dataset and venue mappings.

Returns

list[dict[str, int | str]]

A list of publisher details objects.

publisher_id

int

The publisher ID assigned by Databento.

dataset

str

The dataset ID for the publisher.

venue

str

The venue for the publisher.

description

str

The publisher description.

API method
PythonC++RustHTTP

Historical.metadata.list_publishers() -> list[dict[str, int | str]]

Example usage
PythonC++RustHTTP

import databento as db

client = db.Historical("$YOUR_API_KEY")

publishers = client.metadata.list_publishers()
print(publishers)

Example response
PythonC++RustHTTP

      
    
[{'dataset': 'GLBX.MDP3',
  'description': 'CME Globex MDP 3.0',
  'publisher_id': 1,
  'venue': 'GLBX'},
 {'dataset': 'XNAS.ITCH',
  'description': 'Nasdaq TotalView-ITCH',
  'publisher_id': 2,
  'venue': 'XNAS'},
 {'dataset': 'XBOS.ITCH',
  'description': 'Nasdaq BX TotalView-ITCH',
  'publisher_id': 3,
  'venue': 'XBOS'},
 {'dataset': 'XPSX.ITCH',
  'description': 'Nasdaq PSX TotalView-ITCH',
  'publisher_id': 4,
  'venue': 'XPSX'},
 {'dataset': 'BATS.PITCH',
  'description': 'Cboe BZX Depth',
  'publisher_id': 5,
  'venue': 'BATS'},
 {'dataset': 'BATY.PITCH',
  'description': 'Cboe BYX Depth',
  'publisher_id': 6,
  'venue': 'BATY'},
 {'dataset': 'EDGA.PITCH',
  'description': 'Cboe EDGA Depth',
  'publisher_id': 7,
  'venue': 'EDGA'},
 {'dataset': 'EDGX.PITCH',
  'description': 'Cboe EDGX Depth',
  'publisher_id': 8,
  'venue': 'EDGX'},
 {'dataset': 'XNYS.PILLAR',
  'description': 'NYSE Integrated',
  'publisher_id': 9,
  'venue': 'XNYS'},
 {'dataset': 'XCIS.PILLAR',
  'description': 'NYSE National Integrated',
  'publisher_id': 10,
  'venue': 'XCIS'},
 {'dataset': 'XASE.PILLAR',
  'description': 'NYSE American Integrated',
  'publisher_id': 11,
  'venue': 'XASE'},
 {'dataset': 'XCHI.PILLAR',
  'description': 'NYSE Texas Integrated',
  'publisher_id': 12,
  'venue': 'XCHI'},
 {'dataset': 'XCIS.BBO',
  'description': 'NYSE National BBO',
  'publisher_id': 13,
  'venue': 'XCIS'},
 {'dataset': 'XCIS.TRADES',
  'description': 'NYSE National Trades',
  'publisher_id': 14,
  'venue': 'XCIS'},
 {'dataset': 'MEMX.MEMOIR',
  'description': 'MEMX Memoir Depth',
  'publisher_id': 15,
  'venue': 'MEMX'},
 {'dataset': 'EPRL.DOM',
  'description': 'MIAX Pearl Depth',
  'publisher_id': 16,
  'venue': 'EPRL'},
 {'dataset': 'XNAS.NLS',
  'description': 'FINRA/Nasdaq TRF Carteret',
  'publisher_id': 17,
  'venue': 'FINN'},
 {'dataset': 'XNAS.NLS',
  'description': 'FINRA/Nasdaq TRF Chicago',
  'publisher_id': 18,
  'venue': 'FINC'},
 {'dataset': 'XNYS.TRADES',
  'description': 'FINRA/NYSE TRF',
  'publisher_id': 19,
  'venue': 'FINY'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - NYSE American Options',
  'publisher_id': 20,
  'venue': 'AMXO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - BOX Options',
  'publisher_id': 21,
  'venue': 'XBOX'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe Options',
  'publisher_id': 22,
  'venue': 'XCBO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Emerald',
  'publisher_id': 23,
  'venue': 'EMLD'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe EDGX Options',
  'publisher_id': 24,
  'venue': 'EDGO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq GEMX',
  'publisher_id': 25,
  'venue': 'GMNI'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq ISE',
  'publisher_id': 26,
  'venue': 'XISX'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq MRX',
  'publisher_id': 27,
  'venue': 'MCRY'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Options',
  'publisher_id': 28,
  'venue': 'XMIO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - NYSE Arca Options',
  'publisher_id': 29,
  'venue': 'ARCO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Options Price Reporting Authority',
  'publisher_id': 30,
  'venue': 'OPRA'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Pearl',
  'publisher_id': 31,
  'venue': 'MPRL'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq Options',
  'publisher_id': 32,
  'venue': 'XNDQ'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq BX Options',
  'publisher_id': 33,
  'venue': 'XBXO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe C2 Options',
  'publisher_id': 34,
  'venue': 'C2OX'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq PHLX',
  'publisher_id': 35,
  'venue': 'XPHL'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe BZX Options',
  'publisher_id': 36,
  'venue': 'BATO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MEMX Options',
  'publisher_id': 37,
  'venue': 'MXOP'},
 {'dataset': 'IEXG.TOPS',
  'description': 'IEX TOPS',
  'publisher_id': 38,
  'venue': 'IEXG'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - NYSE Texas',
  'publisher_id': 39,
  'venue': 'XCHI'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - NYSE National',
  'publisher_id': 40,
  'venue': 'XCIS'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - IEX',
  'publisher_id': 41,
  'venue': 'IEXG'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - MIAX Pearl',
  'publisher_id': 42,
  'venue': 'EPRL'},
 {'dataset': 'ARCX.PILLAR',
  'description': 'NYSE Arca Integrated',
  'publisher_id': 43,
  'venue': 'ARCX'},
 {'dataset': 'XNYS.BBO',
  'description': 'NYSE BBO',
  'publisher_id': 44,
  'venue': 'XNYS'},
 {'dataset': 'XNYS.TRADES',
  'description': 'NYSE Trades',
  'publisher_id': 45,
  'venue': 'XNYS'},
 {'dataset': 'XNAS.QBBO',
  'description': 'Nasdaq QBBO',
  'publisher_id': 46,
  'venue': 'XNAS'},
 {'dataset': 'XNAS.NLS',
  'description': 'Nasdaq Trades',
  'publisher_id': 47,
  'venue': 'XNAS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - NYSE Texas',
  'publisher_id': 48,
  'venue': 'XCHI'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - NYSE National',
  'publisher_id': 49,
  'venue': 'XCIS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - IEX',
  'publisher_id': 50,
  'venue': 'IEXG'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - MIAX Pearl',
  'publisher_id': 51,
  'venue': 'EPRL'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - Nasdaq',
  'publisher_id': 52,
  'venue': 'XNAS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - NYSE',
  'publisher_id': 53,
  'venue': 'XNYS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - FINRA/Nasdaq TRF Carteret',
  'publisher_id': 54,
  'venue': 'FINN'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - FINRA/NYSE TRF',
  'publisher_id': 55,
  'venue': 'FINY'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - FINRA/Nasdaq TRF Chicago',
  'publisher_id': 56,
  'venue': 'FINC'},
 {'dataset': 'IFEU.IMPACT',
  'description': 'ICE Europe Commodities',
  'publisher_id': 57,
  'venue': 'IFEU'},
 {'dataset': 'NDEX.IMPACT',
  'description': 'ICE Endex',
  'publisher_id': 58,
  'venue': 'NDEX'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'Databento US Equities Basic - Consolidated',
  'publisher_id': 59,
  'venue': 'DBEQ'},
 {'dataset': 'EQUS.PLUS',
  'description': 'EQUS Plus - Consolidated',
  'publisher_id': 60,
  'venue': 'EQUS'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Sapphire',
  'publisher_id': 61,
  'venue': 'SPHR'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE Texas',
  'publisher_id': 62,
  'venue': 'XCHI'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE National',
  'publisher_id': 63,
  'venue': 'XCIS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - IEX',
  'publisher_id': 64,
  'venue': 'IEXG'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - MIAX Pearl',
  'publisher_id': 65,
  'venue': 'EPRL'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Nasdaq',
  'publisher_id': 66,
  'venue': 'XNAS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE',
  'publisher_id': 67,
  'venue': 'XNYS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - FINRA/Nasdaq TRF Carteret',
  'publisher_id': 68,
  'venue': 'FINN'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - FINRA/NYSE TRF',
  'publisher_id': 69,
  'venue': 'FINY'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - FINRA/Nasdaq TRF Chicago',
  'publisher_id': 70,
  'venue': 'FINC'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe BZX',
  'publisher_id': 71,
  'venue': 'BATS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe BYX',
  'publisher_id': 72,
  'venue': 'BATY'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe EDGA',
  'publisher_id': 73,
  'venue': 'EDGA'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe EDGX',
  'publisher_id': 74,
  'venue': 'EDGX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Nasdaq BX',
  'publisher_id': 75,
  'venue': 'XBOS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Nasdaq PSX',
  'publisher_id': 76,
  'venue': 'XPSX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - MEMX',
  'publisher_id': 77,
  'venue': 'MEMX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE American',
  'publisher_id': 78,
  'venue': 'XASE'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE Arca',
  'publisher_id': 79,
  'venue': 'ARCX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Long-Term Stock Exchange',
  'publisher_id': 80,
  'venue': 'LTSE'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Nasdaq',
  'publisher_id': 81,
  'venue': 'XNAS'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - FINRA/Nasdaq TRF Carteret',
  'publisher_id': 82,
  'venue': 'FINN'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - FINRA/Nasdaq TRF Chicago',
  'publisher_id': 83,
  'venue': 'FINC'},
 {'dataset': 'IFEU.IMPACT',
  'description': 'ICE Europe - Off-Market Trades',
  'publisher_id': 84,
  'venue': 'XOFF'},
 {'dataset': 'NDEX.IMPACT',
  'description': 'ICE Endex - Off-Market Trades',
  'publisher_id': 85,
  'venue': 'XOFF'},
 {'dataset': 'XNAS.NLS',
  'description': 'Nasdaq NLS - Nasdaq BX',
  'publisher_id': 86,
  'venue': 'XBOS'},
 {'dataset': 'XNAS.NLS',
  'description': 'Nasdaq NLS - Nasdaq PSX',
  'publisher_id': 87,
  'venue': 'XPSX'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Nasdaq BX',
  'publisher_id': 88,
  'venue': 'XBOS'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Nasdaq PSX',
  'publisher_id': 89,
  'venue': 'XPSX'},
 {'dataset': 'EQUS.SUMMARY',
  'description': 'Databento Equities Summary',
  'publisher_id': 90,
  'venue': 'EQUS'},
 {'dataset': 'XCIS.TRADESBBO',
  'description': 'NYSE National Trades and BBO',
  'publisher_id': 91,
  'venue': 'XCIS'},
 {'dataset': 'XNYS.TRADESBBO',
  'description': 'NYSE Trades and BBO',
  'publisher_id': 92,
  'venue': 'XNYS'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Consolidated',
  'publisher_id': 93,
  'venue': 'EQUS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Consolidated',
  'publisher_id': 94,
  'venue': 'EQUS'},
 {'dataset': 'EQUS.MINI',
  'description': 'Databento US Equities Mini',
  'publisher_id': 95,
  'venue': 'EQUS'},
 {'dataset': 'XNYS.TRADES',
  'description': 'NYSE Trades - Consolidated',
  'publisher_id': 96,
  'venue': 'EQUS'},
 {'dataset': 'IFUS.IMPACT',
  'description': 'ICE Futures US',
  'publisher_id': 97,
  'venue': 'IFUS'},
 {'dataset': 'IFUS.IMPACT',
  'description': 'ICE Futures US - Off-Market Trades',
  'publisher_id': 98,
  'venue': 'XOFF'},
 {'dataset': 'IFLL.IMPACT',
  'description': 'ICE Europe Financials',
  'publisher_id': 99,
  'venue': 'IFLL'},
 {'dataset': 'IFLL.IMPACT',
  'description': 'ICE Europe Financials - Off-Market Trades',
  'publisher_id': 100,
  'venue': 'XOFF'},
 {'dataset': 'XEUR.EOBI',
  'description': 'Eurex EOBI',
  'publisher_id': 101,
  'venue': 'XEUR'},
 {'dataset': 'XEEE.EOBI',
  'description': 'European Energy Exchange EOBI',
  'publisher_id': 102,
  'venue': 'XEEE'},
 {'dataset': 'XEUR.EOBI',
  'description': 'Eurex EOBI - Off-Market Trades',
  'publisher_id': 103,
  'venue': 'XOFF'},
 {'dataset': 'XEEE.EOBI',
  'description': 'European Energy Exchange EOBI - Off-Market Trades',
  'publisher_id': 104,
  'venue': 'XOFF'}
 ]

Historical.metadata.list_datasets

List all valid dataset IDs on Databento.

Use this method to list the available dataset IDs (string identifiers), so you can use other methods which take the dataset parameter.

Parameters

start_date

optional | date or str

The inclusive UTC start date of the request range as a Python date or ISO 8601 date string. If None then first date available.

end_date

optional | date or str

The exclusive UTC end date of the request range as a Python date or ISO 8601 date string. If None then last date available.

Returns

list[str]

A list of available dataset IDs.

API method
PythonC++RustHTTP

      
    
Historical.metadata.list_datasets(
    start_date: date | str | None = None,
    end_date: date | str | None = None,
) -> list[str]

Example usage
PythonC++RustHTTP

import databento as db

client = db.Historical("$YOUR_API_KEY")

datasets = client.metadata.list_datasets()
print(datasets)

Example response
PythonC++RustHTTP

      
    
[
    "ARCX.PILLAR",
    "DBEQ.BASIC",
    "EPRL.DOM",
    "EQUS.MINI",
    "EQUS.SUMMARY",
    "GLBX.MDP3",
    "IEXG.TOPS",
    "IFEU.IMPACT",
    "NDEX.IMPACT",
    "OPRA.PILLAR",
    "XASE.PILLAR",
    "XBOS.ITCH",
    "XCHI.PILLAR",
    "XCIS.TRADESBBO",
    "XNAS.BASIC",
    "XNAS.ITCH",
    "XNYS.PILLAR",
    "XPSX.ITCH"
]

Historical.metadata.list_schemas

List all available schemas for a dataset.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

Returns

list[str]

A list of available data schemas.

API method
PythonC++RustHTTP

      
    
Historical.metadata.list_schemas(
    dataset: Dataset | str,
) -> list[str]

Example usage
PythonC++RustHTTP

import databento as db

client = db.Historical("$YOUR_API_KEY")

schemas = client.metadata.list_schemas(dataset="GLBX.MDP3")
print(schemas)

Example response
PythonC++RustHTTP

      
    
[
    "mbo",
    "mbp-1",
    "mbp-10",
    "tbbo",
    "trades",
    "bbo-1s",
    "bbo-1m",
    "ohlcv-1s",
    "ohlcv-1m",
    "ohlcv-1h",
    "ohlcv-1d",
    "definition",
    "statistics",
    "status"
]

Historical.metadata.list_fields

List all fields for a particular schema and encoding.

Parameters

schema

required | Schema or str

The data record schema. Must be one of the values from list_schemas.

encoding

required | Encoding or str

The data encoding. Must be one of 'dbn', 'csv', or 'json'. 'dbn' is recommended.

Returns

list[dict[str, str]]

A list of field details objects.

name

str

The field name.

type

str

The field data type.

API method
PythonC++RustHTTP

      
    
Historical.metadata.list_fields(
    schema: Schema | str,
    encoding: Encoding | str,
) -> list[dict[str, str]]

Example usage
PythonC++RustHTTP

import databento as db

client = db.Historical("$YOUR_API_KEY")

fields = client.metadata.list_fields(schema="trades", encoding="dbn")
print(fields)

Example response
PythonC++RustHTTP

      
    
[
    {
        "name": "length",
        "type": "uint8_t"
    },
    {
        "name": "rtype",
        "type": "uint8_t"
    },
    {
        "name": "publisher_id",
        "type": "uint16_t"
    },
    {
        "name": "instrument_id",
        "type": "uint32_t"
    },
    {
        "name": "ts_event",
        "type": "uint64_t"
    },
    {
        "name": "price",
        "type": "int64_t"
    },
    {
        "name": "size",
        "type": "uint32_t"
    },
    {
        "name": "action",
        "type": "char"
    },
    {
        "name": "side",
        "type": "char"
    },
    {
        "name": "flags",
        "type": "uint8_t"
    },
    {
        "name": "depth",
        "type": "uint8_t"
    },
    {
        "name": "ts_recv",
        "type": "uint64_t"
    },
    {
        "name": "ts_in_delta",
        "type": "int32_t"
    },
    {
        "name": "sequence",
        "type": "uint32_t"
    }
]

Historical.metadata.list_unit_prices

List unit prices for each feed mode and data schema in US dollars per gigabyte.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

Returns

list[dict[str, Any]]

A list of maps of feed mode to schema to unit price.

mode

str

The feed mode. Will be one of "historical", "historical-streaming", or "live".

unit_prices

dict[str | float]

A mapping of schemas to unit prices in US dollars per gigabyte.

API method
PythonC++RustHTTP

      
    
Historical.metadata.list_unit_prices(
    dataset: Dataset | str,
) -> list[dict[str, Any]]

Example usage
PythonC++RustHTTP

import databento as db

client = db.Historical("$YOUR_API_KEY")

unit_prices = client.metadata.list_unit_prices(dataset="OPRA.PILLAR")
print(unit_prices)

Example response
PythonC++RustHTTP

      
    
[
    {
        "mode": "historical",
        "unit_prices": {
            "mbp-1": 0.04,
            "ohlcv-1s": 280.0,
            "ohlcv-1m": 280.0,
            "ohlcv-1h": 600.0,
            "ohlcv-1d": 600.0,
            "tbbo": 210.0,
            "trades": 280.0,
            "statistics": 11.0,
            "definition": 5.0
        }
    },
    {
        "mode": "historical-streaming",
        "unit_prices": {
            "mbp-1": 0.04,
            "ohlcv-1s": 280.0,
            "ohlcv-1m": 280.0,
            "ohlcv-1h": 600.0,
            "ohlcv-1d": 600.0,
            "tbbo": 210.0,
            "trades": 280.0,
            "statistics": 11.0,
            "definition": 5.0
        }
    },
    {
        "mode": "live",
        "unit_prices": {
            "mbp-1": 0.05,
            "ohlcv-1s": 336.0,
            "ohlcv-1m": 336.0,
            "ohlcv-1h": 720.0,
            "ohlcv-1d": 720.0,
            "tbbo": 252.0,
            "trades": 336.0,
            "statistics": 13.2,
            "definition": 6.0
        }
    }
]

Historical.metadata.get_dataset_condition

Get the dataset condition from Databento.

Use this method to discover data availability and quality.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

start_date

optional | date or str

The inclusive UTC start date of the request range as a Python date or ISO 8601 date string. If None then first date available.

end_date

optional | date or str

The inclusive UTC end date of the request range as a Python date or ISO 8601 date string. If None then last date available.

Returns

list[dict[str, str | None]]

A list of conditions per date.

date

str

The day of the described data, as an ISO 8601 date string.

condition

str

The condition code describing the quality and availability of the data on the given day. Possible values are listed below.

last_modified_date

str or None

The date when any schema in the dataset on the given day was last generated or modified, as an ISO 8601 date string. Will be None when condition is 'missing'.

Possible values for condition:

available: the data is available with no known issues
degraded: the data is available, but there may be missing data or other correctness issues
pending: the data is not yet available, but may be available soon
missing: the data is not available

API method
PythonC++RustHTTP

      
    
Historical.metadata.get_dataset_condition(
    dataset: Dataset | str,
    start_date: date | str | None = None,
    end_date: date | str | None = None,
) -> list[dict[str, str | None]]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

conditions = client.metadata.get_dataset_condition(
    dataset="GLBX.MDP3",
    start_date="2022-06-06",
    end_date="2022-06-10",
)
print(conditions)

Example response
PythonC++RustHTTP

      
    
[
    {
        "date": "2022-06-06",
        "condition": "available",
        "last_modified_date": "2024-05-18"
    },
    {
        "date": "2022-06-07",
        "condition": "available",
        "last_modified_date": "2024-05-21"
    },
    {
        "date": "2022-06-08",
        "condition": "available",
        "last_modified_date": "2024-05-21"
    },
    {
        "date": "2022-06-09",
        "condition": "available",
        "last_modified_date": "2024-05-21"
    },
    {
        "date": "2022-06-10",
        "condition": "available",
        "last_modified_date": "2024-05-22"
    }
]

Historical.metadata.get_dataset_range

Get the available range for the dataset given the user's entitlements.

Use this method to discover data availability. The start and end values in the response can be used with the timeseries.get_range and batch.submit_job endpoints.

This endpoint will return the start and end timestamps over the entire dataset as well as the per-schema start and end timestamps under the schema key. In some cases, a schema's availability is a subset of the entire dataset availability.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

Returns

dict[str, str | dict[str, str]]

The available range for the dataset.

start

str

The inclusive start of the available range as an ISO 8601 timestamp.

end

str

The exclusive end of the available range as an ISO 8601 timestamp.

schema

dict[str, str]

A mapping of schema names to per-schema start and end timestamps.

API method
PythonC++RustHTTP

      
    
Historical.metadata.get_dataset_range(
    dataset: Dataset | str,
) -> dict[str, str | dict[str, str]]

Example usage
PythonC++RustHTTP

import databento as db

client = db.Historical("$YOUR_API_KEY")

available_range = client.metadata.get_dataset_range(dataset="XNAS.BASIC")

print(available_range)

Example response
PythonC++RustHTTP

      
    
{
    "start":"2018-05-01T00:00:00.000000000Z",
    "end":"2025-01-30T00:00:00.000000000Z",
    "schema": {
        "mbo": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "mbp-1": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "mbp-10": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "bbo-1s": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "bbo-1m": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "tbbo": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "trades": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1s": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1m": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1h": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1d": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "definition": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "statistics": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "status": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "imbalance": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        }
    }
}

Historical.metadata.get_record_count

Get the record count of the time series data query.

This method may not be accurate for time ranges that are not discrete multiples of 10 minutes, potentially over-reporting the number of records in such cases. The definition schema is only accurate for discrete multiples of 24 hours.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

start

required | pd.Timestamp, datetime, date, str, or int

The inclusive start of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.

end

optional | pd.Timestamp, datetime, date, str, or int

The exclusive end of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.

symbols

optional | Iterable[str | int] or str or int

The product symbols to filter for. Takes up to 2,000 symbols per request. If 'ALL_SYMBOLS' or None then will select all symbols.

schema

optional | Schema or str, default 'trades'

The data record schema. Must be one of the values from list_schemas.

stype_in

optional | SType or str, default 'raw_symbol'

The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

limit

optional | int

The maximum number of records to return. If None then no limit.

Returns

int

The record count.

API method
PythonC++RustHTTP

      
    
Historical.metadata.get_record_count(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    limit: int | None = None,
) -> int

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

count = client.metadata.get_record_count(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="mbo",
    start="2022-01-06",
)
print(count)

Example response

Historical.metadata.get_billable_size

Get the billable uncompressed raw binary size for historical streaming or batched files.

This method may not be accurate for time ranges that are not discrete multiples of 10 minutes, potentially over-reporting the size in such cases. The definition schema is only accurate for discrete multiples of 24 hours.

Info
The amount billed will be based on the actual amount of bytes sent; see our pricing documentation for more details.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

start

required | pd.Timestamp, datetime, date, str, or int

The inclusive start of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.

end

optional | pd.Timestamp, datetime, date, str, or int

symbols

optional | Iterable[str | int] or str or int

The product symbols to filter for. Takes up to 2,000 symbols per request. If 'ALL_SYMBOLS' or None then will select all symbols.

schema

optional | Schema or str, default 'trades'

The data record schema. Must be one of the values from list_schemas.

stype_in

optional | SType or str, default 'raw_symbol'

The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

limit

optional | int

The maximum number of records to return. If None then no limit.

Returns

int

The size in number of bytes used for billing.

API method
PythonC++RustHTTP

      
    
Historical.metadata.get_billable_size(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: Stype | str = "raw_symbol",
    limit: int | None = None,
) -> int

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

size = client.metadata.get_billable_size(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T12:10:00",
)
print(size)

Example response

99219648

Historical.metadata.get_cost

Get the cost in US dollars for a historical streaming or batch download request. This cost respects any discounts provided by flat rate plans.

This method may not be accurate for time ranges that are not discrete multiples of 10 minutes, potentially over-reporting the cost in such cases. The definition schema is only accurate for discrete multiples of 24 hours.

Info
The amount billed will be based on the actual amount of bytes sent; see our pricing documentation for more details.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

start

required | pd.Timestamp, datetime, date, str, or int

The inclusive start of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.

end

optional | pd.Timestamp, datetime, date, str, or int

mode

optional | FeedMode or str

The data feed mode of the request. Must be one of 'historical', 'historical-streaming', or 'live'.

symbols

optional | Iterable[str | int] or str or int

The product symbols to filter for. Takes up to 2,000 symbols per request. If 'ALL_SYMBOLS' or None then will select all symbols.

schema

optional | Schema or str, default 'trades'

The data record schema. Must be one of the values from list_schemas.

stype_in

optional | SType or str, default 'raw_symbol'

The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

limit

optional | int

The maximum number of records to return. If None then no limit.

Returns

float

The cost in US dollars.

API method
PythonC++RustHTTP

      
    
Historical.metadata.get_cost(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    mode: FeedMode | str = "historical-streaming",
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    limit: int | None = None,
) -> float

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

cost = client.metadata.get_cost(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T12:10:00",
)
print(cost)

Example response

2.587353944778

Time series

Historical.timeseries.get_range

Makes a streaming request for time series data from Databento.

This is the primary method for getting historical market data, instrument definitions, and status data directly into your application.

This method only returns after all of the data has been downloaded, which can take a long time. For large requests, consider using batch.submit_job instead.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

start

required | pd.Timestamp, datetime, date, str, or int

The inclusive start of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.

end

optional | pd.Timestamp, datetime, date, str, or int

The exclusive end of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.

symbols

optional | Iterable[str | int] or str or int

The product symbols to filter for. Takes up to 2,000 symbols per request. If more than 1 symbol is specified, the data is merged and sorted by time. If 'ALL_SYMBOLS' or None then will select all symbols.

schema

optional | Schema or str, default 'trades'

The data record schema. Must be one of the values from list_schemas.

stype_in

optional | SType or str, default 'raw_symbol'

The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

stype_out

optional | SType or str, default 'instrument_id'

The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

limit

optional | int

The maximum number of records to return. If None then no limit.

path

optional | PathLike[str] or str

The file path to stream the data to. It is recommended to use the ".dbn.zst" suffix.

Returns

A DBNStore object.

A full list of fields for each schema is available through Historical.metadata.list_fields.

API method
PythonC++RustHTTP

      
    
Historical.timeseries.get_range(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    stype_out: SType | str = "instrument_id",
    limit: int | None = None,
    path: PathLike[str] | str | None = None,
) -> DBNStore

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T00:10:00",
    limit=1,
)
df = data.to_df()
print(df.iloc[0].to_json(indent=4))

Example response
PythonC++RustHTTP

      
    
{
    "ts_event":1654473600070,
    "rtype":0,
    "publisher_id":1,
    "instrument_id":3403,
    "action":"T",
    "side":"A",
    "depth":0,
    "price":4108.5,
    "size":1,
    "flags":0,
    "ts_in_delta":18681,
    "sequence":157862,
    "symbol":"ESM2"
}

Historical.timeseries.get_range_async

Asynchronously request a historical time series data stream from Databento.

Primary method for getting historical intraday market data, daily data, instrument definitions and market status data directly into your application.

This method only returns after all of the data has been downloaded, which can take a long time. For large requests, consider using batch.submit_job instead.

Info
This method is a coroutine and must be used with an await expression.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

start

required | pd.Timestamp, datetime, date, str, or int

end

optional | pd.Timestamp, datetime, date, str, or int

symbols

optional | Iterable[str | int] or str or int

schema

optional | Schema or str, default 'trades'

The data record schema. Must be one of the values from list_schemas.

stype_in

optional | SType or str, default 'raw_symbol'

The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

stype_out

optional | SType or str, default 'instrument_id'

The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

limit

optional | int

The maximum number of records to return. If None then no limit.

path

optional | PathLike[str] or str

The file path to stream the data to. It is recommended to use the ".dbn.zst" suffix.

Returns

A DBNStore object.

A full list of fields for each schema is available through Historical.metadata.list_fields.

API method
PythonC++RustHTTP

      
    
Historical.timeseries.get_range_async(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    stype_out: SType | str = "instrument_id",
    limit: int | None = None,
    path: PathLike[str] | str | None = None,
) -> Awaitable[DBNStore]

Example usage
PythonC++RustHTTP

      
    
import asyncio
import databento as db

client = db.Historical("$YOUR_API_KEY")

coro = client.timeseries.get_range_async(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T00:10:00",
    limit=1,
)
data = asyncio.run(coro)

df = data.to_df()
print(df.iloc[0].to_json(indent=4))

Example response
PythonC++RustHTTP

      
    
{
    "ts_event":1654473600070,
    "rtype":0,
    "publisher_id":1,
    "instrument_id":3403,
    "action":"T",
    "side":"A",
    "depth":0,
    "price":4108.5,
    "size":1,
    "flags":0,
    "ts_in_delta":18681,
    "sequence":157862,
    "symbol":"ESM2"
}

Symbology

Historical.symbology.resolve

Resolve a list of symbols from an input symbology type, to an output symbology type.

Take, for example, a raw symbol to an instrument ID: ESM2 → 3403.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

symbols

required | Iterable[str | int] or str or int

The symbols to resolve. Takes up to 2,000 symbols per request. Use 'ALL_SYMBOLS' to request all symbols (not available for every dataset).

stype_in

required | SType or str

The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

stype_out

required | SType or str

The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

start_date

required | date or str

The inclusive UTC start date of the request range as a Python date or ISO 8601 date string.

end_date

optional | date or str

The exclusive UTC end date of the request range as a Python date or ISO 8601 date string. Defaults to the forward filled value of start based on the resolution provided.

Returns

dict[str, Any]

The results for the symbology resolution.

See also
For more information on symbology resolution, visit our symbology documentation.

result

dict[str, list[dict[str, str]]

The symbology mapping result. For each requested symbol, a list of symbology mappings is provided.

symbols

list[str]

The requested symbols.

stype_in

str

The requested input symbology type.

stype_out

str

The requested output symbology type.

start_date

str

The requested symbology start date as an ISO 8601 date string.

end_date

str

The requested symbology end date as an ISO 8601 date string.

partial

list[str]

The list of symbols, if any, that partially resolved inside the start date and end date interval.

not_found

list[str]

The list of symbols, if any, that failed to resolve inside the start date and end date interval.

message

str

A short message indicating the overall symbology result. Can be one of: "OK", or "Partially resolved", or "Not found"

status

int

A numerical status field indicating the overall symbology result. Can be one of: 0 (OK), 1 (Partially resolved), or 2 (Not found).

API method
PythonC++RustHTTP

      
    
Historical.symbology.resolve(
    dataset: Dataset | str,
    symbols: Iterable[str | int] | str | int,
    stype_in: SType | str,
    stype_out: SType | str,
    start_date: date | str,
    end_date: date | str | None = None,
) -> dict[str, Any]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

result = client.symbology.resolve(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    stype_in="raw_symbol",
    stype_out="instrument_id",
    start_date="2022-06-01",
    end_date="2022-06-30",
)
print(result)

Example response
PythonC++RustHTTP

      
    
{
    "result": {
        "ESM2": [
            {
                "d0": "2022-06-01",
                "d1": "2022-06-26",
                "s": "3403"
            }
        ]
    },
    "symbols": [
        "ESM2"
    ],
    "stype_in": "raw_symbol",
    "stype_out": "instrument_id",
    "start_date": "2022-06-01",
    "end_date": "2022-06-30",
    "partial": [],
    "not_found": [],
    "message": "OK",
    "status": 0
}

Batch downloads

Batch downloads allow you to download data files directly from within your portal. For more information, see Streaming vs. batch download.

Historical.batch.submit_job

Make a batch download job request.

Once a request is submitted, our system processes the request and prepares the batch files in the background. The status of your request and the files can be accessed from the Download center from your user portal.

This method takes longer than a streaming request, but is advantageous for larger requests as it supports delivery mechanisms that allow multiple accesses of the data without additional cost for each subsequent download after the first.

Related: batch.list_jobs.

Parameters

dataset

required | Dataset or str

The dataset code (string identifier). Must be one of the values from list_datasets.

symbols

required | Iterable[str | int] or str or int

schema

required | Schema or str

The data record schema. Must be one of the values from list_schemas.

start

required | pd.Timestamp, datetime, date, str, or int

end

optional | pd.Timestamp, datetime, date, str, or int

encoding

optional | Encoding or str

The data encoding. Must be one of 'dbn', 'csv', 'json'. For fastest transfer speed, 'dbn' is recommended.

compression

optional | Compression or str

The data compression mode. Must be either 'zstd', 'none', or None. For fastest transfer speed, 'zstd' is recommended.

pretty_px

optional | bool, default False

If prices should be formatted to the correct scale (using the fixed-precision scalar 1e-9). Only applicable for 'csv' or 'json' encodings.

pretty_ts

optional | bool, default False

If timestamps should be formatted as ISO 8601 strings. Only applicable for 'csv' or 'json' encodings.

map_symbols

optional | bool, default False

If a symbol field should be included with each text-encoded record. Only applicable for 'csv' or 'json' encodings.

split_symbols

optional | bool, default False

If files should be split by raw symbol. Cannot be requested with 'ALL_SYMBOLS'. Cannot be used with limit.

split_duration

optional | Duration or str, default 'day'

The maximum time duration before batched data is split into multiple files. Must be one of 'day', 'week', 'month', or 'none'. A week starts on Sunday UTC.

split_size

optional | int

The maximum size (in bytes) of each batched data file before being split. Must be an integer between 1e9 and 10e9 inclusive (1GB - 10GB). Defaults to no split size.

delivery

optional | Delivery or str, default 'download'

The delivery mechanism for the batched data files once processed. Only 'download' is supported at this time.

stype_in

optional | SType or str, default 'raw_symbol'

The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

stype_out

optional | SType or str, default 'instrument_id'

The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.

limit

optional | int

The maximum number of records to return. If None then no limit. Cannot be used with split_symbols.

Returns

dict[str, Any]

The description of the submitted batch job.

str

The unique job ID for the request.

user_id

str

The user ID of the user who made the request.

api_key

str or None

The API key name for the request (if Basic Auth was used).

cost_usd

float or None

The cost of the job in US dollars (None until the job is done processing).

dataset

str

The dataset code (string identifier).

symbols

str

The list of symbols specified in the request.

stype_in

str

The symbology type of input symbols.

stype_out

str

The symbology type of output symbols.

schema

str

The data record schema.

start

str

The ISO 8601 timestamp start of request time range (inclusive).

end

str

The ISO 8601 timestamp end of request time range (exclusive).

limit

int or None

The maximum number of records to return.

encoding

str

The data encoding.

compression

str

The data compression mode.

pretty_px

bool

If prices are formatted to the correct scale (using the fixed-precision scalar 1e-9).

pretty_ts

bool

If timestamps are formatted as ISO 8601 strings.

map_symbols

bool

If a symbol field is included with each text-encoded record.

split_symbols

bool

If files are split by raw symbol.

split_duration

str

The maximum time interval for an individual file before splitting into multiple files.

split_size

int or None

The maximum size for an individual file before splitting into multiple files.

packaging

str or None

The packaging method of the batch data, one of 'none', 'zip', or 'tar'.

delivery

str

The delivery mechanism of the batch data. Only 'download' is supported at this time.

record_count

int or None

The number of data records (None until the job is processed).

billed_size

int or None

The size of the raw binary data used to process the batch job (used for billing purposes).

actual_size

int or None

The total size of the result of the batch job after splitting and compression.

package_size

int or None

The total size of the result of the batch job after any packaging (including metadata).

state

str

The current status of the batch job. One of 'received', 'queued', 'processing', 'done', or 'expired'.

ts_received

str

The ISO 8601 timestamp when Databento received the batch job.

ts_queued

str or None

The ISO 8601 timestamp when the batch job was queued.

ts_process_start

str or None

The ISO 8601 timestamp when the batch job began processing (if it's begun).

ts_process_done

str or None

The ISO 8601 timestamp when the batch job finished processing (if it's finished).

ts_expiration

str or None

The ISO 8601 timestamp when the batch job will expire from the Download center.

API method
PythonC++RustHTTP

      
    
Historical.batch.submit_job(
    dataset: Dataset | str,
    symbols: Iterable[str | int] | str | int,
    schema: Schema | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    encoding: Encoding | str = "dbn",
    compression: Compression | str = "zstd",
    pretty_px: bool = False,
    pretty_ts: bool = False,
    map_symbols: bool = False,
    split_symbols: bool = False,
    split_duration: Duration | str = "day",
    split_size: int | None = None,
    delivery: Delivery | str = "download",
    stype_in: SType | str = "raw_symbol",
    stype_out: SType | str = "instrument_id",
    limit: int | None = None,
) -> dict[str, Any]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

details = client.batch.submit_job(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    encoding="dbn",
    start="2022-06-06T12:00:00+00:00",
    end="2022-06-10T00:00:00+00:00",
)
print(details)

Example response
PythonC++RustHTTP

      
    
{
    "id": "GLBX-20221217-MN5S5S4WAS",
    "user_id": "NBPDLF33",
    "api_key": "prod-001",
    "cost_usd": None,
    "dataset": "GLBX.MDP3",
    "symbols": "ESM2",
    "stype_in": "raw_symbol",
    "stype_out": "instrument_id",
    "schema": "trades",
    "start": "2022-06-06T12:00:00.000000000Z",
    "end": "2022-06-10T00:00:00.000000000Z",
    "limit": None,
    "encoding": "dbn",
    "compression": "zstd",
    "pretty_px": False,
    "pretty_ts": False,
    "map_symbols": False,
    "split_symbols": False,
    "split_duration": "day",
    "split_size": None,
    "packaging": None,
    "delivery": "download",
    "record_count": None,
    "billed_size": None,
    "actual_size": None,
    "package_size": None,
    "state": "queued",
    "ts_received": "2022-12-17T00:36:37.844913000Z",
    "ts_queued": None,
    "ts_process_start": None,
    "ts_process_done": None,
    "ts_expiration": None
}

Historical.batch.list_jobs

List batch job details for the user account.

The job details will be sorted in order of ts_received.

Related: Download center.

Parameters

states

optional | Iterable[JobState | str] or JobState or str

The filter for job states as a list of comma separated values. Can include 'queued', 'processing', 'done', and 'expired'. Defaults to all except 'expired'.

since

optional | pd.Timestamp, datetime, date, str, or int

The filter for timestamp submitted (will not include jobs prior to this). Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.

Returns

list[dict[str, Any]]

A list of batch job details. See batch.submit_job for a detailed list of returned values.

API method
PythonC++RustHTTP

      
    
Historical.batch.list_jobs(
    states: Iterable[JobState | str] | JobState | str | None = "queued,processing,done",
    since: pd.Timestamp | datetime | date | str | int | None = None,
) -> list[dict[str, Any]]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

jobs = client.batch.list_jobs(
    states=["queued", "processing", "done"],
    since="2022-06-01",
)
print(jobs)

Example response
PythonC++RustHTTP

      
    
[
    {
        "id": "GLBX-20221126-DBVXWPJJQN",
        "user_id": "NBPDLF33",
        "api_key": "prod-001",
        "cost_usd": 23.6454,
        "dataset": "GLBX.MDP3",
        "symbols": "ZC.FUT,ES.FUT",
        "stype_in": "parent",
        "stype_out": "instrument_id",
        "schema": "mbo",
        "start": "2022-10-24T00:00:00.000000000Z",
        "end": "2022-11-24T00:00:00.000000000Z",
        "limit": None,
        "encoding": "csv",
        "compression": "zstd",
        "pretty_px": False,
        "pretty_ts": False,
        "map_symbols": False,
        "split_symbols": False,
        "split_duration": "day",
        "split_size": None,
        "packaging": None,
        "delivery": "download",
        "record_count": 412160224,
        "billed_size": 23080972544,
        "actual_size": 8144595219,
        "package_size": 8144628684,
        "state": "done",
        "ts_received": "2022-11-26T09:23:17.519708000Z",
        "ts_queued": "2022-12-03T14:34:57.897790000Z",
        "ts_process_start": "2022-12-03T14:35:00.495167000Z",
        "ts_process_done": "2022-12-03T14:48:15.710116000Z",
        "ts_expiration": "2023-01-02T14:48:15.710116000Z",
        "progress": 100
    },
...

Historical.batch.list_files

List files for a batch job.

Will include the manifest.json, the metadata.json, and batched data files.

Related: Download center.

Parameters

job_id

required | str

The batch job identifier.

Returns

list[dict[str, Any]]

The file details for the batch job.

filename

str

The file name.

size

int

The size of the file in bytes.

hash

str

The SHA256 hash of the file.

urls

dict

A map of download protocol to URL.

API method
PythonC++RustHTTP

      
    
Historical.batch.list_files(
    job_id: str,
) -> list[dict[str, Any]]

Example usage
PythonC++RustHTTP

import databento as db

client = db.Historical("$YOUR_API_KEY")

files = client.batch.list_files(job_id="GLBX-20220610-5DEFXVTMSM")
print(files)

Example response
PythonC++RustHTTP

      
    
[
    {
        "filename": "metadata.json",
        "size": 1102,
        "hash": "sha256:0168d53e1705b69b1d6407f10bb3ab48aac492fa0f68f863cc9b092931cc67a7",
        "urls": {
            "https": "https://api.databento.com/v0/batch/download/46PCMCVF/GLBX-20230203-WF9WJYSCDU/metadata.json",
            "ftp": "ftp://ftp.databento.com/46PCMCVF/GLBX-20230203-WF9WJYSCDU/metadata.json",
        }
    },
    {
        "filename": "glbx-mdp3-20220610.mbo.csv.zst",
        "size": 21832,
        "hash": "sha256:1218930af153b4953632216044ef87607afa467fc7ab7fbb1f031fceacf9d52a",
        "urls": {
            "https": "https://api.databento.com/v0/batch/download/46PCMCVF/GLBX-20230203-WF9WJYSCDU/glbx-mdp3-20220610.mbo.csv.zst",
            "ftp": "ftp://ftp.databento.com/46PCMCVF/GLBX-20230203-WF9WJYSCDU/glbx-mdp3-20220610.mbo.csv.zst",
        }
    }
]

Historical.batch.download

Download a batch job or a specific file to {output_dir}/{job_id}/.

Will automatically generate any necessary directories if they do not already exist.

Related: Download center.

Parameters

job_id

required | str

The batch job identifier.

output_dir

optional | PathLike[str] or str

The directory to download the file(s) to. If None, defaults to the current working directory.

filename_to_download

optional | str

The specific file to download. If None then will download all files for the batch job.

keep_zip

optional | bool, default False

If True, and filename_to_download is None, all job files will be saved as a .zip archive in the output_dir.

Returns

list[Path]

A list of paths to the downloaded files.

API method
PythonC++RustHTTP

      
    
Historical.batch.download(
    job_id: str,
    output_dir: PathLike[str] | str | None = None
    filename_to_download: str | None = None,
    keep_zip: bool = False,
) -> list[Path]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

# Download all files for the batch job
client.batch.download(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
)

# Alternatively, you can download a specific file
client.batch.download(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
    filename_to_download="metadata.json",
)

Historical.batch.download_async

Asynchronously download a batch job or a specific file to {output_dir}/{job_id}/.

Will automatically generate any necessary directories if they do not already exist.

Related: Download center.

Info
This method is a coroutine and must be used with an await expression.

Parameters

job_id

required | str

The batch job identifier.

output_dir

optional | PathLike[str] or str

The directory to download the file(s) to. If None, defaults to the current working directory.

filename_to_download

optional | str

The specific file to download. If None then will download all files for the batch job.

keep_zip

optional | bool, default False

If True, and filename_to_download is None all job files will be saved as a .zip archive in the output_dir.

Returns

list[Path]

A list of paths to the downloaded files.

API method
PythonC++RustHTTP

      
    
Historical.batch.download_async(
    job_id: str,
    output_dir: PathLike[str] | str | None = None,
    filename_to_download: str | None = None,
    keep_zip: bool = False,
) -> Awaitable[list[Path]]

Example usage
PythonC++RustHTTP

      
    
import asyncio
import databento as db

client = db.Historical("$YOUR_API_KEY")

# Download all files for the batch job
coro = client.batch.download_async(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
)
asyncio.run(coro)

# Alternatively, you can download a specific file
coro = client.batch.download_async(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
    filename_to_download="metadata.json",
)
asyncio.run(coro)

Helpers

DBNStore

The DBNStore object is an I/O helper class for working with DBN-encoded data. Typically, this object is created when performing historical requests. However, it can be created directly using DBN data on disk or in memory using provided factory methods:

Attributes

nbytes

int

The size of the data in bytes.

raw

bytes

The raw data from the I/O stream.

metadata

Metadata

The metadata header for the DBNStore.

dataset

str

The dataset ID.

schema

Schema or None

The data record schema. If None, the DBNStore may contain multiple schemas.

symbols

list[str]

The query symbols for the data.

stype_in

SType or None

The query input symbology type for the data. If None, the DBNStore may contain mixed STypes.

stype_out

SType

The query output symbology type for the data.

start

pd.Timestamp

The query start for the data as a pd.Timestamp.

end

pd.Timestamp or None

The query end for the data as a pd.Timestamp. If None, the DBNStore data was created without a known end time.

limit

int or None

The query limit for the data.

encoding

Encoding

The data encoding.

compression

Compression

Return the data compression format (if any).

mappings

dict[str, list[dict[str, Any]]]

Return the symbology mappings for the data.

symbology

dict[str, Any]

Return the symbology resolution information for the data.

DBNStore.from_bytes

Read data from a DBN byte stream.

Parameters

data

required | BytesIO or bytes or IO[bytes]

The bytes to read from.

Returns

A DBNStore object.

API method
PythonC++RustHTTP

      
    
DBNStore.from_bytes(
    data: BytesIO | bytes | IO[bytes],
) -> DBNStore

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
path = "GLBX-ESM2-20220606.trades.dbn.zst"
data.to_file(path)

# Open saved data as a byte stream.
with open(path, "rb") as saved:
    stored_data = db.DBNStore.from_bytes(saved)

# Convert to dataframe
df = stored_data.to_df()
print(df.head())

Example response
PythonC++RustHTTP

      
    
                                                               ts_event  rtype  publisher_id  instrument_id action  ... size  flags  ts_in_delta  sequence  symbol
ts_recv                                                                                                             ...
2022-06-06 00:00:00.070314216+00:00 2022-06-06 00:00:00.070033767+00:00      0             1           3403      T  ...    1      0        18681    157862    ESM2
2022-06-06 00:00:00.090544076+00:00 2022-06-06 00:00:00.089830441+00:00      0             1           3403      T  ...    1      0        18604    157922    ESM2
2022-06-06 00:00:00.807324169+00:00 2022-06-06 00:00:00.807018955+00:00      0             1           3403      T  ...    4      0        18396    158072    ESM2
2022-06-06 00:00:01.317722490+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    1      0        22043    158111    ESM2
2022-06-06 00:00:01.317736158+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    7      0        17280    158112    ESM2

[5 rows x 13 columns]

DBNStore.from_file

Read data from a DBN file.

See also
databento.read_dbn is an alias for DBNStore.from_file.

Parameters

path

required | PathLike[str] or str

The file path to read from.

Returns

A DBNStore object.

API method
PythonC++RustHTTP

      
    
DBNStore.from_file(
    path: PathLike[str] | str,
) -> DBNStore

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
path = "GLBX-ESM2-20220606.trades.dbn.zst"
data.to_file(path)

# Read saved .dbn.zst
stored_data = db.DBNStore.from_file(path)

# Convert to dataframe
df = stored_data.to_df()
print(df.head())

Example response
PythonC++RustHTTP

      
    
                                                               ts_event  rtype  publisher_id  instrument_id action  ... size  flags  ts_in_delta  sequence  symbol
ts_recv                                                                                                             ...
2022-06-06 00:00:00.070314216+00:00 2022-06-06 00:00:00.070033767+00:00      0             1           3403      T  ...    1      0        18681    157862    ESM2
2022-06-06 00:00:00.090544076+00:00 2022-06-06 00:00:00.089830441+00:00      0             1           3403      T  ...    1      0        18604    157922    ESM2
2022-06-06 00:00:00.807324169+00:00 2022-06-06 00:00:00.807018955+00:00      0             1           3403      T  ...    4      0        18396    158072    ESM2
2022-06-06 00:00:01.317722490+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    1      0        22043    158111    ESM2
2022-06-06 00:00:01.317736158+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    7      0        17280    158112    ESM2

[5 rows x 13 columns]

DBNStore.reader

Return an I/O reader for the data.

Returns

A raw IO stream for reading the DBNStore data.

API method
PythonC++RustHTTP

DBNStore.reader() -> IO[bytes]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

bento = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols="ZWZ3",
    start="2022-06-06",
)

# Create an IO stream reader
reader = bento.reader

DBNStore.replay

Replay data by passing records sequentially to the given callback.

Refer to the List of fields by schema article for documentation on the fields contained with each record type.

Parameters

callback

required | Callable

The callback function or method to be dispatched on every event.

Returns

None

API method
PythonC++RustHTTP

      
    
DBNStore.replay(
    callback: Callable[[DBNRecord], None],
) -> None

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    start="2022-06-06",
)

def print_large_trades(trade):
    size = getattr(trade, "size", 0)
    if size >= 200:
        print(trade)

data.replay(print_large_trades)

Example response
PythonC++RustHTTP

      
    
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524078339857609 }, price: 4164.000000000, size: 291, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524078342408839, ts_in_delta: 20352, sequence: 3605032 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524133736900455 }, price: 4160.000000000, size: 216, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524133737794739, ts_in_delta: 28024, sequence: 3659203 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654538295588752739 }, price: 4140.000000000, size: 200, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654538295589900967, ts_in_delta: 21708, sequence: 10031624 }

DBNStore.request_full_definitions

Request for full instrument Definition(s) for all symbols based on the metadata properties. This is useful for retrieving the instrument definitions for saved DBN data.

A timeseries.get_range request is made to obtain the definitions data which will incur a cost.

Parameters

client

required | Historical

The historical client to use for the request (contains the API key).

path

optional | PathLike[str] or str

The path to stream the data to on disk (will then return a DBNStore).

Returns

A DBNStore object.

A full list of fields for each schema is available through Historical.metadata.list_fields.

API method
PythonC++RustHTTP

      
    
DBNStore.request_full_definitions(
    client: Historical,
    path: PathLike[str] | str | None = None,
) -> DBNStore

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical(
    key="$YOUR_API_KEY",
)

trades = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ES.FUT"],
    stype_in="parent",
    schema="trades",
    start="2022-06-06",
)

definitions = trades.request_full_definitions(client).to_df()
definitions = definitions.sort_values(["expiration", "symbol"]).set_index("expiration")

print(definitions[["symbol"]])

Example response
PythonC++RustHTTP

      
    
                              symbol
expiration
2022-06-17 13:30:00+00:00       ESM2
2022-06-17 13:30:00+00:00  ESM2-ESH3
2022-06-17 13:30:00+00:00  ESM2-ESM3
2022-06-17 13:30:00+00:00  ESM2-ESU2
2022-06-17 13:30:00+00:00  ESM2-ESZ2
2022-09-16 13:30:00+00:00       ESU2
2022-09-16 13:30:00+00:00  ESU2-ESH3
2022-09-16 13:30:00+00:00  ESU2-ESM3
2022-09-16 13:30:00+00:00  ESU2-ESU3
2022-09-16 13:30:00+00:00  ESU2-ESZ2
2022-12-16 14:30:00+00:00       ESZ2
2022-12-16 14:30:00+00:00  ESZ2-ESH3
2022-12-16 14:30:00+00:00  ESZ2-ESM3
2022-12-16 14:30:00+00:00  ESZ2-ESU3
2023-03-17 13:30:00+00:00       ESH3
2023-03-17 13:30:00+00:00  ESH3-ESM3
2023-03-17 13:30:00+00:00  ESH3-ESU3
2023-03-17 13:30:00+00:00  ESH3-ESZ3
2023-06-16 13:30:00+00:00       ESM3
2023-06-16 13:30:00+00:00  ESM3-ESU3
2023-06-16 13:30:00+00:00  ESM3-ESZ3
2023-09-15 13:30:00+00:00       ESU3
2023-09-15 13:30:00+00:00  ESU3-ESH4
2023-09-15 13:30:00+00:00  ESU3-ESZ3
2023-12-15 14:30:00+00:00       ESZ3
2023-12-15 14:30:00+00:00  ESZ3-ESH4
2024-03-15 13:30:00+00:00       ESH4
2024-03-15 13:30:00+00:00  ESH4-ESM4
2024-06-21 13:30:00+00:00       ESM4
2024-09-20 13:30:00+00:00       ESU4
2024-12-20 14:30:00+00:00       ESZ4
2025-12-19 14:30:00+00:00       ESZ5
2026-12-18 14:30:00+00:00       ESZ6

DBNStore.request_symbology

Request to resolve symbology mappings based on the metadata properties.

Parameters

client

required | Historical

The historical client to use for the request (contains the API key).

Returns

dict[str, Any]

A result including a map of input symbol to output symbol across a date range.

API method
PythonC++RustHTTP

      
    
DBNStore.request_symbology(
    client: Historical,
) -> dict[str, Any]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
data.to_file("GLBX-ESM2-20201229.trades.dbn.zst")

# Read saved .dbn.zst
stored_data = db.DBNStore.from_file("GLBX-ESM2-20201229.trades.dbn.zst")

# Request symbology from .dbn.zst metadata
symbology = stored_data.request_symbology(client=client)
print(symbology)

Example response
PythonC++RustHTTP

      
    
{
    "result": {
        "ESM2": [
            {
                "d0": "2022-06-06",
                "d1": "2022-06-07",
                "s": "3403"
            }
        ]
    },
    "symbols": [
        "ESM2"
    ],
    "stype_in": "raw_symbol",
    "stype_out": "instrument_id",
    "start_date": "2022-06-06",
    "end_date": "2022-06-07",
    "partial": [],
    "not_found": [],
    "message": "OK",
    "status": 0
}

DBNStore.to_csv

Write data to a file in CSV format.

Parameters

path

required | PathLike[str] or str

The file path to write to.

pretty_ts

optional | bool, default True

Whether timestamp columns are converted to tz-aware pandas.Timestamp (UTC).

pretty_px

optional | bool, default True

Whether price columns are correctly scaled as display prices.

map_symbols

optional | bool, default True

If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its native symbol for every record.

compression

optional | Compression or str

The output compression for writing. Must be either 'zstd' or 'none'.

schema

optional | Schema or str

The data record schema for the output CSV. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.

mode

optional | str

The file write mode to use, either "x" or "w". Defaults to "w".

Returns

None

API method
PythonC++RustHTTP

      
    
DBNStore.to_csv(
    path: PathLike[str] | str,
    pretty_ts: bool = True,
    pretty_px: bool = True,
    map_symbols: bool = True,
    compression: Compression | str = Compression.NONE,
    schema: Schema | str | None = None,
    mode: Literal["w", "x"] = "w",
) -> None

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .csv
data.to_csv("GLBX-ESM2-20220606-trades.csv")

DBNStore.to_df

Converts data to a pandas DataFrame.

Info
The DataFrame index will be set to ts_recv if it exists in the schema, otherwise it will be set to ts_event.

See also
While not optimized for use with live data due to their column-oriented format, pandas DataFrames can still be used with live data by first streaming DBN data to a file, then converting to a DataFrame with DBNStore.from_file().to_df(). See this example for more information.

Parameters

price_type

optional | PriceType or str, default "float"

The price type to use for price fields. If "fixed", prices will have a type of int in fixed decimal format; each unit representing 1e-9 or 0.000000001. If "float", prices will have a type of float. If "decimal", prices will be instances of decimal.Decimal.

pretty_ts

optional | bool, default True

Whether timestamp columns are converted to tz-aware pandas.Timestamp. The timezone can be specified using the tz parameter.

map_symbols

optional | bool, default True

If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its raw symbol for every record.

schema

optional | Schema or str

The data record schema for the output DataFrame. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.

optional | datetime.tzinfo or str, default UTC

If pretty_ts is True, all timestamps will be converted to the specified timezone.

count

optional | int

If set, instead of returning a single DataFrame a DataFrameIterator instance will be returned. When iterated, this object will yield a DataFrame with at most count elements until the entire contents of the DBNStore are exhausted.

Returns

A pandas DataFrame object.

API method
PythonC++RustHTTP

      
    
DBNStore.to_df(
    price_type: PriceType | str = "float",
    pretty_ts: bool = True,
    map_symbols: bool = True,
    schema: Schema | str | None = None,
    tz: datetime.tzinfo | str = zoneinfo.ZoneInfo("UTC"),
    count: int | None = None,
) -> pd.DataFrame | DataFrameIterator

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    schema="trades",
    symbols=["ESM2"],
    start="2022-03-06",
)

df = data.to_df()
print(df.head())

Example response
PythonC++RustHTTP

      
    
                                                               ts_event  rtype  publisher_id  instrument_id action  ... size  flags  ts_in_delta  sequence  symbol
ts_recv                                                                                                             ...
2022-03-06 23:00:00.039463300+00:00 2022-03-06 23:00:00.036436177+00:00      0             1           3403      T  ...    1      0        18828      5178    ESM2
2022-03-06 23:00:01.098111252+00:00 2022-03-06 23:00:01.097477845+00:00      0             1           3403      T  ...    1      0        19122      6816    ESM2
2022-03-06 23:00:04.612334175+00:00 2022-03-06 23:00:04.611714663+00:00      0             1           3403      T  ...    1      0        18687     10038    ESM2
2022-03-06 23:00:04.613776789+00:00 2022-03-06 23:00:04.613240435+00:00      0             1           3403      T  ...    1      0        18452     10045    ESM2
2022-03-06 23:00:06.881864467+00:00 2022-03-06 23:00:06.880575603+00:00      0             1           3403      T  ...    1      0        18478     11343    ESM2

[5 rows x 13 columns]

DBNStore.to_file

Write data to a DBN file.

Parameters

path

required | PathLike[str] or str

The file path to write to.

mode

optional | str

The file write mode to use, either "x" or "w". Defaults to "w".

compression

optional | Compression or str

The compression format to write. If None, uses the same compression as the underlying data.

Returns

A DBNStore object.

API method
PythonC++RustHTTP

      
    
DBNStore.to_file(
    path: PathLike[str] | str,
    mode: Literal["w", "x"] = "w",
    compression: Compression | str | None = None,
) -> None

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
data.to_file("GLBX-ESM2-20220606.trades.dbn.zst")

DBNStore.to_json

Write data to a file in JSON format.

Parameters

path

required | PathLike[str] or str

The file path to write to.

pretty_ts

optional | bool, default True

Whether timestamp columns are converted to tz-aware pandas.Timestamp (UTC).

pretty_px

optional | bool, default True

Whether price columns are correctly scaled as display prices.

map_symbols

optional | bool, default True

If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its raw symbol for every record.

compression

optional | Compression or str

The output compression for writing. Must be either 'zstd' or 'none'.

schema

optional | Schema or str

The data record schema for the output JSON. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.

mode

optional | str

The file write mode to use, either "x" or "w". Defaults to "w".

Returns

None

API method
PythonC++RustHTTP

      
    
DBNStore.to_json(
    path: PathLike[str] | str,
    pretty_ts: bool = True,
    pretty_px: bool = True,
    map_symbols: bool = True,
    compression: Compression | str = Compression.NONE,
    schema: Schema | str | None = None,
    mode: Literal["w", "x"] = "w",
) -> None

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .json
data.to_json("GLBX-ESM2-20220606-trades.json")

DBNStore.to_ndarray

Converts data to a numpy N-dimensional array. Each element will contain a Python representation of the binary fields as a Tuple.

Parameters

schema

optional | Schema or str

The data record schema for the output array. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.

count

optional | int

If set, instead of returning a single np.ndarray a NDArrayIterator instance will be returned. When iterated, this object will yield a np.ndarray with at most count elements until the entire contents of the DBNStore are exhausted.

Returns

A numpy.ndarray object.

API method
PythonC++RustHTTP

      
    
DBNStore.to_ndarray(
    schema: Schema | str | None = None,
    count: int | None = None,
) -> np.ndarray | NDArrayIterator

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    start="2022-06-06",
)

array = data.to_ndarray()
print(array[0])

Example response
PythonC++RustHTTP

(12, 0, 1, 3403, 1654473600070033767, 4108500000000, 1, b'T', b'A', 0, 0, 1654473600070314216, 18681, 157862)

DBNStore.to_parquet

Write data to a file in Apache parquet format.

Parameters

path

required | PathLike[str] or str

The file path to write the data to.

price_type

optional | PriceType or str, default "float"

pretty_ts

optional | bool, default True

Whether timestamp columns are converted to tz-aware pyarrow.TimestampType (UTC).

map_symbols

optional | bool, default True

If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its raw symbol for every record.

schema

optional | Schema or str

The data record schema for the output parquet file. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.

mode

optional | str

The file write mode to use, either "x" or "w". Defaults to "w".

**kwargs

optional | Any

Keyword arguments to pass to pyarrow.parquet.ParquetWriter. These can be used to override the default behavior of the writer.

API method
PythonC++RustHTTP

      
    
DBNStore.to_parquet(
    path: PathLike[str] | str,
    price_type: PriceType | str = "float",
    pretty_ts: bool = True,
    map_symbols: bool = True,
    schema: Schema | str | None = None,
    mode: Literal["w", "x"] = "w",
    **kwargs: Any,
) -> None

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .parquet
data.to_parquet("GLBX-ESM2-20220606-trades.parquet")

DBNStore.iter

Using for; records will be iterated one at a time. Iteration will stop when there are no more records in the DBNStore instance.

Refer to the List of fields by schema article for documentation on the fields contained with each record type.

API method
PythonC++RustHTTP

DBNStore.__iter__() -> Iterator[DBNRecord]

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    start="2022-06-06",
)

for trade in data:
    size = getattr(trade, "size", 0)
    if size >= 200:
        print(trade)

Example response
PythonC++RustHTTP

      
    
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524078339857609 }, price: 4164.000000000, size: 291, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524078342408839, ts_in_delta: 20352, sequence: 3605032 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524133736900455 }, price: 4160.000000000, size: 216, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524133737794739, ts_in_delta: 28024, sequence: 3659203 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654538295588752739 }, price: 4140.000000000, size: 200, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654538295589900967, ts_in_delta: 21708, sequence: 10031624 }

DBNStore.insert_symbology_json

Insert JSON symbology data which may be obtained from a symbology request or loaded from a file.

Parameters

json_data

required | str or Mapping[str, Any] or TextIO

The JSON data to insert.

clear_existing

optional | bool

If existing symbology data should be cleared from the internal mappings.

API method
PythonC++RustHTTP

      
    
DBNStore.insert_symbology_json(
    json_data: str | Mapping[str, Any] | TextIO,
    clear_existing: bool = True,
) -> None

Example usage
PythonC++RustHTTP

      
    
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="XNAS.ITCH",
    symbols=["ALL_SYMBOLS"],
    schema="trades",
    start="2022-06-06",
    end="2022-06-07",
)

# Request symbology for all symbols and then insert this data
symbology_json = data.request_symbology(client)
data.insert_symbology_json(symbology_json, clear_existing=True)

map_symbols_csv

Use a symbology.json file to map a symbols column onto an existing CSV file. The result is written to out_file.

Parameters

symbology_file

required | PathLike[str] or str

Path to a symbology.json file to use as a symbology source.

csv_file

required | PathLike[str] or str

Path to a CSV file that contains encoded DBN data; must contain a ts_recv or ts_event and instrument_id column.

out_file

optional | PathLike[str] or str

Path to a file to write results to. If unspecified, _mapped will be appended to the csv_file name.

Returns

Path to the written file.

API method
PythonC++RustHTTP

      
    
map_symbols_csv(
    symbology_file: PathLike[str] | str,
    csv_file: PathLike[str] | str,
    out_file: PathLike[str] | str | None = None,
) -> Path:

Example usage
PythonC++RustHTTP

      
    
import databento as db

result = db.map_symbols_csv(
    "symbology.json",
    "xnas-itch-20230821-20230825.ohlcv-1d.csv",
)

print(result.read_text())

Example response
PythonC++RustHTTP

      
    
ts_event,rtype,publisher_id,instrument_id,open,high,low,close,volume,symbol
1692576000000000000,35,2,523,133550000000,135200000000,132710000000,134360000000,11015261,AMZN
1692576000000000000,35,2,7290,439090000000,472900000000,437260000000,470990000000,11098972,NVDA
1692576000000000000,35,2,10157,217000000000,233500000000,217000000000,233380000000,21336884,TSLA
1692576000000000000,35,2,7130,430170000000,434700000000,0,433000000000,68661,NOC
1692662400000000000,35,2,7132,431610000000,438480000000,0,437740000000,86950,NOC
1692662400000000000,35,2,10161,236710000000,241550000000,229560000000,232870000000,17069349,TSLA
1692662400000000000,35,2,523,135320000000,135900000000,133740000000,134200000000,8133698,AMZN
1692662400000000000,35,2,7293,475120000000,483440000000,453340000000,457910000000,11700447,NVDA
1692748800000000000,35,2,523,135120000000,137860000000,133220000000,137060000000,11430081,AMZN
1692748800000000000,35,2,7296,463190000000,518870000000,452080000000,502150000000,13361964,NVDA
1692748800000000000,35,2,7135,0,440000000000,0,434030000000,68411,NOC
1692748800000000000,35,2,10163,236590000000,243750000000,226500000000,240810000000,13759234,TSLA
1692835200000000000,35,2,7287,508190000000,512600000000,466720000000,468700000000,21611800,NVDA
1692835200000000000,35,2,10154,242000000000,244140000000,228180000000,229980000000,13724054,TSLA
1692835200000000000,35,2,7126,433650000000,437730000000,0,429880000000,54570,NOC
1692835200000000000,35,2,521,137250000000,137820000000,131410000000,131880000000,11990573,AMZN

map_symbols_json

Use a symbology.json file to insert a symbols key into records of an existing JSON file. The result is written to out_file.

Parameters

symbology_file

required | PathLike[str] or str

Path to a symbology.json file to use as a symbology source.

json_file

required | PathLike[str] or str

Path to a JSON file that contains encoded DBN data.

out_file

optional | PathLike[str] or str

Path to a file to write results to. If unspecified, _mapped will be appended to the csv_file name.

Returns

Path to the written file.

API method
PythonC++RustHTTP

      
    
map_symbols_json(
    symbology_file: PathLike[str] | str,
    json_file: PathLike[str] | str,
    out_file: PathLike[str] | str | None = None,
) -> Path:

Example usage
PythonC++RustHTTP

      
    
import databento as db

result = db.map_symbols_json(
    "symbology.json",
    "xnas-itch-20230821-20230825.ohlcv-1d.json",
)

print(result.read_text())

Example response
PythonC++RustHTTP

      
    
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":523},"open":"133550000000","high":"135200000000","low":"132710000000","close":"134360000000","volume":"11015261","symbol":"AMZN"}
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":7290},"open":"439090000000","high":"472900000000","low":"437260000000","close":"470990000000","volume":"11098972","symbol":"NVDA"}
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":10157},"open":"217000000000","high":"233500000000","low":"217000000000","close":"233380000000","volume":"21336884","symbol":"TSLA"}
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":7130},"open":"430170000000","high":"434700000000","low":"0","close":"433000000000","volume":"68661","symbol":"NOC"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":10161},"open":"236710000000","high":"241550000000","low":"229560000000","close":"232870000000","volume":"17069349","symbol":"TSLA"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":523},"open":"135320000000","high":"135900000000","low":"133740000000","close":"134200000000","volume":"8133698","symbol":"AMZN"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":7293},"open":"475120000000","high":"483440000000","low":"453340000000","close":"457910000000","volume":"11700447","symbol":"NVDA"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":7132},"open":"431610000000","high":"438480000000","low":"0","close":"437740000000","volume":"86950","symbol":"NOC"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":523},"open":"135120000000","high":"137860000000","low":"133220000000","close":"137060000000","volume":"11430081","symbol":"AMZN"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":7296},"open":"463190000000","high":"518870000000","low":"452080000000","close":"502150000000","volume":"13361964","symbol":"NVDA"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":7135},"open":"0","high":"440000000000","low":"0","close":"434030000000","volume":"68411","symbol":"NOC"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":10163},"open":"236590000000","high":"243750000000","low":"226500000000","close":"240810000000","volume":"13759234","symbol":"TSLA"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":7287},"open":"508190000000","high":"512600000000","low":"466720000000","close":"468700000000","volume":"21611800","symbol":"NVDA"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":10154},"open":"242000000000","high":"244140000000","low":"228180000000","close":"229980000000","volume":"13724054","symbol":"TSLA"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":7126},"open":"433650000000","high":"437730000000","low":"0","close":"429880000000","volume":"54570","symbol":"NOC"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":521},"open":"137250000000","high":"137820000000","low":"131410000000","close":"131880000000","volume":"11990573","symbol":"AMZN"}