Support

API reference - Historical

Databento's historical data service can be accessed programmatically over its HTTP API. To make it easier to integrate the API, we also provide official client libraries that simplify the code you need to write.

Our HTTP API is designed as a collection of RPC-style methods, which can be called using URLs in the form https://hist.databento.com/v0/METHOD_FAMILY.METHOD.

Our client libraries wrap these HTTP RPC-style methods with more idiomatic interfaces in their respective languages.

You can use our API to stream or load data directly into your application. You can also use our API to make batch download requests, which instruct our service to prepare the data as flat files that can downloaded from the Download center.

HISTORICAL DATA
Client Libraries
Python
Python
C++
C++
Rust
Rust
APIs
HTTP
HTTP
$
pip install -U databento
209

Basics

Overview

Our historical API has the following structure:

  • Metadata provides information about the datasets themselves.
  • Time series provides all types of time series data. This includes subsampled data (second, minute, hour, daily aggregates), trades, top-of-book, order book deltas, order book snapshots, summary statistics, static data and macro indicators. We also provide properties of products such as expirations, tick sizes and symbols as time series data.
  • Symbology provides methods that help find and resolve symbols across different symbology systems.
  • Batch provides a means of submitting and querying for details of batch download requests.

Authentication

Databento uses API keys to authenticate requests. You can view and manage your keys on the API keys page of your portal.

Each API key is a 32-character string starting with db-. By default, our library uses the environment variable DATABENTO_API_KEY as your API key. However, if you pass an API key to the Historical constructor through the key parameter, then this value will be used instead.

Related: Securing your API keys.

Example usage
import databento as db

# Establish connection and authenticate
client = db.Historical("$YOUR_API_KEY")

# Authenticated request
print(client.metadata.list_datasets())

Schemas and conventions

A schema is a data record format represented as a collection of different data fields. Our datasets support multiple schemas, such as order book, trades, bar aggregates, and so on. You can get a dictionary describing the fields of each schema from our List of market data schemas.

You can get a list of all supported schemas for any given dataset using the Historical client's list_schemas method. The same information can also be found on the dataset details pages on the user portal.

The following table provides details about the data types and conventions used for various fields that you will commonly encounter in the data.

Name Field Description
Dataset dataset A unique string name assigned to each dataset by Databento. Full list of datasets can be found from the metadata.
Publisher ID publisher_id A unique 16-bit unsigned integer assigned to each publisher by Databento. Full list of publisher IDs can be found from the metadata.
Instrument ID instrument_id A unique 32-bit unsigned integer assigned to each instrument by the venue. Information about instrument IDs for any given dataset can be found in the symbology.
Order ID order_id A unique 64-bit unsigned integer assigned to each order by the venue.
Timestamp (event) ts_event The matching-engine-received timestamp expressed as the number of nanoseconds since the UNIX epoch.
Timestamp (receive) ts_recv The capture-server-received timestamp expressed as the number of nanoseconds since the UNIX epoch.
Timestamp delta (in) ts_in_delta The matching-engine-sending timestamp expressed as the number of nanoseconds before ts_recv. See timestamping guide.
Timestamp out ts_out The Databento gateway-sending timestamp expressed as the number of nanoseconds since the UNIX epoch. See timestamping guide.
Price price The price expressed as signed integer where every 1 unit corresponds to 1e-9, i.e. 1/1,000,000,000 or 0.000000001.
Book side side The side that initiates the event. Can be Ask for a sell order (or sell aggressor in a trade), Bid for a buy order (or buy aggressor in a trade), or None where no side is specified by the original source.
Size size The order quantity.
Flag flag A bit field indicating event end, message characteristics, and data quality.
Action action The event type or order book operation. Can be Add, Cancel, Modify, cleaR book, Trade, Fill, or None.
Sequence number sequence The original message sequence number from the venue.

Datasets

Databento provides time series datasets for a variety of markets, sourced from different publishers. Our available datasets can be browsed through the search feature on our site.

Each dataset is assigned a unique string identifier (dataset ID) in the form PUBLISHER.DATASET, such as GLBX.MDP3. For publishers that are also markets, we use standard four-character ISO 10383 Market Identifier Codes (MIC). Otherwise, Databento arbitrarily assigns a four-character identifier for the publisher.

These dataset IDs are also found on the Data catalog and Download request features of the Databento user portal.

When a publisher provides multiple data products with different levels of granularity, Databento subscribes to the most-granular product. We then provide this dataset with alternate schemas to make it easy to work with the level of detail most appropriate for your application.

More information about different types of venues and publishers is available in our FAQs.

Symbology

Databento's historical API supports several ways to select an instrument in a dataset. An instrument is specified using a symbol and a symbology type, also referred to as an stype. The supported symbology types are:

  • Raw symbology (raw_symbol) original string symbols used by the publisher in the source data.
  • Instrument ID symbology (instrument_id) unique numeric ID assigned to each instrument by the publisher.
  • Parent symbology (parent) groups instruments related to the market for the same underlying.
  • Continuous contract symbology (continuous) proprietary symbology that specifies instruments based on certain systematic rules.

When requesting data from our timeseries.get_range or batch.submit_job endpoints, an input and output symbology type can be specified. By default, our client libraries will use raw symbology for the input type and instrument ID symbology for the output type. Not all symbology types are supported for every dataset.

The process of converting between one symbology type to another is called symbology resolution. This conversion can be done, for no cost, with the symbology.resolve endpoint.

For more about symbology at Databento, see our Standards and conventions.

Encodings

DBN

Databento Binary Encoding (DBN) is an extremely fast message encoding and highly-compressible storage format for normalized market data. It includes a self-describing metadata header and adopts a binary format with zero-copy serialization.

We recommend using our Python, C++, or Rust client libraries to read DBN files locally. A CLI tool is also available for converting DBN files to CSV or JSON.

CSV

Comma-separated values (CSV) is a simple text file format for tabular data, CSVs can be easily opened with Excel, loaded into pandas data frames, or parsed in C++.

Our CSVs have one header line, followed by one record per line. Lines use UNIX-style \n separators.

JSON

JavaScript Object Notation (JSON) is a flexible text file format with broad language support and wide adoption across web apps.

Our JSON files follow the JSON lines specification, where each line of the file is a JSON record. Lines use UNIX-style \n separators.

Compression

Databento provides options for compressing files from our API. Available compression formats depend on the encoding you select.

Zstd

The Zstd compression option uses the Zstandard format.

This option is available for all encodings, and is recommended for faster transfer speeds and smaller files.

You can read Zstandard files in Python using the zstandard package.

Read more about working with Zstandard-compressed files.

None

The None compression option disables compression entirely, resulting in significantly larger files. However, this can be useful for loading small CSV files directly into Excel.

Dates and times

Our Python client library has several functions with timestamp arguments. These arguments will have type pandas.Timestamp | datetime.date | str | int and support a variety of formats.

It's recommended to use pandas.Timestamp, which fully supports timezones and nanosecond-precision. If a datetime.date is used, the time is set to midnight UTC. If an int is provided, the value is interpreted as UNIX nanoseconds.

The client library also handles several string-based timestamp formats based on ISO 8601.

  • yyyy-mm-dd, e.g. "2022-02-28" (midnight UTC)
  • yyyy-mm-ddTHH:MM, e.g. "2022-02-28T23:50"
  • yyyy-mm-ddTHH:MM:SS, e.g. "2022-02-28T23:50:59"
  • yyyy-mm-ddTHH:MM:SS.NNNNNNNNN, e.g. "2022-02-28T23:50:59.123456789"

Timezone specification is also supported.

  • yyyy-mm-ddTHH:MMZ
  • yyyy-mm-ddTHH:MM±hh
  • yyyy-mm-ddTHH:MM±hhmm
  • yyyy-mm-ddTHH:MM±hh:mm

Bare dates

Some parameters require a bare date, without a time. These arguments have type datetime.date | str and must either be a datetime.date object, or a string in yyyy-mm-dd format, e.g. "2022-02-28".

Errors

Our historical API uses HTTP response codes to indicate the success or failure of an API request. The client library provides exceptions that wrap these response codes.

  • 2xx indicates success.
  • 4xx indicates an error on the client side. Represented as a BentoClientError.
  • 5xx indicates an error with Databento's servers. Represented as a BentoServerError.

The full list of the response codes and associated causes is as follows:

Code Message Cause
200 OK Successful request.
206 Partial Content Successful request, with partially resolved symbols.
400 Bad Request Invalid request. Usually due to a missing, malformed or unsupported parameter.
401 Unauthorized Invalid username or API key.
402 Payment Required Issue with your account payment information.
403 Forbidden The API key has insufficient permissions to perform the request.
404 Not Found A resource is not found, or a requested symbol does not exist.
409 Conflict A resource already exists.
422 Unprocessable Entity The request is well formed, but we cannot or will not process the contained instructions.
429 Too Many Requests API rate limit exceeded.
500 Internal Server Error Unexpected condition encountered in our system.
503 Service Unavailable Data gateway is offline or overloaded.
504 Gateway Timeout Data gateway is available but other parts of our system are offline or overloaded.
API method
class databento.BentoError(Exception)
class databento.BentoHttpError(databento.BentoError)
class databento.BentoClientError(databento.BentoHttpError)
class databento.BentoServerError(databento.BentoHttpError)
Example usage
import databento as db

client = db.Historical("INVALID_API_KEY")

try:
    print(client.metadata.list_datasets())
except db.BentoClientError as e:
    print(e)
Example response
400 auth_invalid_username_in_basic_auth
Invalid username in Basic auth ('INVALID_API_KEY').
documentation: https://databento.com/docs

Rate limits

Our historical API allows each IP address up to:

When a request exceeds a rate limit, a BentoClientError exception is raised with a 429 error code.

Retry-After

The Retry-After response header indicates how long the user should wait before retrying.

If you find that your application has been rate-limited, you can retry after waiting for the time specified in the Retry-After header.

If you are using Python, you may use the time.sleep function as seen below to wait for the time specified in the Retry-After header. e.g. time.sleep(int(response.headers("Retry-After", 1)))

This code snippet works best for our current APIs with their rate limits. Future APIs may have different rate limits, and might require a different default time delay.

Size limits

There is no size limit for either stream or batch download requests. Batch download is more manageable for large datasets, so we recommend using batch download for requests over 5 GB.

You can also manage the size of your request by splitting it into multiple, smaller requests. The historical API allows you to make stream and batch download requests with time ranges specified up to nanosecond resolution. You can also use the limit parameter in any request to limit the number of data records returned from the service.

Batch download supports different delivery methods which can be specified using the delivery parameter.

Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

job = client.batch.submit_job(
    dataset="GLBX.MDP3",
    symbols="CLZ7",
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T00:10:00",
    limit=10000,
)

Metered pricing

Databento only charges for the data that you use. You can find rates (per MB) for the various datasets and estimate pricing on our Data catalog. We meter the data by its uncompressed size in binary encoding.

When you stream the data, you are billed incrementally for each outbound byte of data sent from our historical gateway. If your connection is interrupted while streaming our data and our historical gateway detects connection timeout over 5 seconds, it will immediately stop sending data and you will not be billed for the remainder of your request.

Duplicate streaming requests will incur repeated charges. If you intend to access the same data multiple times, we recommend using our batch download feature. When you make a batch download request, you are only billed once for the request and, subsequently, you can download the data from the Download center multiple times over 30 days for no additional charge.

You will only be billed for usage of time series data. Access to metadata, symbology, and account management is free. The Historical.metadata.get_cost method can be used to determine cost before you request any data.

Related: Billing management.

Versioning

Our historical and live APIs and its client libraries adopt MAJOR.MINOR.PATCH format for version numbers. These version numbers conform to semantic versioning. We are using major version 0 for initial development, where our API is not considered stable.

Once we release major version 1, our public API will be stable. This means that you will be able to upgrade minor or patch versions to pick up new functionality, without breaking your integration.

Starting with major versions after 1, we will provide support for previous versions for one year after the date of the subsequent major release. For example, if version 2.0.0 is released on January 1, 2024, then all versions 1.x.y of the API and client libraries will be deprecated. However, they will remain supported until January 1, 2025.

We may introduce backwards-compatible changes between minor versions in the form of:

Our Release notes will contain information about both breaking and backwards-compatible changes in each release.

Our API and official client libraries are kept in sync with same-day releases for major versions. For instance, 1.x.y of the C++ client library will use the same functionality found in any 1.x.y version of the Python client.

Related: Release notes.

Client

Historical

To access Databento's historical API, first create an instance of the Historical client. The entire API is exposed through instance methods of the client.

Note that the API key can be passed as a parameter, which is not recommended for production applications. Instead, you can leave out this parameter to pass your API key via the DATABENTO_API_KEY environment variable:

Currently, only BO1 is supported for historical data.

Parameters

key
optional | str
32-character API key. Found on your API keys page. If None then DATABENTO_API_KEY environment variable is used.
gateway
optional | HistoricalGateway or str
Site of historical gateway to connect to. Currently only BO1 is supported. If None then will connect to the default historical gateway.
API method
class Historical(
    key: str | None = None,
    gateway: HistoricalGateway | str = HistoricalGateway.BO1,
)
Example usage
import databento as db

# Pass as parameter
client = db.Historical("$YOUR_API_KEY")

# Or, pass as `DATABENTO_API_KEY` environment variable
client = db.Historical()

Metadata

Historical.metadata.list_publishers

List all publisher ID mappings.

Use this method to list the details of publishers, including their dataset and venue mappings.

Returns

list[dict[str, int | str]]

A list of publisher details objects.

publisher_id
int
The publisher ID assigned by Databento.
dataset
str
The dataset ID for the publisher.
venue
str
The venue for the publisher.
description
str
The publisher description.
API method
Historical.metadata.list_publishers() -> list[dict[str, int | str]]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

publishers = client.metadata.list_publishers()
print(publishers)
Example response
[{'dataset': 'GLBX.MDP3',
  'description': 'CME Globex MDP 3.0',
  'publisher_id': 1,
  'venue': 'GLBX'},
 {'dataset': 'XNAS.ITCH',
  'description': 'Nasdaq TotalView-ITCH',
  'publisher_id': 2,
  'venue': 'XNAS'},
 {'dataset': 'XBOS.ITCH',
  'description': 'Nasdaq BX TotalView-ITCH',
  'publisher_id': 3,
  'venue': 'XBOS'},
 {'dataset': 'XPSX.ITCH',
  'description': 'Nasdaq PSX TotalView-ITCH',
  'publisher_id': 4,
  'venue': 'XPSX'},
 {'dataset': 'BATS.PITCH',
  'description': 'Cboe BZX Depth',
  'publisher_id': 5,
  'venue': 'BATS'},
 {'dataset': 'BATY.PITCH',
  'description': 'Cboe BYX Depth',
  'publisher_id': 6,
  'venue': 'BATY'},
 {'dataset': 'EDGA.PITCH',
  'description': 'Cboe EDGA Depth',
  'publisher_id': 7,
  'venue': 'EDGA'},
 {'dataset': 'EDGX.PITCH',
  'description': 'Cboe EDGX Depth',
  'publisher_id': 8,
  'venue': 'EDGX'},
 {'dataset': 'XNYS.PILLAR',
  'description': 'NYSE Integrated',
  'publisher_id': 9,
  'venue': 'XNYS'},
 {'dataset': 'XCIS.PILLAR',
  'description': 'NYSE National Integrated',
  'publisher_id': 10,
  'venue': 'XCIS'},
 {'dataset': 'XASE.PILLAR',
  'description': 'NYSE American Integrated',
  'publisher_id': 11,
  'venue': 'XASE'},
 {'dataset': 'XCHI.PILLAR',
  'description': 'NYSE Texas Integrated',
  'publisher_id': 12,
  'venue': 'XCHI'},
 {'dataset': 'XCIS.BBO',
  'description': 'NYSE National BBO',
  'publisher_id': 13,
  'venue': 'XCIS'},
 {'dataset': 'XCIS.TRADES',
  'description': 'NYSE National Trades',
  'publisher_id': 14,
  'venue': 'XCIS'},
 {'dataset': 'MEMX.MEMOIR',
  'description': 'MEMX Memoir Depth',
  'publisher_id': 15,
  'venue': 'MEMX'},
 {'dataset': 'EPRL.DOM',
  'description': 'MIAX Pearl Depth',
  'publisher_id': 16,
  'venue': 'EPRL'},
 {'dataset': 'XNAS.NLS',
  'description': 'FINRA/Nasdaq TRF Carteret',
  'publisher_id': 17,
  'venue': 'FINN'},
 {'dataset': 'XNAS.NLS',
  'description': 'FINRA/Nasdaq TRF Chicago',
  'publisher_id': 18,
  'venue': 'FINC'},
 {'dataset': 'XNYS.TRADES',
  'description': 'FINRA/NYSE TRF',
  'publisher_id': 19,
  'venue': 'FINY'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - NYSE American Options',
  'publisher_id': 20,
  'venue': 'AMXO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - BOX Options',
  'publisher_id': 21,
  'venue': 'XBOX'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe Options',
  'publisher_id': 22,
  'venue': 'XCBO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Emerald',
  'publisher_id': 23,
  'venue': 'EMLD'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe EDGX Options',
  'publisher_id': 24,
  'venue': 'EDGO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq GEMX',
  'publisher_id': 25,
  'venue': 'GMNI'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq ISE',
  'publisher_id': 26,
  'venue': 'XISX'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq MRX',
  'publisher_id': 27,
  'venue': 'MCRY'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Options',
  'publisher_id': 28,
  'venue': 'XMIO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - NYSE Arca Options',
  'publisher_id': 29,
  'venue': 'ARCO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Options Price Reporting Authority',
  'publisher_id': 30,
  'venue': 'OPRA'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Pearl',
  'publisher_id': 31,
  'venue': 'MPRL'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq Options',
  'publisher_id': 32,
  'venue': 'XNDQ'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq BX Options',
  'publisher_id': 33,
  'venue': 'XBXO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe C2 Options',
  'publisher_id': 34,
  'venue': 'C2OX'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Nasdaq PHLX',
  'publisher_id': 35,
  'venue': 'XPHL'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - Cboe BZX Options',
  'publisher_id': 36,
  'venue': 'BATO'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MEMX Options',
  'publisher_id': 37,
  'venue': 'MXOP'},
 {'dataset': 'IEXG.TOPS',
  'description': 'IEX TOPS',
  'publisher_id': 38,
  'venue': 'IEXG'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - NYSE Texas',
  'publisher_id': 39,
  'venue': 'XCHI'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - NYSE National',
  'publisher_id': 40,
  'venue': 'XCIS'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - IEX',
  'publisher_id': 41,
  'venue': 'IEXG'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'DBEQ Basic - MIAX Pearl',
  'publisher_id': 42,
  'venue': 'EPRL'},
 {'dataset': 'ARCX.PILLAR',
  'description': 'NYSE Arca Integrated',
  'publisher_id': 43,
  'venue': 'ARCX'},
 {'dataset': 'XNYS.BBO',
  'description': 'NYSE BBO',
  'publisher_id': 44,
  'venue': 'XNYS'},
 {'dataset': 'XNYS.TRADES',
  'description': 'NYSE Trades',
  'publisher_id': 45,
  'venue': 'XNYS'},
 {'dataset': 'XNAS.QBBO',
  'description': 'Nasdaq QBBO',
  'publisher_id': 46,
  'venue': 'XNAS'},
 {'dataset': 'XNAS.NLS',
  'description': 'Nasdaq Trades',
  'publisher_id': 47,
  'venue': 'XNAS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - NYSE Texas',
  'publisher_id': 48,
  'venue': 'XCHI'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - NYSE National',
  'publisher_id': 49,
  'venue': 'XCIS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - IEX',
  'publisher_id': 50,
  'venue': 'IEXG'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - MIAX Pearl',
  'publisher_id': 51,
  'venue': 'EPRL'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - Nasdaq',
  'publisher_id': 52,
  'venue': 'XNAS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - NYSE',
  'publisher_id': 53,
  'venue': 'XNYS'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - FINRA/Nasdaq TRF Carteret',
  'publisher_id': 54,
  'venue': 'FINN'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - FINRA/NYSE TRF',
  'publisher_id': 55,
  'venue': 'FINY'},
 {'dataset': 'EQUS.PLUS',
  'description': 'Databento US Equities Plus - FINRA/Nasdaq TRF Chicago',
  'publisher_id': 56,
  'venue': 'FINC'},
 {'dataset': 'IFEU.IMPACT',
  'description': 'ICE Europe Commodities',
  'publisher_id': 57,
  'venue': 'IFEU'},
 {'dataset': 'NDEX.IMPACT',
  'description': 'ICE Endex',
  'publisher_id': 58,
  'venue': 'NDEX'},
 {'dataset': 'DBEQ.BASIC',
  'description': 'Databento US Equities Basic - Consolidated',
  'publisher_id': 59,
  'venue': 'DBEQ'},
 {'dataset': 'EQUS.PLUS',
  'description': 'EQUS Plus - Consolidated',
  'publisher_id': 60,
  'venue': 'EQUS'},
 {'dataset': 'OPRA.PILLAR',
  'description': 'OPRA - MIAX Sapphire',
  'publisher_id': 61,
  'venue': 'SPHR'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE Texas',
  'publisher_id': 62,
  'venue': 'XCHI'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE National',
  'publisher_id': 63,
  'venue': 'XCIS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - IEX',
  'publisher_id': 64,
  'venue': 'IEXG'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - MIAX Pearl',
  'publisher_id': 65,
  'venue': 'EPRL'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Nasdaq',
  'publisher_id': 66,
  'venue': 'XNAS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE',
  'publisher_id': 67,
  'venue': 'XNYS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - FINRA/Nasdaq TRF Carteret',
  'publisher_id': 68,
  'venue': 'FINN'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - FINRA/NYSE TRF',
  'publisher_id': 69,
  'venue': 'FINY'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - FINRA/Nasdaq TRF Chicago',
  'publisher_id': 70,
  'venue': 'FINC'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe BZX',
  'publisher_id': 71,
  'venue': 'BATS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe BYX',
  'publisher_id': 72,
  'venue': 'BATY'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe EDGA',
  'publisher_id': 73,
  'venue': 'EDGA'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Cboe EDGX',
  'publisher_id': 74,
  'venue': 'EDGX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Nasdaq BX',
  'publisher_id': 75,
  'venue': 'XBOS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Nasdaq PSX',
  'publisher_id': 76,
  'venue': 'XPSX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - MEMX',
  'publisher_id': 77,
  'venue': 'MEMX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE American',
  'publisher_id': 78,
  'venue': 'XASE'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - NYSE Arca',
  'publisher_id': 79,
  'venue': 'ARCX'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Long-Term Stock Exchange',
  'publisher_id': 80,
  'venue': 'LTSE'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Nasdaq',
  'publisher_id': 81,
  'venue': 'XNAS'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - FINRA/Nasdaq TRF Carteret',
  'publisher_id': 82,
  'venue': 'FINN'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - FINRA/Nasdaq TRF Chicago',
  'publisher_id': 83,
  'venue': 'FINC'},
 {'dataset': 'IFEU.IMPACT',
  'description': 'ICE Europe - Off-Market Trades',
  'publisher_id': 84,
  'venue': 'XOFF'},
 {'dataset': 'NDEX.IMPACT',
  'description': 'ICE Endex - Off-Market Trades',
  'publisher_id': 85,
  'venue': 'XOFF'},
 {'dataset': 'XNAS.NLS',
  'description': 'Nasdaq NLS - Nasdaq BX',
  'publisher_id': 86,
  'venue': 'XBOS'},
 {'dataset': 'XNAS.NLS',
  'description': 'Nasdaq NLS - Nasdaq PSX',
  'publisher_id': 87,
  'venue': 'XPSX'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Nasdaq BX',
  'publisher_id': 88,
  'venue': 'XBOS'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Nasdaq PSX',
  'publisher_id': 89,
  'venue': 'XPSX'},
 {'dataset': 'EQUS.SUMMARY',
  'description': 'Databento Equities Summary',
  'publisher_id': 90,
  'venue': 'EQUS'},
 {'dataset': 'XCIS.TRADESBBO',
  'description': 'NYSE National Trades and BBO',
  'publisher_id': 91,
  'venue': 'XCIS'},
 {'dataset': 'XNYS.TRADESBBO',
  'description': 'NYSE Trades and BBO',
  'publisher_id': 92,
  'venue': 'XNYS'},
 {'dataset': 'XNAS.BASIC',
  'description': 'Nasdaq Basic - Consolidated',
  'publisher_id': 93,
  'venue': 'EQUS'},
 {'dataset': 'EQUS.ALL',
  'description': 'Databento US Equities (All Feeds) - Consolidated',
  'publisher_id': 94,
  'venue': 'EQUS'},
 {'dataset': 'EQUS.MINI',
  'description': 'Databento US Equities Mini',
  'publisher_id': 95,
  'venue': 'EQUS'},
 {'dataset': 'XNYS.TRADES',
  'description': 'NYSE Trades - Consolidated',
  'publisher_id': 96,
  'venue': 'EQUS'},
 {'dataset': 'IFUS.IMPACT',
  'description': 'ICE Futures US',
  'publisher_id': 97,
  'venue': 'IFUS'},
 {'dataset': 'IFUS.IMPACT',
  'description': 'ICE Futures US - Off-Market Trades',
  'publisher_id': 98,
  'venue': 'XOFF'},
 {'dataset': 'IFLL.IMPACT',
  'description': 'ICE Europe Financials',
  'publisher_id': 99,
  'venue': 'IFLL'},
 {'dataset': 'IFLL.IMPACT',
  'description': 'ICE Europe Financials - Off-Market Trades',
  'publisher_id': 100,
  'venue': 'XOFF'},
 {'dataset': 'XEUR.EOBI',
  'description': 'Eurex EOBI',
  'publisher_id': 101,
  'venue': 'XEUR'},
 {'dataset': 'XEEE.EOBI',
  'description': 'European Energy Exchange EOBI',
  'publisher_id': 102,
  'venue': 'XEEE'},
 {'dataset': 'XEUR.EOBI',
  'description': 'Eurex EOBI - Off-Market Trades',
  'publisher_id': 103,
  'venue': 'XOFF'},
 {'dataset': 'XEEE.EOBI',
  'description': 'European Energy Exchange EOBI - Off-Market Trades',
  'publisher_id': 104,
  'venue': 'XOFF'}
 ]

Historical.metadata.list_datasets

List all valid dataset IDs on Databento.

Use this method to list the available dataset IDs (string identifiers), so you can use other methods which take the dataset parameter.

Parameters

start_date
optional | date or str
The inclusive UTC start date of the request range as a Python date or ISO 8601 date string. If None then first date available.
end_date
optional | date or str
The exclusive UTC end date of the request range as a Python date or ISO 8601 date string. If None then last date available.

Returns

list[str]

A list of available dataset IDs.

API method
Historical.metadata.list_datasets(
    start_date: date | str | None = None,
    end_date: date | str | None = None,
) -> list[str]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

datasets = client.metadata.list_datasets()
print(datasets)
Example response
[
    "ARCX.PILLAR",
    "DBEQ.BASIC",
    "EPRL.DOM",
    "EQUS.MINI",
    "EQUS.SUMMARY",
    "GLBX.MDP3",
    "IEXG.TOPS",
    "IFEU.IMPACT",
    "NDEX.IMPACT",
    "OPRA.PILLAR",
    "XASE.PILLAR",
    "XBOS.ITCH",
    "XCHI.PILLAR",
    "XCIS.TRADESBBO",
    "XNAS.BASIC",
    "XNAS.ITCH",
    "XNYS.PILLAR",
    "XPSX.ITCH"
]

Historical.metadata.list_schemas

List all available schemas for a dataset.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.

Returns

list[str]

A list of available data schemas.

API method
Historical.metadata.list_schemas(
    dataset: Dataset | str,
) -> list[str]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

schemas = client.metadata.list_schemas(dataset="GLBX.MDP3")
print(schemas)
Example response
[
    "mbo",
    "mbp-1",
    "mbp-10",
    "tbbo",
    "trades",
    "bbo-1s",
    "bbo-1m",
    "ohlcv-1s",
    "ohlcv-1m",
    "ohlcv-1h",
    "ohlcv-1d",
    "definition",
    "statistics",
    "status"
]

Historical.metadata.list_fields

List all fields for a particular schema and encoding.

Parameters

schema
required | Schema or str
The data record schema. Must be one of the values from list_schemas.
encoding
required | Encoding or str
The data encoding. Must be one of 'dbn', 'csv', or 'json'. 'dbn' is recommended.

Returns

list[dict[str, str]]

A list of field details objects.

name
str
The field name.
type
str
The field data type.
API method
Historical.metadata.list_fields(
    schema: Schema | str,
    encoding: Encoding | str,
) -> list[dict[str, str]]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

fields = client.metadata.list_fields(schema="trades", encoding="dbn")
print(fields)
Example response
[
    {
        "name": "length",
        "type": "uint8_t"
    },
    {
        "name": "rtype",
        "type": "uint8_t"
    },
    {
        "name": "publisher_id",
        "type": "uint16_t"
    },
    {
        "name": "instrument_id",
        "type": "uint32_t"
    },
    {
        "name": "ts_event",
        "type": "uint64_t"
    },
    {
        "name": "price",
        "type": "int64_t"
    },
    {
        "name": "size",
        "type": "uint32_t"
    },
    {
        "name": "action",
        "type": "char"
    },
    {
        "name": "side",
        "type": "char"
    },
    {
        "name": "flags",
        "type": "uint8_t"
    },
    {
        "name": "depth",
        "type": "uint8_t"
    },
    {
        "name": "ts_recv",
        "type": "uint64_t"
    },
    {
        "name": "ts_in_delta",
        "type": "int32_t"
    },
    {
        "name": "sequence",
        "type": "uint32_t"
    }
]

Historical.metadata.list_unit_prices

List unit prices for each feed mode and data schema in US dollars per gigabyte.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.

Returns

list[dict[str, Any]]

A list of maps of feed mode to schema to unit price.

mode
str
The feed mode. Will be one of "historical", "historical-streaming", or "live".
unit_prices
dict[str | float]
A mapping of schemas to unit prices in US dollars per gigabyte.
API method
Historical.metadata.list_unit_prices(
    dataset: Dataset | str,
) -> list[dict[str, Any]]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

unit_prices = client.metadata.list_unit_prices(dataset="OPRA.PILLAR")
print(unit_prices)
Example response
[
    {
        "mode": "historical",
        "unit_prices": {
            "mbp-1": 0.04,
            "ohlcv-1s": 280.0,
            "ohlcv-1m": 280.0,
            "ohlcv-1h": 600.0,
            "ohlcv-1d": 600.0,
            "tbbo": 210.0,
            "trades": 280.0,
            "statistics": 11.0,
            "definition": 5.0
        }
    },
    {
        "mode": "historical-streaming",
        "unit_prices": {
            "mbp-1": 0.04,
            "ohlcv-1s": 280.0,
            "ohlcv-1m": 280.0,
            "ohlcv-1h": 600.0,
            "ohlcv-1d": 600.0,
            "tbbo": 210.0,
            "trades": 280.0,
            "statistics": 11.0,
            "definition": 5.0
        }
    },
    {
        "mode": "live",
        "unit_prices": {
            "mbp-1": 0.05,
            "ohlcv-1s": 336.0,
            "ohlcv-1m": 336.0,
            "ohlcv-1h": 720.0,
            "ohlcv-1d": 720.0,
            "tbbo": 252.0,
            "trades": 336.0,
            "statistics": 13.2,
            "definition": 6.0
        }
    }
]

Historical.metadata.get_dataset_condition

Get the dataset condition from Databento.

Use this method to discover data availability and quality.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
start_date
optional | date or str
The inclusive UTC start date of the request range as a Python date or ISO 8601 date string. If None then first date available.
end_date
optional | date or str
The inclusive UTC end date of the request range as a Python date or ISO 8601 date string. If None then last date available.

Returns

list[dict[str, str | None]]

A list of conditions per date.

date
str
The day of the described data, as an ISO 8601 date string.
condition
str
The condition code describing the quality and availability of the data on the given day. Possible values are listed below.
last_modified_date
str or None
The date when any schema in the dataset on the given day was last generated or modified, as an ISO 8601 date string. Will be None when condition is 'missing'.

Possible values for condition:

  • available: the data is available with no known issues
  • degraded: the data is available, but there may be missing data or other correctness issues
  • pending: the data is not yet available, but may be available soon
  • missing: the data is not available
API method
Historical.metadata.get_dataset_condition(
    dataset: Dataset | str,
    start_date: date | str | None = None,
    end_date: date | str | None = None,
) -> list[dict[str, str | None]]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

conditions = client.metadata.get_dataset_condition(
    dataset="GLBX.MDP3",
    start_date="2022-06-06",
    end_date="2022-06-10",
)
print(conditions)
Example response
[
    {
        "date": "2022-06-06",
        "condition": "available",
        "last_modified_date": "2024-05-18"
    },
    {
        "date": "2022-06-07",
        "condition": "available",
        "last_modified_date": "2024-05-21"
    },
    {
        "date": "2022-06-08",
        "condition": "available",
        "last_modified_date": "2024-05-21"
    },
    {
        "date": "2022-06-09",
        "condition": "available",
        "last_modified_date": "2024-05-21"
    },
    {
        "date": "2022-06-10",
        "condition": "available",
        "last_modified_date": "2024-05-22"
    }
]

Historical.metadata.get_dataset_range

Get the available range for the dataset given the user's entitlements.

Use this method to discover data availability. The start and end values in the response can be used with the timeseries.get_range and batch.submit_job endpoints.

This endpoint will return the start and end timestamps over the entire dataset as well as the per-schema start and end timestamps under the schema key. In some cases, a schema's availability is a subset of the entire dataset availability.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.

Returns

dict[str, str | dict[str, str]]

The available range for the dataset.

start
str
The inclusive start of the available range as an ISO 8601 timestamp.
end
str
The exclusive end of the available range as an ISO 8601 timestamp.
schema
dict[str, str]
A mapping of schema names to per-schema start and end timestamps.
API method
Historical.metadata.get_dataset_range(
    dataset: Dataset | str,
) -> dict[str, str | dict[str, str]]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

available_range = client.metadata.get_dataset_range(dataset="XNAS.BASIC")

print(available_range)
Example response
{
    "start":"2018-05-01T00:00:00.000000000Z",
    "end":"2025-01-30T00:00:00.000000000Z",
    "schema": {
        "mbo": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "mbp-1": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "mbp-10": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "bbo-1s": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "bbo-1m": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "tbbo": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "trades": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1s": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1m": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1h": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "ohlcv-1d": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "definition": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "statistics": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "status": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        },
        "imbalance": {
            "start":"2018-05-01T00:00:00.000000000Z",
            "end":"2025-01-30T00:00:00.000000000Z"
        }
    }
}

Historical.metadata.get_record_count

Get the record count of the time series data query.

This method may not be accurate for time ranges that are not discrete multiples of 10 minutes, potentially over-reporting the number of records in such cases. The definition schema is only accurate for discrete multiples of 24 hours.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
start
required | pd.Timestamp, datetime, date, str, or int
The inclusive start of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.
end
optional | pd.Timestamp, datetime, date, str, or int
The exclusive end of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.
symbols
optional | Iterable[str | int] or str or int
The product symbols to filter for. Takes up to 2,000 symbols per request. If 'ALL_SYMBOLS' or None then will select all symbols.
schema
optional | Schema or str, default 'trades'
The data record schema. Must be one of the values from list_schemas.
stype_in
optional | SType or str, default 'raw_symbol'
The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
limit
optional | int
The maximum number of records to return. If None then no limit.

Returns

int

The record count.

API method
Historical.metadata.get_record_count(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    limit: int | None = None,
) -> int
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

count = client.metadata.get_record_count(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="mbo",
    start="2022-01-06",
)
print(count)
Example response
1329107

Historical.metadata.get_billable_size

Get the billable uncompressed raw binary size for historical streaming or batched files.

This method may not be accurate for time ranges that are not discrete multiples of 10 minutes, potentially over-reporting the size in such cases. The definition schema is only accurate for discrete multiples of 24 hours.

Info
Info

The amount billed will be based on the actual amount of bytes sent; see our pricing documentation for more details.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
start
required | pd.Timestamp, datetime, date, str, or int
The inclusive start of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.
end
optional | pd.Timestamp, datetime, date, str, or int
The exclusive end of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.
symbols
optional | Iterable[str | int] or str or int
The product symbols to filter for. Takes up to 2,000 symbols per request. If 'ALL_SYMBOLS' or None then will select all symbols.
schema
optional | Schema or str, default 'trades'
The data record schema. Must be one of the values from list_schemas.
stype_in
optional | SType or str, default 'raw_symbol'
The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
limit
optional | int
The maximum number of records to return. If None then no limit.

Returns

int

The size in number of bytes used for billing.

API method
Historical.metadata.get_billable_size(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: Stype | str = "raw_symbol",
    limit: int | None = None,
) -> int
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

size = client.metadata.get_billable_size(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T12:10:00",
)
print(size)
Example response
99219648

Historical.metadata.get_cost

Get the cost in US dollars for a historical streaming or batch download request. This cost respects any discounts provided by flat rate plans.

This method may not be accurate for time ranges that are not discrete multiples of 10 minutes, potentially over-reporting the cost in such cases. The definition schema is only accurate for discrete multiples of 24 hours.

Info
Info

The amount billed will be based on the actual amount of bytes sent; see our pricing documentation for more details.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
start
required | pd.Timestamp, datetime, date, str, or int
The inclusive start of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.
end
optional | pd.Timestamp, datetime, date, str, or int
The exclusive end of the request range. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.
mode
optional | FeedMode or str
The data feed mode of the request. Must be one of 'historical', 'historical-streaming', or 'live'.
symbols
optional | Iterable[str | int] or str or int
The product symbols to filter for. Takes up to 2,000 symbols per request. If 'ALL_SYMBOLS' or None then will select all symbols.
schema
optional | Schema or str, default 'trades'
The data record schema. Must be one of the values from list_schemas.
stype_in
optional | SType or str, default 'raw_symbol'
The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
limit
optional | int
The maximum number of records to return. If None then no limit.

Returns

float

The cost in US dollars.

API method
Historical.metadata.get_cost(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    mode: FeedMode | str = "historical-streaming",
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    limit: int | None = None,
) -> float
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

cost = client.metadata.get_cost(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T12:10:00",
)
print(cost)
Example response
2.587353944778

Time series

Historical.timeseries.get_range

Makes a streaming request for time series data from Databento.

This is the primary method for getting historical market data, instrument definitions, and status data directly into your application.

This method only returns after all of the data has been downloaded, which can take a long time. For large requests, consider using batch.submit_job instead.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
start
required | pd.Timestamp, datetime, date, str, or int
The inclusive start of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.
end
optional | pd.Timestamp, datetime, date, str, or int
The exclusive end of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.
symbols
optional | Iterable[str | int] or str or int
The product symbols to filter for. Takes up to 2,000 symbols per request. If more than 1 symbol is specified, the data is merged and sorted by time. If 'ALL_SYMBOLS' or None then will select all symbols.
schema
optional | Schema or str, default 'trades'
The data record schema. Must be one of the values from list_schemas.
stype_in
optional | SType or str, default 'raw_symbol'
The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
stype_out
optional | SType or str, default 'instrument_id'
The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
limit
optional | int
The maximum number of records to return. If None then no limit.
path
optional | PathLike[str] or str
The file path to stream the data to. It is recommended to use the ".dbn.zst" suffix.

Returns

A DBNStore object.

A full list of fields for each schema is available through Historical.metadata.list_fields.

API method
Historical.timeseries.get_range(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    stype_out: SType | str = "instrument_id",
    limit: int | None = None,
    path: PathLike[str] | str | None = None,
) -> DBNStore
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T00:10:00",
    limit=1,
)
df = data.to_df()
print(df.iloc[0].to_json(indent=4))
Example response
{
    "ts_event":1654473600070,
    "rtype":0,
    "publisher_id":1,
    "instrument_id":3403,
    "action":"T",
    "side":"A",
    "depth":0,
    "price":4108.5,
    "size":1,
    "flags":0,
    "ts_in_delta":18681,
    "sequence":157862,
    "symbol":"ESM2"
}

Historical.timeseries.get_range_async

Asynchronously request a historical time series data stream from Databento.

Primary method for getting historical intraday market data, daily data, instrument definitions and market status data directly into your application.

This method only returns after all of the data has been downloaded, which can take a long time. For large requests, consider using batch.submit_job instead.

Info
Info

This method is a coroutine and must be used with an await expression.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
start
required | pd.Timestamp, datetime, date, str, or int
The inclusive start of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.
end
optional | pd.Timestamp, datetime, date, str, or int
The exclusive end of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.
symbols
optional | Iterable[str | int] or str or int
The product symbols to filter for. Takes up to 2,000 symbols per request. If more than 1 symbol is specified, the data is merged and sorted by time. If 'ALL_SYMBOLS' or None then will select all symbols.
schema
optional | Schema or str, default 'trades'
The data record schema. Must be one of the values from list_schemas.
stype_in
optional | SType or str, default 'raw_symbol'
The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
stype_out
optional | SType or str, default 'instrument_id'
The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
limit
optional | int
The maximum number of records to return. If None then no limit.
path
optional | PathLike[str] or str
The file path to stream the data to. It is recommended to use the ".dbn.zst" suffix.

Returns

A DBNStore object.

A full list of fields for each schema is available through Historical.metadata.list_fields.

API method
Historical.timeseries.get_range_async(
    dataset: Dataset | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    symbols: Iterable[str | int] | str | int | None = None,
    schema: Schema | str = "trades",
    stype_in: SType | str = "raw_symbol",
    stype_out: SType | str = "instrument_id",
    limit: int | None = None,
    path: PathLike[str] | str | None = None,
) -> Awaitable[DBNStore]
Example usage
import asyncio
import databento as db

client = db.Historical("$YOUR_API_KEY")

coro = client.timeseries.get_range_async(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06T00:00:00",
    end="2022-06-10T00:10:00",
    limit=1,
)
data = asyncio.run(coro)

df = data.to_df()
print(df.iloc[0].to_json(indent=4))
Example response
{
    "ts_event":1654473600070,
    "rtype":0,
    "publisher_id":1,
    "instrument_id":3403,
    "action":"T",
    "side":"A",
    "depth":0,
    "price":4108.5,
    "size":1,
    "flags":0,
    "ts_in_delta":18681,
    "sequence":157862,
    "symbol":"ESM2"
}

Symbology

Historical.symbology.resolve

Resolve a list of symbols from an input symbology type, to an output symbology type.

Take, for example, a raw symbol to an instrument ID: ESM2 → 3403.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
symbols
required | Iterable[str | int] or str or int
The symbols to resolve. Takes up to 2,000 symbols per request. Use 'ALL_SYMBOLS' to request all symbols (not available for every dataset).
stype_in
required | SType or str
The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
stype_out
required | SType or str
The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
start_date
required | date or str
The inclusive UTC start date of the request range as a Python date or ISO 8601 date string.
end_date
optional | date or str
The exclusive UTC end date of the request range as a Python date or ISO 8601 date string. Defaults to the forward filled value of start based on the resolution provided.

Returns

dict[str, Any]

The results for the symbology resolution.

See also
See also

For more information on symbology resolution, visit our symbology documentation.

result
dict[str, list[dict[str, str]]
The symbology mapping result. For each requested symbol, a list of symbology mappings is provided.
symbols
list[str]
The requested symbols.
stype_in
str
The requested input symbology type.
stype_out
str
The requested output symbology type.
start_date
str
The requested symbology start date as an ISO 8601 date string.
end_date
str
The requested symbology end date as an ISO 8601 date string.
partial
list[str]
The list of symbols, if any, that partially resolved inside the start date and end date interval.
not_found
list[str]
The list of symbols, if any, that failed to resolve inside the start date and end date interval.
message
str
A short message indicating the overall symbology result. Can be one of: "OK", or "Partially resolved", or "Not found"
status
int
A numerical status field indicating the overall symbology result. Can be one of: 0 (OK), 1 (Partially resolved), or 2 (Not found).
API method
Historical.symbology.resolve(
    dataset: Dataset | str,
    symbols: Iterable[str | int] | str | int,
    stype_in: SType | str,
    stype_out: SType | str,
    start_date: date | str,
    end_date: date | str | None = None,
) -> dict[str, Any]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

result = client.symbology.resolve(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    stype_in="raw_symbol",
    stype_out="instrument_id",
    start_date="2022-06-01",
    end_date="2022-06-30",
)
print(result)
Example response
{
    "result": {
        "ESM2": [
            {
                "d0": "2022-06-01",
                "d1": "2022-06-26",
                "s": "3403"
            }
        ]
    },
    "symbols": [
        "ESM2"
    ],
    "stype_in": "raw_symbol",
    "stype_out": "instrument_id",
    "start_date": "2022-06-01",
    "end_date": "2022-06-30",
    "partial": [],
    "not_found": [],
    "message": "OK",
    "status": 0
}

Batch downloads

Batch downloads allow you to download data files directly from within your portal. For more information, see Streaming vs. batch download.

Historical.batch.submit_job

Make a batch download job request.

Once a request is submitted, our system processes the request and prepares the batch files in the background. The status of your request and the files can be accessed from the Download center from your user portal.

This method takes longer than a streaming request, but is advantageous for larger requests as it supports delivery mechanisms that allow multiple accesses of the data without additional cost for each subsequent download after the first.

Related: batch.list_jobs.

Parameters

dataset
required | Dataset or str
The dataset code (string identifier). Must be one of the values from list_datasets.
symbols
required | Iterable[str | int] or str or int
The product symbols to filter for. Takes up to 2,000 symbols per request. If more than 1 symbol is specified, the data is merged and sorted by time. If 'ALL_SYMBOLS' or None then will select all symbols.
schema
required | Schema or str
The data record schema. Must be one of the values from list_schemas.
start
required | pd.Timestamp, datetime, date, str, or int
The inclusive start of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.
end
optional | pd.Timestamp, datetime, date, str, or int
The exclusive end of the request range. Filters on ts_recv if it exists in the schema, otherwise ts_event. Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified. Defaults to the forward filled value of start based on the resolution provided.
encoding
optional | Encoding or str
The data encoding. Must be one of 'dbn', 'csv', 'json'. For fastest transfer speed, 'dbn' is recommended.
compression
optional | Compression or str
The data compression mode. Must be either 'zstd', 'none', or None. For fastest transfer speed, 'zstd' is recommended.
pretty_px
optional | bool, default False
If prices should be formatted to the correct scale (using the fixed-precision scalar 1e-9). Only applicable for 'csv' or 'json' encodings.
pretty_ts
optional | bool, default False
If timestamps should be formatted as ISO 8601 strings. Only applicable for 'csv' or 'json' encodings.
map_symbols
optional | bool, default False
If a symbol field should be included with each text-encoded record. Only applicable for 'csv' or 'json' encodings.
split_symbols
optional | bool, default False
If files should be split by raw symbol. Cannot be requested with 'ALL_SYMBOLS'. Cannot be used with limit.
split_duration
optional | Duration or str, default 'day'
The maximum time duration before batched data is split into multiple files. Must be one of 'day', 'week', 'month', or 'none'. A week starts on Sunday UTC.
split_size
optional | int
The maximum size (in bytes) of each batched data file before being split. Must be an integer between 1e9 and 10e9 inclusive (1GB - 10GB). Defaults to no split size.
delivery
optional | Delivery or str, default 'download'
The delivery mechanism for the batched data files once processed. Only 'download' is supported at this time.
stype_in
optional | SType or str, default 'raw_symbol'
The symbology type of input symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
stype_out
optional | SType or str, default 'instrument_id'
The symbology type of output symbols. Must be one of 'raw_symbol', 'instrument_id', 'parent', or 'continuous'.
limit
optional | int
The maximum number of records to return. If None then no limit. Cannot be used with split_symbols.

Returns

dict[str, Any]

The description of the submitted batch job.

id
str
The unique job ID for the request.
user_id
str
The user ID of the user who made the request.
api_key
str or None
The API key name for the request (if Basic Auth was used).
cost_usd
float or None
The cost of the job in US dollars (None until the job is done processing).
dataset
str
The dataset code (string identifier).
symbols
str
The list of symbols specified in the request.
stype_in
str
The symbology type of input symbols.
stype_out
str
The symbology type of output symbols.
schema
str
The data record schema.
start
str
The ISO 8601 timestamp start of request time range (inclusive).
end
str
The ISO 8601 timestamp end of request time range (exclusive).
limit
int or None
The maximum number of records to return.
encoding
str
The data encoding.
compression
str
The data compression mode.
pretty_px
bool
If prices are formatted to the correct scale (using the fixed-precision scalar 1e-9).
pretty_ts
bool
If timestamps are formatted as ISO 8601 strings.
map_symbols
bool
If a symbol field is included with each text-encoded record.
split_symbols
bool
If files are split by raw symbol.
split_duration
str
The maximum time interval for an individual file before splitting into multiple files.
split_size
int or None
The maximum size for an individual file before splitting into multiple files.
packaging
str or None
The packaging method of the batch data, one of 'none', 'zip', or 'tar'.
delivery
str
The delivery mechanism of the batch data. Only 'download' is supported at this time.
record_count
int or None
The number of data records (None until the job is processed).
billed_size
int or None
The size of the raw binary data used to process the batch job (used for billing purposes).
actual_size
int or None
The total size of the result of the batch job after splitting and compression.
package_size
int or None
The total size of the result of the batch job after any packaging (including metadata).
state
str
The current status of the batch job. One of 'received', 'queued', 'processing', 'done', or 'expired'.
ts_received
str
The ISO 8601 timestamp when Databento received the batch job.
ts_queued
str or None
The ISO 8601 timestamp when the batch job was queued.
ts_process_start
str or None
The ISO 8601 timestamp when the batch job began processing (if it's begun).
ts_process_done
str or None
The ISO 8601 timestamp when the batch job finished processing (if it's finished).
ts_expiration
str or None
The ISO 8601 timestamp when the batch job will expire from the Download center.
API method
Historical.batch.submit_job(
    dataset: Dataset | str,
    symbols: Iterable[str | int] | str | int,
    schema: Schema | str,
    start: pd.Timestamp | datetime | date | str | int,
    end: pd.Timestamp | datetime | date | str | int | None = None,
    encoding: Encoding | str = "dbn",
    compression: Compression | str = "zstd",
    pretty_px: bool = False,
    pretty_ts: bool = False,
    map_symbols: bool = False,
    split_symbols: bool = False,
    split_duration: Duration | str = "day",
    split_size: int | None = None,
    delivery: Delivery | str = "download",
    stype_in: SType | str = "raw_symbol",
    stype_out: SType | str = "instrument_id",
    limit: int | None = None,
) -> dict[str, Any]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

details = client.batch.submit_job(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    encoding="dbn",
    start="2022-06-06T12:00:00+00:00",
    end="2022-06-10T00:00:00+00:00",
)
print(details)
Example response
{
    "id": "GLBX-20221217-MN5S5S4WAS",
    "user_id": "NBPDLF33",
    "api_key": "prod-001",
    "cost_usd": None,
    "dataset": "GLBX.MDP3",
    "symbols": "ESM2",
    "stype_in": "raw_symbol",
    "stype_out": "instrument_id",
    "schema": "trades",
    "start": "2022-06-06T12:00:00.000000000Z",
    "end": "2022-06-10T00:00:00.000000000Z",
    "limit": None,
    "encoding": "dbn",
    "compression": "zstd",
    "pretty_px": False,
    "pretty_ts": False,
    "map_symbols": False,
    "split_symbols": False,
    "split_duration": "day",
    "split_size": None,
    "packaging": None,
    "delivery": "download",
    "record_count": None,
    "billed_size": None,
    "actual_size": None,
    "package_size": None,
    "state": "queued",
    "ts_received": "2022-12-17T00:36:37.844913000Z",
    "ts_queued": None,
    "ts_process_start": None,
    "ts_process_done": None,
    "ts_expiration": None
}

Historical.batch.list_jobs

List batch job details for the user account.

The job details will be sorted in order of ts_received.

Related: Download center.

Parameters

states
optional | Iterable[JobState | str] or JobState or str
The filter for job states as a list of comma separated values. Can include 'queued', 'processing', 'done', and 'expired'. Defaults to all except 'expired'.
since
optional | pd.Timestamp, datetime, date, str, or int
The filter for timestamp submitted (will not include jobs prior to this). Takes pd.Timestamp, Python datetime, Python date, ISO 8601 string, or UNIX timestamp in nanoseconds. Assumes UTC as timezone unless otherwise specified.

Returns

list[dict[str, Any]]

A list of batch job details. See batch.submit_job for a detailed list of returned values.

API method
Historical.batch.list_jobs(
    states: Iterable[JobState | str] | JobState | str | None = "queued,processing,done",
    since: pd.Timestamp | datetime | date | str | int | None = None,
) -> list[dict[str, Any]]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

jobs = client.batch.list_jobs(
    states=["queued", "processing", "done"],
    since="2022-06-01",
)
print(jobs)
Example response
[
    {
        "id": "GLBX-20221126-DBVXWPJJQN",
        "user_id": "NBPDLF33",
        "api_key": "prod-001",
        "cost_usd": 23.6454,
        "dataset": "GLBX.MDP3",
        "symbols": "ZC.FUT,ES.FUT",
        "stype_in": "parent",
        "stype_out": "instrument_id",
        "schema": "mbo",
        "start": "2022-10-24T00:00:00.000000000Z",
        "end": "2022-11-24T00:00:00.000000000Z",
        "limit": None,
        "encoding": "csv",
        "compression": "zstd",
        "pretty_px": False,
        "pretty_ts": False,
        "map_symbols": False,
        "split_symbols": False,
        "split_duration": "day",
        "split_size": None,
        "packaging": None,
        "delivery": "download",
        "record_count": 412160224,
        "billed_size": 23080972544,
        "actual_size": 8144595219,
        "package_size": 8144628684,
        "state": "done",
        "ts_received": "2022-11-26T09:23:17.519708000Z",
        "ts_queued": "2022-12-03T14:34:57.897790000Z",
        "ts_process_start": "2022-12-03T14:35:00.495167000Z",
        "ts_process_done": "2022-12-03T14:48:15.710116000Z",
        "ts_expiration": "2023-01-02T14:48:15.710116000Z",
        "progress": 100
    },
...

Historical.batch.list_files

List files for a batch job.

Will include the manifest.json, the metadata.json, and batched data files.

Related: Download center.

Parameters

job_id
required | str
The batch job identifier.

Returns

list[dict[str, Any]]

The file details for the batch job.

filename
str
The file name.
size
int
The size of the file in bytes.
hash
str
The SHA256 hash of the file.
urls
dict
A map of download protocol to URL.
API method
Historical.batch.list_files(
    job_id: str,
) -> list[dict[str, Any]]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

files = client.batch.list_files(job_id="GLBX-20220610-5DEFXVTMSM")
print(files)
Example response
[
    {
        "filename": "metadata.json",
        "size": 1102,
        "hash": "sha256:0168d53e1705b69b1d6407f10bb3ab48aac492fa0f68f863cc9b092931cc67a7",
        "urls": {
            "https": "https://api.databento.com/v0/batch/download/46PCMCVF/GLBX-20230203-WF9WJYSCDU/metadata.json",
            "ftp": "ftp://ftp.databento.com/46PCMCVF/GLBX-20230203-WF9WJYSCDU/metadata.json",
        }
    },
    {
        "filename": "glbx-mdp3-20220610.mbo.csv.zst",
        "size": 21832,
        "hash": "sha256:1218930af153b4953632216044ef87607afa467fc7ab7fbb1f031fceacf9d52a",
        "urls": {
            "https": "https://api.databento.com/v0/batch/download/46PCMCVF/GLBX-20230203-WF9WJYSCDU/glbx-mdp3-20220610.mbo.csv.zst",
            "ftp": "ftp://ftp.databento.com/46PCMCVF/GLBX-20230203-WF9WJYSCDU/glbx-mdp3-20220610.mbo.csv.zst",
        }
    }
]

Historical.batch.download

Download a batch job or a specific file to {output_dir}/{job_id}/.

Will automatically generate any necessary directories if they do not already exist.

Related: Download center.

Parameters

job_id
required | str
The batch job identifier.
output_dir
optional | PathLike[str] or str
The directory to download the file(s) to. If None, defaults to the current working directory.
filename_to_download
optional | str
The specific file to download. If None then will download all files for the batch job.
keep_zip
optional | bool, default False
If True, and filename_to_download is None, all job files will be saved as a .zip archive in the output_dir.

Returns

list[Path]

A list of paths to the downloaded files.

API method
Historical.batch.download(
    job_id: str,
    output_dir: PathLike[str] | str | None = None
    filename_to_download: str | None = None,
    keep_zip: bool = False,
) -> list[Path]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

# Download all files for the batch job
client.batch.download(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
)

# Alternatively, you can download a specific file
client.batch.download(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
    filename_to_download="metadata.json",
)

Historical.batch.download_async

Asynchronously download a batch job or a specific file to {output_dir}/{job_id}/.

Will automatically generate any necessary directories if they do not already exist.

Related: Download center.

Info
Info

This method is a coroutine and must be used with an await expression.

Parameters

job_id
required | str
The batch job identifier.
output_dir
optional | PathLike[str] or str
The directory to download the file(s) to. If None, defaults to the current working directory.
filename_to_download
optional | str
The specific file to download. If None then will download all files for the batch job.
keep_zip
optional | bool, default False
If True, and filename_to_download is None all job files will be saved as a .zip archive in the output_dir.

Returns

list[Path]

A list of paths to the downloaded files.

API method
Historical.batch.download_async(
    job_id: str,
    output_dir: PathLike[str] | str | None = None,
    filename_to_download: str | None = None,
    keep_zip: bool = False,
) -> Awaitable[list[Path]]
Example usage
import asyncio
import databento as db

client = db.Historical("$YOUR_API_KEY")

# Download all files for the batch job
coro = client.batch.download_async(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
)
asyncio.run(coro)

# Alternatively, you can download a specific file
coro = client.batch.download_async(
    job_id="GLBX-20220610-5DEFXVTMSM",
    output_dir="my_data/",
    filename_to_download="metadata.json",
)
asyncio.run(coro)

Helpers

DBNStore

The DBNStore object is an I/O helper class for working with DBN-encoded data. Typically, this object is created when performing historical requests. However, it can be created directly using DBN data on disk or in memory using provided factory methods:

Attributes

nbytes
int
The size of the data in bytes.
raw
bytes
The raw data from the I/O stream.
metadata
Metadata
The metadata header for the DBNStore.
dataset
str
The dataset ID.
schema
Schema or None
The data record schema. If None, the DBNStore may contain multiple schemas.
symbols
list[str]
The query symbols for the data.
stype_in
SType or None
The query input symbology type for the data. If None, the DBNStore may contain mixed STypes.
stype_out
SType
The query output symbology type for the data.
start
pd.Timestamp
The query start for the data as a pd.Timestamp.
end
pd.Timestamp or None
The query end for the data as a pd.Timestamp. If None, the DBNStore data was created without a known end time.
limit
int or None
The query limit for the data.
encoding
Encoding
The data encoding.
compression
Compression
Return the data compression format (if any).
mappings
dict[str, list[dict[str, Any]]]
Return the symbology mappings for the data.
symbology
dict[str, Any]
Return the symbology resolution information for the data.

DBNStore.from_bytes

Read data from a DBN byte stream.

Parameters

data
required | BytesIO or bytes or IO[bytes]
The bytes to read from.

Returns

A DBNStore object.

API method
DBNStore.from_bytes(
    data: BytesIO | bytes | IO[bytes],
) -> DBNStore
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
path = "GLBX-ESM2-20220606.trades.dbn.zst"
data.to_file(path)

# Open saved data as a byte stream.
with open(path, "rb") as saved:
    stored_data = db.DBNStore.from_bytes(saved)

# Convert to dataframe
df = stored_data.to_df()
print(df.head())
Example response
                                                               ts_event  rtype  publisher_id  instrument_id action  ... size  flags  ts_in_delta  sequence  symbol
ts_recv                                                                                                             ...
2022-06-06 00:00:00.070314216+00:00 2022-06-06 00:00:00.070033767+00:00      0             1           3403      T  ...    1      0        18681    157862    ESM2
2022-06-06 00:00:00.090544076+00:00 2022-06-06 00:00:00.089830441+00:00      0             1           3403      T  ...    1      0        18604    157922    ESM2
2022-06-06 00:00:00.807324169+00:00 2022-06-06 00:00:00.807018955+00:00      0             1           3403      T  ...    4      0        18396    158072    ESM2
2022-06-06 00:00:01.317722490+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    1      0        22043    158111    ESM2
2022-06-06 00:00:01.317736158+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    7      0        17280    158112    ESM2

[5 rows x 13 columns]

DBNStore.from_file

Read data from a DBN file.

See also
See also

databento.read_dbn is an alias for DBNStore.from_file.

Parameters

path
required | PathLike[str] or str
The file path to read from.

Returns

A DBNStore object.

API method
DBNStore.from_file(
    path: PathLike[str] | str,
) -> DBNStore
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
path = "GLBX-ESM2-20220606.trades.dbn.zst"
data.to_file(path)

# Read saved .dbn.zst
stored_data = db.DBNStore.from_file(path)

# Convert to dataframe
df = stored_data.to_df()
print(df.head())
Example response
                                                               ts_event  rtype  publisher_id  instrument_id action  ... size  flags  ts_in_delta  sequence  symbol
ts_recv                                                                                                             ...
2022-06-06 00:00:00.070314216+00:00 2022-06-06 00:00:00.070033767+00:00      0             1           3403      T  ...    1      0        18681    157862    ESM2
2022-06-06 00:00:00.090544076+00:00 2022-06-06 00:00:00.089830441+00:00      0             1           3403      T  ...    1      0        18604    157922    ESM2
2022-06-06 00:00:00.807324169+00:00 2022-06-06 00:00:00.807018955+00:00      0             1           3403      T  ...    4      0        18396    158072    ESM2
2022-06-06 00:00:01.317722490+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    1      0        22043    158111    ESM2
2022-06-06 00:00:01.317736158+00:00 2022-06-06 00:00:01.317385867+00:00      0             1           3403      T  ...    7      0        17280    158112    ESM2

[5 rows x 13 columns]

DBNStore.reader

Return an I/O reader for the data.

Returns

A raw IO stream for reading the DBNStore data.

API method
DBNStore.reader() -> IO[bytes]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

bento = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols="ZWZ3",
    start="2022-06-06",
)

# Create an IO stream reader
reader = bento.reader

DBNStore.replay

Replay data by passing records sequentially to the given callback.

Refer to the List of fields by schema article for documentation on the fields contained with each record type.

Parameters

callback
required | Callable
The callback function or method to be dispatched on every event.

Returns

None

API method
DBNStore.replay(
    callback: Callable[[DBNRecord], None],
) -> None
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    start="2022-06-06",
)

def print_large_trades(trade):
    size = getattr(trade, "size", 0)
    if size >= 200:
        print(trade)

data.replay(print_large_trades)
Example response
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524078339857609 }, price: 4164.000000000, size: 291, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524078342408839, ts_in_delta: 20352, sequence: 3605032 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524133736900455 }, price: 4160.000000000, size: 216, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524133737794739, ts_in_delta: 28024, sequence: 3659203 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654538295588752739 }, price: 4140.000000000, size: 200, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654538295589900967, ts_in_delta: 21708, sequence: 10031624 }

DBNStore.request_full_definitions

Request for full instrument Definition(s) for all symbols based on the metadata properties. This is useful for retrieving the instrument definitions for saved DBN data.

A timeseries.get_range request is made to obtain the definitions data which will incur a cost.

Parameters

client
required | Historical
The historical client to use for the request (contains the API key).
path
optional | PathLike[str] or str
The path to stream the data to on disk (will then return a DBNStore).

Returns

A DBNStore object.

A full list of fields for each schema is available through Historical.metadata.list_fields.

API method
DBNStore.request_full_definitions(
    client: Historical,
    path: PathLike[str] | str | None = None,
) -> DBNStore
Example usage
import databento as db

client = db.Historical(
    key="$YOUR_API_KEY",
)

trades = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ES.FUT"],
    stype_in="parent",
    schema="trades",
    start="2022-06-06",
)

definitions = trades.request_full_definitions(client).to_df()
definitions = definitions.sort_values(["expiration", "symbol"]).set_index("expiration")

print(definitions[["symbol"]])
Example response
                              symbol
expiration
2022-06-17 13:30:00+00:00       ESM2
2022-06-17 13:30:00+00:00  ESM2-ESH3
2022-06-17 13:30:00+00:00  ESM2-ESM3
2022-06-17 13:30:00+00:00  ESM2-ESU2
2022-06-17 13:30:00+00:00  ESM2-ESZ2
2022-09-16 13:30:00+00:00       ESU2
2022-09-16 13:30:00+00:00  ESU2-ESH3
2022-09-16 13:30:00+00:00  ESU2-ESM3
2022-09-16 13:30:00+00:00  ESU2-ESU3
2022-09-16 13:30:00+00:00  ESU2-ESZ2
2022-12-16 14:30:00+00:00       ESZ2
2022-12-16 14:30:00+00:00  ESZ2-ESH3
2022-12-16 14:30:00+00:00  ESZ2-ESM3
2022-12-16 14:30:00+00:00  ESZ2-ESU3
2023-03-17 13:30:00+00:00       ESH3
2023-03-17 13:30:00+00:00  ESH3-ESM3
2023-03-17 13:30:00+00:00  ESH3-ESU3
2023-03-17 13:30:00+00:00  ESH3-ESZ3
2023-06-16 13:30:00+00:00       ESM3
2023-06-16 13:30:00+00:00  ESM3-ESU3
2023-06-16 13:30:00+00:00  ESM3-ESZ3
2023-09-15 13:30:00+00:00       ESU3
2023-09-15 13:30:00+00:00  ESU3-ESH4
2023-09-15 13:30:00+00:00  ESU3-ESZ3
2023-12-15 14:30:00+00:00       ESZ3
2023-12-15 14:30:00+00:00  ESZ3-ESH4
2024-03-15 13:30:00+00:00       ESH4
2024-03-15 13:30:00+00:00  ESH4-ESM4
2024-06-21 13:30:00+00:00       ESM4
2024-09-20 13:30:00+00:00       ESU4
2024-12-20 14:30:00+00:00       ESZ4
2025-12-19 14:30:00+00:00       ESZ5
2026-12-18 14:30:00+00:00       ESZ6

DBNStore.request_symbology

Request to resolve symbology mappings based on the metadata properties.

Parameters

client
required | Historical
The historical client to use for the request (contains the API key).

Returns

dict[str, Any]

A result including a map of input symbol to output symbol across a date range.

API method
DBNStore.request_symbology(
    client: Historical,
) -> dict[str, Any]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
data.to_file("GLBX-ESM2-20201229.trades.dbn.zst")

# Read saved .dbn.zst
stored_data = db.DBNStore.from_file("GLBX-ESM2-20201229.trades.dbn.zst")

# Request symbology from .dbn.zst metadata
symbology = stored_data.request_symbology(client=client)
print(symbology)
Example response
{
    "result": {
        "ESM2": [
            {
                "d0": "2022-06-06",
                "d1": "2022-06-07",
                "s": "3403"
            }
        ]
    },
    "symbols": [
        "ESM2"
    ],
    "stype_in": "raw_symbol",
    "stype_out": "instrument_id",
    "start_date": "2022-06-06",
    "end_date": "2022-06-07",
    "partial": [],
    "not_found": [],
    "message": "OK",
    "status": 0
}

DBNStore.to_csv

Write data to a file in CSV format.

Parameters

path
required | PathLike[str] or str
The file path to write to.
pretty_ts
optional | bool, default True
Whether timestamp columns are converted to tz-aware pandas.Timestamp (UTC).
pretty_px
optional | bool, default True
Whether price columns are correctly scaled as display prices.
map_symbols
optional | bool, default True
If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its native symbol for every record.
compression
optional | Compression or str
The output compression for writing. Must be either 'zstd' or 'none'.
schema
optional | Schema or str
The data record schema for the output CSV. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.
mode
optional | str
The file write mode to use, either "x" or "w". Defaults to "w".

Returns

None

API method
DBNStore.to_csv(
    path: PathLike[str] | str,
    pretty_ts: bool = True,
    pretty_px: bool = True,
    map_symbols: bool = True,
    compression: Compression | str = Compression.NONE,
    schema: Schema | str | None = None,
    mode: Literal["w", "x"] = "w",
) -> None
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .csv
data.to_csv("GLBX-ESM2-20220606-trades.csv")

DBNStore.to_df

Converts data to a pandas DataFrame.

Info
Info

The DataFrame index will be set to ts_recv if it exists in the schema, otherwise it will be set to ts_event.

See also
See also

While not optimized for use with live data due to their column-oriented format, pandas DataFrames can still be used with live data by first streaming DBN data to a file, then converting to a DataFrame with DBNStore.from_file().to_df(). See this example for more information.

Parameters

price_type
optional | PriceType or str, default "float"
The price type to use for price fields. If "fixed", prices will have a type of int in fixed decimal format; each unit representing 1e-9 or 0.000000001. If "float", prices will have a type of float. If "decimal", prices will be instances of decimal.Decimal.
pretty_ts
optional | bool, default True
Whether timestamp columns are converted to tz-aware pandas.Timestamp. The timezone can be specified using the tz parameter.
map_symbols
optional | bool, default True
If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its raw symbol for every record.
schema
optional | Schema or str
The data record schema for the output DataFrame. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.
tz
optional | datetime.tzinfo or str, default UTC
If pretty_ts is True, all timestamps will be converted to the specified timezone.
count
optional | int
If set, instead of returning a single DataFrame a DataFrameIterator instance will be returned. When iterated, this object will yield a DataFrame with at most count elements until the entire contents of the DBNStore are exhausted.

Returns

A pandas DataFrame object.

API method
DBNStore.to_df(
    price_type: PriceType | str = "float",
    pretty_ts: bool = True,
    map_symbols: bool = True,
    schema: Schema | str | None = None,
    tz: datetime.tzinfo | str = zoneinfo.ZoneInfo("UTC"),
    count: int | None = None,
) -> pd.DataFrame | DataFrameIterator
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    schema="trades",
    symbols=["ESM2"],
    start="2022-03-06",
)

df = data.to_df()
print(df.head())
Example response
                                                               ts_event  rtype  publisher_id  instrument_id action  ... size  flags  ts_in_delta  sequence  symbol
ts_recv                                                                                                             ...
2022-03-06 23:00:00.039463300+00:00 2022-03-06 23:00:00.036436177+00:00      0             1           3403      T  ...    1      0        18828      5178    ESM2
2022-03-06 23:00:01.098111252+00:00 2022-03-06 23:00:01.097477845+00:00      0             1           3403      T  ...    1      0        19122      6816    ESM2
2022-03-06 23:00:04.612334175+00:00 2022-03-06 23:00:04.611714663+00:00      0             1           3403      T  ...    1      0        18687     10038    ESM2
2022-03-06 23:00:04.613776789+00:00 2022-03-06 23:00:04.613240435+00:00      0             1           3403      T  ...    1      0        18452     10045    ESM2
2022-03-06 23:00:06.881864467+00:00 2022-03-06 23:00:06.880575603+00:00      0             1           3403      T  ...    1      0        18478     11343    ESM2

[5 rows x 13 columns]

DBNStore.to_file

Write data to a DBN file.

Parameters

path
required | PathLike[str] or str
The file path to write to.
mode
optional | str
The file write mode to use, either "x" or "w". Defaults to "w".
compression
optional | Compression or str
The compression format to write. If None, uses the same compression as the underlying data.

Returns

A DBNStore object.

API method
DBNStore.to_file(
    path: PathLike[str] | str,
    mode: Literal["w", "x"] = "w",
    compression: Compression | str | None = None,
) -> None
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .dbn.zst
data.to_file("GLBX-ESM2-20220606.trades.dbn.zst")

DBNStore.to_json

Write data to a file in JSON format.

Parameters

path
required | PathLike[str] or str
The file path to write to.
pretty_ts
optional | bool, default True
Whether timestamp columns are converted to tz-aware pandas.Timestamp (UTC).
pretty_px
optional | bool, default True
Whether price columns are correctly scaled as display prices.
map_symbols
optional | bool, default True
If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its raw symbol for every record.
compression
optional | Compression or str
The output compression for writing. Must be either 'zstd' or 'none'.
schema
optional | Schema or str
The data record schema for the output JSON. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.
mode
optional | str
The file write mode to use, either "x" or "w". Defaults to "w".

Returns

None

API method
DBNStore.to_json(
    path: PathLike[str] | str,
    pretty_ts: bool = True,
    pretty_px: bool = True,
    map_symbols: bool = True,
    compression: Compression | str = Compression.NONE,
    schema: Schema | str | None = None,
    mode: Literal["w", "x"] = "w",
) -> None
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .json
data.to_json("GLBX-ESM2-20220606-trades.json")

DBNStore.to_ndarray

Converts data to a numpy N-dimensional array. Each element will contain a Python representation of the binary fields as a Tuple.

Parameters

schema
optional | Schema or str
The data record schema for the output array. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.
count
optional | int
If set, instead of returning a single np.ndarray a NDArrayIterator instance will be returned. When iterated, this object will yield a np.ndarray with at most count elements until the entire contents of the DBNStore are exhausted.

Returns

A numpy.ndarray object.

API method
DBNStore.to_ndarray(
    schema: Schema | str | None = None,
    count: int | None = None,
) -> np.ndarray | NDArrayIterator
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    start="2022-06-06",
)

array = data.to_ndarray()
print(array[0])
Example response
(12, 0, 1, 3403, 1654473600070033767, 4108500000000, 1, b'T', b'A', 0, 0, 1654473600070314216, 18681, 157862)

DBNStore.to_parquet

Write data to a file in Apache parquet format.

Parameters

path
required | PathLike[str] or str
The file path to write the data to.
price_type
optional | PriceType or str, default "float"
The price type to use for price fields. If "fixed", prices will have a type of int in fixed decimal format; each unit representing 1e-9 or 0.000000001. If "float", prices will have a type of float.
pretty_ts
optional | bool, default True
Whether timestamp columns are converted to tz-aware pyarrow.TimestampType (UTC).
map_symbols
optional | bool, default True
If symbology mappings from the metadata should be used to create a 'symbol' column, mapping the instrument ID to its raw symbol for every record.
schema
optional | Schema or str
The data record schema for the output parquet file. Must be one of the values from list_schemas. This is only required when reading a DBNStore with mixed record types.
mode
optional | str
The file write mode to use, either "x" or "w". Defaults to "w".
**kwargs
optional | Any
Keyword arguments to pass to pyarrow.parquet.ParquetWriter. These can be used to override the default behavior of the writer.
API method
DBNStore.to_parquet(
    path: PathLike[str] | str,
    price_type: PriceType | str = "float",
    pretty_ts: bool = True,
    map_symbols: bool = True,
    schema: Schema | str | None = None,
    mode: Literal["w", "x"] = "w",
    **kwargs: Any,
) -> None
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    schema="trades",
    start="2022-06-06",
)

# Save streamed data to .parquet
data.to_parquet("GLBX-ESM2-20220606-trades.parquet")

DBNStore.__iter__

Using for; records will be iterated one at a time. Iteration will stop when there are no more records in the DBNStore instance.

Refer to the List of fields by schema article for documentation on the fields contained with each record type.

API method
DBNStore.__iter__() -> Iterator[DBNRecord]
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    symbols=["ESM2"],
    start="2022-06-06",
)

for trade in data:
    size = getattr(trade, "size", 0)
    if size >= 200:
        print(trade)
Example response
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524078339857609 }, price: 4164.000000000, size: 291, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524078342408839, ts_in_delta: 20352, sequence: 3605032 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654524133736900455 }, price: 4160.000000000, size: 216, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654524133737794739, ts_in_delta: 28024, sequence: 3659203 }
TradeMsg { hd: RecordHeader { length: 12, rtype: Mbp0, publisher_id: GlbxMdp3Glbx, instrument_id: 3403, ts_event: 1654538295588752739 }, price: 4140.000000000, size: 200, action: 'T', side: 'B', flags: 0, depth: 0, ts_recv: 1654538295589900967, ts_in_delta: 21708, sequence: 10031624 }

DBNStore.insert_symbology_json

Insert JSON symbology data which may be obtained from a symbology request or loaded from a file.

Parameters

json_data
required | str or Mapping[str, Any] or TextIO
The JSON data to insert.
clear_existing
optional | bool
If existing symbology data should be cleared from the internal mappings.
API method
DBNStore.insert_symbology_json(
    json_data: str | Mapping[str, Any] | TextIO,
    clear_existing: bool = True,
) -> None
Example usage
import databento as db

client = db.Historical("$YOUR_API_KEY")

data = client.timeseries.get_range(
    dataset="XNAS.ITCH",
    symbols=["ALL_SYMBOLS"],
    schema="trades",
    start="2022-06-06",
    end="2022-06-07",
)

# Request symbology for all symbols and then insert this data
symbology_json = data.request_symbology(client)
data.insert_symbology_json(symbology_json, clear_existing=True)

map_symbols_csv

Use a symbology.json file to map a symbols column onto an existing CSV file. The result is written to out_file.

Parameters

symbology_file
required | PathLike[str] or str
Path to a symbology.json file to use as a symbology source.
csv_file
required | PathLike[str] or str
Path to a CSV file that contains encoded DBN data; must contain a ts_recv or ts_event and instrument_id column.
out_file
optional | PathLike[str] or str
Path to a file to write results to. If unspecified, _mapped will be appended to the csv_file name.

Returns

Path to the written file.

API method
map_symbols_csv(
    symbology_file: PathLike[str] | str,
    csv_file: PathLike[str] | str,
    out_file: PathLike[str] | str | None = None,
) -> Path:
Example usage
import databento as db

result = db.map_symbols_csv(
    "symbology.json",
    "xnas-itch-20230821-20230825.ohlcv-1d.csv",
)

print(result.read_text())
Example response
ts_event,rtype,publisher_id,instrument_id,open,high,low,close,volume,symbol
1692576000000000000,35,2,523,133550000000,135200000000,132710000000,134360000000,11015261,AMZN
1692576000000000000,35,2,7290,439090000000,472900000000,437260000000,470990000000,11098972,NVDA
1692576000000000000,35,2,10157,217000000000,233500000000,217000000000,233380000000,21336884,TSLA
1692576000000000000,35,2,7130,430170000000,434700000000,0,433000000000,68661,NOC
1692662400000000000,35,2,7132,431610000000,438480000000,0,437740000000,86950,NOC
1692662400000000000,35,2,10161,236710000000,241550000000,229560000000,232870000000,17069349,TSLA
1692662400000000000,35,2,523,135320000000,135900000000,133740000000,134200000000,8133698,AMZN
1692662400000000000,35,2,7293,475120000000,483440000000,453340000000,457910000000,11700447,NVDA
1692748800000000000,35,2,523,135120000000,137860000000,133220000000,137060000000,11430081,AMZN
1692748800000000000,35,2,7296,463190000000,518870000000,452080000000,502150000000,13361964,NVDA
1692748800000000000,35,2,7135,0,440000000000,0,434030000000,68411,NOC
1692748800000000000,35,2,10163,236590000000,243750000000,226500000000,240810000000,13759234,TSLA
1692835200000000000,35,2,7287,508190000000,512600000000,466720000000,468700000000,21611800,NVDA
1692835200000000000,35,2,10154,242000000000,244140000000,228180000000,229980000000,13724054,TSLA
1692835200000000000,35,2,7126,433650000000,437730000000,0,429880000000,54570,NOC
1692835200000000000,35,2,521,137250000000,137820000000,131410000000,131880000000,11990573,AMZN

map_symbols_json

Use a symbology.json file to insert a symbols key into records of an existing JSON file. The result is written to out_file.

Parameters

symbology_file
required | PathLike[str] or str
Path to a symbology.json file to use as a symbology source.
json_file
required | PathLike[str] or str
Path to a JSON file that contains encoded DBN data.
out_file
optional | PathLike[str] or str
Path to a file to write results to. If unspecified, _mapped will be appended to the csv_file name.

Returns

Path to the written file.

API method
map_symbols_json(
    symbology_file: PathLike[str] | str,
    json_file: PathLike[str] | str,
    out_file: PathLike[str] | str | None = None,
) -> Path:
Example usage
import databento as db

result = db.map_symbols_json(
    "symbology.json",
    "xnas-itch-20230821-20230825.ohlcv-1d.json",
)

print(result.read_text())
Example response
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":523},"open":"133550000000","high":"135200000000","low":"132710000000","close":"134360000000","volume":"11015261","symbol":"AMZN"}
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":7290},"open":"439090000000","high":"472900000000","low":"437260000000","close":"470990000000","volume":"11098972","symbol":"NVDA"}
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":10157},"open":"217000000000","high":"233500000000","low":"217000000000","close":"233380000000","volume":"21336884","symbol":"TSLA"}
{"hd":{"ts_event":"1692576000000000000","rtype":35,"publisher_id":2,"instrument_id":7130},"open":"430170000000","high":"434700000000","low":"0","close":"433000000000","volume":"68661","symbol":"NOC"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":10161},"open":"236710000000","high":"241550000000","low":"229560000000","close":"232870000000","volume":"17069349","symbol":"TSLA"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":523},"open":"135320000000","high":"135900000000","low":"133740000000","close":"134200000000","volume":"8133698","symbol":"AMZN"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":7293},"open":"475120000000","high":"483440000000","low":"453340000000","close":"457910000000","volume":"11700447","symbol":"NVDA"}
{"hd":{"ts_event":"1692662400000000000","rtype":35,"publisher_id":2,"instrument_id":7132},"open":"431610000000","high":"438480000000","low":"0","close":"437740000000","volume":"86950","symbol":"NOC"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":523},"open":"135120000000","high":"137860000000","low":"133220000000","close":"137060000000","volume":"11430081","symbol":"AMZN"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":7296},"open":"463190000000","high":"518870000000","low":"452080000000","close":"502150000000","volume":"13361964","symbol":"NVDA"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":7135},"open":"0","high":"440000000000","low":"0","close":"434030000000","volume":"68411","symbol":"NOC"}
{"hd":{"ts_event":"1692748800000000000","rtype":35,"publisher_id":2,"instrument_id":10163},"open":"236590000000","high":"243750000000","low":"226500000000","close":"240810000000","volume":"13759234","symbol":"TSLA"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":7287},"open":"508190000000","high":"512600000000","low":"466720000000","close":"468700000000","volume":"21611800","symbol":"NVDA"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":10154},"open":"242000000000","high":"244140000000","low":"228180000000","close":"229980000000","volume":"13724054","symbol":"TSLA"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":7126},"open":"433650000000","high":"437730000000","low":"0","close":"429880000000","volume":"54570","symbol":"NOC"}
{"hd":{"ts_event":"1692835200000000000","rtype":35,"publisher_id":2,"instrument_id":521},"open":"137250000000","high":"137820000000","low":"131410000000","close":"131880000000","volume":"11990573","symbol":"AMZN"}