Both methods are suitable for market replay. All of our historical data is sequenced in the order of arrival on our live service and includes receive timestamps. This enables you to replay events in the exact order as they would've been received under live market conditions.
FAQs
Client libraries vs. APIs
There are multiple ways to access data programmatically through Databento. The easiest way is to use one of our official client libraries for the following languages:
These client libraries reduce the amount of code you need to write and adopt our recommended best practices.
If you are using a language that does not have an official client library, you can still access all of our historical features through our HTTP API and all of our live features through our Raw API, which is built on a regular TCP/IP socket.
If you are still unsure, here are some considerations that can help you decide the best choice:
- Historical vs. live. Our HTTP API only exposes historical data, whereas our Raw API only exposes live data. Due to licensing limitations, we treat any data from the last 24 hours as live data. Our client libraries provide both historical and live data by implementing a common layer over both our HTTP API and Raw API.
- Web browser support. Most browsers do not provide support for raw TCP sockets. If you are writing a browser application, you should use our HTTP API to ensure compatibility with your end users.
- Security. If your organization has security policies that make it inconvenient for you to install or update an external client library, we recommend implementing your own client using our APIs directly.
- Real-time performance. Our Raw API uses binary-encoded data messages and will have the lowest overhead for live data. The Python and C++ client libraries wrap the Raw API for live data, so they can't be more efficient than the API itself. See our latency page to compare their latencies in real conditions.
- Similarities to REST. Our HTTP API is a collection of RPC-style methods. Although it is not a REST API, you will find it familiar as it shares many of the standard practices adopted by HTTP-based REST APIs.
- Pricing. You pay the same price for the same data, regardless of the access method. We only charge you based on the size of the data in its original binary encoding. For example, the same data message will be larger in our CSV encoding than our binary encoding, but there are no extra charges for this overhead.
You can toggle between the different client libraries and APIs anywhere in our documentation. By default, our Python client library is selected.
Streaming vs. batch download
We support both streaming and batch download as means of receiving historical data. Both methods yield exactly the same data. They differ only in the way that the data is received. The table below compares the two methods.
Streaming | Batch download | |
---|---|---|
Usage | Load data directly in your application via our API or client libraries. | Download data files from the Download center. HTTP, rsync, and FTP are supported. |
Cost | You will be charged every time you stream the same data. | Download the same data multiple times over a 30 day period, for no additional charge. |
Size | Recommended for data requests under 5 GB | Recommended for data requests over 5 GB |
Customization | Limited | Full set of advanced customizations available |
Wait time | None | Usually takes several minutes to prepare before the data is available in the Download center |
Request method | API only | Either manually using a Batch download request or programmatically via our API |
Here are some details to consider when deciding which method is best for you:
- Usage. With streaming, you can access data immediately when it is needed in your application. Streaming is suitable for smaller, on-demand workflows such as retrieving reference data for production trading, initializing a user's price chart for a given symbol on a display application, or exploring data in an interactive environment.
- Cost. Since streaming is designed for one-off tasks, you will be charged each time for duplicate stream requests. If you intend to access the same data repeatedly -- for instance in parallel simulations, multiple backtests or daily ETL pipelines -- you can do so more efficiently with a batch download of the data onto your system.
- Size. Batch download lets you fetch the same data multiple times without additional charge. Hence, it is more suitable for larger data requests where there is risk of disconnection while your data is being transmitted.
- Customization and wait time. Our streaming infrastructure is optimized for instant retrieval of small data requests in predetermined schemas. When you request a batch download, our system needs time to prepare the data files. This preparation time lets us service additional customizations that are not possible on our streaming infrastructure.
- Request method. A batch download request can be submitted using the Batch download on our portal, using our HTTP API, or using any of our official client libraries.
Info
Usage-based pricing and credits
Usage-based pricing
Every dataset or data feed is priced in $/GB. Every month, you pay only for the data that you've used. There is no monthly subscription fee.
Historical and live data
Historical data is billed per byte consumed, and live data is billed per message. Databento meters data by its uncompressed size in binary encoding. For no additional charge, you can also request data in CSV and JSON encodings. We convert the data from binary, so you only pay for the data sized in the smaller encoding. For historical data, all delivery methods (like HTTP and FTPS) incur no additional charge — with the exception of physical delivery by hard disk drive.
Streaming and batch download
You can receive data through either streaming or batch download. Here's how billing works for each method:
- Streaming. You're billed for every outbound byte of data that's sent on our network. If connection cuts out in the middle of a streaming request, you won't be charged for the remaining unsent data that you had requested.
- Batch download. You're billed once for the request. Then, you can download the data as many times as you want over 30 days — with no charge.
Pricing quotes
You can find estimated quotes for datasets by searching for them on the Browse page of your Databento portal, and then clicking on the dataset detail page. For a definitive quote, select and then customize your data request on the Data request page.
Data credits
After April 30, 2023, both new and existing Databento users will receive $125 in free data credits for historical data. These credits are shared across your team and expire in 6 months. Each team is eligible for one set of credits.
- If you registered for Databento before April 30, 2023, your original credits will expire at 11:59 PM UTC on April 30, and you will receive $125 in new data credits beginning on May 1.
- If you registered for Databento after April 30, 2023, you will receive $125 in free data credits for historical data. You will only be billed after exceeding $125 in data requests.
Instruments and products
Our documentation and web platform uses the terms product, instrument, and listing for different purposes. This article clarifies the differences between these terms.
Products
On Databento, a product (also called a parent product or parent in the context of futures or options) refers to any real or synthetic asset whose data is provided by a market. A product describes a group of instruments belonging to a given economic sector or market segment. Also, products are often fungible, meaning they may be traded on more than one venue. In the case of equities, this means that products and instruments are the same asset, in contrast to derivatives such as futures and options products.
The search on our home page looks up products across our dataset coverage. To find individual instruments, click on any of the products in the search results.
A futures product for a certain underlying has a range of expirations.
To avoid ambiguity, we will refer to the collection of all expirations as a parent, and
specific expiration as an instrument. So, for example, ES.FUT
is the parent of ESM0
and ESZ0
.
Note: we classify exchange-traded spreads between futures outrights as part of the futures products.
An options product for a certain underlying has a range of both expirations and strikes.
To avoid ambiguity, we will refer to the collection of all expirations and strikes as a parent,
and specific expiration with strike as an instrument. So, for example, MSFT.OPT
is the parent of MSFT20210205C210
.
Note: we classify option combinations as part of the options products.
Instruments
On Databento, an instrument (also called a child instrument or child) is a tradable asset, real or synthetic, on a specific market. Instruments define all attributes of what is traded, e.g. product complex, product group, expiration, and strike price.
Some publishers use the term security in lieu of what we consider an instrument. This becomes somewhat of a misnomer when applied to derivatives, which are not strictly securities. As such, we prefer to use the term instrument as it more broadly includes both securities and derivatives.
Moreover, some markets provide multiple datasets which provide different levels of visibility. Two different datasets from the same market may exhibit different data for the same instrument. For instance, FX ECNs will often provide a premium feed with full visibility of real-time trades across all liquidity pools on the ECN as well as an entry feed with only delayed trade prints for a subset of liquidity. Since we support multiple datasets from the same market, we also need to distinguish if you're requesting data for EUR/USD on the premium feed or the entry feed.
To resolve these identification issues, we use the term listing to refer to a tradeable entity in a specific dataset from a specific publisher. As such, we consider AAPL on NASDAQ TotalView-ITCH 5.0, AAPL on NYSE OpenBook Ultra, and AAPL on NYSE Trades as three distinct listings. The following table provides more of such examples.
Listings
On Databento, a listing is an instrument specific to a certain venue, which is not transferable or representable across other venues (non-fungible). While this makes a listing synonymous with an instrument for equities, it's important to be mindful of the distinction if you intend on extending into other asset classes.
Equities | FX | Futures | Options | |
---|---|---|---|---|
Product | AAPL , MSFT |
EUR/USD |
ES.FUT , GE.FUT |
MSFT.OPT , ES.OPT , EW1.OPT |
Instrument | AAPL , MSFT |
EUR/USD spot |
ESM0 , ESZ0 |
MSFT20210205C210 ,ESZ0 C3620 ,ESZ0 P3620 ,EW1Z0 C3500 |
Listing | AAPL on Nasdaq, AAPL on NYSE Arca |
EUR/USD on Cboe SEF |
ESM0 on CME Globex |
MSFT20210205C210 on BOX, ESZ0 C3620 on CME Globex |
Venues and publishers
You may have seen other data sources refer to all trading venues as an exchange. This naming convention is often harmless, especially if you are only interested in a small set of venues that actually implement a centralized exchange model.
However, as you expand to more asset classes or markets with Databento, it becomes increasingly likely that you will encounter different venue models such as ECNs and ATSs. Since terms like exchange, ATS, and ECN have specific regulatory meanings, we refrain from conflating exchange with other trading venues.
We use the terms "trading venue" and venue interchangeably.
We use the term publisher to refer to any data provider on Databento. A publisher may also be a venue.
InfoA vast majority of US equities data consumers are receiving data disseminated by the Securities Information Processors (SIPs) —namely the Consolidated Tape Association (CTA) and the UTP Plan. The SIPs provide a consolidated view of all protected bid/ask quotes and trades from US equities trading venues. On Databento, we refer to the SIPs as publishers but not market operators.
Exchange
An exchange is a venue registered with the SEC under Section 6 of the Securities Exchange Act of 1934. The SEC lists all current exchanges registered, here.
OTC markets
OTC markets are dealer networks that trade securities that don't meet exchange listing requirements (such as "unlisted" securities) and instruments that don't trade on a formal exchange (such as bonds, ADRs, and derivatives). The Financial Industry Regulatory Authority (FINRA) regulates broker-dealers that operate in the over-the-counter (OTC) market.
ATS
An Alternative Trading System (ATS) is a non-exchange trading venue that matches buyers and sellers to counterparties for transactions. ATSs are usually regulated as broker-dealers instead of as exchanges.
ATSes meet the definition of exchange under federal securities law. They aren't required to register as a national securities exchange, as long as the ATS operates under the exemption provided under Exchange Act Rule 3a1-1(a). Since ATSes are non-exchanges, ATS transactions don't appear on national exchange order books. Some traders use this advantage to conceal trading from public view, and reduce the effect of large trades.
An ATS must be approved by the United States Securities and Exchange Commission (SEC). For a current list of SEC-recognized ATSes, see here. The equivalent term under European legislation is a multilateral trading facility (MTF). ATSes are inclusive of electronic communications networks (ECNs), cross networks, call networks, and dark pools.
ECN
ECNs are a type of alternative trading system (ATS) that trade listed stocks and other exchange-traded products outside of traditional stock exchanges. As an ATS, ECNs are required to register with the SEC as broker-dealers as well as register as members of FINRA.
ECNs are computerized systems that internally and automatically match limit orders, charging a fee for each transaction. These venues attempt to eliminate the third party's role in executing orders entered by an exchange market maker or an over-the-counter market maker. They permit such orders to be executed against in whole or in part.
Dark pools
Dark pools are a type of ATS that operate within private groups. They are private exchanges or forums for securities trading. Though their legal validity is subject to local regulations, they are popular for participants who want to disassociate their identity from market activity, and for participants who want to save money on transaction fees.
Type | Overview | Examples |
---|---|---|
Exchange | Regulated by the SEC under Section 6 of the Securities Exchange Act of 1934 | Nasdaq, NYSE, Cboe, TYO |
OTC markets | Regulated by FINRA. Dealer networks that trade unlisted securities and instruments that don't trade on formal exchanges (such as bonds, ADRs, and derivatives). | Best Market (OTCQX), the Venture Market (OTCQB), and the Pink Open Market |
ATS | Regulated as broker-dealers. Exempt from SEC "exchange" regulation under Exchange Act Rule 3a1-1(a) . | BrokerTec, Nasdaq Fixed Income |
ECN | A type of ATS. A computerized system that automatically matches limit orders. | Pepperstone, FXCM, IC Markets |
MTF | The European equivalent of an ATS. Regulated under MiFID II. | Chi-X Europe |
Dark pools | A type of ATS. Operated within private groups as exchanges or forums. Popular for participants who want to disassociate their identity with market activity or save money on transaction fees. | Robinhood, CrossFinder, Sigma X, Citi-Match, MS Pool, Instinet, Liquidnet, ITG Posit, GETCO, Knight |
MBP-1 vs. TBBO vs. BBO schemas
InfoThis article also applies to the CMBP-1, TCBBO, and CBBO schemas.
The MBP-1, TBBO, and BBO schemas are very similar, as they share the same fields and all both cover top-of-book information in different ways.
Key differences: Event update space
The easiest way to think of the differences between the MBP-1, TBBO and BBO schemas is to consider their update space:
- MBP-1 is in book update space.
- BBO on trade (TBBO) is in trade space.
- BBO on interval (BBO) is in time space, subsampled at 1-second or 1-minute intervals.
Recommended use cases
Another way to understand the differences between the schemas is to think of their use cases.
- Every top of book update: Of the three schemas, MBP-1 is the only one that provides every top of book update. If you need the most granular of these three schemas, you should use MBP-1.
- Illiquid instruments: For illiquid instruments that trade infrequently, the BBO schema is generally more useful than TBBO. If an instrument doesn't trade for extended periods of time, the best bid and offer retrieved from the last TBBO message can be stale.
- Options: The BBO schema is most useful for option instruments, since they have a very low ratio of trades to book updates and will exhibit an extremely large volume of MBP-1 updates relative to BBO or TBBO updates. Options MBP-1 data can be costly or difficult to process while options TBBO data can be stale. A majority of our options users use BBO instead of MBP-1.
- Trading signals based on trades: Since most of the price discovery happens around the time of a trade, TBBO can be used to construct many useful signals without the bloat of MBP-1.
- Display apps: If you're building a display app that doesn't show every update and just needs to display a reasonably recent trade price, the BBO schema is more convenient than TBBO or MBP-1 because the BBO schema includes a forward-filled last sale price. In other words, the BBO schema is easier to manage because it's stateless with regards to the trade price. Moreover, your app usually has to handle less traffic when using the BBO schema.
Deriving TBBO and BBO from MBP-1
You can always use MBP-1 and derive the other two schemas.
- You can derive the TBBO schema by removing any event from the MBP-1 schema that does not have an
action
ofT
, for trades. - You can derive the BBO schema by subsampling the MBP-1 schema at either a 1-second or 1-minute interval.
However, it is not possible to derive BBO from TBBO or vice versa.
Visual differences: MBP-1 vs. TBBO
The following graph illustrates the difference between MBP-1s and TBBO. One can observe that the bid-ask spread can be updated in MBP-1 without a trade event. That is not the case for TBBO, where every data point has an associated trade.
Visual differences: MBP-1 vs. BBO-1s
The following graph illustrates the difference between MBP-1s and BBO-1s on an options contract. Observe that BBO-1s has much fewer BBO updates, making it easier to handle.