This article introduces additional concepts to get started with equities data on Databento. If you're new to Databento, see the Quickstart guide.
Equities: Introduction
Info
Overview
In this example, we'll show how to:
- Find an equities dataset
- Fetch top-of-book quotes and last sale
- Stream live data OHLCV data
- Explain special conventions for equities vs. other asset classes on Databento
Finding an equities dataset
To use equities data on Databento, first navigate to our equities catalog page.
Next, click on the dataset of your choice.
For this example, we'll use the Nasdaq TotalView-ITCH
dataset, whose dataset name is XNAS.ITCH
. You'll need to pass this in as the dataset
parameter of any API or client method.
Getting MBP-1 data
The easiest way to fetch top-of-book quotes and last sale is by requesting the MBP-1 schema, which gives all trades and all book updates that affect the top-of-book; some data providers refer to this as "L1" data.
InfoNote that MBP-1 should not be mistaken with the TBBO schema. MBP-1 is a superset of the TBBO schema, which is essentially MBP-1 sampled in trade space. See our FAQ page for more information on the differences between these schemas.
import databento as db
import matplotlib.pyplot as plt
client = db.Historical("$YOUR_API_KEY")
dataset = "XNAS.ITCH"
symbol = "AAPL"
start = "2024-10-28T12:30:00"
end = "2024-10-28T12:45:00"
data = client.timeseries.get_range(
dataset=dataset,
schema="mbp-1",
symbols=symbol,
start=start,
end=end,
)
df = data.to_df()
df = df.rename(columns={"bid_px_00": "bid", "ask_px_00": "ask"})
ax = df[["bid", "ask"]].plot(drawstyle="steps-post")
df[df["action"] == "T"]["price"].plot(
ax=ax,
style=".",
markersize=10,
label="trade",
color="C2",
zorder=2,
ylabel="Price",
xlabel="Time",
title=symbol,
)
plt.legend()
plt.show()
Streaming live OHLCV data
The Databento US Equities Mini (EQUS.MINI
) dataset is our most cost-effective solution for real-time, top-of-book US equities data. It combines multiple trading venues into one composite dataset that provides accurate best bid and offer (BBO) quotes, liquidity, and trade volume data—all without exchange license fees.
In this example, we will use the OHLCV schema and print out records every second. A full list of available schemas can be found on the dataset documentation page.
InfoThis example requires a live data subscription to Databento US Equities. For more information, please visit our equities catalog page.
import databento as db
# Enable some basic logging
db.enable_logging("INFO")
# Create a live client and connect
live_client = db.Live("$YOUR_API_KEY")
# Subscribe to the OHLCV-1s schema for AMZN
live_client.subscribe(
dataset="EQUS.MINI",
schema="ohlcv-1s",
symbols="AMZN",
)
# We add a print callback to view each record
live_client.add_callback(print)
# Start the live client to begin data streaming
live_client.start()
# Run the stream for 15 seconds before closing
live_client.block_for_close(timeout=15)
InfoIf you do not see any output, it could be because the markets are closed. See the
start
parameter in Live.subscribe for using intraday replay.
SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 853, ts_event: 1738287525862149256 }, stype_in: 255, stype_in_symbol: "AMZN", stype_out: 255, stype_out_symbol: "AMZN", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
OhlcvMsg { hd: RecordHeader { length: 14, rtype: Ohlcv1S, publisher_id: EqusMiniEqus, instrument_id: 853, ts_event: 1738269000000000000 }, open: 235.915000000, high: 235.915000000, low: 235.880000000, close: 235.880000000, volume: 413 }
OhlcvMsg { hd: RecordHeader { length: 14, rtype: Ohlcv1S, publisher_id: EqusMiniEqus, instrument_id: 853, ts_event: 1738269001000000000 }, open: 235.870000000, high: 235.875000000, low: 235.870000000, close: 235.875000000, volume: 1139 }
OhlcvMsg { hd: RecordHeader { length: 14, rtype: Ohlcv1S, publisher_id: EqusMiniEqus, instrument_id: 853, ts_event: 1738269004000000000 }, open: 235.870000000, high: 235.870000000, low: 235.865000000, close: 235.865000000, volume: 301 }
...
OhlcvMsg { hd: RecordHeader { length: 14, rtype: Ohlcv1S, publisher_id: EqusMiniEqus, instrument_id: 853, ts_event: 1738269039000000000 }, open: 235.850000000, high: 235.860000000, low: 235.850000000, close: 235.860000000, volume: 465 }
OhlcvMsg { hd: RecordHeader { length: 14, rtype: Ohlcv1S, publisher_id: EqusMiniEqus, instrument_id: 853, ts_event: 1738269042000000000 }, open: 235.820000000, high: 235.820000000, low: 235.820000000, close: 235.820000000, volume: 1 }
OhlcvMsg { hd: RecordHeader { length: 14, rtype: Ohlcv1S, publisher_id: EqusMiniEqus, instrument_id: 853, ts_event: 1738269043000000000 }, open: 235.815000000, high: 235.815000000, low: 235.800000000, close: 235.800000000, volume: 207 }
Special conventions for equities on Databento
- Odd lots. All order quantities are expressed in units of 1 share. Databento's equities data generally includes odd lots as we source our data directly from prop feeds. If you've switched to Databento from a data provider that only distributes CTA/UTP SIP data, you may have never encountered order quantities for less than 100 shares as the SIPs only provide round lots.
- Explicit trade aggressor side. Equities prop feeds usually provide the
aggressor side of each trade explicitly. Hence, the
side
field in our data is exact; it is not inferred using a trade classification algorithm. If the aggressor side is not known, we mark it with a side ofN
. If you've switched to Databento from a data provider that only distributes CTA/UTP SIP data, you may not have dealt with explicit trade aggressor side before but should try to make use of this field as it provides valuable premia over the SIPs. - Ephemeral and Databento-assigned permanent instrument IDs. Several equities venues do not have a system of numeric instrument IDs that correspond to each ticker symbol; these venues rely solely on the ticker symbols to identify each instrument. On certain venues like Nasdaq, the venue does assign instrument IDs to each instrument that can be used alongside the ticker symbols, but such venue-assigned instrument IDs are ephemeral and rotate daily. This is unlike venues like CME Globex, where the venue-assigned instrument IDs are typically permanent until the instrument expires. Where the venue assigns an instrument ID, we pass this directly through—so the instrument IDs of a venue like Nasdaq may be difficult to work with. Where the venue doesn't assign an instrument ID, we'll generate our own synthetic instrument ID for consistency across all venues.
- Atomic executions and passive-side fills. On most equities venues, the fill
event and order deletion is published at the same time as a single, atomic execution
event. In such cases, Databento represents each execution event with action
F
, implying the passive-side fill, and provides a synthetic trade event with actionT
to represent the aggressor side of the trade. Note that this is merely for convenience, so you can reuse the same book management logic for all delta one products like equities and futures—in such cases that the aggressor size is not explicit since we only know the sizes of the passive fills corresponding to the trade, and hence the actual aggressor size needs to be inferred. - Trade conditions. We do not provide trade condition codes as those are not provided on the prop feeds.
- Instrument definitions. Equities instrument definitions tend to be much sparser than those of derivatives like options and futures, so you should expect to see many null values when you fetch instrument definitions.
Symbology
In US equities, the same stock may be traded on different venues, but have slightly different ticker symbols on each venue.
Examples include BRK.B
and AAC+
in the Nasdaq convention vs. BRK B
and AAC WS
in the CMS convention.
In addition to the raw_symbol
symbology, which uses the publisher's convention, we also offer the cms
and nasdaq
symbology types to allow explicitly using one convention.
For Databento consolidated equities datasets and Nasdaq TotalView-ITCH, raw_symbol
and nasdaq
symbology types are equivalent.
Numeric publisher IDs
Consolidated equities feeds include data from several publishers. The publisher_id
field included with each record indicates the source venue of the record.
These IDs can be converted to publisher names through the Publisher
enum.