This article introduces additional concepts to get started with equity options data on Databento. If you're new to Databento, see the Quickstart guide.
Equity options: Introduction
Info
Overview
In this example, we'll introduce:
- The OPRA dataset
- Using instrument definitions to get symbols, strike prices, and expirations
- Options Clearing Corporation (OCC) symbology for raw symbols
- Using parent instrument symbology to fetch an option chain
- Stream live options trades
We'll also highlight any special conventions of equity options datasets on Databento that differ from those of other asset classes.
Finding an equity options dataset
To use equity options data on Databento, first identify the dataset that you want from our data catalog and go to its detail page. Here, you can find its dataset ID at the top left of the page.
For this example, we'll use the OPRA
dataset, whose dataset ID is OPRA.PILLAR
. You'll need to pass this in as the dataset
parameter of any API or client method.
OPRA
OPRA provides consolidated last sale, exchange BBO, and national BBO across all US equity options exchanges. This includes single name stock options (e.g. TSLA), options on ETFs (e.g. SPY, QQQ), index options (e.g. VIX), and some indices (e.g. SPIKE and VSPKE).
InfoThe "Pillar" suffix in the
OPRA.PILLAR
dataset ID is a reference to the SIAC Pillar SIP, the latest infrastructure behind the binary OPRA feed since 2021.
Using instrument definitions to get symbols, strike prices, and expirations
A full list of instrument definitions can be fetched with the definition
schema and
passing in symbols="ALL_SYMBOLS"
to the timeseries.get_range
method.
import databento as db
client = db.Historical("$YOUR_API_KEY")
data = client.timeseries.get_range(
dataset="OPRA.PILLAR",
schema="definition",
symbols="ALL_SYMBOLS",
start="2024-08-09",
limit=10000,
)
df = data.to_df()
print(df[["raw_symbol", "strike_price", "expiration"]])
raw_symbol strike_price expiration
ts_recv
2024-08-09 10:30:00.446956866+00:00 BKNG 250117P01770000 1770.0 2025-01-17 00:00:00+00:00
2024-08-09 10:30:00.446956866+00:00 BKNG NaN NaT
2024-08-09 10:30:00.446972667+00:00 BLK 250117C00430000 430.0 2025-01-17 00:00:00+00:00
2024-08-09 10:30:00.446972667+00:00 BLK NaN NaT
2024-08-09 10:30:00.448575941+00:00 UNH 250117P00640000 640.0 2025-01-17 00:00:00+00:00
... ... ... ...
2024-08-09 10:30:00.586069426+00:00 SPY 240814P00561000 561.0 2024-08-14 00:00:00+00:00
2024-08-09 10:30:00.586076435+00:00 SMCI 250117C01080000 1080.0 2025-01-17 00:00:00+00:00
2024-08-09 10:30:00.586081478+00:00 SMCI 250117C01440000 1440.0 2025-01-17 00:00:00+00:00
2024-08-09 10:30:00.586087572+00:00 QQQ 250331C00415000 415.0 2025-03-31 00:00:00+00:00
2024-08-09 10:30:00.586107573+00:00 SHY 250117P00085000 85.0 2025-01-17 00:00:00+00:00
Note that we used a timeseries method here and the results are indexed in time. These are called point-in-time instrument definitions. Many data providers treat instrument listings and definitions as a static table, so this may initially feel unintuitive if you're new to this approach. However, for options, this is important as new instruments are added intraday as the underlying moves, and knowing the strike prices of these instruments may result in lookahead effects.
OCC symbology
Databento's raw symbols for equity options are based on OCC symbology, also known as Options Symbology Initiative (OSI) symbology. The OPRA dataset specification provides an overview of this symbology format.
You can specify the symbology type to use for your input symbols with the
the stype_in
parameter. To demonstrate this, let's fetch the raw symbol
SPY 241115P00525000
with stype_in="raw_symbol"
: the SPY 5250.00 put expiring on 2024-11-15.
import databento as db
client = db.Historical("$YOUR_API_KEY")
data = client.timeseries.get_range(
dataset="OPRA.PILLAR",
schema="cmbp-1",
stype_in="raw_symbol",
symbols=["SPY 241115P00525000"],
start="2024-08-09T09:30-04:00",
end="2024-08-09T10:00-04:00",
)
df = data.to_df()
print(df[["symbol", "bid_px_00", "ask_px_00"]].head())
symbol bid_px_00 ask_px_00
ts_recv
2024-08-09 13:30:00.025137089+00:00 SPY 241115P00525000 16.33 16.53
2024-08-09 13:30:00.167110712+00:00 SPY 241115P00525000 16.33 16.52
2024-08-09 13:30:00.182316856+00:00 SPY 241115P00525000 16.34 16.51
2024-08-09 13:30:00.238455572+00:00 SPY 241115P00525000 16.34 16.50
2024-08-09 13:30:00.340439305+00:00 SPY 241115P00525000 16.34 16.51
Using parent symbology to fetch an option chain
Working with individual raw symbols can be tedious. Many users find it easier to specify the underlying's symbol and fetch all option symbols for that underlying in an option chain at once.
This can be done using our parent symbology
by passing in stype_in="parent"
to your request. For example, we can print all
tick-by-tick trades of options on AAPL for a given time range as follows:
import databento as db
client = db.Historical("$YOUR_API_KEY")
data = client.timeseries.get_range(
dataset="OPRA.PILLAR",
schema="trades",
stype_in="parent",
symbols=["AAPL.OPT"],
start="2024-08-09T09:30-04:00",
end="2024-08-09T10:00-04:00",
)
df = data.to_df()
print(df[["symbol", "price"]])
symbol price
ts_recv
2024-08-09 13:30:00.008689345+00:00 AAPL 240809C00217500 0.06
2024-08-09 13:30:00.021880156+00:00 AAPL 240816C00230000 0.10
2024-08-09 13:30:00.032886565+00:00 AAPL 240920P00175000 0.77
2024-08-09 13:30:00.036991612+00:00 AAPL 240809P00202500 0.08
2024-08-09 13:30:00.109699333+00:00 AAPL 240816C00215000 2.23
... ... ...
2024-08-09 13:59:58.801315993+00:00 AAPL 240816C00207500 6.80
2024-08-09 13:59:58.962219427+00:00 AAPL 240809C00212500 1.24
2024-08-09 13:59:59.036621416+00:00 AAPL 240816C00240000 0.02
2024-08-09 13:59:59.124512360+00:00 AAPL 240809C00212500 1.24
2024-08-09 13:59:59.858997181+00:00 AAPL 240809C00220000 0.02
Streaming live equity options trades
For this example, we will stream the TCBBO schema for all AAPL options. This schema shows every trade event along with the NBBO immediately before the effect of each trade.
Over 99.9% of events on the OPRA feed are CMBP-1 updates. It's significantly easier to work with another schema like TCBBO or trades.
While the OPRA feed does not disseminate the trade aggressor side for a trade (side
will be always be N
), you can use the NBBO before the trade executes to help make a determination on whether the trade was initiated by the buyer or seller.
InfoThis example requires a live license to
OPRA.PILLAR
. Visit our live data portal to sign up.
import databento as db
# Enable some basic logging
db.enable_logging("INFO")
# Create a live client and connect
live_client = db.Live("$YOUR_API_KEY")
# Subscribe to the TCBBO schema for all active AAPL options
live_client.subscribe(
dataset="OPRA.PILLAR",
schema="tcbbo",
symbols="AAPL.OPT",
stype_in="parent",
)
# We add a print callback to view each record
live_client.add_callback(print)
# Start the live client to begin data streaming
live_client.start()
# Run the stream for 5 seconds before closing
live_client.block_for_close(timeout=5)
InfoIf you do not see any output, it could be because the markets are closed. See the
start
parameter in Live.subscribe for using intraday replay.
SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 16786676, ts_event: 1749829110951439303 }, stype_in: 255, stype_in_symbol: "AAPL.OPT", stype_out: 255, stype_out_symbol: "AAPL 271217P00055000", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 16779807, ts_event: 1749829110951440155 }, stype_in: 255, stype_in_symbol: "AAPL.OPT", stype_out: 255, stype_out_symbol: "AAPL 271217C00180000", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 33558588, ts_event: 1749829110951441016 }, stype_in: 255, stype_in_symbol: "AAPL.OPT", stype_out: 255, stype_out_symbol: "AAPL 250718P00130000", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
...
SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 33561832, ts_event: 1749829110951886324 }, stype_in: 255, stype_in_symbol: "AAPL.OPT", stype_out: 255, stype_out_symbol: "AAPL 260320C00095000", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 16795032, ts_event: 1749829110951887176 }, stype_in: 255, stype_in_symbol: "AAPL.OPT", stype_out: 255, stype_out_symbol: "AAPL 250919C00140000", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 33556365, ts_event: 1749829110951888028 }, stype_in: 255, stype_in_symbol: "AAPL.OPT", stype_out: 255, stype_out_symbol: "AAPL 260320C00185000", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
Cmbp1Msg { hd: RecordHeader { length: 20, rtype: Tcbbo, publisher_id: OpraPillarMxop, instrument_id: 33562619, ts_event: 1749829111047576435 }, price: 1.820000000, size: 5, action: 'T', side: 'N', flags: LAST | TOB (194), ts_recv: 1749829111047778941, ts_in_delta: 0, levels: [ConsolidatedBidAskPair { bid_px: 1.800000000, ask_px: 1.820000000, bid_sz: 27, ask_sz: 5, bid_pb: OpraPillarEdgo, ask_pb: OpraPillarMxop }] }
Cmbp1Msg { hd: RecordHeader { length: 20, rtype: Tcbbo, publisher_id: OpraPillarArco, instrument_id: 16794853, ts_event: 1749829111177886279 }, price: 7.550000000, size: 2, action: 'T', side: 'N', flags: LAST | TOB (194), ts_recv: 1749829111178089292, ts_in_delta: 0, levels: [ConsolidatedBidAskPair { bid_px: 7.550000000, ask_px: 7.600000000, bid_sz: 2, ask_sz: 1985, bid_pb: OpraPillarArco, ask_pb: OpraPillarAmxo }] }
Cmbp1Msg { hd: RecordHeader { length: 20, rtype: Tcbbo, publisher_id: OpraPillarXbox, instrument_id: 16778057, ts_event: 1749829111671382685 }, price: 2.880000000, size: 1, action: 'T', side: 'N', flags: LAST | TOB (194), ts_recv: 1749829111671585410, ts_in_delta: 0, levels: [ConsolidatedBidAskPair { bid_px: 2.860000000, ask_px: 2.890000000, bid_sz: 145, ask_sz: 295, bid_pb: OpraPillarC2Ox, ask_pb: OpraPillarC2Ox }] }
...
Cmbp1Msg { hd: RecordHeader { length: 20, rtype: Tcbbo, publisher_id: OpraPillarXndq, instrument_id: 33567927, ts_event: 1749829114445867737 }, price: 1.790000000, size: 3, action: 'T', side: 'N', flags: LAST | TOB (194), ts_recv: 1749829114446071520, ts_in_delta: 0, levels: [ConsolidatedBidAskPair { bid_px: 1.790000000, ask_px: 1.800000000, bid_sz: 299, ask_sz: 1, bid_pb: OpraPillarXndq, ask_pb: OpraPillarXcbo }] }
Cmbp1Msg { hd: RecordHeader { length: 20, rtype: Tcbbo, publisher_id: OpraPillarMxop, instrument_id: 33567927, ts_event: 1749829114654669177 }, price: 1.790000000, size: 3, action: 'T', side: 'N', flags: LAST | TOB (194), ts_recv: 1749829114654871828, ts_in_delta: 0, levels: [ConsolidatedBidAskPair { bid_px: 1.790000000, ask_px: 1.800000000, bid_sz: 82, ask_sz: 62, bid_pb: OpraPillarMxop, ask_pb: OpraPillarXbxo }] }
Cmbp1Msg { hd: RecordHeader { length: 20, rtype: Tcbbo, publisher_id: OpraPillarGmni, instrument_id: 16797248, ts_event: 1749829115260432023 }, price: 0.080000000, size: 1, action: 'T', side: 'N', flags: LAST | TOB (194), ts_recv: 1749829115260635088, ts_in_delta: 0, levels: [ConsolidatedBidAskPair { bid_px: 0.070000000, ask_px: 0.080000000, bid_sz: 404, ask_sz: 320, bid_pb: OpraPillarEmld, ask_pb: OpraPillarBato }] }
Special conventions for equity options on Databento
- Empty levels and null prices. Many option contracts are illiquid and may have an
empty book, especially at the start of day. Empty levels will have null prices and
zero depth. We represent null prices with type max, e.g. 2^63 - 1, as a sentinel value.
Our client libraries will generally replace these sentinel values with a more idiomatic
null representation—for example, if you pass
price_type="float"
(default) in to the DBNStore.to_df method of the Python client, these sentinel values will be replaced withNaN
in the resulting DataFrame. - Bandwidth. Due to the significant size of OPRA data, your live feed may get backed up if your client is not reading fast enough or network bandwidth is limited—this is a result of TCP flow control. Hence, we recommend setting up dedicated connectivity if you intend to consume a large number of symbols or the entire dataset in real-time.
- Intraday listings. Our instrument definitions are timestamped on both historical and live data. This point-in-time treatment is especially useful if you intend to subscribe to and trade new strikes as they get listed; to backtest this, and to avoid lookahead effects of knowing the underlying's future price changes.
Numeric publisher IDs
The source of each trade and quote is identified by a numeric publisher ID.
See alsoUS equity options volume by venue to see how to identify exchange from publisher ID.