Support

Futures: Introduction

Info
Info

This article introduces additional concepts to get started with futures data on Databento. If you're new to Databento, see the Quickstart guide.

Overview

In this example, we'll show how to:

  • Find a futures dataset
  • Find the 10 futures contracts with the highest volume
  • Use instrument definitions to get the tick size, expiration, and matching algorithm of an instrument
  • Stream live BBO data
  • Use parent symbology to fetch all contracts expirations
  • Use continuous contract symbology to get the lead month contract

We'll also highlight any special conventions of futures datasets on Databento that differ from those of other asset classes.

Finding a futures dataset

To use futures data on Databento, first identify the dataset that you want from our data catalog and go to its detail page. Here, you can find its dataset ID at the top left of the page.

Dataset ID

For this example, we'll use the CME Globex MDP 3.0 dataset, whose dataset ID is GLBX.MDP3. You'll need to pass this in as the dataset parameter of any API or client method.

Finding futures contracts with highest volume

A quick way to find the most actively-traded futures contracts, across all expirations, is to fetch the daily volumes from the OHLCV-1d schema. (You can also get a similar result using the statistics schema, which provides the official daily settlement prices and trade volumes.)

import databento as db

# Create historical client
client = db.Historical("$YOUR_API_KEY")

def rank_by_volume(top=10):
    # Request OHLCV-1d data
    data = client.timeseries.get_range(
        dataset="GLBX.MDP3",
        symbols="ALL_SYMBOLS",
        schema="ohlcv-1d",
        start="2023-08-15"
    )

    # Convert to DataFrame and filter for top 10 instruments by volume
    df = data.to_df()
    return df.sort_values(by="volume", ascending=False)["instrument_id"].to_list()[:top]

top_instruments = rank_by_volume()
print(top_instruments)

This returns the following list of numeric instrument IDs, from the instrument_id field.

[338574, 3445, 404144, 9235, 2922, 2130, 72156, 399495, 225833, 1562]

Using instrument definitions to get tick size, expiration, and matching algorithm

In the case of CME Globex MDP 3.0, these instrument IDs are sourced from tag 48-SecurityID of the original Security Definition messages.

Instrument IDs are necessary for many order routing and post-trade scenarios with the exchange, but can be hard to work with, so we will print their raw symbols instead. Let's extract useful properties of these instruments, like their tick sizes, expiration dates, and matching algorithms.

def get_symbol_properties(instrument_id_list):
    # Request definition data
    data = client.timeseries.get_range(
        dataset="GLBX.MDP3",
        stype_in="instrument_id",
        symbols=instrument_id_list,
        schema="definition",
        start="2023-08-15",
    )

    # Convert to DataFrame
    df = data.to_df()
    return df[["instrument_id", "raw_symbol", "min_price_increment", "match_algorithm", "expiration"]]

print(get_symbol_properties(top_instruments))
                           instrument_id raw_symbol  min_price_increment match_algorithm                expiration
ts_recv
2023-08-15 00:00:00+00:00         225833       ZBU3             0.031250               F 2023-09-20 17:01:00+00:00
2023-08-15 00:00:00+00:00           1562      SR3Z3             0.005000               A 2024-03-19 21:00:00+00:00
2023-08-15 00:00:00+00:00           2922      MESU3             0.250000               F 2023-09-15 13:30:00+00:00
2023-08-15 00:00:00+00:00         399495       TNU3             0.015625               F 2023-09-20 17:01:00+00:00
2023-08-15 00:00:00+00:00         338574       ZNU3             0.015625               F 2023-09-20 17:01:00+00:00
2023-08-15 00:00:00+00:00           3445       ESU3             0.250000               F 2023-09-15 13:30:00+00:00
2023-08-15 00:00:00+00:00          72156       ZTU3             0.003906               K 2023-09-29 17:01:00+00:00
2023-08-15 00:00:00+00:00           9235      MNQU3             0.250000               F 2023-09-15 13:30:00+00:00
2023-08-15 00:00:00+00:00           2130       NQU3             0.250000               F 2023-09-15 13:30:00+00:00
2023-08-15 00:00:00+00:00         404144       ZFU3             0.007812               F 2023-09-29 17:01:00+00:00

Observe that the matching_algorithm values are in their raw values passed through from the exchange. The full list of supported instrument definition fields and info about the output can be found on the exchange's specifications page.

Streaming live BBO data

While highly liquid futures contracts generally maintain narrow bid-ask spreads, certain market conditions can lead to this spread widening. Stream our BBO-1s schema to monitor the current best bid and best offer, subsampled at 1-second intervals.

This example uses the BBO-1s schema for top-of-book information, but other similar schemas exist that may better suit your use case. Read more about these schemas on our MBP-1 vs. TBBO vs. BBO schemas page.

Info
Info

This example requires a live license to GLBX.MDP3. Visit our live data portal to sign up.

import databento as db

# Enable basic logging
db.enable_logging("INFO")

# Create a live client
live_client = db.Live("$YOUR_API_KEY")

# Subscribe to the BBO-1s schema for the continuous NQ contract
live_client.subscribe(
    dataset="GLBX.MDP3",
    schema="bbo-1s",
    symbols="NQ.v.0",
    stype_in="continuous",
)

# Add a print callback
live_client.add_callback(print)

# Start the live client to begin streaming
live_client.start()

# Run the stream for 15 seconds before closing
live_client.block_for_close(timeout=15)
Info
Info

If you do not see any output, it could be because the markets are closed. See the start parameter in Live.subscribe to utilize intraday replay.

SymbolMappingMsg { hd: RecordHeader { length: 44, rtype: SymbolMapping, publisher_id: 0, instrument_id: 42288528, ts_event: 1738586561618888067 }, stype_in: 255, stype_in_symbol: "NQ.v.0", stype_out: 255, stype_out_symbol: "NQH5", start_ts: 18446744073709551615, end_ts: 18446744073709551615 }
BboMsg { hd: RecordHeader { length: 20, rtype: Bbo1S, publisher_id: GlbxMdp3Glbx, instrument_id: 42288528, ts_event: 1738586561944562749 }, price: 21181.750000000, size: 1, side: 'B', flags: LAST (130), ts_recv: 1738586562000000000, sequence: 20360097, levels: [BidAskPair { bid_px: 21180.250000000, ask_px: 21181.250000000, bid_sz: 1, ask_sz: 1, bid_ct: 1, ask_ct: 1 }] }
BboMsg { hd: RecordHeader { length: 20, rtype: Bbo1S, publisher_id: GlbxMdp3Glbx, instrument_id: 42288528, ts_event: 1738586562826454327 }, price: 21180.750000000, size: 1, side: 'A', flags: LAST (130), ts_recv: 1738586563000000000, sequence: 20360523, levels: [BidAskPair { bid_px: 21180.250000000, ask_px: 21181.250000000, bid_sz: 1, ask_sz: 1, bid_ct: 1, ask_ct: 1 }] }
BboMsg { hd: RecordHeader { length: 20, rtype: Bbo1S, publisher_id: GlbxMdp3Glbx, instrument_id: 42288528, ts_event: 1738586563257839295 }, price: 21182.500000000, size: 1, side: 'B', flags: LAST (130), ts_recv: 1738586564000000000, sequence: 20361382, levels: [BidAskPair { bid_px: 21182.250000000, ask_px: 21183.000000000, bid_sz: 3, ask_sz: 1, bid_ct: 1, ask_ct: 1 }] }
...
BboMsg { hd: RecordHeader { length: 20, rtype: Bbo1S, publisher_id: GlbxMdp3Glbx, instrument_id: 42288528, ts_event: 1738586573077758761 }, price: 21181.750000000, size: 2, side: 'A', flags: LAST (130), ts_recv: 1738586574000000000, sequence: 20367086, levels: [BidAskPair { bid_px: 21182.250000000, ask_px: 21183.000000000, bid_sz: 1, ask_sz: 1, bid_ct: 1, ask_ct: 1 }] }
BboMsg { hd: RecordHeader { length: 20, rtype: Bbo1S, publisher_id: GlbxMdp3Glbx, instrument_id: 42288528, ts_event: 1738586573077758761 }, price: 21181.750000000, size: 2, side: 'A', flags: LAST (130), ts_recv: 1738586575000000000, sequence: 20367373, levels: [BidAskPair { bid_px: 21182.750000000, ask_px: 21183.750000000, bid_sz: 1, ask_sz: 1, bid_ct: 1, ask_ct: 1 }] }
BboMsg { hd: RecordHeader { length: 20, rtype: Bbo1S, publisher_id: GlbxMdp3Glbx, instrument_id: 42288528, ts_event: 1738586575233261001 }, price: 21182.250000000, size: 1, side: 'A', flags: LAST (130), ts_recv: 1738586576000000000, sequence: 20367799, levels: [BidAskPair { bid_px: 21181.250000000, ask_px: 21182.250000000, bid_sz: 2, ask_sz: 1, bid_ct: 2, ask_ct: 1 }] }

Parent symbology

On futures trading venues, it can be tedious to fetch every child instrument and contract expiration (like ESU3, ESZ3, ESU3-ESZ3) for a given parent instrument (like ES). You can use our parent symbology type to do this, by passing in stype_in="parent".

def get_child_instruments(parents=["ZB.FUT", "SR3.FUT"]):
    # Request definition data for parent symbols
    data = client.timeseries.get_range(
        dataset="GLBX.MDP3",
        symbols=parents,
        stype_in="parent",
        schema="definition",
        start="2023-08-15",
    )

    # Convert to DataFrame
    df = data.to_df()
    return df[["instrument_id", "raw_symbol"]]

print(get_child_instruments().head())
                           instrument_id       raw_symbol
ts_recv
2023-08-15 00:00:00+00:00          34810    SR3:AB 01Y M8
2023-08-15 00:00:00+00:00          45040      SR3M6-SR3Z7
2023-08-15 00:00:00+00:00          51186      SR3Z4-SR3M6
2023-08-15 00:00:00+00:00          22508  SR3:DF H7Z7U8M9
2023-08-15 00:00:00+00:00         350151  SR3:SB PK M4-M5

Alternatively, you can replicate this logic by requesting definition data for all symbols and filtering on the asset field. The asset field is populated with the root of the parent symbol.

Databento's web portal only exposes parent products, and not child instruments. When you set up a batch download through the web portal, note that you're making a parent symbology request and you'll receive interleaved data from multiple instruments. If you want to set up a batch download of individual child instruments, you must use our API instead.

Continuous contract symbology

Likewise, it's tedious to track the lead month contract of a futures product over a long period of time, due to rollovers. You can use our continuous contract symbology type to get a single symbol that is pegged to the lead month contract, by passing in stype_in="continuous".

For example, let's plot the two lead month ES contracts ES.n.0 and ES.n.1.

import databento as db
import matplotlib.pyplot as plt

client = db.Historical("$YOUR_API_KEY")

dataset = "GLBX.MDP3"
symbols = ["ES.n.0", "ES.n.1"]
start = "2024"

data = client.timeseries.get_range(
    dataset="GLBX.MDP3",
    schema="ohlcv-1d",
    stype_in="continuous",
    symbols=symbols,
    start=start,
)

df = data.to_df()
df.groupby("symbol")["close"].plot(
    xlabel="Date",
    ylabel="Price",
)

plt.legend()
plt.show()

ES plot

Tip
Tip

If you'd like to see the two calendar front month contracts instead, you may use ES.c.0 and ES.c.1. However, in most cases these will resolve to the same symbols as ES.n.0 and ES.n.1 respectively because open interest tends to decay with increasing time to expiration.

Many futures products reflect seasonality in commodities or term structure in fixed income, so the nearest calendar month may not be the lead month. In such cases, you need to specify how you want to resolve the lead month, also known as a roll rule. For example, you could use ES.v.0 to resolve the lead month by volume instead of open interest.

Special conventions for futures on Databento

  • Weekly trading session. Unlike many venues, CME Globex has a weekly trading session. This affects how you process MBO data. We provide a synthetic snapshot of the book at the start of each UTC day to make it easier to start from any day of the week.
  • User-defined instruments and spreads. CME Globex has a large number of user-defined instruments. While many vendors do not expose these and their raw symbols may be foreign to a user who's seeing these for the first time, Databento includes all of them as many are highly liquid and active.
  • Asynchronous trade publication. CME Globex prints fills and order deletions associated with the fills asynchronously, with the fills published before the deletions. This is unlike most venues, which treat the individual fill and corresponding order deletion as a single atomic event. You may choose to preemptively update your book based on trades or fills, or wait until the corresponding deletes, which we represent with action C.
  • Implied book. CME Globex has implied matching. If a trade is partially filled by contra liquidity on the implied book, we show the full quantity of the trade but only the fill quantities on the direct book. A full implied trade will have trade side N.
  • Inverted spreads. CME Globex's matching engine has various price limits and circuit breakers. The spread may be inverted during a trading halt.
  • No rollover back-adjustments. Our continuous contract symbology is a notation that maps to an actual, tradable instrument on any given date. The continuous contract prices provided are the original, unadjusted prices. We don't create a synthetic time series by back-adjusting the prices to remove jumps during rollovers.
See also