7 ways to use order book data (MBO) outside high-frequency trading

October 27, 2023
Title picture for 7 ways to use order book data (MBO) outside HFT

Order book data (MBO) is frequently used in high-frequency trading (HFT) and often raises questions about its relevance beyond these trading scenarios. This article explores various applications of MBO in different contexts outside of HFT environments.

Before we dive into the use cases, let's quickly define MBO as it can be referred to in many different ways.

Order book data, also referred to as MBO (market by order), provides the highest level of granularity view of every individual order event, keyed by its order ID. This includes the real-time bid, ask, high, low, and last price of the day, as well as quote size.

At Databento, we avoid using terms like Level 1 (L1), Level 2 (L2), or Level 3 (L3) to ensure clarity and consistency. These naming conventions and applications can vary widely across different vendors, often leading to confusion. For example, some vendors would call a feed that has full depth but doesn't have individual order information L3.

Instead, we adhere to naming conventions that accurately represent the schemas.

Now that we've summarized MBO data let's dive into some non-HFT use cases.

It allows you to properly sequence the backtest or simulation of any strategy or execution algorithm that uses passive orders and more accurately simulate fill dynamics.

Almost all order book feeds provide visibility of every level in the book, which is useful for estimating things like capacity, sweep cost, and slippage. This is meaningful if you trade large sizes or sparse instruments, regardless of the need for low latency.

Incremental deltas are a more efficient format than book-level snapshots, which is better for performance optimization.

Some venues exclusively use an incremental order book feed as their main or only feed, so it's required to use MBO data when working with one of these venues.

Because order book feeds are ubiquitous, it's easier to write software abstractions that purely consume order book data at any venue rather than a platform that has to handle a mix of price level feeds and MBO.

In the context of US equities, top-of-the-book data from the SIPs, which are typically associated with "TAQ" or "L1 data", leave out a lot of liquidity information compared to the order-based prop feeds: odd lots, non-top level at each venue, trade aggressor side, imbalance, etc.

Odd lots are particularly of interest, making up about half of US market liquidity, and will continue to grow in share. Trade side inference with rule-based approaches aren't very accurate, e.g., the classical Lee-Ready algorithm is only accurate to 70-80%.

Here are a few widely-known classes of features that can't be constructed perfectly from level-based data, e.g., book pressure, and will give a trivial increase in your model R^2 or other fit measures. For certain mid-frequency strategies, utilizing these features is necessary to gain an edge over top competitors that are already using them.

Getting started with using MBO on Databento is simple. We recommend checking out our schemas doc, which provides information about what data is included in our MBO feed, and our fields doc, which is essentially a data dictionary for each schema and provides information about the type and description for each field. To start pulling data, you can view our docs examples on order actions and order tracking.