This example uses the instrument with raw symbol ESU3. You'll need to substitute this with your desired symbol.
A high-frequency liquidity-taking strategy
We'll introduce some terminology for this tutorial:
- A feature is any kind of basic independent variable that we think has some predictive value. This follows machine learning nomenclature; others may refer to this as a predictor or regressor in a statistical or econometric setting.
- A trading rule is a hardcoded trading decision. An example of a trading rule is "If there's only 1 order left at the best offer, lift the offer; if there is only 1 order left at the best bid, hit the bid". A trading rule may be a hardcoded trading decision taken when a feature value exceeds a certain threshold.
- A strategy that is based on trading rules is a rule-based strategy.
- A liquidity-taking strategy takes liquidity by crossing the spread with aggressive or marketable orders.
- A high-frequency strategy is characterized by a large number of trades.
Book skew and trading rule
The simplest type of book feature is called the book skew, which is the imbalance between resting bid depth (vb) and resting ask depth (va) at the top of the book. We can formulate this as some sort of difference between vb and va. It's convenient to scale these by their order of magnitude, so we take their log differences instead.
skew=log(vb)−log(va)=logvavbNotice that we picked this ordering simply because it's useful to formulate features such that positive values imply that we expect prices to go up; this makes it easier to debug your strategy. Intuitively, we expect higher bid depth to imply higher buy demand and hence higher prices.
We can introduce a trading rule that buys when this feature exceeds some alpha
threshold k
, and sells when it goes below some threshold -k
.
Minimum order quantity
While there's some practical advantage to trading larger clips for a liquidity-taking strategy, we'll start with a constant trade size equal to the minimum order quantity so that we minimize slippage and market impact considerations.
The minimum order quantity depends on the trading platform or market that you're on. For this toy strategy, we'll use E-mini S&P 500 (ES) futures as an example and trade in clips of 1 contract.
skew>k→Buy 1 lotskew<−k→Sell 1 lot
Commissions
This strategy will be very sensitive to commissions due to its volume, so we'll include commissions on the estimated PnL. These commissions can be found here.
Position and risk limits
This strategy is convenient because you don't have to worry about complex order and position state management. You just let it build up whatever maximum position you want. Because we expect buys and sells to be symmetrically distributed in the long run, you will eventually get out of position. However, you might not have enough margin to build up arbitrarily large positions, so for proof of concept, we'll specify a maximum position of 10 contracts.
skew>k and pos<10 lots→Buy 1 lotskew<−k and pos>−10 lots→Sell 1 lot
Implementation
These are the parameters we have so far:
Info
import math
import json
import sys
from dataclasses import dataclass, field
from decimal import Decimal
import pandas as pd
import databento as db
API_KEY = '$YOUR_API_KEY'
class Config:
# Alpha threshold to buy/sell, k
ALPHA_THRESHOLD: int = 1.7
# Symbol
DATASET = 'GLBX.MDP3'
SYMBOL = 'ESU3'
POINT_VALUE = 50 # $50 per index point
# Fees
VENUE_FEES_PER_SIDE: Decimal = Decimal('0.39')
CLEARING_FEES_PER_SIDE: Decimal = Decimal('0.05')
FEES_PER_SIDE: Decimal = VENUE_FEES_PER_SIDE + CLEARING_FEES_PER_SIDE
# Position limit
POSITION_MAX: int = 10
Since we're only simulating liquidity-taking at minimum size, our MBP-1 schema is sufficient. You can learn more about our MBP-1 schema here.
To keep this example simple, we'll assume zero round-trip latency for any orders placed. This enables us to implement a simple, online calculation of PnL for our real-time trading simulation.
@dataclass
class Strategy:
# Dataset
dataset: str
# Instrument to trade
symbol: str
# Current position, in contract units
position: int = 0
# Number of long contract sides traded
buy_qty: int = 0
# Number of short contract sides traded
sell_qty: int = 0
## Total realized buy price
real_total_buy_px: Decimal = 0
## Total realized sell price
real_total_sell_px: Decimal = 0
# Total buy price to liquidate current position
theo_total_buy_px: Decimal = 0
# Total sell price to liquidate current position
theo_total_sell_px: Decimal = 0
# Total fees paid
fees: Decimal = 0
# List to track results
results: list = field(default_factory=list)
def run(self) -> None:
client = db.Live()
client.subscribe(
dataset=self.dataset,
schema='mbp-1',
stype_in='raw_symbol',
symbols=[self.symbol],
# start="2023-08-08T12:00", # Burn-in start time
)
for record in client:
if not isinstance(record, db.MBP1Msg):
continue
self.update(record)
def update(self, record: db.MBP1Msg) -> None:
ask_size = record.levels[0].ask_sz
bid_size = record.levels[0].bid_sz
ask_price = record.levels[0].ask_px / Decimal("1e9")
bid_price = record.levels[0].bid_px / Decimal("1e9")
skew = math.log10(bid_size) - math.log10(ask_size)
mid_price = (ask_price + bid_price) / 2
# Buy/sell based when skew signal is large
if skew > Config.ALPHA_THRESHOLD and self.position < Config.POSITION_MAX:
self.position += 1
self.buy_qty += 1
self.real_total_buy_px += ask_price
self.fees += Config.FEES_PER_SIDE
elif skew < -Config.ALPHA_THRESHOLD and self.position > -Config.POSITION_MAX:
self.position -= 1
self.sell_qty += 1
self.real_total_sell_px += bid_price
self.fees += Config.FEES_PER_SIDE
# Update prices
# Fill prices are based on BBO with assumed zero latency
# In practice, should be worse because of alpha decay
if self.position == 0:
self.theo_total_buy_px = 0
self.theo_total_sell_px = 0
elif self.position > 0:
self.theo_total_sell_px = bid_price * abs(self.position)
elif self.position < 0:
self.theo_total_buy_px = ask_price * abs(self.position)
# Compute PnL
theo_pnl = (
Config.POINT_VALUE
* (
self.real_total_sell_px
+ self.theo_total_sell_px
- self.real_total_buy_px
- self.theo_total_buy_px
)
- self.fees
)
result = {
'ts_strategy': str(pd.Timestamp(record.ts_recv, tz='UTC')),
'bid': f'{bid_price:0.2f}',
'ask': f'{ask_price:0.2f}',
'skew' : f'{skew:0.3f}',
'position': self.position,
'trade_ct': self.buy_qty + self.sell_qty,
'fees': f'{self.fees:0.2f}',
'pnl': f'{theo_pnl:0.2f}',
}
print(json.dumps(result, indent=4))
self.results.append(result)
if __name__ == "__main__":
strategy = Strategy(dataset=Config.DATASET, symbol=Config.SYMBOL)
while True:
try:
strategy.run()
except KeyboardInterrupt:
df = pd.DataFrame(strategy.results)
df.to_csv('strategy_log.csv', index=False)
sys.exit()
Results
Further improvements
This implementation uses our simple synchronous client. For production applications, we recommend using our asynchronous client or callback model.
A problem encountered with the book skew is that extreme values may be influenced by spoofing. One possibility is to modify the trading rule and introduce an upper limit as follows:
skew>k and abs(skew)<L and pos<10 lots→Buy 1 lotskew<−k and abs(skew)<L and pos>−10 lots→Sell 1 lotFinally, recall that we assumed zero delay in order placement and fill. It's also important to incorporate a delay when extending this example.