How to build a blazing-fast real-time stock screener with Python and Databento

This tutorial shows you how to build a real-time stock screener (or scanner) that continuously analyzes a real-time market data feed across all U.S. equities—over 9,000 NMS stocks and ETFs.
By the end of this tutorial, you’ll be able to output alerts timestamped to the nanosecond, like this:
[2025-04-23T04:00:01.688717194-04:00] NVDA moved by 4.40% (current: 98.1900, previous: 102.7100)
[2025-04-23T04:00:02.121450448-04:00] BABA moved by 10.34% (current: 131.2750, previous: 118.9700)
[2025-04-23T06:36:23.507827237-04:00] AMZN moved by 7.36% (current: 193.8850, previous: 180.6000)
You can also skip ahead and get the complete code on GitHub here.
A major challenge with implementing a stock scanner is that it needs to listen to all tickers on the market. Very few market data feeds can efficiently stream real-time prices for all ~9,000 U.S. equity tickers over the internet.
While many retail companies provide a stock screener as a commercial service, they're typically built for retail use cases. These off-the-shelf solutions tend to provide limited programmatic control over alert conditions, weak timestamping accuracy, and little support for pre-market activity before the 9:30 AM open.

Our stock screener should:
- Efficiently handle the entire U.S. equities universe (~9,000 symbols).
- Achieve median sub-millisecond feed delay to NY4/5 (WAN-shaped).
- Take less than 5 seconds to fetch historical data, subscribe, and begin scanning.
- Monitor all U.S. stocks and ETFs for price movements that exceed a configurable threshold.
- Compare current prices against the previous day’s closing prices.
- Print alerts when significant moves are detected.
First, we’ll create a class to manage all state for the stock scanner.
We'll start by defining a few constants:
PX_SCALE
helps scale prices to decimal units. By default, Databento uses a fixed-integer representation to avoid precision loss, and each unit represents 1 nanodollar (10^-9 of a dollar).PX_NULL
represents null prices. This occurs when one side of the book is empty, which happens more often than you might expect during pre-market hours or when scanning illiquid stocks and ETFs across the entire U.S. equities universe.PCT_MOVE_THRESHOLD
sets an arbitrary threshold for triggering alerts. In this example, we'll write a simple alert that fires whenever a stock moves 3% or more in either direction.
Next, we’ll initialize three dictionaries for symbol lookup and state tracking:
self.symbol_directory
maps each symbol to its numeric instrument ID, which can change daily in raw exchange feeds like Nasdaq Basic.self.last_day_lookup
stores the previous day’s closing price for each ticker.self.is_signal_lit
tracks whether an alert has already fired for a given ticker, so we only print it once when the threshold is first exceeded.
from datetime import datetime, timedelta
from typing import Dict, Any
import databento as db
import pandas as pd
import pytz
class PriceMovementScanner:
"""Scanner for detecting large price movements in all US equities."""
# Constants
PX_SCALE: float = 1e-9
PX_NULL: int = 2**63 - 1
PCT_MOVE_THRESHOLD: float = 0.03
def __init__(self, pct_threshold: float = None, today: str = TODAY) -> None:
"""Initialize scanner with configurable threshold and date."""
self.pct_threshold = pct_threshold or self.PCT_MOVE_THRESHOLD
self.today = today
self.today_midnight_ns = int(pd.Timestamp(today).timestamp() * 1e9)
self.symbol_directory: Dict[int, str] = {}
self.last_day_lookup: Dict[str, float] = self.get_last_day_lookup()
self.is_signal_lit: Dict[str, bool] = {symbol: False for symbol in self.last_day_lookup}
We’ll now add a class method to populate self.last_day_lookup
. This method makes a data request to Databento’s historical client with the following parameters:
dataset="EQUS.SUMMARY"
uses the Databento US Equities Summary dataset, which provides consolidated end-of-day prices (OHLCVs) across all NMS exchanges and ATSs. This feed is designed to maximize CTA/UTP SIP coverage and offers official settlement prices and volumes suited for daily EOD use.schema="ohlcv-1d"
fetches daily OHLCV bar aggregates.symbols="ALL_SYMBOLS"
retrieves all ticker symbols available in the dataset.
def get_last_day_lookup(self) -> Dict[str, float]:
"""Get yesterday's closing prices for all symbols."""
client = db.Historical()
now = pd.Timestamp(self.today).date()
yesterday = (pd.Timestamp(self.today) - timedelta(days=1)).date()
# Get yesterday's closing prices
data = client.timeseries.get_range(
dataset="EQUS.SUMMARY",
schema="ohlcv-1d",
symbols="ALL_SYMBOLS",
start=yesterday,
end=now,
)
# Request symbology: This is required for ALL_SYMBOLS requests
# which don't automatically map instrument ID to raw ticker symbol
symbology_json = data.request_symbology(client)
data.insert_symbology_json(symbology_json, clear_existing=True)
df = data.to_df()
# TODO: Adjust for overnight splits here, e.g., using Databento corporate actions API
return dict(zip(df["symbol"], df["close"]))
As noted in the inline comment, a more complete implementation would also use corporate actions data to adjust closing prices for overnight stock splits. This method can be easily extended to integrate other major events like dividends, as well as updated fundamentals like market cap, EPS, and P/E ratio.
You may also want to begin scanning only after the order book has stabilized following the pre-market. This can be done by using trade prices instead of the mid-price, requiring the spread to narrow below a certain number of basis points, or waiting for a fixed time.
With the previous day's closing prices stored, we can begin monitoring real-time price movements using Databento’s live API.
First, we build the symbol directory using symbol mapping messages, which map numeric instrument IDs to ticker symbols. These are sent automatically when you subscribe to a real-time feed.
Then, for each MBP-1 message (top-of-book update), we compute the mid-price and compare it against the previous day's closing price.
def scan(self, event: Any) -> None:
"""
Scan for large price movements in market data events.
"""
if isinstance(event, db.SymbolMappingMsg):
self.symbol_directory[event.hd.instrument_id] = event.stype_out_symbol
return
if not isinstance(event, db.MBP1Msg):
return
# Skip if event is from replay before today using `.subscribe(..., start=...)` parameter
#if event.hd.ts_event < self.today_midnight_ns:
# return
symbol = self.symbol_directory[event.instrument_id]
bid = event.levels[0].bid_px
ask = event.levels[0].ask_px
if bid == self.PX_NULL or ask == self.PX_NULL:
# Handle when one side of book is empty
return
mid = (event.levels[0].bid_px + event.levels[0].ask_px) * self.PX_SCALE * 0.5
last = self.last_day_lookup[symbol]
abs_r = abs(mid - last) / last
if abs_r > self.pct_threshold and not self.is_signal_lit[symbol]:
ts = pd.Timestamp(event.hd.ts_event, unit='ns').tz_localize('UTC').tz_convert('US/Eastern')
print(
f"[{ts.isoformat()}] {symbol} moved by {abs_r * 100:.2f}% "
f"(current: {mid:.4f}, previous: {last:.4f})"
)
self.is_signal_lit[symbol] = True
Now, we just need to run the code. We’ll subscribe to real-time, L1 (top-of-book) data using the following parameters:
dataset="EQUS.MINI"
uses the Databento Equities Mini dataset, a cost-effective consolidated feed for real-time U.S. equities. You can modify this code to scan all options or futures by changing the dataset ID.schema="mbp-1"
streams MBP-1 data, which provides every order book event that updates the best bid and offer (BBO). This includes every trade and changes to book depth, alongside total size and order count at BBO.symbols="ALL_SYMBOLS"
retrieves all ticker symbols available in the dataset. You can also specify a watchlist like["AMZN", "AAPL", "GOOG", "MSFT", "NVDA"]
. By default, Databento uses Nasdaq symbology for U.S. stocks.start=0
replays past events starting from midnight UTC, then cuts over to real-time data. This is useful for debugging your application outside of market hours.
def main() -> None:
scanner = PriceMovementScanner()
live = db.Live()
live.subscribe(
dataset="EQUS.MINI",
schema="mbp-1",
symbols="ALL_SYMBOLS",
start=0,
)
live.add_callback(scanner.scan)
live.start()
live.block_for_close()
if __name__ == "__main__":
main()
[2025-04-24T04:00:00.007704938-04:00] TSLA moved by 5.43% (current: 245.4300, previous: 259.5100)
[2025-04-24T04:00:00.008339140-04:00] TQQQ moved by 11.78% (current: 46.0000, previous: 52.1400)
[2025-04-24T04:00:00.009881258-04:00] DBC moved by 5.84% (current: 22.5650, previous: 21.3200)
[2025-04-24T04:00:00.011338853-04:00] DBA moved by 4.81% (current: 25.9400, previous: 27.2500)
[2025-04-24T04:00:00.019717308-04:00] WEAT moved by 3.35% (current: 4.4650, previous: 4.6200)
[2025-04-24T04:00:00.048734686-04:00] RSST moved by 4.37% (current: 19.5950, previous: 20.4900)
[2025-04-24T04:00:00.052540506-04:00] BCI moved by 17.21% (current: 24.3800, previous: 20.8000)
[2025-04-24T04:00:00.111035469-04:00] SOYB moved by 4.26% (current: 22.8100, previous: 21.8777)
[2025-04-24T04:00:00.158550097-04:00] FTGC moved by 3.49% (current: 25.6250, previous: 24.7600)
[2025-04-24T04:00:00.158736420-04:00] CMDY moved by 3.03% (current: 51.7200, previous: 50.2000)
Our screener picked up the pre-market move on Tesla, Inc. (TSLA) just 7.7 milliseconds after 4:00 AM EDT—well before the first Bloomberg News article was published at 5:30 AM.


You can get the full script for this example on GitHub here.
If you’re interested in scanning stocks using only historical data, check out our documentation for an example on finding the top pre-market movers.
You can also learn more about Databento’s U.S. equities coverage, which spans over 20,000 stocks and ETFs across all U.S. exchanges and ATSs. We’re an official distributor for all major proprietary feeds, including Nasdaq TotalView, NYSE Integrated, NYSE Arca Integrated, and more.