Programmatic batch downloads

Overview

In this example we will use the Historical client to submit a batch request and download files programmatically. To do this we will request five days of trade data, wait for our request to process, and then download the files locally and load them into a DBNStore object.

See also
Streaming vs. batch download.

Example

PythonC++Rust

      
    
import operator
import pathlib
import time
import databento as db

# First, create a historical client
client = db.Historical("$YOUR_API_KEY")

# Next, we will submit a batch job
new_job = client.batch.submit_job(
    dataset="GLBX.MDP3",
    start="2022-12-12T00:00:00",
    end="2022-12-17T00:00:00",
    symbols="OZN.OPT",
    schema="trades",
    split_duration="day",
    stype_in="parent",
)

# Retrieve the new job ID
new_job_id: str = new_job["id"]

# Now, we have to wait for our batch job to complete
while True:
    done_jobs = list(map(operator.itemgetter("id"), client.batch.list_jobs("done")))
    if new_job_id in done_jobs:
        break  # Exit the loop to continue
    time.sleep(1.0)

# Once complete, we will download the files
downloaded_files = client.batch.download(
    job_id=new_job_id,
    output_dir=pathlib.Path.cwd(),
)

# Finally, we can load the data into a DBNStore for analysis
for file in sorted(downloaded_files):
    if file.name.endswith(".dbn.zst"):
        data = db.DBNStore.from_file(file)

        # Convert the data to a pandas.DataFrame
        df = data.to_df()
        print(f"{file.name} contains {len(df):,d} records")

Result

PythonC++Rust

      
    
glbx-mdp3-20221212.trades.dbn.zst contains 1,662 records
glbx-mdp3-20221213.trades.dbn.zst contains 3,511 records
glbx-mdp3-20221214.trades.dbn.zst contains 2,544 records
glbx-mdp3-20221215.trades.dbn.zst contains 2,460 records
glbx-mdp3-20221216.trades.dbn.zst contains 2,191 records