Streaming vs. batch download historical data: which is right for you?

May 11, 2023
When sourcing historical market data, you may encounter the option to stream or batch download (both of which are available via the Databento portal). While both methods will provide you with the same data, there are some differences that may apply to your specific use case. Keep reading to learn about each method, and determine which is best for you.

Streaming data will load directly into your platform, typically via an API or client library. Data is provided immediately, and customers are typically charged every time data is streamed. Customizations are limited and there may be size restrictions.

With batch downloads, customers can download files directly. Databento's download center supports HTTP, rsync, and FTP formats. Batch downloads are more customizable, and can handle larger data requests, typically over 5 GB. Downloads are not instantaneous, however, and may take several minutes before the data is available to download.

Now that you know a bit about each method, here are some considerations to help you choose which is best for you and how to use each with Databento.

‍With streaming, you can access data immediately when it is needed in your application. Streaming is suitable for smaller, on-demand workflows such as retrieving reference data for production trading, initializing a user's price chart for a given symbol on a display application, or exploring data in an interactive environment.

Since streaming is designed for one-off tasks, you will be charged each time for duplicate stream requests. If you intend to access the same data repeatedly—for instance in parallel simulations, multiple backtests or daily ETL pipelines—you can do so more efficiently with a batch download of the data onto your system.

‍Batch download lets you fetch the same data multiple times without additional charge. Hence, it is more suitable for larger data requests where there is risk of disconnection while your data is being transmitted.‍

‍Our streaming infrastructure is optimized for instant retrieval of small data requests in predetermined schemas. When you request a batch download, our system needs time to prepare the data files. This preparation time lets us service additional customizations that are not possible on our streaming infrastructure.

‍A batch download request can be submitted using the Batch download on our portal, using our HTTP API, or using any of our official client libraries.

