Skip to content

Instantly share code, notes, and snippets.

@databento-bot
Last active August 18, 2025 01:11
Show Gist options
  • Save databento-bot/2e19de76335b0b9d93f5214ac3ca2972 to your computer and use it in GitHub Desktop.
Save databento-bot/2e19de76335b0b9d93f5214ac3ca2972 to your computer and use it in GitHub Desktop.
Filter for specific symbol on Databento CSV data - Using pandas vs. polars vs. bash
# Side-by-side comparison of pandas vs. polars syntax for reading CSV and filtering by symbol
import pandas as pd
import polars as pl
CSVFILE = "glbx-mdp3-20250716-20250815.ohlcv-1m.csv.zst"
# Using pandas
df = pd.read_csv(CSVFILE)
df_filtered = df[df["symbol"] == "NQU5"]
df_filtered.to_csv("out_pandas.csv", index=False)
# Using polars
df = pl.read_csv(CSVFILE)
df_filtered = df.filter(pl.col("symbol") == "NQU5")
df_filtered.write_csv("out_polars.csv")
#!/bin/bash
zstdcat glbx-mdp3-20250716-20250815.ohlcv-1m.csv.zst | awk -F, 'NR==1 {for (i=1; i<=NF; i++) if ($i=="symbol") col=i; print; next} $col=="NQU5"' > out_awk.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment