Skip to content

Instantly share code, notes, and snippets.

@thanos
Last active March 15, 2026 16:04
Show Gist options
  • Select an option

  • Save thanos/82dc347a48b21d0ece9ae0aadcfed768 to your computer and use it in GitHub Desktop.

Select an option

Save thanos/82dc347a48b21d0ece9ae0aadcfed768 to your computer and use it in GitHub Desktop.

Frequency sketches (also called "Count-Min" or "Heavy Hitters" sketches) are used when you need to know how often a specific item has appeared without storing every single occurrence. They are the go-to for identifying "Top-K" items or finding popular hashtags/IP addresses in real-time.

Frequency Data Sketches (Point Queries & Top-K)

Algorithm Memory CPU / Op Accuracy Notes
Count-Min Sketch $O(\frac{1}{\epsilon} \log \frac{1}{\delta})$ Very Fast Probabilistic (Overestimates) The standard for frequency; easy to implement and merge.
Count-Sketch Higher than Count-Min Fast Unbiased (Lower Variance) Uses $\pm 1$ hashing; better error distribution but slightly more CPU.
Space-Saving $O(\frac{1}{\epsilon})$ Medium High for "Heavy Hitters" Deterministic; maintains a "Stream Summary" of top elements.
Misra-Gries $O(K)$ Fast High for Top-$K$ Classic algorithm; finds elements with frequency $> N/(K+1)$.
HeavyKeeper Low Fast Exceptional for Top-$K$ Uses "decay" strategy to evict small items; state-of-the-art for Zipfian data.
SketchLearn Variable Slow High (ML-based) Uses automated modeling to correct sketch bias; higher overhead.

Selection Guide

  • Use Count-Min Sketch for basic frequency estimation where overcounting is acceptable (e.g., rate limiting).
  • Use HeavyKeeper if your goal is strictly finding the Top-K most frequent items with the highest precision.
  • Use Misra-Gries if you need a simple, deterministic way to ensure you don't miss any items above a certain frequency threshold.

Would you like me to explain the "decay" mechanism in HeavyKeeper and why it’s better for skewed data?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment