ties/ipv4_overview.md

Created February 9, 2026 19:57

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/ties/f593702a1ea9511028a6ecf082c8e6a8.js"></script>
Save ties/f593702a1ea9511028a6ecf082c8e6a8 to your computer and use it in GitHub Desktop.

Gemini Pro 3 and prefix datastructures

Raw

Here is a combined overview of IPv4 and IPv6 lookup data structures, focusing on Fixed-Stride, Level-Compressed (LC), and Multi-Bit Tries.

This guide prioritizes L3 cache efficiency and Memory Mapping (mmap) capabilities in Rust.

The Fundamental Difference: Density vs. Sparsity

To choose the right structure, you must understand the discrepancy between IPv4 and IPv6:

IPv4 (32-bit): The address space is small ($2^{32}$) and dense. You can afford "wasteful" pre-allocation (like a 16MB array) to gain raw speed.
IPv6 (128-bit): The address space is massive ($2^{128}$) and sparse. You cannot pre-allocate large fixed strides without compression. You must use a structure that compresses empty branches (via Bitmaps or Level Compression) to fit in the CPU cache.

Strategy A: The "Speed King" (Fixed-Stride / DIR-24-8)

Best for: IPv4 (Production Routing)

This is the standard for high-performance software routers (DPDK, VPP). It minimizes memory accesses (usually 1 or 2) at the cost of memory usage.

Mechanism:
- Table 1 (DIR-24): A flat array of $2^{24}$ (16 million) entries covering the first 24 bits of the IP.
- Table 2 (DIR-8): Tiny secondary tables for the remaining 8 bits.
Cache: The main table (16-32MB) fits in L3 cache on modern server CPUs (which often have >40MB L3).
IPv6 Suitability: Impossible. A $2^{24}$ stride is fine, but you'd need millions of subsequent tables.
Rust Implementation:
- dpdk-rs wrappers (binding to C) are common.
- Manual Vec: In Rust, this is trivial to implement as a Vec<u16> (holding indices) for the 24-bit table.

Strategy B: The "Hardware Logic" (Tree-Bitmap)

Best for: IPv4 & IPv6 (Balanced)

This algorithm mimics hardware routers. It is a Fixed-Stride Trie but with Bitmap Compression. It is the most robust all-rounder for Rust implementations.

Mechanism:
- It uses a fixed stride (e.g., 4 bits per node).
- Instead of an array of 16 pointers (which bloats memory), it uses a 16-bit bitmap.
- Children are stored contiguously in a single Vec.
L3 Cache: Excellent. Nodes are tiny (bitmap + index), so many fit in a cache line.
Mmap: High. Because it uses a single backing Vec (arena style), it can be serialized directly to disk and mapped back without pointer fixups.
Rust Recommendation:
- Crate: ip_network_table
- Details: Uses the Tree-Bitmap algorithm. It handles both IPv4 and IPv6 elegantly by changing the stride logic internally.

Strategy C: The "Modern LC-Trie" (Poptrie)

Best for: IPv6 (Extreme Read Speed)

Poptrie (Population Count Trie) is the evolution of the LC-Trie. It uses the CPU's popcnt instruction to compress variable strides into dense arrays.

Mechanism:
- It analyzes the tree and creates nodes with large strides (e.g., 6 bits = 64 children).
- It uses a 64-bit bitmap to mark which children exist.
- The index is calculated: index = popcnt(bitmap & mask).
L3 Cache: Supreme. It achieves the highest density of any structure here. It fits massive routing tables (Full Internet BGP) into L2/L3 cache.
Mmap: Native. Poptrie is designed to be built into a contiguous array of u64s.
Rust Recommendation:
- Crate: oxidecomputer/poptrie
- Details: Developed for the Helios emulator. It is read-optimized and extremely compact.

Combined Comparison Matrix

Feature	DIR-24-8	Tree-Bitmap	Poptrie / LC-Trie	Standard Radix
Primary Use	IPv4 High-Speed	IPv4/IPv6 General	IPv6 High-Speed	Generic Maps
Lookup Speed	🚀 Fastest (1-2 accesses)	⚡ Fast (Unrolled loops)	⚡ Fast (Popcnt math)	🐢 Slower (Pointer chase)
Memory Footprint	🐘 High (Pre-allocated)	📦 Compact	🐜 Tiny	🎈 Medium (Pointers)
IPv6 Capable?	❌ No	✅ Yes	✅ Yes	✅ Yes
L3 Cache Fit	⚠️ Borderline (IPv4 only)	✅ Good	✅ Excellent	❌ Poor
Mmap Friendly?	✅ Yes (Flat Arrays)	✅ Yes (Contiguous Vec)	✅ Yes (Bit Arrays)	❌ No (Heap Pointers)

Top Rust Recommendations

1. The "Standard" Choice: `ip_network_table`

If you need a crate that "just works" for both IPv4 and IPv6, supports Longest Prefix Match (LPM), and is memory efficient.

Algo: Tree-Bitmap (Compressed Fixed-Stride).
Link: crates.io/crates/ip_network_table

2. The "Performance" Choice: `poptrie`

If you are building a read-heavy system (like a firewall or packet filter) where you build the table once and query it billions of times.

Algo: Popcount Compressed Trie.
Link: github.com/oxidecomputer/poptrie

3. The "Zero-Copy / Static" Choice: `fst`

If your data is static (e.g., a GeoIP database) and you want to mmap it from disk with zero parsing overhead.

Algo: Finite State Transducer (effectively a highly compressed, shared-suffix trie).
Note: You map IP bytes to values. It is immutable.
Link: crates.io/crates/fst

4. The "Manual" Approach (for IPv4 DIR-24-8)

If you specifically need the raw speed of DIR-24-8 for IPv4, there isn't a dominant "pure Rust" general-purpose crate because the logic is so simple most engines write it manually.

Implementation Sketch:

// A simple DIR-24-8 structure
struct Ipv4Table<T> {
    // Covers first 24 bits.
    // Value: Some(T) or Index into `small_tables`
    main_table: Vec<Entry<T>>, 
    
    // Covers last 8 bits.
    // Allocated only when needed.
    small_tables: Vec<[Entry<T>; 256]>, 
}

Raw

ipv6_prefix_datastructures.md

Since you are open to dynamic stride (and assuming you want high performance for IPv6 with cache efficiency), the landscape opens up significantly.

For IPv6, a "Dynamic Stride" or "Adaptive" trie is actually superior to a Fixed-Stride trie. IPv6 addresses are 128 bits long but usually extremely sparse. A fixed stride forces you to allocate memory for empty space, whereas a dynamic stride (like an LC-Trie or ART) collapses the empty bits, allowing the structure to fit into the L3 cache.

Here are the best Rust implementations for Dynamic/Adaptive Stride Tries suitable for IPv6, specifically looking at cache locality and memory layout.

1. Poptrie (The "Modern" LC-Trie)

Algorithm: Population Count Trie Stride: Dynamic/Compressed (e.g., 6-bit stride compressed into dense arrays).

This is widely considered the successor to the LC-Trie for routing. Instead of a standard node with 64 pointers (for a 6-bit stride), it uses a 64-bit bitmap and the CPU's popcnt instruction.

How it works: If a node has only 3 valid children out of 64 possibilities, the bitmap marks their positions. The "index" in the child array is calculated by counting the bits set to 1 before the target bit.
Cache & Mmap: It stores nodes in a contiguous array (perfect for mmap and L3 cache). It eliminates the "pointer chasing" overhead.
Rust Implementation: oxidecomputer/poptrie
- Context: Used by Oxide Computer (Bryan Cantrill et al.) for their high-performance hypervisor/emulator networking. It is designed precisely for the use case you described.

2. Adaptive Radix Tree (ART)

Algorithm: ART (Adaptive Radix Tree) Stride: Adaptive (Nodes change size: Node4, Node16, Node48, Node256).

ART is the standard-bearer for modern in-memory databases (like HyPer). It treats the IPv6 address as a sequence of bytes (8-bit stride), but the node itself changes layout dynamically based on how full it is.

How it works:
- If a node has 2 children, it uses a tiny "Node4" (4 keys + 4 pointers).
- If it grows to 20 children, it hot-swaps to a "Node48".
IPv6 Fit: Excellent. IPv6 hierarchies are often deep but sparse. ART collapses the path significantly compared to a binary trie.
Cache: Very good. By keeping nodes small, it maximizes the amount of the tree that stays in the CPU cache.
Rust Implementation: blart
- Why this one: There are several ART crates, but blart is currently the most actively maintained and feature-complete Rust implementation.
- Alternative: art-tree is another solid option.

3. Tree-Bitmap (Dynamic/Hybrid)

Algorithm: Tree-Bitmap (Stride usually 4 or 8, but compressed). Stride: Fixed-logic, but Dynamic-storage.

I mentioned this in the previous answer, but it remains relevant here. It sits between a pure LC-Trie and a Fixed-Stride trie.

How it works: It grabs a chunk of bits (stride), but unlike a naive array, it uses a bitmap to indicate which next-hops exist. It packs the children into a single contiguous vector.
Mmap: Because it is usually implemented as a single struct wrapping Vec<Node>, it is arguably the easiest to serialize directly to disk and mmap back into memory.
Rust Implementation: ip_network_table

Summary Comparison for IPv6

Algorithm	Implementation	Cache Strategy	Best For...
Poptrie	`oxidecomputer/poptrie`	Bit-compressed arrays. Extreme density. Fits in L3.	Raw Read Speed. If you build once and read 10 million times/sec.
ART	`blart`	Adaptive Node Sizes. Optimizes for density at every level.	General Purpose. If you need to insert/delete routes frequently while maintaining speed.
TreeBitmap	`ip_network_table`	Contiguous Vectors. Hardware-like stride logic.	Simplicity & Mmap. If you want a standard routing table structure that is easily persisted.

Technical Note on "Memory Mapping" in Rust

If your goal is to mmap a pre-built trie from disk so it loads instantly (zero-copy), you must avoid standard Rust pointer types (Box, Rc, Arc) because they point to heap addresses that won't be valid when re-loaded.

For ip_network_table: You can use libraries like serde with bincode to serialize the internal vectors, but true "zero-copy" mmap requires a library like rkyv.
For Zero-Copy Trie: If you need absolute raw performance and zero-copy loading, look at fst. It is a Finite State Transducer (effectively a compressed trie). It maps byte-keys (IPv6 bytes) to values. It is immutable and designed specifically to be memory-mapped. It is often smaller than the raw data itself, guaranteeing L3 cache fit.

ties/ipv4_overview.md

Select an option

No results found

Select an option

No results found

The Fundamental Difference: Density vs. Sparsity

Strategy A: The "Speed King" (Fixed-Stride / DIR-24-8)

Strategy B: The "Hardware Logic" (Tree-Bitmap)

Strategy C: The "Modern LC-Trie" (Poptrie)

Combined Comparison Matrix

Top Rust Recommendations

1. The "Standard" Choice: `ip_network_table`

2. The "Performance" Choice: `poptrie`

3. The "Zero-Copy / Static" Choice: `fst`

4. The "Manual" Approach (for IPv4 DIR-24-8)

1. Poptrie (The "Modern" LC-Trie)

2. Adaptive Radix Tree (ART)

3. Tree-Bitmap (Dynamic/Hybrid)

Summary Comparison for IPv6

Technical Note on "Memory Mapping" in Rust

ties/ipv4_overview.md

The Fundamental Difference: Density vs. Sparsity

Strategy A: The "Speed King" (Fixed-Stride / DIR-24-8)

Strategy B: The "Hardware Logic" (Tree-Bitmap)

Strategy C: The "Modern LC-Trie" (Poptrie)

Combined Comparison Matrix

Top Rust Recommendations

1. The "Standard" Choice: ip_network_table

2. The "Performance" Choice: poptrie

3. The "Zero-Copy / Static" Choice: fst

4. The "Manual" Approach (for IPv4 DIR-24-8)

1. Poptrie (The "Modern" LC-Trie)

2. Adaptive Radix Tree (ART)

3. Tree-Bitmap (Dynamic/Hybrid)

Summary Comparison for IPv6

Technical Note on "Memory Mapping" in Rust

1. The "Standard" Choice: `ip_network_table`

2. The "Performance" Choice: `poptrie`

3. The "Zero-Copy / Static" Choice: `fst`