Skip to content

Instantly share code, notes, and snippets.

@cordt-sei
Created October 16, 2024 21:37
Show Gist options
  • Save cordt-sei/99488e8b5f9b8704c62858de54895cde to your computer and use it in GitHub Desktop.
Save cordt-sei/99488e8b5f9b8704c62858de54895cde to your computer and use it in GitHub Desktop.

SeiDB

Technical Summary

Overview

SeiDB is a storage optimization solution aimed at mitigating the challenges of growing blockchain state. It replaces the traditional IAVL-based storage system used in Cosmos SDK-based blockchains with an enhanced storage model to tackle storage bloat, enhance performance, and ensure efficient state handling for both full nodes and archive nodes. Key components of SeiDB include VersionDB and MemIAVL, which provide significant improvements to the storage efficiency and performance of nodes across the network.

Key Components

1. IAVL Tree Optimization

  • The original IAVL (Immutable AVL) Tree served as the foundation for state storage. While IAVL balances fast insertions and lookups, it leads to write amplification and slow performance due to metadata overhead.
  • SeiDB enhances IAVL by replacing it with more efficient alternatives like MemIAVL and VersionDB to solve issues such as excessive write amplification, long block rollback times, and large snapshot restoration periods.

2. MemIAVL (Experimental)

  • MemIAVL is an in-memory optimized version of IAVL that provides a significant performance boost to nodes. It can be enabled by setting memiavl.enable to true in the app.toml configuration file. MemIAVL uses a standalone directory for storage (data/memiavl.db) and allows switching back to the default IAVL setup if needed.
  • MemIAVL only supports pruned nodes, with its default configuration equivalent to pruning=everything. If historical gRPC queries or archive Merkle proof generation is required, VersionDB must be enabled alongside MemIAVL.
  • MemIAVL has specific configuration settings, such as async-commit-buffer and snapshot-interval, which influence commit behavior, caching, and snapshot intervals. Notably, async commit drastically improves block sync speed.
  • Use Cases:
    • Semi-Archived Nodes: MemIAVL can replace pruned IAVL trees along with VersionDB to form a semi-archive setup.
    • State Sync Nodes: MemIAVL supports faster snapshot restoration, often outperforming chunk download speeds and allowing for rapid state sync, depending on the internet connection.
    • Snapshot Providers: MemIAVL is well-suited for providing state-sync snapshots, reducing the time for snapshot export from days to minutes.
  • For efficient disk space usage, MemIAVL can benefit from running on Linux filesystems like Btrfs with zstd compression, which can reduce storage by up to 60% without visible performance regression .

3. VersionDB

  • VersionDB replaces the IAVL tree as the backing store to improve storage efficiency, especially for archive nodes. It provides a log-structured, version-based storage system, maintaining only the necessary historical versions to minimize redundant data.
  • VersionDB is configured by adding it to the list of store streamers in the app.toml configuration (streamers = ["versiondb"]). It also stores its data in data/versiondb, which cannot be customized.
  • VersionDB significantly reduces disk usage, with reported reductions of 63% for archive nodes on the Cronos network. By utilizing VersionDB, the active state size is reduced by 60% and historical growth by around 90% .

Storage Reduction and Performance Gains

Storage Reduction

  • Original Setup (RocksDB on Cronos):

    • application.db: 1.6TB
    • Total Disk Usage: 2.3TB
  • VersionDB Setup (SeiDB):

    • application.db: 82GB
    • versiondb: 26GB
    • Other components (blockstore, state, etc.): Approx. 729GB
    • Total Disk Usage: 837GB

The reduction in storage represents a significant improvement, lowering the overall usage from 2.3TB to 837GB .

Performance Improvements

  1. State and Block Synchronization
    • State Synchronization times are improved by 1200% over the previous Sei v1 model.
    • Block Synchronization time is cut by 50%, enabling nodes to catch up to the latest state faster.
  2. Block Commit Times
    • SeiDB achieves a 287x improvement in block commit times. The optimizations in MemIAVL and VersionDB allow for faster access to state data during block execution.
  3. Throughput Gains
    • Overall throughput is improved, resulting in a 2x increase in TPS (Transactions Per Second), due to faster state access and state commits.

Migration to SeiDB

  • To migrate to SeiDB, nodes perform a series of steps involving exporting and restoring the IAVL state, then enabling VersionDB and MemIAVL in the configuration (app.toml).
  • After enabling MemIAVL, nodes must ensure settings like async-commit-buffer and snapshot-keep-recent are correctly configured to maximize performance for either block catching-up or snapshot generation .

Technical Impact

  • TPS and Latency Improvements: Faster state commit and state sync directly lead to better transaction throughput and reduced latency.
  • Efficient Archive Node Management: Archive nodes using VersionDB now require significantly less storage while achieving similar performance to full nodes.
  • State Bloat Mitigation: The combination of MemIAVL and VersionDB significantly mitigates state bloat by optimizing storage and access to both active and historical state data .
  • Simplified Migration: By making MemIAVL a drop-in replacement, switching between storage systems is simplified, with minimal disruption to ongoing operations.

Sources:

  1. Cronos Documentation on VersionDB
  2. Cronos Documentation on MemIAVL
  3. Cosmos ADR-065 Store v2 Architecture
  4. IAVL Export/Import Documentation
  5. Sei Blog - Sei DB: The Numbers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment