ETH 2.0 concept Go client

Initial design doc for a new Go client for ETH 2.0. By @protolambda.

Hope the ideas and outlined problems help design the ETH 2.0 successor of Geth ("Firefly"? See F.A.Q.).

This is a starting point for a work-doc for initial contributors, some of the goals may still change.

General Goals

Lightweight
- No extras that make the core harder to get right (verification, performance)
- Be cautious with off-the-shelf stuff, it may bloat.
- Design to enable lightweight enviroments (cheap VPS, smartphone) to run it.
Performant:
- Quick processing, if there is any bottleneck, it should be the network
- Parallel processing where possible (Go routines)
- Design towards good usage of Go channels
Encapsulation:
- Easy to keep up with spec-changes
- Easy to verify and understand a sub-section of the code
Clear interfaces:
- A complete plugin system would be to much (initially at least)
- Strong composition pattern: Good fit for Go.
Easy to test:
- Encapsulation helps
- Hook different components together through channels where feasible
Experiment-first:
- Don't aim to be complete over correct
- Scrutinize ETH 2.0 spec in new ways
Easy and open for contributions:
- Encapsulated design -> verifyable changes
- Support for experimentation

Known problems & considerations

"god-object": one state.
- Goal: make changes:
  - easy
  - readable
  - verifyable
  - fast
  - cached
- Problems:
  - Duplicate data when storing as a whole
  - Memory. But not all data has to be available at all times.
- Optional: tracking longer history of changes
Undocumented life-cycle expectations
- Documented: "slot", "block", "epoch"
- A try to document the full-extend of processing (afaik):
  1. Block ingestion (either from local node or from network)
    1. Validate basic requirements (known parent, known eth1 ref, etc.)
  2. Block pre-processing
    1. Retrieve state of parent block
      - We want a handle/view to the state that:
        
        can handle changes, without persisting immediately
        
        reflect unchanged data
      - The state can be big, and there may be many being processed in parallel.
    2. Transition state to slot of ingest-block (or just before, easy to make off-by-1 errors with slots...)
      - Change retrieved data, don't persist to disk
  3. Block processing
    - Apply block changes to state
      - Encapsulate changes, don't make one ugly transition.
        
        Spec generally encapsulates it in sections, but can improve upon.
        
        Split transition into multiple files: changing/contributing will be easier
      - Make changes apparent: ideally we know what data to serialize again, and which can be retrieved from a cache
      - Make validation checks in advance where affordable, prevent easy DOS with invalid data.
  4. Block post-processing
    1. Serialize state parts where it's necessary
    2. Hash what's necessary (tree-hash)
    3. Verify state root of block
    4. Store post-state
  5. Block storage
    1. Store block
Attestations (on unfinalized blocks) processing:
- Aggregation is unclear, but necessary:
  - Fork-choice based on summing individual attestations is super slow
  - Sharing attestations can be optimized with it
  - Verifying may be faster
- Aggregate per-target, i.e. ideally we have one batch of attestations per attested block.
- Need to keep track of attestations, we want to produce new slashings ourselves.
- Due to large amounts of attestations, it may require:
  - Storing some idle data on disk
  - Splitting attestation data:
    - Verified stripped-down attestations in memory
    - Full attestation data on disk (also for restart, see below)
- Persisting data for use after restart. This is a over-looked storage requirement.
Serialization of the "god-object": no serialization-cache by design.
- Can we generalize the way a state section:
  - tracks its changes
  - maintains a serialization cache
  - can serialize when necessary (soft, use cache)
  - is loaded from serialized version
  - fill initial cache
  - maintains a hashing cache
  - hash when necessary (soft use cache, tree-hash)
Managing unfinalized blocks:
- Fork rule execution, spec is slow
  - Easy & quick access to necessary data for fork-choice:
    - block hashes
    - block parent-hashes
    - block slot
    - optional: block height (count since genesis)
  - Possibly like a small DAG, easy to implement fork-rule on top of
- Quickly changing head, prevent big updates.
  - The point of time where we want to re-determine the head:
    - On ingesting a block, when already fully synced
    - When unprocessed weight of collected (and aggregated) attestations surpasses a threshold
    - After syncing
Provide access to events
- Implement subscriptions with channels
  - Possibly use Go-ethereum or other events implementation?
- Implement some of the getter/streamer RPC functionality on top of this.
- Validator should be able to access this easily and quickly.
  - Choice: do we connect our validator node(s) via:
    1. RPC
    2. direct to events
    3. both?
Provide storage for state
- Make pruning easy
- Avoid duplication of data (E.g. storing the full validators list every slot, when it only changes every epoch, or with upcoming changes maybe every so often, but likely not continuously)
- Make lookups fast
- Writing speed is not so important, the latest-states may be cached in memory
- Possibly support some sort of queries
- Possibly support fetching of ranges of data
Provide storage for blocks
- Block storage is mostly there to sync other non-light peers with.
- Writing speed is more important (afaik)
- Make pruning easy
- Iteration of keys may be completely unnecessary: we have latest-blocks references in state now.
Provide storage for attestations
- See attestation comments above.
- Persisting for after restart
- Used to create slashings, and restore after restart (or we handle that separately, e.g. persist DAG)
- We could abstract aggregation per-target by indexing storage by target.
Integrate BLS
- Many teams had problems integrating BLS, mostly due to reliance on cross-language or native-lib interactions.
- Don't roll your own crypto, yet you have to find something that works well.
- We could share effort with Prysmatic here. (Need to look at licensing here however, the wrapper has a different license than the external library underneath)

Possibilities for radical changes/innovations

Better Serialization patterns (optimize accesses and caching)
- Experiment with SOS format
Better Attestation Aggregation
- Implement batching well, useful for fast fork-choice
- Implement fork-choice cache on top of batch: i.e. track changes in weights of batches (Already hacked together experimental version in LMD-GHOST simulation, here)
- Cache can be partially processed: only if change in weights is big enough. Good for speed. Sort of supported by spec (FORK_CHOICE_BALANCE_INCREMENT).
- Experiment with storage solutions
Possibly implement state transitions with a decoration-pattern:
- Like composition, but with a clear order, generic, extensible, and relative easy to implement with Go interfaces. Seems like a good fit to avoid clumsy inheritance approaches that don't fit Go
- Provides some good benefits: extensible, good encapsulation and clear processing order
Implement a DAG (a lot like a tree here tho)
- Fork-choice from DAG (already implemented here)
  - much faster than retrieval of blocks (no unnecessary traversal or allocations)
  - more minimal: dag-nodes don't need complete block data
- Easy slot-based pruning (could leave disconnected graph components, but eventually pruned)
- Reasonable branch-based pruning, if even necessary
- Quick to switch head, and justified block
- Complete understanding of available forks available
Synergize DAG <-> storage.
- If state storage really needs to be minimal, state can be split up:
  - state-subsection caches already provide:
    - structured information
    - storage key (hash of subsection)
  - traverse DAG from head to retrieve all necessary state.
    - mark DAG node if it contains a change in storage of a subsection
    - walk back from head, and load state sections for first change-mark
    - No duplication, of any data, even with forks!
- Maybe too ambitious, up to decide if it is worth it, given state size and processing bottleneck

F.A.Q.

Why not Prysm (other Go client)?

Licensing, commercial use is really not that bad.
Arguably built too quick for production: i.e. off-the-shelf components are preferred over work with and on the spec. This progresses the spec less.
More people need to scrutinize the spec in different ways.
More options available
Experiment with big new ideas.

How do I contribute?

The initial phase of starting a new client implementation is messy, but everyone is welcome, please get in touch with the others on the go-ethereum discord (firefly channel).

Who is working on this?

Honestly, no idea. But @karalabe (Go-ethereum dev, Péter Szilágyi) started a repository, and I am looking for a more experimental approach, implemented in Go, than Prysm.

Why "firefly"?

Something with "light", as a reference to the beacon-chain. Similar to other ETH 2.0 clients naming process (e.g. Prysm, Lodestart, Lighthouse, Artermis).

Yes, we know that there is some hardware wallet with the same name. If you have a better name for this project, please let us know.

protolambda/ETH 2.0 concept Go client.md