This document gives a top level view of the blsync PR in order to help the review/merge process. It describes the most important mechanisms and data structures and how they are used in this PR (sometimes also how they are going to be used later).
The current PR implements an MVP feature called blsync that can light sync the beacon chain from a beacon node that supports the light_client REST API namespace (Lodestar or Nimbus) and drive an EL node through the engine API. Its components are sometimes more general purpose though as they are also intended to be part of the new full featured PoS capable Geth light mode. Note that a significant part of this PR (more specifically light.LightChain, merkle.MultiProof, sync.HeaderSync, sync.StateSync) are only used here in order to get the finalized block hash out of the beacon state. It is a possible option to strip down the PR even further by removing this feature in the first version as it is maybe not essential and is implemented with a significant amount of code. On the other hand, it is still a nice to have feature and the exact same beacon state syncing mechanism is going to be an essential part of light servers so this feature is also a good test that helps us move toward the final goal.
This package defines passive data structures that represent the actual state of the beacon chain light syncing.
- CheckpointDatais a starting point for light syncing (can be used to initialize a- CommitteeChain). It can be obtained based on the beacon header's root hash which is either hardcoded in the client or specified as a command line flag. It contains the sync committee of the given period and its beacon state merkle proof.
- CommitteeChainholds a validated series of- SerializedCommittees and- LightClientUpdates. It can validate- SignedHeaders once it has been synced up to the required sync period. It is a key component of beacon light clients but servers driven by full beacon nodes will also use it for storing and serving these structures. Though in the current PR a chain of sufficiently good- LightClientUpdates is never updated,- CommitteeChainis capable of replacing updates with better ones and even reorging if the better update proves a different next update (see comment at- ForwardUpdateSync). Hopefully this feature will not be ever needed in practice (at least on mainnet), but still, propagating the best update chain is a good practice and and being able to recover from a serious attack reduces the potential feasibility of such an attack (note that currently the whole light syncing relies on an honest majority assumption so it is less safe than general consensus, though AFAIK there are serious ongoing efforts to make sync committee signature fraud slashable, at least for the finalized chain).
- LightChainis a beacon header chain with optionally associated partial beacon state proofs which can be added separately, after the header has been added. It keeps track of the canonical header chain which can be externally set by- SetHead. It also automatically keeps track of the section of the canonical chain where state proofs are also available.
- HeadValidatorvalidates- SignedHeaderswith the current- CommitteeChainand also implements a subscription mechanism that allows multiple subscriptions to new validated heads at different signer count levels. Note that currently we only have one subscription at the global signer threshold level (which is a command line parameter of- blsync) but light servers will use separate subscriptions for propagating signed heads at signer count levels which are independent from the local threshold setting.
This package defines passive data structures used by the light syncing process.
- SyncCommitteeis a set of 512 BLS keys randomly selected by consensus for every 8192 slot sync period. It is required for validating the BLS signature aggregates of- SignedHeaders and- LightClientUpdates.
- SerializedCommitteeis a serialized version of- SyncCommittee.
- LightClientUpdateproves the root hash of the sync committee of the next period based on a header signed by a sufficient majority of the sync committee of the given period, plus a beacon state merkle proof of the- next_sync_committeestate field. A light client update is better (has a higher update score) if the header signature aggregate has more participants. The best update is a finalized update that has a supermajority signed header referencing a former header from the same sync period as finalized.
- Forksis a list of known chain forks that can determine the- SigningRootof any header based on header hash and fork version at the given epoch.
- Headeris a beacon header.
- SignedHeaderis a header signed by a subset of the canonical- SyncCommitteefor the given period. Note that the structure does not reference the committee itself but the period is determined by the- SignatureSlotfield.
This package is a framework for network requests and syncing mechanisms. In the final light client implementation it will replace some parts of the les package (the request distributor/retriever).
- Scheduleris the main active component where sync modules and servers are registered. It implements a trigger mechanism that ensures that all sync modules get a chance of making network requests when necessary.
- Moduleis an interface for a syncing module. These modules are called whenever triggered by module or server events. They typically have direct references to passive data structures (and sometimes other modules). In each processing round they determine whether they can add new data to the structures or start new network requests whose results can be added if successful. When changes have been made that might make other additions or requests possible, they emit module trigger signals, triggering themselves and/or other modules for the next processing round. Their- Processfunction always receives an- Environmentwhich allows starting network requests and makes the current validated head and prefetch head available.
- Serverwraps the abstract- RequestServer(which is currently implemented by- SyncServer) and adds timing/triggering mechanisms for request timeout and delay. Delay is not used currently but will be used later by a greatly simplified version of the flow control. Whenever a server is found not available for requesting at any moment, it guarantees to send a server event trigger signal whenever it becomes available again.
- Environmentis always passed to- Module.Processand allows making network requests to the current set of servers (or a subset which has been recently triggered). It also makes the actual validated and prefetch heads available. The validated head is determined by- HeadValidatorwhile the prefetch head comes from- HeadTracker.
- HeadTrackersubscribes to the latest and signed head event streams of registered servers. Based on the latest heads it determines the current prefetch head which is the (possibly unvalidated) latest head advertised by the majority of servers. The signed head events are passed to- HeadUpdater(which passes them further to- HeadValidatorwhen- CommitteeChaincan validate them)
This package contains sync modules (all of them implement request.Module) that are not only used in the current PR but will also be used by the full-featured light client and/or its server.
- CheckpointInitchecks if the- CommitteeChainis initialized. If not, it checks if the necessary- CheckpointDatais in the database. If not, it checks if it can start a request to retrieve it. Finally it initializes- CommitteeChainand emits a module trigger that starts- ForwardUpdateSync.
- ForwardUpdateSyncchecks if any of the servers, based on their advertised head slots, are supposed to have- LightClientUpdates that could be appended to the current- CommitteeChain, then requests and adds them if successful. Note that when serving this data will be implemented, servers will also be able to advertise the update scores of their committee chain and there is going to be another sync method that compares the received scores to the local chain and fetches better updates if available.
- HeadUpdaterdoes not start any requests but is still a sync module so that it can be triggered whenever- CommitteeChainis improved. All it does is that it receives- SignedHeaders from the individual servers and passes them to- HeadValidatorwhen the- CommitteeChainis synced.
- HeaderSynctries to sync up the header chain of- LightChainup to the current validated head (which is available through- Environment). Once successful, it calls- LightChain.SetHead. Once the head is synced, it can also reverse sync the canonical header chain up to an externally set "tail target" slot. Optionally it can also attempt to fetch the prefetch head which is not made canonical yet but allows prefetching the state so that by the time the majority signature is available, all relevant data belonging to the head header is also available. Note that header prefetching is not used in the- blsyncsetup because it prefetches entire beacon blocks and the header is derived from those. Note that the current version always fetches headers one by one based on parent root while reverse syncing older headers could be paralellized by fetching by number and checking parent roots later. This will be added later as this is not essential for the- blsyncsetup which reverse syncs a few hundred slots at most.
- StateSyncfetches partial beacon state proofs with the specified- CompactProofFormatfor all canonical headers of- LightChainand also for the prefetch head.
This package implements request functions for the beacon node REST API. Note that in the final light client implementation execution layer requests will also be implemented here (at which point it might be moved under another package and will replace some parts of the current les package).
- BeaconLightApiimplements REST API requests.
- SyncServerwraps- BeaconLightApiand implements- request.RequestServer. Note that it is going to have more function once the request delays are used.
This package implements merkle proof related tools. Note that these tools do not care about actual data structure definitions but rather about handling, merging, trimming arbitrarily shaped multiproofs. Also note that while they also serve a purpose here, they will make even more sense in the final light client where proof shapes are sometimes procedurally defined, for example when proving the execution block hash of a historical block, through the current beacon state, the historical_roots tree, the old state_roots tree of the given period, and finally the old beacon header and belonging beacon state. Or when servers are syncing up these historical structures from each other, retrieving range proofs for larger sections. On the other hand, they might currently be used unnecessarily in some cases, for example when hashing a Header. In these cases (when there is a fixed and known data structure) github.com/protolambda/zrnt can be used (I want to change this in the current PR).
- ProofFormat,- ProofReader,- ProofWriterare abstractions for arbitrarily shaped beacon state proofs.
- CompactProofFormatis a proof format descriptor defined here and is used both for requesting and storing multiproofs. It is very compact as it only requires two bits per tree node (1/128th the size of the actual nodes) and is very easy to process.
- MultiProofis a partial beacon state proof with a- CompactProofFormatand the corresponding list of tree nodes (- Values).
This package defines consensus constant parameters and beacon state field indices.
The main package of the blsync executable. The main function creates the chain structures, sets up the scheduler and the sync modules, registers a SyncServer for each beacon API URL specified in the command line and then starts the scheduler.
The two sync modules defined in this package (implement request.Module) are only used by blsync.
- beaconBlockSyncretrieves full beacon blocks for the current validated and prefetch head (typically only the prefetch head if it gets validated later). When successful, it also extracts the- Headerfrom the block and adds it to- LightChainso that the state proofs can also be prefetched in the ideal case.
- engineApiUpdaterdoes not make any requests to the REST API but it calls the engine API whenever a new execution block is retrieved and validated.