Probabilistic "light" verification of the beacon chain

Author: Zsolt Felföldi ([email protected])

Abstract

This document explores the possibility of a probabilistic syncing method on the beacon chain that requires significantly less network bandwidth than full verification while still providing a high level of safety based on the security assumption of Casper (that a 2/3 majority consensus of the active validator set is always honest). It also proposes a new representation of the active validator set in the crystallized state and an extra rule when changing the validator set in order to make the necessary verification steps possible and make the design "light client friendly".

Note: this proposal is based on the latest version of the specification at the time of writing:

https://github.com/ethereum/eth2.0-specs/blob/1296a4863e62a3cda10cc6b6a7aca35fdc792a55/specs/beacon-chain.md

Proposed validator set representation

Validators are stored in a Merkle tree with the ID (key hash) as their storage key. In each tree node there is a set of values stored together with each child hash. These values are aggregated for the entire subtree:

total_balance: total balance of active validators
total_attesting: total balance of validators attesting to the slot last_state_recalculation_slot - 1 which is the latest slot that can be justified at this point.
total_enter: cumulative balance of validators entering the validator set
total_exit: cumulative balance of validators leaving the validator set

Note: since the cumulative values can belong to already deleted keys it is possible that the values belonging to a subtree are larger than the sums of the values belonging to its children. The final data structure of an internal tree node would look like this:

hash of left child
total_balance, total_attesting, total_enter, total_exit values belonging to left child
hash of right child
total_balance, total_attesting, total_enter, total_exit values belonging to right child
total_enter, total_exit values belonging to deleted children

The total aggregated values belonging to the root of tree are not stored directly but can be calculated by adding the values in the root tree node.

Note 2: most of these values are updated when the validator set is changed while total_attesting is changed every time the crystallized state is recalculated. It is an alternative option to store these values in a separate tree structure. This structure can be recalculated in memory based on the attestations in the last cycle which could possibly allow a more resource-efficient implementation.

Subtree-aggregated values make it easy to do a weighted random selection based on these values. While doing such a lookup we also verify whether the subtree aggregated values actually match the sum of the aggregated values of the child nodes. Now we define a few basic operations that make use of them:

Prove the justified status of a block that is a parent of a state recalculation block

We need to find the next-to-next state recalculation slot where total_attesting values for this block are available. Here we perform the following checks:

check if 3 * validators_root.total_attesting >= 2 * validators_root.total_balance
repeat check_count times:
- do a weighted random selection based on total_attesting
- request proof of the attestation being actually included in the chain from the serving peer and verify it
  - Note: if the server can not serve the attestation then it is always the server's fault because it should have verified it before accepting the block that contains it in the total_attesting structure

Increasing check_count decreases the chance of accepting an invalid justification exponentially.

Verify the changes of the validator set over a potentially long period

We select two slots s1 and s2 and perform a check of the relative changes in the validator set and the corresponding balance and cumulative values.

repeat check_count times:
- do a weighted random selection based on total_balance in slot s1
- look for the same key in s2
- if it is missing then verify that total_exit has been increased
- always verify that no total_enter and total_exit values were decreased
repeat check_count times:
- do a weighted random selection based on total_balance in slot s2
- look for the same key in s1
- if it is missing then verify that total_enter has been increased
- always verify that no total_enter and total_exit values were decreased

Note: this process can only check that total_enter and total_exit are increased when necessary and never decreased but not that they are not increased unnecessarily. This will be performed by the following check which should be performed on single validator set changes.

Verify a single change of the validator set

We select two slots s1 and s2 where s2 is the first slot with the new validator set and s1 is the last slot that was already justified in s1.

perform a justified status check on s1
repeat check_count times:
- do a weighted random selection based on s2.total_enter - s1.total_enter using the validator structures of both slots
- check if the selected validator was in PENDING_ACTIVATION state in s1
- check if the selected validator was in ACTIVE state in s2
- check if s2.total_enter - s1.total_enter matches the balance of the validator
repeat check_count times:
- do a weighted random selection based on s2.total_exit - s1.total_exit using the validator structures of both slots
- check if the selected validator was in PENDING_EXIT state in s1
- check if the selected validator was in PENDING_WITHDRAW state in s2
- check if s2.total_exit - s1.total_exit matches the balance of the validator

Extra condition when changing the validator set

In addition to the requirement last_finalized_slot > crystallized_state.validator_set_change_slot we should also require a justified block since the PENDING status for changing the validator set has been entered:

last_justified_slot >= crystallized_state.validator_set_change_pending_slot where crystallized_state.validator_set_change_pending_slot is the last slot number where a validator entered PENDING_ACTIVATION or PENDING_EXIT state.

This ensures the possibility of "light" verification of the correctness of validator state changes.

Chain verification method

Chain verification should focus on the validator set changes. If the validator set remains unchanged for a period of any length then verifying the attestations at the end of the period gives us the same amount of security as verifying the entire period. The proposed method would look more or less like this (just a rough draft, not an exact description):

consider the chain as a series of sections divided by checkpoints; first checkpoints are the genesis (or starting checkpoint) and the announced head of the chain
at each checkpoint we obtain validators_root.total_enter + validators_root.total_exit and call it total_change
for a section with limiting checkpoints c1 and c2 calculate relative_change = (c2.total_change - c1.total_change) / MIN(c1.total_balance, c2.total_balance)
add sections to a heap based on relative_change
repeat while the heap is not empty:
- select the section with the highest relative_change and remove from heap
- perform "long" validator set change verification process
- if the section is short enough that only a single validator change can fit in it or if relative_change is smaller than relative_change_threshold:
  - perform single validator set change verification process on the last validator set change of the section (this also verifies the justified status before the change)
- otherwise:
  - randomly select and add a new checkpoint inside, obtain total_change
  - recalculate and add new sections to the heap
verify justified status at the end of the chain

zsfelfoldi/beacon_light_sync.md