Skip to content

Instantly share code, notes, and snippets.

@nazar-pc
Last active December 13, 2024 11:10
Show Gist options
  • Save nazar-pc/760505c5ad7d56c20b2c75c1484e672f to your computer and use it in GitHub Desktop.
Save nazar-pc/760505c5ad7d56c20b2c75c1484e672f to your computer and use it in GitHub Desktop.
Nazar's endgame design
title markmap
Nazar's endgame
colorFreezeLevel duration maxWidth spacingHorizontal spacingVertical
2
200
400
200
20

Raw contents could be inserted into https://markmap.js.org/repl to produce a mind map.

Meaning of "must" and "should" is defined in https://www.rfc-editor.org/rfc/rfc2119.

The goal: massive multiplayer apps that can't be built otherwise

These are literally impossible or extremely hard to build in incentive-compatible and sustainable way without a truly Internet-scale blockchain. It is even more difficult to build applications that are censorship resistant to the point of being unstoppable. The goal is to make such applications a reality and enable developers to build new kinds of apps that were not possible before due to limited infrastructure.

Social media

  • Think Twitter/X
  • Strong (ultimate?) censorship resistance
  • Posts never disappear without a trace, every change is persisted and timestamped in blockchain history
  • Paid community-forming subscriptions are using on-chain tokens directly to creators without intermediates

Sustainable permanent virtual worlds

  • Countless NFTs for different in-world items
  • Countless shaders and textures persisted for as long as the world itself lives and beyond
  • Countless transactions for various actions/interactions

Global payment system

  • Actually global payment system without borders and regulators
  • Strong (ultimate?) censorship resistance
  • Self-custody
  • Decoupled from manipulative central bank issuance
  • Affordable fees for everyday use
  • Billions of accounts for users and machines alike
  • Billions of transactions every day

Benefits of PoAS sharding over Ethereum rollups

  • Persistence
    • Rollup scaling can't fit history of all rollups into L1 chain and store it permanently
    • PoAS sharding can
  • Security
    • While ultimate security of optimistic rollups is backed by Ethereum, until challenge window elapses transactions are not as secure as those on L1
    • PoAS sharding does share the same security as L1
  • Efficiency
    • While ZK rollups commit proofs of execution to Ethereum directly, the amount of computation wasted in the process is very high comparing to re-execution with PoAS sharding unless millions of users re-execute transactions
    • Latency in ZK rollups is significantly higher due to the same reason
  • Decentralization
    • Due to lack of shared security, rollups rely on their own validator sets, which is a fraction of immediate security of L1 at best and having a centralized sequencer at worst
    • PoAS sharding has full decentralized L1 security
  • Uniformity
    • With PoAS sharding it is possible to use the same account (and its verification method in case of multisig, etc.) and contract address across all shards
    • With rollups every rollup has its own independent state with its own accounts and contracts
  • Scalability
    • With rollups there is a limit of how many rollups and how frequently L1 can support
    • PoAS sharding can in principle scale infinitely

Assumptions and expectations

  • Modern hardware is fast and getting faster every day, it is the software that is not using it efficiently
  • Consumer hardware (target audience for running both consensus nodes and end-user applications) is significantly less capable than enterprise server hardware of the same generation
  • The laws of physics allow us to increase storage, compute and bandwidth over time almost infinitely, but not information propagation latency (unless we find a way to transfer information faster than the speed of light at scale, which is unlikely any time soon)
  • If certain increase of blockchain utilization results in only logarithmic increase in system requirements then inevitable incremental improvements in storage, compute and bandwidth will be sufficient to keep up with network growth without practical limits
  • The above essentially means that for blockchain to be scalable, it needs to scale sub-linearly with utilization
    • If it can't the then it is not scalable
    • If it has a hard limit by design then it is not scalable
    • If hardware requirements grow close to linearly with utilization it is not scalable

Consensus

Must have

  • Permissionless participation
  • Unbounded # of farmers
  • Unbounded amount of space pledged (in practice it is sufficient to be able to represent all space in the world)
  • Safety only depends on farmers (can't revert the chain after sufficient amount of time)
  • Modest hardware requirements (mid-range gaming computer)

Should have

  • Low confirmation latency
    • Transaction confirmation time is short and measured in seconds within constraints of bandwidth requirements
  • Liveness only depends on farmers (transactions submitted will end up being included on the blockchain in a reasonable amount of time)
  • Chain upgrades based on decision of consensus nodes (like Bitcoin) instead of on-chain governance (committee)
  • Based on post-quantum cryptography

Should not have

  • Hardcoded ratio between cost of storage and compute defined in the code

Must not have

  • GPU requirement or any exotic hardware, though can be used to improve efficiency

How (ideas)

  • Subspace consensus v2.3 is basically doing this
    • Minus votes

Sharding

Must have

  • Unbounded number of shards (in practice it might be fine to have some ridiculously high number that is convenient to use in code like $2^{32}$ or something)
  • More shards means more throughput (both bandwidth and compute)
  • Bounded compute on individual farmers as number of shards increases
  • The same code for shards and beacon chain with minor conditional logic for beacon chain and shards where there are behavioral differences, all upgrades to the beacon chain implicitly apply to the shards

Should have

  • Bounded amount of space used by node for any farmer
  • Independent archiving for each shard with an easy way to find what segments correspond to which shard for sync from genesis
    • Will cause too much complexity otherwise

Should not have

  • Significant deviation from Nakamoto consensus for safety
  • Transactions on the beacon chain
  • Non-smart contract transactions on shards
    • Claiming block production reward might be the only exception
  • Need to sync on-demand with specific timing assumptions
  • Decoupled execution
    • Will cause too much complexity otherwise
  • Ability to remove previously existing shards
    • Will cause too much complexity otherwise

Must not have

  • Confirmation latency increase or communication delays when more shards are added
  • On-chain registration before joining a particular shard
    • Effectively implies farmer self-assignment to shards

How (ideas)

  • Shards to be organized in hierarchical structure
    • Lower-level shards are handled by higher-level shards
    • Beacon chain only handles a limited set of highest-level shards
    • Number of shards beacon chain/shard handles must be such that allows at least one shard block per higher-level chain block, something like 128 or 256 shards seems reasonable, which will achieve desired number of shards with just a few levels, making cross-shard communication bounded in latency
    • Shard limit per level can grow over time, this will simply require increasing bandwidth requirements
    • Farmer commits to the lowest level shard they want to produce blocks on during plotting and follow all the higher level shards all the way to and including beacon chain
    • If any full node observes a safety violation, "poisoned block" transaction is submitted to the beacon chain, which includes the proof that the block was included is somewhere in the tree of shards that leads to the beacon chain and the block itself, which will reset corresponding branch of shards to the previous state
      • All peers that submitted blocks ultimately pointing to the offending block are banned on the network level
  • When producing beacon chain or shard block reward can either go to a numeric contract directly on specified shard or to a special address represented by public key on specified shard that can be claimed later by one of the contracts
  • Since there is no need to sync on demand (after solving a puzzle for example), it is possible to discover shards using random walk when bootstrap nodes are not available

Execution environment

Must have

  • Homogeneous
    • All shards look exactly the same
  • Source code can be compiled from at least Rust and C
    • Tooling just for Rust is sufficient to start, support for C dependencies and just C can be added later
  • Use C ABI
  • RISC-V ISA
    • The best fit overall from performance and complexity point of view
    • Real world ISA implementable in hardware (might be possible to create purpose-built hardware, using hardware-accelerated virtualization on RISC-V hardware, etc.)
  • High performance
    • Efficient use of compute hardware (can't be ZK VM by default)
    • Efficient use of storage hardware (avoid 32-byte database updates if at all possible)
  • Synchronous composability within a shard
  • Asynchronous composability between shards

Should have

  • Parallel execution
    • One contract call to use more than a single execution thread
    • Run multiple non-conflicting contract calls
  • Pipelined execution
    • Overlap different stages of different contract calls (for example execution of one contract call can happen in parallel with compilation/translation of another)
  • Ability to run unmodified contract under ZK VM
    • Might be worth size/computation tradeoff in cases where multiple contract calls could be combined with a succinct validity proof rather than large storage proof that requires re-execution
  • Instructions having relative cost on reference hardware, but absolute cost is defined by market mechanisms rather than protocol constants, so base protocol does not have to fix the price of compute
  • Contracts should be mostly pure functions without side effects
    • All inputs are given as function arguments
      • No magic stateful "environment" accessible with special host functions
    • All outputs are done as return values/arguments
      • No direct storage writes or similar things
    • If contract needs to modify state of another contract then method returns, another contract gets inputs and executes, return method is called on original contract again, while calls that do not modify state of other contracts can be done directly
    • "mostly" above refers to optional offloading of heavy compute to the host
  • Performance ceiling that can verify Subspace consensus in a contract (in full, including PoT)
    • Might be a bit of a stretch, but highly desirable

Should not have

  • Fraud proofs
    • Will cause too much complexity otherwise
  • Mandatory offloading of compute-intensive operations (cryptographic signatures, etc.) to the host
    • Contracts should be priced and runnable even if host doesn't support any offloading at all, offloading is an optional optimization
  • Exotic tooling
    • cargo build (maybe with a bunch of CLI options) without unusual dependencies on the host OS should be sufficient to build a deployable contract

Must not have

  • Exotic programming language/toolchain/environment (EVM/WASM/Solidity/Move/Cairo)
    • Custom VMs can be just contracts and still run fast enough

How (ideas)

  • Charging for multiple cores regardless of how many cores contract actually uses to incentivize parallel programming
  • Charge reads/writes in multiple of 512/4096 bytes pages with increasing discounts for additional pages up to 16 kiB the way physical disk reads and writes are happening (16 kiB is a physical page size on modern SSDs)
  • Contracts do not work with databases, instead they work with memory that they can use to store arbitrary contents in
    • Should allow for much simpler underlying database implementation, possibly a thin overlay on top of the flat binary files
    • Hashes of all contracts can be stored in memory for fairly large number of contracts with smaller trees (or other data structure) used internally within each contract, which will allow for quick and parallel state root computation on consensus nodes without touching permanent storage at all
  • Efficient database implementation
    • Possibly using NOMT
    • Ideally no such database on the farmer though, maybe only on "operators"
  • Stateless verification
    • This also implies ability to do parallel block verification
  • Separate state for each shard that can be aggregated together on the beacon chain

Contract (account) model

Must have

  • Global notion of a contract that is usable on any shard
  • Contract is represented by a number (unsigned 64-bit)
  • User account is a special case of a contract and is conceptually the same thing
  • Ability to receive farming rewards to a special address represented by public key, from where reward can be moved into a contract
  • Ability for contract to receive tokens on any shard
  • Ability for contract to send tokens from multiple shards
  • Ability to read state of any contract of any shard on any shard (with eventual consistency driven by confirmation latency)
  • Ability to interact with contracts on the same shard that results in shard-specific contract state updates

Should have

  • One or more aliases that resolve to a contract in user-friendly way
  • Ability to use (update) contract on multiple shards at the same time
  • Ability to update independent parts of the contract state in parallel/concurrently
  • Native tokens as contracts
    • Would be convenient as a unifying framework for the whole system, but will require some kind of access control over regions of contract's memory
  • An option to do anonymous/zk/homomorphic computation
    • Ideally including claiming consensus rewards

Should not have

Must not have

  • Contract state bloat becoming an issue over time for consensus nodes

How (ideas)

  • Each shard is given a pool of globally unique numbers to be used for contracts
  • A contract is registered on one of the shards that allocates a unique and globally identifiable number that can later be used on any shard
  • Contract state is composed of "pages" (see execution environment above)
  • Each page contains a set of slots belonging to different contracts that can modify them
    • Custom fungible token is stores in a dedicated slot modifiable by corresponding contract
    • Native logic is using the same mechanism as regular user-deployable contracts:
      • Native token is stored in a slot and modifiable by native logic/contract
      • Account ownership (signature, multisig, etc.) or smart contract code is stored in a slot and modifiable by native logic/contract
    • All slots are added to accumulator, resulting in state root of the contract
  • State of any shard can be proven to exist to any other shard for reading purposes
  • Token balance and certain similar concepts have commutative property, meaning balance increases can be aggregated from multiple concurrently running contracts since they are guaranteed to succeed (while subtractions can underflow and can't be used this way without special handling)
    • Think atomic incrementing of a value
  • Each contract has distinct state on one or more shards with ability to move this state to a different shard as a whole if desired
  • Execution on a particular shard can only update contract state that corresponds to the same shard
  • Advanced accumulators that support generation of inclusion and non-inclusion proofs without access to the whole state can allow contract owner to track changes across shards while only being a light client of the beacon chain
    • big if, this one needs a lot more though given to it
    • fallback is to use Sparse Merkle Tree, which will result in larger proofs, but will satisfy most of the properties
  • Farmers do not store the state of execution environment, they only do stateless verification and update resulting state root
    • Farmers store state roots of the contracts (but not the state itself and keep them all in memory) in an ordered accumulator (according to increasing unique contract number)
    • Transaction must be submitted with corresponding slots that will be read/modified and a proof that they belongs to the contract's state root
    • If number of contracts gets so large that farmers can't store all state roots in memory, state roots can be hashed in pairs and only resulting hashes can be stored, cutting memory requiring in half and increasing proof by one hash
      • The observation here is that if we use Merkle Tree we will never actually remove entries from it, we'll have a continuously growing tree that may have its entries modified (including to null value like in sparse merkle tree)
      • It is possible for light clients to keep track of not just their state, but also a state of a single predictable other contract's state root, which seems like a reasonable tradeoff
  • "Operators" are not forced to store the whole state either, they are free to drop state of any contact and either owner would have to submit a transaction with a stateless proof attached to it or pay to someone who did store their state to do the same
    • Helps to avoid storage cost guessing reducing it to compute cost of doing the store itself and allows setting custom retention policies
@vedhavyas
Copy link

Non-smart contract transactions on shards

So this means, even beacon chain will support contracts for the immediate higher level shards? At lower level, do we want to support EVM and other VMs whose consensus logic, like today pallet-domains, exist in the contracts itself ?
Looks neat!
Do you envision using polka-vm contracts as a default ?
Would there be a standard for such lower level shard's contract ?

Farmer commits to the lowest level shard they want to produce blocks on during plotting and follow all the higher level shards all the way to and including beacon chain

Not sure how the implementation here would look like. How many low-level shards can a farmer follow? Wouldn't following to plotting to higher level be intensive if they decide to pplot the lowest level of shard ?

Independent archiving for each shard with an easy way to find what segments correspond to which shard for sync from genesis

If each shard is archived independently, farmer will only archive beacon chain with pointers of the state of immediate higher level shards through their contracts ?
If so, then shard archived segment is not used by the farmers to find the POAS solution?

Parallel execution

This reminds me of solana data model to support parallel execution. Since substrate data storage model does not support this out of the box, do you envision using something like solana vm even in beacon chain since it should also support parallel execution of immediate higher level shard's transactions

"Operators" are not forced to store the whole state either,

This is the only mention of operators keyword. Does operators exist for each shard up until higher level shards or just only the at the lower level where we deploy custom VMs like EVM etc... ?

@nazar-pc
Copy link
Author

So this means, even beacon chain will support contracts for the immediate higher level shards?

Beacon chain is not supposed to have contracts by itself, even though technically it will likely be capable of due to shared runtime.

At lower level, do we want to support EVM and other VMs whose consensus logic, like today pallet-domains, exist in the contracts itself ?

Yes, the idea is that any VM, including EVM, will be simply a contract.

Do you envision using polka-vm contracts as a default ?
Would there be a standard for such lower level shard's contract ?

It PolkaVM is just a virtual machine and is a likely candidate for what will run execution environment, but the environment itself will likely be different from pallet-revive that Parity engineers are developing. The native environment will be such that satisfies mentioned constraints and there will be docs and standards, etc. for how to do this. If you want to run contracts written for pallet-revive, it'l likely require a nested VM to support that.

Not sure how the implementation here would look like. How many low-level shards can a farmer follow? Wouldn't following to plotting to higher level be intensive if they decide to pplot the lowest level of shard ?

Technically we'll just mix in shard number into the seed when deriving PoSpace tables for sector encoding.

Commitment could be per-sector or per-farm. Farmer can choose to commit to as few or as many shards as they want. There are a few ways to organize this, committing to the lowest level and following all the levels above (which there will be a limited number of) seems like the best strategy right now, but could change. Number of lower-level shards committed to depends on how much compute and bandwidth does farmer have to follow those shards.

If each shard is archived independently, farmer will only archive beacon chain with pointers of the state of immediate higher level shards through their contracts ?

That is the current idea, but the exact construction doesn't currently exist.

If so, then shard archived segment is not used by the farmers to find the POAS solution?

All farmers will still plot all the pieces just like before. The mechanism for discovering that something was archived will, of course, change, but the principle is still the same.

This reminds me of solana data model to support parallel execution. Since substrate data storage model does not support this out of the box, do you envision using something like solana vm even in beacon chain since it should also support parallel execution of immediate higher level shard's transactions

Beacon chain is expected to not have regular transactions and being a lightweight as possible, ideally you'd be able to run full beacon chain node in the browser, which would allow to verify everything that happens on all other shards.

Substrate's infrastructure will likely get in a way more than help, so I'm not sure it is a good idea to build this whole thing on top of Substrate. Much lower-level access to the hardware is needed to make it efficient and fast.

This is the only mention of operators keyword. Does operators exist for each shard up until higher level shards or just only the at the lower level where we deploy custom VMs like EVM etc... ?

It is also in quotes. These are not the same operators, what I meant here is some kind of provider that may store state of user accounts and return it (for a fee or maybe for free) when user wants to make a transaction. Other than that only farmers (consensus nodes) are part of the protocol on fundamental level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment