Skip to content

Instantly share code, notes, and snippets.

@jwiegley
Created October 7, 2025 17:52
Show Gist options
  • Select an option

  • Save jwiegley/b3dcc52409c77f6e310c05daff279539 to your computer and use it in GitHub Desktop.

Select an option

Save jwiegley/b3dcc52409c77f6e310c05daff279539 to your computer and use it in GitHub Desktop.

Product Requirements Document: EVM Test Suite Implementation

Document Version: 1.0 Last Updated: 2025-10-07 Status: Draft Owner: Albert Groothedde Due Date: December 8, 2025 Repository: kadena-io/kadena-evm-sandbox Branch: ag/tests/20-evm


Executive Summary

This PRD outlines the comprehensive test suite implementation for Kadena Chainweb EVM, focusing on validating EVM compatibility, multi-chain functionality, and integration with the Ethereum test suite from github.com/ethereum/tests. The initiative aims to ensure robust, production-ready EVM execution on Kadena's multi-chain architecture (chains 20-24).


1. Project Overview

1.1 Background

Kadena Chainweb EVM is a multi-chain blockchain architecture where EVM execution runs on chains 20-24 alongside Pact smart contracts on chains 0-19. The key innovation is trustless cross-chain bridging using SPV (Simple Payment Verification) proofs, enabled by Chainweb's braided architecture.

1.2 Purpose

Implement a comprehensive test suite that:

  • Validates EVM compatibility with Ethereum standards
  • Tests multi-chain operations and cross-chain bridging
  • Ensures consensus and state management correctness
  • Provides regression testing and continuous integration support
  • Verifies Reth client integration with Chainweb consensus

1.3 Scope

In Scope:

  • Ethereum test suite integration (state, transaction, blockchain tests)
  • RPC API compatibility testing
  • Multi-chain operations and cross-chain bridging tests
  • Consensus and block production validation
  • Gas metering and transaction execution tests
  • Infrastructure and devnet testing
  • Performance and load testing

Out of Scope:

  • Pact smart contract testing (chains 0-19)
  • UI/Frontend testing
  • Security penetration testing (covered separately)
  • Mainnet deployment testing

2. Current State Analysis

2.1 Existing Test Infrastructure

Location: /tests directory

Current Test Coverage:

  1. E2E Tests (/tests/e2e/):

    • discontinued-node.test.ts - Node failure scenarios
    • run-multinode-with-miners.test.ts - Multi-node setup validation
    • check-container-logs.test.ts - Docker container health checks
    • fast-block-production.test.ts - Block timing validation
    • verify-miner-rewards.test.ts - Mining reward verification
  2. Integration Tests (/tests/src/):

    • hardhat.test.ts - Hardhat integration
    • pact.test.ts - Pact interaction tests
    • DX Tests (/tests/src/dx/):
      • token.test.ts - ERC-20 deployment (Hardhat tutorial)
      • zombie.test.ts - CryptoZombies contract deployment
      • tx.test.ts - Invalid transaction testing (reused nonce, low gas, invalid signature)
      • large.test.ts - Large contract deployment (>24KB)
  3. Solidity Tests (/solidity/test/):

    • SimpleToken.test.js - Unit tests for ERC-20 with cross-chain features
    • SimpleToken.integration.test.js - Cross-chain transfer integration tests

Test Infrastructure:

  • Framework: Bun (required, not Node.js)
  • Command: bun run test (from tests directory)
  • Timeout: 300,000ms (5 minutes) for integration tests
  • Docker: devnet/compose.py generates docker-compose.yaml configurations

2.2 Gaps Identified

  1. Ethereum Standard Tests: No integration with official ethereum/tests repository
  2. State Tests: Missing comprehensive state transition validation
  3. RPC Coverage: Limited RPC endpoint testing (eth_getProof exists, needs more)
  4. Blockchain Tests: No validation against Ethereum blockchain test vectors
  5. VM Execution Tests: Missing low-level EVM opcode execution tests
  6. Consensus Edge Cases: Limited testing of consensus failure scenarios
  7. Performance Benchmarks: No systematic performance regression tests
  8. Cross-Chain Stress Tests: Limited multi-chain coordination under load

3. Goals and Objectives

3.1 Primary Goals

  1. Ethereum Compatibility: Achieve >95% compatibility with Ethereum test suite
  2. Multi-Chain Validation: Ensure cross-chain operations work reliably across chains 20-24
  3. Regression Prevention: Catch breaking changes before deployment
  4. Performance Baseline: Establish performance benchmarks for future optimization
  5. Developer Confidence: Provide comprehensive test coverage for contributors

3.2 Success Metrics

Metric Target Measurement
Ethereum test suite pass rate >95% ethereum/tests execution results
RPC endpoint coverage 100% of documented endpoints RPC test suite results
Cross-chain test coverage All bridging scenarios Integration test results
CI/CD integration <15 min test execution GitHub Actions duration
Test documentation completeness 100% of tests documented Documentation review

4. Test Categories

4.1 Ethereum Standard Tests (ethereum/tests)

Reference: https://github.com/ethereum/tests

Priority: HIGH Estimated Effort: 5-8 weeks

4.1.1 State Tests

Validate state transitions for:

  • Account creation and destruction
  • Storage modifications (SSTORE, SLOAD)
  • Balance transfers
  • Contract deployments
  • Self-destruct operations
  • Code execution and gas consumption

Implementation Approach:

# Clone ethereum/tests repository
git clone https://github.com/ethereum/tests.git tests/ethereum-tests

# Test runner structure
tests/
  ethereum/
    state-tests.test.ts     # State transition tests
    vm-tests.test.ts        # VM execution tests
    blockchain-tests.test.ts # Blockchain validation tests
    runner/
      state-runner.ts       # State test executor
      fixtures.ts           # Test fixture loader

Test Format:

  • JSON test fixtures from ethereum/tests
  • Test runner parses JSON and executes against Chainweb EVM
  • Compare final state against expected results

4.1.2 Transaction Tests

Validate transaction processing:

  • Valid and invalid transaction formats
  • Signature verification (ECDSA secp256k1)
  • Nonce handling and replay protection
  • Gas price and gas limit validation
  • Transaction type support (Legacy, EIP-2930, EIP-1559)
  • Large transaction payloads

Test Cases:

describe('Transaction Tests', () => {
  it('should reject transaction with reused nonce', async () => {
    // Send transaction with nonce N
    // Attempt to send another transaction with nonce N
    // Expect rejection with appropriate error
  });

  it('should reject transaction with insufficient gas', async () => {
    // Submit transaction with gasLimit < intrinsic gas
    // Expect rejection
  });

  it('should handle EIP-1559 transaction format', async () => {
    // Submit EIP-1559 transaction with maxFeePerGas, maxPriorityFeePerGas
    // Verify execution and gas accounting
  });
});

4.1.3 VM Execution Tests

Low-level EVM opcode validation:

  • Arithmetic operations (ADD, SUB, MUL, DIV, MOD, EXP)
  • Comparison operations (LT, GT, EQ, ISZERO)
  • Bitwise operations (AND, OR, XOR, NOT, SHL, SHR, SAR)
  • Memory operations (MLOAD, MSTORE, MSTORE8)
  • Storage operations (SLOAD, SSTORE)
  • Control flow (JUMP, JUMPI, JUMPDEST, PC, STOP, RETURN, REVERT)
  • Stack operations (PUSH, POP, DUP, SWAP)
  • Environmental opcodes (ADDRESS, BALANCE, CALLER, CALLVALUE, etc.)
  • Call operations (CALL, STATICCALL, DELEGATECALL, CREATE, CREATE2)
  • Block information (BLOCKHASH, COINBASE, TIMESTAMP, NUMBER, DIFFICULTY, GASLIMIT)

Test Structure:

describe('VM Opcode Tests', () => {
  describe('Arithmetic Operations', () => {
    it('should execute ADD correctly', async () => {
      // Deploy contract with ADD opcode test
      // Execute and verify result
    });

    it('should handle overflow in ADD', async () => {
      // Test uint256 overflow behavior
    });
  });

  describe('Storage Operations', () => {
    it('should handle SSTORE/SLOAD correctly', async () => {
      // Test storage read/write
    });

    it('should handle storage gas costs (EIP-2200)', async () => {
      // Verify SSTORE gas cost calculations
    });
  });
});

4.1.4 Blockchain Tests

Validate blockchain-level operations:

  • Block validation and acceptance
  • Uncle/ommer block handling
  • Difficulty calculation (if applicable)
  • Block rewards
  • Transaction ordering within blocks
  • Block size limits
  • Gas limit enforcement

4.2 RPC API Tests

Priority: HIGH Estimated Effort: 3-4 weeks

4.2.1 Standard JSON-RPC Methods

Test all documented RPC endpoints:

Read Operations:

  • eth_chainId - Verify correct chain ID for each EVM chain (20-24)
  • eth_blockNumber - Current block height
  • eth_getBalance - Account balance queries
  • eth_getCode - Contract code retrieval
  • eth_getStorageAt - Storage slot queries
  • eth_call - Contract call simulation
  • eth_estimateGas - Gas estimation
  • eth_getBlockByNumber / eth_getBlockByHash - Block data retrieval
  • eth_getTransactionByHash - Transaction retrieval
  • eth_getTransactionReceipt - Receipt retrieval
  • eth_getLogs - Event log queries
  • eth_getProof - Account and storage proofs (SPV proof generation)

Write Operations:

  • eth_sendRawTransaction - Transaction submission
  • eth_sendTransaction - Transaction submission (if supported)

Network Information:

  • net_version - Network ID
  • net_listening - Node connectivity
  • net_peerCount - Peer count
  • web3_clientVersion - Client identification

Test Structure:

describe('RPC API Tests', () => {
  describe('eth_chainId', () => {
    it('should return correct chain ID for chain 20', async () => {
      const provider = getProvider(20);
      const chainId = await provider.send('eth_chainId', []);
      expect(chainId).toBe('0x6fe'); // 1789 (chainIdOffset) + 20 = 1809
    });

    it('should return different chain IDs for different chains', async () => {
      const chain20 = await getProvider(20).send('eth_chainId', []);
      const chain21 = await getProvider(21).send('eth_chainId', []);
      expect(chain20).not.toBe(chain21);
    });
  });

  describe('eth_getProof', () => {
    it('should return valid account proof', async () => {
      const proof = await provider.send('eth_getProof', [
        accountAddress,
        storageKeys,
        blockNumber
      ]);

      expect(proof).toHaveProperty('accountProof');
      expect(proof).toHaveProperty('storageProof');
      // Verify proof can be validated
    });

    it('should work for last 1024 blocks only', async () => {
      const currentBlock = await provider.getBlockNumber();
      const oldBlockNumber = currentBlock - 1025;

      await expect(
        provider.send('eth_getProof', [address, [], oldBlockNumber])
      ).rejects.toThrow(/proof not available/i);
    });
  });

  describe('eth_getLogs', () => {
    it('should filter logs by address', async () => { /* ... */ });
    it('should filter logs by topics', async () => { /* ... */ });
    it('should handle fromBlock and toBlock', async () => { /* ... */ });
    it('should respect block range limits', async () => { /* ... */ });
  });

  describe('eth_estimateGas', () => {
    it('should estimate gas for simple transfer', async () => { /* ... */ });
    it('should estimate gas for contract deployment', async () => { /* ... */ });
    it('should estimate gas for complex contract call', async () => { /* ... */ });
  });
});

4.2.2 Error Handling and Edge Cases

  • Invalid parameters (wrong types, out of range)
  • Missing or non-existent resources (blocks, transactions, accounts)
  • Rate limiting behavior (if applicable)
  • Large result sets (pagination, limits)
  • Concurrent request handling

4.3 Cross-Chain Bridging Tests

Priority: CRITICAL Estimated Effort: 4-6 weeks

4.3.1 SPV Proof Generation and Validation

Reference: docs/bridging-protocol.md, SimpleToken.sol:182-259

Test Scenarios:

  1. Valid Cross-Chain Transfer:

    describe('Cross-Chain Transfer', () => {
      it('should complete full cross-chain transfer flow', async () => {
        // Step 1: Initialize transfer on source chain (e.g., chain 20)
        const tx = await token20.transferCrossChain(receiver, amount, targetChain);
        await tx.wait();
    
        // Step 2: Get SPV proof from endpoint
        const proof = await getProof(tx.hash, sourceChain);
    
        // Step 3: Redeem on target chain (e.g., chain 21)
        const redeemTx = await token21.redeemCrossChain(proof);
        await redeemTx.wait();
    
        // Step 4: Verify balances
        expect(await token20.balanceOf(sender)).toBe(initialBalance - amount);
        expect(await token21.balanceOf(receiver)).toBe(amount);
      });
    });
  2. Proof Validation Tests:

    • Valid proof acceptance
    • Invalid proof rejection (tampered data)
    • Proof replay prevention (same proof used twice)
    • Expired proof handling (if applicable)
    • Wrong target chain proof submission
    • Wrong target contract proof submission
  3. Multi-Hop Transfers:

    it('should handle sequential cross-chain transfers', async () => {
      // Chain 20 -> Chain 21 -> Chain 22
      // Verify state consistency across all chains
    });
  4. Concurrent Cross-Chain Operations:

    it('should handle concurrent transfers between multiple chains', async () => {
      // Simultaneously initiate transfers:
      // Chain 20 -> Chain 21
      // Chain 21 -> Chain 22
      // Chain 22 -> Chain 20
      // Verify all complete successfully
    });

4.3.2 Precompile Tests

VALIDATE_PROOF_PRECOMPILE (0x48C3b4d2757447601776837B6a85F31EF88A87bf):

  • Verify SPV proof validation logic
  • Test with valid Chainweb proofs
  • Test with invalid/malformed proofs
  • Gas cost analysis

CHAIN_ID_PRECOMPILE (0x9b02c3e2dF42533e0FD166798B5A616f59DBd2cc):

  • Verify correct chain ID returned on each chain (20-24)
  • Integration with smart contracts

Test Structure:

// Test contract for precompile validation
contract PrecompileTest {
    address constant VALIDATE_PROOF = 0x48C3b4d2757447601776837B6a85F31EF88A87bf;
    address constant CHAIN_ID = 0x9b02c3e2dF42533e0FD166798B5A616f59DBd2cc;

    function testChainId() public view returns (uint256) {
        (bool success, bytes memory data) = CHAIN_ID.staticcall("");
        require(success, "ChainID precompile failed");
        return abi.decode(data, (uint256));
    }

    function testValidateProof(bytes memory proof) public returns (bool) {
        (bool success, bytes memory data) = VALIDATE_PROOF.call(proof);
        require(success, "ValidateProof precompile failed");
        return abi.decode(data, (bool));
    }
}

4.4 Consensus and Multi-Chain Tests

Priority: HIGH Estimated Effort: 4-5 weeks

4.4.1 Block Production Tests

  • Block production timing (2 second default)
  • Block propagation across chains
  • Chain synchronization (cut-height advancement)
  • Fork handling and resolution
  • Block reorganization scenarios

Test Cases:

describe('Block Production', () => {
  it('should produce blocks at consistent intervals', async () => {
    const provider = getProvider(20);
    const block1 = await provider.getBlock('latest');
    await sleep(2000); // Wait for next block
    const block2 = await provider.getBlock('latest');

    expect(block2.number).toBe(block1.number + 1);
    expect(block2.timestamp - block1.timestamp).toBeCloseTo(2, 0.5);
  });

  it('should advance cut-height across all chains', async () => {
    const initialCutHeight = await getCutHeight();
    await waitForBlocks(10);
    const finalCutHeight = await getCutHeight();

    expect(finalCutHeight).toBeGreaterThan(initialCutHeight);
  });

  it('should handle temporary network partition', async () => {
    // Simulate network split
    // Verify chains continue producing blocks
    // Reconnect and verify consensus convergence
  });
});

4.4.2 Mining and Rewards

  • Mining client functionality
  • Block reward distribution
  • Transaction fee distribution
  • Mining difficulty (if applicable)

Existing Test: verify-miner-rewards.test.ts (expand coverage)

4.4.3 Multi-Node Coordination

  • Node discovery and peering
  • State synchronization between nodes
  • Transaction propagation
  • Block propagation
  • Byzantine fault tolerance scenarios

Existing Test: run-multinode-with-miners.test.ts (expand coverage)

4.5 Performance and Load Tests

Priority: MEDIUM Estimated Effort: 3-4 weeks

4.5.1 Transaction Throughput

  • Maximum transactions per second (TPS) per chain
  • TPS across all EVM chains (20-24) combined
  • Transaction pool management under load
  • Memory pool eviction policies

Test Structure:

describe('Performance Tests', () => {
  it('should handle 100 concurrent transactions', async () => {
    const txPromises = Array(100).fill(null).map((_, i) =>
      sendTransaction({ nonce: i, ... })
    );

    const results = await Promise.allSettled(txPromises);
    const successful = results.filter(r => r.status === 'fulfilled');

    expect(successful.length).toBeGreaterThan(95); // >95% success rate
  });

  it('should measure TPS under sustained load', async () => {
    const duration = 60000; // 1 minute
    const startTime = Date.now();
    let txCount = 0;

    while (Date.now() - startTime < duration) {
      await sendTransaction({ nonce: txCount++ });
    }

    const tps = txCount / (duration / 1000);
    console.log(`Achieved TPS: ${tps}`);
    expect(tps).toBeGreaterThan(10); // Baseline expectation
  });
});

4.5.2 Contract Execution Performance

  • Gas benchmarks for common operations
  • Large contract execution time
  • Recursive call depth limits
  • Memory-intensive operations

4.5.3 State Size and Scaling

  • State growth over time
  • State pruning behavior
  • Archive node vs. full node performance
  • Database performance under large state

4.6 Infrastructure and Devnet Tests

Priority: HIGH Estimated Effort: 2-3 weeks

4.6.1 Docker Compose Tests

Reference: devnet/compose.py

Existing Coverage:

  • Docker compose generation
  • Node start/stop/restart
  • Container health checks

Additional Tests Needed:

  • Multi-configuration validation (frontend-dev, app-dev, minimal, production)
  • Exposed chains configuration (--exposed-chains parameter)
  • Resource limits enforcement
  • Volume persistence across restarts
  • Network isolation between projects

Test Structure:

describe('Docker Compose Generation', () => {
  it('should generate valid docker-compose.yaml for frontend-dev', async () => {
    await exec('python3 devnet/compose.py --project frontend-dev > /tmp/test-compose.yaml');
    const compose = await readYaml('/tmp/test-compose.yaml');

    expect(compose.services).toHaveProperty('bootnode-consensus');
    expect(compose.services).toHaveProperty('bootnode-chain-20');
    // Verify all required services present
  });

  it('should expose only specified chains', async () => {
    await exec('python3 devnet/compose.py --exposed-chains "20,22" > /tmp/test-compose.yaml');
    const compose = await readYaml('/tmp/test-compose.yaml');

    // Verify only chains 20 and 22 have port mappings
    expect(compose.services['bootnode-chain-20'].ports).toBeDefined();
    expect(compose.services['bootnode-chain-22'].ports).toBeDefined();
    expect(compose.services['bootnode-chain-21'].ports).toBeUndefined();
  });
});

4.6.2 Node Lifecycle Tests

Existing Tests:

  • discontinued-node.test.ts - Node failure scenarios
  • fast-block-production.test.ts - Block timing

Additional Tests:

  • Graceful shutdown and startup
  • State recovery after crash
  • Log rotation and retention
  • Configuration hot-reload (if supported)

4.7 Security and Edge Case Tests

Priority: MEDIUM Estimated Effort: 2-3 weeks

4.7.1 Transaction Edge Cases

  • Maximum transaction size
  • Minimum transaction size
  • Transaction with empty data
  • Transaction to zero address
  • Transaction with zero value
  • Transactions with excessive gas limit

4.7.2 Contract Edge Cases

Existing Test: large.test.ts - Contracts >24KB

Additional Cases:

  • Contract deployment to existing address
  • Contract self-destruct scenarios
  • Contracts with no code
  • Contracts with invalid bytecode
  • CREATE2 address collision attempts

4.7.3 Reentrancy and Attack Vectors

  • Reentrancy attack simulation
  • Front-running scenarios
  • Flash loan patterns (if applicable)
  • Gas griefing attacks

5. Implementation Plan

5.1 Phase 1: Ethereum Test Suite Integration (Weeks 1-4)

Deliverables:

  • Clone and integrate ethereum/tests repository
  • Implement test runner for state tests
  • Implement test runner for VM tests
  • Implement test runner for transaction tests
  • Implement test runner for blockchain tests
  • Document compatibility matrix (which tests pass/fail)

Tasks:

  1. Set up test infrastructure:

    cd tests
    mkdir -p ethereum/{state,vm,transaction,blockchain}
    git clone --depth 1 https://github.com/ethereum/tests.git ethereum-tests
  2. Create test runners:

    // tests/ethereum/runner/state-runner.ts
    export class StateTestRunner {
      async loadFixture(path: string): Promise<StateTest[]> { /* ... */ }
      async runTest(test: StateTest): Promise<TestResult> { /* ... */ }
      compareState(actual: State, expected: State): boolean { /* ... */ }
    }
  3. Implement test execution:

    // tests/ethereum/state-tests.test.ts
    import { StateTestRunner } from './runner/state-runner';
    import { glob } from 'glob';
    
    describe('Ethereum State Tests', () => {
      const runner = new StateTestRunner();
      const testFiles = glob.sync('ethereum-tests/GeneralStateTests/**/*.json');
    
      testFiles.forEach(file => {
        it(`should pass ${file}`, async () => {
          const tests = await runner.loadFixture(file);
          for (const test of tests) {
            const result = await runner.runTest(test);
            expect(result.passed).toBe(true);
          }
        });
      });
    });
  4. Track compatibility:

    // tests/ethereum/compatibility-matrix.json
    {
      "timestamp": "2025-10-07T10:00:00Z",
      "state_tests": {
        "total": 1000,
        "passed": 950,
        "failed": 50,
        "skipped": 0,
        "pass_rate": 95.0
      },
      "vm_tests": { /* ... */ },
      "failed_tests": [
        {
          "name": "TestName",
          "category": "state",
          "reason": "Gas calculation mismatch",
          "issue_link": "https://github.com/kadena-io/kadena-evm/issues/123"
        }
      ]
    }

5.2 Phase 2: RPC and Cross-Chain Tests (Weeks 5-8)

Deliverables:

  • Complete RPC endpoint test coverage
  • Cross-chain transfer test suite
  • Precompile validation tests
  • SPV proof generation/validation tests

Tasks:

  1. RPC test implementation:

    // tests/src/rpc/
    // ├── endpoints.test.ts      // All RPC endpoints
    // ├── errors.test.ts         // Error handling
    // ├── edge-cases.test.ts     // Edge cases
    // └── performance.test.ts    // Response time benchmarks
  2. Cross-chain test implementation:

    // tests/src/cross-chain/
    // ├── bridging.test.ts       // Full transfer flows
    // ├── proofs.test.ts         // SPV proof validation
    // ├── precompiles.test.ts    // Precompile testing
    // └── stress.test.ts         // Concurrent operations
  3. Expand existing integration tests:

    // Enhance solidity/test/SimpleToken.integration.test.js
    // Add more cross-chain scenarios:
    // - Multi-hop transfers
    // - Proof replay prevention
    // - Invalid proof handling
    // - Concurrent transfers

5.3 Phase 3: Consensus and Performance (Weeks 9-12)

Deliverables:

  • Block production validation suite
  • Multi-node coordination tests
  • Performance benchmarking framework
  • Load testing scenarios

Tasks:

  1. Consensus tests:

    // tests/e2e/consensus/
    // ├── block-production.test.ts
    // ├── fork-resolution.test.ts
    // ├── chain-sync.test.ts
    // └── cut-advancement.test.ts
  2. Performance framework:

    // tests/performance/
    // ├── benchmarks/
    // │   ├── tps.benchmark.ts
    // │   ├── gas.benchmark.ts
    // │   └── state-growth.benchmark.ts
    // ├── load-tests/
    // │   ├── sustained-load.test.ts
    // │   └── burst-load.test.ts
    // └── reports/
    //     └── benchmark-results.json
  3. Automated benchmarking:

    # scripts/run-benchmarks.sh
    #!/bin/bash
    bun run performance:tps > reports/tps-$(date +%Y%m%d).json
    bun run performance:gas > reports/gas-$(date +%Y%m%d).json
    # Generate comparison report
    node scripts/compare-benchmarks.js

5.4 Phase 4: Infrastructure and Edge Cases (Weeks 13-16)

Deliverables:

  • Docker compose configuration tests
  • Node lifecycle tests
  • Security and edge case coverage
  • Documentation and CI/CD integration

Tasks:

  1. Infrastructure tests:

    // tests/infrastructure/
    // ├── compose-generation.test.ts
    // ├── node-lifecycle.test.ts
    // ├── configuration.test.ts
    // └── resources.test.ts
  2. Edge case coverage:

    // tests/edge-cases/
    // ├── transactions.test.ts
    // ├── contracts.test.ts
    // ├── state.test.ts
    // └── security.test.ts
  3. CI/CD integration:

    # .github/workflows/evm-tests.yml
    name: EVM Test Suite
    
    on: [push, pull_request]
    
    jobs:
      ethereum-tests:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v3
          - uses: oven-sh/setup-bun@v1
          - run: cd tests && bun install
          - run: bun run test:ethereum
    
      rpc-tests:
        runs-on: ubuntu-latest
        steps:
          - run: ./network devnet start
          - run: bun run test:rpc
    
      cross-chain-tests:
        runs-on: ubuntu-latest
        steps:
          - run: ./network devnet start
          - run: bun run test:cross-chain
    
      performance-tests:
        runs-on: ubuntu-latest
        if: github.event_name == 'push' && github.ref == 'refs/heads/main'
        steps:
          - run: bun run test:performance
          - run: node scripts/upload-benchmark-results.js

5.5 Phase 5: Documentation and Refinement (Weeks 17-18)

Deliverables:

  • Comprehensive test documentation
  • Test troubleshooting guide
  • Contribution guidelines for tests
  • CI/CD dashboard and reporting

Tasks:

  1. Documentation:

    # tests/TESTING.md
    
    ## Running Tests
    
    ### Prerequisites
    - Bun installed
    - Docker running
    - 4+ CPU cores, 8+ GB RAM
    
    ### Test Categories
    
    #### Unit Tests
    ```bash
    bun run test:unit

    Integration Tests

    ./network devnet start
    bun run test:integration

    E2E Tests

    ./network devnet start
    bun run test:e2e

    Ethereum Tests

    bun run test:ethereum

    Troubleshooting

    Issue: Tests timeout Solution: Increase timeout in package.json or check devnet status

    Issue: Docker containers fail to start Solution: Run docker compose down --volumes and restart

    
    
  2. Contribution guide:

    # tests/CONTRIBUTING.md
    
    ## Adding New Tests
    
    1. Choose appropriate category (unit/integration/e2e/ethereum)
    2. Create test file following naming convention: `feature.test.ts`
    3. Use describe/it blocks for organization
    4. Include comments explaining test purpose
    5. Add to relevant test suite in package.json
    
    ## Test Writing Guidelines
    
    - Test one thing per `it` block
    - Use descriptive test names
    - Clean up resources (close connections, reset state)
    - Use appropriate timeout for test type
    - Include both positive and negative cases

6. Verification Strategy

6.1 Test Quality Criteria

Each test must meet the following criteria:

  1. Clarity: Test name clearly describes what is being tested
  2. Independence: Test can run in isolation without dependencies on other tests
  3. Repeatability: Test produces consistent results across runs
  4. Speed: Test completes within reasonable time (unit: <1s, integration: <30s, e2e: <5min)
  5. Cleanup: Test properly cleans up resources (connections, files, state)

6.2 Code Review Checklist

  • Test file follows naming convention (*.test.ts)
  • Test includes describe/it structure
  • Test has meaningful assertions (not just checking for absence of errors)
  • Test handles both success and failure cases
  • Test includes comments explaining complex logic
  • Test uses appropriate timeout value
  • Test cleans up after itself
  • Test is added to appropriate test suite in package.json

6.3 Coverage Metrics

Minimum Coverage Targets:

Component Line Coverage Branch Coverage
Smart Contracts 90% 85%
Test Runners 80% 75%
Integration Code 70% 65%

Coverage Tools:

# Solidity coverage
cd solidity
npx hardhat coverage

# TypeScript coverage
cd tests
bun run test --coverage

6.4 Continuous Integration

CI Pipeline Stages:

  1. Fast Tests (on every PR):

    • Unit tests (<5 min)
    • Linting and formatting checks
  2. Integration Tests (on every PR):

    • RPC tests
    • Basic cross-chain tests
    • Infrastructure tests
  3. Full Test Suite (on merge to main):

    • All integration tests
    • E2E tests
    • Ethereum test suite
    • Performance benchmarks
  4. Nightly Tests:

    • Long-running stress tests
    • Comprehensive ethereum/tests execution
    • Performance regression analysis

CI Success Criteria:

  • All unit tests pass
  • 95% integration tests pass

  • 90% ethereum/tests pass

  • No performance regression >10% from baseline

7. Dependencies

7.1 Infrastructure Requirements

Hardware:

  • CI runners: 4+ CPU cores, 8 GB RAM per runner
  • Performance testing: 8+ CPU cores, 16 GB RAM
  • Storage: 100+ GB for test artifacts and logs

Software:

  • Docker 20.10+
  • Docker Compose 2.0+
  • Bun 1.0+
  • Node.js 22+ (for Hardhat)
  • Python 3.13+ with uv (for devnet compose generation)

7.2 External Dependencies

Repositories:

  • ethereum/tests - Official Ethereum test suite
  • Kadena Chainweb EVM node images
  • Reth client integration

Tools:

  • Hardhat (smart contract testing)
  • ethers.js / viem (Ethereum library)
  • @kadena/client (Pact interaction)
  • @kadena/hardhat-chainweb (multi-chain deployment)
  • @kadena/hardhat-kadena-create2 (deterministic deployment)

7.3 Team Dependencies

Required Expertise:

  • EVM internals knowledge (opcodes, state transitions)
  • Chainweb consensus understanding
  • Cross-chain bridging concepts
  • Test framework proficiency (Bun, Hardhat)
  • Docker and infrastructure

Collaboration Needs:

  • EVM core team: Clarification on implementation details
  • Consensus team: Multi-chain coordination edge cases
  • DevOps team: CI/CD pipeline setup
  • Documentation team: Test documentation review

8. Success Criteria

8.1 Functional Requirements

  • Ethereum Compatibility: >95% of applicable ethereum/tests pass
  • RPC Coverage: 100% of documented RPC endpoints tested
  • Cross-Chain: All bridging scenarios covered with tests
  • Consensus: Block production, fork handling, and chain sync validated
  • Performance: Baseline TPS and gas benchmarks established
  • Infrastructure: Docker compose configurations fully tested

8.2 Non-Functional Requirements

  • CI/CD Integration: Tests run automatically on every PR
  • Test Execution Time: Full test suite completes in <15 minutes on CI
  • Documentation: Every test category has documentation
  • Maintainability: Test code follows project conventions
  • Debuggability: Failed tests provide clear error messages
  • Monitoring: Test results tracked over time with dashboards

8.3 Acceptance Criteria

For Task Completion:

  1. All test categories implemented (Phases 1-4)
  2. Documentation complete (Phase 5)
  3. CI/CD pipeline operational
  4. 95% ethereum/tests passing (excluding known incompatibilities)

  5. All RPC endpoints validated
  6. Cross-chain bridging fully tested
  7. Performance baseline established
  8. Code review approved by 2+ team members
  9. All deliverables merged to main branch

For Production Readiness:

  1. All critical tests passing consistently
  2. No known test flakiness
  3. Performance benchmarks stable over 1 week
  4. Documentation reviewed and published
  5. Team trained on running and interpreting tests

9. References

9.1 External Resources

9.2 Internal Documentation

  • CLAUDE.md - Project overview and setup instructions
  • docs/bridging-protocol.md - Cross-chain bridging specification
  • tests/README.md - Test suite getting started guide
  • tests/src/dx/README.md - DX test suite documentation
  • solidity/README.md - Solidity project documentation

9.3 NPM Packages

9.4 Related Tasks (from Asana Screenshot)

  • Parent Task: Refactoring and Testing
  • Blocking Dependency: FS (File System) - dependency listed as blocking
  • Related Projects:
    • /devnet/compose.py - Docker compose generator
    • /tests folder - Test suite location
  • Repository Branch: ag/tests/20-evm
  • Assignee: Albert Groothedde
  • Due Date: December 8, 2025

10. Risk Assessment and Mitigation

10.1 Technical Risks

Risk Impact Probability Mitigation
Ethereum tests incompatible with Chainweb EVM High Medium Document incompatibilities, focus on applicable tests
Performance tests flaky due to timing Medium High Use statistical analysis, multiple runs, tolerance ranges
Cross-chain tests timeout frequently Medium Medium Increase timeouts, optimize test flow, parallel execution
CI infrastructure insufficient High Low Reserve dedicated CI runners, optimize test execution
Test maintenance burden too high Medium Medium Automate test generation, clear documentation, tooling

10.2 Schedule Risks

Risk Impact Mitigation
Ethereum test integration takes longer than estimated High Prioritize subset of tests, phase remaining
Resource availability (reviewers, infrastructure) Medium Early communication, backup reviewers
Scope creep (additional test requirements) Medium Strict phase boundaries, prioritization matrix
Dependencies on EVM core team for clarifications Medium Async communication, document assumptions

10.3 Quality Risks

Risk Impact Mitigation
Tests pass but don't catch real bugs High Code review, mutation testing, real bug validation
False negatives (flaky tests) High Retry logic, statistical significance, test isolation
Insufficient edge case coverage Medium Systematic edge case enumeration, fuzzing
Test code quality degradation Medium Code reviews, test for tests, refactoring sprints

11. Future Enhancements

11.1 Beyond Initial Scope

  1. Fuzzing and Property-Based Testing:

    • Integrate fuzzing tools (Echidna, Foundry)
    • Generate random test cases for edge case discovery
  2. Formal Verification:

    • Formal verification of critical components
    • Model checking for consensus protocols
  3. Chaos Engineering:

    • Random network partitions
    • Random node failures
    • Byzantine behavior simulation
  4. Advanced Performance Testing:

    • Stress testing under extreme load
    • Long-running stability tests (days/weeks)
    • Resource leak detection
  5. Security Testing Automation:

    • Automated vulnerability scanning
    • Smart contract security analysis integration
    • Penetration testing scenarios

11.2 Tooling Improvements

  1. Test Dashboard:

    • Real-time test execution monitoring
    • Historical trends and regression detection
    • Test coverage visualization
  2. Test Generator:

    • Automated test generation from specifications
    • Contract-based test generation
    • API endpoint test generation from OpenAPI specs
  3. Debugging Aids:

    • Interactive test replay
    • State inspection tools
    • Transaction tracing integration

12. Appendix

12.1 Test File Organization

tests/
├── README.md                          # Getting started guide
├── TESTING.md                         # Comprehensive testing documentation
├── CONTRIBUTING.md                    # Test contribution guidelines
├── package.json                       # Test dependencies and scripts
├── tsconfig.json                      # TypeScript configuration
├── bun.lock                          # Bun lockfile
│
├── ethereum/                          # Ethereum standard tests
│   ├── state-tests.test.ts
│   ├── vm-tests.test.ts
│   ├── transaction-tests.test.ts
│   ├── blockchain-tests.test.ts
│   ├── runner/                       # Test execution framework
│   │   ├── state-runner.ts
│   │   ├── vm-runner.ts
│   │   ├── transaction-runner.ts
│   │   └── fixtures.ts
│   ├── ethereum-tests/               # Cloned from github.com/ethereum/tests
│   └── compatibility-matrix.json     # Test pass/fail tracking
│
├── src/                              # Integration tests
│   ├── rpc/                          # RPC endpoint tests
│   │   ├── endpoints.test.ts
│   │   ├── errors.test.ts
│   │   ├── edge-cases.test.ts
│   │   └── performance.test.ts
│   ├── cross-chain/                  # Cross-chain bridging tests
│   │   ├── bridging.test.ts
│   │   ├── proofs.test.ts
│   │   ├── precompiles.test.ts
│   │   └── stress.test.ts
│   ├── consensus/                    # Consensus tests
│   │   ├── block-production.test.ts
│   │   ├── fork-resolution.test.ts
│   │   └── chain-sync.test.ts
│   ├── dx/                           # Developer experience tests
│   │   ├── README.md
│   │   ├── token.test.ts
│   │   ├── zombie.test.ts
│   │   ├── tx.test.ts
│   │   └── large.test.ts
│   ├── hardhat.test.ts               # Hardhat integration
│   └── pact.test.ts                  # Pact interaction
│
├── e2e/                              # End-to-end tests
│   ├── run-multinode-with-miners.test.ts
│   ├── verify-miner-rewards.test.ts
│   ├── fast-block-production.test.ts
│   ├── discontinued-node.test.ts
│   ├── check-container-logs.test.ts
│   ├── consensus/
│   │   └── cut-advancement.test.ts
│   └── infrastructure/
│       ├── compose-generation.test.ts
│       ├── node-lifecycle.test.ts
│       └── configuration.test.ts
│
├── performance/                      # Performance tests
│   ├── benchmarks/
│   │   ├── tps.benchmark.ts
│   │   ├── gas.benchmark.ts
│   │   └── state-growth.benchmark.ts
│   ├── load-tests/
│   │   ├── sustained-load.test.ts
│   │   └── burst-load.test.ts
│   └── reports/                      # Generated benchmark results
│       └── .gitkeep
│
├── edge-cases/                       # Edge case tests
│   ├── transactions.test.ts
│   ├── contracts.test.ts
│   ├── state.test.ts
│   └── security.test.ts
│
├── utils/                            # Test utilities
│   ├── devnet.ts                     # Devnet interaction helpers
│   ├── chainweb.ts                   # Chainweb-specific utilities
│   ├── contracts.ts                  # Contract deployment helpers
│   └── assertions.ts                 # Custom assertions
│
├── fixtures/                         # Test fixtures
│   ├── contracts/                    # Sample contracts
│   ├── transactions/                 # Sample transactions
│   └── proofs/                       # Sample SPV proofs
│
└── scripts/                          # Test automation scripts
    ├── run-benchmarks.sh
    ├── compare-benchmarks.js
    ├── generate-report.js
    └── upload-results.js

12.2 Test Naming Conventions

File Naming:

  • Unit tests: component.test.ts
  • Integration tests: feature.test.ts
  • E2E tests: scenario.test.ts
  • Benchmarks: metric.benchmark.ts

Test Structure:

describe('ComponentName', () => {
  describe('methodName', () => {
    it('should do expected behavior when condition', async () => {
      // Arrange
      const input = setupInput();

      // Act
      const result = await methodUnderTest(input);

      // Assert
      expect(result).toBe(expected);
    });

    it('should throw error when invalid input', async () => {
      // Arrange
      const invalidInput = setupInvalidInput();

      // Act & Assert
      await expect(methodUnderTest(invalidInput))
        .rejects.toThrow(/expected error message/);
    });
  });
});

12.3 Common Test Patterns

Pattern 1: Multi-Chain Test Setup

import { runOverChains } from '@kadena/hardhat-chainweb';

describe('Multi-Chain Test', () => {
  it('should work across all EVM chains', async () => {
    await runOverChains(async (chainId) => {
      const provider = getProvider(chainId);
      // Test logic for each chain
    });
  });
});

Pattern 2: Cross-Chain Transfer

async function testCrossChainTransfer(
  sourceChain: number,
  targetChain: number
) {
  // Step 1: Initialize on source
  await switchChain(sourceChain);
  const initTx = await contract.transferCrossChain(/* ... */);
  await initTx.wait();

  // Step 2: Get proof
  const proof = await getProof(initTx.hash, sourceChain);

  // Step 3: Redeem on target
  await switchChain(targetChain);
  const redeemTx = await contract.redeemCrossChain(proof);
  await redeemTx.wait();

  // Step 4: Verify
  // ... assertions
}

Pattern 3: Devnet Health Check

async function waitForDevnetReady() {
  const maxAttempts = 30;
  for (let i = 0; i < maxAttempts; i++) {
    try {
      const blockNumber = await provider.getBlockNumber();
      if (blockNumber > 0) return;
    } catch (e) {
      await sleep(1000);
    }
  }
  throw new Error('Devnet failed to start');
}

12.4 Glossary

Terms:

  • SPV Proof: Simple Payment Verification proof, cryptographic proof of an event on a source chain
  • Cut Height: The minimum block height across all chains in Chainweb
  • Precompile: Built-in smart contract with native implementation
  • Devnet: Local development network running in Docker
  • EVM Chains: Chains 20-24 running EVM payload provider
  • Pact Chains: Chains 0-19 running Pact smart contract execution
  • Chain ID Offset: Base value added to Chainweb chain ID to get EVM chain ID (sandbox: 1789, testnet: 5920)
  • BIP-44: Bitcoin Improvement Proposal 44, hierarchical deterministic wallet standard

Abbreviations:

  • TPS: Transactions Per Second
  • RPC: Remote Procedure Call
  • E2E: End-to-End
  • CI/CD: Continuous Integration / Continuous Deployment
  • PRD: Product Requirements Document

Document History

Version Date Author Changes
1.0 2025-10-07 AI Assistant Initial PRD creation based on Asana task analysis

Next Steps:

  1. Review and approve PRD with stakeholders
  2. Create GitHub issues for each phase
  3. Assign team members to implementation tasks
  4. Set up project tracking in Asana/GitHub Projects
  5. Begin Phase 1 implementation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment