Product Requirements Document: EVM Test Suite Implementation

Document Version: 1.0 Last Updated: 2025-10-07 Status: Draft Owner: Albert Groothedde Due Date: December 8, 2025 Repository: kadena-io/kadena-evm-sandbox Branch: ag/tests/20-evm

Executive Summary

This PRD outlines the comprehensive test suite implementation for Kadena Chainweb EVM, focusing on validating EVM compatibility, multi-chain functionality, and integration with the Ethereum test suite from github.com/ethereum/tests. The initiative aims to ensure robust, production-ready EVM execution on Kadena's multi-chain architecture (chains 20-24).

1. Project Overview

1.1 Background

Kadena Chainweb EVM is a multi-chain blockchain architecture where EVM execution runs on chains 20-24 alongside Pact smart contracts on chains 0-19. The key innovation is trustless cross-chain bridging using SPV (Simple Payment Verification) proofs, enabled by Chainweb's braided architecture.

1.2 Purpose

Implement a comprehensive test suite that:

Validates EVM compatibility with Ethereum standards
Tests multi-chain operations and cross-chain bridging
Ensures consensus and state management correctness
Provides regression testing and continuous integration support
Verifies Reth client integration with Chainweb consensus

1.3 Scope

In Scope:

Ethereum test suite integration (state, transaction, blockchain tests)
RPC API compatibility testing
Multi-chain operations and cross-chain bridging tests
Consensus and block production validation
Gas metering and transaction execution tests
Infrastructure and devnet testing
Performance and load testing

Out of Scope:

Pact smart contract testing (chains 0-19)
UI/Frontend testing
Security penetration testing (covered separately)
Mainnet deployment testing

2. Current State Analysis

2.1 Existing Test Infrastructure

Location: /tests directory

Current Test Coverage:

E2E Tests (/tests/e2e/):
- discontinued-node.test.ts - Node failure scenarios
- run-multinode-with-miners.test.ts - Multi-node setup validation
- check-container-logs.test.ts - Docker container health checks
- fast-block-production.test.ts - Block timing validation
- verify-miner-rewards.test.ts - Mining reward verification
Integration Tests (/tests/src/):
- hardhat.test.ts - Hardhat integration
- pact.test.ts - Pact interaction tests
- DX Tests (/tests/src/dx/):
  - token.test.ts - ERC-20 deployment (Hardhat tutorial)
  - zombie.test.ts - CryptoZombies contract deployment
  - tx.test.ts - Invalid transaction testing (reused nonce, low gas, invalid signature)
  - large.test.ts - Large contract deployment (>24KB)
Solidity Tests (/solidity/test/):
- SimpleToken.test.js - Unit tests for ERC-20 with cross-chain features
- SimpleToken.integration.test.js - Cross-chain transfer integration tests

Test Infrastructure:

Framework: Bun (required, not Node.js)
Command: bun run test (from tests directory)
Timeout: 300,000ms (5 minutes) for integration tests
Docker: devnet/compose.py generates docker-compose.yaml configurations

2.2 Gaps Identified

Ethereum Standard Tests: No integration with official ethereum/tests repository
State Tests: Missing comprehensive state transition validation
RPC Coverage: Limited RPC endpoint testing (eth_getProof exists, needs more)
Blockchain Tests: No validation against Ethereum blockchain test vectors
VM Execution Tests: Missing low-level EVM opcode execution tests
Consensus Edge Cases: Limited testing of consensus failure scenarios
Performance Benchmarks: No systematic performance regression tests
Cross-Chain Stress Tests: Limited multi-chain coordination under load

3. Goals and Objectives

3.1 Primary Goals

Ethereum Compatibility: Achieve >95% compatibility with Ethereum test suite
Multi-Chain Validation: Ensure cross-chain operations work reliably across chains 20-24
Regression Prevention: Catch breaking changes before deployment
Performance Baseline: Establish performance benchmarks for future optimization
Developer Confidence: Provide comprehensive test coverage for contributors

3.2 Success Metrics

Metric	Target	Measurement
Ethereum test suite pass rate	>95%	ethereum/tests execution results
RPC endpoint coverage	100% of documented endpoints	RPC test suite results
Cross-chain test coverage	All bridging scenarios	Integration test results
CI/CD integration	<15 min test execution	GitHub Actions duration
Test documentation completeness	100% of tests documented	Documentation review

4. Test Categories

4.1 Ethereum Standard Tests (ethereum/tests)

Reference: https://github.com/ethereum/tests

Priority: HIGH Estimated Effort: 5-8 weeks

4.1.1 State Tests

Validate state transitions for:

Account creation and destruction
Storage modifications (SSTORE, SLOAD)
Balance transfers
Contract deployments
Self-destruct operations
Code execution and gas consumption

Implementation Approach:

# Clone ethereum/tests repository
git clone https://github.com/ethereum/tests.git tests/ethereum-tests

# Test runner structure
tests/
  ethereum/
    state-tests.test.ts     # State transition tests
    vm-tests.test.ts        # VM execution tests
    blockchain-tests.test.ts # Blockchain validation tests
    runner/
      state-runner.ts       # State test executor
      fixtures.ts           # Test fixture loader

Test Format:

JSON test fixtures from ethereum/tests
Test runner parses JSON and executes against Chainweb EVM
Compare final state against expected results

4.1.2 Transaction Tests

Validate transaction processing:

Valid and invalid transaction formats
Signature verification (ECDSA secp256k1)
Nonce handling and replay protection
Gas price and gas limit validation
Transaction type support (Legacy, EIP-2930, EIP-1559)
Large transaction payloads

Test Cases:

describe('Transaction Tests', () => {
  it('should reject transaction with reused nonce', async () => {
    // Send transaction with nonce N
    // Attempt to send another transaction with nonce N
    // Expect rejection with appropriate error
  });

  it('should reject transaction with insufficient gas', async () => {
    // Submit transaction with gasLimit < intrinsic gas
    // Expect rejection
  });

  it('should handle EIP-1559 transaction format', async () => {
    // Submit EIP-1559 transaction with maxFeePerGas, maxPriorityFeePerGas
    // Verify execution and gas accounting
  });
});

4.1.3 VM Execution Tests

Low-level EVM opcode validation:

Arithmetic operations (ADD, SUB, MUL, DIV, MOD, EXP)
Comparison operations (LT, GT, EQ, ISZERO)
Bitwise operations (AND, OR, XOR, NOT, SHL, SHR, SAR)
Memory operations (MLOAD, MSTORE, MSTORE8)
Storage operations (SLOAD, SSTORE)
Control flow (JUMP, JUMPI, JUMPDEST, PC, STOP, RETURN, REVERT)
Stack operations (PUSH, POP, DUP, SWAP)
Environmental opcodes (ADDRESS, BALANCE, CALLER, CALLVALUE, etc.)
Call operations (CALL, STATICCALL, DELEGATECALL, CREATE, CREATE2)
Block information (BLOCKHASH, COINBASE, TIMESTAMP, NUMBER, DIFFICULTY, GASLIMIT)

Test Structure:

describe('VM Opcode Tests', () => {
  describe('Arithmetic Operations', () => {
    it('should execute ADD correctly', async () => {
      // Deploy contract with ADD opcode test
      // Execute and verify result
    });

    it('should handle overflow in ADD', async () => {
      // Test uint256 overflow behavior
    });
  });

  describe('Storage Operations', () => {
    it('should handle SSTORE/SLOAD correctly', async () => {
      // Test storage read/write
    });

    it('should handle storage gas costs (EIP-2200)', async () => {
      // Verify SSTORE gas cost calculations
    });
  });
});

4.1.4 Blockchain Tests

Validate blockchain-level operations:

Block validation and acceptance
Uncle/ommer block handling
Difficulty calculation (if applicable)
Block rewards
Transaction ordering within blocks
Block size limits
Gas limit enforcement

4.2 RPC API Tests

Priority: HIGH Estimated Effort: 3-4 weeks

4.2.1 Standard JSON-RPC Methods

Test all documented RPC endpoints:

Read Operations:

eth_chainId - Verify correct chain ID for each EVM chain (20-24)
eth_blockNumber - Current block height
eth_getBalance - Account balance queries
eth_getCode - Contract code retrieval
eth_getStorageAt - Storage slot queries
eth_call - Contract call simulation
eth_estimateGas - Gas estimation
eth_getBlockByNumber / eth_getBlockByHash - Block data retrieval
eth_getTransactionByHash - Transaction retrieval
eth_getTransactionReceipt - Receipt retrieval
eth_getLogs - Event log queries
eth_getProof - Account and storage proofs (SPV proof generation)

Write Operations:

eth_sendRawTransaction - Transaction submission
eth_sendTransaction - Transaction submission (if supported)

Network Information:

net_version - Network ID
net_listening - Node connectivity
net_peerCount - Peer count
web3_clientVersion - Client identification

Test Structure:

describe('RPC API Tests', () => {
  describe('eth_chainId', () => {
    it('should return correct chain ID for chain 20', async () => {
      const provider = getProvider(20);
      const chainId = await provider.send('eth_chainId', []);
      expect(chainId).toBe('0x6fe'); // 1789 (chainIdOffset) + 20 = 1809
    });

    it('should return different chain IDs for different chains', async () => {
      const chain20 = await getProvider(20).send('eth_chainId', []);
      const chain21 = await getProvider(21).send('eth_chainId', []);
      expect(chain20).not.toBe(chain21);
    });
  });

  describe('eth_getProof', () => {
    it('should return valid account proof', async () => {
      const proof = await provider.send('eth_getProof', [
        accountAddress,
        storageKeys,
        blockNumber
      ]);

      expect(proof).toHaveProperty('accountProof');
      expect(proof).toHaveProperty('storageProof');
      // Verify proof can be validated
    });

    it('should work for last 1024 blocks only', async () => {
      const currentBlock = await provider.getBlockNumber();
      const oldBlockNumber = currentBlock - 1025;

      await expect(
        provider.send('eth_getProof', [address, [], oldBlockNumber])
      ).rejects.toThrow(/proof not available/i);
    });
  });

  describe('eth_getLogs', () => {
    it('should filter logs by address', async () => { /* ... */ });
    it('should filter logs by topics', async () => { /* ... */ });
    it('should handle fromBlock and toBlock', async () => { /* ... */ });
    it('should respect block range limits', async () => { /* ... */ });
  });

  describe('eth_estimateGas', () => {
    it('should estimate gas for simple transfer', async () => { /* ... */ });
    it('should estimate gas for contract deployment', async () => { /* ... */ });
    it('should estimate gas for complex contract call', async () => { /* ... */ });
  });
});

4.2.2 Error Handling and Edge Cases

Invalid parameters (wrong types, out of range)
Missing or non-existent resources (blocks, transactions, accounts)
Rate limiting behavior (if applicable)
Large result sets (pagination, limits)
Concurrent request handling

4.3 Cross-Chain Bridging Tests

Priority: CRITICAL Estimated Effort: 4-6 weeks

4.3.1 SPV Proof Generation and Validation

Reference: docs/bridging-protocol.md, SimpleToken.sol:182-259

Test Scenarios:

Valid Cross-Chain Transfer:

describe('Cross-Chain Transfer', () => {
  it('should complete full cross-chain transfer flow', async () => {
    // Step 1: Initialize transfer on source chain (e.g., chain 20)
    const tx = await token20.transferCrossChain(receiver, amount, targetChain);
    await tx.wait();

    // Step 2: Get SPV proof from endpoint
    const proof = await getProof(tx.hash, sourceChain);

    // Step 3: Redeem on target chain (e.g., chain 21)
    const redeemTx = await token21.redeemCrossChain(proof);
    await redeemTx.wait();

    // Step 4: Verify balances
    expect(await token20.balanceOf(sender)).toBe(initialBalance - amount);
    expect(await token21.balanceOf(receiver)).toBe(amount);
  });
});

Proof Validation Tests:
- Valid proof acceptance
- Invalid proof rejection (tampered data)
- Proof replay prevention (same proof used twice)
- Expired proof handling (if applicable)
- Wrong target chain proof submission
- Wrong target contract proof submission

Multi-Hop Transfers:

it('should handle sequential cross-chain transfers', async () => {
  // Chain 20 -> Chain 21 -> Chain 22
  // Verify state consistency across all chains
});

Concurrent Cross-Chain Operations:

it('should handle concurrent transfers between multiple chains', async () => {
  // Simultaneously initiate transfers:
  // Chain 20 -> Chain 21
  // Chain 21 -> Chain 22
  // Chain 22 -> Chain 20
  // Verify all complete successfully
});

4.3.2 Precompile Tests

VALIDATE_PROOF_PRECOMPILE (0x48C3b4d2757447601776837B6a85F31EF88A87bf):

Verify SPV proof validation logic
Test with valid Chainweb proofs
Test with invalid/malformed proofs
Gas cost analysis

CHAIN_ID_PRECOMPILE (0x9b02c3e2dF42533e0FD166798B5A616f59DBd2cc):

Verify correct chain ID returned on each chain (20-24)
Integration with smart contracts

Test Structure:

// Test contract for precompile validation
contract PrecompileTest {
    address constant VALIDATE_PROOF = 0x48C3b4d2757447601776837B6a85F31EF88A87bf;
    address constant CHAIN_ID = 0x9b02c3e2dF42533e0FD166798B5A616f59DBd2cc;

    function testChainId() public view returns (uint256) {
        (bool success, bytes memory data) = CHAIN_ID.staticcall("");
        require(success, "ChainID precompile failed");
        return abi.decode(data, (uint256));
    }

    function testValidateProof(bytes memory proof) public returns (bool) {
        (bool success, bytes memory data) = VALIDATE_PROOF.call(proof);
        require(success, "ValidateProof precompile failed");
        return abi.decode(data, (bool));
    }
}

4.4 Consensus and Multi-Chain Tests

Priority: HIGH Estimated Effort: 4-5 weeks

4.4.1 Block Production Tests

Block production timing (2 second default)
Block propagation across chains
Chain synchronization (cut-height advancement)
Fork handling and resolution
Block reorganization scenarios

Test Cases:

describe('Block Production', () => {
  it('should produce blocks at consistent intervals', async () => {
    const provider = getProvider(20);
    const block1 = await provider.getBlock('latest');
    await sleep(2000); // Wait for next block
    const block2 = await provider.getBlock('latest');

    expect(block2.number).toBe(block1.number + 1);
    expect(block2.timestamp - block1.timestamp).toBeCloseTo(2, 0.5);
  });

  it('should advance cut-height across all chains', async () => {
    const initialCutHeight = await getCutHeight();
    await waitForBlocks(10);
    const finalCutHeight = await getCutHeight();

    expect(finalCutHeight).toBeGreaterThan(initialCutHeight);
  });

  it('should handle temporary network partition', async () => {
    // Simulate network split
    // Verify chains continue producing blocks
    // Reconnect and verify consensus convergence
  });
});

4.4.2 Mining and Rewards

Mining client functionality
Block reward distribution
Transaction fee distribution
Mining difficulty (if applicable)

Existing Test: verify-miner-rewards.test.ts (expand coverage)

4.4.3 Multi-Node Coordination

Node discovery and peering
State synchronization between nodes
Transaction propagation
Block propagation
Byzantine fault tolerance scenarios

Existing Test: run-multinode-with-miners.test.ts (expand coverage)

4.5 Performance and Load Tests

Priority: MEDIUM Estimated Effort: 3-4 weeks

4.5.1 Transaction Throughput

Maximum transactions per second (TPS) per chain
TPS across all EVM chains (20-24) combined
Transaction pool management under load
Memory pool eviction policies

Test Structure:

describe('Performance Tests', () => {
  it('should handle 100 concurrent transactions', async () => {
    const txPromises = Array(100).fill(null).map((_, i) =>
      sendTransaction({ nonce: i, ... })
    );

    const results = await Promise.allSettled(txPromises);
    const successful = results.filter(r => r.status === 'fulfilled');

    expect(successful.length).toBeGreaterThan(95); // >95% success rate
  });

  it('should measure TPS under sustained load', async () => {
    const duration = 60000; // 1 minute
    const startTime = Date.now();
    let txCount = 0;

    while (Date.now() - startTime < duration) {
      await sendTransaction({ nonce: txCount++ });
    }

    const tps = txCount / (duration / 1000);
    console.log(`Achieved TPS: ${tps}`);
    expect(tps).toBeGreaterThan(10); // Baseline expectation
  });
});

4.5.2 Contract Execution Performance

Gas benchmarks for common operations
Large contract execution time
Recursive call depth limits
Memory-intensive operations

4.5.3 State Size and Scaling

State growth over time
State pruning behavior
Archive node vs. full node performance
Database performance under large state

4.6 Infrastructure and Devnet Tests

Priority: HIGH Estimated Effort: 2-3 weeks

4.6.1 Docker Compose Tests

Reference: devnet/compose.py

Existing Coverage:

Docker compose generation
Node start/stop/restart
Container health checks

Additional Tests Needed:

Multi-configuration validation (frontend-dev, app-dev, minimal, production)
Exposed chains configuration (--exposed-chains parameter)
Resource limits enforcement
Volume persistence across restarts
Network isolation between projects

Test Structure:

describe('Docker Compose Generation', () => {
  it('should generate valid docker-compose.yaml for frontend-dev', async () => {
    await exec('python3 devnet/compose.py --project frontend-dev > /tmp/test-compose.yaml');
    const compose = await readYaml('/tmp/test-compose.yaml');

    expect(compose.services).toHaveProperty('bootnode-consensus');
    expect(compose.services).toHaveProperty('bootnode-chain-20');
    // Verify all required services present
  });

  it('should expose only specified chains', async () => {
    await exec('python3 devnet/compose.py --exposed-chains "20,22" > /tmp/test-compose.yaml');
    const compose = await readYaml('/tmp/test-compose.yaml');

    // Verify only chains 20 and 22 have port mappings
    expect(compose.services['bootnode-chain-20'].ports).toBeDefined();
    expect(compose.services['bootnode-chain-22'].ports).toBeDefined();
    expect(compose.services['bootnode-chain-21'].ports).toBeUndefined();
  });
});

4.6.2 Node Lifecycle Tests

Existing Tests:

discontinued-node.test.ts - Node failure scenarios
fast-block-production.test.ts - Block timing

Additional Tests:

Graceful shutdown and startup
State recovery after crash
Log rotation and retention
Configuration hot-reload (if supported)

4.7 Security and Edge Case Tests

Priority: MEDIUM Estimated Effort: 2-3 weeks

4.7.1 Transaction Edge Cases

Maximum transaction size
Minimum transaction size
Transaction with empty data
Transaction to zero address
Transaction with zero value
Transactions with excessive gas limit

4.7.2 Contract Edge Cases

Existing Test: large.test.ts - Contracts >24KB

Additional Cases:

Contract deployment to existing address
Contract self-destruct scenarios
Contracts with no code
Contracts with invalid bytecode
CREATE2 address collision attempts

4.7.3 Reentrancy and Attack Vectors

Reentrancy attack simulation
Front-running scenarios
Flash loan patterns (if applicable)
Gas griefing attacks

5. Implementation Plan

5.1 Phase 1: Ethereum Test Suite Integration (Weeks 1-4)

Deliverables:

Clone and integrate ethereum/tests repository
Implement test runner for state tests
Implement test runner for VM tests
Implement test runner for transaction tests
Implement test runner for blockchain tests
Document compatibility matrix (which tests pass/fail)

Tasks:

Set up test infrastructure:

cd tests
mkdir -p ethereum/{state,vm,transaction,blockchain}
git clone --depth 1 https://github.com/ethereum/tests.git ethereum-tests

Create test runners:

// tests/ethereum/runner/state-runner.ts
export class StateTestRunner {
  async loadFixture(path: string): Promise<StateTest[]> { /* ... */ }
  async runTest(test: StateTest): Promise<TestResult> { /* ... */ }
  compareState(actual: State, expected: State): boolean { /* ... */ }
}

Implement test execution:

// tests/ethereum/state-tests.test.ts
import { StateTestRunner } from './runner/state-runner';
import { glob } from 'glob';

describe('Ethereum State Tests', () => {
  const runner = new StateTestRunner();
  const testFiles = glob.sync('ethereum-tests/GeneralStateTests/**/*.json');

  testFiles.forEach(file => {
    it(`should pass ${file}`, async () => {
      const tests = await runner.loadFixture(file);
      for (const test of tests) {
        const result = await runner.runTest(test);
        expect(result.passed).toBe(true);
      }
    });
  });
});

Track compatibility:

// tests/ethereum/compatibility-matrix.json
{
  "timestamp": "2025-10-07T10:00:00Z",
  "state_tests": {
    "total": 1000,
    "passed": 950,
    "failed": 50,
    "skipped": 0,
    "pass_rate": 95.0
  },
  "vm_tests": { /* ... */ },
  "failed_tests": [
    {
      "name": "TestName",
      "category": "state",
      "reason": "Gas calculation mismatch",
      "issue_link": "https://github.com/kadena-io/kadena-evm/issues/123"
    }
  ]
}

5.2 Phase 2: RPC and Cross-Chain Tests (Weeks 5-8)

Deliverables:

Complete RPC endpoint test coverage
Cross-chain transfer test suite
Precompile validation tests
SPV proof generation/validation tests

Tasks:

RPC test implementation:

// tests/src/rpc/
// ├── endpoints.test.ts      // All RPC endpoints
// ├── errors.test.ts         // Error handling
// ├── edge-cases.test.ts     // Edge cases
// └── performance.test.ts    // Response time benchmarks

Cross-chain test implementation:

// tests/src/cross-chain/
// ├── bridging.test.ts       // Full transfer flows
// ├── proofs.test.ts         // SPV proof validation
// ├── precompiles.test.ts    // Precompile testing
// └── stress.test.ts         // Concurrent operations

Expand existing integration tests:

// Enhance solidity/test/SimpleToken.integration.test.js
// Add more cross-chain scenarios:
// - Multi-hop transfers
// - Proof replay prevention
// - Invalid proof handling
// - Concurrent transfers

5.3 Phase 3: Consensus and Performance (Weeks 9-12)

Deliverables:

Block production validation suite
Multi-node coordination tests
Performance benchmarking framework
Load testing scenarios

Tasks:

Consensus tests:

// tests/e2e/consensus/
// ├── block-production.test.ts
// ├── fork-resolution.test.ts
// ├── chain-sync.test.ts
// └── cut-advancement.test.ts

Performance framework:

// tests/performance/
// ├── benchmarks/
// │   ├── tps.benchmark.ts
// │   ├── gas.benchmark.ts
// │   └── state-growth.benchmark.ts
// ├── load-tests/
// │   ├── sustained-load.test.ts
// │   └── burst-load.test.ts
// └── reports/
//     └── benchmark-results.json

Automated benchmarking:

# scripts/run-benchmarks.sh
#!/bin/bash
bun run performance:tps > reports/tps-$(date +%Y%m%d).json
bun run performance:gas > reports/gas-$(date +%Y%m%d).json
# Generate comparison report
node scripts/compare-benchmarks.js

5.4 Phase 4: Infrastructure and Edge Cases (Weeks 13-16)

Deliverables:

Docker compose configuration tests
Node lifecycle tests
Security and edge case coverage
Documentation and CI/CD integration

Tasks:

Infrastructure tests:

// tests/infrastructure/
// ├── compose-generation.test.ts
// ├── node-lifecycle.test.ts
// ├── configuration.test.ts
// └── resources.test.ts

Edge case coverage:

// tests/edge-cases/
// ├── transactions.test.ts
// ├── contracts.test.ts
// ├── state.test.ts
// └── security.test.ts

CI/CD integration:

# .github/workflows/evm-tests.yml
name: EVM Test Suite

on: [push, pull_request]

jobs:
  ethereum-tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: oven-sh/setup-bun@v1
      - run: cd tests && bun install
      - run: bun run test:ethereum

  rpc-tests:
    runs-on: ubuntu-latest
    steps:
      - run: ./network devnet start
      - run: bun run test:rpc

  cross-chain-tests:
    runs-on: ubuntu-latest
    steps:
      - run: ./network devnet start
      - run: bun run test:cross-chain

  performance-tests:
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    steps:
      - run: bun run test:performance
      - run: node scripts/upload-benchmark-results.js

5.5 Phase 5: Documentation and Refinement (Weeks 17-18)

Deliverables:

Comprehensive test documentation
Test troubleshooting guide
Contribution guidelines for tests
CI/CD dashboard and reporting

Tasks:

Documentation:

# tests/TESTING.md

## Running Tests

### Prerequisites
- Bun installed
- Docker running
- 4+ CPU cores, 8+ GB RAM

### Test Categories

#### Unit Tests
```bash
bun run test:unit

Integration Tests

./network devnet start
bun run test:integration

E2E Tests

./network devnet start
bun run test:e2e

Ethereum Tests

bun run test:ethereum

Troubleshooting

Issue: Tests timeout Solution: Increase timeout in package.json or check devnet status

Issue: Docker containers fail to start Solution: Run docker compose down --volumes and restart

Contribution guide:

# tests/CONTRIBUTING.md

## Adding New Tests

1. Choose appropriate category (unit/integration/e2e/ethereum)
2. Create test file following naming convention: `feature.test.ts`
3. Use describe/it blocks for organization
4. Include comments explaining test purpose
5. Add to relevant test suite in package.json

## Test Writing Guidelines

- Test one thing per `it` block
- Use descriptive test names
- Clean up resources (close connections, reset state)
- Use appropriate timeout for test type
- Include both positive and negative cases

6. Verification Strategy

6.1 Test Quality Criteria

Each test must meet the following criteria:

Clarity: Test name clearly describes what is being tested
Independence: Test can run in isolation without dependencies on other tests
Repeatability: Test produces consistent results across runs
Speed: Test completes within reasonable time (unit: <1s, integration: <30s, e2e: <5min)
Cleanup: Test properly cleans up resources (connections, files, state)

6.2 Code Review Checklist

Test file follows naming convention (*.test.ts)
Test includes describe/it structure
Test has meaningful assertions (not just checking for absence of errors)
Test handles both success and failure cases
Test includes comments explaining complex logic
Test uses appropriate timeout value
Test cleans up after itself
Test is added to appropriate test suite in package.json

6.3 Coverage Metrics

Minimum Coverage Targets:

Component	Line Coverage	Branch Coverage
Smart Contracts	90%	85%
Test Runners	80%	75%
Integration Code	70%	65%

Coverage Tools:

# Solidity coverage
cd solidity
npx hardhat coverage

# TypeScript coverage
cd tests
bun run test --coverage

6.4 Continuous Integration

CI Pipeline Stages:

Fast Tests (on every PR):
- Unit tests (<5 min)
- Linting and formatting checks
Integration Tests (on every PR):
- RPC tests
- Basic cross-chain tests
- Infrastructure tests
Full Test Suite (on merge to main):
- All integration tests
- E2E tests
- Ethereum test suite
- Performance benchmarks
Nightly Tests:
- Long-running stress tests
- Comprehensive ethereum/tests execution
- Performance regression analysis

CI Success Criteria:

All unit tests pass
95% integration tests pass
90% ethereum/tests pass
No performance regression >10% from baseline

7. Dependencies

7.1 Infrastructure Requirements

Hardware:

CI runners: 4+ CPU cores, 8 GB RAM per runner
Performance testing: 8+ CPU cores, 16 GB RAM
Storage: 100+ GB for test artifacts and logs

Software:

Docker 20.10+
Docker Compose 2.0+
Bun 1.0+
Node.js 22+ (for Hardhat)
Python 3.13+ with uv (for devnet compose generation)

7.2 External Dependencies

Repositories:

ethereum/tests - Official Ethereum test suite
Kadena Chainweb EVM node images
Reth client integration

Tools:

Hardhat (smart contract testing)
ethers.js / viem (Ethereum library)
@kadena/client (Pact interaction)
@kadena/hardhat-chainweb (multi-chain deployment)
@kadena/hardhat-kadena-create2 (deterministic deployment)

7.3 Team Dependencies

Required Expertise:

EVM internals knowledge (opcodes, state transitions)
Chainweb consensus understanding
Cross-chain bridging concepts
Test framework proficiency (Bun, Hardhat)
Docker and infrastructure

Collaboration Needs:

EVM core team: Clarification on implementation details
Consensus team: Multi-chain coordination edge cases
DevOps team: CI/CD pipeline setup
Documentation team: Test documentation review

8. Success Criteria

8.1 Functional Requirements

Ethereum Compatibility: >95% of applicable ethereum/tests pass
RPC Coverage: 100% of documented RPC endpoints tested
Cross-Chain: All bridging scenarios covered with tests
Consensus: Block production, fork handling, and chain sync validated
Performance: Baseline TPS and gas benchmarks established
Infrastructure: Docker compose configurations fully tested

8.2 Non-Functional Requirements

CI/CD Integration: Tests run automatically on every PR
Test Execution Time: Full test suite completes in <15 minutes on CI
Documentation: Every test category has documentation
Maintainability: Test code follows project conventions
Debuggability: Failed tests provide clear error messages
Monitoring: Test results tracked over time with dashboards

8.3 Acceptance Criteria

For Task Completion:

All test categories implemented (Phases 1-4)
Documentation complete (Phase 5)
CI/CD pipeline operational
95% ethereum/tests passing (excluding known incompatibilities)
All RPC endpoints validated
Cross-chain bridging fully tested
Performance baseline established
Code review approved by 2+ team members
All deliverables merged to main branch

For Production Readiness:

All critical tests passing consistently
No known test flakiness
Performance benchmarks stable over 1 week
Documentation reviewed and published
Team trained on running and interpreting tests

9. References

9.1 External Resources

9.2 Internal Documentation

CLAUDE.md - Project overview and setup instructions
docs/bridging-protocol.md - Cross-chain bridging specification
tests/README.md - Test suite getting started guide
tests/src/dx/README.md - DX test suite documentation
solidity/README.md - Solidity project documentation

9.3 NPM Packages

9.4 Related Tasks (from Asana Screenshot)

Parent Task: Refactoring and Testing
Blocking Dependency: FS (File System) - dependency listed as blocking
Related Projects:
- /devnet/compose.py - Docker compose generator
- /tests folder - Test suite location
Repository Branch: ag/tests/20-evm
Assignee: Albert Groothedde
Due Date: December 8, 2025

10. Risk Assessment and Mitigation

10.1 Technical Risks

Risk	Impact	Probability	Mitigation
Ethereum tests incompatible with Chainweb EVM	High	Medium	Document incompatibilities, focus on applicable tests
Performance tests flaky due to timing	Medium	High	Use statistical analysis, multiple runs, tolerance ranges
Cross-chain tests timeout frequently	Medium	Medium	Increase timeouts, optimize test flow, parallel execution
CI infrastructure insufficient	High	Low	Reserve dedicated CI runners, optimize test execution
Test maintenance burden too high	Medium	Medium	Automate test generation, clear documentation, tooling

10.2 Schedule Risks

Risk	Impact	Mitigation
Ethereum test integration takes longer than estimated	High	Prioritize subset of tests, phase remaining
Resource availability (reviewers, infrastructure)	Medium	Early communication, backup reviewers
Scope creep (additional test requirements)	Medium	Strict phase boundaries, prioritization matrix
Dependencies on EVM core team for clarifications	Medium	Async communication, document assumptions

10.3 Quality Risks

Risk	Impact	Mitigation
Tests pass but don't catch real bugs	High	Code review, mutation testing, real bug validation
False negatives (flaky tests)	High	Retry logic, statistical significance, test isolation
Insufficient edge case coverage	Medium	Systematic edge case enumeration, fuzzing
Test code quality degradation	Medium	Code reviews, test for tests, refactoring sprints

11. Future Enhancements

11.1 Beyond Initial Scope

Fuzzing and Property-Based Testing:
- Integrate fuzzing tools (Echidna, Foundry)
- Generate random test cases for edge case discovery
Formal Verification:
- Formal verification of critical components
- Model checking for consensus protocols
Chaos Engineering:
- Random network partitions
- Random node failures
- Byzantine behavior simulation
Advanced Performance Testing:
- Stress testing under extreme load
- Long-running stability tests (days/weeks)
- Resource leak detection
Security Testing Automation:
- Automated vulnerability scanning
- Smart contract security analysis integration
- Penetration testing scenarios

11.2 Tooling Improvements

Test Dashboard:
- Real-time test execution monitoring
- Historical trends and regression detection
- Test coverage visualization
Test Generator:
- Automated test generation from specifications
- Contract-based test generation
- API endpoint test generation from OpenAPI specs
Debugging Aids:
- Interactive test replay
- State inspection tools
- Transaction tracing integration

12. Appendix

12.1 Test File Organization

tests/
├── README.md                          # Getting started guide
├── TESTING.md                         # Comprehensive testing documentation
├── CONTRIBUTING.md                    # Test contribution guidelines
├── package.json                       # Test dependencies and scripts
├── tsconfig.json                      # TypeScript configuration
├── bun.lock                          # Bun lockfile
│
├── ethereum/                          # Ethereum standard tests
│   ├── state-tests.test.ts
│   ├── vm-tests.test.ts
│   ├── transaction-tests.test.ts
│   ├── blockchain-tests.test.ts
│   ├── runner/                       # Test execution framework
│   │   ├── state-runner.ts
│   │   ├── vm-runner.ts
│   │   ├── transaction-runner.ts
│   │   └── fixtures.ts
│   ├── ethereum-tests/               # Cloned from github.com/ethereum/tests
│   └── compatibility-matrix.json     # Test pass/fail tracking
│
├── src/                              # Integration tests
│   ├── rpc/                          # RPC endpoint tests
│   │   ├── endpoints.test.ts
│   │   ├── errors.test.ts
│   │   ├── edge-cases.test.ts
│   │   └── performance.test.ts
│   ├── cross-chain/                  # Cross-chain bridging tests
│   │   ├── bridging.test.ts
│   │   ├── proofs.test.ts
│   │   ├── precompiles.test.ts
│   │   └── stress.test.ts
│   ├── consensus/                    # Consensus tests
│   │   ├── block-production.test.ts
│   │   ├── fork-resolution.test.ts
│   │   └── chain-sync.test.ts
│   ├── dx/                           # Developer experience tests
│   │   ├── README.md
│   │   ├── token.test.ts
│   │   ├── zombie.test.ts
│   │   ├── tx.test.ts
│   │   └── large.test.ts
│   ├── hardhat.test.ts               # Hardhat integration
│   └── pact.test.ts                  # Pact interaction
│
├── e2e/                              # End-to-end tests
│   ├── run-multinode-with-miners.test.ts
│   ├── verify-miner-rewards.test.ts
│   ├── fast-block-production.test.ts
│   ├── discontinued-node.test.ts
│   ├── check-container-logs.test.ts
│   ├── consensus/
│   │   └── cut-advancement.test.ts
│   └── infrastructure/
│       ├── compose-generation.test.ts
│       ├── node-lifecycle.test.ts
│       └── configuration.test.ts
│
├── performance/                      # Performance tests
│   ├── benchmarks/
│   │   ├── tps.benchmark.ts
│   │   ├── gas.benchmark.ts
│   │   └── state-growth.benchmark.ts
│   ├── load-tests/
│   │   ├── sustained-load.test.ts
│   │   └── burst-load.test.ts
│   └── reports/                      # Generated benchmark results
│       └── .gitkeep
│
├── edge-cases/                       # Edge case tests
│   ├── transactions.test.ts
│   ├── contracts.test.ts
│   ├── state.test.ts
│   └── security.test.ts
│
├── utils/                            # Test utilities
│   ├── devnet.ts                     # Devnet interaction helpers
│   ├── chainweb.ts                   # Chainweb-specific utilities
│   ├── contracts.ts                  # Contract deployment helpers
│   └── assertions.ts                 # Custom assertions
│
├── fixtures/                         # Test fixtures
│   ├── contracts/                    # Sample contracts
│   ├── transactions/                 # Sample transactions
│   └── proofs/                       # Sample SPV proofs
│
└── scripts/                          # Test automation scripts
    ├── run-benchmarks.sh
    ├── compare-benchmarks.js
    ├── generate-report.js
    └── upload-results.js

12.2 Test Naming Conventions

File Naming:

Unit tests: component.test.ts
Integration tests: feature.test.ts
E2E tests: scenario.test.ts
Benchmarks: metric.benchmark.ts

Test Structure:

describe('ComponentName', () => {
  describe('methodName', () => {
    it('should do expected behavior when condition', async () => {
      // Arrange
      const input = setupInput();

      // Act
      const result = await methodUnderTest(input);

      // Assert
      expect(result).toBe(expected);
    });

    it('should throw error when invalid input', async () => {
      // Arrange
      const invalidInput = setupInvalidInput();

      // Act & Assert
      await expect(methodUnderTest(invalidInput))
        .rejects.toThrow(/expected error message/);
    });
  });
});

12.3 Common Test Patterns

Pattern 1: Multi-Chain Test Setup

import { runOverChains } from '@kadena/hardhat-chainweb';

describe('Multi-Chain Test', () => {
  it('should work across all EVM chains', async () => {
    await runOverChains(async (chainId) => {
      const provider = getProvider(chainId);
      // Test logic for each chain
    });
  });
});

Pattern 2: Cross-Chain Transfer

async function testCrossChainTransfer(
  sourceChain: number,
  targetChain: number
) {
  // Step 1: Initialize on source
  await switchChain(sourceChain);
  const initTx = await contract.transferCrossChain(/* ... */);
  await initTx.wait();

  // Step 2: Get proof
  const proof = await getProof(initTx.hash, sourceChain);

  // Step 3: Redeem on target
  await switchChain(targetChain);
  const redeemTx = await contract.redeemCrossChain(proof);
  await redeemTx.wait();

  // Step 4: Verify
  // ... assertions
}

Pattern 3: Devnet Health Check

async function waitForDevnetReady() {
  const maxAttempts = 30;
  for (let i = 0; i < maxAttempts; i++) {
    try {
      const blockNumber = await provider.getBlockNumber();
      if (blockNumber > 0) return;
    } catch (e) {
      await sleep(1000);
    }
  }
  throw new Error('Devnet failed to start');
}

12.4 Glossary

Terms:

SPV Proof: Simple Payment Verification proof, cryptographic proof of an event on a source chain
Cut Height: The minimum block height across all chains in Chainweb
Precompile: Built-in smart contract with native implementation
Devnet: Local development network running in Docker
EVM Chains: Chains 20-24 running EVM payload provider
Pact Chains: Chains 0-19 running Pact smart contract execution
Chain ID Offset: Base value added to Chainweb chain ID to get EVM chain ID (sandbox: 1789, testnet: 5920)
BIP-44: Bitcoin Improvement Proposal 44, hierarchical deterministic wallet standard

Abbreviations:

TPS: Transactions Per Second
RPC: Remote Procedure Call
E2E: End-to-End
CI/CD: Continuous Integration / Continuous Deployment
PRD: Product Requirements Document

Document History

Version	Date	Author	Changes
1.0	2025-10-07	AI Assistant	Initial PRD creation based on Asana task analysis

Next Steps:

Review and approve PRD with stakeholders
Create GitHub issues for each phase
Assign team members to implementation tasks
Set up project tracking in Asana/GitHub Projects
Begin Phase 1 implementation

jwiegley/prd.md

Product Requirements Document: EVM Test Suite Implementation

Executive Summary

1. Project Overview

1.1 Background

1.2 Purpose

1.3 Scope

2. Current State Analysis

2.1 Existing Test Infrastructure

2.2 Gaps Identified

3. Goals and Objectives

3.1 Primary Goals

3.2 Success Metrics

4. Test Categories

4.1 Ethereum Standard Tests (ethereum/tests)

4.1.1 State Tests

4.1.2 Transaction Tests

4.1.3 VM Execution Tests

4.1.4 Blockchain Tests

4.2 RPC API Tests

4.2.1 Standard JSON-RPC Methods

4.2.2 Error Handling and Edge Cases

4.3 Cross-Chain Bridging Tests

4.3.1 SPV Proof Generation and Validation

4.3.2 Precompile Tests

4.4 Consensus and Multi-Chain Tests

4.4.1 Block Production Tests

4.4.2 Mining and Rewards

4.4.3 Multi-Node Coordination

4.5 Performance and Load Tests

4.5.1 Transaction Throughput

4.5.2 Contract Execution Performance

4.5.3 State Size and Scaling

4.6 Infrastructure and Devnet Tests

4.6.1 Docker Compose Tests

4.6.2 Node Lifecycle Tests

4.7 Security and Edge Case Tests

4.7.1 Transaction Edge Cases

4.7.2 Contract Edge Cases

4.7.3 Reentrancy and Attack Vectors

5. Implementation Plan

5.1 Phase 1: Ethereum Test Suite Integration (Weeks 1-4)

5.2 Phase 2: RPC and Cross-Chain Tests (Weeks 5-8)

5.3 Phase 3: Consensus and Performance (Weeks 9-12)

5.4 Phase 4: Infrastructure and Edge Cases (Weeks 13-16)

5.5 Phase 5: Documentation and Refinement (Weeks 17-18)

Integration Tests

E2E Tests

Ethereum Tests

Troubleshooting

6. Verification Strategy

6.1 Test Quality Criteria

6.2 Code Review Checklist

6.3 Coverage Metrics

6.4 Continuous Integration

7. Dependencies

7.1 Infrastructure Requirements

7.2 External Dependencies

7.3 Team Dependencies

8. Success Criteria

8.1 Functional Requirements

8.2 Non-Functional Requirements

8.3 Acceptance Criteria

9. References

9.1 External Resources

9.2 Internal Documentation

9.3 NPM Packages

9.4 Related Tasks (from Asana Screenshot)

10. Risk Assessment and Mitigation

10.1 Technical Risks

10.2 Schedule Risks

10.3 Quality Risks

11. Future Enhancements

11.1 Beyond Initial Scope

11.2 Tooling Improvements

12. Appendix

12.1 Test File Organization

12.2 Test Naming Conventions

12.3 Common Test Patterns

12.4 Glossary