User
Ok, seems this dramatic change might not be worth the savings, but thanks a lot for the try. Let's go back to the initial code: after all your experiments with re-implementing, size comparisons and the like? Do you have very simple/local/mininmal-change suggestions how we can safe on BAL size here? These can be very local things, small optimizations, small data format adjustments, .... Whatever you can imagine.
Cursor
Good question. Let me look at the actual old-format BAL structure and find concrete, local optimizations with real measured impact.
Now let me do a deep forensic breakdown of exactly what's eating space in the old RLP format:
Good data. Let me now run the most promising ones across all 10 blocks:
Now let me also check a few more micro-optimizations to be thorough:
Excellent data. Now I have a clear picture. Here's the analysis:
| Component | Raw payload | RLP overhead | Total est. | % of BAL |
|---|---|---|---|---|
| Addresses (670 * 20B) | 13,400 | ~1,700 | ~15,100 | 23% |
| Storage change slots + values | 19,124 | ~6,500 | ~23,400 | 36% |
| Storage reads (slot keys) | 14,422 | ~1,200 | ~15,600 | 24% |
| Balance changes (741 entries) | 5,264 + 1,181 idx | ~2,200 | ~8,700 | 13% |
| Nonce changes (274 entries) | 365 + 421 idx | ~1,000 | ~1,800 | 3% |
| Access-only overhead (5 empty lists * 90) | 450 | - | 450 | 1% |
1. Final-only storage values: flatten [[idx, val], ...] to just val (~7% savings)
Currently each storage slot change stores:
[slot_key, [[blockAccessIdx_1, postValue_1], [blockAccessIdx_2, postValue_2], ...]]
70 out of 481 changed slots have multi-write entries (up to 12 writes per slot). Only the final value matters for state reconstruction. Flattened:
[slot_key, finalValue]
This eliminates intermediate values AND one level of RLP list nesting (which is a significant overhead reducer given 481 total slots).
2. Final-only balance/nonce: drop per-tx BlockAccessIndex (~8% savings)
The killer data point: the coinbase has 273 balance entries (one per transaction). That's a single address contributing ~2.5 KiB of redundant intermediate balance values. 11 accounts in total have >1 balance entry. Currently:
balanceChanges: [[idx_1, postBalance_1], [idx_2, postBalance_2], ...]
Simplified to just the final value:
balance: [finalBalance] // optional list: present = changed, empty = unchanged
nonce: [finalNonce] // same
3. Compact access-only addresses (~1% savings, but a nice cleanup)
90 addresses are access-only but currently encoded as [addr, [], [], [], [], []] (27 bytes each). Splitting into a separate flat address list saves 5 bytes per address (~450B total). Not huge, but it also cleans up the structure conceptually.
| Block | Baseline | Opt 2 (bal/nonce) | Opt 3 (all final) | Savings |
|---|---|---|---|---|
| 24476000 | 65,205 | 59,814 (-8.3%) | 55,692 | -14.6% |
| 24476001 | 104,042 | 95,892 (-7.8%) | 88,036 | -15.4% |
| 24476002 | 60,081 | 54,683 (-9.0%) | 51,425 | -14.4% |
| 24476003 | 63,650 | 58,636 (-7.9%) | 53,068 | -16.6% |
| 24476004 | 86,089 | 80,154 (-6.9%) | 74,068 | -14.0% |
| 24476005 | 27,805 | 25,799 (-7.2%) | 23,537 | -15.4% |
| 24476006 | 173,672 | 166,019 (-4.4%) | 152,337 | -12.3% |
| 24476007 | 8,198 | 7,541 (-8.0%) | 7,365 | -10.2% |
| 24476008 | 110,080 | 98,952 (-10.1%) | 92,404 | -16.1% |
| 24476009 | 167,880 | 160,643 (-4.3%) | 144,470 | -13.9% |
Consistent ~15% reduction, with zero structural change to the BAL -- just flattening the per-tx nesting to final values.
- Merging storageReads into storageChanges: Actually increases size by ~1% because storageReads are currently bare slot keys (no value), and merging them into the change structure adds value-field overhead (even if empty).
- Indexed/deduplicated address encoding: Addresses only appear once per account anyway, so there's nothing to deduplicate in the old format.
- Dropping precompiles: ~7 addresses * 27B = ~189B. Noise level.
All three optimizations trade away per-tx change attribution (knowing which transaction caused each intermediate state change). The BlockAccessIndex and per-tx storage values would be gone. You keep:
- Which addresses/slots were accessed (unchanged)
- The final post-block value for each changed field (sufficient for state reconstruction)
You lose: the ability to replay intermediate states from the BAL alone. If a consumer needs that, they'd need to re-execute transactions. For parallel execution hinting and state reconstruction (the two primary BAL use cases), the final values suffice.