Skip to content

Instantly share code, notes, and snippets.

@110CodingP
Forked from ValuedMammal/review-club-2038.md
Created February 3, 2026 16:06
Show Gist options
  • Select an option

  • Save 110CodingP/4e231b58b0957e6c6291d07f9e6cfeb1 to your computer and use it in GitHub Desktop.

Select an option

Save 110CodingP/4e231b58b0957e6c6291d07f9e6cfeb1 to your computer and use it in GitHub Desktop.
Review club `bdk` #2038

Review club bdk #2038

Topic: Replace CanonicalIter with sans-IO CanonicalizationTask
Meeting: 2026-01-29 at 13:00 UTC on Discord
PR: bitcoindevkit/bdk#2038
Author: @oleonardolima

Commits

  • 997a460d - refactor(chain)!: replace CanonicalIter with sans-io CanonicalizationTask
  • cfbcdbbf - refactor(chain)!: migrate from CanonicalIter to CanonicalizationTask
  • 2cf6fd8b - refactor(chain)!: complete removal of canonical_iter module
  • c1717060 - refactor(core,chain)!: extract generic ChainQuery trait from CanonicalizationTask
  • 430df1c5 - refactor(chain)!: generalize ChainQuery trait with generic type
  • dfb14617 - refactor(chain): use single queue for anchored txs in canonicalization
  • af396fbe - refactor(chain): restructure canonicalization with staged processing

Context and Background

Right now canonicalizing the transaction graph requires frequent queries to a chain oracle to discover which anchor blocks exist in the current active chain. This is necessary since, by design, the TxGraph structure allows the existence of stale or conflicting data, some of which only a chain oracle can resolve by acting as a source of truth for on-chain data. In the current implementation the chain oracle is tightly coupled with the rest of the canonicalization logic (for example see the definition of TxGraph::list_canonical_txs), which is less ergonomic and may hurt performance in cases where network I/O is a bottleneck.

PR #2038 refactors the canonicalization system by separating business logic from I/O operations, through a task-based approach, effectively inverting the dependency of TxGraph on the ChainOracle. Now the TxGraph organizes the work to be done in the form of a CanonicalizationTask which the chain oracle then executes by handling queries and driving the process to completion to create a complete view of canonical transactions. The goals of this work are 1. To achieve greater separation of concerns, and 2. To enable batching requests for on-chain data.

API changes

  • Removed CanonicalIter type and canonical_iter module
  • Removed TxGraph::canonical_iter and TxGraph::canonical_view methods
  • Changed visibility of CanonicalView::new to pub(crate)
  • Added TxGraph::canonicalization_task method
  • Added LocalChain::canonicalize method
  • Added CanonicalizationTask, ChainRequest, ChainResponse types
  • Added ChainQuery trait
  • Moved ObservedIn, CanonicalReason, CanonicalizationParams from canonical_iter to canonical_task module

This is a major breaking change affecting bdk_chain 0.24.0. Users will need to:

  1. Replace list_canonical_txs() and canonical_view() calls with the two-step task creation and execution pattern
  2. Update imports from canonical_iter to canonical_task module
  3. Adapt any code relying on CanonicalIter

References

  • Draft PR #2046 showing how different chain oracles can execute a CanonicalizationTask
  • PR #2029 that introduced the CanonicalView

Review Questions

Concepts and Approach

  1. How do you feel about the PR (ACK, NACK, etc)? What was your review process?
  2. What are the main benefits of separating I/O operations from business logic in the canonicalization process? How does this improve flexibility?
  3. The PR introduces a generic ChainQuery trait in bdk_core. What's the reasoning behind making this generic and reusable beyond canonicalization? What other blockchain query operations might benefit from this pattern?
  4. This is a major breaking change affecting the public API. Was there consideration for providing a compatibility layer or gradual migration path?
  5. How does batching anchor queries improve performance? Are there any scenarios where the new approach might perform worse than the old iterator-based method?
  6. The last design based on CanonicalIter used an iterator-based approach. How does the new task-based approach differ?
  7. Can you think of any potential follow-up work that might complement or improve on the current state of the PR?

Implementation

Task Architecture

  1. Looking at the task implementation, how does the staged processing work? What is the purpose of the CanonicalStage enum?
  2. How does the new system track which transactions have been processed, and how does this compare to the old approach? (e.g. direct_anchors, unprocessed_transitively_anchored_txs)
  3. Why did the implementation move away from iterator-based processing of unprocessed_anchored_txs to VecDeque? Are there advantages or disadvantages?
  4. How does the new system handle transitively-anchored transactions (ancestors of explicitly anchored transactions)? Is this handled differently than in the old system?

Request/Response Pattern

  1. In the context of the CanonicalizationTask what is the "request" made up of, and how is it created? What does the data in the "response" represent semantically?
  2. How does the implementation ensure that responses are correctly matched to their requests?
  3. How does error handling work in the new request/response pattern? What happens if a chain query fails part way through processing?

ChainOracle Integration

  1. The chain_tip parameter was moved from the canonicalize() method into the request structure. What's the reasoning behind this change?
  2. How should a chain oracle handle a single request containing multiple anchor blocks?
  3. This method is now generic over the ChainQuery trait. What are the benefits of this design choice?

Testing

  1. One stated benefit is the ability to test with mock responses. How would you write a test for the canonicalization logic without a real blockchain?
  2. Are there integration tests that verify the new system produces the same results as the old system for the same inputs?
  3. What is the potential impact on performance of the new system? How would you benchmark the performance compared to the old system?

Code Organization

  1. The PR moves types from canonical_iter to canonical_task module and completely removes the old module. What's the migration path for existing code using these types?
  2. The new CanonicalView::new() constructor is now private. How do consumers create canonical views in the new system?

ChainQuery

  1. What is the ChainQuery trait used for? Who are the implementors of the trait?
  2. Why do ChainRequest<B> and ChainResponse<B> share the same type parameter? Can you think of realistic blockchain queries where the request and response would need different type parameters?
  3. If ChainQuery is a solution for a single-use-case (scanning the blockchain for requested block IDs), should it have a name that reflects a singular purpose, e.g. BlockQuery
  4. What is meant by "dependency inversion"? What is the dependency that's being inverted?
  5. What is meant by "architectural decoupling" and how is this achieved?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment