Review club `bdk` #2038

Topic: Replace CanonicalIter with sans-IO CanonicalizationTask
Meeting: 2026-01-29 at 13:00 UTC on Discord
PR: bitcoindevkit/bdk#2038
Author: @oleonardolima

Commits

997a460d - refactor(chain)!: replace CanonicalIter with sans-io CanonicalizationTask
cfbcdbbf - refactor(chain)!: migrate from CanonicalIter to CanonicalizationTask
2cf6fd8b - refactor(chain)!: complete removal of canonical_iter module
c1717060 - refactor(core,chain)!: extract generic ChainQuery trait from CanonicalizationTask
430df1c5 - refactor(chain)!: generalize ChainQuery trait with generic type
dfb14617 - refactor(chain): use single queue for anchored txs in canonicalization
af396fbe - refactor(chain): restructure canonicalization with staged processing

Context and Background

Right now canonicalizing the transaction graph requires frequent queries to a chain oracle to discover which anchor blocks exist in the current active chain. This is necessary since, by design, the TxGraph structure allows the existence of stale or conflicting data, some of which only a chain oracle can resolve by acting as a source of truth for on-chain data. In the current implementation the chain oracle is tightly coupled with the rest of the canonicalization logic (for example see the definition of TxGraph::list_canonical_txs), which is less ergonomic and may hurt performance in cases where network I/O is a bottleneck.

PR #2038 refactors the canonicalization system by separating business logic from I/O operations, through a task-based approach, effectively inverting the dependency of TxGraph on the ChainOracle. Now the TxGraph organizes the work to be done in the form of a CanonicalizationTask which the chain oracle then executes by handling queries and driving the process to completion to create a complete view of canonical transactions. The goals of this work are 1. To achieve greater separation of concerns, and 2. To enable batching requests for on-chain data.

API changes

Removed CanonicalIter type and canonical_iter module
Removed TxGraph::canonical_iter and TxGraph::canonical_view methods
Changed visibility of CanonicalView::new to pub(crate)
Added TxGraph::canonicalization_task method
Added LocalChain::canonicalize method
Added CanonicalizationTask, ChainRequest, ChainResponse types
Added ChainQuery trait
Moved ObservedIn, CanonicalReason, CanonicalizationParams from canonical_iter to canonical_task module

This is a major breaking change affecting bdk_chain 0.24.0. Users will need to:

Replace list_canonical_txs() and canonical_view() calls with the two-step task creation and execution pattern
Update imports from canonical_iter to canonical_task module
Adapt any code relying on CanonicalIter

References

Draft PR #2046 showing how different chain oracles can execute a CanonicalizationTask
PR #2029 that introduced the CanonicalView

Review Questions

Concepts and Approach

How do you feel about the PR (ACK, NACK, etc)? What was your review process?
What are the main benefits of separating I/O operations from business logic in the canonicalization process? How does this improve flexibility?
The PR introduces a generic ChainQuery trait in bdk_core. What's the reasoning behind making this generic and reusable beyond canonicalization? What other blockchain query operations might benefit from this pattern?
This is a major breaking change affecting the public API. Was there consideration for providing a compatibility layer or gradual migration path?
How does batching anchor queries improve performance? Are there any scenarios where the new approach might perform worse than the old iterator-based method?
The last design based on CanonicalIter used an iterator-based approach. How does the new task-based approach differ?
Can you think of any potential follow-up work that might complement or improve on the current state of the PR?

Implementation

Task Architecture

Looking at the task implementation, how does the staged processing work? What is the purpose of the CanonicalStage enum?
How does the new system track which transactions have been processed, and how does this compare to the old approach? (e.g. direct_anchors, unprocessed_transitively_anchored_txs)
Why did the implementation move away from iterator-based processing of unprocessed_anchored_txs to VecDeque? Are there advantages or disadvantages?
How does the new system handle transitively-anchored transactions (ancestors of explicitly anchored transactions)? Is this handled differently than in the old system?

Request/Response Pattern

In the context of the CanonicalizationTask what is the "request" made up of, and how is it created? What does the data in the "response" represent semantically?
How does the implementation ensure that responses are correctly matched to their requests?
How does error handling work in the new request/response pattern? What happens if a chain query fails part way through processing?

ChainOracle Integration

The chain_tip parameter was moved from the canonicalize() method into the request structure. What's the reasoning behind this change?
How should a chain oracle handle a single request containing multiple anchor blocks?
This method is now generic over the ChainQuery trait. What are the benefits of this design choice?

Testing

One stated benefit is the ability to test with mock responses. How would you write a test for the canonicalization logic without a real blockchain?
Are there integration tests that verify the new system produces the same results as the old system for the same inputs?
What is the potential impact on performance of the new system? How would you benchmark the performance compared to the old system?

Code Organization

The PR moves types from canonical_iter to canonical_task module and completely removes the old module. What's the migration path for existing code using these types?
The new CanonicalView::new() constructor is now private. How do consumers create canonical views in the new system?

ChainQuery

What is the ChainQuery trait used for? Who are the implementors of the trait?
Why do ChainRequest<B> and ChainResponse<B> share the same type parameter? Can you think of realistic blockchain queries where the request and response would need different type parameters?
If ChainQuery is a solution for a single-use-case (scanning the blockchain for requested block IDs), should it have a name that reflects a singular purpose, e.g. BlockQuery
What is meant by "dependency inversion"? What is the dependency that's being inverted?
What is meant by "architectural decoupling" and how is this achieved?

110CodingP/review-club-2038.md

Select an option

No results found

Select an option

No results found

Review club `bdk` #2038

Commits

Context and Background

API changes

References

Review Questions

Concepts and Approach

Implementation

Task Architecture

Request/Response Pattern

ChainOracle Integration

Testing

Code Organization

ChainQuery

110CodingP/review-club-2038.md

Review club bdk #2038

Commits

Context and Background

API changes

References

Review Questions

Concepts and Approach

Implementation

Task Architecture

Request/Response Pattern

ChainOracle Integration

Testing

Code Organization

ChainQuery

Review club `bdk` #2038