-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Better canonial history construction. #1659
Closed
evanlinjin
wants to merge
4
commits into
bitcoindevkit:master
from
evanlinjin:feature/unconfirmed-oracle
Closed
Draft: Better canonial history construction. #1659
evanlinjin
wants to merge
4
commits into
bitcoindevkit:master
from
evanlinjin:feature/unconfirmed-oracle
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We rm `ConfirmationTime` because it is essentially the same thing as `ChainPosition<ConfirmationBlockTime>` without the block hash. We also impl `serde::Deserialize` and `serde::Serialize` for `ChainPosition`.
Previously, the field `TxGraph::anchors` existed because we assumed there was use in having a partially-chronological order of transactions there. Turns out it wasn't used at all. This commit removes anchors from the `txs` field and reworks `anchors` field to be a map of `txid -> set<anchors>`. This is a breaking change since the signature of `all_anchors()` is changed.
Previously, this was a `BTreeMap` which was less performant. Order by `OutPoint` is not important. Test `test_descendants_no_repeat` is fixed. Previously, a tx of the test spent a non-existing prevout, and `tx_a` was not part of the `TxGraph`.
8 tasks
notmandatory
added
module-blockchain
audit
Suggested as result of external code audit
api
A breaking API change
labels
Nov 15, 2024
Replaced by #1670. Although it would be nice to reintroduce this |
evanlinjin
added a commit
that referenced
this pull request
Dec 11, 2024
956d0a9 test(chain): Update test docs to stop referencing `get_chain_position` (志宇) 1508ae3 chore: Fix typos (志宇) af92199 refactor(wallet): Reuse chain position instead of obtaining new one (Jiri Jakes) caa0f13 docs(wallet): Explain `.take` usage (志宇) 1196405 refactor(chain): Reorganize `TxGraph::insert_anchor` logic for clarity (志宇) 4706315 chore(chain): Address `CanonicalIter` nitpicks (志宇) 68f7b77 test(chain): Add canonicalization test (志宇) da0c43e refactor(chain)!: Rename `LastSeenIn` to `ObservedIn` (志宇) d4102b4 perf(chain): add benchmarks for canonicalization logic (志宇) e34024c feat(chain): Derive `Clone` on `IndexedTxGraph` (志宇) e985445 docs: Add ADR for `O(n)` canonicalization algorithm (志宇) 4325e2c test(chain): Add transitive anchor tests (志宇) 8fbee12 feat(chain)!: rm `get_chain_position` and associated methods (志宇) 582d6b5 feat(chain)!: `O(n)` canonicalization algorithm (志宇) f6192a6 feat(chain)!: Add `run_until_finished` methods (志宇) 0aa39f9 feat(chain)!: `TxGraph` contain anchors in one field (志宇) Pull request description: Fixes #1665 Replaces #1659 ### Description Previously, getting the canonical history of transactions/UTXOs required calling `TxGraph::get_chain_position` on each transaction. This was highly inefficient and resulted in an `O(n^2)` algorithm. The situation is especially problematic when we have many unconfirmed conflicts. This PR introduces an `O(n)` algorithm to determine the canonical set of transactions in `TxGraph`. The algorithm's premise is as follows: 1. If transaction `A` is determined to be canonical, all of `A`'s ancestors must also be canonical. 2. If transaction `B` is determined to be NOT canonical, all of `B`'s descendants must also be NOT canonical. 3. If a transaction is anchored in the best chain, it is canonical. 4. If a transaction conflicts with a canonical transaction, it is NOT canonical. 5. A transaction with a higher last-seen has precedence. 6. Last-seen values are transitive. A transaction's collective last-seen value is the max of it's last-seen value and all of it's descendants. We maintain two mutually-exclusive `txid` sets: `canoncial` and `not_canonical`. Imagine a method `mark_canonical(A)` that is based on premise 1 and 2. This method will mark transaction `A` and all of it's ancestors as canonical. For each transaction that is marked canonical, we can iterate all of it's conflicts and mark those as `non_canonical`. If a transaction already exists in `canoncial` or `not_canonical`, we can break early, avoiding duplicate work. This algorithm iterates transactions in 3 runs. 1. Iterate over all transactions with anchors in descending anchor-height order. For any transaction that has an anchor pointing to the best chain, we call `mark_canonical` on it. We iterate in descending-height order to reduce the number of anchors we need to check against the `ChainOracle` (premise 1). The purpose of this run is to populate `non_canonical` with all transactions that directly conflict with anchored transactions and populate `canonical` with all anchored transactions and ancestors of anchors transactions (transitive anchors). 2. Iterate over all transactions with last-seen values, in descending last-seen order. We can call `mark_canonical` on all of these that do not already exist in `canonical` or `not_canonical`. 3. Iterate over remaining transactions that contains anchors (but not in the best chain) and have no last-seen value. We treat these transactions in the same way as we do in run 2. #### Benchmarks Thank you to @ValuedMammal for working on this. ```sh $ cargo bench -p bdk_chain --bench canonicalization ``` Benchmark results (this PR): ``` many_conflicting_unconfirmed::list_canonical_txs time: [709.46 us 710.36 us 711.35 us] many_conflicting_unconfirmed::filter_chain_txouts time: [712.59 us 713.23 us 713.90 us] many_conflicting_unconfirmed::filter_chain_unspents time: [709.95 us 711.16 us 712.45 us] many_chained_unconfirmed::list_canonical_txs time: [2.2604 ms 2.2641 ms 2.2680 ms] many_chained_unconfirmed::filter_chain_txouts time: [3.5763 ms 3.5869 ms 3.5979 ms] many_chained_unconfirmed::filter_chain_unspents time: [3.5540 ms 3.5596 ms 3.5652 ms] nested_conflicts_unconfirmed::list_canonical_txs time: [660.06 us 661.75 us 663.60 us] nested_conflicts_unconfirmed::filter_chain_txouts time: [650.15 us 651.36 us 652.71 us] nested_conflicts_unconfirmed::filter_chain_unspents time: [658.37 us 661.54 us 664.81 us] ``` Benchmark results (master): https://github.com/evanlinjin/bdk/tree/fix/1665-master-bench ``` many_conflicting_unconfirmed::list_canonical_txs time: [94.618 ms 94.966 ms 95.338 ms] many_conflicting_unconfirmed::filter_chain_txouts time: [159.31 ms 159.76 ms 160.22 ms] many_conflicting_unconfirmed::filter_chain_unspents time: [163.29 ms 163.61 ms 163.96 ms] # I gave up running the rest of the benchmarks since they were taking too long. ``` ### Notes to the reviewers * ***PLEASE MERGE #1733 BEFORE THIS PR!*** We had to change the signature of `ChainPosition` to account for transitive anchors and unconfirmed transactions with no `last-seen` value. * The canonicalization algorithm is contained in `/crates/chain/src/canonical_iter.rs`. * Since the algorithm requires traversing transactions ordered by anchor height, and then last-seen values, we introduce two index fields in `TxGraph`; `txs_by_anchor` and `txs_by_last_seen`. Methods `insert_anchor` and `insert_seen_at` are changed to populate these index fields. * An ADR is added: `docs/adr/0003_canonicalization_algorithm.md`. This is based on the work in #1592. ### Changelog notice * Added: Introduce an `O(n)` canonicalization algorithm. This logic is contained in `/crates/chain/src/canonical_iter.rs`. * Added: Indexing fields in `TxGraph`; `txs_by_anchor_height` and `txs_by_last_seen`. Pre-indexing allows us to construct the canonical history more efficiently. * Removed: `TxGraph` methods: `try_get_chain_position` and `get_chain_position`. This is superseded by the new canonicalization algorithm. ### Checklists #### All Submissions: * [x] I've signed all my commits * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md) * [x] I ran `cargo fmt` and `cargo clippy` before committing #### New Features: * [x] I've added tests for the new feature * [x] I've added docs for the new feature #### Bugfixes: * [x] This pull request breaks the existing API * [x] I've added tests to reproduce the issue which are now passing * [x] I'm linking the issue being fixed by this PR ACKs for top commit: ValuedMammal: ACK 956d0a9 nymius: ACK 956d0a9 oleonardolima: utACK 956d0a9 jirijakes: ACK 956d0a9 Tree-SHA512: 44963224abf1aefb3510c59d0eb27e3a572cd16f46106fd92e8da2e6e12f0671dcc1cd5ffdc4cc80683bc9e89fa990eba044d9c64d9ce02abc29a08f4859b69e
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
The way we are determining the canonical history of transactions/outputs in
bdk_chain
is inefficient and inflexible:TxGraph::try_get_chain_position
is costly when we have many conflicting unconfirmed txsO(n^2)
.TxGraph
may be replaced.Wallet::insert_tx
will be replaced by the wallet when creating a new tx. #1642Wallet
should always include alast_seen
. #1644To fix this, I have introduced the following in the PR:
UnconfirmedOracle
is used to determine the maybe canonical transaction between a set of conflicting unconfirmed transactions. It can returnNone
to signal that none of the presented transactions are canonical. This can be used to create different policies as to which transactions to include in the canonical history.CanonicalView
represents a canonical view of transactions. The algorithm to create it iterates from the "roots" of theTxGraph
(roots are either coinbase transactions, or outpoints which we have no residing transaction). Each set of conflicting spends only needs to be iterated over once.Notes to the reviewers
This is a proof of concept. Not sure which parts should be included in
v1.0
. Tests are passing (except of doc tests).Changelog notice
TODO
Checklists
All Submissions:
cargo fmt
andcargo clippy
before committingNew Features:
Bugfixes: