-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancel back outgoing dust htlcs before commitment is confirmed. #9068
base: master
Are you sure you want to change the base?
Cancel back outgoing dust htlcs before commitment is confirmed. #9068
Conversation
Important Review skippedAuto reviews are limited to specific labels. Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe changes enhance the handling of canceled Hash Time-Locked Contracts (HTLCs) within the contract arbitration process. New methods for inserting and fetching canceled HTLCs are added to the Changes
Assessment against linked issues
Possibly related PRs
Suggested labels
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
923c6b0
to
fa7a925
Compare
fa7a925
to
4f96e28
Compare
Hi reviewers I am not happy about the following in this PR maybe someone has a nice idea how to make it more clean: So right now when we locally force close the channel we would fail the dust 2 times, meaning that the second time will cause the log error saying the closeCircuit is already gone. This is currently needed because we need to cancel dust even if the force-close is not initiated by us or the force-close is initiated by us but not by LND, broadcasting the force-close via some other means. This would right now cause some annying log entry similar to: Example:
Probably we should make the extra work and remove the outgoing htlc from the commitSet as soon as we cancel the incoming back. Will investigate. |
4f96e28
to
9aca3ec
Compare
We could query the circuitMap and not attempt the cancelling of the incoming htlc but maybe just failing it and hitting the error is as efficient ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactors! My main comment is we could cancel the dust even earlier, once we've decided there are chain actions to be taken here,
lnd/contractcourt/channel_arbitrator.go
Lines 955 to 960 in edd9ade
if len(chainActions) == 0 && trigger == chainTrigger { | |
log.Debugf("ChannelArbitrator(%v): no actions for "+ | |
"chain trigger, terminating", c.cfg.ChanPoint) | |
return StateDefault, closeTx, nil | |
} |
The current design may end up calling the canceling logic twice, as indicated from this state transition diagram,
lnd/contractcourt/briefcase.go
Lines 139 to 183 in edd9ade
// StateDefault | |
// | | |
// |-> StateDefault: no actions and chain trigger | |
// | | |
// |-> StateBroadcastCommit: chain/user trigger | |
// | | | |
// | |-> StateCommitmentBroadcasted: chain/user trigger | |
// | | | | |
// | | |-> StateCommitmentBroadcasted: chain/user trigger | |
// | | | | |
// | | |-> StateContractClosed: local/remote/breach close trigger | |
// | | | | | |
// | | | |-> StateWaitingFullResolution: contract resolutions not empty | |
// | | | | | | |
// | | | | |-> StateWaitingFullResolution: contract resolutions not empty | |
// | | | | | | |
// | | | | |-> StateFullyResolved: contract resolutions empty | |
// | | | | | |
// | | | |-> StateFullyResolved: contract resolutions empty | |
// | | | | |
// | | |-> StateFullyResolved: coop/breach(legacy) close trigger | |
// | | | |
// | |-> StateContractClosed: local/remote/breach close trigger | |
// | | | | |
// | | |-> StateWaitingFullResolution: contract resolutions not empty | |
// | | | | | |
// | | | |-> StateWaitingFullResolution: contract resolutions not empty | |
// | | | | | |
// | | | |-> StateFullyResolved: contract resolutions empty | |
// | | | | |
// | | |-> StateFullyResolved: contract resolutions empty | |
// | | | |
// | |-> StateFullyResolved: coop/breach(legacy) close trigger | |
// | | |
// |-> StateContractClosed: local/remote/breach close trigger | |
// | | | |
// | |-> StateWaitingFullResolution: contract resolutions not empty | |
// | | | | |
// | | |-> StateWaitingFullResolution: contract resolutions not empty | |
// | | | | |
// | | |-> StateFullyResolved: contract resolutions empty | |
// | | | |
// | |-> StateFullyResolved: contract resolutions empty | |
// | | |
// |-> StateFullyResolved: coop/breach(legacy) close trigger. |
12fb4ee
to
67cd269
Compare
Updated the commit-structure and also the code design quite a bit. Needed to persist the canceled htlcs in the This PR is also created in a way so we can now add the canceling of non-dust htlcs as well in a followup PR. I think might still be missing some unit-tests, need to dig into all the tests in the contractcourt, but lets see whether this new approach is the way we wanna go before adding more tests. |
@coderabbitai review |
Actions performedReview triggered.
|
So currently the db recording is just used to avoid ERR loglines which might be misleading. But for the non-dust htlc case we definitely need it. So maybe we accept these log lines and push the db recording when we introduce the non-dust fail-back? |
ae6ff2d
to
b41b357
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the db recording but rather expect an error in the circuit mapper when a dust htlcs are resolved.
b41b357
to
3e9c95f
Compare
3e9c95f
to
cf231df
Compare
d74289e
to
3fb27ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good - had a small suggestion on how we could avoid duplication in canceling the dust.
b78429e
to
0ed9f19
Compare
Refactor the part where we are failing back the incoming htlc when the channel of the corresponding outgoing htlc is force closed.
We distinguish between dangling and dust htlcs. This does not change any logic but only introduces new types to later act on them differently when we begin to fail dust htlcs earlier in a later commit.
We will now cancel dust htlcs on the local/remote commits after we decided to go onchain. This can be done because dust cannot be enforced onchain and therefore there is no way to also reveil the preimage onchain.
Now that we cancel dust htlcs prematurely even before the commitment tx is confirmed we don't consider dust htlcs when creating the cpfp transaction.
Now outgoing dust-htlcs are canceled back before the commitment is confirmed onchain.
0ed9f19
to
e25afc2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just have one final question
@@ -682,7 +682,7 @@ func (c *ChannelArbitrator) relaunchResolvers(commitSet *CommitSet, | |||
// chain actions may exclude some information, but we cannot recover it | |||
// for these older nodes at the moment. | |||
var confirmedHTLCs []channeldb.HTLC | |||
if commitSet != nil { | |||
if commitSet != nil && commitSet.ConfCommitKey != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - curious tho, have you run into a panic before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No but I ran into the one in constructChainActions
which was triggered by TestChannelArbitratorPersistence
which writes an empty set to disc which will cause commitSet.ConfCommitKey
to be nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one of those situations where we are using pointers to simulate optional behavior or is this a pointer because of memory profiling characteristics?
// no confCommitSet. However when the remote commitment confirms | ||
// without us ever broadcasting our local commitment we need to | ||
// make sure we cancel all outgoing dust HTLCs as well. | ||
if confCommitSet == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: will there be a case when we does local force close but this is not nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, this is only needed in case the remote confirms so we need to look up the remote HTLC set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the review 🙏
// no confCommitSet. However when the remote commitment confirms | ||
// without us ever broadcasting our local commitment we need to | ||
// make sure we cancel all outgoing dust HTLCs as well. | ||
if confCommitSet == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, this is only needed in case the remote confirms so we need to look up the remote HTLC set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some style nits to address, waiting for keags review to push them
HtlcFailNowAction = 3 | ||
// HtlcFailDustAction indicates that we should fail an outgoing dust | ||
// HTLC immediately (even before the commitment transaction is | ||
// confirmed in case we (local) broadcast the commitment tx) by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: in case we broadcast the local commitment tx
return requests, nil | ||
} | ||
|
||
// resolveBreachedHTLCs resolves all HTLCs that are breached transaction was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: when a breached transaction was detected.
} | ||
|
||
// cancelIncomingHTLCs cancels back the incoming HTLCs for the corresponding | ||
// outgoing HTLC. We use a set here to avoid sending duplicate failure messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style nit: HTLCs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think I understand the approach but my main first-round feedback is that we need to tighten up terminology. I wish this wasn't the case but it impedes my ability to understand the details of the rest of the PR and makes it pretty difficult for me to tell whether or not the code is doing the right thing.
Another thing to note is that I'm somewhat less experienced with the contractcourt package and so I'm less able to bring expertise into the review and give charitable interpretations to the things I encounter.
Overall I think the bias towards extracting out common functionality into helper functions is very good and I think I understand the high level approach here. Nothing stands out to me as obviously wrong but I can't comment very well on whether this change is complete with respect to doing all of the things it needs to to safely accomplish its goal.
Hopefully future review rounds will help both in tightening the implementation as well as me taking some time out of band to better familiarize myself with the overall structure.
@@ -682,7 +682,7 @@ func (c *ChannelArbitrator) relaunchResolvers(commitSet *CommitSet, | |||
// chain actions may exclude some information, but we cannot recover it | |||
// for these older nodes at the moment. | |||
var confirmedHTLCs []channeldb.HTLC | |||
if commitSet != nil { | |||
if commitSet != nil && commitSet.ConfCommitKey != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this one of those situations where we are using pointers to simulate optional behavior or is this a pointer because of memory profiling characteristics?
// Send the msg to the switch. | ||
if len(msgsToSend) == 0 { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment here is wrong. However, this seems unnecessary in the first place, right? Even if it has a zero length you can still send that to the DeliverResolutionMsg no?
// resolveBreachedHTLCs resolves all HTLCs that are breached transaction was | ||
// detected. Resovling here means that it fails back the corresponding incoming | ||
// HTLCs for a given outgoing HTLC on the current remote commitment set | ||
// (including the remote pending commitment set). | ||
func (c *ChannelArbitrator) resolveBreachedHTLCs(commitSet CommitSet) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this isn't doing resolution beyond the incoming cancellations.
// resolveHTLCsNow resolves all HTLCs which can be acted on immediately after | ||
// the commitment is confirmed. This includes HTLCs on the remote pending | ||
// commitment whose corresponding incoming HTLCs can now be failed back and | ||
// incoming dust HTLCs which can be marked as failed because they are not part | ||
// of the commitment transaction. | ||
func (c *ChannelArbitrator) resolveHTLCsNow(htlcActions ChainActionMap) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we need to come up with better names for these functions. The main nudge I want here is, why do we have the word "Now" in here? Is there a corresponding "Later". When the function completes are all of these HTLCs fully resolved? It also seems like this only resolves two of seven (or six if we exclude "no action") actions in the map so it doesn't seem to operate over the entire map. I think It may be a good idea to break this function up on a per action basis and clearly describe what that resolution is.
@@ -1632,7 +1632,7 @@ out: | |||
defer s.wg.Done() | |||
|
|||
if err := s.FlushForwardingEvents(); err != nil { | |||
log.Errorf("unable to flush "+ | |||
log.Errorf("Unable to flush "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't capitalize error messages. Idk why but this change is unnecessary.
dustHTLCSet := fn.SliceToMap(dustHTLCs, | ||
func(htlc channeldb.HTLC) uint64 { | ||
return htlc.HtlcIndex | ||
}, | ||
func(htlc channeldb.HTLC) struct{} { | ||
return struct{}{} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't do this. Just do
getIdx := func(htlc channeldb.HTLC) uint64 {
return htlc.HtlcIndex
}
dustHTLCSet := fn.NewSet(fn.Map(getIdx, dustHTLCs)...)
// We expect an immediate resolution message for the outgoing dust htlc. | ||
// It is not resolvable on-chain and it can be canceled back even before | ||
// the commitment transaction confirmed. | ||
select { | ||
case msgs := <-chanArbCtx.resolutions: | ||
if len(msgs) != 1 { | ||
t.Fatalf("expected 1 message, instead got %v", | ||
len(msgs)) | ||
} | ||
|
||
if msgs[0].HtlcIndex != outgoingDustHtlc.HtlcIndex { | ||
t.Fatalf("wrong htlc index: expected %v, got %v", | ||
outgoingDustHtlc.HtlcIndex, msgs[0].HtlcIndex) | ||
} | ||
case <-time.After(defaultTimeout): | ||
t.Fatalf("resolution msgs not sent") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🫡
// createSweepRequest creates an anchor sweeping request for a particular | ||
// version (local/remote/remote dangling) of the commitment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another terminology nitpick. What is a remote dangling anchor? Are we referring to an anchor sweep that includes remote dangling HTLCs? Are we referring to an anchor sweep on the pending remote commitment? I think this PR could use a comprehensive review of the terminology used and make sure that we are very consistent with it since there are so many delicate cases that need to be handled correctly.
chanArbCtx, err := createTestChannelArbitrator(t, log) | ||
require.NoError(t, err, "unable to create ChannelArbitrator") | ||
|
||
// Attack a mock PreimageDB and Registry to channel arbitrator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attach?
Fixes #7969