Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug(blockchain): make block finalization wait on sidecar availability #2118
base: main
Are you sure you want to change the base?
bug(blockchain): make block finalization wait on sidecar availability #2118
Changes from all commits
0f6e1ba
00aebfa
2280da8
93fd044
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the returned context from
errgroup.WithContext
for proper cancellation propagationCurrently, the context returned by
errgroup.WithContext(ctx)
is being ignored. To ensure that cancellation signals are properly propagated to the goroutines, you should use the returned context and pass it to the goroutines.Apply this diff to fix the issue:
📝 Committable suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 should use the context from
errgroup
hereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calbera, glad we're in agreement. Using the context from
errgroup
will ensure proper cancellation propagation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait why are we using errgroup here for 1 task? Usually err group helps for multiple concurrent tasks. Also we just end up waiting on its result anyways below, so it seems we can just call
verifyFinalBlobAvailability
in the same thread?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calbera, you're absolutely right. Since we're only running a single task and waiting for its result immediately, using
errgroup
is unnecessary here. We can simplify the code by callingverifyFinalBlobAvailability
directly in the same goroutine.Here's the updated code:
This removes the overhead of the
errgroup
and simplifies the code.✏️ Learnings added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for potential data races in concurrent functions
The call to
s.sendPostBlockFCU(ctx, st, blk)
is executed as a new goroutine. Ensure that this function and any shared resources it accesses are thread-safe to prevent data races or synchronization issues.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@itsdevbear we may want to harden this and pass some kind of blob data to enhance verification. Not sure what. You got suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid potential deadlock with unbuffered 'blobFinalized' channel
In the
eventLoop
method at lines 214-215, sending tos.blobFinalized
may block if there's no goroutine ready to receive from it. SinceblobFinalized
is an unbuffered channel, this could lead to a deadlock if not properly synchronized.Consider one of the following solutions:
Solution 1: Use a buffered channel
Apply this diff to make
blobFinalized
a buffered channel:Solution 2: Ensure proper synchronization
Ensure that
verifyFinalBlobAvailability
is called before the event is sent, guaranteeing that a goroutine is ready to receive fromblobFinalized
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure 'blobFinalized' signal corresponds to the specific block being verified
Currently,
verifyFinalBlobAvailability
waits ons.blobFinalized
, which signals when any blob sidecar is finalized. This approach may cause the function to proceed even if the blob sidecar for the blockblk
has not been finalized, potentially leading to incorrect verification. To prevent this, modify the synchronization mechanism to ensure that the blob sidecar for the specific block is finalized before proceeding.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would agree, might need to check that the blk.GetSlot() <= number of blobFinalized event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calbera,
That's a valid point. Comparing
blk.GetSlot()
to the number ofblobFinalized
events could help, but it might not be entirely reliable if blobs are finalized out of order or if multiple blobs are finalized concurrently. To ensure that the correct blob sidecar corresponds to the specific block being verified, it might be better to track blob finalization per slot.Consider modifying
blobFinalized
to be a map from slots to channels:Then, in
verifyFinalBlobAvailability
, you can wait on the channel associated with the specific slot:This way, the function only proceeds when the blob sidecar for the block
blk
has been finalized, ensuring accurate synchronization.Let me know what you think about this approach.
(_/)
( •_•)
🐰🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth adding the blk slot num in the event data here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider enhancing the finalization event payload for better observability.
The current implementation uses an empty struct as the event payload. Including metadata about the processed sidecars would aid in debugging and monitoring the finalization process.
Consider this enhancement:
Don't forget to add the time import:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider implementing finalization timeout and retry mechanism.
While the current implementation ensures sidecar processing completion before finalization, it might benefit from additional robustness measures:
This would help handle edge cases in distributed scenarios where network delays or temporary failures might occur.
Would you like me to provide a detailed implementation suggestion for these enhancements?