-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug(blockchain): make block finalization wait on sidecar availability #2118
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request introduces significant modifications to the blockchain processing logic, particularly in handling blob sidecars. The Changes
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2118 +/- ##
==========================================
- Coverage 23.48% 23.44% -0.05%
==========================================
Files 357 357
Lines 16064 16092 +28
Branches 12 12
==========================================
Hits 3772 3772
- Misses 12121 12149 +28
Partials 171 171
|
@@ -80,6 +80,12 @@ type Service[ | |||
// forceStartupSyncOnce is used to force a sync of the startup head. | |||
forceStartupSyncOnce *sync.Once | |||
|
|||
// blobFinalized is used to verify blob sidecar availability upon | |||
// block finalization | |||
blobFinalized chan struct{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@itsdevbear we may want to harden this and pass some kind of blob data to enhance verification. Not sure what. You got suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (5)
mod/beacon/blockchain/process.go
(2 hunks)mod/beacon/blockchain/service.go
(5 hunks)mod/da/pkg/da/service.go
(1 hunks)mod/node-core/pkg/components/dispatcher.go
(1 hunks)mod/primitives/pkg/async/id.go
(1 hunks)
🔇 Additional comments (3)
mod/primitives/pkg/async/id.go (1)
49-49
: LGTM! The new event constant is well-placed and follows conventions.
The addition of BlobSidecarsFinalized
event aligns with the PR's objective of synchronizing block and sidecar finalization. It maintains consistent naming and follows the existing event pattern.
Let's verify the event usage across the codebase:
✅ Verification successful
Event constant is correctly implemented and used across the codebase
The verification shows proper implementation of the BlobSidecarsFinalized
event:
- Published by DA service after blob finalization
- Subscribed to by the beacon blockchain service
- Properly registered in the dispatcher's event system
- Follows the event-driven architecture pattern consistently
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the consistent usage of the new BlobSidecarsFinalized event
# Expected: Should find event subscriptions and publications in service.go and da/service.go
# Test 1: Find all usages of the new event constant
echo "=== Finding all usages of BlobSidecarsFinalized ==="
rg -l "BlobSidecarsFinalized"
# Test 2: Check for event publication patterns
echo -e "\n=== Checking event publication patterns ==="
rg "Publish.*blob-sidecars-finalized"
# Test 3: Check for event subscription patterns
echo -e "\n=== Checking event subscription patterns ==="
rg "Subscribe.*blob-sidecars-finalized"
Length of output: 564
Script:
#!/bin/bash
# Let's examine the actual usage context of BlobSidecarsFinalized in the identified files
echo "=== Checking usage in dispatcher.go ==="
rg -A 5 -B 5 "BlobSidecarsFinalized" mod/node-core/pkg/components/dispatcher.go
echo -e "\n=== Checking usage in service.go (beacon) ==="
rg -A 5 -B 5 "BlobSidecarsFinalized" mod/beacon/blockchain/service.go
echo -e "\n=== Checking usage in da/service.go ==="
rg -A 5 -B 5 "BlobSidecarsFinalized" mod/da/pkg/da/service.go
# Let's also check for any event-related method calls with the string pattern
echo -e "\n=== Checking for event-related method calls ==="
rg -A 2 -B 2 "blob-sidecars-finalized" --type go
Length of output: 2496
mod/node-core/pkg/components/dispatcher.go (1)
60-60
: LGTM! The event registration is correctly placed to support block finalization dependencies.
The addition of BlobSidecarsFinalized
event between FinalSidecarsReceived
and BeaconBlockFinalized
aligns with the PR's objective of ensuring sidecar availability before block finalization.
Let's verify the event integration across components:
✅ Verification successful
Event integration verified and properly implemented across components
The event BlobSidecarsFinalized
is correctly integrated across the codebase:
- Defined as a constant in
mod/primitives/pkg/async/id.go
- Published by the DA service in
mod/da/pkg/da/service.go
- Subscribed to by the beacon blockchain service in
mod/beacon/blockchain/service.go
- Properly registered in the dispatcher chain in
mod/node-core/pkg/components/dispatcher.go
The event flow follows the expected pattern:
- DA service publishes the event when sidecars are finalized
- Beacon blockchain service subscribes to handle the finalization
- Dispatcher properly chains it in the event sequence
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the BlobSidecarsFinalized event usage across the codebase
# Expected: Find event publishing and subscription points to ensure proper integration
# Test 1: Find where the event is published
echo "=== Event Publishing Points ==="
rg -A 3 "BlobSidecarsFinalized.*Publish"
# Test 2: Find where the event is subscribed
echo "=== Event Subscription Points ==="
rg -A 3 "BlobSidecarsFinalized.*Subscribe"
# Test 3: Verify event constant definition
echo "=== Event Definition ==="
rg "BlobSidecarsFinalized.*=.*Event"
Length of output: 335
Script:
#!/bin/bash
# Let's try a different approach to find the event usage
# Search for any occurrence of BlobSidecarsFinalized
echo "=== BlobSidecarsFinalized Usage ==="
rg "BlobSidecarsFinalized"
# Search for the event definition in async package
echo "=== Event Definition in async package ==="
fd -e go -E 'vendor/*' --exec grep -l "BlobSidecarsFinalized" {}
# Look for WithEvent usage pattern
echo "=== WithEvent Usage Pattern ==="
rg -A 3 "WithEvent.*BlobSidecarsFinalized"
# Search for event type definition
echo "=== Event Type Definition ==="
ast-grep --pattern 'var BlobSidecarsFinalized = Event[struct{}]'
Length of output: 1482
mod/da/pkg/da/service.go (1)
139-146
: Verify the event handling in the broader system.
The addition of this new event requires verification of its integration with other components that depend on sidecar finalization.
Let's verify the event handling implementation:
✅ Verification successful
Let me try a different approach to verify the event handling implementation.
Let me search for more specific information about how this event is used in the identified files.
Event handling implementation is properly integrated
The verification shows that the BlobSidecarsFinalized
event is well-integrated into the system:
- The event is properly defined in
mod/primitives/pkg/async/id.go
as part of the finalization events. - The event is registered in the dispatcher system (
mod/node-core/pkg/components/dispatcher.go
) with the correct typeasync.Event[struct{}]
. - The blockchain service (
mod/beacon/blockchain/service.go
) properly subscribes to and handles this event in its event loop, using it as part of the blob availability verification process through theverifyFinalBlobAvailability
method.
The event is used in a coordinated way to ensure blob sidecar availability before block finalization, which is the intended behavior. No issues were found in the event handling implementation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the integration of BlobSidecarsFinalized event
# Test 1: Check for event subscribers
echo "Checking for event subscribers..."
rg -A 5 "BlobSidecarsFinalized.*Subscribe"
# Test 2: Check for potential race conditions in event handling
echo "Checking for concurrent event handling..."
ast-grep --pattern 'select {
case $_ := <-$ch:
$$$
case $_ := <-$ch2:
$$$
}'
Length of output: 305
Script:
#!/bin/bash
# Search for BlobSidecarsFinalized event usage across the codebase
echo "Searching for BlobSidecarsFinalized event usage..."
rg -l "BlobSidecarsFinalized"
# Search for event type definition
echo -e "\nSearching for event type definition..."
rg "type.*BlobSidecarsFinalized"
# Search for event handlers/subscribers
echo -e "\nSearching for potential event handlers..."
rg -A 5 "Subscribe\(.*BlobSidecarsFinalized"
# Search for any async event handling patterns
echo -e "\nSearching for async event handling patterns..."
rg -A 5 "async\.NewEvent.*BlobSidecarsFinalized"
Length of output: 1074
Script:
#!/bin/bash
# Check the content of files that use BlobSidecarsFinalized
echo "Checking mod/primitives/pkg/async/id.go..."
cat mod/primitives/pkg/async/id.go
echo -e "\nChecking mod/node-core/pkg/components/dispatcher.go..."
cat mod/node-core/pkg/components/dispatcher.go
echo -e "\nChecking mod/beacon/blockchain/service.go..."
cat mod/beacon/blockchain/service.go
# Look for any event handler registration or subscription patterns
echo -e "\nSearching for event handler registration patterns..."
rg -A 5 "RegisterEventHandler|AddEventHandler|Subscribe.*async\."
Length of output: 17215
mod/da/pkg/da/service.go
Outdated
|
||
event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{}) | ||
if err := s.dispatcher.Publish(event); err != nil { | ||
s.logger.Error( | ||
"Failed to publish blob finalized event", | ||
"error", err, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider enhancing event payload for better observability.
The current implementation uses an empty struct as the event payload. Consider including metadata about the processed sidecars to aid in debugging and monitoring.
Example enhancement:
-event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{})
+event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct {
+ ProcessedAt time.Time
+ SidecarCount int
+}{
+ ProcessedAt: time.Now(),
+ SidecarCount: msg.Data().Len(),
+})
Committable suggestion skipped: line range outside the PR's diff.
Prevent publishing finalization event on processing failure.
The current implementation publishes the BlobSidecarsFinalized
event even when processSidecars
fails. This could lead to incorrect signaling that sidecars are finalized when they haven't been processed successfully.
Consider this implementation to ensure correct finalization signaling:
if err := s.processSidecars(msg.Context(), msg.Data()); err != nil {
s.logger.Error(
"Failed to process blob sidecars",
"error",
err,
)
+ return
}
event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{})
if err := s.dispatcher.Publish(event); err != nil {
s.logger.Error(
"Failed to publish blob finalized event",
"error", err,
)
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{}) | |
if err := s.dispatcher.Publish(event); err != nil { | |
s.logger.Error( | |
"Failed to publish blob finalized event", | |
"error", err, | |
) | |
} | |
if err := s.processSidecars(msg.Context(), msg.Data()); err != nil { | |
s.logger.Error( | |
"Failed to process blob sidecars", | |
"error", | |
err, | |
) | |
return | |
} | |
event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{}) | |
if err := s.dispatcher.Publish(event); err != nil { | |
s.logger.Error( | |
"Failed to publish blob finalized event", | |
"error", err, | |
) | |
} |
case event := <-s.subBlobFinalized: | ||
s.blobFinalized <- event.Data() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid potential deadlock with unbuffered 'blobFinalized' channel
In the eventLoop
method at lines 214-215, sending to s.blobFinalized
may block if there's no goroutine ready to receive from it. Since blobFinalized
is an unbuffered channel, this could lead to a deadlock if not properly synchronized.
Consider one of the following solutions:
Solution 1: Use a buffered channel
Apply this diff to make blobFinalized
a buffered channel:
- blobFinalized: make(chan struct{}),
+ blobFinalized: make(chan struct{}, 1),
Solution 2: Ensure proper synchronization
Ensure that verifyFinalBlobAvailability
is called before the event is sent, guaranteeing that a goroutine is ready to receive from blobFinalized
.
Committable suggestion skipped: line range outside the PR's diff.
g, _ := errgroup.WithContext(ctx) | ||
g.Go(func() error { | ||
return s.verifyFinalBlobAvailability(ctx, blk) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the returned context from errgroup.WithContext
for proper cancellation propagation
Currently, the context returned by errgroup.WithContext(ctx)
is being ignored. To ensure that cancellation signals are properly propagated to the goroutines, you should use the returned context and pass it to the goroutines.
Apply this diff to fix the issue:
- g, _ := errgroup.WithContext(ctx)
+ g, egCtx := errgroup.WithContext(ctx)
g.Go(func() error {
- return s.verifyFinalBlobAvailability(ctx, blk)
+ return s.verifyFinalBlobAvailability(egCtx, blk)
})
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
g, _ := errgroup.WithContext(ctx) | |
g.Go(func() error { | |
return s.verifyFinalBlobAvailability(ctx, blk) | |
}) | |
g, egCtx := errgroup.WithContext(ctx) | |
g.Go(func() error { | |
return s.verifyFinalBlobAvailability(egCtx, blk) | |
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 should use the context from errgroup
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calbera, glad we're in agreement. Using the context from errgroup
will ensure proper cancellation propagation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait why are we using errgroup here for 1 task? Usually err group helps for multiple concurrent tasks. Also we just end up waiting on its result anyways below, so it seems we can just call verifyFinalBlobAvailability
in the same thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@calbera, you're absolutely right. Since we're only running a single task and waiting for its result immediately, using errgroup
is unnecessary here. We can simplify the code by calling verifyFinalBlobAvailability
directly in the same goroutine.
Here's the updated code:
if err = s.verifyFinalBlobAvailability(ctx, blk); err != nil {
return nil, err
}
This removes the overhead of the errgroup
and simplifies the code.
✏️ Learnings added
Learnt from: calbera
PR: berachain/beacon-kit#2118
File: mod/beacon/blockchain/process.go:70-73
Timestamp: 2024-11-04T20:24:07.176Z
Learning: When there's only one task and we're waiting for its result immediately, using `errgroup` may be unnecessary. Instead, call the function directly in the same goroutine to simplify the code.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (1)
mod/beacon/blockchain/process.go
(2 hunks)
🔇 Additional comments (1)
mod/beacon/blockchain/process.go (1)
85-86
: Ensure the event dispatcher handles BeaconBlockFinalized
correctly
Verify that the BeaconBlockFinalized
event is properly handled by the dispatcher and that all subscribers are updated accordingly. This is crucial to maintain the consistency of the event-driven architecture after introducing asynchronous blob verification.
event := async.NewEvent(ctx, async.BeaconBlockFinalized, blk) | ||
if err = s.dispatcher.Publish(event); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for potential data races in concurrent functions
The call to s.sendPostBlockFCU(ctx, st, blk)
is executed as a new goroutine. Ensure that this function and any shared resources it accesses are thread-safe to prevent data races or synchronization issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
📒 Files selected for processing (3)
mod/beacon/blockchain/process.go
(2 hunks)mod/beacon/blockchain/service.go
(5 hunks)mod/da/pkg/da/service.go
(1 hunks)
🔇 Additional comments (2)
mod/da/pkg/da/service.go (1)
135-138
: LGTM! Robust error handling implementation.
The early return on processing error prevents incorrect finalization signaling, which is crucial for maintaining data consistency.
mod/beacon/blockchain/service.go (1)
215-216
: Potential deadlock with unbuffered 'blobFinalized' channel remains unresolved
The previous review comment regarding the potential deadlock due to the unbuffered blobFinalized
channel is still valid. Sending to s.blobFinalized
may block if no goroutine is ready to receive from it. Please address this issue to prevent possible deadlocks.
event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{}) | ||
if err := s.dispatcher.Publish(event); err != nil { | ||
s.logger.Error( | ||
"Failed to publish blob finalized event", | ||
"error", err, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider enhancing the finalization event payload for better observability.
The current implementation uses an empty struct as the event payload. Including metadata about the processed sidecars would aid in debugging and monitoring the finalization process.
Consider this enhancement:
-event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{})
+event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct {
+ SidecarCount int
+ ProcessedAt time.Time
+}{
+ SidecarCount: msg.Data().Len(),
+ ProcessedAt: time.Now(),
+})
Don't forget to add the time import:
import "time"
"error", err, | ||
) | ||
return | ||
} | ||
|
||
event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{}) | ||
if err := s.dispatcher.Publish(event); err != nil { | ||
s.logger.Error( | ||
"Failed to publish blob finalized event", | ||
"error", err, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider implementing finalization timeout and retry mechanism.
While the current implementation ensures sidecar processing completion before finalization, it might benefit from additional robustness measures:
- Timeout mechanism to prevent indefinite waiting
- Retry logic for transient failures
- Metrics for monitoring processing duration and retry attempts
This would help handle edge cases in distributed scenarios where network delays or temporary failures might occur.
Would you like me to provide a detailed implementation suggestion for these enhancements?
// wait for blob sidecar to be finalized | ||
select { | ||
case <-s.blobFinalized: | ||
// Proceed with verification | ||
case <-ctx.Done(): | ||
return ctx.Err() | ||
} | ||
|
||
// If the blobs needed to process the block are not available, we | ||
// return an error. It is safe to use the slot off of the beacon block | ||
// since it has been verified as correct already. | ||
if !s.storageBackend.AvailabilityStore().IsDataAvailable( | ||
ctx, blk.GetSlot(), blk.GetBody(), | ||
) { | ||
return ErrDataNotAvailable | ||
} | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure 'blobFinalized' signal corresponds to the specific block being verified
Currently, verifyFinalBlobAvailability
waits on s.blobFinalized
, which signals when any blob sidecar is finalized. This approach may cause the function to proceed even if the blob sidecar for the block blk
has not been finalized, potentially leading to incorrect verification. To prevent this, modify the synchronization mechanism to ensure that the blob sidecar for the specific block is finalized before proceeding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would agree, might need to check that the blk.GetSlot() <= number of blobFinalized event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a valid point. Comparing blk.GetSlot()
to the number of blobFinalized
events could help, but it might not be entirely reliable if blobs are finalized out of order or if multiple blobs are finalized concurrently. To ensure that the correct blob sidecar corresponds to the specific block being verified, it might be better to track blob finalization per slot.
Consider modifying blobFinalized
to be a map from slots to channels:
// Use a map to track blob finalization per slot
blobFinalized map[common.Slot]chan struct{}
Then, in verifyFinalBlobAvailability
, you can wait on the channel associated with the specific slot:
select {
case <-s.blobFinalized[blk.GetSlot()]:
// Proceed with verification
case <-ctx.Done():
return ctx.Err()
}
This way, the function only proceeds when the blob sidecar for the block blk
has been finalized, ensuring accurate synchronization.
Let me know what you think about this approach.
(_/)
( •_•)
🐰🎉
return | ||
} | ||
|
||
event := async.NewEvent(msg.Context(), async.BlobSidecarsFinalized, struct{}{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth adding the blk slot num in the event data here
When we finalize a block, we verify that its associated sidecar has been stored.
However block and sidecar finalization are handled concurrently, so we need wait for the blob to be persisted before checking.
This PR solves the issue
Summary by CodeRabbit
Release Notes
New Features
BlobSidecarsFinalized
.BlobSidecarsFinalized
to improve event identification.Bug Fixes
Documentation