Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log NewReportingPlugin errors for Commit and Exec plugins #1187

Merged
merged 1 commit into from
Aug 7, 2024

Conversation

rstout
Copy link
Contributor

@rstout rstout commented Jul 11, 2024

Also output metrics

Motivation

Solution

return func() (reportingPluginAndInfo, error) {
result, err := newReportingPluginFn()
if err != nil {
rf.config.lggr.Errorw("NewReportingPlugin failed", "err", err)
Copy link
Contributor

@winder winder Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider wrapping the errors returned by newReportingPluginFn with additional context similar to how the exec plugin's version of this function does it.

Also, should this change be made there as well?

Copy link
Contributor Author

@rstout rstout Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should this change be made there as well?

This change is in the exec plugin as well, or did you mean something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, missed that.

var defaultNewReportingPluginRetryConfig = ccipdata.RetryConfig{
InitialDelay: time.Second,
MaxDelay: 10 * time.Minute,
MaxRetries: 6 * 24, // Retry for approximately 24hrs (MaxDelay of 10m = 6 times per hour, times 24 hours)
Copy link
Collaborator

@connorwstein connorwstein Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rstout on second thought lets make this a bit shorter. I think 4 hours would be good - that should be enough time for a nop to notice that their RPC etc is broken and address. I think if another misconfig issue arose we'd want to react a bit faster that 24hrs

@cl-sonarqube-production
Copy link

@rstout rstout merged commit 5f1b30e into ccip-develop Aug 7, 2024
83 of 86 checks passed
@rstout rstout deleted the CCIP-2826 branch August 7, 2024 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants