Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for retry cancellation #2456

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kmcclellan
Copy link
Contributor

Pull Request

The issue or feature being addressed

[Bug]: Inconsistent behavior when canceling a retry

Details on the issue fix or feature implementation

Despite how the conversations around the previous attempt, only a small change is needed to make this work in a way that is coherent.

Note my choice to avoid throwing until after delay is calculated and the retry event is logged. I think this is most likely what a user will expect, since every failed attempt currently generates such a log, and the outcome was "handled" in the sense that it was prevented from surfacing.

This also takes care of @martintmk's concern about disposables, since DisposeHelper is also called before cancellation is acknowledged, just as though we were continuing with the retry. I've added an extra test case for this too.

Confirm the following

  • I started this PR by branching from the head of the default branch
  • I have targeted the PR to merge into the default branch
  • I have included unit tests for the issue/feature
  • I have successfully run a local build

Copy link

codecov bot commented Jan 18, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.39%. Comparing base (dc4010a) to head (b66ad70).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2456   +/-   ##
=======================================
  Coverage   85.39%   85.39%           
=======================================
  Files         312      312           
  Lines        7464     7465    +1     
  Branches     1121     1121           
=======================================
+ Hits         6374     6375    +1     
  Misses        905      905           
  Partials      185      185           
Flag Coverage Δ
linux 85.39% <100.00%> (+0.02%) ⬆️
macos 85.37% <100.00%> (+<0.01%) ⬆️
windows 85.36% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kmcclellan

This comment has been minimized.

{
try
context.CancellationToken.ThrowIfCancellationRequested();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling the ThrowIfCancellationRequested here might be too late. The OnRetry telemetry event is reported and the OnRetry callback is already executed. But there will be no new retry attempt if cancellation is requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this. The event may be named "on retry," but the actual meaning is more like "on handled by the retry policy." Currently, any outcome that is not returned to the caller is considered "handled" and triggers this event. Users don't want to wait for the next attempt (which may or may not happen). We want to know the strategy was triggered since this tells us the callback completed and why the outcome was not returned.

Cancellation in .NET is cooperative. It's normal for code to finish doing certain important tasks until it comes to a better "stopping place" to acknowledge the cancellation. Logging/telemetry for what has just happened I think falls into this category. I would be open to skipping the "delay" calculation and logging a zero if cancellation is triggered, but I'm also not convinced this adds much value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now the strategy works like this on high level for a retry attempt (happy path):

  • user provided callback is executed
  • telemetry event is reported
  • delay is calculated
  • onretry is executed
  • delay is waited

The execution can be stopped due to the following circumstances: the outcome is not handled by the strategy, the attempts are exhausted. From one side treating the cancellation in a different way feels a bit odd. But I agree that if the user provided callback executed then the telemetry and OnRetry hook should be performed as well because they allow the consumers to get insights what happened.

The OnRetryArguments serves multiple purposes. It tells about the past (outcome, duration, etc.) but also shares some information about the future (delay). You can access the information whether the cancellation was requested via the context (context.CancellationToken.IsCancellationRequested) but since it is not a top-level field I have doubts that anyone has ever checked it. IMHO making this information as a top-level field would make the 0/-1 delay more meaningful by providing contextual information.

IMHO the best would be to have something like this:

flowchart TD
    Args[OnRetryAgruments]
    Current[CurrentAttempt]
    Next[NextAttempt]

    Args --> Current
    Args --> Next

    Current --> Duration
    Current --> Outcome
    Current --> A[etc.]

    Next --> IsCancelled
    Next --> Delay
    Next --> b[etc.]
Loading

Maybe in V9 😛

@@ -66,6 +66,51 @@ public async Task ExecuteAsync_CancellationRequestedAfterCallback_EnsureNotRetri
executed.Should().BeTrue();
}

[Fact]
public async Task ExecuteAsync_CancellationRequestedDuringCallback_EnsureNotRetried()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in my PR I think we should cover all cases whenever we are dealing with cancellation token:

  • Before the callback execution
  • During the callback execution
  • After the callback execution
    • During the OnRetry execution
    • During the sleep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants