Fix for retry cancellation #2456

kmcclellan · 2025-01-18T02:54:53Z

Pull Request

The issue or feature being addressed

[Bug]: Inconsistent behavior when canceling a retry

Details on the issue fix or feature implementation

Despite how the conversations around the previous attempt, only a small change is needed to make this work in a way that is coherent.

Note my choice to avoid throwing until after delay is calculated and the retry event is logged. I think this is most likely what a user will expect, since every failed attempt currently generates such a log, and the outcome was "handled" in the sense that it was prevented from surfacing.

This also takes care of @martintmk's concern about disposables, since DisposeHelper is also called before cancellation is acknowledged, just as though we were continuing with the retry. I've added an extra test case for this too.

Confirm the following

I started this PR by branching from the head of the default branch
I have targeted the PR to merge into the default branch
I have included unit tests for the issue/feature
I have successfully run a local build

* Fixes App-vNext#2375

codecov · 2025-01-18T03:00:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.39%. Comparing base (dc4010a) to head (b66ad70).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2456   +/-   ##
=======================================
  Coverage   85.39%   85.39%           
=======================================
  Files         312      312           
  Lines        7464     7465    +1     
  Branches     1121     1121           
=======================================
+ Hits         6374     6375    +1     
  Misses        905      905           
  Partials      185      185

Flag	Coverage Δ
linux	`85.39% <100.00%> (+0.02%)`	⬆️
macos	`85.37% <100.00%> (+<0.01%)`	⬆️
windows	`85.36% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

peter-csala · 2025-01-20T08:09:22Z

src/Polly.Core/Retry/RetryResilienceStrategy.cs

            {
-                try
+                context.CancellationToken.ThrowIfCancellationRequested();


Calling the ThrowIfCancellationRequested here might be too late. The OnRetry telemetry event is reported and the OnRetry callback is already executed. But there will be no new retry attempt if cancellation is requested.

I considered this. The event may be named "on retry," but the actual meaning is more like "on handled by the retry policy." Currently, any outcome that is not returned to the caller is considered "handled" and triggers this event. Users don't want to wait for the next attempt (which may or may not happen). We want to know the strategy was triggered since this tells us the callback completed and why the outcome was not returned.

Cancellation in .NET is cooperative. It's normal for code to finish doing certain important tasks until it comes to a better "stopping place" to acknowledge the cancellation. Logging/telemetry for what has just happened I think falls into this category. I would be open to skipping the "delay" calculation and logging a zero if cancellation is triggered, but I'm also not convinced this adds much value.

As of now the strategy works like this on high level for a retry attempt (happy path):

user provided callback is executed

telemetry event is reported

delay is calculated

onretry is executed

delay is waited

The execution can be stopped due to the following circumstances: the outcome is not handled by the strategy, the attempts are exhausted. From one side treating the cancellation in a different way feels a bit odd. But I agree that if the user provided callback executed then the telemetry and OnRetry hook should be performed as well because they allow the consumers to get insights what happened.

The OnRetryArguments serves multiple purposes. It tells about the past (outcome, duration, etc.) but also shares some information about the future (delay). You can access the information whether the cancellation was requested via the context (context.CancellationToken.IsCancellationRequested) but since it is not a top-level field I have doubts that anyone has ever checked it. IMHO making this information as a top-level field would make the 0/-1 delay more meaningful by providing contextual information.

IMHO the best would be to have something like this:

flowchart TD Args[OnRetryAgruments] Current[CurrentAttempt] Next[NextAttempt] Args --> Current Args --> Next Current --> Duration Current --> Outcome Current --> A[etc.] Next --> IsCancelled Next --> Delay Next --> b[etc.]

Loading

Maybe in V9 😛

peter-csala · 2025-01-20T08:13:00Z

test/Polly.Core.Tests/Retry/RetryResilienceStrategyTests.cs

@@ -66,6 +66,51 @@ public async Task ExecuteAsync_CancellationRequestedAfterCallback_EnsureNotRetri
        executed.Should().BeTrue();
    }

+    [Fact]
+    public async Task ExecuteAsync_CancellationRequestedDuringCallback_EnsureNotRetried()


As in my PR I think we should cover all cases whenever we are dealing with cancellation token:

Before the callback execution

During the callback execution

After the callback execution

During the OnRetry execution

During the sleep

kmcclellan added 2 commits January 17, 2025 20:09

Tests for retry cancellation issue

5774842

Fix retry cancellation issue

b66ad70

* Fixes App-vNext#2375

This comment has been minimized.

Sign in to view

martincostello added the bug fix label Jan 18, 2025

peter-csala reviewed Jan 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for retry cancellation #2456

Fix for retry cancellation #2456

kmcclellan commented Jan 18, 2025

codecov bot commented Jan 18, 2025 •

edited

Loading

This comment has been minimized.

peter-csala Jan 20, 2025

kmcclellan Jan 21, 2025

peter-csala Jan 22, 2025

peter-csala Jan 20, 2025

Fix for retry cancellation #2456

Are you sure you want to change the base?

Fix for retry cancellation #2456

Conversation

kmcclellan commented Jan 18, 2025

Pull Request

The issue or feature being addressed

Details on the issue fix or feature implementation

Confirm the following

codecov bot commented Jan 18, 2025 • edited Loading

Codecov Report

This comment has been minimized.

peter-csala Jan 20, 2025

Choose a reason for hiding this comment

kmcclellan Jan 21, 2025

Choose a reason for hiding this comment

peter-csala Jan 22, 2025

Choose a reason for hiding this comment

peter-csala Jan 20, 2025

Choose a reason for hiding this comment

codecov bot commented Jan 18, 2025 •

edited

Loading