-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retries #49
Conversation
Maybe you could mention in the PR's description, which features mentioned in #48 are covered, and provide some high-level examples of how the API should be used? The tests provide this somewhat, but it would also be good to know some rationale behind the design :) |
Great description - can be almost 1-1 be copied as docs to the readme - thanks :) I haven't read the code yet, but some initial questions:
|
Thanks, the idea was exactly to avoid doing the same work twice :)
By calling Internally, they are implemented as separate, private case classes marked with the
They are conceptually the same, Indeed
You're right, although it's the example I gave that's incorrect. The idea would be to use completely different policies based on the outcome, e.g.
Right, it seems it's achievable with nesting, assuming that a global counter is desired. We can consider whether nesting is enough user-friendly for such use case (or, ideally, get some feedback from real users ;)), and whether the retry limit should be global (or maybe, with a better API, we could let the user choose a global vs. per-policy limit?)
I think you're right in that the current "retry policies" are actually "retry schedules", since they only cover the timing aspect of retries, while
Yes, why not. It seems that a |
So then we have e.g.
Ah yes. I guess we could have sth like a
I doubt a global counter is what we'd want, but I think that for many scenarios simple nesting would work (with separate counters). But yes, let's wait for user feedback.
Another point for further work :) |
No, you call Delay.forever(100.millis)
Backoff.forever(100.millis, 5.minutes, Jitter.Full) |
Ah ok, good then :) |
@adamw a couple of updates after our discussion:
|
And, thanks to the simplified API, the |
Again, just read the docs - looks good :) 🚀 Implementation choices -> maybe ADR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Although the composability of RetryPolicy
may be tricky to achieve in the future, I don't think we need to account for it now.
else right | ||
|
||
val remainingAttempts = policy.schedule match | ||
case policy: Schedule.Finite => Some(policy.maxRetries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick case schedule
or case finiteSchedule
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks, got lost in translation refactoring ;)
Awesome! Now only a release & blog :) |
Retries
Inspired by https://github.com/softwaremill/retry
Rationale
The goal was to have a unified API (a single
retry
function) that would handle different ways of defining the operation (direct result,Try
andEither
), using various policies of delaying subsequent attempts (no delay, fixed delay, exponential backoff with an optional jitter), with a possibility to customize the definition of a successful result, and to fail fast on certain errors.API
The basic syntax for retries is:
retry[T](operation)(policy)
Operation definition
The
operation
can be provided as one of:f: => T
Try[T]
, i.e.f: => Try[T]
Either[E, T]
, i.e.f: => Either[E, T]
Policies
A retry policy consists of two parts:
Schedule
, which indicates how many times and with what delay should we retry theoperation
after an initial failure,ResultPolicy
, which indicates whether:operation
should be considered a success (if not, theoperation
would be retried),operation
should be retried or fail fast.The available schedules are defined in the
Schedule
object. Each schedule has a finite and an infinite variant.Finite schedules
Finite schedules have a common
maxRetries: Int
parameter, which determines how many times theoperation
would be retried after an initial failure. This means that the operation could be executed at mostmaxRetries + 1
times.Infinite schedules
Each finite schedule has an infinite variant, whose settings are similar to those of the respective finite schedule, but without the
maxRetries
setting. Using the infinite variant can lead to a possibly infinite number of retries (unless theoperation
starts to succeed again at some point). The infinite schedules are created by calling.forever
on the companion object of the respective finite schedule (see examples below).Schedule types
The supported schedules (specifically - their finite variants) are:
Immediate(maxRetries: Int)
- retries up tomaxRetries
times without any delay between subsequent attempts.Delay(maxRetries: Int, delay: FiniteDuration)
- retries up tomaxRetries
times , sleeping fordelay
between subsequent attempts.Backoff(maxRetries: Int, initialDelay: FiniteDuration, maxDelay: FiniteDuration, jitter: Jitter)
- retries up tomaxRetries
times , sleeping forinitialDelay
before the first retry, increasing the sleep between subsequent attempts exponentially (with base2
) up to an optionalmaxDelay
(default: 1 minute).Optionally, a random factor (jitter) can be used when calculating the delay before the next attempt. The purpose of jitter is to avoid clustering of subsequent retries, i.e. to reduce the number of clients calling a service exactly at the same time. See the AWS Architecture Blog article on backoff and jitter for a more in-depth explanation.
The following jitter strategies are available (defined in the
Jitter
enum):None
- the default one, when no randomness is added, i.e. a pure exponential backoff is used,Full
- picks a random value between0
and the exponential backoff calculated for the current attempt,Equal
- similar toFull
, but prevents very short delays by always using a half of the original backoff and adding a random value between0
and the other half,Decorrelated
- uses the delay from the previous attempt (lastDelay
) and picks a random value between theinitalAttempt
and3 * lastDelay
.Result policies
A result policy allows to customize how the results of the
operation
are treated. It consists of two predicates:isSuccess: T => Boolean
(default:true
) - determines whether a non-erroneous result of theoperation
should be considered a success. When it evaluates totrue
- no further attempts would be made, otherwise - we'd keep retrying.With finite schedules (i.e. those with
maxRetries
defined), ifisSuccess
keeps returningfalse
whenmaxRetries
are reached, the result is returned as-is, even though it's considered "unsuccessful",isWorthRetrying: E => Boolean
(default:true
) - determines whether another attempt would be made if theoperation
results in an errorE
. When it evaluates totrue
- we'd keep retrying, otherwise - we'd fail fast with the error.The
ResultPolicy[E, T]
is generic both over the error (E
) and result (T
) type. Note, however, that for the direct andTry
variants of theoperation
, the error typeE
is fixed toThrowable
, while for theEither
variant,E
can ba an arbitrary type.API shorthands
When you don't need to customize the result policy (i.e. use the default one), you can use one of the following shorthands to define a retry policy with a given schedule (note that the parameters are the same as when manually creating the respective
Schedule
):RetryPolicy.immediate(maxRetries: Int)
,RetryPolicy.immediateForever
,RetryPolicy.delay(maxRetries: Int, delay: FiniteDuration)
,RetryPolicy.delayForever(delay: FiniteDuration)
,RetryPolicy.backoff(maxRetries: Int, initialDelay: FiniteDuration, maxDelay: FiniteDuration, jitter: Jitter)
,RetryPolicy.backoffForever(initialDelay: FiniteDuration, maxDelay: FiniteDuration, jitter: Jitter)
.If you want to customize a part of the result policy, you can use the following shorthands:
ResultPolicy.default[E, T]
- uses the default settings,ResultPolicy.successfulWhen[E, T](isSuccess: T => Boolean)
- uses the defaultisWorthRetrying
and the providedisSuccess
,ResultPolicy.retryWhen[E, T](isWorthRetrying: E => Boolean)
- uses the defaultisSuccess
and the providedisWorthRetrying
,ResultPolicy.neverRetry[E, T]
- uses the defaultisSuccess
and fails fast on any error.Examples
See the tests in
ox.retry.*
for more.Implementation choices
@tailrec
vswhile
To make the infinite policies stack-safe, the actual implementation of
retry
is tail-recursive. This resulted in some code duplication in the implementation, but it still seems more readable and nicer than the alternative variant with a plain-oldwhile
loop with a couple ofvar
s for state management.Possible next steps
repeat(operation)(Schedule)
plus a stop condition