-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config to automatically Re-trigger failed periodics #358
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jmguzik The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for k8s-prow ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
8d1e0bd
to
de145fc
Compare
/lgtm |
/cc |
/hold Please provide a PR description of what the PR actually implements - the link to the issue is helpful but not sufficient, because even the issue contains some discussion about options for a vague idea, not a clear implementation direction. |
@petr-muller done |
/hold cancel |
I'll review tomorrow 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we need to be intentional about is the interaction between the next legit trigger and a retry, when both could occur, and also whether it is acceptable for retries to "push back" interval triggers.
As done here, next legit trigger takes precedence, basically interrupting the retry sequences, which may impact especially the stop on success = false
case where we expect to deterministically get the configured amount of retries but we wont because the legit-triggerred job interrupts that.
That sounds acceptable to me but needs to be intentional, documented behavior.
@petr-muller thanks for the review. I agree with most of your comments but I will clear them when we reach common understanding. As noted in the comments, every job marked with the new fields (in the configuration) will have associated label. First run will have value 1. This is because I wanted to keep other periodics unaffected and I make distinction on this field weather to process periodic as normal or repeatable.
I think the way it is implemented, normal trigger (in the case of this implementation first run value 1) will always have precedence because as soon as
That's true
Makes sense. |
New changes are detected. LGTM label has been removed. |
Signed-off-by: Jakub Guzik <[email protected]>
This PR introduces a new configuration field (along with the functionality in horologium component) for ProwJobs to support automatically re-triggering periodic jobs based on a specified policy. The new
retrigger-failed-run
field allows retrying a failed periodic job with customizable settings for the number of attempts and the interval between retries.Example usage:
until_success: true
alows to stop if success state is reached. Set tofalse
it triggers untilattempts
reached.sloves #268