-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add retry for testrunner validation #65
Add retry for testrunner validation #65
Conversation
result = await docker.pull(test_runner) | ||
if result: | ||
break | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a log print maybe? I guess opentelemetry will not work here, but maybe just a print to see when retries are done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not too hard we should add some data to an otel span here describing how many retries were done.
This is not required. We could add it to metrics later instead.
@@ -191,6 +193,13 @@ async def validate(self, test_suite_url): | |||
test_runners.add(constraint.value) | |||
docker = Docker() | |||
for test_runner in test_runners: | |||
for _ in range(3): | |||
result = await docker.pull(test_runner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a complete pull necessary here? Can it be avoided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC there is no other way to verify that the image is available for download. @t-persson you're the original author, do you remember?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait a moment, looking at this more closely that shouldn't be docker.pull()
at all. The Docker()
class doesn't even have a pull()
method. This is most likely a CoPilot hallucination that slipped past my weary eyes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That also means that there is no pulling of the actual image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
dc421c9
to
2bc0d0f
Compare
Description of the Change
Adding retry when validation test runners withing test suite validation. This will increase resilience against random network outages and other intermittent non-permanent network related problems.
Alternate Designs
I could have added the retries on several places in the code as there are chains of head requests made throughout the validation code. Although since the validation is fairly quick i opted for one retry to rule them all.
Possible Drawbacks
The backoff is currently very rudimentary and short lived, we will not "survive" longer outages with the current code.
Sign-off
Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.
Signed-off-by: Fredrik Fristedt <[email protected]>