-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple controllers competing #467
Comments
There are three resources as you know involved in the workflow for IAC: ImageRepository, ImageUpdateAutomation, and ImagePolicy. IAC does not "look back" or consider the value that it is overwriting and whether it is higher or lower than the currently "latest" image known to ImageRepository. This is a relevant issue with some discussion around a similar idea. Top of mind, I would not recommend running Image Automation on more than one cluster targeting the same paths/repos as they will compete with each other, and there's no way to address that currently except to only run it one place. Here's how it can happen: If an Image gets published, and Cluster 1 reconciles ImageRepository, that updates ImagePolicy, which passes the news to ImageUpdateAutomation, which reconciles the git repo, finds a diff, then commits and pushes a change. Then, on Cluster 2... ImageRepository and ImageUpdateAutomation are both waiting for their next reconcile interval. ImageUpdateAutomation reconciles first before ImageRepository (by random chance, let's say). ImageRepository is still behind. IUA resource reconciles its git repository, finds a diff, overwrites the version with the (stale) version that ImagePolicy still understands to be the latest, because ImageRepo has not reconciled yet. Last, ImageRepository on Cluster 2 is reconciled and catches up both clusters ImagePolicy are now current, then IUA gets notified again, and the dance stops after a third commit, "finally completing" the change in Git. There could be a flag in the future which would enable you to "prevent downgrades" but the fact that IUA does not care what version was used before, only what version ImagePolicy says is the latest, is most likely the root cause explanation for what behavior you described seeing. |
Thanks for answering! That is what I suspected had happen to us and was a concern I had initially. See discussion. I think the "prevent downgrades" flag sounds very interesting. I prefer to always try to roll-forward instead of reverting changes so that would work well in our setup. Any plans on implementing this or is it still on the draft table? |
I'd prefer a flag which rate limits updates to the Git repo. Rollback, for us, is perfectly possible (we run tests as part of the pipeline and if they fail, we withdraw the bad image) but having the automation only update n seconds after a previous update would be useful. |
Hi,
I am running Flux with the
image-automation-controller
enabled in two clusters. They are both watching the same image tag pattern. I just realised that they have been affecting each other meaning that one controller rolled back to a previous version of an container image that was pushed. E.g:Obviously, the end result is correct but I am curious to hear if the "rollback" done at 11:20:53 is due to
image-automation-controller
competing with each other? Does theimage-automation-controller
support multiple instances watching the same image tags or is it built to only have a singleimage-automation-controller
?This is the first time running in to this problem, after using flux for over a year, so I would suspect some sort of race condition between the 2 controllers have happened. Unfortunately, I cannot reproduce so I suspect this is an edge case.
The text was updated successfully, but these errors were encountered: