-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup Part 1 - First refactors #990
Conversation
This COULD be merged after PR 989 if you want, but does not need to be. It however MUST be merged after PR 982, as it contains a few of its commits. (needs rebase) |
cad319d
to
41ef215
Compare
DCO issue on a commit, pushed it back. |
41ef215
to
d5d9374
Compare
I don't like the idea to create so many small packages, with exported variables. Yet, I find that this is the best way to achieve the goals of cleanup. I intend to continue cleaning up with a part2, doing larger changes, then internalizing many packages. |
Without this, the interface and the code to reboot is a bit more complex than it should be. We do not need setters and getters, as we are just instanciating a single instance of a rebooter interface. We create it based on user input, then pass the object around. This should cleanup the code. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this, it makes the code a bit harder to read. This fixes it by extracting the method. Signed-off-by: Jean-Philippe Evrard <[email protected]>
This will be useful to refactor the checkers loop. Signed-off-by: Jean-Philippe Evrard <[email protected]>
This will make it easier to manipulate main in the future. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this, validations are all over the place. This moves some validations directly into the function, to make the code simpler to read. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this, the variable name is hard to follow. This fixes it by cleaning up the var name. Signed-off-by: Jean-Philippe Evrard <[email protected]>
d5d9374
to
1e8b592
Compare
pkg/daemonsetlock/daemonsetlock.go
Outdated
func (dsl *DaemonSetLock) GetDaemonSet(sleep, timeout time.Duration) (*v1.DaemonSet, error) { | ||
var ds *v1.DaemonSet | ||
var lastError error | ||
err := wait.PollImmediate(sleep, timeout, func() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PollUntilContextTimeout
might be cleaner here
https://pkg.go.dev/k8s.io/apimachinery/pkg/util/wait#PollUntilContextTimeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do on follow up commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in a new commit, but not really confident here, as it has no test coverage. Can you double check on your side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks sane to me
@jackfrancis can you please not squash merge this time? If necessary, I will adapt the commits to make it look cleaner, but I really dislike to lose history, especially when those commits are mostly atomic. |
5e29f01
to
ab24e06
Compare
I added a few extra commits, to merely manage the API surface, and remove deps. I will stop here, it's large enough. Everything is repushed, ready for your review again. It answers all the questions asked. |
Without this, the checkers are only shell calls: test -f sentinelFile, or sentinelCommand. This changes the behaviour of existing code to test file for sentinelFile checker, and to keep the sentinel command as a command. However, to avoid having validation in the root loop, it moves to use a constructor to cleanup the code. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this patch, the rebooter interface has data which is not related to the rebooter interface. This should get removed to make it easier to maintain. The loss comes from the logging, which mentioned the node. In order to not have a regression compared to [1], this ensures that at least the node to be rebooted appears in the main. [1]: kubereboot#134 Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this, we have no validation of the data in command/signal reboot. This was not a problem in the first refactor, as the constructor was a dummy one, without validation. However, as we refactoed, we now have code in the root method that is validation for the reboot command. This can now be encompassed in the constructor. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this, the main loop is in need of 3 functions to simply parse flags and env variables (excluding input validation). This is a bit more complex than it should, especially since we only need to parse command line flags and env vars. This fixes it by simply using pflags (which we were already using) instead of pflags + viper + cobra (for which we do not have any benefit), and removing all the methods outside the mapping of env var with cli flag. The main code is now far simpler: It handles the reading, parsing, and returning in case of error. As we do not bubble up errors from rebootasRequired yet, this is good enough at this moment. Signed-off-by: Jean-Philippe Evrard <[email protected]>
If the notification url configuration is known to be not working, this should be raised as an error, not a warning. Without this, it would be easy to miss a misconfiguration. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Implementation details of lock should not leak into the calling methods. Without this path, calls are a bit more complex and error handling is harder to find. This is a problem for long term maintenance, as it is tougher to refactor the locks without impacting the main. Decoupling the two (main usage of the lock, and the lock themselves) will allow us to introduce other kinds of locks easily. I solve this by inlining into the daemonsetlock package: - including all the methods for managing locks from the main.go functions. Those were mostly doing error handling where code became no-op by introducing multiple daemonsetlock types - adding the lock release delay part of lock info I also did not like the pattern include in Test method, which added a reference to nodeMeta: It was not very clear that Test was storing the current metadata of the node, or was returning the current state. (Metadata here only means unschedulable). The problem I saw was that the metadata was silently mutated from a lock Test method, which was very not obvious. Instead, I picked to explicitly return the lock data instead. I also made it explicit that the Acquire lock method is passing the node metadata as structured information, rather than an interface{}. This is a bit more fragile at runtime, but I prefer having very explicit errors if the locks are incorrect, rather than having to deal with unvalidated data. For the lock release delay, it was part of the rebootasrequired loop, where I believe it makes more sense to be part of the Release method itself, for readability. Yet, it hides the delay into the implementation detail, but it keeps the reboot as required goroutine more readable. Instead of passing the argument rebootDelay as parameter of the rebootasrequired method, this refactor took creation of the lock object in the main loop, close to all the variables, and then pass the lock object to the rebootasrequired. This makes the call for rebootasrequired more clear, and lock is now encompassing everything needed to acquire, release, or get info about the lock. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this, a bit of the validation is done in main, while the rest is done in each constructor. This fixes it by create a new global constructor in checkers/reboot to solve all the cases and bubble up the errors. I prefered keeping the old constructors, and calling them, this way someone wanting to have a fork of the code could still create directly the good checker/rebooter, without the arbitrary decisions taken by the generic constructor. However, kured is not a library, and was never intended to be usable in forks, so we might want to reconsider is part 2 of the refactor. Signed-off-by: Jean-Philippe Evrard <[email protected]>
This will remove double pointers, and be explicit about the type we are using. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Replaced with PollUntilContextTimeout. Signed-off-by: Jean-Philippe Evrard <[email protected]>
Without this, impossible to bubble up errors to main Signed-off-by: Jean-Philippe Evrard <[email protected]>
The main is doing flag validation through pflags, then did further validation by involving the constructors. With the recent refactor on the commit "Refactor constructors" in this branch, we moved away from that pattern. However, it means we reintroduced a log dependency into our external API, and the external API now had extra validations regardless of the type. This is unnecessary, so I moved away from that pattern, and moved back all the validation into a central place, internal, which is only doing what kured would desire, without exposing it to users. The users could still theoretically use the proper constructors for each type, as they would validate just fine. The only thing they would lose is the kured internal decision of validation/precedence. Signed-off-by: Jean-Philippe Evrard <[email protected]>
This at the same time, removes the alert public package. Alert was only used inside prometheus blocker, so it allows to simplify the code. Signed-off-by: Jean-Philippe Evrard <[email protected]>
ab24e06
to
f559a95
Compare
@ckotzbauer I know you're quite busy nowadays... Could you tell us if you have time to review this/give your opinion? I am okay to merge it with only @jackfrancis approval... but in case you want to have a look, don't hesitate to tell. |
@jackfrancis should we get the ball rolling? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@evrardjp feel free to merge w/ your preferred commit preservation strategy |
Thanks @jackfrancis for the reviews. |
This contains a few commits there were
historically in the MR 982 1, to refactor
kured for more maintainability.
This will be the first in a series of
refactors.