-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add broken status #357
Add broken status #357
Conversation
We just landed a PR to make the error fatal - see #354 and related discussion #352. From what I can see in the zfs issue, it seems like the release/package was completely broken leading to the problem. Is that correct? Overall I like the idea, although am cautious for regressions in the XXXs users out there parsing Some requests:
Glancing through the codebase - Thanks |
Various reasons from what I understand. -: dkms status output changed at some point, which broke the package scriptlet. All lead to various problems, when installing, re-installing, upgrading or removing. Tomorrow I will be gone for 10 days. |
IIRC the Thanks for the work unwrapping this. Enjoy you time AFK o/ |
Fwiw at a later point, if you feel like adding some zfs tests into our CI that would be deeply appreciated, although is entirely optional. |
@AllKind did you have the time to get back to this? Introducing a "broken" status is reasonable, although we'd need to update the documentation and add some test(s) as mentioned above. Thanks |
@evelikov |
0c6663f
to
935e7af
Compare
@evelikov 1: I reverted the requested commit. It makes sense to not interrupt at 2: The broken status is displayed to the user to stdout as all the other states. Plus an extra line to stderr, which describes the problem. 3: I took a look at the internal functions. I added a check for the 'broken' status in do_autoinstall(). Also I modified is_module_added(), to only report a module as added, if both the source directory and the symlink pointing to it exist. 4: The man page was updated. Regarding adding tests, I don't know what to do there. Some more details would be useful. |
Welcome back o/ Thinking about the tests, here are some rough pseudo-code ideas:
Copy/edit the existing autoinstall test to:
No ideas ... if you can come up with any, that'll be great although optional. I would *request that we elaborate/define in the manual page how a NOTE: |
Broken in this PR means, that either the module source (and the dkms.conf) is missing, or the symlink 'source' pointing to it. |
One thing I wanted to point out is: |
That's my understanding as well.
Fully agreed, let's avoid silently fixing things. |
In case it wasn't obvious - it's fine to issue |
935e7af
to
e4af22f
Compare
Just pushed a new version with a lot of changes. Description is in the updated commit message. I'd just like to confirm if the changes go into the right direction - conceptual wise. Also I have two questions: 1 - About the exit states on die() - Could not find any documentation in the source. What's the conventions? 2 - As I've never seen this before.... Why is every |
Seems like existing tests caught some breakage already. Hazzah for tests catching issues. Cannot see any new tests, not sure if you meant to
Instead of looking at the fixes/code alone, can I suggest opting for another route - TDD:
As general rule - code should use
It's a bashism - see https://unix.stackexchange.com/questions/48106/ for more. In practical sense - there are only a handful of cases (say 5%?) where they're needed ... even though we use it ~40% of the time. Don't worry pick whichever you're happy with. |
e4af22f
to
153457c
Compare
Edit: I meant the existing tests here on github. Yeah, saw that, quickly did a fix. Will think about it more tomorrow.
I did not add any tests yet. I wanted to first check in with you guys, if my changes go a way you are comfortable with. Also a case for .gitignore? |
Hard to reply here, since the very first part is missing - "define the expectations". The commit message explains what the code does, instead of why. As a whole it doesn't seem to be doing crazy things. Will open a PR in a second to update .gitignore. Thanks o/ |
I thought we talked about the "expectations"... You then asked me to expand that to the other actions / functionality. As said I wanted to check with you, if the way I do it, is ok for you. Cool? |
Grr - didn't see the updated manual page, sorry my bad. It covers things afaict. Left a few specific comments but overall the work is fine |
ce23f98
to
8565169
Compare
So... I added the tests for the 'broken' status to the best of my knowledge (nothing for the match action - no existing tests to copy from), but the code is the same as in autoinstall()... so... I'd say this PR is ready for merge. |
8565169
to
e81f165
Compare
This reverts commit c0004f0. Signed-off-by: Mart Frauenlob <[email protected]>
e81f165
to
939cfec
Compare
Adjusted the tests to the new output... Now this is odd:
Same test passes in the Ubuntu VMs. I do the testing in a Virtualbox VM with Fedora 38. |
4b53a1d
to
27fc015
Compare
If either a dkms module source, or the symbolic link pointing to it is missing, the output of `dkms status` will be messed up. Add a new status called 'broken', which will inform the user about it in a nicely formatted way. is_module_added() was modified to not report a module as added, if not both the source directory and the 'source' symlink exist. do_autoinstall() and run_match() were modified to handle a broken status. They skip that particular module/version combo and continue iterating. The new function module_is_broken_and_die() was introduced to die early on a broken module. Because if in a broken state everything has to be considered volatile, we always die. User intvervention is required to restore a healthy environment. The only exeption is, if only the symbolic link 'source' is missing, the action 'add' can be used to re-add the module. The man page was updated with the new 'broken' status. Tests were added to the test suite. Signed-off-by: Mart Frauenlob <[email protected]>
27fc015
to
9d6c9ab
Compare
Ok, worked around that by not using the |
Silent around here lately... |
Any word on this? |
Ouch, sorry for letting this slip through the cracks. Looking through it looks solid - the tests are more extensive than I would have gone for. Thank you. The easily annoyed user in me would have preferred if "dkms remove the/broken/module" (and by extension unbuild/uninstall) to work, although I'm not !00% sure if that's a good idea. In the worst case, we can worry about it if people complain. Thanks again for the work (and prodding me) 👍 |
Resolved the merge conflict, but could not push back to your branch. Did you have In either case, I've resolved the issues and pushed this PR via the CLI. Closing |
If either a dkms module source, or the symbolic link pointing to it is missing, the output of
dkms status
will be messed up. Add a new status called 'broken', which will inform the user about it in a nicely formatted way.A recently discovered error in the ZFS installation routine lead to this issue: openzfs/zfs#15336
The dkms module sources were deleted, but the files in the dkms tree were still there.
Therefor the symbolic link 'source' was dangling.
This led to
dkms status
output like this:With this patch, the output will be like this:
Things, which are IMHO for discussion are (of course only in case the change gets accepted):
If the "Missing ..." message should be at the same line as "broken", or on a line on its own.
For the latter possibly with sent to
stderr
. Which would have the advantage eventually not to break possibly existing scripts, which parse the output ofdkms status
.Just as an idea... In the future other things may be checked for a healthy state. I.e. the installed modules. If found broken, reported as such with this new state.
I didn't look into the codebase more deeply yet, to propose something more concrete. But maybe I will. If I come up with something reasonable, I'll put it up for discussion.
Have a nice day!