Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed service not triggering rollback #25

Open
slaecker opened this issue Nov 1, 2024 · 2 comments
Open

Failed service not triggering rollback #25

slaecker opened this issue Nov 1, 2024 · 2 comments

Comments

@slaecker
Copy link

slaecker commented Nov 1, 2024

On MicroOS I recently observed that k3s fails on kernel 6.6.58-1-longterm but starts up successfully on kernel-default. So I wanted to add it to the health-checker to force a rollback if a kernel update breaks k3s.

I added k3s.service to the After section of health-checker.service but the health-check always succeeded.

cat /etc/systemd/system/health-checker.service.d/override.conf
[Unit]
After=k3s.service

Because k3s.service stayed in status Activating due to Restart=always I added an override:

cat /etc/systemd/system/k3s.service.d/override.conf
[Service]
Restart=on-success

Now when rebooting the service fails but health-checker won't perform a rollback. The check still succeeds.

Am I missing something?

systemctl status k3s.service 
× k3s.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s.service; enabled; preset: disabled)
    Drop-In: /etc/systemd/system/k3s.service.d
             └─override.conf
     Active: failed (Result: exit-code) since Fri 2024-11-01 12:14:25 CET; 12min ago
   Duration: 1.254s
 Invocation: 3435ccb41df54841b4e1482cc86e52d4
       Docs: https://k3s.io
   Main PID: 1163 (code=exited, status=2)
        CPU: 6.846s
systemctl status health-checker.service 
○ health-checker.service - MicroOS Health Checker
     Loaded: loaded (/usr/lib/systemd/system/health-checker.service; enabled; preset: enabled)
    Drop-In: /etc/systemd/system/health-checker.service.d
             └─override.conf
     Active: inactive (dead) since Fri 2024-11-01 12:14:25 CET; 13min ago
 Invocation: 30a45f6a1e1645f584fd606313f48934
   Main PID: 1326 (code=exited, status=0/SUCCESS)
        CPU: 620ms

Nov 01 12:14:24 vm-opensuse-microos-test systemd[1]: Starting MicroOS Health Checker...
Nov 01 12:14:24 vm-opensuse-microos-test health-checker[1326]: Clearing GRUB flag
Nov 01 12:14:24 vm-opensuse-microos-test health-checker[1327]: grub2-editenv: error: cannot open `/boot/grub2/grubenv': Read-only file system.
Nov 01 12:14:24 vm-opensuse-microos-test health-checker[1326]: Starting health check
Nov 01 12:14:24 vm-opensuse-microos-test health-checker[1373]: active
Nov 01 12:14:25 vm-opensuse-microos-test health-checker[1326]: Health check passed
@thkukuk
Copy link
Contributor

thkukuk commented Nov 4, 2024

There is no plugin for k3s, so health-checker cannot verify that and ignores the service.
Somebody needs to write and provide a plugin first.

@slaecker
Copy link
Author

slaecker commented Nov 5, 2024

Thanks, I worked on this today and created pull request #27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants