Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failure_action shell not trigerred on nfs mount failure #42

Open
jbtrystram opened this issue Oct 8, 2024 · 3 comments
Open

failure_action shell not trigerred on nfs mount failure #42

jbtrystram opened this issue Oct 8, 2024 · 3 comments
Assignees

Comments

@jbtrystram
Copy link

With the following kdump config :

path /
nfs server.example.com:/export/kdump
core_collector makedumpfile -l --message-level 7 -d 31
failure_action shell

If the nfs destination fail to be mounted, kdump don't launch the shell but instead try to write the dump to the rootfs.

More information:

[get_mntpoint_from_target](https://github.com/rhkdump/kdump-utils/blob/88525ebf5e43cc86aea66dc75ec83db58233883b/kdump-lib-initramfs.sh#L104) don't fail when given a remote address not mounted.
when NFS is not mounted :

[root@kdtest ~]$ source /lib/kdump/kdump-lib.sh
[root@kdtest ~]$ get_mntpoint_from_target "172.16.82.90:/home/export"
<No result>

with nfs mounted :

[root@kdtest ~]$  mount  "<nfs-server-address>:/home/kdump" /tmp/nfsmnt
[root@kdtest ~]$ source /lib/kdump/kdump-lib.sh
[root@kdtest ~]$ get_mntpoint_from_target "<nfs-server-address>:/home/kdump"
/tmp/nfsmnt

Maybe [dump_fs](https://github.com/rhkdump/kdump-utils/blob/main/dracut/99kdumpbase/kdump.sh#L135) should fail when passed an empty value ? or get_mntpoint_from_target should fail when given a non-mounted remote fs ?

In any case, the failure shell is not started

@licliu
Copy link
Collaborator

licliu commented Oct 12, 2024

Hi @jbtrystram , kdump should try to mount the nfs target while building kdump image and it will fail if the target cannot be mounted. Did you run kdumpctl rebuild && kdumpctl reload before you tirgger the panic?
It would be even better if you could provide the steps to reproduce it.

@licliu licliu self-assigned this Oct 14, 2024
@masaki-hatada
Copy link

Hi @licliu ,

Please let me comment instead.

Did you run kdumpctl rebuild && kdumpctl reload before you tirgger the panic?

Yes.

It would be even better if you could provide the steps to reproduce it.

Make nfs mount to fail intentionally.
To add "dracut_args --omit-drivers nfs" in your kdump.conf as follows is the good way to reproduce.

path /
nfs server.example.com:/export/kdump
dracut_args --omit-drivers nfs
core_collector makedumpfile -l --message-level 7 -d 31
failure_action shell

The following is the output of when nfs mount didn't fail.

[   11.012759] systemd[1]: Mounted /kdumproot.
[   12.651204] systemd[1]: Mounted /kdumproot.
[   11.040729] systemd[1]: Reached target Remote File Systems.
[   11.066565] systemd[1]: Starting dracut pre-pivot and cleanup hook...
[   11.125571] rpc.idmapd[478]: exiting on signal 15
[   11.156972] systemd[1]: var-lib-nfs-rpc_pipefs.mount: Deactivated successfully.
[   11.196980] systemd[1]: Finished dracut pre-pivot and cleanup hook.
[   11.241854] systemd[1]: Starting Kdump Vmcore Save Service...
[   11.269973] systemd[1]: Workaround dracut FIPS unmounting /boot was skipped because of an unmet condition check (ConditionPathExists=/run/ostree-live).
[   11.315855] kdump[632]: Kdump is using the default log level(3).
[   12.386861] kdump[666]: saving to /kdumproot/192.168.242.53-2024-11-14-07:57:28/
[   12.417546] kdump[671]: saving vmcore-dmesg.txt to /kdumproot/192.168.242.53-2024-11-14-07:57:28/
[   12.477687] kdump[677]: saving vmcore-dmesg.txt complete[   14.116132] kdump[677]: saving vmcore-dmesg.txt complete
[   12.498262] kdump[679]: saving vmcore[   14.136707] kdump[679]: saving vmcore
Copying data                                      : [100.0 %] \ 
[   15.355144] kdump.sh[680]: The dumpfile is saved to /kdumproot/192.168.242.53-2024-11-14-07:57:28//vmcore-incomplete.
[   15.376718] kdump.sh[680]: makedumpfile Completed.
[   15.398426] kdump[684]: saving vmcore complete[   17.036870] kdump[684]: saving vmcore complete
[   15.420911] kdump[686]: saving the /run/initramfs/kexec-dmesg.log to /kdumproot/192.168.242.53-2024-11-14-07:57:28//
[   15.483365] kdump[692]: Executing final action systemctl reboot -f[   17.121806] kdump[692]: Executing final action systemctl reboot -f
[   15.531619] systemd[1]: Shutting down.[   17.170062] systemd[1]: Shutting down.

The following is the output of when nfs mount got failed.
The dump was outputted to /<server name>-<date> directory instead of respecting failure_action shell. It shouldn't happen.

[   10.477714] systemd[1]: Mounting /kdumproot...
[   10.489769] systemd[1]: Acquire Live PXE rootfs Image was skipped because of an unmet condition check (ConditionPathExists=/run/ostree-live).
[   10.511505] systemd[1]: Persist Osmet Files (PXE) was skipped because of an unmet condition check (ConditionPathExists=/run/ostree-live).
[   10.532955] systemd[1]: Starting dracut pre-mount hook...
[   10.545789] systemd[1]: kdumproot.mount: Mount process exited, code=exited, status=32/n/a
[   10.562180] systemd[1]: kdumproot.mount: Failed with result 'exit-code'.
[   10.576575] systemd[1]: Failed to mount /kdumproot.
[   10.588735] systemd[1]: Dependency failed for Remote File Systems.
[   10.602625] systemd[1]: remote-fs.target: Job remote-fs.target/start failed with result 'dependency'.
[   10.620258] systemd[1]: Finished dracut pre-mount hook.
[   10.632989] systemd[1]: Reached target Initrd Root File System.
[   10.646501] systemd[1]: CoreOS Propagate Multipath Configuration was skipped because of an unmet condition check (ConditionKernelCommandLine=rd.multipath=default).
[   10.670546] systemd[1]: Mountpoints Configured in the Real Root was skipped because of an unmet condition check (ConditionPathExists=!/proc/vmcore).
[   10.693261] systemd[1]: Reached target Initrd File Systems.
[   10.706500] systemd[1]: Reached target Initrd Default Target.
[   10.720346] systemd[1]: dracut mount hook was skipped because no trigger condition checks were met.
[   10.737791] systemd[1]: Starting dracut pre-pivot and cleanup hook...
[   10.752395] systemd[1]: var-lib-nfs-rpc_pipefs.mount: Deactivated successfully.
[   10.768179] systemd[1]: Finished dracut pre-pivot and cleanup hook.
[   10.782795] systemd[1]: Starting Kdump Vmcore Save Service...
[   10.796993] systemd[1]: Workaround dracut FIPS unmounting /boot was skipped because of an unmet condition check (ConditionPathExists=/run/ostree-live).
[   11.271400] kdump[662]: saving to /192.168.242.53-2024-11-14-08:02:14/
[   11.298231] kdump[667]: saving vmcore-dmesg.txt to /192.168.242.53-2024-11-14-08:02:14/
[   11.337092] kdump[673]: saving vmcore-dmesg.txt complete
[   11.358309] kdump[675]: saving vmcore
Copying data                                      : [100.0 %] 
[   12.956985] kdump.sh[676]: The dumpfile is saved to /192.168.242.53-2024-11-14-08:02:14//vmcore-incomplete.
[   12.997566] kdump.sh[676]: makedumpfile Completed.
[   13.013112] kdump[680]: saving vmcore complete
[   13.030758] systemd[1]: Shutting down.

@licliu
Copy link
Collaborator

licliu commented Nov 14, 2024

@masaki-hatada Thanks for explanation!
So the issue is that when nfs mount failed in second kernel, kdump still tries to save the vmcore instead of failing immediately.
Seems like a dependency issue, I'll take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants