Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positive "seccomp filter pointer corruption" on Linux 6.11.0-1-default x86_64 Opensuse tumbleweed #354

Closed
Laitinlok opened this issue Sep 30, 2024 · 65 comments

Comments

@Laitinlok
Copy link

Random crashes on boot and constant kernel panic at runtime and unable to reboot system properly when loaded with lkrg on systemd . System shows error with lkrg during reboot when loaded in runtime.

@solardiz
Copy link
Contributor

Thank you for reporting this @Laitinlok and sorry LKRG isn't working well for you. Please provide more detail - what architecture, what distro, what kernel build (e.g. specific distro package or whether it's your own build), kernel config. Please try loading LKRG with kINT enforcement disabled, e.g with insmod lkrg.ko kint_enforce=1 and show us what appears in dmesg.

If the problem somehow only shows up when you use the systemd service and enable the service to start at boot, then you can similarly debug this by adding options lkrg kint_enforce=1 to /etc/modprobe.d/lkrg.conf (create it).

Alternatively, you can try putting lkrg.kint_enforce = 1 in /etc/sysctl.d/01-lkrg.conf, which is likely to also do the trick, although it takes effect a tiny bit later (than the /etc/modprobe.d/lkrg.conf way).

@Laitinlok
Copy link
Author

Thank you for the swift reply, it will try it and report back.

@Laitinlok
Copy link
Author

Thank you for reporting this @Laitinlok and sorry LKRG isn't working well for you. Please provide more detail - what architecture, what distro, what kernel build (e.g. specific distro package or whether it's your own build), kernel config. Please try loading LKRG with kINT enforcement disabled, e.g with insmod lkrg.ko kint_enforce=1 and show us what appears in dmesg.

If the problem somehow only shows up when you use the systemd service and enable the service to start at boot, then you can similarly debug this by adding options lkrg kint_enforce=1 to /etc/modprobe.d/lkrg.conf (create it).

Alternatively, you can try putting lkrg.kint_enforce = 1 in /etc/sysctl.d/01-lkrg.conf, which is likely to also do the trick, although it takes effect a tiny bit later (than the /etc/modprobe.d/lkrg.conf way).

Opensuse tumbleweed with kernel-default from zypper.

@solardiz
Copy link
Contributor

solardiz commented Oct 2, 2024

Opensuse tumbleweed with kernel-default from zypper.

We do test on OpenSUSE Tumbleweed here in GitHub Actions, and that test passes. But maybe there's something different in your setup, or maybe it takes longer for the issue to show up.

Are you still planning to provide the additional detail I asked for above? Thank you!

@solardiz
Copy link
Contributor

solardiz commented Oct 2, 2024

We do test on OpenSUSE Tumbleweed here in GitHub Actions, and that test passes.

Oh, I see the last time it ran (Sep 24) it used 6.10.11-1. Maybe they've updated to 6.11 since. We'll need to re-run the test.

@solardiz
Copy link
Contributor

solardiz commented Oct 2, 2024

We do test on OpenSUSE Tumbleweed here in GitHub Actions, and that test passes.

Oh, I see the last time it ran (Sep 24) it used 6.10.11-1. Maybe they've updated to 6.11 since. We'll need to re-run the test.

I'm sorry I totally forgot for a moment that it's a build-only test, so it's not supposed to detect this issue. (We do also test boot-up with some other distros.)

So still need more info on this one from you, @Laitinlok.

@Laitinlok
Copy link
Author

Laitinlok commented Oct 5, 2024

Yes it can build properly on 6.10.11 with the release tarball, for 6.11 you need to use latest git commit. I have tried lkrg.kint_enforce=1, it does not help.

@solardiz
Copy link
Contributor

solardiz commented Oct 5, 2024

Yes it can build properly on 6.10.11 with the release tarball, for 6.11 you need to use latest git commit.

Yes, that's as expected.

I have tried lkrg.kint.enforce=1, it does not help.

How exactly did you try it and how exactly does it not help? Does the kernel still panic? Are you able to capture the relevant kernel messages (as appear in dmesg output) and share them with us here, please? Thank you!

Also, please share the output of uname -mrs (which may tell us a bit more than mere 6.11 - also which arch and build).

@Laitinlok
Copy link
Author

Laitinlok commented Oct 5, 2024

Through sysctl. I also isolate the issue is related to lkrg.pint_enforce=2 .

@Laitinlok
Copy link
Author

Laitinlok commented Oct 5, 2024

Yes it can build properly on 6.10.11 with the release tarball, for 6.11 you need to use latest git commit.

Yes, that's as expected.

I have tried lkrg.kint.enforce=1, it does not help.

How exactly did you try it and how exactly does it not help? Does the kernel still panic? Are you able to capture the relevant kernel messages (as appear in dmesg output) and share them with us here, please? Thank you!

Also, please share the output of uname -mrs (which may tell us a bit more than mere 6.11 - also which arch and build).

Linux 6.11.0-1-default x86_64

@solardiz
Copy link
Contributor

solardiz commented Oct 5, 2024

I also isolate the issue is related to lkrg.pint_enforce=2

Where does pint_enforce=2 come from on your system? Our default is pint_enforce=1.

Can you please run your system for a while with pint_enforce=1 and capture and send us relevant pieces from dmesg, where it presumably would detect a violation (just enforce it more mildly, so the system should stay up)?

@Laitinlok
Copy link
Author

I have set it to 2 through sysctl

@Laitinlok
Copy link
Author

10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 4778, name tracker-extract
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: BLOCK: Task: Killing pid 4778, name tracker-extract
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: BLOCK: Task: Killing pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: BLOCK: Task: Killing pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: BLOCK: Task: Killing pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: BLOCK: Task: Killing pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: BLOCK: Task: Killing pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 4259, name pipewire-pulse
10月 06 04:31:20 localhost.localdomain kernel: LKRG: ALERT: BLOCK: Task: Killing pid 4259, name pipewire-pulse
warning from lkrg when shutting down.

@solardiz solardiz changed the title Instability with the latest kernel 6.11 False positive "seccomp filter pointer corruption" on Linux 6.11.0-1-default x86_64 Opensuse tumbleweed Oct 5, 2024
@solardiz
Copy link
Contributor

solardiz commented Oct 5, 2024

Thank you, this helps.

Do I understand correctly that you were previously using "6.10.11 with the release tarball" and it didn't exhibit the issue?

@Laitinlok
Copy link
Author

Laitinlok commented Oct 5, 2024

Thank you, this helps.

Do I understand correctly that you were previously using "6.10.11 with the release tarball" and it didn't exhibit the issue?

It started having issues in 6.10.7 I think.

@solardiz
Copy link
Contributor

solardiz commented Oct 6, 2024

It started having issues in 6.10.7 I think.

That's puzzling. When issues started, did you upgrade only the kernel or also LKRG? Were those the same issues (the seccomp filter pointer corruption message) or something else?

@solardiz
Copy link
Contributor

solardiz commented Oct 6, 2024

@Adam-pi3 It sounds like your reasoning in #346 could have been flawed. As seen from code snippets in #338, what changed with 38b3b11 for 5.9+ is that previously we increased refcount for filter->users and filter->refs, and now we do only for filter->refs. Per your comments in #346, none of this should have been needed, and we only do it as defensive programming to reduce impact of a possible misunderstanding from a use-after-free to a resource leak. Yet the impact we see looks like a use-after-free by our own code, so maybe the filter->users increase was somehow required to keep the filter from disappearing/changing under us (if this is indeed a new problem with this change, which isn't entirely clear)?

Anyway, I am really tempted to do what I had suggested earlier - exclude seccomp checks on 5.9+. I think they're also incomplete anyway, checking only the first out of possible multiple filters. Is this OK with you? We haven't seen real-world exploits that would modify only seccomp and not anything else we track, have we? However, we have seen plenty of issues related to LKRG's seccomp tracking support on 5.9+, where we had to use risky hacks to get around Linux's symbol non-export. So I feel this feature has poor balance of benefit vs. risk as currently implemented, and we do not readily have an obviously better idea.

@Strykar
Copy link

Strykar commented Oct 7, 2024

When I reboot, I see kernel: LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid for pipewire and mympd.
I am also seeing this on kernel Linux r912 6.11.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 04 Oct 2024 21:51:11 +0000 x86_64 GNU/Linux:

sudo dmesg | grep -i lkrg
[   11.437009] LKRG: ALIVE: Loading LKRG
[   11.598059] LKRG: ISSUE: [kretprobe] register_kretprobe() for <ovl_dentry_is_whiteout> failed! [err=-2]
[   11.598061] LKRG: ISSUE: Can't hook 'ovl_dentry_is_whiteout'. This is expected when OverlayFS is not used
[   11.723268] LKRG: ALIVE: LKRG initialized successfully
[  119.812548] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 5531, name gmain
[  119.812554] LKRG: ALERT: BLOCK: Task: Killing pid 5531, name gmain
[  119.813312] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 5527, name bwrap
[  119.813316] LKRG: ALERT: BLOCK: Task: Killing pid 5527, name bwrap

[   52.657022] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2047, name pool-spawner
[   52.657027] LKRG: ALERT: BLOCK: Task: Killing pid 2047, name pool-spawner
[   52.657031] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2047, name pool-spawner
[   52.657032] LKRG: ALERT: BLOCK: Task: Killing pid 2047, name pool-spawner
[   52.657034] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2047, name pool-spawner
[   52.657035] LKRG: ALERT: BLOCK: Task: Killing pid 2047, name pool-spawner
[   52.658779] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2025, name pipewire
[   52.658783] LKRG: ALERT: BLOCK: Task: Killing pid 2025, name pipewire
[   52.658788] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2025, name pipewire
[   52.658789] LKRG: ALERT: BLOCK: Task: Killing pid 2025, name pipewire
[   52.658791] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2025, name pipewire
[   52.658792] LKRG: ALERT: BLOCK: Task: Killing pid 2025, name pipewire
[   52.658794] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2025, name pipewire
[   52.658795] LKRG: ALERT: BLOCK: Task: Killing pid 2025, name pipewire
[   52.658796] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2025, name pipewire
[   52.658797] LKRG: ALERT: BLOCK: Task: Killing pid 2025, name pipewire
[   52.658799] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 2025, name pipewire
[   52.658800] LKRG: ALERT: BLOCK: Task: Killing pid 2025, name pipewire

[  255.571156] LKRG: ALERT: BLOCK: Task: Killing pid 19675, name [vkrt] Analysis
[  255.571158] LKRG: ALERT: BLOCK: Task: Killing pid 19652, name pool-org.gnome.
[  255.571158] LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 19653, name pool-spawner
[  255.571160] LKRG: ALERT: BLOCK: Task: Killing pid 19653, name pool-org.gnome.
[  255.571162] LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 19680, name nautilus
[  255.571163] LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 19654, name pool-org.gnome.
[  255.571164] LKRG: ALERT: BLOCK: Task: Killing pid 19680, name nautilus

I can't even find a binary named gmain:

plocate gmain
/usr/include/at-spi-2.0/atspi/atspi-gmain.h
/usr/include/glib-2.0/glib/gmain.h
/usr/include/glib-2.0/glib/deprecated/gmain.h
/usr/share/texmf-dist/fonts/source/public/elvish/tengmain.mf
/work/x86_64/airootfs/usr/include/glib-2.0/glib/gmain.h
/work/x86_64/airootfs/usr/include/glib-2.0/glib/deprecated/gmain.h

bwrap appears to be part of bubblewrap and required by multiple gnome packages on Arch linux:

$ pacwho /usr/bin/bwrap
/usr/bin/bwrap is owned by bubblewrap 0.10.0-1

This is the default config on arch:

sudo sysctl -a | grep lkrg
lkrg.block_modules = 0
lkrg.heartbeat = 0
lkrg.hide = 0
lkrg.interval = 15
lkrg.kint_enforce = 2
lkrg.kint_validate = 3
lkrg.log_level = 3
lkrg.msr_validate = 0
lkrg.pcfi_enforce = 1
lkrg.pcfi_validate = 2
lkrg.pint_enforce = 1
lkrg.pint_validate = 1
lkrg.profile_enforce = 2
lkrg.profile_validate = 3
lkrg.smap_enforce = 2
lkrg.smap_validate = 1
lkrg.smep_enforce = 2
lkrg.smep_validate = 1
lkrg.trigger = 0
lkrg.umh_enforce = 1
lkrg.umh_validate = 1

Please let me know if I should open a separate issue instead.

@solardiz
Copy link
Contributor

solardiz commented Oct 7, 2024

Thank you for reporting this @Strykar! Looks like the same issue to me, so let's keep the info in here.

@Adam-pi3 I think we need to look for possible seccomp-related changes between 6.10 and 6.11 to see if we possibly miss tracking some new legitimate seccomp filter pointer updates. This issue appears too frequently for it to be likely a race condition.

@solardiz
Copy link
Contributor

solardiz commented Oct 7, 2024

I think we need to look for possible seccomp-related changes between 6.10 and 6.11 to see if we possibly miss tracking some new legitimate seccomp filter pointer updates.

I searched commit messages for mentions of seccomp. Didn't find any new legitimate updates, but found this:

commit bfafe5efa9754ebc991750da0bcca2a6694f3ed3
Author: Andrei Vagin <[email protected]>
Date:   Fri Jun 28 02:10:12 2024 +0000

    seccomp: release task filters when the task exits
    
    Previously, seccomp filters were released in release_task(), which
    required the process to exit and its zombie to be collected. However,
    exited threads/processes can't trigger any seccomp events, making it
    more logical to release filters upon task exits.
    
    This adjustment simplifies scenarios where a parent is tracing its child
    process. The parent process can now handle all events from a seccomp
    listening descriptor and then call wait to collect a child zombie.
    
    seccomp_filter_release takes the siglock to avoid races with
    seccomp_sync_threads. There was an idea to bypass taking the lock by
    checking PF_EXITING, but it can be set without holding siglock if
    threads have SIGNAL_GROUP_EXIT. This means it can happen concurently
    with seccomp_filter_release.
    
    This change also fixes another minor problem. Suppose that a group
    leader installs the new filter without SECCOMP_FILTER_FLAG_TSYNC, exits,
    and becomes a zombie. Without this change, SECCOMP_FILTER_FLAG_TSYNC
    from any other thread can never succeed, seccomp_can_sync_threads() will
    check a zombie leader and is_ancestor() will fail.
    
    Reviewed-by: Oleg Nesterov <[email protected]>
    Signed-off-by: Andrei Vagin <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Tycho Andersen <[email protected]>
    Signed-off-by: Kees Cook <[email protected]>

Maybe this created or exposed (made more likely) a race condition?

@Laitinlok
Copy link
Author

Thank you for reporting this @Strykar! Looks like the same issue to me, so let's keep the info in here.

@Adam-pi3 I think we need to look for possible seccomp-related changes between 6.10 and 6.11 to see if we possibly miss tracking some new legitimate seccomp filter pointer updates. This issue appears too frequently for it to be likely a race condition.

Yes I also experienced the same problem in the logs every time with different binaries, seems to be a false positive.

@Laitinlok
Copy link
Author

Laitinlok commented Oct 10, 2024

Edit by @solardiz: dropped the over-quoting

I searched commit messages for mentions of seccomp. Didn't find any new legitimate updates, but found this:

https://github.com/openSUSE/kernel/tree/v6.11.2
You might be more lucky funding the commit from the distro tree.

@Kirkezz
Copy link

Kirkezz commented Oct 11, 2024

I got alerts about "seccomp filter pointer corruption" recently too.
https://pastebin.com/qwfaU2MZ
6.10.12-hardened Arch Linux
lkrg 0.9.8-1

@solardiz
Copy link
Contributor

Thank you @Kirkezz. The mainline commit I found above is also included in 6.10.10+, so your report does not exclude the potential that the issue is related to that commit.

https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.10.10

@Kirkezz
Copy link

Kirkezz commented Oct 12, 2024

Yes, you are most likely right. I found earlier logs in journalctl with this problem, and the linux version in those logs is 6.10.10.

@Adam-pi3
Copy link
Collaborator

Adam-pi3 commented Oct 12, 2024

I installed OpenSUSE Tumbleweed Desktop and server version as my VmWare VMs and none of them has the issue which you are describing:

localhost:~/lkrg # cat /etc/os-release 
NAME="openSUSE Tumbleweed"
# VERSION="20241011"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20241011"
PRETTY_NAME="openSUSE Tumbleweed"
ANSI_COLOR="0;32"
# CPE 2.3 format, boo#1217921
CPE_NAME="cpe:2.3:o:opensuse:tumbleweed:20241011:*:*:*:*:*:*:*"
#CPE 2.2 format
#CPE_NAME="cpe:/o:opensuse:tumbleweed:20241011"
BUG_REPORT_URL="https://bugzilla.opensuse.org"
SUPPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org"
DOCUMENTATION_URL="https://en.opensuse.org/Portal:Tumbleweed"
LOGO="distributor-logo-Tumbleweed"
localhost:~/lkrg # uname -a
Linux localhost.localdomain 6.11.2-1-default #1 SMP PREEMPT_DYNAMIC Fri Oct  4 17:37:58 UTC 2024 (38c846e) x86_64 x86_64 x86_64 GNU/Linux
localhost:~/lkrg # 

I assume there is some more to this problem than just kernel version. Did you compile it by yourself? Did you need to do anything specific to see the issue? I browse the internet through Firefox on Desktop VM and I didn't see any problem with LKRG:

localhost:~/lkrg # dmesg -T|tail -10
[Sat Oct 12 14:46:38 2024] [  T64502] Freezing user space processes completed (elapsed 0.005 seconds)
[Sat Oct 12 14:46:38 2024] [  T64502] OOM killer disabled.
[Sat Oct 12 14:46:38 2024] [  T64502] LKRG: ISSUE: [kretprobe] register_kretprobe() for <ovl_dentry_is_whiteout> failed! [err=-2]
[Sat Oct 12 14:46:38 2024] [  T64502] LKRG: ISSUE: Can't hook 'ovl_dentry_is_whiteout'. This is expected when OverlayFS is not used.
[Sat Oct 12 14:46:38 2024] [  T64502] LKRG: ALIVE: LKRG initialized successfully
[Sat Oct 12 14:46:38 2024] [  T64502] OOM killer enabled.
[Sat Oct 12 14:46:38 2024] [  T64502] Restarting tasks ... done.
[Sat Oct 12 14:51:13 2024] [  T65282] LKRG: STATE: Enabling 'heartbeat'
[Sat Oct 12 14:51:15 2024] [  T64590] LKRG: ALIVE: System is clean
[Sat Oct 12 14:51:30 2024] [  T64590] LKRG: ALIVE: System is clean
localhost:~/lkrg # 

@Laitinlok
Copy link
Author

Laitinlok commented Oct 13, 2024

Edit by @solardiz: dropped the over-quoting

I installed OpenSUSE Tumbleweed Desktop and server version as my VmWare VMs and none of them has the issue which you are describing

Are you using dkms?

@Adam-pi3
Copy link
Collaborator

No, I didn't use dkms because i fetch the git repo, compile it and loaded LKRG after the system was booted.

@Adam-pi3
Copy link
Collaborator

@Laitinlok Do you happened to know how I could repro the issue? Do you execute any specific action to cause the issue?

@Laitinlok
Copy link
Author

@Laitinlok Do you happened to know how I could repro the issue? Do you execute any specific action to cause the issue?

sudo systemctl enable --now lkrg, restart 2 times.

@Laitinlok
Copy link
Author

@Laitinlok Do you happened to know how I could repro the issue? Do you execute any specific action to cause the issue?

Do you have secure boot and trusted boot enabled.

@Adam-pi3
Copy link
Collaborator

Certainly it doesn't repro on my side. @Laitinlok can you try LKRG under newest SUSE kernel 6.11.2-1-default and check if you see the same issue?

Do you have secure boot and trusted boot enabled.

I do not (it's under VM emulating BIOS)

@Laitinlok
Copy link
Author

Certainly it doesn't repro on my side. @Laitinlok can you try LKRG under newest SUSE kernel 6.11.2-1-default and check if you see the same issue?

Do you have secure boot and trusted boot enabled.

I do not (it's under VM emulating BIOS)

It has the same issues with the latest kernel.

@solardiz
Copy link
Contributor

@Adam-pi3 What would your next steps be if you were able to reproduce the issue? Maybe we can jump to those right away.

@Adam-pi3
Copy link
Collaborator

@Laitinlok can you change the log.level to level 4 ( I would like to see the actual value of the pointers ). You can do it via cli:
sysctl lkrg.log_level=4
Additionally, can you also apply this small patch to LKRG?

diff --git a/src/modules/exploit_detection/p_exploit_detection.c b/src/modules/exploit_detection/p_exploit_detection.c
index 69db274..3fc8fa5 100644
--- a/src/modules/exploit_detection/p_exploit_detection.c
+++ b/src/modules/exploit_detection/p_exploit_detection.c
@@ -1245,6 +1245,7 @@ static int p_cmp_creds(struct p_cred *p_orig, const struct cred *p_current_cred,
 
 #define P_CMP_PTR(orig, curr, name) \
    if (orig != curr) { \
+      printk(KERN_CRIT "p_ret[%d] test_task_syscall_work=%d",p_ret,test_task_syscall_work(p_current, SECCOMP)); \
       if (p_opt) { \
          if (P_CTRL(p_log_level) >= P_LOG_WATCH) \
             p_print_log(P_LOG_ALERT, \

@Laitinlok
Copy link
Author

@Laitinlok can you change the log.level to level 4 ( I would like to see the actual value of the pointers ). You can do it via cli:

sysctl lkrg.log_level=4

Additionally, can you also apply this small patch to LKRG?


diff --git a/src/modules/exploit_detection/p_exploit_detection.c b/src/modules/exploit_detection/p_exploit_detection.c

index 69db274..3fc8fa5 100644

--- a/src/modules/exploit_detection/p_exploit_detection.c

+++ b/src/modules/exploit_detection/p_exploit_detection.c

@@ -1245,6 +1245,7 @@ static int p_cmp_creds(struct p_cred *p_orig, const struct cred *p_current_cred,

 

 #define P_CMP_PTR(orig, curr, name) \

    if (orig != curr) { \

+      printk(KERN_CRIT "p_ret[%d] test_task_syscall_work=%d",p_ret,test_task_syscall_work(p_current, SECCOMP)); \

       if (p_opt) { \

          if (P_CTRL(p_log_level) >= P_LOG_WATCH) \

             p_print_log(P_LOG_ALERT, \

Sure

@Kirkezz
Copy link

Kirkezz commented Oct 17, 2024

@Adam-pi3

compile it and load instead of the one which you have pre-installed from DKMS and check if you see the same problem?

I installed lkrg-dkms-git from AUR (replacing lkrg-dkms with it). The problem still persists in the logs (got one entry this boot: "BLOCK: Task: Killing pid 1424, name HTML5 Parser"), but the previous boot has a reappeared problem I'd almost forgotten about when I occasionally boot up, and there's a “Temporary failure in name resolution” and

dhcpcd[706]: no valid interfaces found
dhcpcd[706]: no interfaces have a carrier

I don't know if this is related to LKRG or that I recently updated all packages in my system to not have a partial upgrade.

@Laitinlok
Copy link
Author

Laitinlok commented Oct 17, 2024

Edit by @solardiz: Added triple-backtick quoting.

[  119.987323] [   T5887] p_ret[0] test_task_syscall_work=1
[  119.987328] [   T5887] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 5887, name services.exe
[  119.987335] [   T5887] LKRG: ALERT: BLOCK: Task: Killing pid 5887, name services.exe
[  123.937601] [   T6134] p_ret[0] test_task_syscall_work=1
[  123.937607] [   T6134] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 6134, name services.exe
[  123.937614] [   T6134] LKRG: ALERT: BLOCK: Task: Killing pid 6134, name services.exe
[  143.476774] [   T6433] p_ret[0] test_task_syscall_work=1
[  143.476778] [   T6433] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 6433, name services.exe
[  143.476784] [   T6433] LKRG: ALERT: BLOCK: Task: Killing pid 6433, name services.exe
[  148.636004] [   T6683] p_ret[0] test_task_syscall_work=1
[  148.636009] [   T6683] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 6683, name services.exe
[  148.636015] [   T6683] LKRG: ALERT: BLOCK: Task: Killing pid 6683, name services.exe
[  344.579928] [   T5401] p_ret[0] test_task_syscall_work=1
[  344.579933] [   T5401] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 5401, name TaskCon~ller #6
[  344.579939] [   T5401] LKRG: ALERT: BLOCK: Task: Killing pid 5401, name TaskCon~ller #6
[  466.900735] [   T7484] p_ret[0] test_task_syscall_work=1
[  466.900741] [   T7484] LKRG: ALERT: DETECT: Task: seccomp filter pointer corruption for pid 7484, name services.exe
[  466.900747] [   T7484] LKRG: ALERT: BLOCK: Task: Killing pid 7484, name services.exe

@Adam-pi3
Copy link
Collaborator

Thanks @Laitinlok however it looks like that log_level is not at minimum WATCH (number 4) level. Can you repeat it with log_level=4 ?

@Kirkezz I have no idea what lkrg-dkms-git works. Can you please get it directly from the github via:

git clone https://github.com/lkrg-org/lkrg.git

The problems which you see are not related

@solardiz
Copy link
Contributor

@Laitinlok @Kirkezz @Strykar Can you please try the below patch and let us know if it helps? -

+++ b/src/modules/exploit_detection/p_exploit_detection.c
@@ -1414,7 +1414,8 @@ static int p_cmp_tasks(struct p_ed_process *p_orig, struct task_struct *p_curren
          p_ret++;
       }
 
-      P_CMP_PTR(p_orig->p_ed_task.p_sec.sec.filter, p_current->seccomp.filter, "seccomp filter")
+      if (!(p_current->flags & PF_EXITING))
+         P_CMP_PTR(p_orig->p_ed_task.p_sec.sec.filter, p_current->seccomp.filter, "seccomp filter")
 
       p_lkrg_seccomp_filter_put(p_current);
    }

@solardiz
Copy link
Contributor

@Adam-pi3 I overcame the laziness and looked the kernel code in proper context. The issue may actually be quite simple. This commit I found moves the call to seccomp_filter_release to be made much earlier, in do_exit:

@@ -832,6 +831,8 @@ void __noreturn do_exit(long code)
        io_uring_files_cancel();
        exit_signals(tsk);  /* sets PF_EXITING */
 
+       seccomp_filter_release(tsk);

and here's what seccomp_filter_release does:

/**
 * seccomp_filter_release - Detach the task from its filter tree,
 *                          drop its reference count, and notify
 *                          about unused filters
 *
 * @tsk: task the filter should be released from.
 *
 * This function should only be called when the task is exiting as
 * it detaches it from its filter tree. PF_EXITING has to be set
 * for the task.
 */
void seccomp_filter_release(struct task_struct *tsk)
{
        struct seccomp_filter *orig;

        if (WARN_ON((tsk->flags & PF_EXITING) == 0))
                return;

        spin_lock_irq(&tsk->sighand->siglock);
        orig = tsk->seccomp.filter;
        /* Detach task from its filter tree. */
        tsk->seccomp.filter = NULL;
        spin_unlock_irq(&tsk->sighand->siglock);
        __seccomp_filter_release(orig);
}

Our incremented refcount probably prevents freeing of the filter in the trailing __seccomp_filter_release call, but it does not prevent tsk->seccomp.filter = NULL; - so the filter is still there but is detached. Yet our code expects the filter field to be still the same pointer we had recorded, not NULL.

The trial patch I posted above should prevent the issue when we're validating the current task. This is the only case now possible with your added check of current == p_current up in the code on 5.9+ kernels. I think we may want to duplicate this check in the if I suggest to add, because I think we also had this problem on pre-5.9 kernels - it was just not exposed enough to trigger it often prior to 6.10.10 - and because the PF_EXITING check would be racy when validating a non-current task (in paranoid mode).

In other words, while I suggested a simpler patch for testing here, I actually propose its more elaborate revision (that would unfortunately make paranoid mode less effective on pre-5.9 kernels, albeit not to the extent we already accepted on 5.9+).

+++ b/src/modules/exploit_detection/p_exploit_detection.c
@@ -1414,7 +1414,8 @@ static int p_cmp_tasks(struct p_ed_process *p_orig, struct task_struct *p_curren
          p_ret++;
       }
 
-      P_CMP_PTR(p_orig->p_ed_task.p_sec.sec.filter, p_current->seccomp.filter, "seccomp filter")
+      if (current == p_current && !(p_current->flags & PF_EXITING))
+         P_CMP_PTR(p_orig->p_ed_task.p_sec.sec.filter, p_current->seccomp.filter, "seccomp filter")
 
       p_lkrg_seccomp_filter_put(p_current);
    }

solardiz added a commit to solardiz/lkrg that referenced this issue Oct 19, 2024
The issue may also have been triggered on older kernels, but with
negligible probability.

Fixes lkrg-org#354
@solardiz
Copy link
Contributor

-      P_CMP_PTR(p_orig->p_ed_task.p_sec.sec.filter, p_current->seccomp.filter, "seccomp filter")
+      if (current == p_current && !(p_current->flags & PF_EXITING))
+         P_CMP_PTR(p_orig->p_ed_task.p_sec.sec.filter, p_current->seccomp.filter, "seccomp filter")

I've just tested this for lack of regressions via our GitHub Actions in my fork of the repo, except for 3 unrelated test failures in our cross-builds (opened new issue for those).

@solardiz
Copy link
Contributor

@Adam-pi3 I looked at the code some more and thought of these problems some more. I don't get why we were doing the get/put filter thing at all. We are only validating the pointer, not filter content, right? Well, get/put does not protect the pointer anyway - it only ensures the actual filter content won't be gone, not that the filter wouldn't get detached from the task. We do validate a few other per-task seccomp things, but they are not part of the filter, right?

If the above is correct, then how about the below changes? -

  1. Drop the get/put stuff.
  2. Move (not just duplicate) your recently added current == p_current check from top level to be near and apply only to the filter pointer check. This will actually increase/restore amount of seccomp validation we perform in paranoid mode.
  3. Not sure about this one - instead of PF_EXITING as above, we could be checking for the pointer becoming NULL - Linux itself does, and we already do in our get/put wrappers (to be dropped). However, we could want to detect unexpected removal of the filter, which is why I preferred to check PF_EXITING. If the attacker could simply NULL the pointer, then why bother checking at all - perhaps a typical exploit not even trying to bypass LKRG specifically would do just that.

Since we need to make a release soon, maybe let's use my proposed patch above for now, but try 1 and 2 in our development tree afterwards.

My guess is you were planning to add validation of the filter itself, which is why you added these get/put - years ago, but we never proceeded to add such validation. If so, we'd need to revisit/re-add these if and when we're ready to add filter validation. Perhaps along with also recognizing and validating potential multiple filters per task. For now, though, we have incomplete functionality that is better dropped.

@solardiz
Copy link
Contributor

@Adam-pi3 Further, get_seccomp_filter doesn't return whether it succeeded at all. If the pointer was already NULL, it does nothing. It also does not (and cannot?) check whether the filter was being removed or even freed just as it runs, so it makes perfect sense there's no return value. I guess it was meant to be used only in contexts where the filter couldn't be concurrently freed anyway, to ensure the filter still can't be freed after such context is exited, and until a put.

Our usage looks different: we're doing the get/put for a moment either to dump or validate the filter (if we were to add its real validation, beyond pointer). If the filter can't be concurrently freed at the time of get_seccomp_filter, we have no problem for our entire block anyway, even without a get/put. If the filter can be concurrently freed at the time of get_seccomp_filter, we have a problem despite of our get/put.

So what we currently do looks like nonsense to me now, not only for the functionality that we have, but also for further extension.

solardiz added a commit to solardiz/lkrg that referenced this issue Oct 19, 2024
@solardiz
Copy link
Contributor

I've just implemented my proposed simplification in my fork of the repo.

@Laitinlok @Kirkezz @Strykar Can you please test https://github.com/solardiz/lkrg as of commit 3bdf5c8 and let us know how it works for you?

commit 3bdf5c84081f65a5b8dedb11d68ea19af88811b7 (HEAD -> main, origin/main, origin/HEAD)
Author: Solar Designer <[email protected]>
Date:   Sat Oct 19 20:07:53 2024 +0200

    Simplify seccomp validation
    
    See #354

commit 3e6abdd4662020f414e80e93c315cd2b6125dc9a
Author: Solar Designer <[email protected]>
Date:   Sat Oct 19 04:57:52 2024 +0200

    Fix false positive "seccomp filter pointer corruption" on 6.10.10+
    
    The issue may also have been triggered on older kernels, but with
    negligible probability.
    
    Fixes #354

@Adam-pi3
Copy link
Collaborator

@solardiz I think it boils down to the discussion which we had here:
#346 (comment)

I still think we may want to keep references. And yes, we wanted to add filter validations itself.

Btw. Let's wait for @Laitinlok and others if your patches fixes the issue

@solardiz
Copy link
Contributor

I think it boils down to the discussion which we had here: #346 (comment)

Yes, but in that discussion neither of us appeared to realize we're not actually accessing the filter.

I still think we may want to keep references.

Why would we? They're references on the filter, which we never access. They do not affect the pointer, which we do access. And we acquire them in a way that would either be unneeded/redundant or unreliable if/when we add filter validation. I'd say that was useless and misleading code that we had, which also caused us portability problems for no reason. Let's drop it for good.

Let's wait for @Laitinlok and others if your patches fixes the issue

Yes. I hope we'll hear from them soon.

@Kirkezz
Copy link

Kirkezz commented Oct 20, 2024

@Laitinlok

Can you please test https://github.com/solardiz/lkrg as of commit 3bdf5c8 and let us know how it works for you?

Everything seems to be working fine. I am no longer getting any warnings in the logs. Installed lkrg from your commit and rebooted twice.

@Laitinlok
Copy link
Author

Yes it is working fine with this commit.

@solardiz
Copy link
Contributor

Thank you for testing the fix @Kirkezz and @Laitinlok!
@Strykar Would you also test the above fix, please?

@Strykar
Copy link

Strykar commented Oct 21, 2024

Thank you for testing the fix @Kirkezz and @Laitinlok! @Strykar Would you also test the above fix, please?

I just built and loaded it, and opened a few programs, no issues so far.
dmesg looks good!

@Strykar
Copy link

Strykar commented Oct 24, 2024

@solardiz Just FYI:
Calling Nautilius from Firefox closed the file browse window three times before it let me select a file. Unfortunately I could not grok logs at the time to see if it was LKRG related.

Here is what LKRG looks like running for a day on a daily driver dekstop post this patch:

~  sudo journalctl -b -1 | grep LKRG
Oct 23 19:39:29 r912 kernel: LKRG: ALIVE: Loading LKRG
Oct 23 19:39:29 r912 kernel: LKRG: ISSUE: [kretprobe] register_kretprobe() for <ovl_dentry_is_whiteout> failed! [err=-2]
Oct 23 19:39:29 r912 kernel: LKRG: ISSUE: Can't hook 'ovl_dentry_is_whiteout'. This is expected when OverlayFS is not used.
Oct 23 19:39:29 r912 kernel: LKRG: ALIVE: LKRG initialized successfully
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 484646, name pool-spawner
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: BLOCK: Task: Killing pid 484646, name pool-org.gnome.
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 484645, name pool-spawner
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 484648, name pool-spawner
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: BLOCK: Task: Killing pid 484648, name pool-org.gnome.
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: BLOCK: Task: Killing pid 484645, name pool-org.gnome.
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 484646, name pool-spawner
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 484645, name pool-spawner
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: BLOCK: Task: Killing pid 484646, name pool-org.gnome.
Oct 24 09:38:51 r912 kernel: LKRG: ALERT: BLOCK: Task: Killing pid 484645, name pool-org.gnome.
Oct 24 09:38:53 r912 kernel: LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 484789, name pool-spawner
Oct 24 09:38:53 r912 kernel: LKRG: ALERT: BLOCK: Task: Killing pid 484789, name pool-org.gnome.
Oct 24 09:38:53 r912 kernel: LKRG: ALERT: DETECT: Task: 'off' flag corruption for pid 484789, name pool-spawner
Oct 24 09:38:53 r912 kernel: LKRG: ALERT: BLOCK: Task: Killing pid 484789, name pool-org.gnome.

@solardiz
Copy link
Contributor

@Strykar That's a different issue now. Please add your comment to #329. Thank you!

@solardiz
Copy link
Contributor

@Adam-pi3 We got the get/put_seccomp_filter added in fcf7209 (in 2019). That same commit also added a bunch of get/put_task_struct and get/put_cred. Those also make little sense to me now, for much the same reasons - at get time we rely on the struct still being around anyway, and then we proceed to use it just briefly without doing anything that would alter the initial assumption that the struct is still around. If the assumption was correct, get/put are unneeded. If it was not (struct already not around or could be gone from under us given those circumstances), we have a problem even with get/put (because the struct could as well disappear when we try to get, and we wouldn't know). We may want to clean them up as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants