Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doesn't kill cgroup, unable to set xattr trusted.oomd_ooms=1 #122

Open
nartes opened this issue Mar 6, 2020 · 5 comments
Open

doesn't kill cgroup, unable to set xattr trusted.oomd_ooms=1 #122

nartes opened this issue Mar 6, 2020 · 5 comments

Comments

@nartes
Copy link

nartes commented Mar 6, 2020

Description: oomd has identified a process, but can't kill it.

Package: https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=oomd

Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/util/Fs.cpp:576] Unable to set xattr trusted.oomd_ooms=1 on /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/gnome-launched-firefox-11870.scope. errno=30
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/plugins/BaseKillPlugin.cpp:96] Trying to kill /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/gnome-launched-firefox-11870.scope
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/plugins/KillMemoryGrowth-inl.h:168] Picked "user.slice/user-1000.slice/[email protected]/gnome-launched-firefox-11870.scope" (2040MB) based on size > 10% of total 6989MB (size threshold overridden)
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/util/Fs.cpp:576] Unable to set xattr trusted.oomd_kill=0 on /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service. errno=30
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/plugins/BaseKillPlugin.cpp:141] Killed 0: 1377(ssh-agent)[E1] 1401(tmux: server)[E1] 1402(zsh)[E1] 1427(zsh)[E1] 1454(vim)[E1] 1455(zsh)[E1] 1485(htop)[E1] 1496(zsh)[E1] 1521(zsh)[E1] 46339(zsh)[E1] 4>
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/util/Fs.cpp:576] Unable to set xattr trusted.oomd_ooms=1 on /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service. errno=30
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/plugins/BaseKillPlugin.cpp:96] Trying to kill /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/plugins/KillMemoryGrowth-inl.h:168] Picked "user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service" (2370MB) based on size > 10% of total 6989MB (size threshold overridden)
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/OomdContext.cpp:163]   io_cost_cumulative=0 io_cost_rate=0
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/OomdContext.cpp:156]   mem=8MB mem_avg=7MB mem_low=0MB mem_min=0MB mem_prot=0MB anon=6MB swap_usage=0MB
Mar 06 17:53:21 MACHINE_NAME oomd[69346]: [../src/oomd/OomdContext.cpp:151]   pressure=0:0:0-0:0:0
@danobi
Copy link
Contributor

danobi commented Mar 9, 2020

In

1377(ssh-agent)[E1] 1401(tmux: server)[E1] 1402(zsh)[E1]

E1 means kill(, SIGKILL) failed with EPERM. Is oomd running with the right permissions?

In

Unable to set xattr trusted.oomd_kill=0 on /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/gnome-terminal-server.service. errno=30

errno=30 means setxattr failed with EROFS (readonly FS).

Are you using a hybrid cgroup1 + cgroup2 setup? May be unrelated but would be good to know.

@nartes
Copy link
Author

nartes commented Mar 9, 2020

@danobi perhaps it is some issue with cgroups permissions setup.
Could you tell me some bash commands to debug a killing procedure?
I didn't read the source yet, but thought about just hacking it with system('kill -9 %d', process_pid) in Fs.cpp instead of cryptic trusted.oomd_kill = 0 attributes.
What is this attribute, is it related to a facebook contributed kernel module?
I didn't find any documents on a killing procedure used by oomd.
It is puzzling me at the moment.

P.S.

  1. cgroups configuring on archlinux https://aur.archlinux.org/cgit/aur.git/commit/?h=oomd&id=3a6dcdb577bfa3c874894889315f0c940174bf73
  2. Some kernel parameters in PKGBUILD https://aur.archlinux.org/cgit/aur.git/commit/?h=oomd&id=3a6dcdb577bfa3c874894889315f0c940174bf73

P.P.S.

systemctl status oomd
● oomd.service - userspace out-of-memory killer
     Loaded: loaded (/usr/lib/systemd/system/oomd.service; enabled; vendor preset: disabled)
     Active: active (running) since Sun 2020-03-08 22:35:50 +03; 24h ago
    Process: 584 ExecStartPre=/usr/bin/oomd --check-config ${OOMD_CONFIG} (code=exited, status=0/SUCCESS)
   Main PID: 594 (oomd)
      Tasks: 3 (limit: 9336)
     Memory: 2.6M (low: 64.0M)
        CPU: 9min 34.582s
     CGroup: /system.slice/oomd.service
             └─594 /usr/bin/oomd --config /etc/oomd.json --interval 5

Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/OomdContext.cpp:156]   mem=11MB mem_avg=11MB mem_low=0MB mem_min=0MB mem_prot=0MB anon=6MB swap_usage=0MB
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/OomdContext.cpp:163]   io_cost_cumulative=0 io_cost_rate=0
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/OomdContext.cpp:150] name=user.slice/user-1000.slice/[email protected]/gsd-media-keys.service
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/OomdContext.cpp:151]   pressure=0:0:0-0:0:0
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/OomdContext.cpp:156]   mem=8MB mem_avg=8MB mem_low=0MB mem_min=0MB mem_prot=0MB anon=6MB swap_usage=0MB
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/OomdContext.cpp:163]   io_cost_cumulative=0 io_cost_rate=0
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/plugins/KillMemoryGrowth-inl.h:168] Picked "user.slice/user-1000.slice/[email protected]/gnome-launched-firefox-29108.scope" (2519MB) based on size > 10% of tot>
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/plugins/BaseKillPlugin.cpp:92] OOMD: In dry-run mode; would have tried to kill /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/gnome-launched-firefo>
Mar 09 22:38:21 MACHINE_NAME oomd[594]: [../src/oomd/Log.cpp:114] 0.00 0.00 0.00 user.slice/user-1000.slice/[email protected]/gnome-launched-firefox-29108.scope 2641997824 ruleset:[user session protection] detecto>
Mar 09 22:38:22 MACHINE_NAME oomd[594]: [../src/oomd/engine/Ruleset.cpp:134] Action=kill_by_memory_size_or_growth returned STOP. Terminating action chain.

P.P.P.S.
A process of /usr/bin/oomd is being executed under root user.

P.P.P.S.

yay -Qs cgroup
local/libcgroup 0.41-2
    Library that abstracts the control group file system in Linux

@danobi
Copy link
Contributor

danobi commented Mar 9, 2020

This is the kill code: https://github.com/facebookincubator/oomd/blob/master/src/oomd/plugins/BaseKillPlugin.cpp#L138

cryptic trusted.oomd_kill = 0 attributes. What is this attribute, is it related to a facebook contributed kernel module?

It's an extended attribute. See man 7 xattr for more details. It's so delegated cgroup subtrees can know when a kill was performed.

Your systemctl status oomd shows dry-run mode on. With dry run mode on for plugins the previous log messages cannot have been printed. Are you sure you're sending information about the same setup?

@danobi
Copy link
Contributor

danobi commented Mar 9, 2020

Can you also share the oomd config you're using?

@nartes
Copy link
Author

nartes commented Mar 10, 2020

It is in dry-run, but the above problem has been reported without it. I've used dry run mode to debug killing selector.

//
// Basic configuration for a desktop linux machine
//

{
    "rulesets": [
        {
            "name": "user session protection",
            "detectors": [
                [
                    "user pressure above 60 for 30s",
                    {
                        "name": "memory_above",
                        "args": {
                            "cgroup": "user.slice",
                            "threshold": "80%",
                            "duration": "1"
                        }
                    }
                ]
            ],
            "actions": [
                {
                    "name": "kill_by_memory_size_or_growth",
                    "args": {
                        "cgroup": "user.slice/user-*.slice/user@*.service/*",
			"size_threshold": 10,
			"post_action_delay": 1,
			"dry": true
                    }
                }
            ]
        }
    ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants