-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large lingering mmaps that apparently could be paged out seem to instead cause large memory crashes #318
Comments
earlyoom just did what it had to do. You can make its settings different. |
Also I'd like to recommend to increase swap space, maybe up to 3-6 GB, also maybe configure swap on ZRAM. |
Okay, so how should I configure it to avoid this? I can't make it trigger later because then the device will lock up. My apologies if I'm missing something. From what I got from the previous conversations, this doesn't seem to be related to the trigger point chosen but seems to be kind of a problem with earlyoom apparently 1. being triggered by tons of mapped file pages that arent actually actively used (instead of legit used pages of allocated memory), 2. then earlyoom not defaulting to somehow trying to unmap file pages first instead of stopping a program (if that's even something earlyoom could do). So I'm not quite sure what I should change in my configuration to avoid running into this, it sounded to me like something that can't be handled in the config. Then again I really wouldn't know, I don't know that much about mapping files to memory. |
What happens when earlyoom disabled? |
arent actually actively used != free |
Seems like the kernel DO that: swap free is 24.95% |
The Nheko dev said apparently the kernel is meant to swap them out on its own so earlyoom just shouldn't have triggered, if I understood the words correctly. I couldn't tell you who is right 😂 the problem seems to be triggered by LMDB mapping a ton of things when writing all over the database, and then not really unmapping it when it's actually no longer needed, and apparently just ignoring that would lead to correct behavior...? I wouldn't understand myself. What instead happens is that earlyoom triggers and things start going down. |
Can you run
pmap -x PID
on the PID of the huge chat process?
…On Thu, 6 Jun 2024, 16:24 Ellie, ***@***.***> wrote:
arent actually actively used != free
The Nheko dev said apparently the kernel is meant to swap them out on its
own so earlyoom just shouldn't have triggered, if I understood the words
correctly. I couldn't tell you who is right 😂 the problem seems to be
triggered by LMDB <https://www.symas.com/lmdb> mapping a ton of things
when writing all over the database, and then not really unmapping it when
it's actually no longer needed, and apparently just ignoring that would
lead to correct behavior...? I wouldn't understand myself. What instead
happens is that earlyoom triggers and things start going down.
—
Reply to this email directly, view it on GitHub
<#318 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACGA73H743QNQWA3WJ4HLTZGBWLDAVCNFSM6AAAAABI4VQE52VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJSGY3TKNBTGY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
If earlyoom will not work, then in case of further leakage: SwapFree and MemAvailable will be closed to 0, then the kernel OOM killer will be triggered. Try simply increasing the swap space and decreasing earlyoom thresholds (for example to 4-6%) |
I've been trying to recreate the exact situation to use |
Looks like VmRSS is in fact very strange with mmap mappings present. Look at this: htop-dev/htop#924 . Bizarre! However, earlyoom does not use VmRSS for decisions (except if you use --sort-by-rss). It uses I wonder if the mmap strangeness also affects |
So I wrote a toy program to find out how this works ( c759f1b ). Findings:
I agree. Looks like:
|
Reopening the ticket. It's now for the bug of "killing the wrong process, just because it had mmaps". |
I think |
I'm using a chat client called nheko which uses a lot of memory during the initial log in. However, usually later it uses less memory during normal use. Yet, after one of those log ins, many minutes later I got this out of memory "crash":
I asked the developer of Nheko what happened, and whether this is a Nheko memory leak. Especially since after restarting it used less memory again, only around 200mb which is about 10% of what it used when killed.
The nheko developer suggested the issue was a combination of 1. nheko memory maps a large amount of areas but then mostly doesn't use them anymore after that login phase, 2. those memory pages could then be paged out but the kernel doesn't seem to do that eagerly unless memory is tighter than the earlyoom trigger point, 3. earlyoom instead of just doing something to get those pages paged out will seemingly needlessly terminate nheko instead.
(I hope I summed all of that up correctly, my apologies if not.)
Is this fixable in earlyoom? This seems like a quite fundamental issue that will cause unnecessary crashes for apps that use memory mapping extensively.
Affected earlyoom version: earlyoom v1.7, as packaged by postmarketOS 23.12
The text was updated successfully, but these errors were encountered: