Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory after 3 months of use #32

Open
kit-ty-kate opened this issue Aug 16, 2024 · 5 comments
Open

Out of memory after 3 months of use #32

kit-ty-kate opened this issue Aug 16, 2024 · 5 comments

Comments

@kit-ty-kate
Copy link

I deployed a unipi instance in early May, and a few days ago i tried to access it only to find this error message in the log when i realised the service was down:

[...]
2024-07-31T04:43:27-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:43:27-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:43:28-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:43:28-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:43:28-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:43:28-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:43:28-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
2024-07-31T04:44:03-00:00: [INFO] [tcp.segment] Max retransmits reached for connection - terminating
Fatal error: out of memory
Aborted
Solo5: solo5_abort() called
Solo5: Halted

Info:

@hannesm
Copy link
Collaborator

hannesm commented Aug 16, 2024

Thanks for your report. You don't have any detailed GC statistics, or do you?

My suspicion is that it is mainly the (old) TCP stack that in certain scenarios (with real-world Internet traffic) leaks memory. Unfortunately the new TCP stack has still some other issues and thus is not ready for primtle time yet.

An interesting data point would be how often you updated the data in the unikernel (via the /hook url, doing a git pull on the data repository) -- if at all. (The reason behind that question is to exclude the git-client from the considerations of memory usage.)

What I can say: thanks for testing, sorry that it behaves for you in that way, and we're working hard to get towards a more performant and less leaky stack. :)

@kit-ty-kate
Copy link
Author

Thanks for your report. You don't have any detailed GC statistics, or do you?

Sadly not. Is there a way to get more information when this type of critical error happens? (something like an option gc-verbose=true so if it happens again we have more information)

My suspicion is that it is mainly the (old) TCP stack that in certain scenarios (with real-world Internet traffic) leaks memory. Unfortunately the new TCP stack has still some other issues and thus is not ready for primtle time yet.

No worries, whenever have an alpha version that ready to test i'll be happy to give the new stack a try.

An interesting data point would be how often you updated the data in the unikernel (via the /hook url, doing a git pull on the data repository) -- if at all. (The reason behind that question is to exclude the git-client from the considerations of memory usage.)

Unless some robot used /hook over and over, i've only used it once on July 7. To rule this out, it might be useful to have a password field on the /hook page so it isn't triggered by some random robot by accident or malice.

What I can say: thanks for testing, sorry that it behaves for you in that way, and we're working hard to get towards a more performant and less leaky stack. :)

No problem at all, thanks for creating unipi!

@kit-ty-kate
Copy link
Author

Unless some robot used /hook over and over, i've only used it once on July 7. To rule this out, it might be useful to have a password field on the /hook page so it isn't triggered by some random robot by accident or malice.

actually, replying to myself here: i can simply set --hook=<some random password>. I'll do that on reboot

@reynir
Copy link
Contributor

reynir commented Aug 19, 2024

Thanks for your report. You don't have any detailed GC statistics, or do you?

Sadly not. Is there a way to get more information when this type of critical error happens? (something like an option gc-verbose=true so if it happens again we have more information)

There is the --enable-monitoring compile time flag. With that you can get metrics to influx including GC metrics. I realize this requires more setup on your part which may not be desirable for you.

@hannesm
Copy link
Collaborator

hannesm commented Aug 19, 2024

So the metrics won't suffice (as far as I know). It is more stuff like memtrace that would be useful -- but then we bite into the Cstruct.t apple and memtrace isn't too useful for these bigarray allocations.

Below the line, I think it would be great to get the new TCP stack out of the door, and spend more time on improving that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants