Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory consumption with a long-lived Procrastinate processes and adding max_tasks_per_child #1262

Open
paulzakin opened this issue Jan 2, 2025 · 3 comments

Comments

@paulzakin
Copy link
Contributor

Hello @ewjoachim and @medihack!

Hope you guys had a good holiday season :)

I noticed something with Procrastinate and long-lived processes. We have three procrastinate processes that have been alive for ~ 10 days or so (and working flawlessly) on our Ubuntu server. However, I noticed something interesting. Memory consumption is increasing by about ~ 20 MB per day (on a server with 16 GB of RAM). And I used atop to confirm that it is the Procrastinate processes, and not something else on the server.

Assuming Procrastinate and Celery are similar(ish) I did some research on memory consumption / leaks in Celery. And Celery has a functionality called max_tasks_per_child. Would something like this make sense here?

To be clear - I'm not sure if it is Procrastinate the library that is leading to the memory usage, or Procrastinate running one of our tasks. Open to suggestions if you would like me to debug further on that.

But have either of you even run into this problem or have any ideas on the memory consumption?

@medihack
Copy link
Member

medihack commented Jan 2, 2025

No, not really. I have been running Procrastinate for about 30 days without a restart, but I don't see an increase in memory consumption. But I am already using the v3 branch. Hopefully, we will release a beta of it in the next 2 weeks (there is only one small issue left). The worker in v3 was fully rewritten and may solve this (and hopefully doesn't make a workaround like max_tasks_per_child necessary).

@ewjoachim
Copy link
Member

I think @paulzakin meant that the memory leak might also come from their own code but while they're figuring where exactly, it would make sense to have a single setting that ensures the worker stops after a set amount of work.

I think it would be nice if this can easily be implemented in your own code but we may need some primitives for that.

The more I think about it, the more I'd be inclined to add a middleware feature: when launching the worker, you could add an additional argument middleware= and pass a callable that recieves a task and awaits it. This way you could add your own code before and after tasks. This would let you count how many tasks were recieved. We could also expose a special exception StopWorker that when raised stops the worker gracefully. I believe this would provide easy primitives that might be a powerful way to achieve plenty of results.

Of course, the middleware part can already be achieved today but it's a bit tedious.

@paulzakin
Copy link
Contributor Author

Yup - totally not sure if it is because of my code that Procrastinate is running or Procrastinate itself, @ewjoachim nailed it. I think @medihack your approach makes sense - and it is good to know that you have not seen any issue on the v3 branch. I'm content to wait a month (or two) for the release and then try that - will be an excellent test of it, I think!

And @ewjoachim I love the middleware idea - but sensitive to you and @medihack's workflow - so up to you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants