Recommended way of running ocrmypdf with memory limits #1386
Replies: 1 comment
-
Memory limits do not ask a process to limit its memory usage; they ask the operating system to intervene when a process exceeds the memory limits you specify. Limiting the number of jobs may help. On a N core system, usually you get best results by running sqrt(N) parallel ocrmypdf processes (or containers) each with Another option is a retry mechanism. If a process is killed with OOM, its return code will be 137. Retry with It's possible that the recently reported quadratic regression on large page count files #1378 also has quadratic memory usage. I have not investigated that. Without specifics (e.g., this test file (attached) has n pages and is x MB, and uses y GB of RAM and z GB of temporary storage on my system, with configuration C) no specific answers can be given. If the file contains pages with very large images then you may need to use some of the tools for managing large images in |
Beta Was this translation helpful? Give feedback.
-
Hello!
I've had some OOM scenarios and I'm wondering what's the recommended way of running ocrmypdf to avoid OOM. Currently I execute it via a subprocess, but from what I'm seeing there's the possibility that ocrmypdf will spin up it's own subprocesses that may disregard the configured memory limits?
Essentially I want to treat OOMs as a result on the same level as successful runs and save it to my database, but wrapping up ocrmypdf in a subprocess does not seem to help. Any recommendations?
The following does not seem to limit the memory usage and eventually cause my container to OOM:
Beta Was this translation helpful? Give feedback.
All reactions