How to run on a cluster without workload manager #80
-
Dear Francisco, I realize metaGEM was optimized to run on a HPC cluster system, but the cluster of our lab (40 cores, 500 GB RAM) operates without a workload manager. So as a user you have to take care you don't use all 40 cores, and that you do not submit more jobs then there are cores etc. I have 43 (soil) samples organized into sample specific subdirectories within the dataset folder If I run
and I set megahit cores to Would that mean metaGEM would assemble the samples only one at a time? What I want to avoid is that the metagem parser submits all 43 jobs at once, because on this cluster submitting means starting Best wishes, |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
Hey Sam, Good question. If you look at the Lines 338 to 364 in a125bb7 Let me know if you run into any issues with this. Best, |
Beta Was this translation helpful? Give feedback.
-
Thank you for your answer. Ok, so it should make sense then to set the number of cores for all jobs to the maximum number of threads I am allowed to use at any time (e.g. 20)? Otherwise, if I use the default values, when fastp is running for example, only 4 cores will be used, and 16 will be left unused? Best, |
Beta Was this translation helpful? Give feedback.
-
Hi Sam, That is correct. When running Lines 42 to 59 in d81186a Hope that is clear and let me know if you run into any issues with this! Best, |
Beta Was this translation helpful? Give feedback.
-
Hi Francisco, Yes, very clear, thanks! When running fastp however, it seems 16 threads are used instead of 20. I get a warning that I specified 20 but only 16 will be used. But this seems to be inherent to fastp. Best, |
Beta Was this translation helpful? Give feedback.
Hey Sam,
Good question. If you look at the
metaGEM.sh
script, specifically in line 358, you will see that only 1 job will be submitted at a time when runningmetaGEM
with the--local
flag:metaGEM/metaGEM.sh
Lines 338 to 364 in a125bb7