You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used srun to hop into a bash shell on a GPU machine. Then I wanted to use srun.to launch 4 processes on this same machine. It just hangs. Looks like srun reserves the whole node, and then further calls to srun are stuck. So this is a use case for salloc, and that's how I do stuff on Frontier/Perlmutter. However, on those systems salloc will jump into the machine also. That's convenient.
You're right that you'd need to salloc and then srun inside of that if you want to do multiple jobs inside of an allocation.
Is there a specific request you're making or improvement you suggest? srun is the shortest one-line command, so it's what I generally recommend, and doing multiple jobs is generally a special case.
I used
srun
to hop into a bash shell on a GPU machine. Then I wanted to usesrun
.to launch 4 processes on this same machine. It just hangs. Looks likesrun
reserves the whole node, and then further calls tosrun
are stuck. So this is a use case forsalloc
, and that's how I do stuff on Frontier/Perlmutter. However, on those systems salloc will jump into the machine also. That's convenient.sapling-guide/README.md
Lines 63 to 78 in 4a2dc09
The text was updated successfully, but these errors were encountered: