How do I submit a large number of very similar jobs?
There are a few tricks that can help when submitting large numbers of similar jobs, that will make your life easier. This FAQ will outline some of them, and grow as we see other things that will help.
Note: If you are trying to submit a large number of very small, very short jobs, please also read this page.
Avoiding multiple files, one per job
Many times when users have hundreds of jobs, and only one thing changes between them, they create a new job script for each of them, and then call
sbatch on each of them. That's a lot of files, and if the changes are minimal, you don't really need to do it anyway. Instead, you can create a single job script file, and encapsulate the changes into a variable, that you can pass in a value for, when you submit the job.
Let me give you an example. Let's say that you have a directory that contains all your job information. You have 700 different cases to submit, and they each have their own directory, named like this:
case001/ case002/ case003/ case004/ ... case100/ case101/ ... case695/ case696/ case697/ case698/ case699/ case700/
Inside each of these case directories, exists a script that runs the individual case called ''runcase.sh''. Therefore, the submission script (call it ''submit_case001.sh'') for case 1 looks something like this:
#!/bin/bash #SLURM --ntasks=1 #SLURM --time=00:30:00 #SLURM --mem-per-cpu=1G #SLURM --job-name=my_case001_job cd $SLURM_SUBMIT_DIR/case001 ./runcase.sh
And then when you submit the case, you use syntax like this:
What's wrong with that?
Using this model, you'd create 700 individual scripts, and the only thing that would change would be the case number. A much easier way would be to use a job array, which would only require the use of a single script.
Job arrays are collections of similar tasks, each executing the same script. In order to allow tasks to do unique work, each has an ID which is available to the task via the environment variable SLURM_ARRAY_TASK_ID. These ID's are given when submitting the array, and can be specified in a few different ways:
|Submission syntax||Resulting task ID's|
||1, 2, 3, 5, 8||Comma-separated list|
||1, 2, 3, 4, 5, 6||Range of ID's|
||0, 4, 8, 12, 16, 20||Range of ID's, with step size 4|
For the example job mentioned, one could use an array with tasks 1-700 since we have 700 cases, named "case001"-"case700". The submission script might look something like this:
#!/bin/bash # submit_array.sh #SBATCH --ntasks=1 #SBATCH --time=00:30:00 #SBATCH --mem-per-cpu=1G #SBATCH --array=1-700 # pad the task ID with leading zeros (to get 001, 002, etc.) CASE_NUM=`printf %03d $SLURM_ARRAY_TASK_ID` cd $SLURM_SUBMIT_DIR/case$CASE_NUM ./runcase.sh
sbatch submit_array.sh would result in an array of 700 tasks, each running one of the above-mentioned cases.