slurm-auto-array
Slurm job arrays are useful when one wants to run one command many times, each time with a different set of parameters. Sometimes job arrays are somewhat cumbersome to set up, though; for example, when one needs to run a command tens of thousands of times, several units of work need to be aggregated into one array task since there is a limit of 5,000 tasks per job array. slurm-auto-array
seeks to remedy this by automatically aggregating work such that strain on the scheduler is reduced and throughput is maximized.
slurm-auto-array
works much like GNU Parallel: it takes a newline-delimited list of arguments from stdin and submits a job array that runs a user-specified command on each argument. To use it, load the parallel
and slurm-auto-array
modules.
For information on usage, see the man page (man slurm-auto-array
), help message (slurm-auto-array --help
), and worked example on GitHub.
Alternatives
Job Arrays
If your tasks run longer than an hour and there are less than a few thousand of them, raw job arrays are a good choice, with less overhead than slurm-auto-array
.
GNU Parallel
If you have a huge amount of very small tasks (e.g. 100,000 tasks that will run for about 5 minutes each), parallel
is a better choice than slurm-auto-array
; using parallel
instead of slurm-auto-array
is especially important if each unit of work uses one or more files, because moving around so many files in so short a time can bog down our storage systems, slowing your (and others') jobs dramatically. If you are working with a tarball where extracting a single file per task would be time-consuming, parallel
is also a good choice.
As an example, say you have a tarball named mydir.tar.gz
containing 200,000 files that you would like to process in parallel using a command of the form foo $filename > $filename.out
. To do so, you could use a job script similar to:
#!/bin/bash
#SBATCH --nodes 1 # parallel is easier to use on a single node
#SBATCH --ntasks 16 --mem 32G --time 1-00:00:00
# Prep environment
module load parallel pigz
workdir="$(mktemp -d)"
outfiles="$(mktemp -d)"
# Unzip and process files
tar xf <(unpigz mydir.tar.gz) -C "$workdir" # unzip in parallel
parallel --jobs "$SLURM_NTASKS" foo {} ">" "$outfiles"/{}.out ::: "$workdir"/*
# Zip results to current directory and clean up
tar czf results.tar.gz -C "$outfiles" .
rm -r "$workdir" "$outfiles"
If you have enough work that you need to split it across multiple nodes, you can still use parallel; to do so, add the following to your script to tell parallel
which nodes to use:
sshloginfile=`mktemp`
paste -d '/' <(perl -pe 's/(\d+)\(x(\d+)\)/substr("$1,"x$2,0,-1)/ge' <<<$SLURM_TASKS_PER_NODE | tr ',' '\n') \
<(scontrol show hostnames) > $sshloginfile
trap 'rm $sshloginfile' EXIT
...and add these flags to the parallel
invocation:
--ssh 'ssh -o ServerAliveInterval=300' --sshloginfile "$sshloginfile"
Since using parallel
ties you to a single job, your job may not finish as quickly since larger jobs tend to wait in the queue for longer, but if you are working with a tremendous amount of files and/or very short jobs it is likely to be faster than slurm-auto-array
.
Last changed on Tue Aug 6 13:52:49 2024