BYU

Office of Research Computing

General PBS Information

PBS is no longer used at BYU Supercomputing.

This page remains for historical reasons. See the SLURM documentation instead.


Overview

This document describes the steps necessary to prepare and run batch jobs using the TORQUE/PBS resource manager with the Moab scheduler.

For information about how to query the resource manager and scheduler, or how to submit or delete jobs, etc. go to our scheduler commands page.

Script files

Like most batch systems, PBS requires its users to create a script to define what the job needs to do. This may be written in any interpreted language that uses a "#" as the comment character, including Perl, bash, tcsh, etc. This script defines the attributes of the job, including resource requirements, as well as specifying the tasks that the job needs to accomplish.

Once the script is written, it can be submitted to the scheduler with the qsub command. i.e. qsub myscript.sh

To help illustrate the principles at work here, consider the following example, and the accompanying explanation.

Example 1: simple serial job script


#!/bin/bash
#PBS -l nodes=1:ppn=1,walltime=24:00:00
#PBS -N test_program
#PBS -m abe
#PBS -M user@byu.edu
PROG=/fslhome/username/compute/test_program

PROGARGS=""
cd $PBS_O_WORKDIR
$PROG $PROGARGS
exit 0

For the above example, we will explain each line:

  • The first line, #!/bin/bash, defines the language and the interpreter used to interpret the job script.
  • The next four lines, which all start with #PBS are treated by the interpreter (bash in this case) as comments. However, the batch scheduler interprets these special comments as a definition of the resources that the job requires.
  • The line that begins #PBS -l (lowercase "L") defines the resources that your job is requesting, as well as other job instructions.
  • The next line that begins #PBS -N defines the name applied to the job.
  • The next two lines, which begin with #PBS -m and #PBS -M define the notifications that the system will send to you about your job.
  • The remainder of the script lists the commands the job will actually execute when it launches. This syntax is mostly outside the scope of this document, but the examples on this page will most likely be sufficient. For further information, we recommend the Advanced Bash-Scripting Guide.

Requesting Resources 

In order for the system to schedule your job appropriately, it needs to define the resources it needs to utilize (the #PBS -l line in your script). Currently, we schedule primarily using the number of processors you need to use.

When using a cluster like marylou5, use the nodes=n:ppn=n syntax to specify the number of nodes and processors per node (ppn) needed.

Example 2: Requesting nodes and processors on marylou5


#PBS -l nodes=2:ppn=2,walltime=24:00:00

The system also needs to have the total amount of time you expect the job to run, known as walltime. The system will terminate your job if it exceeds the specified walltime, so don't set it too low. On the other hand, setting the walltime too high causes problems with the scheduler, and shorter jobs have a greater chance of starting sooner, through a technique known as backfilling. The syntax for specifying a walltime is shown in the example above on the first #PBS line where it says walltime=24:00:00, meaning 24 hours, 0 minutes, and 0 seconds. Time may be specified using the format hh:mm:ss, where hh refers to the number of hours, mm the minutes, and ss the seconds. If you don't specify a walltime, the system will assume one for you, currently set at 1 hour. The system will also adjust your job start priority based on your historical walltime accuracy.

Scratch Space 

The Fulton Supercomputing Lab provides a unified scratch storage space. This area is optimized for large amounts of input and output, and is not backed up. We recommend that you copy your job's input data files to some location there, and make appropriate arrangements to put temporary processing files there as well, only copying the final results back to your home directory. One method of doing this is shown below in example 3.

Scratch space is located in /fslhome/username/compute where username is your username.

Job Naming 

Each job can be assigned a name for it to use when generating output files. The scheduler also assigns it a job number, which is output when you submit the job. These names are assigned to the job using the #PBS -N line as shown in the example above. If you don't specify a name, the system will use the basename of your job script as the name.

Job Output 

The system will automatically create files for each of your jobs that contain the standard-output output and the standard-error output of your job. These files are automatically put into the directory you were in when you submitted the job. They are named using the name assigned to your job, and the job ID number assigned by the system. For example, if your job was named test_job, and it was given a number of 243, then in the directory you were in when you submitted the job, you would get a file named test_job.o243 that contains the standard-output output of your job, and one named test_job.e243 that contains the standard-error output of your job.

NOTE: Standard output and error files are spooled on the compute nodes, then copied back to your home directory when the job finishes. If you want these files to remain in your home directory while the job runs you can specify this in your job script with the -k option i.e. #PBS -k oe

Notification 

If you request it, your job can attempt to email you when it begins execution, ends or terminates execution, and aborts execution. These are done using the #PBS -m and #PBS -M lines shown in the example above.

The line that begins #PBS -m specifies the list of events you wish to be notified about; a for aborting execution, b for began execution, and e for ending or terminating execution. You may use any combination of these letters, in any order. If you don't want any notifications, simply use the line #PBS -m n, or leave the notification directives out entirely.

The line that begins #PBS -M defines the email address where you want to receive the notifications. If you request notifications, you must specify an address.

SMP vs. MPI 

The Supercomputing Lab has historically managed two types of systems: large single-node systems, and clusters. However, all the current systems are considered clusters, and future single-node systems will be integrated into the cluster system. Therefore, all jobs should use the cluster-style resource request syntax of nodes=n:ppn=m, and you should not use the old single-node style of ncpus=n when requesting processors.

Using MPI 

In order to launch an MPI process, you have to use an MPI launcher. Historically, we have preferred people use mpiexec, and not mpirun, but with the recently updated Operating System, there is no longer any difference between these two tools. You may use either one you wish. In all the examples below, they are interchangeable.

Using mpiexec or mpirun

Example 3: 8 processor mpiexec job


#!/bin/bash
#PBS -l nodes=4:ppn=2,walltime=72:00:00
#PBS -N test_program

#PBS -m abe
#PBS -M user@byu.edu

PROG=/fslhome/username/compute/test_program
PROGARGS=""
SCRATCH_DIR=/fslhome/username/compute/$USER/$PBS_JOBID


#make sure the scratch directory is created
mkdir -p $SCRATCH_DIR

#copy datafiles from directory where I typed qsub, to scratch directory
cp -r $PBS_O_WORKDIR/* $SCRATCH_DIR/

#change to the scratch directory
cd $SCRATCH_DIR

# Execute the mpi job
mpiexec $PROG $PROGARGS

#copy data back from scratch directory to directory where I typed qsub
cp -r $SCRATCH_DIR/* $PBS_O_WORKDIR/

exit 0

Historically, you had to pass special parameters to the job launcher, or run a specific version, in order to use a special interconnect like Infiniband. With the new system, this is no longer the case. By default, any job that uses nodes that all have Infiniband, will use Infiniband. If any of the nodes do not have Infiniband, then the job will simply use Gigabit Ethernet instead. It's all automatic now.