Power outages occurred in the Crabtree Technology Building 11:47 AM - 12:12 PM and again 4:33 PM - 4:47 PM, on Wednesday, August 21, 2019. This affected a portion of our compute cluster (predominantly m8g and m8f nodes). Jobs running on those nodes will be lost. Those nodes should now be available to schedule again. Last Updated Thursday, Aug 22 01:16 pm 2019

Storage

Office of Research Computing data storage offerings are to be used in conjunction with high performance compute processing performed on Office of Research Computing compute resources. Data for other purposes should be held elsewhere.

Storage Resources Summary

Name Path Default Quota Backed Up Purpose
Home Directory /fslhome/username or ~/ 100 GB Yes Critical data to be retained, source codes, small file processing
Scratch or Compute ~/compute 15 TB No Temporary files, in processing data, large file processing
Group ~/fsl_groups/groupname 40 GB Yes Shared data sets and results, group small file processing
Group Scratch ~/fsl_groups/groupname/compute 12 TB No Temporary files, in processing data, and large file processing of group shared data
tmp or local scratch /tmp N/A No Limited storage on compute nodes for in processing scratch data

Home Directories

  • Home directories should be used to store code and data deemed critical and worth being backed up
  • Home directories are accessible from compute nodes and may be used for storing small files being processed against, but should not be used to store temporary scratch data
  • The default quota is 100 GB, but requests may be made for more if appropriate
  • Data in home directories is backed up daily

Scratch, or Compute, Storage

  • Scratch storage is expensive high performance storage to store large data and results
  • Users should take the time to periodically clean up scratch storage and free up space for others
  • There is a 15 TB quota on user scratch storage
  • Data in scratch is not backed up

Group Storage

  • Group Storage should be used like home directories, with the added benefit of being able to share files with other users
  • The default quota for a group is 40 GB
  • Data in group storage is backed up daily
  • See the Group File Sharing page for more information

Group Scratch Storage

  • Group scratch storage should be used like scratch storage, with the added benefit of being able to share files with other users
  • There is a 12 TB quota on group scratch storage
  • Group scratch storage is not backed up

tmp, or Local Scratch Storage

  • tmp storage uses the local hard drive on the node the job is running on
  • Users may create temporary files and directories in /tmp for local scratch file needs
  • Users are responsible for and should automate cleaning up of /tmp after every job
  • Each compute node has limited space. This may vary from about 30 GB to over 800 GB
  • We recommend using the path /tmp/$SLURM_JOB_ID to make sure that the directory is unique for each job.
  • An example script to manage the cleanup follows:
#!/bin/bash
#SBATCH --nodes=1 --ntasks=8
#SBATCH --mem=16G
#SBATCH --time=3:00:00

TMPDIR=/tmp/$SLURM_JOB_ID

# prepare function to clean up directory if job is killed, etc
# NOTE: THIS IS NOT EXECUTED UNTIL THE SIGNAL IS CALLED.
cleanup_scratch()
    echo "Cleaning up temporary directory inside signal handler, meaning I either hit the walltime, or deliberately deleted this job using scancel"
    rm -rfv $TMPDIR
    echo "Signal handler ended at:"
    date
    exit 1
}

#Now, associate this function with the signal that is called when the job is killed, or hits its walltime
trap 'cleanup_scratch' TERM

#Now, initial setup of temporary scratch directory
echo "Creating scratch directory at $TMPDIR"
mkdir -pv $TMPDIR 2>&1

#PUT CODE TO COPY YOUR DATA INTO $TMPDIR HERE IF NECESSARY
#DO WHATEVER YOU NEED TO DO TO GET YOUR SOFTWARE TO USE $TMPDIR. THIS WILL DEPEND ON THE SOFTWARE BEING USED
#PUT CODE TO RUN YOUR JOB HERE

echo "Cleaning up temporary directory at end of script, meaning that the job exited cleanly"
rm -rfv $TMPDIR