BYU

Office of Research Computing

Storage

hard_disk_small.jpg Office of Research Computing data storage offerings are to be used in conjunction with high performance compute processing performed on Office of Research Computing compute resources. Data for other purposes should be held elsewhere.

Rent or Purchase Additional Storage

Please see this document to learn more about rental and purchase options.

Storage Resources Summary

Name Path Default Quota Backed Up Purpose
Home Directory /home/username or ~/ 1 TB (1000 GB), 1 million files Yes Critical data to be retained, source codes, small file processing
Scratch or Compute ~/compute 20 TB (20480 GB), 1 million files No Temporary files, in processing data, large file processing
Group ~/fsl_groups/groupname 40 GB, 1 million files Yes Shared data sets and results, group small file processing
Group Scratch ~/fsl_groups/groupname/compute 20 TB (20480 GB), 1 million files No Temporary files, in processing data, and large file processing of group shared data
tmp or local scratch /tmp N/A No Limited storage on compute nodes for in processing scratch data

Home Directories

  • Home directories should be used to store code and data deemed critical and worth being backed up
  • Home directories are accessible from compute nodes and may be used for storing small files being processed against, but should not be used to store temporary scratch data
  • The default quota is 1 TB and 1 million files, but requests may be made for more if appropriate
  • Data in home directories is backed up daily

Scratch, or Compute, Storage

  • Scratch storage is expensive high performance storage to store large data and results
  • Users should take the time to periodically clean up scratch storage and free up space for others
  • There is a 20 TB, 1 million file quota on user scratch storage
  • Data in scratch is not backed up, unless you back up manually

Group Storage

  • Group Storage should be used like home directories, with the added benefit of being able to share files with other users
  • The default quota for a group is 40 GB and 1 million files
  • Data in group storage is backed up daily
  • See the Group File Sharing page for more information

Group Scratch Storage

  • Group scratch storage should be used like scratch storage, with the added benefit of being able to share files with other users
  • There is a 20 TB, 1 million file quota on group scratch storage
  • Group scratch storage is not backed up, unless you back up manually

tmp, or Local Scratch Storage

  • tmp storage uses the local hard drive on the node the job is running on
  • Users may create temporary files and directories in /tmp for local scratch file needs
  • Users are responsible for and should automate cleaning up of /tmp after every job
  • Each compute node has limited space. This may vary from about 30 GB to over 800 GB
  • We recommend using the path `/tmp/$SLURM_JOB_ID` to make sure that the directory is unique for each job.
  • An example script to manage the cleanup follows:
#!/bin/bash
#SBATCH --nodes=1 --ntasks=8
#SBATCH --mem=16G
#SBATCH --time=3:00:00

TMPDIR=/tmp/$SLURM_JOB_ID

# prepare function to clean up directory if job is killed, etc
# NOTE: THIS IS NOT EXECUTED UNTIL THE SIGNAL IS CALLED.
cleanup_scratch() {
    echo "Cleaning up temporary directory inside signal handler, meaning I either hit the walltime, or deliberately deleted this job using scancel"
    rm -rfv $TMPDIR
    echo "Signal handler ended at:"
    date
    exit 1
}

#Now, associate this function with the signal that is called when the job is killed, or hits its walltime
trap 'cleanup_scratch' TERM

#Now, initial setup of temporary scratch directory
echo "Creating scratch directory at $TMPDIR"
mkdir -pv $TMPDIR 2>&1

#PUT CODE TO COPY YOUR DATA INTO $TMPDIR HERE IF NECESSARY
#DO WHATEVER YOU NEED TO DO TO GET YOUR SOFTWARE TO USE $TMPDIR. THIS WILL DEPEND ON THE SOFTWARE BEING USED
#PUT CODE TO RUN YOUR JOB HERE

echo "Cleaning up temporary directory at end of script, meaning that the job exited cleanly"
rm -rfv $TMPDIR