BYU

Office of Research Computing

Backing Up to Box With Restic

We now recommend kopia for backups; this page is unmaintained.

restic in combination with rclone makes a versatile cloud backup solution. Once you have rclone set up, restic backups are very simple. This tutorial demonstrates how to back up your personal and group compute directories to Box. Note that restic is versatile and can be used in other configurations as well--pulling data to your personal machine, for example, or pushing it to your department's servers.

Prerequisites

First you will have to set up rclone to work with Box.

You will need the rclone and restic modules loaded:

module load rclone restic

For security, choose a strong, unique password and enter it into a file, for example ~/.restic-password:

echo 'my 100% unique password' > ~/.restic-password

If you don't want the password stored, you can skip this step and omit --password-file ~/.restic-password when backing up, but automating your backups will be significantly harder in this case since you will either need to enter the password manually each time restic runs or figure out how to use --password-command. Do not lose or forget your password--since restic's encryption is actually secure, your password is the only way to decrypt the repository.

You will need a directory (e.g. ORCBackup) in which to store your restic backups. If you are backing up to Box, use rclone to create it:

rclone mkdir box:ORCBackup

Initializing a repository

Before backing up, restic needs to initialize the directory where it will be storing data. When using an rclone back end, prefix the rclone back end and directory name with "rclone:":

restic --password-file ~/.restic-password \
       --repo rclone:box:ORCBackup \
       init

Once a repository is initialized, you can back up anything to it. Since restic deduplicates, it's advantageous to maintain only one repository, backing up all of your computers and projects to it.

Run a backup

After initialization creating a backup is very simple. We recommend using tags to keep your backups sorted.

Suppose you want to backup your compute directory in addition to that of fslg_mygroup, distinguishing the backups with tags my_compute and fslg_mygroup_compute respectively. You can do so with:

restic --password-file ~/.restic-password \
       --repo rclone:box:ORCBackup \
       backup ~/compute/ \
       --tag my_compute
sleep 30 # so Box actually deletes restic's lockfile
restic --password-file ~/.restic-password \
       --repo rclone:box:ORCBackup \
       backup /fslgroup/fslg_mygroup/compute/ \
       --tag fslg_mygroup_compute

The trailing slashes are important--excluding them will result in worthless backups of the symlinks themselves.

Automating backups

After this initial backup, you will probably want to schedule regular backups. The following script will run the backup and "prune" old backups (to keep space use from growing indefinitely); it keeps 2 weeks of daily, 3 months of weekly, 1 year of monthly, and 5 years of yearly backups for ~/compute, and 1 week of daily and 6 weeks of weekly backups for /fslgroup/fslg_mygroup/compute.

#!/bin/bash -l

# Load rclone and restic
module load rclone restic

# Set password and repo environment varialbe so we don't have to specify '--password-file' and '--repo'
export RESTIC_PASSWORD_FILE=~/.restic-password
export RESTIC_REPOSITORY=rclone:box:ORCBackup

# Back up
restic backup ~/compute/ --tag my_compute
sleep 30 # so Box actually deletes restic's lockfile
restic backup /fslgroup/fslg_mygroup/compute/ --tag fslg_mygroup_compute
sleep 30 # so Box actually deletes restic's lockfile

# Prune backups
restic forget --prune --tag my_compute --keep-daily 14 \
                                       --keep-weekly 12 \
                                       --keep-monthly 12 \
                                       --keep-yearly 5
sleep 30 # so Box actually deletes restic's lockfile
restic forget --prune --tag fslg_mygroup_compute --keep-daily 7 --keep-weekly 6

In order to run this every day, save the script somewhere (e.g. ~/restic-backup.sh), add execute permissions to it (chmod +x ~/restic-backup.sh), and add a corresponding cron entry by running crontab -e. You will probably want it to look something like this:

MAILTO=your_email@address.com
M H * * * ~/restic-backup.sh > ~/restic-backup.log

...where M should be replaced with a minute (0-59) and H should be replaced with an hour (0-23). This will run the backup script each day at a given time. Most users will only need to back up weekly or monthly:

MAILTO=your_email@address.com
# weekly (replace `W` with a number 0-6)
M H * * W ~/restic-backup.sh > ~/restic-backup.log

# monthly (replace 'L' with a number 0-31, preferably 0-28)
M H L * * ~/restic-backup.sh > ~/restic-backup.log

An added advantage of running under cron on our systems is that if the job fails (or prints anything to stderr), you will receive an email with the error message(s) as long as you have MAILTO set to the correct address. Make sure that you direct stdout to a file (~/restic-backup.log in these examples), or you will get an email every time the backup script runs whether or not it was successful

Checking your backups

You should check your backups roughly monthly; taking a few hours a year to make sure that you would only lose a month of work at worst is well worth the effort.

Use restic mount to check your backups--just create a directory and mount the repository there:

mkdir /tmp/restic-mount
restic --password-file ~/.restic-password \
       --repo rclone:box:ORCBackup \
       mount /tmp/restic-mount
# Now press 'Ctrl-Z' to stop the process, then...
bg # resume mount process in background
restic_pid=$!
cd /tmp/restic-mount

/tmp/restic-mount will contain four directories: hosts, ids, snapshots, and tags. They all contain the same data, but are organized differently; generally, tags is the most useful. It will be organized something like this:

tags/
|-- fslg_mygroup_compute
|   |-- 2020-10-01T14:56:03-06:00
|   |-- 2020-10-02T13:45:42-06:00
|   '-- latest -> 2020-10-02T13:45:42-06:00
'-- my_compute
    |-- 2020-10-01T13:45:02-06:00
    |-- 2020-10-02T13:45:01-06:00
    '-- latest -> 2020-10-02T13:44:18-06:00

Each of the subdirectories under the tags will contain everything in the associated backup--for example, if you navigate to tags/my_compute/latest and run ls, you will see the exact same files and directories as exist in your ~/compute directory. This is very useful for restoring single files or small groups of files (e.g. on accidental deletion)--just mount the repository, find the file(s), and copy them back to their original location.

Once you are finished, navigate out of the mount, kill the restic mount process, and clean up the mount directory:

cd /tmp
kill $restic_pid
rmdir restic-mount

Restoring from backup

In case of a major data loss, restic mount isn't efficient--you'll want to use restic restore. To restore the most recent snapshot of /fslgroup/fslg_mygroup/compute/ to its original location, for example, you can use:

restic --password-file ~/.restic-password \
       --repo rclone:box:ORCBackup \
       restore latest \
       --tag fslg_mygroup_compute \
       --target /fslgroup/fslg_mygroup/compute/

Be very careful with restore--it will overwrite anything that exists in the target directory, so only use the original directory if all truly is lost or if you have moved everything that is salvageable elsewhere.

You can, of course, specify older snapshots--for example, you might want to use the snapshot of your compute directory from two days ago, since you just realized that a breaking change happened yesterday before the backup ran. To find out which snapshot corresponds to that backup, use restic snapshot:

restic --password-file ~/.restic-password \
       --repo rclone:box:ORCBackup \
       snapshots

The results will look something like this:

ID        Time                 Host        Tags                    Paths
------------------------------------------------------------------------
fa582527  2020-10-01 13:45:02  login02     my_compute              /
ca22a0dc  2020-10-01 14:56:03  login02     fslg_mygroup_compute    /
04df3762  2020-10-02 13:45:01  login02     wind-ssd                /
ad81d9a1  2020-10-02 13:45:42  login02     fslg_mygroup_compute    /
------------------------------------------------------------------------
4 snapshots

fa582527 is the snapshot corresponding to the backup of your ~/compute directory from two days ago. If you just want to restore one file, ~/compute/somedir/someprogram.c, you can specify it with --include (there is a corresponding --exclude flag as well):

restic --password-file ~/.restic-password \
       --repo rclone:box:ORCBackup \
       restore fa582527 \
       --target ~/compute/ \
       --include /somedir/someprogram.c