BYU

Office of Research Computing

Backing Up to Box With Kopia

Kopia in combination with rclone is our recommended cloud backup solution. Once you have rclone set up, kopia backups are very simple. Note that kopia is versatile and can be used in other configurations as well--pulling data to your personal machine, for example, or pushing it to your department's servers. Keep in mind that home directories are backed up, so manual backups are only necessary for compute and private storage.

Prerequisites

Load the rclone and kopia modules:

module load rclone kopia

You'll need to load kopia each time you want to use it, and rclone if your repository is on Box.

Set up rclone to work with Box; kopia will use it as a communication layer. You only need to do this once, though you'll need to re-authenticate sometimes if you use Box seldom.

For security, choose a strong, unique password and store it elsewhere. For automation, you'll also need to store it on the supercomputer itself.

You will need a directory (e.g. kopiabackup) in which to store your kopia backups. If you are backing up to Box, use rclone to create it:

rclone mkdir box:kopiabackup

Creating a Repository

Before backing up, kopia needs to create the repository where it will store its data. When using rclone, this looks something like:

kopia repository create rclone --remote-path box:kopiabackup

You'll also want to configure kopia, for example excluding directories not requiring backup and setting retention policies:

kopia policy set --global --keep-annual=3 --keep-monthly=6 --keep-weekly=4 --keep-daily=7 --add-ignore '**/.cache'

Once a repository is initialized, you can back up anything to it. Since kopia deduplicates, it's advantageous to maintain only one repository, backing up all of your computers and projects to it.

Backing up

Once a repository is created, backing up is very simple. To back up /home/netid/compute/important:

kopia repository connect rclone --remote-path box:kopiabackup
kopia snapshot create /home/netid/compute/important

You only need to connect to the repository once per session.

Automating backups

For automation, you'll need to store your password on the supercomputer:

echo 'my 100% unique password' > ~/.kopia-password

This done, you can create a script that will automatically connect, backup /home/netid/compute/important and /home/netid/compute/cat-pictures, and prune:

#!/bin/bash --login
module load kopia
KOPIA_PASSWORD="$(cat ~/.kopia-password)" kopia repository connect rclone --remote-path box:kopiabackup
kopia snapshot create /home/netid/compute/important
kopia snapshot create /home/netid/compute/cat-pictures
kopia snapshot expire

To run this script every day at 3:14 AM, you could use cron. The relevant entry might look something like:

14 3 * * * bash /home/netid/kopia-backup.sh

Restoring backups

Make sure to check your backups often to ensure that you'll be able to restore in case of data loss. Store your password off of the supercomputer so that you can still restore if data there is lost.