Backing Up to Box With Kopia
Kopia in combination with rclone is our recommended cloud backup solution. Once you have rclone
set up, kopia
backups are very simple. Note that kopia
is versatile and can be used in other configurations as well--pulling data to your personal machine, for example, or pushing it to your department's servers. Keep in mind that home directories are backed up, so manual backups are only necessary for compute and private storage.
Prerequisites
Load the rclone
and kopia
modules:
module load rclone kopia
You'll need to load kopia
each time you want to use it, and rclone
if your repository is on Box.
Set up rclone to work with Box; kopia
will use it as a communication layer. You only need to do this once, though you'll need to re-authenticate sometimes if you use Box seldom.
For security, choose a strong, unique password and store it elsewhere. For automation, you'll also need to store it on the supercomputer itself.
You will need a directory (e.g. kopiabackup
) in which to store your kopia
backups. If you are backing up to Box, use rclone
to create it:
rclone mkdir box:kopiabackup
Creating a Repository
Before backing up, kopia
needs to create the repository where it will store its data. When using rclone, this looks something like:
kopia repository create rclone --remote-path box:kopiabackup
You'll also want to configure kopia
, for example excluding directories not requiring backup and setting retention policies:
kopia policy set --global --keep-annual=3 --keep-monthly=6 --keep-weekly=4 --keep-daily=7 --add-ignore '**/.cache'
Once a repository is initialized, you can back up anything to it. Since kopia
deduplicates, it's advantageous to maintain only one repository, backing up all of your computers and projects to it.
Backing up
Once a repository is created, backing up is very simple. To back up /home/netid/compute/important
:
kopia repository connect rclone --remote-path box:kopiabackup
kopia snapshot create /home/netid/compute/important
You only need to connect to the repository once per session.
Automating backups
For automation, you'll need to store your password on the supercomputer:
echo 'my 100% unique password' > ~/.kopia-password
This done, you can create a script that will automatically connect, backup /home/netid/compute/important
and /home/netid/compute/cat-pictures
, and prune:
#!/bin/bash --login
module load kopia
KOPIA_PASSWORD="$(cat ~/.kopia-password)" kopia repository connect rclone --remote-path box:kopiabackup
kopia snapshot create /home/netid/compute/important
kopia snapshot create /home/netid/compute/cat-pictures
kopia snapshot expire
To run this script every day at 3:14 AM, you could use cron. The relevant entry might look something like:
14 3 * * * bash /home/netid/kopia-backup.sh
Restoring backups
Make sure to check your backups often to ensure that you'll be able to restore in case of data loss. Store your password off of the supercomputer so that you can still restore if data there is lost.
Last changed on Wed Jan 24 13:56:23 2024