Restic
We now recommend kopia for backups; this page is unmaintained.
If you want a fast and reliable way to back your data up to Box but don't need to access data directly therefrom, we recommend using Restic (which can use Rclone as a back end) to do so. There are some advantages of using Restic rather than Rclone alone:
- Chunking: smaller files are aggregated, so backing them up to Box is much faster
- Deduplication: duplicate files or pieces of files are only saved once, so less data has to be transmitted
- Encryption: Restic's encryption is extremely transparent, and even if you move the repository away from Box it will remain encrypted
A comprehensive example of using Restic to back up to Box can be found here.
Usage
Restic is installed on the supercomputer, and can be accessed with module load restic
.
We maintain a patched version of Restic that is tuned for Box: it uses significantly larger packs, which are the data globs that Restic uses for storage. Since Box only allows 4 file transfer initializations per second per user, using these big pack sizes means more data can go to Box per second and speeds up transfers significantly. One downside is the potential for bigger diffs between backups, but this is outweighed by the much faster data transfer for most users. This patched version of Restic (restic/0.10-big-packs
) is the default Restic module; if you want to load the unpatched version, use module load restic/0.10
.
Initializing a Repository
Restic stores its data in repositories, which are simply directories containing the encrypted data. Creating one is simple: assuming that ~/.restic-password
is a file containing the password you would like to use to encrypt your data and ~/restic-repo
is the directory where you would like the backup stored, creating the repository would look like:
restic -p ~/.restic-password \
-r ~/restic-repo \
init
If you have the rclone
module loaded and would like to create the repository on a remote (e.g. box
) in a folder named backup/restic-repo
, you could use:
restic -p ~/.restic-password \
-r rclone:box:backup/restic-repo \
init
You can omit -p ~/.restic-password
, but will have to type in your password if you do so.
The directory that you use for the Restic repository needs to exist before running restic init
, so make sure to create it.
Creating a Backup
To create a backup of the directory ~/compute/important-stuff
with the tag "important," one can use:
restic -p ~/.restic-password \
-r rclone:box:backup/restic-repo \
backup ~/compute/important-stuff \
--tag important
Note that you can back up multiple directories to the same repository with no ill effects, and some advantages; since it helps with deduplication, we recommend using a single repository to back up everything.
If you want to exclude certain directories (e.g. caches or other quickly-changing but unimportant files and directories), you can use the --exclude=pattern
flag; for example, to exclude directories and files named cache
and not-super-important
, you could use:
restic -p ~/.restic-password \
-r rclone:box:backup/restic-repo \
backup ~/compute/important-stuff \
--tag important \
--exclude=cache --exclude=not-super-important
Restoring from a Snapshot
Restoring from a snapshot is fairly simple; one just specifies a snapshot (or latest
) and runs restic restore
:
restic -p ~/.restic-password \
-r rclone:box:backup/restic-repo \
restore latest \
--target ~/compute/restored-files \
--tag important
This will restore the latest snapshot with the important
tag to ~/compute/restored-files
, from which you can copy whichever files you need to restore back to the original location. You can restore older snapshots (which you can list with restic snapshots
) by replacing latest
with the snapshot ID (no need for --tag
if ID is specified):
restic -p ~/.restic-password \
-r rclone:box:backup/restic-repo \
restore 12345 \
--target ~/compute/restored-12345 \
Just like when backing up, you can include or exclude certain files with the --include
and --exclude
flags. This is useful when you want to restore only a subset of the files from a given snapshot.
Mounting a Backup
To recover just a few files from a backup rather than doing a full restore, it can be beneficial to mount a repository thus:
mkdir restic-mount # where the repo will be mounted
restic -p ~/.restic-password \
-r rclone:box:backup/restic-repo \
mount restic-mount
To background the mount process, press ctrl-z
then run bg
. This will allow you to browse restic-mount
, which should have four directories: hosts
, ids
, snapshots
, and tags
. They all point to the same data, but with different organization. If you used the --tag
flag when backing up, you can navigate to the tags
directory and browse by tag; within each directory in tags
will be directories named for the date on which the backup was created. To restore a couple files, simply navigate to the correctly dated directory, find them, and copy them to the corresponding location outside of the mounted directory.
Once you are finished, navigate outside of restic-mount
, run fg
to foreground the mount process, and press ctrl-c
to end it gracefully.
Regular Backups
If you want to run Restic regularly, you can use cron to automate your backups; create a backup script, e.g.:
#!/bin/bash -l
# using `-l` ensures that it will run as a login shell, which is needed for cron
module load rclone restic
restic -p ~/.restic-password \
-r rclone:box:backup/restic-repo \
backup ~/compute
...then run said script regularly by adding to your crontab with crontab -e
:
X Y * * Z /path/to/your/backup/script.sh
...making sure to replace X with the minute (0-59), Y with the hour (0-24), and Z with the day of the week (0-6) on which you want to run the job. A '*' will make the job run every time unit--so if you want the job to run every day, use '*' for Z.
Last changed on Tue Jan 23 12:02:41 2024