Monday, November 20, 2006

Automatic Backups with rdiff-backup

I finally got around to finishing my backup scripts. My goal was to have off site backup of machines on my internal network to a remote location, through a secure tunnel established with OpenVPN. I elected to go with rdiff-backup, as it permits nifty features like point-in-time recovery (i.e. restore this file/directory/whatever as it was on a certain date at a certain time). I set up a machine in the same physical location as the servers I wanted to back up as a primary backup server (so as to permit speedy recovery without having to go through the tunnel), and then backed up once a day off site to the remote machine.

It turned out to be pretty easy.

The first step was to allow automatic backups without human intervention. The way that rdiff-backup works is actually pretty cool. You establish a connection to the remote server using some login facility such as telnet, rlogin, or ssh (I went with ssh for obvious reasons -- it's the most secure), and then execute the rdiff-backup program on the remote machine, telling it to send the files across the network to wherever you want them backed up. This means that rdiff-backup has to be installed on both the "server" and the "clients". Installation is a snap.

The next step is to create a "backupuser" account on all machines, and use Public Key Infrastructure (PKI) to permit secure unattended logins.

This is relatively simple. First, create the account on all machines (i.e. adduser command). Next, generate a public/private keypair for the account as follows:

trolius> ssh-keygen2
Generating 2048-bit dsa key pair
1 oOo.oOo.o
Key generated.
2048-bit dsa, user@Local, Wed Mar 22 2002 00:13:43 +0200
Passphrase :
Again :
Private key saved to /home/backupuser/.ssh/id_dsa_2048_a
Public key saved to /home/backupuser/.ssh/id_dsa_2048_a.pub


Note that you might get slightly different feedback depending on your version of OpenSSH. Next, rename the generate private and public keys to whatever your OpenSSH requires them to be (hint: read /etc/ssh/sshd_config for a clue). Copy the keys to the remote machines, and log into each once so that you can say "yes" when prompted as to whether or not you want to accept the keys.

Finally, back everything up! These commands will do it for you:

/usr/local/bin/rdiff-backup \
backupuser@192.168.0.16::/home/httpd \
/backup/luther/httpd
Note that the slashes (\) are there to keep the command from going out of the text area on the blog; you can use them or not, as you wish, when you type the actual commands. This command backs up /home/httpd on "luther" to /backup/luther/httpd on the machine which originated the command (i.e. the one that is receiving the backup).
rdiff-backup -r 3D /backup/luther/httpd/somedir/ \
/home/backupuser/tmp/
This will restore the entire "somedir" directory to the local directory /home/backupuser/tmp/ as it was three days ago. The "r" stands for "restore as of". You can use a variety of formats to specify date, time, etc. Other acceptable time strings include 5m4s (5 minutes and 4 seconds) and 2002-03-05 (March 5th, 2002).

I ran the backups once, to ensure that everything is backed up, and then added each command as a crontab in /etc/crontab to run it hourly on the "primary" backup server. I then added similar entries on the crontab of the remote backup server, to run once a day at 4:00 AM.

Couldn't be simpler, and I can sleep better at night knowing my data is stored redundantly.