Automated backups in Linux
It’s backup week at the Draconis blog! We all know the importance of keeping our files backed up (and us at Draconis spend a lot of time planning for disasters), but it can often be a tedious task to ensure backups run properly. Often, we put off setting up a good system until it’s too late. This week, I’ll be posting a new installment each day, showing you how to get your backup system implemented using rsync, a free, cross-platform file transmit tool.
Numerous commercial and free solutions exist for backing up your hard drives – some are even cross-platform. Many are great for the job they do – especially when it comes to burning DVDs or CDs of your important files. But if you’re like me, and you have multiple computers you use on a regular basis (each of which has files you’d like to save), then an automated backup solution, sent to a single computer, may be in order. I happen to have four different platforms to support: Linux, Solaris, Windows, and a Mac laptop. I’ve set up a system where I can sync all my important files to my Linux computer (which has copious amounts of disk space) and burn CD/DVDs on a regular basis. This article will explain how I set it up.
Tools of the Trade
First, as I mentioned, we’ll be using rsync to do all the dirty work. If you’re not familiar with it, rsync is a fantastic tool in the command-line world, that lets you quickly copy huge amounts of data over a network, by sending only the data that’s different (perfect for backups). For instance, consider a situation where you have 20 gigs of important files, and each day you want to make sure those 20 gigs are backed up. Rather than sending all 20 gigs to your backup computer every day, you could just send the files that changed. Or, better yet, just send the data inside the files that changed. If you added a paragraph to a Microsoft Word document one day, only that paragraph will be sent to the backups.
Using rsync, we can run our backups more often (as there is much less transfer to do) without worrying too much about clogging up our network, slowing down our machines, or other common maladies.
Getting rsync to work is very simple. It uses a client/server model, which designates one computer as the server, and all other computers connecting to it as clients. For my purposes, my Linux computer (with the copious disk space) was designated the rsync server, which would accept incoming rsync connections. As far as rsync is concerned, it doesn’t always matter which is which: a client can pull changed data from the server just as a client can push changed data to the server. But to keep things simple, let’s call the Linux computer the rsync server, and everything else the client.
Second, the major downside to rsync is that it exists pretty much in the command-line world (i.e. it doesn’t have its own pretty GUI for Windows users). So, we need to improvise. The best tool to do that – at least on Windows – is to use Cygwin. Cygwin is a great tool that allows many common Linux/UNIX command-line programs to run under Windows (its like a Linux emulator). We can easily install rsync using Cygwin and backup our Windows files using rsync.
Third, we’re going to use cron (for Linux/UNIX/MacOS) and Scheduled Tasks (for Windows) which, when run, will backup all of our important files on a regular basis. For me, I have several different directories to backup on each of my computers (on Windows, for instance, My Documents, and my Application Settings folders are both important). This will automatically backup our important files every night at regular times (and allows us to stagger when the backups happen so they don’t all trigger at the same time).
Installing & Configuring Rsync in Linux
Installing rsync in Linux is very easy – in many cases, it’s already there (try typing ‘rsync –version’ at the command line). If it isn’t, then installing is usually very easy: either download a package for your distribution of Linux (such as an RPM or PKG), or build rsync from source. Be sure to check out the rsync downloads page.
For this example, I’ll show how to build it from source. First, download the tarball (the latest version at the time of this writing was 2.6.8) and uncompress it:
tar zxf rsync-2.6.8.tar.gz; cd rsync-2.6.8
The next step is to configure and build it from sources. As with most Linux source distributions, simply do:
./configure
make
If everything compiled properly, install rsync (as root) with:
make install
Now that you have rsync installed, the next step is to configure it. As I explained earlier, my Linux computer was designated the rsync server – my other computers would synchronize their data with the Linux box. So, let’s set up an rsync listening server.
The first step is to tell rsync what we’ll be receiving from clients. As clients connect, they specify a particular “profile” (or identity, or whatever you want to call it) that they are using. For instance, your Windows computer might connect to your backup server and identify itself as “computer1″; the backup server will accept these files and place them on disk as specified by the “computer1″ profile. To define these profiles, edit the file (or create it, if it doesn’t exist) /etc/rsyncd.conf in your favorite text editor.
Here’s an example of one of my backup profiles (for the computer I named sigma):
[sigma]
path=/backups/sigma
uid = ryan
gid = ryan
read only = true
auth users = ryan
secrets file = /etc/rsyncd.secrets
This profile is pretty self explanatory, but let’s go through it anyway. The first line is the target path (where the files will be stored), and should point to a directory. I have a separate hard drive (called /backups) where I store all my files. You can make this anything you want, so long as it has read/write permissions for the user running rsync (root or a different user, as we’ll discuss later). The “uid” and “gid” fields specify the user ID and the group ID, respectively, that the files should be stored as. The “read only” flag sets whether the files should have the read only field set (I turn it on, but it isn’t necessary).
The last two lines of this profile are rsync-specific, and determine the authentication the client must use to connect to this backup server. To authenticate users, rsync uses a file (called /etc/rsyncd.secrets in this case) that maintains the user names and passwords that can be used to connect to this server with. In the case above, I have a user named “ryan”, whose password is stored in /etc/rsyncd.secrets. The user rsync uses to authenticate clients doesn’t have anything to do with users on the system: a user can exist to rsync but not necessarily to Linux.
To setup the rsyncd.secrets file, create a new text file in /etc called rsyncd.secrets (really, you can place it just about anywhere with just about any name you like). This is just a key/value pair file, and looks as follows:
[user name]:[password]
So, in the example above, I would have something like:
ryan:password
Each pair is stored, one per line, in this file and defines who is allowed to connect to the rsync server. As stated earlier, these users don’t necessarily have to exist as part of the Linux users system.
Since this file is stored in plain text, it’s a good idea to keep it protected. Since I run the rsync listening server as root, I can set the permissions of this file to 600 (meaning root can read/write it, but no one else on the system can).
And that’s it! Be sure to add as many users as you need, and define as many profiles as you like (I usually do one profile per system, so my backup drive has a single directory for each computer I’m backing up, but you can really set this up any way you like).
Coming up tomorrow…
Tomorrow, I’ll take a look at how to setup rsync in Solaris – it’s surprisingly easy! Keep checking back throughout the week – there’s bound to be a few gems you’ll find interesting. Enjoy!














