Philippe Simard

Menu Principal
Home
Liens externes
Photos
Ressources
Météo
Password Generator
Sujets
Musique
Bricolage
Informatique
SherBroue
Related Items



Accueil Informatique Easy Automated Snapshot-Style Backups with Linux and Rsync
Easy Automated Snapshot-Style Backups with Linux and Rsync PDF Print E-mail
Written by Philippe Simard   
Tuesday, 30 September 2008 12:33
Article Index
Easy Automated Snapshot-Style Backups with Linux and Rsync
Incremental backups with rsync
Isolating the backup from the rest of the system
Making the backup as read-only as possible
Extensions: hourly, daily, and weekly snapshots
Bugs
Contributed codes, References and FAQ
All Pages

Taken from Mike Rubel website: http://www.mikerubel.org/

 

Updates: As of rsync-2.5.6, the --link-dest option is now standard! That can be used instead of the separate cp -al and rsync stages, and it eliminates the ownerships/permissions bug. I now recommend using it. Also, I'm proud to report this article is mentioned in Linux Server Hacks, a new (and very good, in my opinion) O'Reilly book by compiled by Rob Flickenger.


Abstract

This document describes a method for generating automatic rotating "snapshot"-style backups on a Unix-based system, with specific examples drawn from the author's GNU/Linux experience. Snapshot backups are a feature of some high-end industrial file servers; they create the illusion of multiple, full backups per day without the space or processing overhead. All of the snapshots are read-only, and are accessible directly by users as special system directories. It is often possible to store several hours, days, and even weeks' worth of snapshots with slightly more than 2x storage. This method, while not as space-efficient as some of the proprietary technologies (which, using special copy-on-write filesystems, can operate on slightly more than 1x storage), makes use of only standard file utilities and the common rsync program, which is installed by default on most Linux distributions. Properly configured, the method can also protect against hard disk failure, root compromises, or even back up a network of heterogeneous desktops automatically.

Motivation

Note: what follows is the original sgvlug DEVSIG announcement.

Ever accidentally delete or overwrite a file you were working on? Ever lose data due to hard-disk failure? Or maybe you export shares to your windows-using friends--who proceed to get outlook viruses that twiddle a digit or two in all of their .xls files. Wouldn't it be nice if there were a /snapshot directory that you could go back to, which had complete images of the file system at semi-hourly intervals all day, then daily snapshots back a few days, and maybe a weekly snapshot too? What if every user could just go into that magical directory and copy deleted or overwritten files back into "reality", from the snapshot of choice, without any help from you? And what if that /snapshot directory were read-only, like a CD-ROM, so that nothing could touch it (except maybe root, but even then not directly)?

Best of all, what if you could make all of that happen automatically, using only one extra, slightly-larger, hard disk? (Or one extra partition, which would protect against all of the above except disk failure).

In my lab, we have a proprietary NetApp file server which provides that sort of functionality to the end-users. It provides a lot of other things too, but it cost as much as a luxury SUV. It's quite appropriate for our heavy-use research lab, but it would be overkill for a home or small-office environment. But that doesn't mean small-time users have to do without!

I'll show you how I configured automatic, rotating snapshots on my $80 used Linux desktop machine (which is also a file, web, and mail server) using only a couple of one-page scripts and a few standard Linux utilities that you probably already have.

I'll also propose a related strategy which employs one (or two, for the wisely paranoid) extra low-end machines for a complete, responsible, automated backup strategy that eliminates tapes and manual labor and makes restoring files as easy as "cp".

Using rsync to make a backup

The rsync utility is a very well-known piece of GPL'd software, written originally by Andrew Tridgell and Paul Mackerras. If you have a common Linux or UNIX variant, then you probably already have it installed; if not, you can download the source code from rsync.samba.org. Rsync's specialty is efficiently synchronizing file trees across a network, but it works fine on a single machine too.

Basics

Suppose you have a directory called source, and you want to back it up into the directory destination. To accomplish that, you'd use:

rsync -a source/ destination/

(Note: I usually also add the -v (verbose) flag too so that rsync tells me what it's doing). This command is equivalent to:

cp -a source/. destination/

except that it's much more efficient if there are only a few differences.

Just to whet your appetite, here's a way to do the same thing as in the example above, but with destination on a remote machine, over a secure shell:

rsync -a -e ssh source/ 
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 :/path/to/destination/

Trailing Slashes Do Matter...Sometimes

This isn't really an article about rsync, but I would like to take a momentary detour to clarify one potentially confusing detail about its use. You may be accustomed to commands that don't care about trailing slashes. For example, if a and b are two directories, then cp -a a b is equivalent to cp -a a/ b/. However, rsync does care about the trailing slash, but only on the source argument. For example, let a and b be two directories, with the file foo initially inside directory a. Then this command:

rsync -a a b

produces b/a/foo, whereas this command:

rsync -a a/ b

produces b/foo. The presence or absence of a trailing slash on the destination argument (b, in this case) has no effect.

Using the --delete flag

If a file was originally in both source/ and destination/ (from an earlier rsync, for example), and you delete it from source/, you probably want it to be deleted from destination/ on the next rsync. However, the default behavior is to leave the copy at destination/ in place. Assuming you want rsync to delete any file from destination/ that is not in source/, you'll need to use the --delete flag:

rsync -a --delete source/ destination/

Be lazy: use cron

One of the toughest obstacles to a good backup strategy is human nature; if there's any work involved, there's a good chance backups won't happen. (Witness, for example, how rarely my roommate's home PC was backed up before I created this system). Fortunately, there's a way to harness human laziness: make cron do the work.

To run the rsync-with-backup command from the previous section every morning at 4:20 AM, for example, edit the root cron table: (as root)

crontab -e

Then add the following line:

20 4 * * * rsync -a --delete source/ destination/

Finally, save the file and exit. The backup will happen every morning at precisely 4:20 AM, and root will receive the output by email. Don't copy that example verbatim, though; you should use full path names (such as /usr/bin/rsync and /home/source/) to remove any ambiguity.



 


(C) 2008 philippesimard.com. Tous droits réservés. Powered by Webboréal hébergement Web.
Your IP: 193.47.80.45

Valid XHTML 1.0 Transitional Valid CSS!