Courtesy of Do-It-Yourself Backup System Using Rsync.
Written by Kevin Korb as a presentation for GOLUG Presented 2010-03-02
When you run rsync you will tell it to backup the live file system into a new empty directory and to look to the previous backup for files that have already been backed up. Whenever rsync finds a new file it will copy over that file. Whenever it finds a modified file it will copy over the differences making a new file in the new backup directory but leaving the old version of the file as it was in the old backup directory. When rsync finds a file that has not changed since the last backup it will simply be hard linked into the new backup directory requiring almost no additional disk space. here is a wide variety of options that can be used with rsync to tailor it to your specific needs but here is what my system uses by default:
rsync --archive --one-file-system --hard-links \
--human-readable --inplace --numeric-ids --delete \
--delete-excluded --exclude-from=excludes.txt \
--link-dest=/backup/rsync/asylum/_home_asylum.2005-07-25.15-32-42 \
asylum:/home/asylum/ /backup/rsync/asylum/_home_asylum.incomplete/
I also add --verbose --progress --itemize-changes
when I am watching the backup run instead of
sing it from a cron job. Now I will explain the components of that rather long command...
rsync: Duh, the rsync command ;)
--archive: This causes rsync to backup (they call it "preserve") things like file permissions, ownerships, and timestamps.
--one-file-system: This causes rsync to NOT recurse into other file systems. If you use this like I do then you must backup each file system (mount point) one at a time. The alternative is to simply backup / and exclude things you don't want to backup (like /proc, /sys, /tmp, and any network or removable media mounts)
--hard-links: This causes rsync to maintain hard links that are on the server being backed up. This has nothing to do with the hard links used during the rotation.
--human-readable: This tells rsync to output numbers of bytes with K, M, G, or T suffixes instead of just long strings of digits.
--inplace: This tells rsync to update files on the target at the block level instead of building a temporary replacement file. It is a significant performance improvement however it should not be used for things other than backups or if your version of rsync is old enough that --inplace is incompatible with --link-dest.
--numeric-ids: This tells rsync to not attempt to translate UID <> userid or GID <> groupid. This is very important when doing backups and restores. If you are doing a restore from a live cd such as SystemRescueCD or Knoppix your file ownerships will be completely screwed up if you leave this out.
--delete: This tells rsync to delete files that are no longer on the server from the backup. This is less important when using --link-dest because you should be backing up to an empty directory so there would be nothing to delete however I include it because of the possibility that the *.incomplete directory I am backing up to is actually left over from a previous failed run and may have things to delete.
--delete-excluded: This tells rsync that it can delete stuff from a previous backup that is now within the excluded list.
--exclude-from=excludes.txt: This is a plain text file with a list of paths that I do not want backed up. The format of the file is simply one path per line. I tend to add things that will always be changing but are unimportant such as unimportant log and temp files. If you have a ~/.gvfs entry you should add it too as it will cause a non-fatal error.
--link-dest=/backup/rsync/asylum/_home_asylum.2005-07-25.15-32-42: This is the most recent complete backup that was current when we started. We are telling rsync to link to this backup for any files that have not changed.
asylum:: This is the host name that rsync will ssh to.
/home/asylum/: This is the path on the server that is to be backed up. Note that the trailing slash IS significant.
/backup/rsync/asylum/_home_asylum.incomplete/: This is the empty directory we are going to backup to. It should be created with mkdir -p first. If the directory exists from a previous failed or aborted backup it will simply be completed. This trailing slash is not significant but I prefer to have it.
--verbose: This causes rsync to list each file that it touches.
--progress: This adds to the verbosity and tells rsync to print out a %completion and transfer speed while transferring each file.
--itemize-changes: This adds to the file list a string of characters that explains why rsync believes each file needs to be touched. See the man page for the explanation of the characters.
Links