Skip to content

Instantly share code, notes, and snippets.

@rmi1974
Last active March 29, 2020 15:45
Show Gist options
  • Save rmi1974/44dc847f353bfc8ffc4f53ed3af2a903 to your computer and use it in GitHub Desktop.
Save rmi1974/44dc847f353bfc8ffc4f53ed3af2a903 to your computer and use it in GitHub Desktop.
Do-It-Yourself Backup System Using Rsync #backup #rsync #commandlinefu

Do-It-Yourself Backup System Using Rsync

Courtesy of Do-It-Yourself Backup System Using Rsync.

Written by Kevin Korb as a presentation for GOLUG Presented 2010-03-02

When you run rsync you will tell it to backup the live file system into a new empty directory and to look to the previous backup for files that have already been backed up. Whenever rsync finds a new file it will copy over that file. Whenever it finds a modified file it will copy over the differences making a new file in the new backup directory but leaving the old version of the file as it was in the old backup directory. When rsync finds a file that has not changed since the last backup it will simply be hard linked into the new backup directory requiring almost no additional disk space. here is a wide variety of options that can be used with rsync to tailor it to your specific needs but here is what my system uses by default:

rsync --archive --one-file-system --hard-links \
  --human-readable --inplace --numeric-ids --delete \
  --delete-excluded --exclude-from=excludes.txt \
  --link-dest=/backup/rsync/asylum/_home_asylum.2005-07-25.15-32-42 \
  asylum:/home/asylum/ /backup/rsync/asylum/_home_asylum.incomplete/  

I also add --verbose --progress --itemize-changes when I am watching the backup run instead of sing it from a cron job. Now I will explain the components of that rather long command...

rsync: Duh, the rsync command ;)

    --archive: This causes rsync to backup (they call it "preserve") things like file permissions, ownerships, and timestamps.
    --one-file-system: This causes rsync to NOT recurse into other file systems. If you use this like I do then you must backup each file system (mount point) one at a time. The alternative is to simply backup / and exclude things you don't want to backup (like /proc, /sys, /tmp, and any network or removable media mounts)
    --hard-links: This causes rsync to maintain hard links that are on the server being backed up. This has nothing to do with the hard links used during the rotation.
    --human-readable: This tells rsync to output numbers of bytes with K, M, G, or T suffixes instead of just long strings of digits.
    --inplace: This tells rsync to update files on the target at the block level instead of building a temporary replacement file. It is a significant performance improvement however it should not be used for things other than backups or if your version of rsync is old enough that --inplace is incompatible with --link-dest.
    --numeric-ids: This tells rsync to not attempt to translate UID <> userid or GID <> groupid. This is very important when doing backups and restores. If you are doing a restore from a live cd such as SystemRescueCD or Knoppix your file ownerships will be completely screwed up if you leave this out.
    --delete: This tells rsync to delete files that are no longer on the server from the backup. This is less important when using --link-dest because you should be backing up to an empty directory so there would be nothing to delete however I include it because of the possibility that the *.incomplete directory I am backing up to is actually left over from a previous failed run and may have things to delete.
    --delete-excluded: This tells rsync that it can delete stuff from a previous backup that is now within the excluded list.
    --exclude-from=excludes.txt: This is a plain text file with a list of paths that I do not want backed up. The format of the file is simply one path per line. I tend to add things that will always be changing but are unimportant such as unimportant log and temp files. If you have a ~/.gvfs entry you should add it too as it will cause a non-fatal error.
    --link-dest=/backup/rsync/asylum/_home_asylum.2005-07-25.15-32-42: This is the most recent complete backup that was current when we started. We are telling rsync to link to this backup for any files that have not changed.
    asylum:: This is the host name that rsync will ssh to.
    /home/asylum/: This is the path on the server that is to be backed up. Note that the trailing slash IS significant.
    /backup/rsync/asylum/_home_asylum.incomplete/: This is the empty directory we are going to backup to. It should be created with mkdir -p first. If the directory exists from a previous failed or aborted backup it will simply be completed. This trailing slash is not significant but I prefer to have it.
    --verbose: This causes rsync to list each file that it touches.
    --progress: This adds to the verbosity and tells rsync to print out a %completion and transfer speed while transferring each file.
    --itemize-changes: This adds to the file list a string of characters that explains why rsync believes each file needs to be touched. See the man page for the explanation of the characters.

Links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment