Skip to content

Instantly share code, notes, and snippets.

@drkarl
Last active October 30, 2024 19:38
Show Gist options
  • Save drkarl/739a864b3275e901d317 to your computer and use it in GitHub Desktop.
Save drkarl/739a864b3275e901d317 to your computer and use it in GitHub Desktop.
Ask HN: Best Linux server backup system?

Linux Backup Solutions

I've been looking for the best Linux backup system, and also reading lots of HN comments.

Instead of putting pros and cons of every backup system I'll just list some deal-breakers which would disqualify them.

Also I would like that you, the HN community, would add more deal breakers for these or other backup systems if you know some more and at the same time, if you have data to disprove some of the deal-breakers listed here (benchmarks, info about something being true for older releases but is fixed on newer releases), please share it so that I can edit this list accordingly.

  • It has a lot of management overhead and that's a problem if you don't have time for a full time backup administrator.
  • It mainly comprises of using tar for backups which is pretty inflexible by modern standards.
  • The enterprise web interface is OK but it's had so many bugs it's not funny.
  • Backups are very slow.
  • Restores are slow and painful to manage.
  • I haven't found it to be great when trying to integrate with puppet / automation frameworks.
  • Too complex to configure
  • Stores catalog separate from backups, need to backup catalog
  • Doesn't deduplicate
  • Relies on clock accuracy
  • Can't resume an interrupted backup
  • Retention policy
  • Doesn't do encryption
  • File level, not block level deduplication
  • Really slow for large backups (from a benchmark between obnam and attic)
  • To improve performance:
lru-size=1024
upload-queue-size=512

as per: http://listmaster.pepperfish.net/pipermail/obnam-support-obnam.org/2014-June/003086.html

  • Client side encryption turns off delta differencing
  • Can't purge old backups
  • Doesn't encrypt backups (well, there is encbup)
  • Slow restore performance on large backups? (Sorry Colin aka cperciva)
  • This was a really strong candidate until I read some comments on HN about the slow performance to restore large backups.
  • If this has changed in a recent version or someone has benchmarks to prove or disprove it, it would be really valuable.
  • Slow restore performance on large backups?
  • This was also a really strong candidate until I read some comments on HN about the slow performance to restore large backups.
  • If this has changed in a recent version or someone has benchmarks to prove or disprove it, it would be really valuable.
  • It doesn't do encrypted backups
  • No support for encryption
  • Just included here because I knew someone would mention it in the comments. It's Mac OS X only. This list is for Linux server backup systems.

Other contenders (of which I don't have references or information):

Also Tarsnap scores really high on encryption and deduplication but it has 3 important cons:

  • Not having control of the server where your backups are stored
  • Bandwith costs make your costs unpredictable
  • The so called Colin-Percival-gets-hit-by-a-bus scenario

Attic has some really good comments on HN and good blog posts, doesn't have any particular deal-breaker (for now, if you have one please share with us), so for now is the most promising.

Roll your own

Some HN users have posted the simple script they use. The scripts usually use a combination of

mikhailian's script

FROM=/etc
TO=/var/backups
LINKTO=--link-dest=$TO/`/usr/bin/basename $FROM`.1
OPTS="-a --delete -delete-excluded"
NUMBER_OF_BACKUPS=8

find $TO -maxdepth 1 -type d -name "`basename $FROM`.[0-9]"| sort -rn| while read dir
do
        this=`expr match "$dir" '.*\([0-9]\)'`; 
        let next=($this+1)%$NUMBER_OF_BACKUPS;
        basedirname=${dir%.[0-9]}
        if [ $next -eq 0 ] ; then
                 rm -rf $dir
        else
                 mv $dir $basedirname.$next
        fi
done
rsync $OPTS $LINKTO $FROM/ $TO/`/usr/bin/basename $FROM.0`

zx2c4's script

zx2c4@thinkpad ~ $ cat Projects/remote-backup.sh 
    #!/bin/sh
    
    cd "$(readlink -f "$(dirname "$0")")"
    
    if [ $UID -ne 0 ]; then
            echo "You must be root."
            exit 1
    fi
    
    umount() {
            if ! /bin/umount "$1"; then
                    sleep 5
                    if ! /bin/umount "$1"; then
                            sleep 10
                            /bin/umount "$1"
                    fi
            fi
    }
    
    unwind() {
            echo "[-] ERROR: unwinding and quitting."
            sleep 3
            trace sync
            trace umount /mnt/mybackupserver-backup
            trace cryptsetup luksClose mybackupserver-backup || { sleep 5; trace cryptsetup luksClose mybackupserver-backup; }
            trace iscsiadm -m node -U all
            trace kill %1
            exit 1
    }
    
    trace() {
            echo "[+] $@"
            "$@"
    }
    
    RSYNC_OPTS="-i -rlptgoXDHxv --delete-excluded --delete --progress $RSYNC_OPTS"
    
    trap unwind INT TERM
    trace modprobe libiscsi
    trace modprobe scsi_transport_iscsi
    trace modprobe iscsi_tcp
    iscsid -f &
    sleep 1
    trace iscsiadm -m discovery -t st -p mybackupserver.somehost.somewere -P 1 -l
    sleep 5
    trace cryptsetup --key-file /etc/dmcrypt/backup-mybackupserver-key luksOpen /dev/disk/by-uuid/10a126a2-c991-49fc-89bf-8d621a73dd36 mybackupserver-backup || unwind
    trace fsck -a /dev/mapper/mybackupserver-backup || unwind
    trace mount -v /dev/mapper/mybackupserver-backup /mnt/mybackupserver-backup || unwind
    trace rsync $RSYNC_OPTS --exclude=/usr/portage/distfiles --exclude=/home/zx2c4/.cache --exclude=/var/tmp / /mnt/mybackupserver-backup/root || unwind
    trace rsync $RSYNC_OPTS /mnt/storage/Archives/ /mnt/mybackupserver-backup/archives || unwind
    trace sync
    trace umount /mnt/mybackupserver-backup
    trace cryptsetup luksClose mybackupserver-backup
    trace iscsiadm -m node -U all
    trace kill %1

pwenzel suggests

  rm -rf backup.3
  mv backup.2 backup.3
  mv backup.1 backup.2
  cp -al backup.0 backup.1
  rsync -a --delete source_directory/  backup.0/

and https://gist.github.com/ei-grad/7610406

Meta-backup solutions (which use several backup solutions)

@xenithorb
Copy link

Thanks @ThomasWaldmann, I made it to the end of the comments just to look for updates. As now it's 2017 a lof of the solutions mentioned here are very out of date and their lack of development is prevalent. (Almost half are no longer being maintained, it seems.) Some like attic haven't had any new commits in years, unfortunately.

Any more updates are appreciated. (Comments about stable software not needing commits and such notwithstanding, the focus is to trust in something that will continue to work)

@DJsupermix
Copy link

Well, it's all about security issues we're now facing in the modern world. We tested Bacula's solutions, just like those https://www.baculasystems.com/enterprise-backup-solution-with-bacula-systems/easy-and-scalable-windows-backup though were neither satisfied, nor dissatisfied as there still have not been failures, luckily. Would be interested in getting feedback if someone used their products, want to be sure that everything would be OK on the X day.

@markfox1
Copy link

Thanks for the census @drkarl. For our Windows workstations, we settled on Duplicati, which uses a block-based deduplication algorithm to allow incremental backups to local, remote, or cloud object storage indefinitely. So the first backup is the big one, but all backups are incremental from there. It is open source and runs on the major unices. We are experimenting with it under Linux, and it does feel a bit weird running a C# program under Linux, but until someone writes an open source program with similar abilities to hashbackup, it seems to be the only game in town.

@tomwaldnz
Copy link

tomwaldnz commented Jun 27, 2018

A problem that I've found with Attic / Borg Backup (Borg is a fork that's more actively maintained) is when you run it the old backup file is either renamed or deleted, and a new backup file created. This means if you're using metered storage or bandwidth (eg Amazon S3) you'll get charged more - you effectively upload all your data every night. Someone else found the same thing, here, but they found this behavior changes for very large backups - but I don't know what the threshold is.

Duplicati has a lot of potential. It's just come out of years of alpha testing, and is now a beta. I found an issue a year ago that prevented restores of the data when a non-standard block size was being used - which I did because I was backing up large video files. This is a logged bug, which hasn't been fixed yet. The author thinks it's something that's easy to work around, but any bug that prevents restores is a big red flag for me.

@rsyncnet
Copy link

rsyncnet commented Jul 2, 2019

I hope that it is interesting and valuable to point out that rsync.net supports, among other things:

  • rclone
  • restic
  • borg / attic

... which is to say that an rsync.net cloud storage account is stock/standard OpenSSH, with SFTP/SCP, so borg and restic "just work" using the SFTP transport. Further, we installed and maintain an rclone binary on our platform[1][2] which means we're not just an rclone SFTP target, but you can execute rclone and run it from an rsync.net account (to go fetch from gdrive or S3 or whatever).

And, of course, any other thing that runs over SSH/SFTP/SCP (from rsync to filezilla) will work.

The entire platform is ZFS so you have protection from corruption as well as optional snapshots.[3]

[1] rclone/rclone#3254

[2] https://twitter.com/njcw/status/1144534055817502721

[3] https://www.rsync.net/platform.html

@per2jensen
Copy link

Just stumbled upon this thread.

I will second http://dar.linux.free.fr/ - it is mature, matintained and work extremely well in many use cases.

/Per

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment