I recently set up a host running Ubuntu 14.04 LTS, ubuntu-zfs, and PostgreSQL 9.3. Using the ZFS snapshot feature I was able to make a snapshot of the file system holding the PostgreSQL database, write the snapshot to a compressed file, transfer it to another Ubuntu 14.04 LTS host running ubuntu-zfs, restore it, and launch PostgreSQL using that data directory with all the data (seemingly) intact.
The database is very low-traffic, so I do not know how effective this strategy would be for a busier database.
I take no responsibility if you attempt this setup and it causes data loss or other issues. Use it at your own risk; this document is meant to explain a potential setup and open discussion about limitations and potential improvements.
I started by creating a pool to hold the PostgreSQL DB, on a simple block file. It isn't recommended to use file stores, as ZFS is better on actual devices, but this is just a quick test.
$ fallocate -l 10G poolfile.zfs
$ sudo zpool create -f -O compression=on -o ashift=12 -o listsnapshots=on -o recordsize=8k -o primarycache=metadata pg-pool /home/vagrant/poolfile.zfs
This creates a pool called pg-pool
. We can see it using zpool
:
$ sudo zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
pg-pool 9.94G 820K 9.94G 0% 1.00x ONLINE -
Next, I installed PostgreSQL 9.3. If you are installing extension (e.g. PostGIS) then they can be added to this step.
$ sudo apt-get install postgresql-9.3 postgresql-contrib-9.3
Once installed, we will copy the default DB cluster to the ZFS pool and change the configuration.
$ sudo service postgresql stop
* Stopping PostgreSQL 9.3 database server
...done.
$ sudo cp -r /var/lib/postgresql /pg-pool/postgresql
Then edit the PostgreSQL configuration to point to the new data directory:
$ sudo vim /etc/postgresql/9.3/main/postgresql.conf
Goto about line 41, and change data_directory:
data_directory = '/pg-pool/postgresql/9.3/main'
Once done and saved, try starting PostgreSQL.
$ sudo service postgresql start
* Starting PostgreSQL 9.3 database server
...done.
Good, it is now running. Try creating some users and databases:
$ sudo -u postgres psql
psql (9.3.4)
Type "help" for help.
postgres=# CREATE ROLE alice;
CREATE ROLE
postgres=# CREATE ROLE bob;
CREATE ROLE
postgres=# CREATE DATABASE alice_db OWNER alice;
CREATE DATABASE
postgres=# CREATE DATABASE bob_db OWNER bob;
CREATE DATABASE
postgres=#\q
Next is creating a snapshot and saving it to a file. With ZFS, creating snapshots is instantaneous. For an example, I will use low Gzip compression on the file; you can use higher compression (-9
) to speed up transferring it to another host.
$ sudo zfs snapshot pg-pool@snapshot-1
$ sudo zfs send pg-pool@snapshot-1 | gzip -1 > zfs-backup.gz
This results in a Gzip file we can then transfer to our second host.
As this host is for testing, we will set up a simple ZFS pool based on a file store.
$ fallocate -l 10G poolfile.zfs
$ sudo zpool create -f -O compression=on -o ashift=12 -o listsnapshots=on -o recordsize=8k -o primarycache=metadata pg-pool /home/vagrant/poolfile.zfs
Next, copy the compressed snapshot file from the first host to the second. I am using vagrant, so I placed it in the same directory as my Vagrantfile and it is automatically shared at /vagrant
in my VM. We will restore it to a directory inside the Second Host's pool, rather than replacing the pool completely.
$ gunzip -c -d /vagrant/zfs-backup.gz | sudo zfs recv pg-pool/restore
$ ls /pg-pool/restore
postgresql
As we can see, the restore has succeeded in decompressing the snapshot and placing the data back onto a file system (even at a different path). Next, install PostgreSQL and change the data directory to the restored snapshot:
$ sudo apt-get install postgresql-9.3 postgresql-contrib-9.3
$ sudo service postgresql stop
* Stopping PostgreSQL 9.3 database server
...done.
$ sudo vim /etc/postgresql/9.3/main/postgresql.conf
Goto about line 41, and change data_directory:
data_directory = '/pg-pool/restore/postgresql/9.3/main'
This should tell PostgreSQL to simply "resume" the data directory as if it were its own, and continue from where the PostgreSQL on the First Host left off at the time of the snapshot.
$ sudo service postgresql start
* Starting PostgreSQL 9.3 database server
...done.
$ sudo -u postgres psql
psql (9.3.4)
Type "help" for help.
postgres=# \du
... role information ...
In the role information, it should list the roles we created on the first host.
postgres=# \l
... database information ...
The database information should also include the database from the first host.
And that is more or less my working setup. There is potential for improvements, such as incremental snapshots and automated snapshots to files.
Replication is not a replacement for backups, as previous versions of the database are not kept. Simply replicating ZFS file systems from one host to another host is also susceptible to the same limitations. Instead, versioned backups (preferably stored on multiple hosts) are much more effective in allowing you to restore a database to any previous version.
There is also a risk of an errant zpool destroy
command deleting all your snapshots in the ZFS filesystem. By keeping snapshots on multiple hosts or in compressed files, this risk can be mitigated.
You need to use
pg_backup_start/stop
andCHECKPOINT
or else the buffers are not correcly flushed.