WAL-E needs to be installed on all machines, masters and slaves.
Only one machine, the master, writes WAL segments via continuous archiving. The configuration for the master postgresql.conf
is:
archive_mode = on
archive_command = 'envdir /etc/wal-e.d/env wal-e wal-push %p'
archive_timeout = 60
When a master is first setup, a base backup needs to be performed. As the postgres
user:
envdir /etc/wal-e.d/env wal-e backup-push $PGDATA
This will perform a base backup and push it to S3. The above archiving commands will stream all deltas in realtime as generated by the Postgres master.
The flow for setting up a new Postgres machine slave, which will be based on a base backup.
- Install Postgres, configure as a slave.
- Remove the
$PGDATA
dir, if it exists. This will be re-created in step 3. - Pull a base backup using WAL-E:
envdir /etc/wal-e.d/env wal-e backup-fetch $PGDATA LATEST
- Implement a
$PGDATA/recovery.conf
which is configured with WAL-E to replay remote segments:
standby_mode = 'on'
primary_conninfo = 'host=$HOST user=$USER password=$PASSWORD'
restore_command = 'envdir /etc/wal-e.d/env wal-e wal-fetch "%f" "%p"'
trigger_file = '/data/postgresql/9.1/main/trigger'
4b. During recovery, by default WAL-E will attempt to pull and restore all WAL segments available. If you wish to restore to a specific POINT IN TIME than specify
# restore to 4:38PM on 3/6/2012
recovery_target_time = '2012-03-06 16:38:00'
In recovery.conf
to recover to that specific point in time. This can be helpful it something bad happened at 4:39PM and you want to restore to 4:38PM.
See Recovery Target Settings for more details.
- Ensure proper permissions:
chown -R postgres:postgres $PGDATA
- Start Postgres and tail its log. It should start up and fetch WAL segments from S3, which is being performed by the
restore_command
in the aboverecovery.conf
.
We have WAL-E configured to push a base backup nightly:
/etc/cron.d/pg_base_backup
:
0 8 * * * postgres envdir /etc/wal-e.d/env wal-e backup-push /data/postgresql/9.1/main/
Great tutorial. Thanks! ⭐