Welcome to Tutorial Tuesday 🎉
This is a quick intro to running a database dump made using pg_basebackup with Docker. It's intended audience is developers who are interested in exploring a database from a server, perhaps for development, debugging or forensics.
It covers:
- Running
pg_basebackup - RSyncing this to your local machine
- Configuring the backup to run with required files and setting a password
- How to connect from another program on your local machine
When you grab a pg_basebackup from a server you have all the databases and roles. You get the whole 'cluster' rather than just the data +schema of your database. This lets you play around, delete/restore, and to some extent gives you ability to restore to a point in time.
In this instance I'm fetching a database cluster from gatherer-demo.gatherdata.io. To do that we'll ssh into the box and run pg_basebackup.
This creates a new folder in /tmp.
~/gatherer_demo_basebackup
❯ ssh [email protected]
Welcome to Ubuntu 22.04 LTS (GNU/Linux 5.15.0-106-generic x86_64)
...
root@gatherer-demo ~ # cd /tmp
root@gatherer-demo /tmp # su postgres
postgres@gatherer-demo:/tmp$ pg_basebackup -D ./basebackupNext, on our local machine, we'll rsync this over. We're using flags -avz (archive, verbose, compressed). The z flag in particular is going to save us a lot of time.
❯ rsync -avz [email protected]:/tmp/basebackup .
That creates a local directory. It took about 2:40 on my machine to rsync, it's 11Gb.
The directory we have, is going to become the data directory for a postgres container.
Before we spin it up, we need to grab 2 files: pg_hba.conf and postgresql.conf
We can do that by starting a temporary container and copying the files over, then 'rm' the container Here the container happens to have and 'id' of 81b5....
❯ docker run -d -e POSTGRES_PASSWORD=post1234 postgres:14
81b5920e0d0158a2d4e8dd3480b2e827168b5bddf790e2724fbb8c8109722050
~/gatherer_demo_basebackup/basebackup
❯ docker cp 81b5920e0d01:/var/lib/postgresql/data/pg_hba.conf .
Successfully copied 6.66kB to /home/josh/gatherer_demo_basebackup/basebackup.
~/gatherer_demo_basebackup/basebackup
❯ docker cp 81b5920e0d01:/var/lib/postgresql/data/postgresql.conf .
Successfully copied 30.7kB to /home/josh/gatherer_demo_basebackup/basebackup.
~/gatherer_demo_basebackup/basebackup
❯ docker stop 81b5920e0d01 && docker rm 81b5920e0d01
81b5920e0d01
81b5920e0d01
Now we can start the container
docker run -e POSTGRES_PASSWORD=post1234 -P -v .:/var/lib/postgresql/data postgres:14
Your container will have a random name and port in my case it's mystifying_kapitsa on 32779
❯ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d27a3f41b17d postgres:14 "docker-entrypoint.s…" 26 minutes ago Up 26 minutes 0.0.0.0:32779->5432/tcp, :::32779->5432/tcp mystifying_kapitsa
Because this container came from a server where postgres connected over a socket, it has no password. We want to change that
docker exec -it --user postgres mystifying_kapitsa psql
I altered the password with this SQL:
psql (14.12 (Debian 14.12-1.pgdg120+1))
Type "help" for help.
postgres=# ALTER ROLE postgres PASSWORD 'post1234'
I can use the host localhost, user postgres, the password post1234 and the port 32779 now to connect from Metabase / Django / Dbeaver etc
❯ psql --host localhost --user postgres --port 32779 --db jf_db
Password for user postgres:
psql (14.8 (Ubuntu 14.8-0ubuntu0.22.10.1), server 14.12 (Debian 14.12-1.pgdg120+1))
Type "help" for help.
jf_db=# select * from auth_user
jf_db-# limit 5;
jf_db=#
One thing to note is that Docker changes ownership of folders when
bind mountsare used. That means that after running the container you may find that you are no longer the owner of the backup directory.No easy remedy for this as far as I know except to
sudo chown -Rthe dir again