Last active
February 22, 2023 23:32
-
-
Save oofnikj/4ee432033421f34b548ae2891067efcf to your computer and use it in GitHub Desktop.
PostgreSQL dump / restore to Google Cloud Storage
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
PGUSER=${PGUSER:-"postgres"} | |
CATALOG_BUCKET="normalized-catalog" | |
_get_dbs() { | |
psql -U $PGUSER -c "COPY ( | |
SELECT datname FROM pg_database | |
WHERE datname NOT IN ( | |
'postgres', | |
'template0', | |
'template1' | |
) | |
) TO STDOUT;" | |
} | |
_pg_dump() { | |
echo "dumping database $1..." | |
pg_dump -U $PGUSER -d $1 \ | |
--format=c \ | |
--compress=1 \ | |
| gsutil cp - gs://$CATALOG_BUCKET/$1.pgdump | |
} | |
_cat() { | |
if [[ $1 == "-" ]] ; then | |
echo "restoring from stdin..." >&2 | |
cat | |
else | |
echo "restoring database $1 from GCS..." >&2 | |
gsutil cat gs://$CATALOG_BUCKET/$1.pgdump | |
fi | |
} | |
_pg_restore() { | |
_cat $1 \ | |
| pg_restore -d postgres -U $PGUSER \ | |
--create \ | |
--no-tablespaces \ | |
--clean \ | |
--if-exists | |
} | |
_usage() { | |
cat <<EOF | |
Usage: | |
${BASH_SOURCE[0]} [ dump_all | dump | restore ] [database] | |
Commands 'dump' and 'restore' require user to provide the database name. | |
If database name is '-', database will be restored from standard input. | |
'dump_all' dumps all the databases in compressed pg_dump custom format. | |
EOF | |
exit 1 | |
} | |
case $1 in | |
dump_all) | |
for db in $(_get_dbs); do | |
_pg_dump $db | |
done | |
;; | |
dump) | |
( shift; _pg_dump $1 ) || ( echo 'must provide database name'; exit 1 ) | |
;; | |
restore) | |
( shift; _pg_restore $1 ) || ( echo 'must provide database name'; exit 1 ) | |
;; | |
*) | |
_usage | |
;; | |
esac |
@pnmartinez Did you solve this issue? Did split work?
@pnmartinez Did you solve this issue? Did split work?
Hi! Short answer: yes, it works. Long answer: no, it does not.
The script works, though I had serious problems with connection not stable enough. GCS has some parameters to allow for resuming interrupted uploads, but I guess piping 3 different commands makes it crash in some other stage. So works, but very bleak.
In the end the task I needed this for was cancelled, so I never looked for an alternative. I must say, however, that I did not find anything better on the interwebs, at the time.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @oofnikj . Good gist, even to get only parts of it.
I have an odd case that you can probably solve.
I've hit a limit though: GCS hard limit of 5Tb for single objects. So now looking for a way to deal with larger-than-5Tb backups.
So: the backup must be
pg_dumped
, splited, and sent on a stream in the same operation.I would kindly ask you to review my approach below, basically piping Linux's
split
utility in the middle of thepg_dump | gsutil cp
pipe:I've posted on StackOverflow, in case you'd like the reputation.