Backup requirements

Different applications have different requirements when it comes to being backed up. Some applications works perfectly fine if a disk snapshot is taken, while others may require to first enter some sort of maintenance mode first to ensure data consistency. These are my personal notes on what requirements different software stacks has. I make no promises that any of this is true. What I do promise you is that this is not an exhaustive list.

OpenSearch (ElasticSearch)

OpenSearch has snapshots. If a snapshot exists on disk before the data is backed up, that data will for sure be consistent. For more details, see this post in the OpenSearch forum.

To reduce the gap between the latest snapshot and a backup, the snapshot API can be used in a pre-backup hook to take a new snapshot before backing up.

Another alternative is to enable remote-backed storage, but that probably has little additional value if one already takes snapshots and performs backups.

PostgreSQL

As mentioned in the documentation, it is a bad idea to try and backup by copying the live files used by PostgreSQL. It can work under some circumstances, but tl;dr use pg_dump/pg_dumpall instead!

Both Bitnami and Kubegres implements support for running a cronjob that writes a database dump to a separate volume. I think it is not very important to use a separate volume if the main volume is backed up externally. The biggest reason for using a separate volume is probably to reduce the size of the backup (and possibly to protect the backup volume from accidental file operations).

It seems like StackGres has a very robust backup solution that works slighlty differently. It relies on continuous archiving with VolumeSnapshots and then it archives the WAL and some metadata to an object store (typically S3). As of writing this, they do not support archiving into a PV in Kubernetes, only object store.

RabbitMQ

Notably, in my experience, RabbitMQ sometimes does not need to be backed up. Clearly this does not hold true for all applications, but do have a think about it. Most of the RabbitMQ configuration is stored as code in the Kubernetes manifests anyway, so if they get deployed using GitOps or whatever, that's your backup!

But for when backup is needed, docs for backing up RabbitMQ can be found here.

Regarding backing up definitions:

There are two types of data in the data directory: definitions (metadata, schema/topology) and messages.
Definitions can be exported and imported as JSON files (from any running node).
When a part of definitions changes, the update is performed on all nodes in a single transaction.

Exporting the JSON file with definitions from a running node is probably the best option, since that has a very clear restore path. But it does sound like backing up the data directory from disk is also a sound backup strategy (perhaps with a less clear restore path). If so, that's probably the only viable way if the node is not running. Backing up a single node should be sufficient, in any case.

Regarding backing up messages:

Backing up the data directory on running nodes may risk getting inconsistent data, so backups should only be taken from stopped nodes.

lindhe/NOTES.md

Backup requirements

OpenSearch (ElasticSearch)

PostgreSQL

RabbitMQ