Skip to content

Instantly share code, notes, and snippets.

@volkovasystems
Created November 20, 2019 09:07
Show Gist options
  • Save volkovasystems/57ae02689739ff121e6c4694b37e9ef6 to your computer and use it in GitHub Desktop.
Save volkovasystems/57ae02689739ff121e6c4694b37e9ef6 to your computer and use it in GitHub Desktop.
MongoDB Backup Protocol
1.) Get all the list of ObjectID references in batch process per collection.
2.) The list of ObjectID references will be the snapshot of the backup.
3.) Store the list of ObjectID references per collection to a JSON file (.json)
a.) One JSON file per collection with format, "<collection-name>-<datetimestamp>.json"
Example: user-20191122103055.json
b.) JSON file will contain an Object with "referenceList" as property containing array of ObjectID reference string.
c.) If the list is greater than 5000 elements,
c.1.) Create another JSON file with the format, "<collection-name>-<datetimestamp>-<start-index>-<end-index>.json"
4.) Backup process will focus on this snapshot list. Create another instance of mongod server under any port locally.
5.) Use the snapshot list and transfer all the documents to that another database. The collection list must be the same with the original database.
a.) For each transfer of the document, we will get the md5 hashsum of the document transferred then append it to the reference.
Example:
{
"referenceList":[
"57031eb7de2e87b605ff0aee", //ObjectID reference only.
"5ba4687f9384ccb81d7198f0-7627FBE99C0B387A38121FB94C24C030" //With md5 hashsum
]
}
6.) Now we have a constant database that cannot be accessed or updated. We will run a mongodump process on this database.
a.) The dump folder format would be "<database-name>-<datetimestamp>"
7.) After running a mongodump and no errors encountered, we will wrap the dump folder to a compressed file.
a.) The compressed file format would be "<database-name>-<datetimestamp>.tar.gz"
8.) After compressing the dump folder, we will get the md5 hashsum of the compressed file.
9.) Create a backup-meta.json file. This will contain an object with the following property,
a.) "databaseHashstamp" this is the md5 hashsum of the compressed dump file.
b.) "integrityHashstamp" this is the md5 hashsum of all the md5 hashsum of all the documents in ascending order
with the last element of databaseHashstamp.
10.)Hashstamp will be used to verify the integrity of the backup.
11.)Verification of integrity is always done after the backup is restored.
12.)Delete the dump folder.
13.)Place all the JSON files and the compressed file in one folder with format name,
"backup-package-<database-name>-<datetimestamp>"
14.)Compress the backup package folder.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment