Skip to content

Instantly share code, notes, and snippets.

@vhbui02
Created May 16, 2023 13:51
Show Gist options
  • Save vhbui02/7131eb209bcd020b189c404021a5b5a7 to your computer and use it in GitHub Desktop.
Save vhbui02/7131eb209bcd020b189c404021a5b5a7 to your computer and use it in GitHub Desktop.
[MongoDB Storage Engines] #mongodb

the component of the database that is responsible for managing how data is stored

REFRESH MEMORY: data is stored in-memory and on disk.

Multiple storage engines performs better for specific workloads. Choose the right WILL SIGNIFICANTLY impact the performance.

WiredTiger (defaulted)

To check if your MongoDB instance is using Wired Tiger: db.serverStatus().storageEngine

Document-level locking mechanism for Write operation

Multiple clients can modify different documents of a collection at the same time.

Intent shared lock: for Read operation that do not change or update data, such as find() query

Intent exclusive lock: Write operation, such as save(), updateOne(), updateMany()....

Intent shared lock blocking Intent exclusive lock and vice versa.

Intent shared lock DO NOT block other Intent shared lock

Intent exclusive lock DO block other Intent exclusive lock

When blocking, they will transparently (implicitly) retry that operation.

Database lock, collection lock are required for some special operations.

Compression

  • block compression with snappy compression lib (default if use WiredTiger) for all collections
  • prefix compression for all indexes

Memory use

Wired Tiger: default = MAX((physical_ram - 1GB) * 50%, 256MB)

E.g. 4GB RAM, WiredTiger cache will use (0.5 * (4GB - 1GB)) = 1.5GB.

A system of 1.25GB will allocate 256MB since (0.5 * (1.25GB - 1GB)) = 0.128 GB < 256MB.

WiredTiger cache vs on-disk format

Filesystem

  • Data in cache is the same as on-disk format
  • Reduce disk I/O since OS use data directly from cache.

Index

  • In cache have different reps to the on-disk format.
  • Remain advantage of index prefix compression to reduce its size => reduce RAM usage. (Index prefix compression deduplicates common prefixes from indexed fields)

Now that's interesting! If the indexed values does not different so much and share some prefix, they can have smaller size when compressed.

Collection

  • In cache uses a different reps from the on-disk format.
  • On-disk data is compressed by block compression, but cache data remains uncompressed, in order to be manipulated by server

Adjust the size of the WiredTiger internal cache

storage.wiredTiger.engineConfig.cacheSizeGB or --wiredTigerCacheSizeGB

Note: Avoid increasing the cache size above its default value

Snapshots and Checkpoints

Operation start => WiredTiger provides a point-in-time snapshot of the data to the operation, rather than tell it to go look for index or docs.

This snapshot presents a consistent view of the in-memory data.

Durable Write Operation: it write data in snapshot to disk in a consistent way across data files (consistent how? idk). This data is now durable.

Durable only if:

  • for MongoDB instance is a standalone instance, the write operation must be logged in server's journal file,
  • for MongoDB instance is a replica set, the write operation must be loggen in a majority of voting nodes's journal file.

The durable data act as a checkpoint in the data files. This checkpoint ensures the data is consistent up to, it contains previous checkpoint as well to act as a recovery point.

When writing new checkpoint, it still holds the prev checkpoint. In case of a failure, an error occurs while writing new checkpoint, after Mongo restart it can revert to old data using prev checkpoint.

With WiredTiger, without journaling, MongoDB can recover from last checkpoint. But in order to recover the changes have been made, journaling is required.

Note: Replica set that use WiredTiger storage engine ALWAYS use journaling.

Snapshot History Retention

The old snapshot can not be kept for long because if minSnapshotHistoryWindowInSeconds is too high, it will keep the snapshot for a long time and increases disk space.

Make it low enough to ensure consistency without expensive storage usage.

Journaling

The WiredTiger journal persists all data modification between checkpoints, in case of exiting between checkpoint (internal error, power outage, ...) after restart it uses the journal to replay all data modified since last checkpoint.

But this means MongoDB has to add 1 more write operation: from journal to disk. But the operation frequency is manageable (how? idk)

This file uses snappy compression lib (the same lib used by block compression for collection data).

In-Memory Storage Engine

Only for MongoDB Enterprise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment