You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In MongoDB, we have to base our modeling on our application-specific data access patterns. Finding out the questions that our users will have is paramount to designing our entities. In contrast to an RDBMS, data duplication and denormalization are used far more frequently, for good reason.
Supported Data Types
The sorting order of different types of data, from highest to lowest, is as follows: Max. key/Regular expression/Timestamp/Date/Boolean/ObjectID/Binary data/Array/string/Numbers/Null/Min. key
Schema Design
Index early and often
Eliminate unnecessary indexes
Use a compound index, rather than index intersection
Low selectivity indexes
User regular expressions
Avoid negation in queries
Use partial indexes
Use document validation
Sharding
Think about query routing
Use tag-aware sharding
Lets try to get only document where balance is integer and not string
Non-existent fields get sorted as if they have null in the respective field. Comparing arrays is a bit more complex than fields. Ascending order of comparison (or <) will compare the smallest element of each array. Descending order of comparison (or >) will compare the largest element of each array.
Dates are stored as milliseconds, with effect from January 01, 1970 (epoch time). They are 64-bit signed integers, allowing for a range of 135 million years before and after 1970. A negative date value denotes a date before January 01, 1970. The BSON specification refers to the date type as UTC DateTime.
var now = new Date()
db.learnmongo.insertOne({date: now, offset: now.getTimezoneOffset()})
{ acknowledged: true,
insertedId: ObjectId("63cb6b169ed70232acef3693") }
var rec = db.learnmongo.findOne({"offset": -330})
rec
{ _id: ObjectId("63cb6b169ed70232acef3693"),
date: 2023-01-21T04:32:13.623Z,
offset: -330 }
var localnow = new Date(rec.date.getTime() - (rec.offset * 60000))
localnow
2023-01-21T10:02:13.623Z
ObjectId
It has 12 bytes
It is ordered
Sorting by _id will sort by creation time for each document, down to one-second granularity
The creation time can be accessed by .getTimeStamp() in the shell
var id = rec._id
id
ObjectId("64630935bb8576d9b24a09fb")
id.getTimestamp()
2023-05-16T04:40:21.000Z
Some operations are atomic at the document operation level:
- update()
- findandmodify()
- remove()
Starting from version 4.4, we can set a global default read concern on the replica set and sharded cluster level. The implicit write concern is w:majority, which means that the write will be acknowledged after it’s been propagated to a majority of the nodes in the cluster.
If we use a read concern that is majority or higher, then we can make sure that the data we read is write-committed in the majority of nodes, making sure that we avoid the read uncommitted problem that we described at the beginning of this section.
if Many is bounded then store the reference of Many part in One part
Person{"addresses":[ ObjectID('590a56743e37d79acac26a44'),ObjectID('590a56743e37d79acac26a46'),ObjectID('590a56743e37d79acac26a54') ]} if Many is unbounded the store the reference of One part in Many part.
Address{"person": ObjectId("590a530e3e37d79acac26a41")}
Search
{ name : "Macbook Pro late 2019 16in" ,
manufacturer : "Apple" ,
price: 2000 ,
keywords : [ "Macbook Pro late 2016 15in", "2000", "Apple", "macbook", "laptop", "computer" ]
}
db.products.createIndex( { keywords: 1 } )
Above index is good but not efficient. its with data duplication to filter. This is not an efficient or flexible approach, as we need to keep keywords lists in sync, we can’t use stemming, and we can’t rank results (it’s more like filtering than searching). The only advantage of this method is that it is slightly quicker to implement.
Only one text index per collection (except for Atlas Search SaaS) can be declared in one or multiple fields. The text index supports stemming, tokenization, exact phrase (“ “), negation (-), and weighting results.
metaField is a field that can store any kind of metadata that is useful for our querying, such as sensor-unique IDs.
The time series collection has an index on timeField, so we can query it really effectively within any time period that we need. For more complex queries, we can use the aggregation framework.
We will be using Quarkus and MongoDB for the same
Follow all the snippets at [email protected]:debu999/oreillymastermongodb.git
bulkbookinsert = function()
{
for(i=0;i<1000;i++)
{
db.books.insertOne({name: "Mongo DB Book Number" + i})
}
}
< [Function: bulkbookinsert]
fastBulkBookInsert = function()
{
var bulk = db.books.initializeUnorderedBulkOp();
for(i=0;i<1000;i++)
{
bulk.insert({name: "Mongo DB Book Number" + i})
}
bulk.execute();
}
< [Function: fastBulkBookInsert]
db.bookOrders.find()
{
_id: ObjectId("646ecc8611d51b196ad8c61e"),
name: 'Mastering Mongodb',
available: 99,
isbn: 101
}
# With the following series of operations in a single bulk operation, we are adding one book to the inventory and then ordering 100 books, for a final total of 0 copies available:
var bulkOp = db.bookOrders.initializeOrderedBulkOp();
bulkOp.find({isbn: 101}).updateOne({$inc: {available : 1}}); // increment by 1
bulkOp.find({isbn: 101}).updateOne({$inc: {available : -100}});
bulkOp.execute();
db.bookOrders.find()
{
_id: ObjectId("646ecc8611d51b196ad8c61e"),
name: 'Mastering Mongodb',
available: 0,
isbn: 101
}
When executing through an ordered list of operations, MongoDB will split the operations into batches of 1000 and group these by operation
bulkWrite arguments, as shown in the following code snippet, are the series of operations we want to execute; WriteConcern (the default is again 1), and if the series of write operations should get applied in the order that they appear in the array (they will be ordered by default):
The following operations are the same ones supported by bulk:
insertOne
updateOne
updateMany
deleteOne
deleteMany
replaceOne
updateOne, deleteOne, and replaceOne have matching filters; if they match more than one document, they will only operate on the first one. It’s important to design these queries so that they don’t match more than one document; otherwise, the behavior will be undefined.
Administration is generally performed on three different levels, ranging from more generic to more specific: process, collection, and index.
At the process level, there is the shutDown command to shut down the MongoDB server.
At the database level, we have the following commands:
dropDatabase to drop the entire database
listCollections to retrieve the collection names in the current database
copyDB or clone to clone a remote database locally
repairDatabase for when our database is not in a consistent state due to an unclean shutdown.
In comparison, at the collection level, the following commands are used:
drop: To drop a collection
create: To create a collection
renameCollection: To rename a collection
cloneCollection: To clone a remote collection to our local database
cloneCollectionAsCapped: To clone a collection into a new capped collection
convertToCapped: To convert a collection to a capped one.
At the index level, we can use the following commands:
createIndexes: To create new indexes in the current collection
listIndexes: To list existing indexes in the current collection
dropIndexes: To drop all indexes from the current collection
reIndex: To drop and recreate an index in the current collection
COMPACT/CURRENTOP/KILLOP/COLLMOD
Defragmentation on size compact db.runCommand ( { compact: '<collection>', force: true } )db.currentOp() : List current running Operations in MongoDB
db.runCommand( { "killOp": 1, "op": <operationId> } ) Kill the operation if needed
collMod helps with adding validations to collections. Below ensure in every insert we have isbn and name found so db.bookOrders.insert({isbn: 102}) will fail with message MongoBulkWriteError: Document failed validation
Default Roles in MongoDB ordered from more powerful to least powerful (dbAdminAnyDatabase = userAdminAnyDatabase + readWriteAnyDatabase)
root
dbAdminAnyDatabase
userAdminAnyDatabase
readWriteAnyDatabase
readAnyDatabase
dbOwner
dbAdmin
userAdmin
readWrite
read
MongoDB Stable API
ApiVersion=1 supports any of the following commands:
abortTransaction: To terminate a multi-document (also known as distributed) transaction and roll back its results.
aggregate: (with limitations) To execute an aggregation pipeline.
authenticate: To authenticate a client using the x.509 authentication mechanism.
count: Introduced in version 6 and available since version 5.0.9, this counts the number of documents in a collection or a view.
collMod: To modify view definitions or add options to a collection.
commitTransaction: To commit a multi-document transaction.
create: (with limitations) To create a collection or view.
createIndexes: (with limitations) To create one or more indexes in a collection.
delete: To remove one or more documents from a collection.
drop: To remove an entire collection.
dropDatabase: To remove an entire database.
dropIndexes: To remove one or more indexes from a collection.
endSessions: To expire specific sessions after waiting for the timeout period.
explain: (output may change in future versions) To get an execution query plan for MongoDB operations.
find: (with limitations) To execute a query against a collection.
findAndModify: To execute a query against a collection and modify one or more documents in the result set.
getMore: To fetch more documents in commands that use a cursor to return results in batches.
insert: To insert one or more documents in a collection.
hello: To return information about the MongoDB server. This may include primary/secondary replica set information as well as authentication method supported and other role-level information.
killCursors: To delete cursors that are returned from queries that return results in batches.
listCollections: To list collections in the database.
listDatabases: To list databases.
listIndexes: To list indexes in a collection.
ping: To ping a server, equivalent to the Internet Control Message Protocol (ICMP) echo request/reply.
refreshSessions: To update the last used time for specified sessions in order to extend their active state.
update: To update one or more documents in a collection.
Auditing and logging are two concepts that are, sometimes, used interchangeably in some contexts. Logging refers to program-level events that happen throughout its execution. Usually, these events are atomic operations that happen internally and contain information that is useful for development, bug identification, and fixing. Logging includes information about what happened in the level of detail that is useful for the developers. Often, logs are deleted after a short- or developer-determined amount of time.
Auditing refers to business-level events. A business-level event refers to an action that is usually performed by a user and refers to a domain rather than an implementation-level library. Auditing answers the question of “Who did What and When?” in the system. We might also want to answer the question of “Why?” in the case of a root cause investigation.
The recorded JSON messages have the following syntax:
I will not document connection / Active Record Pattern and Multi DB Connectivity for MongoDB with Quarkus. We will focus more on samples problems and data query patterns.
Advanced Querying - Quarkus
Creating documentscompare
// Entity Class
@EqualsAndHashCode(callSuper = false)
@Data
@AllArgsConstructor
@NoArgsConstructor
@MongoEntity(collection = "bookOrders")
public class BookOrdersEntity extends ReactivePanacheMongoEntityBase {
@Id
public ObjectId id;
public String name;
public int isbn;
public double price;
public static Uni<List<BookOrdersEntity>> getAll() {
return listAll();
}
}
// Model Class
@EqualsAndHashCode(callSuper = false)
@Data
@AllArgsConstructor
@NoArgsConstructor
public class BookOrdersModel {
public String id;
public String name;
public int isbn;
public double price;
}
// Resource Class
@ApplicationScoped
@Path("/bookOrders")
@GraphQLApi
public class BookResource {
@Inject
ObjectMapper mapper;
@Inject
BookOrdersMapper bookOrdersMapper;
@GET
@Path("/fetchAll")
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
public Uni<List<BookOrdersModel>> fetchBookOrdersApi() {
return BookOrdersEntity.getAll().log("ALL_BOOKS").map(b -> bookOrdersMapper.fromBookEntities(b));
}
@Query("BookOrders")
public Uni<List<BookOrdersModel>> fetchBookOrdersGraphQL() {
return this.fetchBookOrdersApi();
}
}
// Mappers Needed for Smallrye GraphQL due to the issue (JSON B ObjectId - )https://github.com/smallrye/smallrye-graphql/issues/1753
@Retention(RetentionPolicy.CLASS)
@Mapping(target = "id", expression = "java(org.doogle.mappers.utils.ObjectIdUtils.toObjectId(source.getId()))")
public @interface ToEntity {
}
@Retention(RetentionPolicy.CLASS)
@Mapping(target = "id", expression = "java(org.doogle.mappers.utils.ObjectIdUtils.fromObjectId(source.getId()))")
public @interface ToModel {
}
@Mapper(componentModel = "jakarta")
public abstract class BookOrdersMapper {
@ToModel
@Named("fromBookEntity")
public abstract BookOrdersModel fromBookEntity(BookOrdersEntity source);
@ToEntity
@Named("toBookEntity")
public abstract BookOrdersEntity toBookEntity(BookOrdersModel source);
@IterableMapping(qualifiedByName = "fromBookEntity")
public abstract List<BookOrdersModel> fromBookEntities(List<BookOrdersEntity> source);
@IterableMapping(qualifiedByName = "toBookEntity")
public abstract List<BookOrdersEntity> toBookEntities(List<BookOrdersModel> source);
}
public class ObjectIdUtils {
public static ObjectId toObjectId(String objectId) {
return StringUtils.isNotBlank(objectId) ? new ObjectId(objectId) : null;
}
public static String fromObjectId(ObjectId objectId) {
return ObjectUtils.isNotEmpty(objectId) ? objectId.toString() : null;
}
}
API or GraphQL queries
// Input
query bookOrders {
BookOrders {
id
isbn
name
price
}
}
// result
{
"data": {
"BookOrders": [
{
"id": "646ecc8611d51b196ad8c61e",
"isbn": 101,
"name": "Mastering Mongodb",
"price": 0
}
]
}
}
// API
curl -X 'GET' \
'http://localhost:8080/bookOrders/fetchAll' \
-H 'accept: application/json'
// Result
[
{
"id": "646ecc8611d51b196ad8c61e",
"name": "Mastering Mongodb",
"isbn": 101,
"price": 0
}
]
Create or Update Documents
// New change BookResource.java
@Mutation("createUpdateBookOrder")
public Uni<BookOrdersModel> createBookOrderGraphQL(BookOrdersModel bookOrder) {
return this.createBookOrderApi(bookOrder);
}
@POST
@Path("/createorupdate")
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
public Uni<BookOrdersModel> createBookOrderApi(@RequestBody BookOrdersModel bookOrder) {
Uni<BookOrdersModel> bookOrderModel = Uni.createFrom().item(bookOrder).map(b -> bookOrdersMapper.toBookEntity(b)).call(BookOrdersEntity::persistOrUpdateBookOrdersEntity).log().map(be -> bookOrdersMapper.fromBookEntity(be));
return bookOrderModel.log();
}
// New Changes BookEntity.java
public static Uni<BookOrdersEntity> persistOrUpdateBookOrdersEntity(BookOrdersEntity entity)
{
return persistOrUpdate(entity).replaceWith(entity).log("BOOK_ORDER_ENITITY_PERSISTED");
}
// GraphQL
mutation createUpdateBookOrders {
create: createUpdateBookOrder(
bookOrder: {isbn: 101, name: "Mastering MongoDB with GraphQL", price: 198})
{
id
isbn
name
price
}
update: createUpdateBookOrder(
bookOrder: {
id:"647825ed5ad4627619cf4f62",isbn: 101, name: "Mastering MongoDB with GraphQL Update", price: 199})
{
id
isbn
name
price
}
}
// Result
{
"data": {
"create": {
"id": "6478262c5ad4627619cf4f63",
"isbn": 101,
"name": "Mastering MongoDB with GraphQL",
"price": 198
},
"update": {
"id": "647825ed5ad4627619cf4f62",
"isbn": 101,
"name": "Mastering MongoDB with GraphQL Update",
"price": 199
}
}
}
// Rest API - Create no id passed
curl -X 'POST' 'http://localhost:8080/bookOrders/createorupdate' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{ "isbn": 104, "name": "Mastering MonogoDb via Rest API 104", "price": 150 }'
// Result
{
"id": "6478267f5ad4627619cf4f64",
"name": "Mastering MonogoDb via Rest API 104",
"isbn": 104,
"price": 150
}
// Update id passed
curl -X 'POST' 'http://localhost:8080/bookOrders/createorupdate' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{ "id": "6478267f5ad4627619cf4f64", "isbn": 105, "name": "Mastering MonogoDb via Rest API 105", "price": 149 }'
// result
{
"id": "6478267f5ad4627619cf4f64",
"name": "Mastering MonogoDb via Rest API 105",
"isbn": 105,
"price": 149
}
Panache Query Language can be used to make customized queries for find criteria
We can try the following options with Quarkus we have methods like
persist/update/persistOrUpdate/delete/findById/findByIdOptional/find/findAll/list/listAll/stream/streamAll/count/deleteAll/deleteById/mongoCollection/mongoDatabaseBulk Operations Reference
insertOne
updateOne
updateMany
replaceOne
deleteOne
deleteMany
The bulkWrite() method accepts the following parameters:
A List of objects that implement WriteModel: the classes that implement WriteModel correspond to the aforementioned write operations. E.g. the InsertOneModel class wraps the insertOne write operation. See the links to the API documentation at the bottom of this page for more information on each class.
BulkWriteOptions: optional object that specifies settings such as whether to ensure your MongoDB instance orders your write operations.
Batch is of default 1000 so use it wisely to improve performance.
@Mutation
@Blocking
public List<ResultMap> bulkInsertBlocking(List<BookOrdersModel> bookOrdersModels) throws ExecutionException, InterruptedException {
List<BookOrdersEntity> boe = bookOrdersMapper.toBookEntities(bookOrdersModels);
List<Map<String, ?>> boelist = mapper.convertValue(boe, new TypeReference<List<Map<String, ?>>>() {
});
boelist.forEach(m -> m.values().removeIf(Objects::isNull));
List<InsertOneModel<Document>> insertOneModels = boelist.stream().map(d -> new Document(d)).map(doc -> new InsertOneModel<>(doc)).toList();
List<InsertOneModel<Document>> insertOneModels2 = boelist.stream().map(d -> new Document(d)).map(doc -> new InsertOneModel<>(doc)).toList();
BulkWriteResult r = getCollection().bulkWrite(insertOneModels, new BulkWriteOptions().ordered(true)).subscribe().asCompletionStage().get();
Log.info(r);
List<ResultMap> res = List.of(new ResultMap("wasAcknowledged", String.valueOf(r.wasAcknowledged())),
new ResultMap("insertedCount", String.valueOf(r.getInsertedCount())),
new ResultMap("matchededCount", String.valueOf(r.getMatchedCount())),
new ResultMap("deletededCount", String.valueOf(r.getDeletedCount())),
new ResultMap("modifiededCount", String.valueOf(r.getModifiedCount())));
BulkWriteResult result1 = getCollection().bulkWrite(insertOneModels2, new BulkWriteOptions().ordered(false)).subscribe().asCompletionStage().get();
Log.info(result1);
List<ResultMap> res1 = List.of(new ResultMap("wasAcknowledged", String.valueOf(result1.wasAcknowledged())),
new ResultMap("insertedCount", String.valueOf(result1.getInsertedCount())),
new ResultMap("matchededCount", String.valueOf(result1.getMatchedCount())),
new ResultMap("deletededCount", String.valueOf(result1.getDeletedCount())),
new ResultMap("modifiededCount", String.valueOf(result1.getModifiedCount())));
return Stream.of(res1, res)
.flatMap(Collection::stream).collect(Collectors.toList());
}
// Non-Blocking
@Mutation
public Uni<List<ResultMap>> bulkInsert(List<BookOrdersModel> bookOrdersModels) {
List<BookOrdersEntity> boe = bookOrdersMapper.toBookEntities(bookOrdersModels);
List<Map<String, ?>> boelist = mapper.convertValue(boe, new TypeReference<List<Map<String, ?>>>() {
});
boelist.forEach(m -> m.values().removeIf(Objects::isNull));
List<InsertOneModel<Document>> insertOneModels = boelist.stream().map(d -> new Document(d)).map(doc -> new InsertOneModel<>(doc)).toList();
List<InsertOneModel<Document>> insertOneModels2 = boelist.stream().map(d -> new Document(d)).map(doc -> new InsertOneModel<>(doc)).toList();
Uni<BulkWriteResult> result = getCollection().bulkWrite(insertOneModels, new BulkWriteOptions().ordered(true));
Uni<List<ResultMap>> res = result.map(r -> List.of(new ResultMap("wasAcknowledged", String.valueOf(r.wasAcknowledged())), new ResultMap("insertedCount", String.valueOf(r.getInsertedCount())), new ResultMap("matchededCount", String.valueOf(r.getMatchedCount())), new ResultMap("deletededCount", String.valueOf(r.getDeletedCount())), new ResultMap("modifiededCount", String.valueOf(r.getModifiedCount()))));
result = getCollection().bulkWrite(insertOneModels2, new BulkWriteOptions().ordered(false));
Uni<List<ResultMap>> res1 = result.map(r -> List.of(new ResultMap("wasAcknowledged", String.valueOf(r.wasAcknowledged())), new ResultMap("insertedCount", String.valueOf(r.getInsertedCount())), new ResultMap("matchededCount", String.valueOf(r.getMatchedCount())), new ResultMap("deletededCount", String.valueOf(r.getDeletedCount())), new ResultMap("modifiededCount", String.valueOf(r.getModifiedCount()))));
return Uni.combine().all().unis(res, res1).asTuple().map(tup -> Stream.of(tup.getItem1(), tup.getItem2())
.flatMap(Collection::stream).collect(Collectors.toList()));
}
MongoDB uses Perl Compatible Regular Expression (PCRE) version 8.42 with UTF-8 support.
db.bookOrders.find({"name": /mongo/i})
Directly using regex
db.bookOrders.find({'name': { '$regex': /mongo/ } })db.bookOrders.find({'name': { '$regex': /mongo/i } })db.bulkCollection.find({comment: {$regex: /^doogle/m}}) # multi line match
db.bulkCollection.find({comment: {$regex: /^doogle/}})
Indexes that use regular expressions can only be used if our regular expression does queries for the beginning of a string that is indexed; that is, regular expressions starting with ^ or \A. If we only want to query using a starts with regular expression, we should avoid writing lengthier regular expressions, even if they will match the same strings.
Both queries will match name values starting with mongo (case-sensitive), but the first one will be faster as it will stop matching as soon as it hits the sixth character in every name value.
GOTCHA
Thread A starts querying the collection and matches the A1 document.
Thread B updates the A1 document, increasing its size and forcing MongoDB to move it to a different physical location toward the end of the storage file.
Thread A is still querying the collection. It reaches the end of the collection and finds the A1 document again with its new value.
This is rare, but it can happen in production; if we can’t safeguard from such a case in the application layer, we can use hint( { $natural : 1 } ) to prevent it
hint( { $natural : 1 } ) is supported by official drivers and the shell by appending it into an operation that returns a cursor:
db.bookOrders.find().hint( { $natural : 1 } )db.bookOrders.compact() Storage compression uses less disk space at the expense of CPU usage, but this trade-off is mostly worth it.
The fundamental problem that change streams solve is the need for applications to react immediately to changes in the underlying data. Modern web applications need to be reactive to data changes and refresh the page view without reloading the entire page.
Change streams also have other advantages around security:
Users can only create change streams on collections, databases, or deployments that they have read access to.
Change streams are also idempotent by design. Even in the case that the application cannot fetch the absolute latest change stream event notification ID, it can resume applying from an earlier known one and it will eventually reach the same state.
Finally, change streams are resumable. Every change stream response document includes a resume token. If the application gets out of sync with the database, it can send the latest resume token back to the database and continue processing from there. This token needs to be persisted in the application, as the MongoDB driver won’t keep application failures and restarts. It will only keep state and retry in case of transient network failures and MongoDB replica set elections.
A change stream requires a WiredTiger storage engine and replica set protocol version 1 (pv1). pv1 is the only supported version starting from MongoDB 4.0. Change streams are compatible with deployments that use encryption-at-rest.
To enable changeprepostimage run following command
db.runCommand({collMod:"bookOrders",changeStreamPreAndPostImages:{enabled: true}})db.runCommand({collMod:"bulkCollection",changeStreamPreAndPostImages:{enabled: true}})
Debezium reads the changes and publishes this way so we can easily stream them as needed.
Pipeline: This is an optional parameter that we can use to define an aggregation pipeline to be executed on each document that matches watch().
Because the change stream itself uses the aggregation pipeline, we can attach events to it.
The aggregation pipeline events we can use are as follows:
> $match
> $project
> $addFields
> $replaceRoot
> $redact
> $replaceWith
> $set
> $unset
Important notes
When using a sharded database, change streams need to be opened against a MongoDB server. When using replica sets, a change stream can only be opened against any data-bearing instance. Each change stream will open a new connection, as of 4.0.2. If we want to have lots of change streams in parallel, we need to increase the connection pool (as per the SERVER-32946 JIRA MongoDB ticket) to avoid severe performance degradation.
Production recommendations
Let’s look at some of the best recommendations by MongoDB and expert architects at the time of writing.
Replica sets
Starting from MongoDB 4.2, a change stream can still be available even if the Read Concern of the majority is not satisfied. The way to enable this behavior is by setting { majority : false }.
Invalidating events, such as dropping or renaming a collection, will close the change stream. We cannot resume a change stream after an invalidating event closes it.
As the change stream relies on the oplog size, we need to make sure that the oplog size is large enough to hold events until they are processed by the application.
We can open a change stream operation against any data-bearing member in a replica set.
Sharded clusters
On top of the considerations for replica sets, there are a few more to keep in mind for sharded clusters. They are as follows:
The change stream is executed against every shard in a cluster and will be as fast as the slowest shard
To avoid creating change stream events for orphaned documents, we need to use the new feature of ACID-compliant transactions if we have multi-document updates under sharding
We can only open a change stream operation against the mongos member in a sharded cluster.
While sharding an unsharded collection (that is, migrating from a replica set to sharding), the documentKey property of the change stream notification document will include _id until the change stream catches up to the first chunk migration.
MongoDB drivers support the following methods with queryable encryption:
We will explore multi-document ACID transactions (also known as distributed transactions) in MongoDB.
MongoDB is a non-relational database and provides only a few guarantees around ACID. Data modeling in MongoDB does not focus on BCNF, 2NF, and 3NF normalization; instead, its focus is in the opposite direction.
Atomicity
Atomicity refers to the concept that a transaction needs to follow the binary success or fail principle. If a transaction succeeds, then its results are visible to every subsequent user. If a transaction fails, then every change is rolled back to the point it was right before it started. Either all actions in a transaction occur or none at all.
Consistency
Consistency refers to the database’s state. Every database operation must start and regardless of whether it succeeds or fails, it should leave the database in a state where its data is consistent. The database must always be in a consistent state.
MongoDB falls somewhere in between eventual and strict consistency by adopting a causal consistency model. With causal consistency, any transaction execution sequence is the same as if all causally related read/write operations were executed in an order that reflects their causality.
Isolation
Database isolation refers to the view that each transaction has of other transactions that run in parallel. Isolation protects us from transactions acting on the state of parallel running, incomplete transactions that may subsequently roll back. An example of why isolation levels are essential is described in the following scenario:
Transaction A updates user 1’s account balance from £50 to £100 but does not commit the transaction.
Transaction B reads user 1’s account balance as £100.
Transaction A is rolled back, reverting user 1’s account balance to £50.
Transaction B thinks that user 1 has £100, whereas they only have £50.
Transaction B updates user 2’s value by adding £100. User 2 receives £100 out of thin air from user 1, since user 1 only has £50 in their account. Our imaginary bank is in trouble.
Isolation typically has four levels, as follows, listed from the most to the least strict:
Serializable
Repeatable read
Read committed
Read uncommitted
The problems we can run into, from the least to the most serious, depend on the isolation level, as follows:
Phantom reads
Non-repeatable reads
Dirty reads
Lost updates
Read Concerns
"local": The query returns data from the instance with no guarantee that the data has been written to a majority of the replica set members (i.e. may be rolled back). Default for reads against the primary and secondaries. Availability: Read concern "local" is available for use with or without causally consistent sessions and transactions.
"available": The query returns data from the instance with no guarantee that the data has been written to a majority of the replica set members (i.e. may be rolled back). Availability: Read concern "available" is unavailable for use with causally consistent sessions and transactions.
"majority": The query returns the data that has been acknowledged by a majority of the replica set members. The documents returned by the read operation are durable, even in the event of failure. To fulfill read concern "majority", the replica set member returns data from its in-memory view of the data at the majority-commit point. As such, read concern "majority" is comparable in performance cost to other read concerns. Availability: Read concern "majority" is available for use with or without causally consistent sessions and transactions.
Note: For operations in multi-document transactions, read concern "majority" provides its guarantees only if the transaction commits with write concern "majority". Otherwise, the "majority" read concern provides no guarantees about the data read in transactions.
"linearizable": The query returns data that reflects all successful majority-acknowledged writes that completed prior to the start of the read operation. The query may wait for concurrently executing writes to propagate to a majority of replica set members before returning results. If a majority of your replica set members crash and restart after the read operation, documents returned by the read operation are durable if writeConcernMajorityJournalDefault is set to the default state of true. With writeConcernMajorityJournalDefault set to false, MongoDB does not wait for w: "majority" writes to be written to the on-disk journal before acknowledging the writes. As such, "majority" write operations could possibly roll back in the event of a transient loss (e.g. crash and restart) of a majority of nodes in a given replica set. Availability: Read concern "linearizable" is unavailable for use with causally consistent sessions and transactions. You can specify linearizable read concern for read operations on the primary only.
"snapshot": A query with read concern "snapshot" returns majority-committed data as it appears across shards from a specific single point in time in the recent past. Read concern "snapshot" provides its guarantees only if the transaction commits with write concern "majority". If a transaction is not part of a causally consistent session, upon transaction commit with write concern "majority", the transaction operations are guaranteed to have read from a snapshot of majority-committed data.
If a transaction is part of a causally consistent session, upon transaction commit with write concern "majority", the transaction operations are guaranteed to have read from a snapshot of majority-committed data that provides causal consistency with the operation immediately preceding the transaction start.
Availability: Read concern "snapshot" is available for All read operations inside multi-document transactions with the read concern set at the transaction level.
The following methods outside of multi-document transactions:
find
aggregate
distinct (on unsharded collections)
All other read operations prohibit "snapshot".
Transactions can also perform some data definition language (DDL) operations, starting from version 4.4. A transaction can create a collection or indexes in empty collections that have already been created in the same transaction.
A collection can be created explicitly using the create_collection() operator or implicitly by creating or upserting a document targeting a collection that does not already exist.
There are still some limitations in place; for example, we cannot write/update across different shards in the same transaction. As an example, let’s say our transaction is performing the following two operations:
Adding or updating a document in the accounts collection in shard A.
Implicitly or explicitly creating a collection in shard B.
Here, this transaction will abort.
Another limitation is that when we want to create a collection or index explicitly (that is, using the create collection and index methods), we need to set the transaction read concern to local.
Finally, we cannot list collections and indexes and we cannot use any other non-CRUD and non-informational operators. This would include methods such as count() and createUser().
NOTE
We can still use the count() method to enumerate the number of documents inside a transaction by wrapping the command inside an aggregation framework operation and using $count or $group combined with $sum. MongoDB drivers will usually provide a helper method, countDocuments(filter, options), that does exactly that.
MongoDB also allows us to customize read_concern and write_concern per transaction.
The available read_concern levels for multi-document ACID transactions are as follows:
majority: The majority of the servers in a replica set have acknowledged the data. For this to work as expected in transactions, they must also use write_concern set to majority.
local: Only the local server has acknowledged the data. This is the default read_concern level for transactions.
snapshot: If the transaction commits with majority set to write_concern, all the transaction operations will have read from a snapshot of the majority of the committed data; otherwise, no guarantee can be made.
A snapshot read concern is also available outside of multi-document ACID transactions for the find() and aggregate() methods. It is also available for the distinct() method if the collection is not sharded.
NOTE
A read concern for transactions is set at the transaction level or higher (session or client). Setting a read concern in individual operations is not supported and is generally discouraged.
The available write_concern levels for multi-document ACID transactions are as follows:
majority: The majority of the servers in a replica set have acknowledged the data. This is the default write concern as of MongoDB 5.0.
<w>: The <w> number of servers have to acknowledge the write before it’s considered successful. w==1 will write to primary, w==2 will write to primary and one data bearing node, and so on.
<custom_write_concern_name>: We can also tag our servers and cluster them under <custom_write_concern_name>. This way, we can wait for acknowledgment from our desired number of nodes and also specify exactly which servers we want to propagate our writes to. This is useful, for example, when we have disaster recovery scenarios where one of the servers is hosted on another data center and we need to make sure that the writes are always propagated there. The operation may be rolled back if our <custom_write_concern_name> set only ends up having one server and that server steps down from being a primary before the write is acknowledged.
The default write concern will be different in the case that we have one or more arbiter nodes in our cluster. In this case, if the number of data nodes is less than or equal to the “majority number of voting nodes,” then the write concern falls back to 1, instead of the default value of “majority.”
The “majority number of voting nodes” is calculated as 1 + floor(<number_of_voting nodes>).floor() will round down the number of voting nodes to the nearest integer.
The number of voting nodes is the sum of arbiter nodes plus the data-bearing nodes.
NOTE
Transaction read and write concerns will fall back from the transaction, then to the session level, and finally to the MongoDB client-level defaults if unset.
Some other transaction limitations as of MongoDB 5.3 are as follows:
We can’t write to any system or capped collections.
We can’t read or write any collections in the admin, config, or local databases.
We can’t inspect the query planner using explain().
We can’t read from a capped collection using a snapshot read concern.
The getMore operation on cursors must be created and accessed either inside or outside the transaction. This means that if we create the cursor inside a transaction we can only use getMore() on the cursor inside it. The same goes for creating it outside the transaction.
We can’t start a transaction by invoking killCursors() as the first operation.
The best practices and limitations of multi-document ACID transactions
The transaction timeout is set to 60 seconds.
As a best practice, any transaction should not try to modify more than 1,000 documents. There is no limitation in reading documents during a transaction.
The oplog will record a single entry for a transaction, meaning that this is subject to the 16 MB document size limit. This is not such a big problem with transactions that update documents, as only the delta will be recorded in the oplog. It can, however, be an issue when transactions insert new documents, in which case the oplog will record the full contents of the new documents.
We should add application logic to cater to failing transactions. These could include using retry-able writes or executing some business logic-driven action when the error cannot be retried or we have exhausted our retries (usually, this means a custom 500 error).
DDL operations such as modifying indexes, collections, or databases will get queued up behind active transactions. Transactions trying to access the namespace while a DDL operation is still pending will immediately abort.
Transactions only work in replica sets. Starting from MongoDB 4.2, transactions will also be available for sharded clusters.
Use them sparingly; maybe the most important point to consider when developing using MongoDB transactions is that they are not meant as a replacement for good schema design. They should only be used when there is no other way to model our data without them.
The $out stage has to be the final stage in an aggregation pipeline, outputting data to an output collection that will be completely erased and overwritten if it already exists. The $out operator cannot output to a sharded collection.
Additionally, the $merge stage has to be the final stage in an aggregation pipeline, but it can insert, merge, replace, or keep existing documents. It can also process the documents using a custom update pipeline. It can replace all the documents in a collection but only if the aggregation results output matches all of the existing documents in the collection. The $merge operator can also output to a sharded collection.
The following list describes the most important aggregation pipeline stages:
$addFields: This adds new fields to documents and outputs the same number of documents as input with the added fields.
$bucket: This splits the documents into buckets based on predefined selection criteria and bucket boundaries.
$bucketAuto: This splits documents into buckets based on predefined selection criteria and attempts to evenly distribute documents among the buckets.
$collStats: This returns statistics regarding the view or collection.
$count: This returns a count of the number of documents at this stage of the pipeline.
$densify: This will create new documents and fill in the gaps in a sequence of documents where the values in the specified field are missing.
$documents: This will return literal documents from our input values. It can be used for testing or to inject documents into the pipeline programmatically.
$facet: This combines multiple aggregation pipelines within a single stage.
$fill: This will populate null and missing field values within the specified documents. We can use the linear regression or Last Observation Carried Forward (locf) methods to generate the values.
$geoNear: This returns an ordered list of documents based on the proximity to a specified field. The output documents include a computed distance field.
$graphLookup: This recursively searches a collection and adds an array field with the results of the search in each output document.
$group: This is most commonly used to group by identifier expression and to apply the accumulator expression. It outputs one document per distinct group.
$indexStats: This returns statistics regarding the indexes of the collection.
$limit: This limits the number of documents passed on to the next aggregation phase based on predefined criteria.
$listSessions: This is only available as the first step in the pipeline. It will list active sessions by querying the system.sessions collection. All sessions are initiated in memory local to each node before MongoDB syncs them with the system.sessions collection. We can list in-memory sessions using the * $ listLocalSessions operation in the node.
$lookup: This is used for filtering documents from the input. The input could be documents from another collection in the same database selected by an outer left join or the literal * $documents from the input values operator.
$match: This is used for filtering documents from input based on criteria.
$merge
$out: This outputs the documents from this pipeline stage to an output collection by replacing or adding to the documents that already exist in the collection.
$project: This is used for document transformation, and outputs one document per input document.
$redact: As a combination of * $project and * $match, this will redact the selected fields from each document and pass them on to the next stage of the pipeline.
$replaceRoot: This replaces all existing fields of the input document (including the standard _id field) with the specified fields.
$replaceWith: This will replace a document with the specified embedded document. The operation replaces all of the existing fields in the input document, including the _id field. Specify a document embedded in the input document to promote the embedded document to the top level. * $replaceWith is an alias for the * $replaceRoot stage.
$sample: This randomly selects a specified number of documents from the input.
$search: This is used to perform a full-text search in the input-specified fields. It can return a snippet of the text including the search term. It can only be the first step in the pipeline; MongoDB Atlas only.
$searchMeta: MongoDB Atlas only. This is not available in self-hosted clusters.
$set: This will add new fields to the specified documents. * $set will reformat each document passing through the stream and may add new fields to the output documents with both the existing fields and the new ones. It is an alias of the * $addFields stage.
$setWindowFields: This will group input documents into partitions (windows) and apply our operators to all documents in each partition.
$skip: This skips a certain number of documents, preventing them from passing on to the next stage of the pipeline.
$sort: This sorts the documents based on criteria.
$sortByCount: This groups incoming documents based on the value of an expression and computes the count of documents in each bucket.
$unionWith: This will perform a union of two collections to merge the results from two collections into one result set. It is similar to SQL’s UNION ALL operator.
$unset: This will remove fields from the output documents. * $unset is an alias for the * $project stage that removes fields.
$unwind: This transforms an array of n elements into n documents, mapping each document to one element of the array. The documents are then passed on to the next stage of the pipeline.
$allElementsTrue: This is true if all of the elements in the set evaluate to true.
$anyElementTrue: This is true if any of the elements in the set evaluate to true.
$setDifference: This returns the documents that appear in the first input set but not the second.
$setEquals: This is true if the two sets have the same distinct elements.
$setIntersection: This returns the intersection of all input sets (that is, the documents that appear in all of the input sets).
$setIsSubset: This is true if all documents in the first set appear in the second one, even if the two sets are identical.
$setUnion: This returns the union of all input sets (that is, the documents that appear in at least one of all of the input sets).
Array Operators
$arrayElemAt: This is used to retrieve the element at the array index position, starting from zero.
$arrayToObject: This is used to transform an array into a document.
$concatArrays: This returns a concatenated array.
$filter: This returns a subset of the array based on specified criteria.
$first: This will return the first element of the array. Note that this is different from the $first accumulator operator.
$firstN: This will return the first N elements of the array. Note that this is different from the $first accumulator operator.
$in: This returns true if the specified value is in the array; otherwise, it returns false.
$indexOfArray: This returns the index of the array that fulfills the search criteria. If it does not exist, then it will return -1.
$isArray: This returns true if the input is an array; otherwise, it returns false.
$last: This returns the last element of the array. Note that this is different from the $last accumulator operator.
$lastN: This returns the last N elements of the array. Note that this is different from the $lastN accumulator operator.
$map: This is similar to JavaScript and the map() function of other languages. This operator will apply the expression to each element of the array and return an array of the resulting values in order. It accepts named parameters.
$maxN: This returns the largest N values of the array. Note that this is different from the $maxN accumulator operator.
$minN: This returns the smallest N values of the array. Note that this is different from the $minN accumulator operator.
$objectToArray: This operator will convert a document into an array of documents representing key-value pairs.
$range: This outputs an array containing a sequence of integers according to user-defined inputs.
$reduce: This reduces the elements of an array to a single value according to the specified input.
$reverseArray: This returns an array with the elements in reverse order.
$size: This returns the number of items in the array.
$slice: This returns a subset of the array.
$sortArray: This operator sorts the elements of the array. The array can contain simple values, where we can define 1 for ascending and –1 for descending order. Or it can contain documents, where we can define the field to sort by the order direction in the same way.
$zip: This returns a merged array.
Date Operators
$dateAdd: This is used to add a number of time units to a date object.
$dateDiff: This is used to get the delta difference between two dates in the defined unit (year, month, and second).
$dateFromParts: This constructs a date object from a set of fields. It can be either year/month-millisecond or an isoWeekDate format with year/week/dayOfWeek… /millisecond format.
$dateFromString: This converts a string into a date object according to the defined format. The default format is %Y-%m-%dT%H:%M:%S.%LZ.
$dateSubtract: This subtracts a number of time units from a date object. It returns a date.
$dateToParts: This returns a document with the year/month...milliseconds fields from a date object. It can also return an ISO week date format by setting ISO8601 to true.
$dateToString: This will return the string representation of a date.
$dateTrunc: This will truncate a date. We can use binsize and unit to truncate appropriately. For example, binsize=2 and unit=hours will truncate 11:30 a.m. to 10:00 a.m., truncating to the nearest multiple of 2-hour bins.
$dayOfMonth: This is used to return the day of the month within a range of 1 to 31.
$dayOfWeek: This is used to get the day of the week, starting from Sunday(1) to Saturday(7).
$dayOfYear: This will return the day of the year. It starts from 1, all the way to 366 for a leap year.
$isoDayOfWeek: This is used to get the day of the week in ISO8601 format, starting from Sunday(1) to Saturday(7).
$isoWeek: This is used to get the week number in the ISO 8601 date format. This would be an integer from 1 to 53 if the year has 53 weeks. The first week of the year is the week that contains the first Thursday of the year.
$isoWeekYear: This will return the year number in the ISO 8601 date format according to the date that the last week in the ISO 8601 date format ends with. A year starts on the Monday of the first week of the year and ends on the Sunday of the last week of the year, both inclusive.
For example, with an ISODate input of Sunday 1/1/2017, this operator will return 2016, as this is the year that ends on the Sunday for this week of the year.
$second: This will return 0 to 59 or 60 in the case that there is a leap second in the calculation.
$toDate: This will parse a value to a date object. It will return null on null or missing input. It’s a wrapper for { $convert: { input: <expression>, to: "date" } } expression.
$week: This will return 0 to 53 for the week number. 0 would be the first partial week of the year and 53 the last week of a year with a leap week.
year, month, hour, minute, and milliSecond: These will return the relevant portion of the date in zero-based numbering, except for $month, which returns a value ranging from 1(January) to 12(December).
NOTE
We can also use the **add** and subtract arithmetic operators with dates.
Literal
$literal: We use this operator to pass a value through the pipeline without parsing it. One example of usage would be a string such as$sortthat we need to pass along without interpreting it as a field path.
Miscellaneous
$getField: This is useful when we have fields that start with $ or contain . in their name. It will return the value of the field.
$rand: This will return a random float between 0 and 1. It can contain up to 17 decimal digits and will truncate trailing zeros, so the actual length of the value will vary.
$sampleRate: Given a float between 0 and 1 inclusively, it will return a number of documents according to the rate. 0 for zero documents returned and 1 to return all documents. The process is non-deterministic and, as such, multiple runs will return a different number of documents. The larger the number of documents in the collection, the bigger chance that sampleRate will converge to the percentage of documents returned. This is a wrapper of the { $match: { $expr: { $lt: [ { $rand: {} }, <sampleRate> ] } } } expression.
Object Operators
$mergeObjects: This will combine multiple objects within a simple output document.
$objectToArray: This will convert an object into an array of documents with all key-value pairs.
$setField: This is useful when we have fields that start with $ or contain . in their name. It will create, update, or delete the specified field in the document.
String Operators
$concat: This is used to concatenate strings.
$dateFromString: This is used to convert a DateTime string into a date object.
$dateToString: This is used to parse a date into a string.
$indexOfBytes: This is used to return the byte occurrence of the first occurrence of a substring in a string.
$ltrim: This is used to delete whitespace or the characters from the beginning on the left-hand side of the string.
$regexFind: This will find the first match of the regular expression in the string. It returns information about the match.
$regexFindAll: This will find all matches of the regular expression in the string. It returns information about all matches.
$regexMatch: This will return true or false if it can match the regular expression in the string.
$replaceOne: This will replace the first instance of a matched string in the input.
$replaceAll: This will replace all instances of the matched string in the input.
$rtrim: This will remove whitespace or the specified characters from the end on the right-hand side of a string.
$strLenBytes: This is the number of bytes in the input string.
$strcasecmp: This is used in case-insensitive string comparisons. It will return 0 if the strings are equal and 1 if the first string is great; otherwise, it will return -1.
$substrBytes: This returns the specified bytes of the substring.
$split: This is used to split strings based on a delimiter. If the delimiter is not found, then the original string is returned.
$toString: This will convert a value into a string.
$trim: This will remove whitespace or the specified characters from both the beginning and the end of a string.
$toLower/$toUpper: These are used to convert a string into all lowercase or all uppercase characters, respectively.
The equivalent methods for code points (a value in Unicode, regardless of the underlying bytes in its representation) are listed as follows:
$indexOfCP
$strLenCP
$substrCP
Text Operators
$meta: This will return metadata from the aggregation operation for each document.
Timestamp Operators
$tsIncrement: This will return the incrementing ordinal from a timestamp as a Long value.
$tsSecond: This will return the second value from a timestamp as a Long value.
Trignometry Operators
$sin: This will return the sin of a value in radians.
$cos: This will return the cosine of a value in radians.
$tan: This will return the tangent of a value in radians.
$asin: This will return the inverse sin, also known as arc sine, angle of a value in radians.
$acos: This will return the inverse cosine, also known as arc cosine, angle of a value in radians.
$atan: This will return the inverse tangent, also known as arc tangent, angle of a value in radians.
$atan2: With inputs x and y, in that order, it will return the inverse tangent, also known as arc tangent, angle of an x/y expression.
$asinh: This will return the inverse hyperbolic sine, also known as hyperbolic arc sine, of a value in radians.
$aconh: This will return the inverse hyperbolic cosine, also known as hyperbolic arc cosine, of a value in radians.
$atanh: This will return the inverse hyperbolic tangent, also known as hyperbolic arc tangent, of a value in radians.
$sinh: This will return the hyperbolic sine of a value in radians.
$conh: This will return the hyperbolic cosine of a value in radians.
$tanh: This will return the hyperbolic tangent of a value in radians.
$degreesToRadians: This will convert a value into radians from degrees.
$radiansToDegrees: This will convert a value into degrees from radians.
Type Operators
$convert: This will convert a value into our target type. For example, { $convert: { input: 1337, to: "bool" } } will output 1 because every nonzero value evaluates to true. Likewise, { $convert: { input: false, to: "int" } } will output 0.
$isNumber: This will return true if the input expression evaluates to any of the following types: integer, decimal, double, or long. It will return false if the input expression is missing, evaluates to null, or any other BSON type than the truthful ones mentioned earlier.
$toBool: This will convert a value into its Boolean representation or null if the input value is null.
It’s a wrapper around the { $convert: { input: <expression>, to: "bool" } } expression.
$toDate: This will convert a value into a date. Numerical values will be parsed as the number of milliseconds since Unix epoch time. Additionally, we can extract the timestamp value from an ObjectId object.
It’s a wrapper around the { $convert: { input: <expression>, to: "date" } } expression.
$toDecimal: This will convert a value into a Decimal128 value.
It’s a wrapper around the { $convert: { input: <expression>, to: "decimal" } } expression.
$toDouble: This will convert a value into a Double value.
It’s a wrapper around the { $convert: { input: <expression>, to: "double" } } expression.
$toInt: This will convert a value into an Integer value.
It’s a wrapper around the { $convert: { input: <expression>, to: "int" } } expression.
$toLong: This will convert a value into a Long value.
It’s a wrapper around the { $convert: { input: <expression>, to: "long" } } expression.
$toObject: This will convert a value into an ObjectId value.
$toString: This will convert a value into a string.
$type: This will return the type of the input or the “missing” string if the argument is a field that is missing from the input document.
NOTE
When converting to a numerical value using any of the relevant $to<> expressions mentioned earlier, a string input has to follow the base 10 notation; for example, “1337” is acceptable, whereas its hex representation, “0x539,” will result in an error.
The input string can’t be a float or decimal.
When converting to a numerical value using any of the relevant $to<> expressions mentioned earlier, a date input will output as a result the number of milliseconds since its epoch time. For example, this could be the Unix epoch time, at the beginning of January 1st, 1970, for the ISODate input.
Arithmetic Operators
$abs: This is the absolute value.
$add: This can add numbers or a number to a date to get a new date.
$ceil/$floor: These are the ceiling and floor functions, respectively.
$divide: This is used to divide by two inputs.
$exp: This raises the natural number, e, to the specified exponential power.
$ln/$log/$log10: These are used to calculate the natural log, the log on a custom base, or a log base ten, respectively.
$mod: This is the modular value.
$multiply: This is used to multiply by inputs.
$pow: This raises a number to the specified exponential power.
$round: This will round a number to an integer or specified decimal. For example, rounding X.5 will round to the nearest even number, so 11.5 and 12.5 will both round to 12. Also, we can specify a negative round precision number, <place> N, to round the Nth digit to the left-hand side of the decimal point. For example, 1234.56 with N=-2 will round to 1200, as 1234 is closest to 1200 than 1300.
$sqrt: This is the square root of the input.
$subtract: This is the result of subtracting the second value from the first. If both arguments are dates, it returns the difference between them. If one argument is a date (this argument has to be the first argument) and the other is a number, it returns the resulting date.
$trunc: This is used to truncate the result.
Aggregation accumulators
$accumulator: This will evaluate a user-defined accumulator and return its result.
$addToSet: This will add an element (only if it does not exist) to an array, effectively treating it as a set. It is only available at the group stage.
$avg: This is the average of the numerical values. It ignores non-numerical values.
$bottom: This returns the bottom element within a group according to the specified sort order. It is available at the $group and $setWindowFields stages.
$bottomN: This will return the aggregated bottom N fields within the group as per the defined sort order. We can use it within the $group and $setWindowFields stages.
$count: This will return the count of documents in a group. Note that this is different from the $count pipeline stage. We can use it within the $group and $setWindowFields stages.
$first/$last : These are the first and last value that passes through the pipeline stage. They are only available at the group stage.
$firstN/$lastN : These will return the sum of the first or last N elements within the group. Note that they are different from the $firstN and $lastN array operators.
$max/$min : These get the maximum and minimum values that pass through the pipeline stage, respectively.
$maxN: This will return the sum of the first N elements within the group. We can use it within the $group and $setWindowFields stages or as an expression. Note that this is different from the $maxN array operator.
$push: This will add a new element at the end of an input array. It is only available at the group stage.
$stdDevPop/$stdDevSamp : These are used to get the population/sample standard deviation in the $project or $match stages.
$sum: This is the sum of the numerical values. It ignores non-numerical values.
$top/$topN : This will return the top element or the sum of the top N fields within the group respectively, according to the defined sort order.
NOTE
Accumulators in other stages
Some operators are available in stages other than the $group stage, sometimes with different semantics. The following accumulators are stateless in other stages and can take one or multiple arguments as input.
These operators are $avg, $max, $min, $stdDevPop, $stdDevSamp, and $sum
Conditional Expressions
$cond
$ifNull
$switch
Custom aggregation expression operators$accumulator: This will define a custom accumulator operator. An accumulator maintains its state as documents go through the pipeline stages. For example, we could use an accumulator to calculate the sum, maximum, and minimum values.
An accumulator can be used in the following stages: $bucket, $bucketAuto, and $group.
Note : Note used if javascript is not available.
$function: This will define a generic custom JavaScript function.
NOTE
To use these operators, we need to enable server-side scripting. In general, enabling server-side scripting is not recommended by MongoDB due to performance and security concerns.
Implementing custom functions in JavaScript in this context comes with a set of risks around performance and should only be used as a last resort.
Data size operators
$binarySize: This will return the size of the input in bytes. The input can be a string or binary data. The binary size per character can vary. For example, English alphabet characters will be encoded using 1 byte, Greek characters will use 2 bytes per character, and ideograms such as might use 3 or more bytes.
$bsonSize: This will take a document’s representation as input, for example, an object, and return the size, in bytes, of its BSON-encoded representation.
The null input will return a null output and any other input than an object will result in an error.
Variable expression operators
$let: We can use a variable expression operator to define a variable within a subexpression in the aggregation stage using the $let operator. The $let operator defines variables for use within the scope of a subexpression and returns the result of the subexpression. It accepts named parameters and any number of argument expressions.
Window operators
We can define a span of documents from a collection by using an expression in the $partitionBy field during the $setWindowFields aggregation stage.
Partitioning our collection’s documents in spans will create a “window” into our data, resembling window functions in relational databases:
$addToSet: This will apply an input expression to each document of the partition and return an array of all of the unique values.
$avg: This will return the average of the input expression values, for each document of the partition.
$bottom: This will return the bottom element of the group according to the specified input order.
$bottomN: This will return the sum of the bottom N fields within the group according to the specified input order.
$count: This will return the number of documents in the group or window.
$coVariancePop: This will return the covariance population of two numerical expressions.
$covarianceSamp: This will return the sample covariance of two numerical expressions.
$denseRank: This will return the position of the document (rank) in the $setWindowFields partition. Tied documents will receive the same rank. The rank numbering is consecutive.
$derivative: This will return the first derivative, that is, the average rate of change within the input window.
$documentNumber: This will return the position of the document in the partition. In contrast with the $denseRank operator, ties will result in consecutive numbering.
$expMovingAvg: With a numerical expression as input, this operator will return the exponential moving average calculation.
$first: This will apply the input expression to the first document in the group or window and return the resulting value.
$integral: Similar to the mathematical integral, this operator will return the calculation of an area under a curve.
$last: This will apply the input expression to the last document in the group or window and return the resulting value.
$linearFill: This will use linear interpolation based on surrounding field values to fill null and missing fields in a window.
$locf: This will use the last non-null value for a field to set values for subsequent null and missing fields in a window.
$max: This will apply the input expression to each document and return the maximum value.
$min: This will apply the input expression to each document and return the minimum value.
$minN: This will return the sum of the N minimum valued elements in the group. This is different from the $minN array operator.
$push: This will apply the input expression to each document and return a result array of values.
$rank: This will return the document position, meaning the rank of this document relative to the rest of the documents in the $setWindowFields stage partition.
$shift: This will return the value from an expression applied to a document in a specified position relative to the current document in the $setWindowFields stage partition.
$stdDevPop: This will apply the input numerical expression to each document in the window and return the population’s standard deviation.
$stdDevSamp: This will apply the input numerical expression to each document in the window and return the population’s sample standard deviation.
$sum: This will apply the input numerical expression to each document in the window and return the sum of their values.
$top: This will return the top element within the group, respecting the specified sorting order.
$topN: This will return the sum of the top N element within the group, respecting the specified sorting order.
Type conversion operators
Introduced in MongoDB 4.0, type conversion operators allow us to convert a value into a specified type. The generic syntax of the command is as follows:
NOTE: In the preceding syntax, input and to (the only mandatory arguments) can be any valid expression. For example, in its simplest form, we could have the following: $convert: { input: "true", to: "bool" }
NOTE: MongoDB also provides some helper functions for the most common $convert operations. These functions are listed as follows:
To create a time series collection, we need to specify the following fields. In this context, a data point might refer to a sensor reading or the stock price at a specific point in time:
$timeField: This field is mandatory and is the field that stores the timestamp of the data point. It must be a Date() object.
$metaField: This field is optional and is used to store metadata for the data point. The metadata field can be an embedded document and should be used to add any data that is uniquely identifying our data point.
$granularity: This field is optional and is used to help MongoDB optimize the storage of our data points. It can be set to “seconds,” “minutes,” or “hours” and should be set to the nearest match between our data points’ consecutive readings.
$expireAfterSeconds: This field is optional. We can set it to the number of seconds that we would like MongoDB to automatically delete the data points after they are created.
Time series collections use the zstd compression algorithm instead of the default snappy “for generic purpose” collections. This is configurable at collection creation time using the block_compressor=snappy|zlib|zstd parameter.
We can define the $storageEngine, $indexOptionDefaults, $collation, and $writeConcern parameters for a time series collection in the same way as a general-purpose collection.
NOTE
A time series’ maximum document size is limited to 4 MB instead of the global 16 MB.
Time series are optimized for write once, read many with sporadic updates/deletes, and this reflects the limitations in updates/deletes in favor of improved read/write performance.
MongoDB Views
We can create a MongoDB view using the shell or our programming language’s driver with the following parameters:
$viewName: This field is mandatory. It refers to the name of the view.
$viewOn: This field is mandatory. It refers to the name of the collection that will be used as the data source for the view data.
$pipeline: This field is mandatory. The pipeline will execute every time we query the view. The pipeline cannot include the $out or $merge stages during any stage, including any nested pipelines.
$collation: This field is optional. We can define custom language-specific rules for string comparison; for uppercase, lowercase, or accent marks.
NOTE
Querying a view will execute the aggregation pipeline every time. Its performance is limited by the performance of the aggregation pipeline stages and all pipeline limitations will be applied.
The aggregation pipeline can output results in the following three distinct ways:
Inline as a document containing the result set
In a collection
Returning a cursor to the result set
AGGREGATION OPTIMIZATION
The planner will use an index in a $match stage if $match is the first stage in the pipeline. The planner will use an index in the $sort stage if $sort is not preceded by a $group, $project, or $unwind stage.
The planner might use an index in the $group stage, but just to find the first document in each group if all the following conditions are satisfied:
$group follows a $sort stage, which sorts the grouped fields by key.
Previously, we added an index to the grouped field matching the sorting order.
$first is the only accumulator in $group.
The planner will use a geospatial index in the $geoNear stage if $geoNear is the first stage in the pipeline.
Excercise
Find the top 10 addresses that the transactions originate from.
Find the top 10 addresses that the transactions end in.
Find the average value per transaction, with statistics concerning the deviation.
Find the average fee required per transaction, with statistics concerning the deviation.
Find the time of day that the network is more active according to the number or value of transactions.
Find the day of the week in which the network is more active according to the number or value of transactions.
The average number of transactions per block, for both the total number of overall transactions and the total number of contracted internal transactions.
The average amount of gas used per block.
The average amount of gas used per transaction to a block. Is there a window of opportunity in which to submit my smart contract in a block?
The average level of difficulty per block and how it deviates.
The average number of transactions per block, both in total and also in contracted internal transactions.
find spammers via lookup with span_details collections.
Index internals
In most cases, indexes are variations of the B-tree data structure. Invented by Rudolf Bayer and Ed McCreight in 1971 while they were working at Boeing research labs, the B-tree data structure allows for searches, sequential access, inserts, and deletes to be performed in logarithmic time. The logarithmic time property stands for both the average case performance and the worst possible performance, and it is a great property when applications cannot tolerate unexpected variations in performance behavior.
Compound indexes
Compound indexes are a generalization of single-key indexes, allowing for multiple fields to be included in the same index. They are useful when we expect our queries to span multiple fields in our documents, and also for consolidating our indexes when we start to have too many of them in our collection.
NOTE
Compound indexes can have as many as 32 fields. They can only have up to one hashed index field.
A compound index is declared similarly to single indexes by defining the fields that we want to index and the order of indexing:
db.books.createIndex({"name": 1, "isbn": 1})
Compound Index works only if the order is matched viz. for above if query have
name 1/ isbn 1 or name -1/ isbn -1 or name 1 or name -1 it will work for all other combinations viz name 1/isbn -1 , name -1/isbn 1, isbn 1, isbn -1 it will not work.
The underlying reason is that the values of our fields are stored in the index as secondary, tertiary, and so on; each one is embedded inside the previous ones, just like a matryoshka, the Russian nesting doll.
Indexing embedded documents
We can also index the embedded document as a whole, similar to indexing embedded fields:
db.books.createIndex( { "meta_data": 1 } )
In this we can only query meta_data during searches but internal data query viz. meta_data.var will not be indexed or used if in filter.
NOTE
The db.books.find({"meta_data.average_customer_review": { $gte: 4.8}, "meta_data.page_count": { $gte: 200 } }) command will not use our meta_data index, whereas db.books.find({"meta_data": {"page_count":256, "average_customer_review":4.8}}) will use it.
In Case we have an array we can still create index on it. viz. region: ["AP", "EU", "NA"] when we do db.work.find({"region": "EU"}) the index will be used. But we cant have single index on 2 arrays viz.
db.work.createIndex({region: 1, state: 1})
\\ We have object with details like state: ["OD", "KA", "WB"] etc. then index creation fails. If we created index on collection first and then insert than insertion will fail.
"errmsg" : "cannot index parallel arrays [analytics_data] [tags]",
"code" : 171,
"codeName" : "CannotIndexParallelArrays"
NOTE: Hash Index cannot be multi key index.
Another limitation that we will likely run into when trying to fine-tune our database is that multikey indexes cannot cover a query completely. A compound index with multikey fields can be used by the query planner only on the non-multikey fields.
db.books.find({tags: [ "mongodb", "index", "cheatsheet", "new" ] })
\\ This will first search for all entries in multikey index tags that have a `mongodb` value and will then sequentially scan through them to find the ones that also have the index, cheatsheet, and new tags.
This will be a two step process. Not very efficient so data model to be planned on using index.
TODO: UNDERSTAND
A multikey index cannot be used as a shard key. However, if the shard key is a prefix index of a multikey index, it can be used.
Text indexes
Text indexes are special indexes on string value fields, which are used to support text searches.
A text index can be specified similarly to a regular index by replacing the index sort order (-1, 1) with text, as follows:
db.books.createIndex({"name": "text"})
NOTE
A collection can have one text index at most except in mongodb atlas. This text index can support multiple fields, whether text or not. It cannot support other special types, such as multikey or geospatial. Text indexes cannot be used for sorting results, even if they are only a part of a compound index.
MongoDB Atlas Search allows for multiple full-text search indexes. MongoDB Atlas Search is built on Apache Lucene and thus is not limited by MongoDB on-premise full-text search.
We can use this index to query on available, or the combination of available and meta_data.page_count, or sort them if the sort order allows for traversing our index in any direction
Text indexes will apply stemming (removing common suffixes, such as plural s/es for English language words) and remove stop words (a, an, the, and so on) from the index. Text indexes behave the same as partial indexes, not adding a document to the text index if it lacks a value for the field that we have indexed it on.
We can control the weights of different text-indexed fields, such as the following:
As we can see bookname match will be have double impact then book_description Match.
NOTE
Text indexing supports 15 languages as of MongoDB 5.3, including Spanish, Portuguese, and German. Text indexes require special configurations to correctly index in languages other than English.
Some interesting properties of text indexes are explained as follows:
Case-insensitivity and diacritic insensitivity: A text index is case-insensitive and diacritic-insensitive. Version 3 of the text index (the one that comes with version 3.4) supports common C, simple S, and the special T case folding, as described in the Unicode Character Database (UCD) 8.0 case folding. In addition to case insensitivity, version 3 of the text index supports diacritic insensitivity. This expands insensitivity to characters with accents in both small and capital letter forms. For example, e, è, é, ê, ë, and their capital letter counterparts, can all be equal when compared using a text index. In the previous versions of the text index, these were treated as different strings.
Tokenization delimiters: Version 3 of the text index supports tokenization delimiters, defined as Dash, Hyphen, Pattern_Syntax, Quotation_Mark, Terminal_Punctuation, and White_Space, as described in the UCD 8.0 case folding.
Hashed Index
A hashed index contains hashed values of the indexed field:
db.books.createIndex( { name: "hashed" } )
This will create a hashed index on the name of each book in our books collection. A hashed index is ideal for equality matches but it cannot work with range queries. If we want to perform a range of queries on fields, we can create a compound index with at most one hashed field, which will be used for equality matches.
The following mongo shell code will create a compound index on created_at_date and name, with the name field being the one and only hashed field that we can create in a compound index:
The created_at_date field values have to be either a date or an array of dates (the earliest one will be used). In this example, the documents will be deleted one day (86400 seconds) after created_at_date.
Note: If the field does not exist or the value is not a date, the document will not expire. In other words, a TTL index silently fails and does not return any errors when it does.
Data gets removed by a background job, which runs every 60 seconds. As a result, there is no explicitly guaranteed accurate measure of how much longer documents will persist past their expiration dates.
NOTE
A TTL index is a regular single-field index. It can be used for queries like a regular index. A TTL index cannot be a compound index, operate on a capped or time series collection, or use the _id field. The _id field implicitly contains a timestamp of the created time for the document but is not a Date field. If we want each document to expire at a different, custom date point, we have to set {expireAfterSeconds: 0}, and set the TTL index Date field manually to the date on which we want the document to expire.
Partial indexes
A partial index on a collection is an index that only applies to the documents that satisfy the partialFilterExpression query.
Using this, we can have an index only for books that have a price greater than 30. The advantage of partial indexes is that they are more lightweight in creation and maintenance and use less storage than an index on every possible value.
The partialFilterExpression filter supports the following operators:
Equality expressions (that is, field: value, or using the $eq operator)
The $exists: true expression
The $gt, $gte, $lt, and $lte expressions
$type expressions
The $and operator, at the top level only.
Partial indexes will only be used if the query can be satisfied as a whole by the partial index.
If our query matches or is more restrictive than the partialFilterExpression filter, then the partial index will be used. If the results may not be contained in the partial index, then the index will be totally ignored.
#NirvanaMoment Just wow for Mongo Creators unlike groupby and select to be on same thing.
partialFilterExpression does not need to be a part of the sparse index fields. The following index is a valid sparse index:
To use this partial index, however, we need to query for both name and price equal to or greater than 30.
_id field and shard key cannot be part of partialIndex
We can use the same key pattern (for example, db.books.createIndex( {name: 1, price: 1} ) ) to create multiple partial indexes on the same collection but the partialFilterExpression filter has to be distinct between them.
A unique index can also be a partial index at the same time. Documents that do not satisfy the partialFilterExpression filter, that have null or the field does not exist at all, will not be subject to the unique constraint.
Unique Indexes
MongoDB creates a unique index by default on the _id field for every inserted document:
db.books.createIndex( { "name": 1 }, { unique: true } )
This will create a unique index on a book’s name. A unique index can also be a compound embedded field or an embedded document index.
In a compound index, the uniqueness is enforced across the combination of values in all of the fields of the index
Unique indexes do not work with hashed indexes
The only way that we can enforce a unique index across different shards is if the shard key is a prefix or the same as the unique compound index. Uniqueness is enforced in all of the fields of the unique compound index.
If a document is missing the indexed field, it will be inserted. If a second document is missing the indexed field, it will not be inserted.
IMPORTANT: Indexes that are a combination of unique and partial will only apply unique indexes after a partial index has been applied. This means that there may be several documents with duplicate values if they are not a part of partial filtering.
Create case insensitive index use of collation parameter.
NOTE: The strength parameter is one of the collation parameters, the defining parameter for case-sensitivity comparisons. Strength levels follow the International Components for Unicode (ICU) comparison levels.
In Query we must use collation as well to achieve true case insensitive details.
Creating the index with collation is not enough to get back case-insensitive results. We need to specify collation in our query as well:
If we specify the same level of collation in our query as our index, then the index will be used. We could specify a different level of collation, as follows:
Here, we cannot use the index, as our index has a level 1 collation parameter, and our query looks for a level 2 collation parameter.
NOTE
Diacritics, also known as diacritical marks, are one or more characters that have a mark near, above, or through them to indicate a different phonetic value than the unmarked equivalents. A common example is the French é, as in café.
If we don’t use any collation in our queries, we will get results defaulting to level 3, that is, case-sensitive.
Indexes in collections that were created using a different collation parameter from the default will automatically inherit this collation level.
Suppose that we create a collection with a level 1 collation parameter, as follows:
Default queries to this collection will be strength: 1 collation, case-insensitive, and ignoring diacritics. If we want to override this in our queries, we need to specify a different level of collation in our queries or ignore the strength part altogether. The following two queries will return case-sensitive, default collation level results in our case_insensitive_books collection:
db.case_insensitive_books.find( { name: "Mastering MongoDB" } ).collation( { locale: 'en', strength: 3 } ) // querying the colletion with the global default collation strength value, 3
db.case_insensitive_books.find( { name: "Mastering MongoDB" } ).collation( { locale: 'en' } ) // no value for collation, will reset to global default (3) instead of default for case_sensitive_books collection (1)
Collation is a very useful concept in MongoDB.
Geospatial indexes
2D geospatial indexes
2dsphere geospatial indexes
db.books.createIndex( { location: "2dsphere" } )
The location field needs to be a GeoJSON object, such as this one. location : { type: "Point", coordinates: [ 51.5876, 0.1643 ] } A 2dsphere index can also be a part of a compound index, in any position in the index, first or not.
db.books.createIndex( { name: 1, location : "2dsphere" })
Wildcard indexes
NOTE
The only case where we can mix the inclusion and exclusion of fields is the _id field. We can create a wildcard index with _id:0 or _id:1 but all the other fields in the wildcard declaration have to be either entirely included or entirely excluded.
To create a wildcard index in all attributes of a collection, we have to use the following command:
db.books.createIndex( { "$**": 1 } )
db.books.createIndex( { "attributes.$**": 1 } )
inclusion only name/attributes.author it will include all subdocs in name/authod fielddb.books.createIndex( { "$**": 1 }, { "wildcardProjection": {{ "name": 1, "attributes.author": 1 }}} )
Wildcard index generation creates single attribute indexes across all fields. This means that they are mostly useful in single-field queries and will use at most one field to optimize the query.
Wildcard indexes are essentially partial indexes and as such, they cannot support a query looking for attributes that do not exist in the documents.
Wildcard indexes will recursively index every attribute of the subdocuments and as such, cannot be used to query for exact subdocument or whole array equality ($eq) or inequality ($neq) queries.
Wildcard indexes cannot be used as keys for sharding purposes.
Hidden indexes
NOTE: We can have all index features just the query planner will not use the index if its hidden. but unique constraint partial index all follows same rules.
We can define any value we want but we need to set our custom value in the _id key of the clustered collection. The clustered index cannot be hidden.
A clustered collection needs to be created as such and we cannot convert an existing collection into a clustered one. The clustered collection can also be a TTL collection if we set the expireAfterSeconds parameter to the creation time.
A clustered collection can be more space efficient than a non-clustered (default) collection but we need to explicitly skip creating a secondary index at creation time. The query optimizer will try to use the secondary index instead of the clustered index if we don’t provide a hint to skip it.
Managing Index
The fewer inserts and updates that occur during the build process, the faster the index creation will be. Few to no updates will result in the build process happening as fast as the former foreground build option.
Queries won’t see partial index results. Queries will start getting results from an index only after it is completely created.
An index filter is more powerful than a hint() parameter. MongoDB will ignore any hint() parameter if it can match the query shape to one or more index filters.
Index filters are only held in memory and will be erased on server restart.
We cannot use hint() when the query contains a $text query expression or while an index is hidden.
he MongoDB query planner will always choose the index filter and ignore the hint() parameter if an index filter already exists for our query.
Large Index Builds
Stop one secondary from the replica set
Restart it as a standalone server in a different port
Build the index from the shell as a standalone index
Restart the secondary in the replica set
Allow for the secondary to catch up with the primary
Ensure to have large oplog in primary so the replica set can be fully synced once it joins the cluster. Same goes for primary to do rs.stepDown and then follow above process.
The rs.stepDown() command will not step down the server immediately but will instead wait for a default of 10 seconds (configurable using the secondaryCatchUpPeriodSecs parameter) for any eligible secondary server to catch up to its data.
For sharded environment, we have to stop the sharding data balancer first. Then identify the shard/replicaset where the collection is and then perform the above steps.
Naming Index
index names are auto assigned by MongoDB, based on the fields indexed and the direction of the index (1 or -1).
Own name db.books.createIndex( { name: 1 }, { name: "book_name_index" } )
A collection can have up to 64 indexes.
A compound index can have up to 32 fields.
Geospatial indexes cannot cover a query.
Multikey indexes cannot cover a query over fields of an array type.
Indexes have a unique constraint on fields.
We cannot create multiple indexes on the same fields, differing only in options. We can create multiple partial indexes with the same key pattern given that the filters in partialFilterExpression differ.
Few commands for seeeing the query plans
db.books.find().explain()
db.books.find().explain("allPlansExecution")
IXSCAN = index scan in the query explain plan.
The MongoDB SBE will calculate a unique queryHash for each query shape. A query shape is the combination of the fields that we are querying together with sorting, projection, and collation.
indexes to do the following:
Fit in the RAM
Ensure selectivity nReturned,keysExamined must be similar
Be used to sort our query results
Be used in our most common and important queries
Index intersection will not work with sort(). We can’t use one index for querying and a different index for applying sort() to our results.
AND_SORTED index used for sorting as well.
If the data is not in the memory, then it will fetch the data from the disk and copy it to the memory. This is a page fault event because the data in the memory is organized into pages.
As page faults happen, the memory gets filled up and eventually, some pages need to be cleared for more recent data to come into the memory. This is called a page eviction event.
The resident memory <= 80% of available memory
Virtual and mapped memory : The virtual memory refers to the size of all of the data requested by MongoDB, including the journaling.
What all of this means is that over time, our mapped memory will be roughly equal to our working set, and the virtual memory will be our mapped memory size plus the dataset size after the last checkpoint.
Working sets
Tracking free space: Keep monitoring the disk space usage, with proper alerts when it reaches 40%, 60%, or 80% of the disk space, especially for datasets that grow quickly.
Monitoring Replication - operations log (oplog) is a capped collection. Secondaries read this oplog asynchronously and apply the operations one by one.
Replication lag is counted as the time difference between the last operation applied on the primary and the last operation applied on the secondary, as stored in the oplog capped collection.
Oplog size
Network
Cursors and connections - check for optimal open connections
Document Metrics - no. of cruds
Monitoring memory usage in WiredTiger
Using WiredTiger, we can define the internal cache memory usage on startup.
By default, the internal cache will be ( (total RAM size in GB) - 1 ) / 2, with a lower limit of 256 MB and an upper limit of 10 GB.
This means that in a system with 16 GB RAM, the internal cache size would be ( 16 - 1 ) / 2 = 7.5 GB.
db.serverStatus().wiredTiger.cache
NOTE: The generic recommendation is to leave the WiredTiger internal cache size at its default. If our data has a high compression ratio, it may be worth reducing the internal cache size by 10% to 20% to free up more memory for the filesystem cache.
We should aim to keep the I/O wait at less than 60% to 70% for a healthy operational cluster.
Read and write queues
Queues are the effect rather than the root cause, so by the time the queues start building up, we know we have a problem to solve.
Lock percentage
Document lock % must tell the story of any issues in db.
Working set calculations
understand data/ page faults and come with the requirement. Then add 30% for index and add that as memory. Less page faults better performance of db.
we can use db.enableFreeMonitoring() which is free offering from mongodb for on premise hostings except for free tier.
NOTE: Monitoring tools like MongoDB Cloud Manager and MongoDB Ops Manager.
Open Source: Nagios, Munin, and Cacti provide plugin support for MongoDB
mongotop & mongostat can be used for adhoc monitoring.
BACKUP OPTIONS
Cloud-based solutions: Snapshot functionality from the underlying cloud provider (AWS, Microsoft Azure, or Google Cloud Platform) to provide both on-demand and Continuous Cloud Backups with a frequency and retention that is dependent on the MongoDB Atlas cloud level of service selected
Continuous Cloud Backups use the oplog to back up our data,
MongoDB Cloud Manager SaaS External/ MongoDB Ops Manager can be used for backups in on premises.
File System Snapshots: EBS on EC2, and Logical Volume Manager (LVM) on Linux, support point-in-time snapshots.
mongodbdump for cli dump of mongodb
sharded cluster balancer must be stopped before taking backups
The major downside that the mongodump tool has is that in order to write data to the disk, it needs to bring data from the internal MongoDB storage to the memory first. This means that in the case of production clusters running under strain, mongodump will invalidate the data residing in the memory from the working set with the data that would not be residing in the memory under regular operations. This degrades the performance of our cluster.
Making backups using queuing
Enforce authentication: Always enable authentication in production environments.
Enable access control: First, create a system administrator, and then use that administrator to create more limited users. Give as few permissions as needed for each user role.
Define fine-grained roles in access control: Do not give more permissions than needed for each user.
Encrypt communication between clients and servers: Always use TLS/SSL for communication between clients and servers in production environments.
Always use TLS/SSL for communication between mongod and mongos or config servers as well.
Encrypt data at rest: The MongoDB Enterprise Advanced edition offers the functionality to encrypt data when stored, using WiredTiger encryption at rest.
WiredTiger uses Multi-Version Concurrency Control (MVCC). MVCC is based on the concept that the database keeps multiple versions of an object so that readers will be able to view consistent data that doesn’t change during a read.
MVCC is said to provide point-in-time consistent views. This is equivalent to a read-committed isolation level in traditional RDBMS systems.
Journaling: journaling is the cornerstone of WiredTiger crash recovery protection. WiredTiger compresses the journal using the snappy compression algorithm.
In-memory - Utilizing MongoDB’s in-memory storage is a risky task with high rewards. Keeping data in memory can be up to 100,000 times faster than durable storage on disk.
RocksDB and TokuMX alternatives
Locking in MongoDB
Global
Database
Collection
Document
Lock reporting
We can inspect the lock status using any of the following tools and commands:
db.serverStatus() through the locks document
db.currentOp() through the locks field
mongotop
mongostat
MongoDB Cloud Manager
MongoDB Ops Manager
Commands requiring a database lock
The following commands require a database lock. We should plan before issuing them in a production environment:
db.collection.createIndex() in the (default) foreground mode
reIndex
compact
db.repairDatabase()
db.createCollection() if creating a multiple GB capped collection
db.collection.validate()
db.copyDatabase(), which may lock more than one database
We also have some commands that lock the entire database for a short period:
Big data’s defining characteristics are, in general, these:
Volume
Variety
Velocity
Veracity
Variability
Replication
In logical replication, we have our primary server performing operations; the secondary server tails a queue of operations from the primary and applies the same operations in the same order. Using MongoDB as an example, the operations log (oplog) keeps track of operations that have happened on the primary server and applies them in the exact same order on the secondary server.
Different high availability types
Cold: A secondary cold server is a server that is there just in case the primary server goes offline, without any expectation of it holding the data and state that the primary server had.
Warm: A secondary warm server receives periodic updates of data from the primary server, but typically, it is not entirely up to date with the primary server. It can be used for some non-real-time analytics reporting to offload the main server, but typically, it will not be able to pick up the transactional load of the primary server if it goes down.
Hot: A secondary hot server always keeps an up-to-date copy of the data and state from the primary server. It usually waits in a hot standby state, ready to take over when the primary server goes down.
Production considerations
Deploy each mongod instance on a separate physical host. If you are using VMs, make sure that they map to different underlying physical hosts. Use the bind_ip option to make sure that your server maps to a specific network interface and port address.
Use firewalls to block access to any other port and/or only allow access between application servers and MongoDB servers. Even better, set up a VPN so that your servers communicate with each other in a secure, encrypted fashion.
We need to ensure that we have enabled authentication in our MongoDB cluster before binding to any IP address other than localhost.
Initial sync
Adding a new member to an existing replica set requires copying all data from an existing member to the new member. MongoDB uses initial sync to copy all data in a first pass and then replication using the oplog to continuously keep the members in sync.
The logical initial sync will not clone the local database. It uses the oplog to sync from the primary, which means that we need to make sure that we have enough disk space in the target system to temporarily store the oplog records while the sync is ongoing.
File copy-based initial sync, on the other hand, is only available in the MongoDB Enterprise edition and copies the underlying database files directly between the source and target systems. As such, file copy-based initial sync will overwrite the local database in the target system. File copy-based initial sync can be faster than logical-based initial sync with a few limitations, primarily that it does not work at all with encrypted storage.
Chained replication This is to replicate from another secondary rather than primary
Streaming replication default from mongo 4.4
Replica set limitations
A replica set is great when we understand why we need it and what it cannot do. The different limitations for a replica set are as follows:
It will not scale horizontally; we need sharding for it.
We will introduce replication issues if our network is flaky.
We will make debugging issues more complex if we use secondaries for reads, and these have fallen behind our primary server.
Sharding
Sharding is the ability to horizontally scale out our database by partitioning our datasets across different servers—shards
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters