CLUSTERED COLLECTIONS

A Collection with 1 clustered index

Pros

1. fast query without 2nd-ary index, use clustered index key instead to make range or equality comparison.
1. lower storage size, very good for bulk inserts
1. eliminate the need of TTL index, since: TTL index = clustered index + expireAfterSeconds option + supported _id field => short-lived documents improves delete performance and reduce storage size.
1. have additional performance for inserts, updates, deletes and selects

Non-clustered Collection

_id index and the Document are stored seperately.
A query requires 2 reads and 2 write (1 for index and 1 for document)

Clustered Collection

_id index and the Document are stored together.
A query requires only 1 read and 1 write.

Note: the collection size reture by collStats command will includes clustered index size.

Behavior

clustered Collection store Documents which are pre-ordered by the clustered index key value. There can only be 1 clustered index.

Only Collections with a clustered index that store Documents in sorted order.

Might be a good idea to use both clustered index and 2nd-ary index.

Some limitations

1. Migration from non-clustered to clustered and vice versa is not supported. Try to use an aggregation pipeline to read collection and write into another collection with corresponding type (e.g. $out stage and $merge stage can do this job)
1. If 2nd-ary index and clustered index coexists, when query it will use 2nd-ary index defaultly. You must use hint() to force use clustered index and perform a bounded collection scan (idk if it's better than 2nd-ary index?)
1. Clustered index key is _id field by default.
1. It might not be capped collection

Define custom clustered index key

Some criteria to choose the new key:

contain unique value
immutable
contain sequentially increasing values (just like AUTO_INCREMENT, not mandatory but inproves insert performance)
small in size as possible since it lives together with the Document.

Check if a Collection is Clustered

db.runCommand({ listCollections: 1 }) // check for options > clusteredIndex

Create Clustered Collection example

// create
// one method
db.runCommand({
  create: "stocks",
  clusteredIndex: {
  	// same as below
    // ...
  }
})

// perferred method
db.createCollection(
	"stocks", // Collection's name
  {
    clusteredIndex: {
      "key": {
        _id: 1 // set the _id as key, not changing the value of it to 1
      },
      "unique": true,
      "name": "products clustered key" // clustered index's name
    }
  }
)

vhbui02/mongodb-clustered-collections.md