Skip to content

Instantly share code, notes, and snippets.

@shoaibi
Last active October 14, 2015 14:41
Show Gist options
  • Select an option

  • Save shoaibi/86f8f8f3359c934cf963 to your computer and use it in GitHub Desktop.

Select an option

Save shoaibi/86f8f8f3359c934cf963 to your computer and use it in GitHub Desktop.
MongoDb notes

MongoDb

  • Dynamic schema, not schema-free.
  • Database
    • Collection
      • Document
        • Field

BSON

  • Binary version of JSON
  • Added types, like object Id, Arrays

Databases

  • show dbs
  • show collections
  • db
    • Pointer to current selected db.
  • use db_name_here
    • Creates a database automatically when first collection comes in.
    • Creates a collection automatically when first document comes in.

What is _id?

  • PK, Immutable, can be defined manually
    • Could use counters from here
  • Generated using:
    • 4 byte timestamp
    • 3 byte machine id
    • 2 byte process id
    • 3 byte random number
  • Why not just an incrementing number? Better support for distributed cluster with a uuid.
  • Can issue getTimestamp() to get creation date against _id

Finding Data

find*()

  • find* methods' first argument the criteria and second object is the select clause.
  • Following will return records with name set to John with only name and email fields. We could use true instead of 1 but most documentation uses 1.
  find({ 'name.first' : 'John' }, { 'name.last': 1, 'email': 1 })
  • could also use an exclusion list for select clause by supplying 0
  • can't mix inclusion and exclusion. The only exception is _id exclusion while inclusion of rest.

find()

  • Operators that work with integers/strings also work with arrays e.g. find({ tags: 'node' }) would work just fine even if tags is an array.
  • lists all docs without a criteria
  • returns a cursor object so can't do direct property dereferencing on the return value
  • use forEach(printjson) for pretty printing objects
  • can use direct array dereferencing e.g. find()[0]

Regex

  • $regex
db.links.find({ title: { $regex: /title\+$, $ne: "Exclude This" } });

Sort

db.links.find( { } , { _id: 0, title:1 }) . sort({ title: 1, favorites: -1 });

Limit

db.links.find( { } , { _id: 0, title:1 }) . sort({ title: 1, favorites: -1 }).limit(1);

Skip/Offset

db.links.find( { } , { _id: 0, title:1 }) . sort({ title: 1, favorites: -1 }).skip(10).limit(1);

distinct()

group()

{
    key : { columnToGroupOn: true },
    initial: { favCount: 0 }, // gets passed to reduce() every time but resets on every different key
    reduce: function (doc, o) { o.favCount += doc.favorites; },
    finalize: function (o) { o.name = db.users.findOne({ _id : o.userId }).name; }
  };

findOne()

  • Returns a document
  • can do direct property dereferencing on the return value

count()

Complex Criteria

Integer comparisons

  • Find all objects with favorites greater than 10
   db.links.find({ favorites: { $gt: 10 }});
  • Also have $lt, $lte, $gte
  • Can mix operators:
  db.links.find({ favorites: { $gt: 10, $lt: 50 }});

Range

  • $in
    • takes an array
  • $nin
  • $all
    • Matched records must have all criteria terms e.g. AND condition between criteria terms

Logicial

  • $or
    • takes an array with where clause objects
  • $nor
  • $and
    • conditions on same field using $and can be merged
  db.users.find({ $and : [{ 'name.first' : 'John' }, { 'name.last' : 'Doe' }] });
   // is same as:
   db.users.find({ 'name.first' : 'John', 'name.last' : 'Doe' });
   // or:
   db.users.find({ name : { 'first' : 'John', 'last' : 'Doe' });

Geospatial

  • $near
  • $within
    • $box

Misc

  • $ne
  • $mod
  db.links.find({ favorites: { $mod : [5, 0] }});
  • $exists
    • Returns records that have a specific field
  db.users.find({ email: { $exists : true }})
  • $not
  db.links.find({ favorites: { $not : { $mod : [5, 0] }} });
  • Avoid using $not if there is a negative operator or if query can be done some other way without compromising simplicity

  • An example of such a scenario would be doing $not on $gte

  • $elemMatch

    • For nested lookups inside an array of objects.
    • Say we want to find users that have logins (which is an array of objects) older than 20 minutes.
  db.users.find( { logins : { $elemMatch : { minutes : { $gt, 20 } } } });
  • $where
    • Use raw JS to compile a custom complex query
  db.users.find( { $where: 'this.name.first === "John"', age: 30 });
  db.users.find( { $where: 'this.name.first === "John"' });
  // can be shortened to:
  db.users.find( { 'this.name.first === "John"' });
  • Do not use for simpler queries as it is slow.

  • $size

    // query users with membership array set to empty
    db.users.count({"membership" : { $size : 0}});

Modifying Data

  • insert()
  • update()
    • First parameter is where clause, second is the replacement object. It is the update-by-replacement.
    • Setting third parameter to true makes it act like upsert.
    • By default only updates first document. Set 4th parameter to true to do bulk updates.
      • This only works for updates, not upserts so 3rd parameter has to be false.
  • save()
    • checks if document has _id e.g. pk, if so does update(), else insert()
  • findAndModify()
  {
  query: { name : "John" },
  update: { $set: { age: 30 }},
  sort: { title: 1 },
  new: false,
  fields: { title: 1 },
  };

This will return the object before update is made. Setting new to true would return the updated object.

(In|De)crementing

  db.links.find( { title: 'SomeTitle' }, { $inc: { favorites: -2 }});

(Un)Set a new value:

  db.links.find( { title: 'SomeTitle' }, { $set: { favorites: 10 }});
  • Also works for fields that do not exist yet.
  • $unset behaves the exact opposite.

Adding values into an array without caring for duplicates

  • $push
  • $pushAll

Adding value into an array while not adding duplicates

  • $addToSet
  • Use nested $each to add multiple values

Removing values from an array

  • $pull
  • $pullAll

Remove an element from array from start or end

  • $pop
    • -1 means from end, 1 is for start.

Rename a key

  • $rename

Dropping/deleting data

  • remove()
  • findAndModify()
  {
  query: { name : "John" },
  remove: true,
  };
  • drop()
    • drops collection. fails silently
  • dropDatabase()

Relations

  • No Joins or FKs
  • Denormalize data that is less likely to be modified
  • DocumentId linking the the best alternate.
  • More Reads === More denormalization, even duplication is alright e.g. having a user object as well as having it embedded inside posts (say if owner doesn't change)

Indexing

  • explain()
    • Cursor type: BasicCursor is a dead ringer for missing index
      • With indexing it is usually BtreeCursor
    • nscanned* is another one to watch
    • n and nscanned* should be as close as possible
    • scanAndOrder set to true mans that index can't be used for sorting
  • ensureIndex()
   // 1 translate to ascending index
  db.links.ensureIndex({ title: 1 }, { unique: true, dropDups: true, sparse: true });
  • dropIndex()

Backup and Restore

Binary

  • mongodump
  • mongorestore

Textual (JSON, CSV)

  • mongoexport
  • mongoimport

Monitoring

  • mongostat

Missing

  • Geospatial
  • Aggregation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment