Skip to content

Instantly share code, notes, and snippets.

Forked from jnewman12/
Created January 2, 2019 05:56
Show Gist options
  • Save mohankumar-i2i/f6694a2a0425c236f14310291a83a14e to your computer and use it in GitHub Desktop.
Save mohankumar-i2i/f6694a2a0425c236f14310291a83a14e to your computer and use it in GitHub Desktop.
Data Modeling With Mongo

Data Modeling with MongoDB



  • Understand model relationships in MongoDB
  • Understand One-to-Many relationships
  • Understand Many-to-Many relationships
  • Reinforce the Difference between embedding and referencing


  • Data in MongoDB has a flexible schema. Unlike SQL databases, where you must determine and declare a table’s schema before inserting data, MongoDB’s collections do not enforce document structure.
  • This flexibility facilitates the mapping of documents to an entity or an object. Each document can match the data fields of the represented entity, even if the data has substantial variation.

Recap: References vs Embeds


  • References store the relationships between data by including links or references from one document to another. Applications can resolve these references to access the related data. Broadly, these are normalized data models.


Embedded Data

  • Embedded documents capture relationships between data by storing related data in a single document structure. MongoDB documents make it possible to embed document structures in a field or array within a document. These denormalized data models allow applications to retrieve and manipulate related data in a single database operation.



var Comments = new Schema({
    title     : String
  , body      : String
  , date      : Date

var BlogPost = new Schema({
    author    : ObjectId
  , title     : String
  , body      : String
  , date      : Date
  , comments  : [Comments]
  , meta      : {
        votes : Number
      , favs  : Number

mongoose.model('BlogPost', BlogPost);
  • add an embedded document to the array
  // retrieve my model
var BlogPost = mongoose.model('BlogPost');

// create a blog post
var post = new BlogPost();

// create a comment
post.comments.push({ title: 'My comment' }); (err) {
  if (!err) console.log('Success!');
  • removing an embedded document
BlogPost.findById(myId, function (err, post) {
  if (!err) {
    post.comments[0].remove(); (err) {
      // do something
  • this might be how you'd find embedded objects by their id
  • DocumentArrays have an special method id that filters your embedded documents by their _id property (each embedded document gets one):; (err) {
  // embedded comment with id `my_id` removed!

When to embed? When to reference?

  • Both embedding and referencing have their strengths and weaknesses.
  • Unlike the strict structure of relational DB's, data modeling in mongo is more art than science due to mongo's unique flexibility
  • Without going into to much detail, here is a quick recap of some of the key points of when to use either referencing or embedding in your apps
  1. Referencing
  • good when you need more flexibility
  • good when you have a many-to-many relationship
  1. Embedding
  • good when the sub object always appears with it's parent. Like a comment to a post
  • good when you have a one-to-many relationship

A basic One-to-Many Example

  • Recap: In relational databases, a one-to-many relationship occurs when a parent record in one table can potentially reference several child records in another table. In a one-to-many relationship, the parent is not required to have child records; therefore, the one-to-many relationship allows zero child records, a single child record or multiple child records. The important thing is that the child cannot have more than one parent record.


Modeling One-to-Many Relationships with Embedded Documents

Consider the following example that maps a user and multiple address relationships. The example illustrates the advantage of embedding over referencing if you need to view many data entities in context of another. In this one-to-many relationship between user and address data, the user has multiple address entities.

In the normalized data model, the address documents contain a reference to the patron document.


var mongoose = require('mongoose');
var addressSchema = new mongoose.Schema({
    street: String,
    city: String,
    cc: String

var userSchema = new mongoose.Schema({
    name: String, 
    ssn: String,
    addresses: [addressSchema]

module.exports = mongoose.model('User', userSchema);


var user = new User({
    name: 'Kate Monster', 
    ssn: '123-456-7890',
    addresses : [
        { street: '123 Sesame St', city: 'Anytown', cc: 'USA' },
        { street: '123 Avenue Q', city: 'New York', cc: 'USA' }

// or
user.addresses.push({street: 'bancroft pkwy', city: 'wilmington', country: 'usa'});


  name: 'Kate Monster',
  ssn: '123-456-7890',
  addresses : [
     { street: '123 Sesame St', city: 'Anytown', cc: 'USA' },
     { street: '123 Avenue Q', city: 'New York', cc: 'USA' }

If your application frequently retrieves the address data with the name information, then your application needs to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data entities in the patron data, as in the following document

Modeling One-to-Many Relationships with Document References

Consider the following example that maps products and order relationships. The example illustrates the advantage of referencing over embedding to avoid repetition of the products information.

var mongoose = require('mongoose');

var orderSchema = new mongoose.Schema({
    products: [{type: mongoose.Schema.ObjectId, ref: 'Product'}]

var Order = mongoose.model('Order', orderSchema);

var productSchema = new mongoose.Schema({
    name: String,
    price: Number

var Product = mongoose.model('Product', productSchema);


var product = new Product({name: 'Wrench', price: 5});;
var order = new Order()
order.products // ["57ec7d5cf292421828791b8b"] // just the objectId


    _id: '57ec7d63f292421828791b8c',
    products: [ '57ec7d5cf292421828791b8b' ] 

In order to obtain the referenced documents we need to call 'populate' on the query.

Order.findById(id).populate('products').exec(function(err, order){


{ _id: '57ec800a3130441eb4b52e39',
  __v: 0,
   [ { _id: '57ec800a3130441eb4b52e38',
       name: 'Wrench',
       price: 5,
       __v: 0 } ] }

Question Checkpoint

  • what are mongo references?
  • what are mongo embeds?
  • when would you use a reference?
  • when would you use an embed?

Modeling Many to Many Relationships

  • Just like one-to-many relationships, many-to-many relationships are important for any app.
  • Normally we implement these relationships with MongoDB by linking documents via referencing.
  • We're going to discuss a few possible ways to model many-to-many relationships and show you how to pick one over another.

A Basic Many to Many Example

  • Recap: A many-to-many relationship refers to a relationship between tables in a database when a parent row in one table contains several child rows in the second table, and vice versa.


  • Another Example
    • product can be in many categories
    • category can have many products

Many to Many

var mongoose = require("mongoose"),
    Schema = mongoose.Schema,
    relationship = require("mongoose-relationship");

var ParentSchema = new Schema({
    children:[{ type:Schema.ObjectId, ref:"Child" }]
var Parent = mongoose.models("Parent", ParentSchema);

var OtherParentSchema = new Schema({
    children:[{ type:Schema.ObjectId, ref:"Child" }]
var OtherParent = mongoose.models("OtherParent", OtherParentSchema);

var ChildSchema = new Schema({
    parents: [{ type:Schema.ObjectId, ref:"Parent", childPath:"children" }]
    otherParents: [{ type:Schema.ObjectId, ref:"OtherParent", childPath:"children" }]
ChildSchema.plugin(relationship, { relationshipPathName:['parents', 'otherParents'] });
var Child = mongoose.models("Child", ChildSchema)

var parent = new Parent({});;
var otherParent = new OtherParent({});;

var child = new Child({});
child.otherParents.push(otherParent); //both parent and otherParent children property will now contain the child's id 

Embedded and Referrencing Documents - TLDR

  • One: favor embedding unless there is a compelling reason not to

  • Two: needing to access an object on its own is a compelling reason not to embed it

  • Three: If there are more than a couple of hundred documents on the “many” side, don’t embed them; if there are more than a few thousand documents on the “many” side, don’t use an array of ObjectID references. High-cardinality arrays are a compelling reason not to embed.

  • Four: Don’t be afraid of application-level joins: if you index correctly and use the projection specifier (as shown in part 2) then application-level joins are barely more expensive than server-side joins in a relational database.

  • Five: Consider the write/read ratio when denormalizing. A field that will mostly be read and only seldom updated is a good candidate for denormalization: if you denormalize a field that is updated frequently then the extra work of finding and updating all the instances is likely to overwhelm the savings that you get from denormalizing.

  • Six: As always with MongoDB, how you model your data depends – entirely – on your particular application’s data access patterns. You want to structure your data to match the ways that your application queries and updates it.


  • referencing allows you to store id's like a table in SQL
  • embedding allows you to store a whole child object inside a parent object
  • one-to-many in mongo operates very similarly to how we handled it in rails
  • many-to-many in mongo operates differently, and there are a couple ways to handle that association

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment