Skip to content

Instantly share code, notes, and snippets.

@jim-clark
Last active January 22, 2017 14:18
Show Gist options
  • Save jim-clark/7cd2484c4e955d8632c6 to your computer and use it in GitHub Desktop.
Save jim-clark/7cd2484c4e955d8632c6 to your computer and use it in GitHub Desktop.

Intro to MongoDB

Learning Objectives
Explain What MongoDB Is
Save and Retrieve MongoDB Documents
Model Data using Embedding & Referencing

Roadmap

  • What is MongoDB
  • MongoDB vs. Relational SQL Databases
  • Installing and Starting MongoDB
  • Creating a Database and Inserting Documents
  • Data Modeling in MongoDB
  • Querying Data
  • Updating Data
  • Removing Data

What is MongoDB

Overview

Hu**mongo**us

MongoDB is one of the new breeds of databases known as NoSQL databases. NoSQL databases are heavily used in realtime, big data and social media applications. However, they are not commonly used in applications that require a high level of transactional & consistency support, such as critical financial, e.g., banking, applications.

MongoDB puts the "M" in MEANStack, a technology stack that emphasizes the use of JavaScript in both the front-end and back-end.

There are software "drivers" that allow MongoDB to be used with a multitude of programming languages and frameworks, including Ruby on Rails. When used with Rails, there is no need for database migrations! If this interests you, take a look at the mongoid ORM gem.

Data Format

  • A MongoDB database consists of documents.
  • A document in MongoDB is composed of field and value pairs.

Lets take a look of what a MongoDB document may look like:

{
    _id: ObjectId("5099803df3f4948bd2f98391"),
    name: { first: "Alan", last: "Turing" },
    birth: new Date('Jun 23, 1912'),
    death: new Date('Jun 07, 1954'),
    contribs: [ "Turing machine", "Turing test", "Turingery" ],
    views: 1250000
}

What does this datastructure remind you of?




A MongoDB document is very much like JSON, except it is stored in the database in a format known as BSON (think - Binary JSON).

BSON basically extends JSON with additional data types, such as ObjectID and Date shown above.

The Document _id

The _id is a special field represents the document's primary key and will always be listed as the first field. It must be unique.

We can explicitly set the _id like this:

{
	_id: 2,
	name: "Suzy"
}

or this...

{
	_id: "ABC",
	name: "Suzy"
}

However, it's more common to allow MongoDB to create it implicitly for us using its ObjectID data type.

MongoDB vs. Relational SQL Databases

Terminology

Key Differences of MongoDB

Schema-less

The documents in a MongoDB collection can have completely different types and number of fields from each other.
How does this compare to a SQL database like PostgreSQL?

No Table Joins

In a SQL DB, we break up related data into separate tables.

In MongoDB, we often embed related data in a single document, you'll see an example of this later.

The supporters of MongoDB highlight the lack of table joins as a performance advantage since joins are expensive in terms of computer processing.

Installing and Starting MongoDB

Installation

You may already have MongoDB installed on your system, lets check in terminal ? monggod (note the lack of a "b" at the end").

If you receive an error, lets use Homebrew to install MongoDB:

  1. Update Homebrew's database (this might take a bit of time)
    ? brew update
  2. Then install MongoDB
    ? brew install mongodb

MongoDB by default will look for data in a folder named /data/db. We would have had to create this folder, but Homebrew did it for us.

Start Your Engine

mongod is the name of the actual database engine process. The installation of MongoDB does not set mongoDB to start automatically. A common source of errors when starting to work with MongoDB is forgetting to start the database engine.

To start the database engine, type mongod in terminal.

Press control-c to stop the engine.

Creating a Database and Inserting Documents

Mongo Client App

MongoDB installs with a client app, a JavaScript-based shell, that allows us to interact with MongoDB directly.

Start the app in terminal by typing mongo.

The app will load and change the prompt will change to >.

List the shell's commands available: > help.

Show the list of databases: > show dbs.

Show the name of the currently active database: > db.

Switch to a different database: > use [name of database to switch to].

Lets switch to the local database: > use local.

Show the collections of the current database > show collections.

Creating a new Database

To create a new database in the Mongo Shell, we simply have to use the database. Lets create a database named myDB:

> use myDB

Inserting Data into a Collection

This how we can create and insert a document into a collection named people:

> db.people.insert({
... name: "Fred",	// Don't type the dots, they are from the 
... age: 21			// shell, indicating multi-line mode
})

Using a collection for the first time creates it!

YOU DO: Lets add another person to the people collection. But this time, add an additional field called birthDate and assign it a date value with something like this: birthDate: new Date('3/21/1981')

To list all documents in a collection, we use the find method on the collection without any arguments:

> db.people.find()

Again, unlike the rows in a relational database, our documents don't have to have the same fields!

Plant the Seed and Watch your Data Grow

To practice querying our Dhere are few more documents to put in your people collection. We can simply provide this array to the insert method and it will create a document for each object in the array.

[
	{
		"name": "Emma",
		"age": 20
	},
	{
		"name": "Ray",
		"age": 45
	},
	{
		"name": "Celeste",
		"age": 33
	},
	{
		"name": "Stacy",
		"age": 53
	},
	{
		"name": "Katie",
		"age": 12
	},
	{
		"name": "Adrian",
		"age": 47
	}
]

Be sure to type the closing paren of the insert method.

Creating Data Programmatically (using JavaScript code)

To demonstrated linking data later, we're going to create another collection named bankAccounts. However, to demonstrate how we might create data in a JavaScript program, we will do it programmatically.

Enter the following JavaScript in the shell:

> var acctNums = [1234, 5678, 9876, 5432]
> acctNums.forEach(function(acct) {
... var bal = Math.floor(Math.random() * 2000)
... db.bankAccounts.insert({
... _id: acct,
... balance: bal
... })
... })

Note how we assigned our own _id field instead of using MongoDB's auto-generated ObjectID.

Take a look by listing the bankAccount collection

Obviously, we would prefer to be writing code like the JavaScript above in an actual program instead of the shell. In a future lesson next week, we will learn to use MongoDB in a Nodeljs application. Today, we are just focussing on the MongoDB fundamentals...

Questions?

Data Modeling in MongoDB

There are two ways to modeling related data in MongoDB:

  • via embedding
  • via referencing (linking)

Both approaches can be used simultaneously in the same document.

Embedded Documents

In MongoDB, by design, it is common to embed related data in the parent document.

Modeling data with the embedded approach is different than what we've seen in a relational DB where we spread our data across multiple tables. However, this is the way MongoDB is designed to work and is the reason MongoDB can read and return large amounts of data far more quickly than a SQL DB that requires join operations.

To demonstrate embedding, we will add another person to our people collection, but this time we want to include contact info. A person may have several ways to contact them, so we will be modeling a typical one-to-many relationship.

Let's walk through this command by entering it together:

> db.people.insert({
... name: "Manny",
... age: 33,
... contacts: [
... {
... type: "email",
... contact: "[email protected]"
... },
... {
... type: "mobile",
... contact: "(555) 555-5555"
... }
... ]
... })

What would be a downside of embedding data?




If the embedded data's growth is unbound, MongoDB's maximum document size of 16 megabytes could be exceeded.

The above approach of embedding contact documents provides a great deal of flexibility in what types and how many contacts a person may have. However, this flexibility slightly complicates querying.

However, what if our app only wanted to work with a person's multiple emails and phoneNumbers?
Knowing this, pair up and discuss how you might alter the above structure.

Referencing Documents (linking)

We can model data relationships using a references approach where data is stored in separate documents. These documents, due to the fact that they hold different types of data, are likely be stored in separate collections.

It may help to think of this approach as linking documents together by including a reference to the related document's _id field.

Earlier, we created a bankAccounts collection to demonstrate the references approach.

SCENARIO

A person has a bank account.

That bank account might be a joint account, owned by more than one person.

For the sake of data consistency, keeping the account data in its own document would be a better design decision. In more clear terms, it would not be a good idea to store a bank account's balance in more than one place.

In our app, we have decided that all bank accounts will be retrieved through a person. This decision allows us to include a reference on the person document only.

Implementing the above scenario is as simple as assigning a bankAccount document's _id to a new field in our person document:

> db.people.insert({
... name: "Miguel",
... age: 46,
... bankAccount: 5678
})

Again, because there are no "joins" in MongoDB, retrieving a person's bank account information would require a separate query on the bankAccounts collection.

Data Modeling Best Practices

MongoDB was designed from the ground up with application development in mind. More specifically, what can and can't be done in regards to data is enforced in your application, not the database itself (like in a SQL database).

Here are a few things to keep in mind:

  • For performance and simplicity reasons, lean toward embedding over referencing.
  • Prefer the reference approach when the amount of child data is unbound.
  • Prefer the reference approach when multiple parent documents access the same child document and that child's document changes frequently.
  • Obtaining referenced documents requires multiple queries by your application.
  • In the references approach, depending upon your application's needs, you may choose to maintain links to the related document's _id in either document, or both.

For more details regarding data modeling in MongoDB, start with this section of mongoDB's documentation or this hour long YouTube video

Querying Data

We've seen how to retrieve all of the documents in a collection using the find() method.

We can also use the find() method to query the collection by passing in an argument containing our query criteria as an JS object:

> db.people.find( {name: "Miguel"} )

Here's how we can use MongoDB's $gt query operator to return all people documents with an age greater than 20:

> db.people.find( {age: { $gt: 20 } } )

MongoDB comes with a slew of built-in query operators we can use to write complex queries.

Can you write a query to retrieve people that are less than or equal to age 20?

In addition to selecting which data is returned, we can modify how that data is returned by limiting the number of documents returned, sorting the documents, and by projecting which fields are returned.

This sorts our age query and sorts by name:

> db.people.find( {age: { $gt: 20 } } ).sort( {name: 1} )

The "1" indicates ascending order.

This documentation provides more detail about reading data.

Updating Data

In MongoDB, we use the update() method of collections by specifying the update criteria (like we did with find()), and use the $set action to set the new value.

> db.people.update( { name: "Miguel" }, { $set: { age: 99 } })

By default update() will only modify a single document. However, with the multi option, we can update all of the documents that match the query.

> db.people.update( { name: { $lt: "M" } }, { $inc: { age: 10 } }, { multi: true } )

We used the $inc update operator to increase the existing value.

Here is the list of Update Operators available.

Challenge: add another contact to our person document named "Manny". Hint: you will need to use the link above to discover the Array Update Operator to use for this

Removing Data

We use the remove() method to data from collections.

If you want to completely remove a collection, including all of its indexes, use [name of the collection].drop().

Call remove({}) on the collection to remove all docs from a collection. Note: all documents will match the "empty" criteria.

Otherwise, specify a criteria to remove all documents that match it:

>db.people.remove( { age: { $lt: 16 } } )

Choose a person document by name and remove them from the collection.

Exercise

  • Analyze and determine how you would model the following:
    • Gamers plays games
    • Gamers wish to log each time they play a particular game and save the score
    • Gamers like to become members of meetups that specialize in a particular game
    • The meetups have lots of information, like their schedule, speakers, membership, etc. (don't worry about creating all of these attributes).

References

mongoDB homepage

MongoLab

Robomongo - MongoDB Management Tool

Essential Questions

SQL Tables are represented in MongoDB with ______?
Collections

While in MongoDB's shell, what command would we enter to retrieve all of the documents from a collection named books?
db.books.find()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment