multi-key indexing allows for indexing properties within an array. It makes things fast [ red, blue, green ]
On the way out Is red in the colors array? Is red not in the colors array? Scalar: $ne, $mod, $exists, $type, $lt, $gt, $gte, $ne Vector: $in, $nin, $all, $size
On the way in: Atomicness scalar: $inc, $set, $unset vector: $push, $pop, $pull, $pullAll, $addToSet
//Case #1: As a librarian, when I swipe a patrons cards, I want to verify their address //• One-to-one relationships = "Belongs to". They are ofect embedded
patron = {
_id: "joe",
name: "joe trojan",
address: {
street: "123 face st. ",
city: "Los Angeles",
state: "MA",
Zip: 10000
}
}
//Case #2: As a librarian, I want to store multiple addresses so I have a better chance of hunting you down
patron = {
_id: "joe",
name: "joe trojan",
join_date: ISODate("2011-10-14")
address: [
{ street: "123 face st. ", city: "Los Angeles", state: "MA", Zip: 10000 }
{ street: "456 face st. ", city: "Los Angeles", state: "MA", Zip: 10000 }
]
}
- As a librariean, I want to see the publisher of a book //• publisher puts out a lot of book
book = {
_id: "123",
title: "Mongo DB Book",
authors: [],
published_date: ISODate("2010-09-24"),
pages: 215,
language: "English",
publisher: {
name: "O'Reilery media",
founded: 1980,
location: "CA"
}
}
//This causes trouble because it'll be hard to find all the publishers in a query. Plus, this data is immutable (meaning that it will never change) but
- A librarian wants to do a query for all the publshers in the system //If you dont' care about history, you don't need to worry about changes to the publisher but data is history so you must respect it.
publisher = {
_id: "oreilly",
name: "O'Reilley media ",
founded: 1980,
location: "CA"
}
books = {
_id: "123",
publisher_id: "oreilly"
}
//Attempt #3: bad Idea because this grows
publisher = {
_id: "oreilly",
name: "O'Reilley media ",
founded: 1980,
books: [ "123", ...]
}
//Case #6: Find authors of book 'foo'
book = {
_id: "123",
title: "Mongo DB Book",
authors: [
{ id: "kchodoworow", name: "kristina cho" },
{ id: "mdirol", name: "Mike dirioli" }
],
published_date: ISODate("2010-09-24"),
pages: 215,
language: "English",
}
author = {
_id: "kchodoworow",
name: "kristina cho",
hometown: "New York"
}
author = {
_id: "mdirol",
name: "Mike dirioli",
hometown: "CA"
}
//Attempe #2: Let's put teh book in the author
book = {
_id: "123",
title: "Mongo DB Book",
published_date: ISODate("2010-09-24"),
pages: 215,
authors: [
{ id: "kchodoworow", name: "kristina cho" },
{ id: "mdirol", name: "Mike dirioli" }
],
language: "English",
}
author = {
_id: "kchodoworow",
name: "kristina cho",
hometown: "New York"
books: [
{ id: "123", title: "Mongo DB Definitive guide" }
]
}
Embedding
- Great for read performances
- One seek to laod entire object
- one roundtrip to database
- writese can be slow
- maintaining data integrigy
- MongoDB has a 16mb data cap. Guttenberg bible is 4mb
Linking
- MOre flexibility
- Data integrity is maintained
- Work is done during reads
If a book has categories, then we can put books //Problem
book = {
category: "MongoDB"
}
category = { _id: "mongo db", parent: "databases" }
category = { _id: "databases", parent: "programming" }
//Option #1: Store the category heirarchy
book = {
//If you index the category array, you'll be efficient
categories: [ "MongoDB", "Databases", "Programming" ]
}
book = {
categories: [ "MySQL", "Databases", "Programming" ]
}
book = {
categories: [ "MySQL", "Databases", "Programming" ]
}
//Optione #2: Putting things into a string
book = {
category: "Programming/Databases/MySQL"
}
book = {
category: "MongoDB/Databases/Programming"
}
//If you have a problem you can solve with Regular expressions have two problems
//Option #3" Interval trees