Created
November 21, 2011 16:23
-
-
Save albertoperdomo/1383129 to your computer and use it in GitHub Desktop.
Examples with MongoDB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is a case study of how we would model parts of the spotify app in MongoDB. | |
MODEL | |
===== | |
users | |
----- | |
username | |
name | |
plain_txt_password | |
plan | |
list_ids: [id, id, id] | |
friend_ids: [id, id, id] | |
tracks | |
------ | |
id | |
title | |
album_id | |
artist_ids [1,2,8] | |
albums | |
------ | |
id | |
title | |
artist_id | |
published_at | |
produced_by | |
track_count | |
awards:[{id: new ObjectId(), title: "best pitbull featuring version", year: 2010, awarded_by: "MTV"}] | |
artists | |
------- | |
id | |
name | |
bio | |
picture: [{path: ..., filename: ..., version: "thumb", ...}] | |
active: true | |
overdose: true | |
lists | |
----- | |
id | |
user_id | |
title | |
public: true/false | |
track_ids: [id, id, id, id] | |
subscriber_count | |
plays (this collection should be hosted on a different host & database, because it will have a lot of frequent high volume writes) | |
----- | |
user_id | |
track_id | |
played_at | |
duration | |
SAMPLE QUERIES | |
============== | |
Your n favorite songs (the ones with the most plays by you) | |
----------------------------------------------------------- | |
One possibility is to use group but it has some limitations. We cannot sort by number of plays or limit the number of items we want to retrieve. | |
Group query: | |
db.plays.group({ query: { user_id: xxx }, key: 'track_id', reduce: function(doc, prev) { prev.n++; if (doc.played_at > prev.last_played_at) { prev.last_played_at = doc.played_at } }, initial: { n: 0, last_played_at: 0 }) | |
result: | |
[ { _id: ..., value: { n: ..., last_played_at: ...} | |
map reduce query | |
---------------- | |
db.plays.mapReduce( | |
function () { emit( this.track_id, { n: 1, t: this.played_at }) }, | |
function (key, values) { | |
//// Prettier implementation | |
var n_res = 0; | |
var max_t = 0; | |
values.forEach(function(v){ | |
n_res+=v.n; | |
if(v.t > max_t) max_t = v.t; | |
}) | |
return {n: n_res, t: max_t}; | |
///// Implementation with better performance | |
var ac = values[0]; | |
for(var i = 1; i < values.length; i++) { | |
ac.n += values[i].n; | |
var v_t = values[i].t; | |
if(v_t > ac.t) ac.t = v_t; | |
} | |
return ac; | |
}, | |
out: "fooo" | |
query: { user_id: xxx }) | |
db.foo.find().sort(..) | |
How this map reduce works: | |
Sample data: | |
track_id played_at | |
-------------- | |
1 100 | |
2 101 | |
1 102 | |
2 100 | |
1 200 | |
emit(1, { n: 1, t: 100 }) | |
emit(2, { n: 1, t: 101 }) | |
emit(1, { n: 1, t: 102 }) | |
emit(2, { n: 1, t: 100 }) | |
emit(1, { n: 1, t: 200 }) | |
reduce(1, [ { n: 1, t: 100 }, { n: 1, t: 102 }, { n: 1, t: 200 } ]) | |
=> { n: 3, t: 200 } | |
reduce(2, [{ n: 1, t: 100 }, { n: 1, t: 101 }]) | |
=> { n: 2, t: 101 } | |
... | |
Ideally this query should run in background and the results (e.g. your 10 most played songs) stored in the user's document for fast retrieval |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Option A: comments are referencing a blogpost, like typically in ActiveRecord | |
============================================================================= | |
posts | |
----- | |
id | |
title | |
body | |
author_id -> user | |
comments | |
-------- | |
id | |
text | |
author_id -> user | |
post_id -> post | |
Option B: comments are embedded in the blogpost document | |
======================================================== | |
posts | |
----- | |
id | |
title | |
body | |
author_id -> user | |
comments: [{_id: ...., text: ..., author: ...., approved: true}, {...}] | |
An ID attribute is provided for the comment so that they can be handled invidually, as if they would be a separate entity, e.g. for approval process: Search blogpost by comment ID and set approved attribute. | |
posts.find({"comments._id": xxx}) | |
It's also possible to define in the query params which parts of the document we do not want to retrieve, so we reduce the overhead when we are only interested in certain parts, e.g. the comment itself. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment