Skip to content

Instantly share code, notes, and snippets.

@misostack
Last active July 19, 2022 05:04
Show Gist options
  • Select an option

  • Save misostack/9ef73d4c290b737cac04ef6ff741dab4 to your computer and use it in GitHub Desktop.

Select an option

Save misostack/9ef73d4c290b737cac04ef6ff741dab4 to your computer and use it in GitHub Desktop.
100 days mongodb

100 days mongodb

Data Modeling Methodology

Overall

Input

  • Scenarios
  • Business Domain Experts
  • Production Logs & Stats
  • Data Modeling Experts

Workload

  • size data
  • quantify ops
  • qualify ops

Relationships

  • identify
  • quantify
  • embed or link

Apply Schema Design Patterns

  • recognize
  • apply

Schema

  • Queries
  • Indexes
  • Data sizing
  • Operations
  • Assumptions

image

Recap

image

Methodologies

image image image image

Performance criteria

  • Sharding : because of data size
  • Read & Writes
  • Ton of operations
  • Operations per second
  • Required latency
  • Pinning attributes queries

=> More collections

1st: Identify your workload 2nd: Embed or Link

Identify Workload

image image image

image

image image image

image image image image

Relationship

image image

Embed vs Link

image image

image image

One to Many

image

Embeded in One Side image

Embeded in Many Side image

One to Many Reference in the One Side image

One to Many Reference in the Many Side image

image

Many To Many Relationship

image

image

image

Many to Many, Embed in main Side image

Many to Many, Reference in main Side image

Many to Many, Reference in secondary side image

image

One to One

image

@misostack
Copy link
Author

image

@misostack
Copy link
Author

image

@misostack
Copy link
Author

image

@misostack
Copy link
Author

image

@misostack
Copy link
Author

image

@misostack
Copy link
Author

image

@misostack
Copy link
Author

image

@misostack
Copy link
Author

misostack commented Jun 7, 2022

One to One Relationship

image

One to One Embed

image

One to One Reference

image

image

Sample

From
image

To

image

@misostack
Copy link
Author

misostack commented Jun 7, 2022

One to Zillions Relationship

image

image

Solution

image
image
image

@misostack
Copy link
Author

Patterns

Reusable solutions of knowledge

  • Data Modeling
  • Schema Design

Topics

  • Optimize large documents with Subset Pattern
  • Use the Computed Pattern to avoid the repetitive computations
  • Structure similar fields with the Attribute Pattern
  • Handle changes to your deployment with no downtime with a Schema versioning Pattern

@misostack
Copy link
Author

misostack commented Jun 8, 2022

Datatypes

ObjectId

image

mongosh.exe
test> x = ObjectId()
ObjectId("62a025dca5ad6127dbacd972")
test> x.toString()
62a025dca5ad6127dbacd972
test> x.getTimestamp()
ISODate("2022-06-08T04:30:20.000Z")
test> x.toString().length;
24
test>
ObjectId("62a02802a5ad6127dbacd975")
test> ObjectId(Date.now()/1000)
ObjectId("62a0280ea5ad6127dbacd976")
test> ObjectId(Date.now()/1000)
ObjectId("62a02811a5ad6127dbacd977")
test> db.examples.insert({"_id":ObjectId(), "title":"A", "artist":"B", "room": "r", "spot":"s", "on_display":true, "in_house": true, "events":[{"k":"e1","v": new Date()}]});
DeprecationWarning: Collection.insert() is deprecated. Use insertOne, insertMany, or bulkWrite.
{
  acknowledged: true,
  insertedIds: { '0': ObjectId("62a02b56a5ad6127dbacd979") }
}
test> db.examples.find({})
[
  {
    _id: ObjectId("62a02b56a5ad6127dbacd979"),
    title: 'A',
    artist: 'B',
    room: 'r',
    spot: 's',
    on_display: true,
    in_house: true,
    events: [ { k: 'e1', v: ISODate("2022-06-08T04:53:42.776Z") } ]
  }
]

@misostack
Copy link
Author

misostack commented Jun 8, 2022

Handling Duplication Staleness

image

Good things come at a cost

Duplication

image

image

image

image

Staleness

image

image

Referential Integrity

image

image

image

@misostack
Copy link
Author

misostack commented Jun 8, 2022

Attribute Pattern

image

image

Solution

image

image

Another case

image

image

Recap

image

image

image

@misostack
Copy link
Author

Extended Reference Pattern

Too Many Joins Nightmares

image

Solution

image

image

image

Recap

image

@misostack
Copy link
Author

misostack commented Jun 8, 2022

Subset Pattern

MongoDB tries to optimize the use of RAM by pulling in memory the documents that it needs from the disk to the RAM. When there is no more available, it evicts pages that contains the document it doesn't need anymore to make room for more document it needs to process at the moment

image

Solution

image

image

Split Collections

image

image

image

@misostack
Copy link
Author

misostack commented Jun 8, 2022

Computed Pattern

image

Mathematical Operations

image

image

Fan Out Operations

It means to do many tasks to represent one logical tasks

  1. Fan Out of Reads
  2. Fan Out of Writes

image

Shared photos

image

Role up Operations

image

Role up Operations: we merged data together. Eg: grouping categories in a parent category will be a roll-up, grouping time-based data from small intervals to large ones -> reporting for hourly, daily, monthly, yearly summaries.

image

image

Recap

image

image

image

@misostack
Copy link
Author

misostack commented Jun 9, 2022

Bucket Pattern

image

Problems:

10M devices send data every minutes. How do we store?

image

image

Maximum : 64KB/document

Example Collaboration Platform

image

Row Oriented vs Column Oriented

image

image

Recap

image

image

image

@misostack
Copy link
Author

misostack commented Jun 10, 2022

Schema versioning Pattern

Issues

image

  • Memories of modifying the schema of your relational database
  • Performing complex tasks under pressure, so you could limit the amount of downtime for your users?

image
image

Solution

image

image

image

image

Recap

image

image

image

@misostack
Copy link
Author

misostack commented Jul 19, 2022

Tree Patterns

image

Questions

image

List of patterns

image

Parent References

Who are the ancestors of Node X ?

image

Who reports to Y ?

image

Changes all categories under N to P ?

image

image

Child References

image

image

Array of Ancestors

image

image

Materialized Paths

image

image

Solution

Mixed Patterns

image

Summary

image

image

image

Accessment

image

Initial model

{
  "_id": "<objectId>",
  "name": "<string>",
  "role": "<string>",
  "department": {
    "name": "<string>",
    "id": "<objectId>"
  }
}

Issue

image

@misostack
Copy link
Author

Polymorphic Pattern

Grouping Objects Together

image

Sample

image

image

image

Single View

image

image

image

image

image

@misostack
Copy link
Author

misostack commented Jul 19, 2022

Summary

image

Approximation Pattern

image
image

Outlier Pattern

image

Eg: a favorite single may have more than 1 million followers in the app

image

image

@misostack
Copy link
Author

More

image

@misostack
Copy link
Author

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment