Created
April 25, 2011 19:56
-
-
Save newfront/941087 to your computer and use it in GitHub Desktop.
Ezra NoSQL Talk
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Ruby Meetup Group | |
Ezra | |
Limitations of SQL | |
(horizontal databases) - not covered in mysql | |
- Limitations | |
- don't scale past a single master | |
#lesssql | |
- hybrid systems (solution) | |
*find a small part of solution not on critical path (session_data, logs, etc) | |
'Redis' - alternative database | |
New Tour of New Database Types: | |
Redis | |
- fast, in memory key/value store | |
- alternate data types -> lists, sets (hash table) | |
- set intersections (commonalities across users, lookup) | |
- sessions, hit counters, log buffers (can use) | |
Pros | |
- all operations happen in memory | |
Cons | |
- data has to fit in memory | |
- data structure server | |
Re-Distribute-Your-Load | |
Efficient Data Handling (IO based) | |
Scales (single threaded) (Like memcache) | |
- allows you to spread out over server machines | |
Uses: as fast as you can get from a data store | |
Tokyo Cabinet | |
- Large Data workhorse | |
- Fully Syncronus, no chance of losing data | |
- Memory Caching, | |
More key/value type | |
- can have extensible code structures built into system | |
Pros: | |
- Tokyo Server (80 GB) | |
- Fixed Length Records | |
- Efficient, Smallest on-desk footprint | |
Cons: | |
- above 70GB, gets funky | |
Replication | |
- master, master | |
- master, slave | |
Uses For: Fastest, store large amounts of data, tune RAM server usage | |
- gets embedded in process | |
MongoDB | |
- document database | |
(mySQL of key/value stores) - easiest step from MySQL databases | |
- tables are collections of documents | |
- rolling buffer | |
- great complex queries | |
- index on attributes | |
- (Not Tied down to schema) | |
- Set collections as Shartable (auto-rebalancing) | |
- JSON document database | |
Cons: no transaction | |
Pros: recovery tools | |
- advanced query system | |
- I/O open, write - grid file system | |
- scales horizontally | |
MongoDB - fast syncronus writes, good for web, logging, statistics | |
- can use hugely complex queries | |
- have flexibility in queries | |
Riak | |
- Document Oriented DB | |
- HTTP/JSON query interface | |
- Add and Remove Nodes | |
-Erlang map/reduce query interface | |
- Tunable Nobs, I want you to write to 3 servers, etc (Rule sets) | |
http://riak.basho.com | |
Pros: | |
- schemaless | |
- wants to stay alive | |
Cons: | |
- interface via http, json | |
- ruby binding | |
Uses for: | |
- manage | |
- add nodes when you need them | |
Cassandra | |
- Eventually consistent node distribution | |
- column familys, etc | |
- structured key/value store | |
- can easily get back great sorted | |
RULES: rack aware, data aware, location aware | |
- When you need to scale out huge amounts of data | |
- Writes will always succeed | |
Pros: | |
- Can add as many nodes as you need | |
- Twitter will jump on board | |
- Scale out over petabyte | |
Cons: | |
Dynomite | |
- cliffmoon/dynomite | |
- no high level types | |
- Based on Amazon's Dynamo Papers | |
- key/blob | |
Uses: | |
- Large amount of files (static) that you want to serve | |
Cons: | |
- bring new nodes into cluster (system can easily get overloaded) | |
- (re-balance data) | |
*in active development | |
Use when you want to scale easily | |
Use as image asset store | |
Redis, Tokyo, MongoDB (stable) | |
*being used in production | |
*cassandra (look out for stable release) | |
- Chef Recipes on github | |
Pitfalls of #LSSSQL | |
- no referential Integrity | |
- not as much tooling | |
- almost non existent disaster recovery tools | |
- not as much production, used in anger experience | |
*Customers care (save the data!) | |
Cloud-Computing | |
- horizontal cloud computing | |
- add more nodes when you need them (cloud data) | |
*Hypertable (offline, large batch processing) | |
- map reduce, offline cron based processing | |
*HyperCube | |
- object relational mapping | |
*remapping | |
- logic trees (easier to build out in new style dbs) | |
Moneta (github) | |
*InfoBrightEngine (for MySQL) | |
------------------------------------------------------ | |
joins can be done within the client | |
- scalable by taking data from multiple end-points | |
------------------------------------------------------ | |
Day to Day Issues | |
- what happens when you hit your limits | |
- memcache infront of mysql, redistribute other data into multiple / single systems | |
- Solid State Drives (ssd - hotspots on ssd) | |
------------------------------------------------------ | |
Fusion I/O (solid state) | |
- controllers getting smarter | |
Riak | |
- boot config, simple to configure | |
SlideShare - (post slides) | |
Google App Engine (Data Store) - always slow, but always same slow | |
Benchmarking: (?) - no huge studies | |
Cassandra - nodes talk to eachother | |
- eventually consistent | |
Key/Value convergence on the move. | |
MongoDB | |
(+) Mongo team helps via IRC | |
(+) Feature Requests | |
(+) Good first step, document store | |
Redis -> Tokyo | |
(s1):(s2) | |
*breakdown of object model | |
- now multiple queries to save, build, etc (crash = dead state) | |
- save code as rows in db, utilize db to run and return code |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment