MongoDB São Paulo - 13/07/2012

1. Welcome Keynote - Paul Pedersen (10gen)

4 níveis de consistência (escrita)
1. conectou ao servidor e recebeu ACK
2. conectou e salvou em memória
3. fsync, escreveu em disco
4. fsync replica, escreveu em disco em todas as réplicas
Journaling

Table scans - slow - O(n)
Indexed: BTREE lookup - faster - O(log n)
Profiler = nice tool to analyze slow queries
query.explain(); --> look at "nscanned"
Se existe mais de um índice possível, mongod faz "specular search", tenta todos lembrando a razão n/nscanned => depois usa o melhor.
Covering Indexes
- Query resolvida apenas no índice
Sparse Indexes
- Não cria índice seo documento não tem o campo indexado. Otherwise, "null" is used for indexing.
Geospatial Indexes
Listing: db.posts.getIndexes();
Background building: db.posts.ensureIndex(..., {background: true});
Sorting queries is limited to 32 MB => create index on sort fields
What if your data distribution change? How to update your (cached) Query Plan?
- 100 writes mongo will "unlearn"
- add/remove indexes / query plans
Query Plans are automatically stored by query pattern.
If nscanned is (currently) 10x more than stored on the QP, then other plans are evaluated (Bad QP ensurance / relearning)

Avoid single point of failure
Availability and durability
- Fire & forget (default)
- getLastError()
  
  j:true = guarantee that journal was written
  
  fsync: true
  
  w: n, "majority", tag
Hidden node
Priority 0 node
Priorities can be used to decide what's the primary node
Arbiters (to break ties) => used to have 2 data nodes + 1 arbiter (odd number of nodes)

mongos can be deployed on the application server (and talk to your application via loopback interface, localhost)
mongos is like a router
You must have exactly 3 config servers (you can run with 1 for development)

They should, obviously, be on different machines
The shard key is immutable in 2 ways:
- Cannot change the shard key for a sharded collection
- Cannot update the value of the key field
As you insert more data you get more chunks (splits)
Chunks stay within a shard
Every shard is balanced to have all the same number of chunks
Distributed Merge Sort
Chunks contain sequential data, but shards need not
A shard key doesn't have to be unique, but it should be granular. It should be able to split fairly easy.
- Example: (country, userid)
- Activity Stream: shard by userid