MongoDB is document database that supports range and field queries.
A single server can run either standalone or as part of a replica set. A "replica set" is set of mongod instances with 1 primary. Primary: receives writes, services reads. Can step down and become secondary. Secondary: replicate the primary's oplog. If the primary goes down, secondaries will hold an election. Arbiter: used to achieve majority vote with even members, do not hold data, don't need dedicated nodes. Never becomes primary.
Replication is asynchronous. Failover: If a primary doesn't communicate with the others for > 10s, secondaries conduct election. Roles:
- Arbiter: Only votes, holds no data. Don't deploy more than 1 per replica set.
- Priority: Priority 0 members cannot trigger elections, cannot become primary. Can service reads and vote.
- Hidden: just like priority 0 but cannot service reads, only vote. Does maintain a copy of master data.
- Delayed: just like hidden but records master copies with a delay to avoid eg: human error.
Number of memebers that can become unavailable and the cluster can still elect primary. 50 memebers, 7 voting members => 46 can go down (but only 3 of the voting members). WAN deployment: 1 member per DC in 3 DCs, can tolerate a single DC going down.
Write concern: requests ack only from primary, overwrite per write operation to specify number of secondaries. Read concern: local/majority. Local means read from primary, majority might read from secondaries. OpLog size: depends on storage engine, 3 types: in-memory, wired tiger, mmapv1.
New members or secondaries that fall behind too far must resync everything. Starting mongo with an empyt datadir will force an initial sync. Starting it with a copy of a recent datadir from another member in the set will also hasten the initial sync.
- Change hostnames of all secondaries, wait till they catch up, ask master to step down, bounce clients.
- Stop all members, reconfigure offline using same datadir but different port (so clients can't connect), write revised db config, start new hostnames normal way.
rollbacks - network partition, secondary can't keep up with primary, primary goes down, stale secondary becomes master, master rejoins as primary -- master needs to rollback writes it accepted. Such a rollback will not happen if the write propogates to a healthy reachable secondary, because it will become master. rebooting 2 secondaries simultaneously in a 3 member replica set forces the primary to step down, meaning it closes all sockets (Connection reset by peer) till one of the secondaries becomes available. false elections