Last active
August 29, 2015 14:04
-
-
Save saintc0d3r/ca14d013d4b0a28f5fcc to your computer and use it in GitHub Desktop.
[MongoDb][MMS] Recommended Alerts in MMS
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Host Recovery -> Triggered when any hosts are in "Recovering" state | |
2. Replication Lag | |
3. Connections: | |
- Normal state: Look at the graphs over time to determine what is the number of normal connections. | |
- Worrying: It is not good to see if the number of connections are increased by 3x or more than the normal numbers. | |
- Critical: There is absolute limit of connections number that can be handled by hosts. We need to put an alert for this situation since 1 opened Connection eats 1 MB. Example: critical alert should be triggered if there are 24k connection opened on our Mongos host since it would eat Mongos Host's 24 gigs of RAM. Below are the normal average numbers of opened connection per sec for several types of hosts: Primary Node -> 10k connections per sec. Secondary Node -> 500 connections per sec. Mongos node -> 200 connections per sec. | |
4. Lock Percentage -> Global Lock %. Generally, we only want to watch this Global Lock % contentions on Primary Host. Generally, 60-70% is acceptable number. Beyond that or let's say 85%, there would be performance degradation happen on the primary host. | |
5. Replica -> Generally it sets to below than 12 hours ( 43200000 milliseconds) with an assumption that the Maintenance window would happen from midnight to 8AM. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment