Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save minhhungit/01b7606cc8cac17b4f0629e02acdc878 to your computer and use it in GitHub Desktop.
Save minhhungit/01b7606cc8cac17b4f0629e02acdc878 to your computer and use it in GitHub Desktop.
MongoDB Performance Troubleshooting Guide

MongoDB Performance Troubleshooting Guide

References

Official docs

For all of these docs, be sure to select your MongoDB version from the left dropdown!

Configuration

Connections & Sessions

Diagnostics

Read Preferences:

Other links

Useful database commands

One-liners

Environment Settings

  • db.version() --> get database version
  • db.serverStatus() --> get overall server status
  • db.serverStatus().storageEngine --> get storage engine
  • db.adminCommand( { getCmdLineOpts: 1 } ) --> get the commandline options used at startup
References

Connections & Sessions

  • db.serverStatus().connections --> get total number of connections to the database
  • db.serverStatus().logicalSessionRecordCache --> use this for troubleshooting error messages with logical sessions

Logging

  • db.adminCommand( { getLog: "*" } ) --> retrieve available log filters
  • db.adminCommand( { getLog : "global" } ) --> get recent global event logs
  • db.getLogComponents() --> get logging verbosity for various log components
  • db.setLogLevel( 4, "command" ) --> set logging verbosity ( -1 through 5 ) for a specific component (in this case, "command" entries)

Primary-/Secondary-related

  • rs.stepDown() - to be issued on the primary; gives up primary status
  • db.runCommand( { isMaster: 1 } ) - can be run on any node; provides information about the primary/master node

Convenience binary for hiding connection info on the commandline

mongo_command:

#!/bin/sh

mongo_command="$1"

mongo \
  '<server_address>:<port>/<database_to_connect_to>' \
  --username '#########' \
  --password '#########' \
  --authenticationDatabase '########' \
  --authenticationMechanism SCRAM-SHA-1 \
  --eval "${mongo_command}"

Watch scripts

View IP connection counts to the database

watch -n 1 "mongo_command 'db.currentOp(true);' | grep '\"client\"' | grep -oE \"[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*\" | sort | uniq -c | sort -r"

output:

  47 13.1.2.3
  27 13.1.2.4
  26 51.1.1.3
  26 54.1.2.3
  26 18.1.1.3
  25 18.1.2.3
  ...

Another version

db.currentOp(true).inprog.reduce((accumulator, connection) => { ipaddress = connection.client ? connection.client.split(":")[0] : "Internal"; accumulator[ipaddress] = (accumulator[ipaddress] || 0) + 1; accumulator["TOTAL_CONNECTION_COUNT"]++; return accumulator; }, { TOTAL_CONNECTION_COUNT: 0 })

Example output:

{
    "TOTAL_CONNECTION_COUNT" : 331,
    "192.168.1.3" : 8,
    "192.168.1.5" : 17,
    "127.0.0.1" : 3,
    "192.168.1.6" : 2,
    "11.123.12.1" : 2,
    "Internal" : 41,
    "11.123.12.2" : 86,
    ...
}

Send most recent global logs to a file

[WIP] mongo_command 'db.adminCommand( { getLog : "global" } )' > LOGFILE_NAME.log [WIP]

Log-generating scripts

Troubleshoot activeSessionsCount (Mongo 3.6 error):

while :; do mongo_command 'db.serverStatus().logicalSessionRecordCache.activeSessionsCount' | xargs -I {} echo "[`date`]: activeSessionsCount: {}" | tee -a ~/tmp/YYYY_MM_DD_activeSessionsCount.log; sleep 1; done

Record LogicalSessionCache records

while :; do mongo_command 'db.adminCommand( { getLog : "global" } )' | grep -e 'LogicalSessionCache' | xargs -I {} echo "[`date`]: {}" | tee -a ~/tmp/YYYY_MM_DD_logical_session_cache_log_records.log; sleep 1; done

The dbStats command

db.stats()                                        // bytes
db.runCommand({ dbStats: 1, scale: 1024 })        // kilobytes
db.runCommand({ dbStats: 1, scale: 1048576 })     // megabytes
db.runCommand({ dbStats: 1, scale: 1073741824 })  // gigabytes

Useful fields:

  • objects - number of objects (documents) across the entire database
  • dataSize - total size of uncompressed data held in the database; may be bigger than storageSize if compression is turned on

Useful commandline utilities

mongostat

Provide a refresh interval (in seconds) immediately after the mongostat command.

In this case, mongostat will refresh every 10 seconds:

mongostat 10 \
  --host <server_dns_or_ip> \
  --port <port> \
  --username <username> \
  --authenticationDatabase <auth_database> \
  --authenticationMechanism SCRAM-SHA-1 \
  --password '########'

Sample output:

insert query update delete getmore command dirty  used flushes vsize   res qrw arw net_in net_out conn           set repl                time
    *0  1692   3152     *0     899   855|0  4.8% 80.2%       0 8.92G 3.56G 0|0 2|5  1.60m   8.81m  273 prod-cluster  PRI May 11 20:11:52.703
    *0  1528   2641     *0     817   828|0  5.3% 80.4%       0 8.92G 3.57G 0|0 2|1  1.51m   7.72m  273 prod-cluster  PRI May 11 20:12:02.770
    *0  1659   2625     *0     857   876|0  5.0% 80.1%       0 8.92G 3.61G 0|1 3|4  1.59m   8.63m  274 prod-cluster  PRI May 11 20:12:12.728

mongotop

Provide a refresh interval (in seconds) immediately after the mongotop command.

In this case, mongotop will refresh every 10 seconds:

mongotop 10 \
  --host <server_dns_or_ip> \
  --port <port> \
  --username <username> \
  --authenticationDatabase <auth_database> \
  --authenticationMechanism SCRAM-SHA-1 \
  --password '########'

Sample output:

                             ns     total      read    write
                 main_db.users    2544ms    2544ms      0ms
             main_db.locations     540ms     540ms      0ms
                main_db.routes     460ms       0ms    459ms
             main_db.schedules      75ms      74ms      0ms
             main_db.purchases      32ms      32ms      0ms
              main_db.settings      28ms      28ms      0ms
                local.oplog.rs      27ms      27ms      0ms
                  main_db.jobs      18ms      17ms      0ms

Optimization opportunities

Make use of database profiling to find slow queries

Turn on profiling for a while

Get profiling status:

db.getProfilingStatus()
  • "was -> 0 means profiling off, 1 means profiling on
  • "slowms" -> the current slow query threshold

Turn on profiling, capture anything queries running longer than 1000 ms:

db.setProfilingLevel(1, { slowms: 1000 })

Turn off profiling

db.setProfilingLevel(0)

Analysis

Isolate slow queries:

db.system.profile.find({ "op": "query"}).limit(100).sort( { ts : -1 } ).pretty()

Isolate slow commands:

db.system.profile.find({ "op": "command"}).limit(100).sort( { ts : -1 } ).pretty()

Slow column scan queries:

db.system.profile.find({"planSummary":{$eq:"COLLSCAN"},"op" : {$eq:"query"}}).sort({millis:-1})

Slow exhaustive IXScan queries:

db.system.profile.find({"planSummary":{$eq:"IXSCAN"},"op" : {$eq:"query"}}).sort({millis:-1})

Turn off compression

See official WiredTiger storage engine documentation for more info.

Default settings:

  • collections: block compression using the snappy compression library
  • indexes: prefix compression

To change compression algorithm or turn off compression for collections, adjust the storage.wiredTiger.collectionConfig.blockCompressor setting.

To disable prefix compression for indexes, adjust the storage.wiredTiger.indexConfig.prefixCompression setting.

db.aggregate( [  { $listLocalSessions: { allUsers: true } }, { $group: { _id: { user: '$user' }, count: { $sum: 1 } } } ] )

output:

{ "_id" : { "user" : "suser@main_db" }, "count" : 3 }
{ "_id" : { "user" : "admin@main_db" }, "count" : 2 }
{ "_id" : { "user" : "user1@main_db" }, "count" : 131 }
{ "_id" : { "user" : "agent@main_db" }, "count" : 53002 }

Current database connections (from here)

db.currentOp(true).inprog.reduce((accumulator, connection) => { ipaddress = connection.client ? connection.client.split(":")[0] : "Internal"; accumulator[ipaddress] = (accumulator[ipaddress] || 0) + 1; accumulator["TOTAL_CONNECTION_COUNT"]++; return accumulator; }, { TOTAL_CONNECTION_COUNT: 0 })
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment