Wednesday, November 25, 2015

Mongo Performance Monitoring

A basic database top for Mongo is
> & 'mongostat' /host:localhost /port:27014 /username:mongoMonitor /password:<> /authenticationDatabase:admin

insert query update delete getmore command % dirty % used flushes  vsize    res qr|qw ar|aw netIn netOut conn     time
    *0    *0     *0     *0       0     1|0     0.0   34.2       0 641.0M 562.0M   0|0   1|0   79b    15k    1 15:25:26
    *0    *0     *0     *0       0     1|0     0.0   34.2       0 641.0M 562.0M   0|0   1|0   79b    15k    1 15:25:27
    *0    *0     *0     *0       0     1|0     0.0   34.2       0 641.0M 562.0M   0|0   1|0   79b    15k    1 15:25:28
insert query update delete getmore command % dirty % used flushes  vsize    res qr|qw ar|aw netIn netOut conn     time
    *0    *0     *0     *0       0     1|0     0.0   34.2       0 641.0M 562.0M   0|0   1|0   79b    15k    1 15:25:36

This is very console driven and and requires writing the console output to stdout for trend analysis.

There are a number of for costs solutions for Monitoring Mongo. Most monitoring platforms have some way to hook into Mongo's built in performance metrics. Solarwinds uses a powershell wrapper (on Windows, probably bash on Linux) in their Mongo template. This was interesting because it showed a clear pattern for building your own monitor if you do not have really really nice monitors like we do.

The path to writing your own monitor is creating a connectionadding a polling interval
creating an object to load the json stats in
parsing the object into discrete  KVP
associate the measures with date and time
adding aggregates to the measure
storing the metric in a database or file

The basic query you can run would be
    db.runCommand( { serverStatus: 1} )

A more discrete monitor call (for queue exhaustion in this case) would be
    db.runCommand( { serverStatus: 1, metrics: 0, locks: 0, globalLock: 1, asserts: 0, connections: 0,  network:0, cursors: 0, extra_info:0 , opcounters:0, opcountersRepl: 0, storageEngine:0, wiredTiger:0})

Using discrete calls per metric group increases the number of connections, queues, and I/O. However it allows you to poll individual metric groups at different intervals.

The GlobalLocks stats look like this.
 {"totalTime":1048003623000,"currentQueue":{"total":0,"readers":0,"writers":0},"activeClients":{"total":10,"readers":0,"writers":1}}

globalLock:
 totalTime: 1048003623000
 currentQueue:
 total: 0
 readers: 0
 writers: 0
 activeClients:
 total: 10
 readers: 0
 writers: 1
 Polling this on a 1 minute interval can give you really detailed utilization patterns when you first implement applications with Mongo. However over time you should be able to scale back to 5 or 15 minute intervals as your average utilization levels out.