Skip to content

Instantly share code, notes, and snippets.

@abuxton
Created August 1, 2018 18:46
Show Gist options
  • Save abuxton/93c4597340a6055d19c94ea4266a02d6 to your computer and use it in GitHub Desktop.
Save abuxton/93c4597340a6055d19c94ea4266a02d6 to your computer and use it in GitHub Desktop.
monitoring Puppet
### Monitoring and scaling
Puppet provide a module for exposing the Puppet metrics in an easy consumable command line client regardless of the infrastructure topology https://forge.puppet.com/puppetlabs/puppet_metrics_collector there is a companion module to provide visualisation of this data https://forge.puppet.com/puppetlabs/puppet_metrics_dashboard
What this exposes is the common metrics end points for Puppet https://puppet.com/docs/pe/2018.1/available_graphite_metrics.html#reference-3813
This is a companion and extension of the PuppetDB dashboard https://puppet.com/docs/puppetdb/5.1/maintain_and_tune.html#monitor-the-performance-dashboard and experimental developer dashboard https://puppet.com/docs/puppetserver/5.1/puppet_server_metrics.html#accessing-the-developer-dashboard
What this looks like in regards system health is as follows;
* Run times will spike by orders of magnitude.
* Timeouts when retrieving catalogs. Monitor server metrics:
top , atop , & friends
* CPU utilization stays at 100% for a significant period. Indicates overloaded master.
* Some cores maxed but others unloaded. Indicates insufficient number of jRuby instances.
* Count number of compiles in master logs. Should see almost exactly number_of_nodes * checkins_per_hour compiles.
The total number of compilations happening across all nodes should be almost exactly equal to the number of nodes in your infrastructure multiplied by the expected number of runs per hour. If significantly fewer compiles occur, then the master is overloaded and is not able to keep up.
For example, if you have 1000 nodes, and you've left the runinterval set at its default of 30 minutes, then you should see a total of 2000 catalogs compiled across all of your compile masters.
Maintenance and tuning is discussed with the following additional_resources
* https://puppet.com/docs/puppetdb/5.1/maintain_and_tune.html
* https://puppet.com/docs/pe/2018.1/tuning_monolithic.html
* https://puppet.com/docs/pe/2018.1/maintenance.html
* https://puppet.com/docs/pe/2018.1/troubleshooting.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment