Created
August 1, 2018 18:46
-
-
Save abuxton/93c4597340a6055d19c94ea4266a02d6 to your computer and use it in GitHub Desktop.
monitoring Puppet
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Monitoring and scaling | |
Puppet provide a module for exposing the Puppet metrics in an easy consumable command line client regardless of the infrastructure topology https://forge.puppet.com/puppetlabs/puppet_metrics_collector there is a companion module to provide visualisation of this data https://forge.puppet.com/puppetlabs/puppet_metrics_dashboard | |
What this exposes is the common metrics end points for Puppet https://puppet.com/docs/pe/2018.1/available_graphite_metrics.html#reference-3813 | |
This is a companion and extension of the PuppetDB dashboard https://puppet.com/docs/puppetdb/5.1/maintain_and_tune.html#monitor-the-performance-dashboard and experimental developer dashboard https://puppet.com/docs/puppetserver/5.1/puppet_server_metrics.html#accessing-the-developer-dashboard | |
What this looks like in regards system health is as follows; | |
* Run times will spike by orders of magnitude. | |
* Timeouts when retrieving catalogs. Monitor server metrics: | |
top , atop , & friends | |
* CPU utilization stays at 100% for a significant period. Indicates overloaded master. | |
* Some cores maxed but others unloaded. Indicates insufficient number of jRuby instances. | |
* Count number of compiles in master logs. Should see almost exactly number_of_nodes * checkins_per_hour compiles. | |
The total number of compilations happening across all nodes should be almost exactly equal to the number of nodes in your infrastructure multiplied by the expected number of runs per hour. If significantly fewer compiles occur, then the master is overloaded and is not able to keep up. | |
For example, if you have 1000 nodes, and you've left the runinterval set at its default of 30 minutes, then you should see a total of 2000 catalogs compiled across all of your compile masters. | |
Maintenance and tuning is discussed with the following additional_resources | |
* https://puppet.com/docs/puppetdb/5.1/maintain_and_tune.html | |
* https://puppet.com/docs/pe/2018.1/tuning_monolithic.html | |
* https://puppet.com/docs/pe/2018.1/maintenance.html | |
* https://puppet.com/docs/pe/2018.1/troubleshooting.html |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment