Icinga Groups Configuration File

Dependency Servicegroups

Details:

The groups are used to set dependency conditions environment wide. Details on specific dependencies can be found in dependencies.md

Example:

All services/groups in the nrpe_dependency will have the current status of check_nrpe evalulated before the monitor is either run or alerts.

define servicegroup{
  servicegroup_name       nrpe_dependency
  alias                   NRPE Dependency
  servicegroup_members    u_load,u_processes,u_memory,machine_type,sys_uptime,disk_checks,service_checks
  }

General Servicegroups

Details:

These are general groups that allow the goc or monitoring to pm/ack entire environments. They consist of either every service monitor or every disk monitor.

define servicegroup{
  servicegroup_name       service_checks
  alias                   All Service Checks
  servicegroup_members    cron,ssh,ntp,mysql_service,postgres_service,dns_service;,monitoring_service
  }

define servicegroup{
  servicegroup_name       disk_checks
  alias                   All Disk checks
  servicegroup_members    backup_disk,var_export_disk,data_disk,var_spool_disk,monitoring_disk
  }

OS Servicegroups

Details:

These are OS level groups, each group contains every host that the check runs on. The point of them is to be able to ack/downtime a single check on every machine that it runs on and in some cases to assist in building escalation and dependency groups.

define servicegroup{
  servicegroup_name       vol_check
  alias                   Vol RW Check
  members                 *,Vol RW Check
  }

define servicegroup{
  servicegroup_name       cron
  alias                   Unix Cron Service
  members                 *,Service - Cron
  }

define servicegroup{
  servicegroup_name       u_disk
  alias                   Unix Disk Space
  members                 *,Disk Space - /,*,Disk Space - /home,*,Disk Space - /tmp,*,Disk Space - /usr,*,Disk Space - /var,*,Disk Space - /var/log
  }

define servicegroup{
  servicegroup_name       dns_service
  alias                   DNS Service
  }

define servicegroup{
  servicegroup_name       ssh
  alias                   Unix SSH Service
  members                 *,Service - SSH
  }

define servicegroup{
  servicegroup_name       u_load
  alias                   Unix CPU Load
  members                 *,Current Load
  }

define servicegroup{
  servicegroup_name       u_processes
  alias                   Unix Processes
  members                 *,Total Processes,*,Zombie Processes
  }

define servicegroup{
  servicegroup_name       u_memory
  alias                   Memory Usage
  members                 *,Memory Usage,*,Swap Usage
  }

define servicegroup{
  servicegroup_name       ntp
  alias                   NTP Service
  members                 *,NTP Offset
  }

define servicegroup{
  servicegroup_name       sys_uptime
  alias                   System Uptime
  members                 *,System Uptime
  }

Monitoring Servicegroups

Details:

These are all groups of monitoring checks, they function in the same way as the OS level groups except they are specific to monitoring.

Note:

If you downtime/disable the nrpe group you will stop all remote checking, I can't think of a good reason for this except an NRPE update has gone bad. If the network is having issues, I would recommend stopping all of Icinga.

define servicegroup{
  servicegroup_name       nrpe
  alias                   NRPE Service
  members                 *,Service - NRPE
  }

define servicegroup{
  servicegroup_name       machine_type
  alias                   Machine Type Service
  members                 *,Machine Type
  }

define servicegroup{
  servicegroup_name       monitoring_disk
  alias                   Icinga Store Disks
  }

App Servicegroups

Details:

These function the same as the OS level groups except they are specific to the app owner. Using these you can disable/pm an entire app. The hardware groups are not to be messed with, they are used in escalations and contain cpu, memory, and swap checks.

Transient Servicegroups

Details:

This is a container for all servicegroups in current use or new ones that have yet to be deployed. Placing anything in this container that you care about is bad as I will delete cruft from here without asking.

General Hostgroups

###Details:

These are used for various reason, mostly for filter out apps for downtime and escalations.

define hostgroup{
  hostgroup_name      default_group
  alias               default_group
  }

define hostgroup{
  hostgroup_name      ESX
  alias               ESX
  }

define hostgroup{
  hostgroup_name      Monitoring
  alias               Monitoring
  }

define hostgroup{
  hostgroup_name      Unix
  alias               Unix
  }

Transient Hostgroups

Details:

These are groups used for a specific purpose such as a release.

Note:

Do not count on these being here and DO NOT place anything here you care about, as they could be removed at any given time based upon monitoring's discretion.

mattyjones/groups.md

Icinga Groups Configuration File

Dependency Servicegroups

Details:

The groups are used to set dependency conditions environment wide. Details on specific dependencies can be found in dependencies.md

Example:

All services/groups in the nrpe_dependency will have the current status of check_nrpe evalulated before the monitor is either run or alerts.

General Servicegroups

Details:

These are general groups that allow the goc or monitoring to pm/ack entire environments. They consist of either every service monitor or every disk monitor.

OS Servicegroups

Details:

These are OS level groups, each group contains every host that the check runs on. The point of them is to be able to ack/downtime a single check on every machine that it runs on and in some cases to assist in building escalation and dependency groups.

Monitoring Servicegroups

Details:

These are all groups of monitoring checks, they function in the same way as the OS level groups except they are specific to monitoring.

Note:

If you downtime/disable the nrpe group you will stop all remote checking, I can't think of a good reason for this except an NRPE update has gone bad. If the network is having issues, I would recommend stopping all of Icinga.

App Servicegroups

Details:

These function the same as the OS level groups except they are specific to the app owner. Using these you can disable/pm an entire app. The hardware groups are not to be messed with, they are used in escalations and contain cpu, memory, and swap checks.

Transient Servicegroups

Details:

This is a container for all servicegroups in current use or new ones that have yet to be deployed. Placing anything in this container that you care about is bad as I will delete cruft from here without asking.

General Hostgroups

These are used for various reason, mostly for filter out apps for downtime and escalations.

Transient Hostgroups

Details:

These are groups used for a specific purpose such as a release.

Note:

Do not count on these being here and DO NOT place anything here you care about, as they could be removed at any given time based upon monitoring's discretion.