This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"Description" : "An example template which launches and bootstraps a cluster of eight CC2 EC2 instances for high performance computational tasks using spot pricing. Includes StarCluster, Grid Engine and NFS.", | |
"AWSTemplateFormatVersion" : "2010-09-09", | |
"Parameters" : { | |
"AccountNumber" : { | |
"Description" : "Twelve digit AWS account number.", | |
"Type" : "String", | |
"NoEcho" : "True" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############################################################################## | |
# | |
# This alarm monitors two file systems and sends an alert when the space | |
# utilization on a given filesystem exceeds the given value. | |
# | |
# Initialize the variables root_util and var_util which will hold the | |
# current FS_SPACE_UTIL value. The first time they are accessed, they will | |
# be initialized to zero. Loop through the filesystem each interval and save | |
# the FS_SPACE_UTIL for each one. Send an alert if the space utilization | |
# exceeds the given threshold. A repeat alert will be sent every 30 minutes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# The Network bottleneck symptom default relies on general throughput | |
# metrics. Not all network interfaces report collision data. To be | |
# useful as a bottleneck indicator, the rate thresholds should be | |
# adjusted based on values seen in historical data for a particular | |
# system or network. For example, 100mbit networks cannot handle as | |
# high packet rates without a bottleneck than can gigabit networks. | |
symptom Network_Bottleneck type=NETWORK | |
rule GBL_NFS_CALL_RATE > 500 prob 25 | |
rule GBL_NET_COLLISION_PCT > 10 prob 10 | |
rule GBL_NET_COLLISION_PCT > 25 prob 20 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# The Memory bottleneck symptom default is triggered by a combination | |
# of several metrics. Excessive page outs can be an indicator of memory | |
# pressure when the memory utilization is high, however memory-mapped | |
# file writes also generate pageouts. Under heavy memory pressure, data | |
# will start to be swapped out. | |
symptom Memory_Bottleneck type=MEMORY | |
rule GBL_MEM_UTIL > 95 prob 30 | |
rule GBL_MEM_UTIL > 98 prob 20 | |
rule GBL_MEM_PAGEOUT_BYTE_RATE > 200 prob 20 | |
rule GBL_MEM_SWAPOUT_BYTE_RATE > 0 prob 20 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# The Disk bottleneck symptom default is influenced mostly by the busiest | |
# disk's utilization. The disk request queue is an indicator that processes | |
# may be waiting for disk resources. | |
symptom Disk_Bottleneck type=DISK | |
rule GBL_DISK_UTIL_PEAK > 50 prob GBL_DISK_UTIL_PEAK | |
rule GBL_DISK_REQUEST_QUEUE > 3 prob 25 | |
alarm Disk_Bottleneck > 50 for 5 minutes | |
type = "Disk" | |
start |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# The CPU bottleneck symptom default is influenced mostly by the overall | |
# cpu utilization. Note that cpu utilization may be high even though | |
# there is no bottleneck. The run queue is an indicator processes are | |
# waiting for cpu resources, and that the cpu may be bottlenecked. | |
symptom CPU_Bottleneck type=CPU | |
rule GBL_CPU_TOTAL_UTIL > 75 prob 25 | |
rule GBL_CPU_TOTAL_UTIL > 85 prob 25 | |
rule GBL_CPU_TOTAL_UTIL > 90 prob 25 | |
rule GBL_RUN_QUEUE > 2 prob 25 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
docker ps -a --no-trunc | awk -F " +" '{$1=$1}1' OFS="\t" |