https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_architecture.html
Namespace
- Isolate metrics
- No default, must specify
< 256,0-9a-zA-Z . - _ / # :- AWS namespace:
AWS/<service>, e.g.AWS/EC2
Metrics: fundamental concept
-
Region specific
-
Metric(variable) represents a time-ordered set of data points(value over time), e.g. EC2 CPU usage
-
You can send your own metrics, add data points in ANY ORDER, and at any rate you choose, you can retrieve statistics about those data points as ordered set of time-series data
-
Can NOT be deleted; if no new data is pushed, auto expired after 15 months
-
Data points older than 15 months expire on a rolling basis; new in, old out
-
Time Stamps
- Required for each data point, default now
now.minus(2Weeks) <= TS <= now.plus(2Hours)- recommend UTC,
2016-10-31T23:59:59Z; non-UTC cause alarmInsufficient Dataor delayed
-
Metrics Retention
For example, if you collect data using a period of 1 minute, the data remains available for 15 days with 1-minute resolution. After 15 days this data is still available, but is aggregated and is retrievable only with a resolution of 5 minutes. After 63 days, the data is further aggregated and is available with a resolution of 1 hour.
- data points with period of X are available for Y
- < 1 min: 3 hours
- 1 min: 15 days
- 5 mins: 63 days
- 1 hour: 455 days(15 months)
- data points with period of X are available for Y
Dimensions
-
a name/value pair uniquely identifies a metric; assign up to 10 dimensions to a metric
-
e.g. filter specific EC2 instance by specifying
InstanceIddimension -
For certain AWS services, e.g. EC2, CloudWatch can aggregate across dimensions
-
For custom metrics, CloudWatch does not aggregate across dimension, you must specify dimension when searching
-
Dimension Combinations
- treat each as a separate metric, so combinations should uniquely identify
- can only use the dimensions you published
// namespace: `DataCenterMetric` //metric: `ServerStat` Dimensions: Server=Prod, Domain=Frankfurt, Unit: Count, Timestamp: 2016-10-31T12:30:00Z, Value: 105
Dimensions: Server=Prod, Domain=Rio, Unit: Count, Timestamp: 2016-10-31T12:32:00Z, Value: 95
// GOOD
Server=Prod,Domain=Rio
// BAD
Server=Prod
**Statistics**
* *statistics* are metric data aggregations over specified periods of time
* avaliable statistic
* `Minimum`, `Maximum`, `Average`(Sum/SampleCount)
* `Sum`: all added together
* `SampleCount`
* `pNN.NN`: e.g. p95.45
* Unit
* each statistic has a unit of measure, by default `None`
* aggregate data points by unit, two different unit => separate data streams
* Periods
* A period is the length of time associated with a specific Amazon CloudWatch statistic. Each statistic represents an aggregation of the metrics data collected for a specified period of time
* 1, 5, 10, 30 or any multiple of 60, e.g. 360 means 6mins, min 1, max 86400
* When you retrieve statistics, specify a period, start time, and end time; default period is 1 min, and end-start is 1 hours, so you get an aggregated set of statistics for each minute of the previous hour.
* Aggregation
* **NOT** aggregate data across regions
* You can publish as many data points as you want with the same or similar time stamps. CloudWatch aggregates them by period length
* For large datasets, you can insert a pre-aggregated dataset called a statistic set. With **statistic sets**, you give CloudWatch the Min, Max, Sum, and SampleCount for a number of data points. This is commonly used when you need to collect data many times in a minute
**Percentiles**
* e.g. 95th mean 95% data is lower than a value and %5 is higher than it
* Supported by `EC2 RDS Kinesis ALB ELB APIGateway`
* up to 2 decimal places e.g. p95.85
* for **statistic sets**, only for data
1. SampleCount is 1
2. Min and Max are equal
**Alarms**
* An alarm watches a single metric over a specified time period, and performs one or more specified actions, based on the value of the metric relative to a threshold over time
* When creating an alarm, select a period that is greater than or equal to the frequency of the metric to be monitored
### Getting Started
#### Custom metrics
```bash
# Publish metric
aws cloudwatch put-metric-data --metric-name Buffers --namespace MyNameSpace --unit Bytes --value 231434333 --dimensions InstanceId=1-23456789,InstanceType=m1.small
# Get statistic
aws cloudwatch get-metric-statistics --metric-name Buffers --namespace MyNameSpace --dimensions Name=InstanceId,Value=1-23456789 Name=InstanceType,Value=m1.small --start-time 2016-10-15T04:00:00Z --end-time 2016-10-19T07:00:00Z --statistics Average --period 60
Single Data Points
# Publish
aws cloudwatch put-metric-data --metric-name PageViewCount --namespace MyService --value 2 --timestamp 2016-10-20T12:00:00.000Z
aws cloudwatch put-metric-data --metric-name PageViewCount --namespace MyService --value 4 --timestamp 2016-10-20T12:00:01.000Z
aws cloudwatch put-metric-data --metric-name PageViewCount --namespace MyService --value 5 --timestamp 2016-10-20T12:00:02.000Z# Get statistic
aws cloudwatch get-metric-statistics --namespace MyService --metric-name PageViewCount \
--statistics "Sum" "Maximum" "Minimum" "Average" "SampleCount" \
--start-time 2016-10-20T12:00:00.000Z --end-time 2016-10-20T12:05:00.000Z --period 60
{
"Datapoints": [
{
"SampleCount": 3.0,
"Timestamp": "2016-10-20T12:00:00Z",
"Average": 3.6666666666666665,
"Maximum": 5.0,
"Minimum": 2.0,
"Sum": 11.0,
"Unit": "None"
}
],
"Label": "PageViewCount"
}Log Events: contains timestamp and raw message(must be UTF-8 encoded)
Log Streams
- a sequence of log events that share the same source.
Log Groups
Metric Filter
Retention Settings





