Metrics Filters Resolution - amazon-web-services

What resolution (basic monitoring with 5 min period, detailed with 1 min, or high-resolution with 1 sec) do Metric Filters use? And how can I change it or at least see it?

Metric filters will only publish the data at 1min resolution.
As the data ages out, these will be rolled up into 5min (for data between 15d and 63d) then into 1h (for the remaining 15 months).
This follows the normal metric retention policy as described in the question "What is the retention period of all metrics?" in the CloudWatch FAQ.
AFAIK subminute resolution is not supported at the moment for metric filters.

Related

Best practices to configure thresholds for alarms

I have been having some difficulty understanding how to go about the ideal threshold for few of our cloudwatch alarms. I am looking at metrics for error rates, fault rate and failure rate. I am vaguely looking at having an evaluation period of around 15 mins. My metrics are being recorded at a minute level currently. I have the following ideas:
To look at the avg of minute level data over a few days, and set it slightly higher than that.
To try different thresholds (t1,t2 ..) and for a given day, see how many times the datapoints are crossing it in 15 min bins.
Not sure if this is the right way of going about it, do share if there is a better way of going about the problem.
PS 1: I know that thresholds should be based on Service Level Agreements(SLA), but let's say we do not have an SLA yet.
PS 2: Also does can I import data from cloudwatch to excel for some easier manipulation? Currently looking at running a few queries on log insights to calculate error rates.
In your case, maybe you could also try Amazon CloudWatch Anomaly Detection instead of static thresholds:
You can create an alarm based on CloudWatch anomaly detection, which mines past metric data and creates a model of expected values. The expected values take into account the typical hourly, daily, and weekly patterns in the metric.

Setting a CloudWatch alarm to check for DynamoDB items

I want to setup an alarm which would raise an alert in case there are any items in the DyanamoDB table. The alarm has been setup in the following manner -
My understanding is that -
Period defines how regularly the data point is recorded in this case, a datapoint per minute.
Maximum statistic means that we select the maximum value of RecordedItemCount in a minute.
2 out of 5 would mean that from the last 5 datapoint (5 minutes), if 2 of them are in ALARM, the state of the alarm would change.
However I am not seeing the intended results. I can only see a single datapoint (instead of a datapoint every minute?) in the chart and the state is OK even when the datapoint is above the threshold?
Could someone help out with this?
I figured out that this was the intended behaviour. As per the documentation for ReturnedItemCount -
This metric would only record a datapoint if the Query/Scan operation was performed on the table during a given time interval. It isn't a periodic check unlike some of the other metrics available on CloudWatch.

How many sample count number for basic monitoring per period?

I need help in making sense of how many data points (SampleCount) I get in 5-minute intervals in basic monitoring.
I have basic monitoring for an EC2 instance, which means new data point is gathered every 5 minutes.
With MetricsDataQueries API, I can get data points for the metric.
I have queried to get SampleCount of data points every 5 minutes in a 10-minute period.
Data shows (10-min period):
0 min - 5 sample count
5 min - 5 sample count
I am confused now as to what this actually means. Since basic monitoring gathers data every 5 minute, I would've expected to have 1 data point per 5 minute intervals. So my expectation:
0 min - 1 sample count
5 min - 1 sample count
Thank you for your help!
Default monitoring metrics are collected every 5 minutes but the custom monitoring metrics are collected every minute. See FAQ.
A custom metric can be one of the following:
Standard resolution, with data having one-minute granularity
High resolution, with data at a granularity of one second
By default, metrics are stored at 1-minute resolution in CloudWatch. You can define a metric as high-resolution by setting the StorageResolution parameter to 1 in the PutMetricData API request. If you do not set the optional StorageResolution parameter, then CloudWatch will default to storing the metrics at 1-minute resolution.
When you publish a high-resolution metric, CloudWatch stores it with a resolution of 1 second, and you can read and retrieve it with a period of 1 second, 5 seconds, 10 seconds, 30 seconds, or any multiple of 60 seconds.
Custom metrics follow the same retention schedule listed above.
Difference (one of) between detailed monitoring and basic monitoring is the frequency at which the data is published to CloudWatch. In the case of basic monitoring that is every 5 min and in the case of detailed monitoring that is every 1 minute.
Data collected is the same and the 5 min datapoint that is published in the case of basic monitoring is an aggregation of the 1 min datapoints, that's why the sample count is 5, it's an aggregation of five 1 min samples.
Below is an example of a metric before and after the detailed monitoring was enabled.
Before enabling - no difference between graphing the metric at 1 min or 5 min resolution.
After enabling - graphing at 1 min resolution gives you more detail.

Cloudwatch Custom Metrics units for "minutes"

I've been scouring different sources (Boto3 docs, AWS docs among others) and most only list a limited number of units as far as time goes. Seconds, Milliseconds, and Microseconds. Say I want to measure a metric in Minutes. How would I go about publishing a custom metric that does this?
Seconds, Microseconds and Milliseconds are the only supported time units: https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_MetricDatum.html
If you want to graph your data using CloudWatch Dashboards, in Minutes, you could publish the data in Seconds and then use metric math to get the data in Minutes: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html
You give the metric id m1 and then your expression would be m1/60.
You can also use metric math with GetMetricData API, in case you need raw values instead of a graph: https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-data.html

CloudWatch Custom Dashboard - Period Setting

I'm trying to set up a custom dashboard in CloudWatch, and don't understand how "Period" is affecting how my data is being displayed.
Couldn't find any coverage of this variable in the AWS documentation, so any guidance would be appreciated!
Period is the width of the time range of each datapoint on a graph and it's used to define the granularity at which you want to view your data.
For example, if you're graphing total number of visits to your site during a day you could set the period to 1h, which would plot 24 datapoints and you will see how many visitors you had in each hour of that day. If you set the period to 1min, graph will display 1440 datapoints and you will see how many visitors you had each in minute of that day.
See the CloudWatch docs for more details:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#CloudWatchPeriods
Here is a similar question that might be useful:
API Gateway Cloudwatch advanced logging