I need help in making sense of how many data points (SampleCount) I get in 5-minute intervals in basic monitoring.
I have basic monitoring for an EC2 instance, which means new data point is gathered every 5 minutes.
With MetricsDataQueries API, I can get data points for the metric.
I have queried to get SampleCount of data points every 5 minutes in a 10-minute period.
Data shows (10-min period):
0 min - 5 sample count
5 min - 5 sample count
I am confused now as to what this actually means. Since basic monitoring gathers data every 5 minute, I would've expected to have 1 data point per 5 minute intervals. So my expectation:
0 min - 1 sample count
5 min - 1 sample count
Thank you for your help!
Default monitoring metrics are collected every 5 minutes but the custom monitoring metrics are collected every minute. See FAQ.
A custom metric can be one of the following:
Standard resolution, with data having one-minute granularity
High resolution, with data at a granularity of one second
By default, metrics are stored at 1-minute resolution in CloudWatch. You can define a metric as high-resolution by setting the StorageResolution parameter to 1 in the PutMetricData API request. If you do not set the optional StorageResolution parameter, then CloudWatch will default to storing the metrics at 1-minute resolution.
When you publish a high-resolution metric, CloudWatch stores it with a resolution of 1 second, and you can read and retrieve it with a period of 1 second, 5 seconds, 10 seconds, 30 seconds, or any multiple of 60 seconds.
Custom metrics follow the same retention schedule listed above.
Difference (one of) between detailed monitoring and basic monitoring is the frequency at which the data is published to CloudWatch. In the case of basic monitoring that is every 5 min and in the case of detailed monitoring that is every 1 minute.
Data collected is the same and the 5 min datapoint that is published in the case of basic monitoring is an aggregation of the 1 min datapoints, that's why the sample count is 5, it's an aggregation of five 1 min samples.
Below is an example of a metric before and after the detailed monitoring was enabled.
Before enabling - no difference between graphing the metric at 1 min or 5 min resolution.
After enabling - graphing at 1 min resolution gives you more detail.
Related
I want to setup an alarm which would raise an alert in case there are any items in the DyanamoDB table. The alarm has been setup in the following manner -
My understanding is that -
Period defines how regularly the data point is recorded in this case, a datapoint per minute.
Maximum statistic means that we select the maximum value of RecordedItemCount in a minute.
2 out of 5 would mean that from the last 5 datapoint (5 minutes), if 2 of them are in ALARM, the state of the alarm would change.
However I am not seeing the intended results. I can only see a single datapoint (instead of a datapoint every minute?) in the chart and the state is OK even when the datapoint is above the threshold?
Could someone help out with this?
I figured out that this was the intended behaviour. As per the documentation for ReturnedItemCount -
This metric would only record a datapoint if the Query/Scan operation was performed on the table during a given time interval. It isn't a periodic check unlike some of the other metrics available on CloudWatch.
Input data
DynamoDB free tier provides:
25 GB of Storage
25 Units of Read Capacity
25 Units of Write Capacity
Capacity units (SC/EC - strongly/eventually consistent):
1 RCU = 1 SC read of 4kB item/second
1 RCU = 2 EC reads of 4kB item/second
1 WCU = 1 write of 1kB item/second
My application:
one DynamoDB table 5 RCU, 5 WCU
one lambda
runs each 1 minute
writes 3 items ~8kB each to the DynamoDB
lambda execution takes <1 second
The application works ok, no throttling so far.
CloudWatch
In my CloudWatch there are some charts (ignore the part after 7:00):
the value on this chart is 22 WCU
on this chart it is 110 WCU - actually figured it out - this chart resolution is 5min - 5*22=110 (leaving it here in case my future self gets confused)
Questions
We have 3 writes of ~8kB items/second - that's ~24 WCU. That is consistent with what we see in the CloudWatch (22 WCU). But the table is configured to have only 5 WCU. I've read some other questions and as far as I understand I'm safe from paying extra if the sum of WCUs in my tables configurations is below 25.
Am I overusing the write capacity for my table?
Should I expect throttling or extra charges?
As far as I can tell my usage is still within the free tier limits, but it is close (22 of 25). Am I to be charged extra if my usage gets over 25 on those charts?
The configured provisioned capacity is per second, while the data you see in CloudWatch is per minute. So your configured 5 WCU per second translate to 300 WCU per minute (5 WCU * 60 seconds), which is well above the consumed 22 WCU per minute.
That should already answer your question, but to elaborate a bit on some details:
A single write of 7KB with a configured amount of 5 WCU would in theory never succeed and cause throttling, as 7KB would require 7 WCU to write, while you only have 5 WCU configured (and we can safely assume that your write would occur within one second). Fortunately the DynamoDB engineers thought about that and implemented burst capacity. While you're not using provisioned capacity you'll save them up for up to 5 minutes to use them when you need more than the provisioned capacity. That's something to keep in mind when increasing the utilization of your capacity.
What resolution (basic monitoring with 5 min period, detailed with 1 min, or high-resolution with 1 sec) do Metric Filters use? And how can I change it or at least see it?
Metric filters will only publish the data at 1min resolution.
As the data ages out, these will be rolled up into 5min (for data between 15d and 63d) then into 1h (for the remaining 15 months).
This follows the normal metric retention policy as described in the question "What is the retention period of all metrics?" in the CloudWatch FAQ.
AFAIK subminute resolution is not supported at the moment for metric filters.
I've been scouring different sources (Boto3 docs, AWS docs among others) and most only list a limited number of units as far as time goes. Seconds, Milliseconds, and Microseconds. Say I want to measure a metric in Minutes. How would I go about publishing a custom metric that does this?
Seconds, Microseconds and Milliseconds are the only supported time units: https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_MetricDatum.html
If you want to graph your data using CloudWatch Dashboards, in Minutes, you could publish the data in Seconds and then use metric math to get the data in Minutes: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html
You give the metric id m1 and then your expression would be m1/60.
You can also use metric math with GetMetricData API, in case you need raw values instead of a graph: https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-data.html
I'm trying to set up a custom dashboard in CloudWatch, and don't understand how "Period" is affecting how my data is being displayed.
Couldn't find any coverage of this variable in the AWS documentation, so any guidance would be appreciated!
Period is the width of the time range of each datapoint on a graph and it's used to define the granularity at which you want to view your data.
For example, if you're graphing total number of visits to your site during a day you could set the period to 1h, which would plot 24 datapoints and you will see how many visitors you had in each hour of that day. If you set the period to 1min, graph will display 1440 datapoints and you will see how many visitors you had each in minute of that day.
See the CloudWatch docs for more details:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#CloudWatchPeriods
Here is a similar question that might be useful:
API Gateway Cloudwatch advanced logging