Recently I have configured the ALARM in Cloudwatch for tracking VPN Tunnel connection. It is well known that 0 indicates tunnel is DOWN and 1 indicates tunnel is UP. When Connection is down, I have seen some data points on the graph shown as 0.66, 0.75.
So what does that mean, is the connection is DOWN or UP?
The correct statistic for each metric depends on your use case, and the underling metric.
From CloudWatch Concepts - Statistics
Statistics are metric data aggregations over specified periods of
time. CloudWatch provides statistics based on the metric data points
provided by your custom data or provided by other AWS services to
CloudWatch. Aggregations are made using the namespace, metric name,
dimensions, and the data point unit of measure, within the time period
you specify. The following table describes the available statistics.
Given the VPN metric above, try using the Maximum or Minimum statistics for the alarm. You are using the Average statistic, which, as you noted, will not produce meaningful data for your use case.
Minimum
The lowest value observed during the specified period. You can use this value to determine low volumes of activity for your application.
Maximum
The highest value observed during the specified period. You can use this value to determine high volumes of activity for your application.
That happens if your graph shows averages (that is why both of your values are between 1 and 0). In the ClouWatch console select your metric and then click on the Graphed metrics tab. There you will see Statistics column which is most likely set to Average now.
Related
I have a requirement to send an email notification whenever there is no data getting inserted into my BigQuery table. For this, I am using the Logging and Alerting mechanism But still, I am not able to receive any email. Here are the steps I followed:
I had written a Query in Logs explorer as below:
Now I had created a metric for those logs with Metric type COUNTER and in the filter section obviously I have given the above query.
Now I created a policy in ALERTING under the MONITORING domain. And here is the screenshot attached. The alerting policy which I had selected is for the logging metrics which I had created before.
And then a trigger as below:
And in the Notification channel, added my Email ID.
Can someone please help me if I am missing something? My requirement is to receive an alert when there is no data inserted into a Bigquery table for more than a day.
And also, I could see in Metrics Explorer, the metric which I created is not ACTIVE. Why so?
As mentioned in GCP docs:
Metric absence conditions require at least one successful measurement — one that retrieves data — within the maximum duration window after the policy was installed or modified.
For example, suppose you set the duration window in a metric-absence policy to 30 minutes. The condition isn't met if the subsystem that writes metric data has never written a data point. The subsystem needs to output at least one data point and then fail to output additional data points for 30 minutes.
Meaning, you will need at least 1 datapoint (insert job) to have an incident created for the metric to be missing.
There are two options:
Create an artificial log entry to get the metric started and have at least one time series and data point.
Run an insert job that would match the log-based metric that was created to get the metric started.
With regards to your last question, the metric you created is not active because there hasn't been any written data points to it within the previous 24 hours. As mentioned above, the metric must have at least 1 datapoint written to it.
Refer to custom metrics quota for more info.
I'm trying to identify the initial creation date of a metric on CloudWatch using the AWS CLI but don't see any way of doing so in the documentation. I can kind of identify the start date if there is a large block of missing data but that doesn't work for metrics that have large gaps in data.
CloudWatch metrics are "created" with the first PutMetricData call that includes the metric. I use quotes around created, because the metric doesn't have an independent existence, it's simply an entry in the time-series database. If there's a gap in time with no entries, the metric effectively does not exist for that gap.
Another caveat to CloudWatch metrics is that they only have a lifetime of 455 days, and individual metric values are aggregated as they age (see specifics at link).
All of which begs the question: what's the real problem that you're trying to solve.
I have one AWS load balancer going to one EC2 instance. According to the AWS documentation, and what I would expect it to mean, the CloudWatch metric for RequestCount on the ELB should show total number of requests. However, I get a graph mapped to a scale of 0-1, with 1 being the peak.
Is this correct? This is not useful for me. Is there a way to see the actual number of requests?
Okay, answering my own question for future searchers:
You need to go the Graph metrics tab and change the Statistic option to Sum (thanks #Dejan Peretin). I previously had it set to Average.
Looking to list all EC2 servers/instances which have crossed a certain threshold using AWS CloudWatch
I want to view all my ec2 instances or servers which have reached or crossed some threshold i.e triggered some alarm, in any time in the last one month. I have been looking for a solution for the past two days but to no avail. I would really appreciate any help regarding the matter.
You can create an alarm in CloudWatch for an EC2 metric (e.g., CPUUtilization) for all instances by selecting a metric like this EC2 -> Across All Instances -> CPUUtilization. Then you can select a value for Statistic (e.g., Maximum) and a specify a period over which you need the alarm to check for alert (e.g., 1 Minute). Under conditions, select Threshold type (e.g., Static), chose the condition operator (e.g., Greater > threshold), define the threshold value (e.g., 75.0). Under Notification, select the value for Whenever this alarm state is (e.g., in Alarm) and Select an SNS topic. Finally, specify other values like name, description, etc and create the alarm.
I'm trying to push data into a custom metric on AWS CloudWatch but wanted to find out more about the Dimensions and how these are used? I've already read the AWS documentation but it doesn't really explain what they are used for and how it affects the graphing UI in the AWS Management Console.
Are Dimensions a way to breakdown the Metric Value further?
To give a fictitious example, say I have a metric which counts the number of people in a room. The metric's name is called "Population". I report the count once a minute. The Metric Count is set to the number of people. The Dimension field is just a list of Name and Value pairs. Assuming I report a datapoint with a value of 90, can I add two Dimensions as follows:
1. Name: Male, Count: 50
2. Name: Female, Count: 40
Any help will be greatly appreciated.
Yes, you can add dimensions such as you described to your custom metrics.
However, CloudWatch is NOT able to aggregate across these dimensions, as it doesn't know the groups of these dimensions. Basically:
Amazon CloudWatch treats each unique combination of dimensions as a
separate metric. For example, each call to mon-put-data in the
following figure creates a separate metric because each call uses a
different set of dimensions. This is true even though all four calls
use the same metric name (ServerStats).
See more information about dimensions in CloudWatch here
Do note that you can retrieve aggregated value from API, as well as plot a graph in CloudWatch using a math expression. See Using metric math
I should probably also add that you can NOT use metric math in alarms.
update: as #Brooks said Amazon CloudWatch Launches Ability to Add Alarms on Metric Math Expressions
All in all pretty restricted and user-unfriendly compared e.g. to DataDog.