CloudWatch Custom Dashboard - Period Setting - amazon-web-services

I'm trying to set up a custom dashboard in CloudWatch, and don't understand how "Period" is affecting how my data is being displayed.
Couldn't find any coverage of this variable in the AWS documentation, so any guidance would be appreciated!

Period is the width of the time range of each datapoint on a graph and it's used to define the granularity at which you want to view your data.
For example, if you're graphing total number of visits to your site during a day you could set the period to 1h, which would plot 24 datapoints and you will see how many visitors you had in each hour of that day. If you set the period to 1min, graph will display 1440 datapoints and you will see how many visitors you had each in minute of that day.
See the CloudWatch docs for more details:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#CloudWatchPeriods
Here is a similar question that might be useful:
API Gateway Cloudwatch advanced logging

Related

AWS CloudWatch sql query not showing results

I'm using CloudWatch to monitor cpu_usage_system metric from CWagent.
I'm plotting data that is more then 24h old.
When using the regular CloudWatch browsing tab to view the data I see data points, when I do the same with CloudWatch SQL I do not.
Answer from AWS support:
CloudWatch Metrics Insights currently supports the latest three hours of data only. When you graph with a period larger than one minute, for example five minutes or one hour, there could be cases where the oldest data point differs from the expected value. This is because the Metrics Insights queries return only the most recent 3 hours of data, so the oldest datapoint, being older than 3 hours, accounts only for observations that have been measured within the last three hours boundary.
In simple words, Currently, we can query only the most recent 3 hours of data (nothing other than that). The document link having more information on this is mentioned below.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-metrics-insights-limits.html

Best practices to configure thresholds for alarms

I have been having some difficulty understanding how to go about the ideal threshold for few of our cloudwatch alarms. I am looking at metrics for error rates, fault rate and failure rate. I am vaguely looking at having an evaluation period of around 15 mins. My metrics are being recorded at a minute level currently. I have the following ideas:
To look at the avg of minute level data over a few days, and set it slightly higher than that.
To try different thresholds (t1,t2 ..) and for a given day, see how many times the datapoints are crossing it in 15 min bins.
Not sure if this is the right way of going about it, do share if there is a better way of going about the problem.
PS 1: I know that thresholds should be based on Service Level Agreements(SLA), but let's say we do not have an SLA yet.
PS 2: Also does can I import data from cloudwatch to excel for some easier manipulation? Currently looking at running a few queries on log insights to calculate error rates.
In your case, maybe you could also try Amazon CloudWatch Anomaly Detection instead of static thresholds:
You can create an alarm based on CloudWatch anomaly detection, which mines past metric data and creates a model of expected values. The expected values take into account the typical hourly, daily, and weekly patterns in the metric.

How can I see request count (not rate) for a Google Cloud Run application?

I deployed a Google Cloud Run service running in a docker container. Out of the box, it looks like I get insight into some metrics on the Metrics tab of the service page such as Request count, Request latencies and more. Although it sounds like request count would answer my question, what I am really looking for is insight into adoption so that I can answer "How many visits to my application were there in the past week" or something like that. Is there a way to get insight like that out of the box?
Currently, the Request count metric reports responses/second, so I can see blips that look like "0.05/s", which can give me some insight but it's hard to aggregate.
I've tried using the Monitoring > Metrics explorer as well, but I'm not seeing any data for the metrics I select. I'm considering hooking into Google Analytics from within my application if that seems like the suggested solution. Thank you!
I've realized it's quite difficult to have Metrics Explorer give you a straight answer on "how many requests I received this month". However, it's possible:
Go to Metrics Explorer as you said, choose resource type "Cloud Run Revision" (cloud_run_revision) and you'll see "Request Count" (run.googleapis.com/request_count) metric:
Description: Number of requests reaching the revision. Excludes requests that are not reaching your container instances (e.g. unauthorized requests or when maximum number of instances is reached).
Resource type: cloud_run_revision
Unit: number Kind: Delta Value type: Int64
Then, choose Aggregator: None, and click Show Advanced Options. In the form, choose Aligner: sum (instead of default "Rate" default). You now should be able to see total request count per minute:
Now if you change "Alignment Period" to "10 minutes", you'll see one data point for every 10m (sadly, there seems to be a bug that says X req/s, but that's more like X reqs/10m in this case):
If you collect enough data, you can change "Alignment Period" to "Custom" and set 30 days, then update your timeframe on the top to 1 year and see monthly request count.
This does not show sums of all Alignment Periods (I think that part is up to you to do manually, maybe possible via the API), but it lets you see requests per month. For example, here's a service I've been running for some months and I set alignment period to 7 days, viewing past 6 weeks, so I get 6 data points on weekly request count. Hope this helps.

One or more points were written more frequently than the maximum sampling period configured for the metric

Background
I have a website deployed in multiple machines. I want to create a Google Custom Metric that specifies the throughput of it - how many calls were served.
The idea was to create a custom metric that collects information about served requests and 1 time per minute to update the information into a custom metric. So, for each machine, this code can happen a maximum of 1-time per minute. But this process is happening on each machine on my cluster.
Running the code locally is working perfectly.
The problem
I'm getting this error: Grpc.Core.RpcException:
Status(StatusCode=InvalidArgument, Detail="One or more TimeSeries
could not be written: One or more points were written more frequently
than the maximum sampling period configured for the metric. {Metric:
custom.googleapis.com/web/2xx, Timestamps: {Youngest Existing:
'2019/09/28-23:58:59.000', New: '2019/09/28-23:59:02.000'}}:
timeSeries[0]; One or more points were written more frequently than
the maximum sampling period configured for the metric. {Metric:
custom.googleapis.com/web/4xx, Timestamps: {Youngest Existing:
'2019/09/28-23:58:59.000', New: '2019/09/28-23:59:02.000'}}:
timeSeries1")
Then, I was reading in the custom metric limitations that:
Rate at which data can be written to a single time series = one point per minute
I was thinking that Google Cloud Custom Metric will handle the concurrencies issues for me.
According to their limitations, the only option for me to implement realtime monitoring is to put another application that will collect information from all machines and will update it into a custom metric. It sounds to me like too much work for a real use case.
What I'm missing?
Now that you add the machine name on the metric and you get the machines metrics.
To SUM these metrics go to Stackdriver > Metric Explorer, and group your metrics by project-id or label for example, and then SUM the metrics.
https://cloud.google.com/monitoring/charts/metrics-selector#alignment
You can save the chart in a custom dashboard.

How to see # of requests per hour on Google Cloud Platform APIs reports

Google API metrics only shows 1 hour to 30 days metrics from today/now. It shows the total but when you narrow the graph it wont update total for that gap of time.
How do I see total amount of request for specific point in time.
Besides, it only provides requests per second, which is a constant variation form my app.
I have tried using "Traffic by API" graph on Google Cloud Platform and narrowing it to shorter time.
Expected the results at the bottom to update with a count of requests of the shorter period of time.
Screen cap of one day of metrics adding up 34,238 requests
Screen cap of graph narrowed from 18hs to 21hs but counts still on 34,238
Looking into it, it seems you can go into the desired API and then into metrics where you will find a drop down.
In that drop down you can choose "Traffic by API" and then filter the graph for the period you are interested. Then download/export that graph and process it on Excel.
In the output file you will find request per second, that you can multiply by 60000 and then create a pivot table to add up everything.