AWS cloudwatch metrics - amazon-web-services

I would like to know more details and impact on aws cloud watch metrics- aws docs does have much details on metrics
what is the difference between below metrics ?
What is the impact on application or aws instance if below alerts triggers ?
Http Server Errors GreaterThan 0 (Count) in the last 5 minutes was activated
Requests GreaterThan 100 (Count) in the last 5 minutes
Http 404 GreaterThan 0 (Count) in the last 5 minutes' was activated
Requests GreaterThan 500 (Count) in the last 5 minutes' was activated
Cloudwatch checking these errors in logs ?

These metrics are related to your load balancer. Here is my explanation:
the web server behind the load balancer throws an HTTP error with code 5XX - indicating that your server cannot perform the request. This can be due to several reasons such as Internal Server Error, Not Implemented (e.g., the server expects POST but the client sends GET), Gateway Timeout (e.g., the server executes a slow db query and
the result does not come back in time), etc.
the number of requests completed or connections made is more than 100 - indicating exactly what is says.
the number of "Not Found" messages received by the clients - indicating that the client is requesting a page which does not exist in your application (for instance, https://stackoverflow.com/test)
the number of requests completed or connections made is more than 500 - the same as number 2 but indicating even more requests.
If all these alarms are triggered at once, there is probably high load on your server and it is not functioning optimally. More than that, though, is hard to say. You need to check the maximum number of errors. The most important one is 5XX (number 1).
The load balancers, publish these metrics to CloudWatch and these don't have anything to do with your application log (if I understood the question correctly).

Related

AWS Cloudfront returns 403 when PUT request body is greater than 8kb

I have an API behind AWS Cloudfront which has functioned fine with our front end application for years. Recently, after a feature release, I've noticed some users are reporting data not being saved and the app just hanging. After a lengthy investigation, I've discovered that our Cloudfront distribution will return a 403 Forbidden error when a PUT request's JSON body is greater than 8kb. Anything less works fine, anything more returns 403. I verified this by sending PUT requests with a decreasing body data size until I got the expected 201 Created response, and just checked the size of the body sent. The JSON is properly formatted.
The error returns in about 170ms and contains the header from Cloudfront X-Cache: Error from cloudfront
I have looked for settings on size limits, I've tried disabling the WAF rules, I've tried "Compress objects" to both on/off. Would having Real-time logs enabled have an impact on the max data accepted somehow? Seems crazy but I'm kind of bewildered by this issue.
I would love to show you all some kind of log from Cloudwatch to help, but requests that don't make it past Cloudfront are not logged in Cloudwatch, and I have not been able to setup any kind of logging to get better insight into why it's barfing.
As was the initial hunch, this turned out to be a WAF ACL rule issue.
The blocking ACL was applied to the application load balancer, so finding it in the Web ACL list either requires inspecting the region where your load balancer is (eg us-west-2), or by inspecting the load balancer's Integrate Services, where you can see any AWS WAF rules:
AWS > EC2 > Load Balancers > {instance} > Integrated services (tab) > AWS WAF
The specific rule was in an AWS managed rule set called AWS-AWSManagedRulesCommonRuleSet. Just turn the SizeRestrictions_BODY rule to Count instead of "Use action defined in the rule"
This obviously has impacts on what requests get through to your application, so do with that what you will.

Google Cloud Run concurrency limits + autoscaling clarifications

Google Cloud Run allows a specified request concurrency limit per container. The subtext of the input field states "When this concurrency number is reached, a new container instance is started" Two clarification questions:
Is there any way to set Cloud Run to anticipate the concurrency limit being reached, and spawn a new container a little before that happens to ensure that requests over the concurrency limit of Container 1 are seamlessly handled by Container 2 without the cold start time affecting the requests?
Imagine we have Maximum Instances set to 10, Concurrency set to 10 and there are currently 100 requests being processed (i.e. we've maxed our our capacity and cannot autoscale any more). What happens to the 101th request? Will it be queued up for some period of time, or will a 5XX be returned immediately?
Is there any way to set Cloud Run to anticipate the concurrency limit
being reached, and spawn a new container a little before that happens
to ensure that requests over the concurrency limit of Container 1 are
seamlessly handled by Container 2 without the cold start time
affecting the requests?
No. Cloud Run does not try to predict future traffic patterns.
Imagine we have Maximum Instances set to 10, Concurrency set to 10 and
there are currently 100 requests being processed (i.e. we've maxed our
our capacity and cannot autoscale any more). What happens to the 101th
request? Will it be queued up for some period of time, or will a 5XX
be returned immediately?
HTTP Error 429 Too Many Requests will be returned.
[EDIT - Google Cloud documentation on request queuing]
Under normal circumstances, your revision scales out by creating new
instances to handle incoming traffic load. But when you set a maximum
instances limit, in some scenarios there will be insufficient
instances to meet that traffic load. In that case, incoming requests
queue for up to 60 seconds. During this 60 second window, if an
instance finishes processing requests, it becomes available to process
queued requests. If no instances become available during the 60 second
window, the request fails with a 429 error code on Cloud Run (fully
managed).
About maximum container instances

High latency from Google CDN - how to troubleshoot it?

We are trying to find out why there is a high latency from Google CDN.
Our site is behind Google'a http_load_balancer with CDN turned on.
For example, by inspecting sampe GET request for a jpg file (43Kb), we can see from http_load_balancer logs that around 30% of such requests have httpRequest.latency > 1 second, and a lot are taking much longer like several or hundreds of seconds....
This is just by looking at 24h log sample (around 6K the same requests).
The httpRequest.cacheLookup and httpRequest.cacheHit for all of those requests are true.
Also jsonpayload_type_loadbalancerlogentry.statusdetails is response_from_cache and jsonpayload_type_loadbalancerlogentry.cacheid values shows correct region.
When doing the same GET request manually in the browser we are getting expected results with TTFB around 15-20ms.
Any idea where to look for a clue?
The httpRequest.latency field measures the entire download duration, and is directly impacted by slow clients - e.g. a mobile device on a congested network or throttled data plan.
You can check this by looking at the frontend_tcp_rtt metric (which is the RTT between the client and Cloud CDN) in Cloud Monitoring, as well as the average, median and 90th percentile total_latencies, where the slow clients will show up as outliers: https://cloud.google.com/load-balancing/docs/https/https-logging-monitoring#monitoring_metrics_fors
You may find that slow clients are from a specific group of client_country values.
Latency can be introduced:
Between the original client and the load balancer.
You can see the latency of that segment with the metric https/frontend_tcp_rtt.
Between the load balancer and the backend instance.
Which can be reviewed with the metric https/backend_latencies (this metric also includes the app processing time in your backend).
By the software running on the instance itself.
To investigate this I would check the access/error logs on the backend instance software and resource utilization of the VM instance.
Further information about metrics description on the GCP load balancer metrics doc.
httpRequest.latency log field description:
"The request processing latency on the server, from the time the request was received until the response was sent."

Why are all requests to ECS container going to only 1 (of 2) EC2 instances in AWS?

In AWS, I have an ECS cluster that contains a service that has 2 EC2 instances. I sent 3 separate API requests to this service, each should take about an hour to run at 100% capacity. I sent the requests a couple minutes apart. They all went to the same instance and left the other open. Here's a graph of CPU utilization Here's an image of my Service CPU Utilization. It is not using all it's bandwidth: What am I missing? Why won't requests go to the second EC2 instance
An ALB will not perfectly Round-Robin between two instances. If you sent 100 requests, 100 times, then on average each instance would receive 50 requests each, but most of the time it won't be 50 exactly for each backend.
For a long running task like this it is preferable to use something else such as SQS, whereby each container will only process x messages at a time (most of the time you'd want x=1). Each instance can then poll SQS for the work, and wont take more work whilst it is busy.
You will receive other benefits too such as being able to see how long a message is taking to finish, and error handling capabilities to account for timeouts or if a server were to die whilst it is doing work.

AWS throttling for Code Commit

I am getting below error when I'm doing git operations on a Code Commit repository. The number of operations is in the range of tens in few minutes - adding/removing/pulling files.
Is this because of AWS throttling or something else?
If so, what's the limit and how do I increase it in AWS?
"interim_desc": "RequestId: 12e27770db854bf0a6034cd6f851717d. 'git fetch origin --depth 20' returned with exit code 128.
error: RPC failed; HTTP 429 curl 22 The requested URL returned error: 429 Too Many Requests: The remote end hung up unexpectedly'"
Here is the manual how to handle 429 error while accessing CodeCommit:
Access error: “Rate Exceeded” or “429” message when connecting to a CodeCommit repository
https://docs.aws.amazon.com/codecommit/latest/userguide/troubleshooting-ae.html#troubleshooting-ae3
I would copy here the most noteable part:
Implement jitter in requests, particularly in periodic polling requests.
If you have an application that is polling CodeCommit periodically and this application is running on multiple Amazon EC2 instances, introduce jitter (a random amount of delay) so that different Amazon EC2 instances do not poll at the same second. We recommend a random number from 0 to 59 seconds to evenly distribute polling mechanisms across a one-minute timeframe.
......
Request a CodeCommit service quota increase in the AWS Support Center.
To receive a service limit increase, you must confirm that you have already followed the suggestions offered here, including implementation of error retries or exponential backoff methods. In your request, you must also provide the AWS Region, AWS account, and timeframe affected by the throttling issues.