Monitoring http 5xx errors in AWS loadbalancer - amazon-web-services

I need to monitor the http 5xx errors coming in AWS ALB using cloudwatch and need to know what is the error and to set up alerts using clouwdatch .
I searched for a lot of methods , but couldnt get one .
Can someone help me with this to setup and monitoring and alert of 5xx errors in ALB using cloudwatch/Lamba/Slack??

Related

Monitoring EKS Kubernetes LoadBalancer service Type

I have created few services in Kubernetes with type: LoadBalancer.
Platform: EKS
Is there a way to get number of 4xx or 5xx errors from this LoadBalancer? I have tried the following:
Prometheus - Does not seem to be any metric collected for services with HTTP response codes.
AWS Cloudwatch - Does not show data points for the 2xx, 3xx, 4xx or 5xx errors. Shows other metrics like latency, request counts, etc.
This should be solved using cloudwatch metrics.
https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-cloudwatch-metrics.html
Particularly, look at HTTPCode_ELB_5XX and HTTPCode_Backend_5XX

AWS Application ELB swallowing the 5XXs errors?

Since one week we are using "application" ELB for our applications. In ELB monitoring we couldn't see any 5XXs responses, even though there were many 5XXs in our application accesslogs.
Maybe it could be a configuration error!?
You are getting 5xx in the application logs but not in ELB metrics. If there is 5xx in application logs It's for that application which connects to.
It's not for the load balancer. So ELB is not receiving 504.

AWS ELB Logs not showing 5xx errors in

I am seeing an strange issues with AWS ELB, I am getting High-Sum-HTTP-5XX from ELB but when I go to log I do not see any request in access log which have 5XX errors.
Does elb access log does not have 5XX errors reported there. Where can I see which request were having 5XX error it will help me to find root cause. I do not see anything in my server log as well.
I'm speculating, you are running a CLB (Classic Load Balancer). The access log with HTTP 5xx errors entries should be analyzed using elb_status_code and a backend_status_code
entries.
This could be off the topic but from AWS's documentation, it looks like some of these HTTP messages cannot be parsed by Classic Load Balancer (This could happen if there is reverse proxy in place on the instance that is sending an error that the ELB doesn't understand and hence are not recorded in the access logs. I could see the 404 errors in the access logs).

Diagnosing occasional HTTP 5xx errors in Elastic Beanstalk and Elastic Load Balancer

My monitoring tab in Elastic Beanstalk is showing occasional HTTP 5xx errors, both from the EB instance and the ELB that performs its load balancing.
The trouble is that I generally only see these a few hours after they occur, and by the time I log into the EB instance the logs have rotated and see no trace of the error.
What's the best way to record the request and response associated with these errors for later viewing?
Best and cheap option to achieve this is set up a cron job on the EC2 instance that will move the logs to a AWS S3 bucket each 15 min or so. Or in other word store the logs in AWS S3 so you can analyze them when ever you want.
Here are some things I've found out in the past few weeks (I'll maybe edit into a more coherent answer later):
Consider the layering here: we've got ELB -> httpd -> Tomcat (in my example). I'd forgotten about httpd (Apache 2.2 atm)
You can enable ELB logging into an S3 bucket of your choice. This allows you to see the results returned to the client
From there, trace through to httpd to see if there are any errors in /var/log/httpd
And then from there, trace through to the Tomcat logs to see if the same errors pop up there
I was seeing errors in ELB and httpd that weren't showing in Tomcat
I was also seeing a number of error messages similar to:
->
"proxy: error reading status line from remote server"
"(103)Software caused connection abort: proxy: pass request body failed"
Reading around, these may be caused by bugs in mod_proxy.

AWS CloudWatch Web Server Metrics

I have a few EC2 instances with NGINX installed using both ports 80 and 443. The instances are serving different applications so I'm not using an ELB.
I would like to create a CloudWatch alarm to make sure port 80 is always returning 200 HTTP status code. I realize there are several commercial solutions for this such as New Relic, etc, but this is the task I have at hand at the moment.
None of the EC2 metrics look to be able to accomplish this, and I cannot use any ELB metrics since I have no ELB.
What's the best way to resolve this?
You can definetly do this manually (send a request and update a metric directly sent to Cloudwatch). Monitor that metric.
Or you could look into Route53 health checks. You might get away with just configuring a health check there if you are already using Route53:
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover.html
Create a Route53 Heath Check. Supported protocols are TCP, HTTP, and HTTPS.
The HTTP/S protocol supports matching the response payload against a user-defined string so you can not only react to connectivity problems but also to unexpected content being returned to users.
For a more advanced monitoring enable Latency metrics which collect TTFB (time to first byte) and SSL handshake times.
You can then create alarms to get alerts when one your apps becomes inaccessible.