GCP load balancer health check not working properly - google-cloud-platform

I have a Load Balancer setup that routes traffic to multiple Cloud Storage buckets and an instance group for backend. The buckets seem to work fine but I just can't get the instance group to work. The instance works fine when I use it's public IP. But it just won't work through the load balancer.
This is my second time setting up the exact same deployment so not entirely sure where I went wrong. I looked at the documentation for troubleshooting, looks like something is wrong with my health check.
I have configured a health check which should work though, I set it to http port 80 / path. My server is returning 200 response code for that but for some reason, in the load balancer page, I can see the column "healthy" as 0/0 and in the backend service page, I can see 0 of 1 instances healthy.
I even tried adding firewall rules for health check but still no luck.
Then I tried to get the health status using cloud shell and I get an empty status, not even failed, just empty. below is the result I got from cloud shell
backend: https://www.googleapis.com/compute/v1/projects/project-name/zones/asia-southeast1-b/instanceGroups/prod-instance-group
status:
kind: compute#backendServiceGroupHealthkind:
compute#backendServiceGroupHealth

This got resolved overnight. I guess there was some issue with GCP itself, I didn't change a thing and now it shows Healthy 1/1 and works perfectly.
I guess GCP load balancer is just unstable.

Related

I get 404 error when doing a synchronous request to AWS Gateway with Load Balancer

I created a REST AWS API Gateway and it worked perfectly when it was targeting a single ec2 instance. I then went on to set it up with an EC2 Load Balancer for a Target Group with 2 EC2 instances. Now when I make a request that I synchronously get the status of, I get a 404 error. My guess is that the initial job was posted on one machine and then I try to access it on the other machine yielding a 404 error. I tried to enable stickiness to the target group, but that did nothing. Any suggestions?
Stickiness config
I would suggest you to check the logs on your EC2 instances and see which is the exact request routed from the LB to the EC2 machine. My experience is that LB calls the EC2 instances using their internal IP address and the URL might be modified, based on configuration.
Checking the logs will help you debug this error. With stickness you're doing good.

How to change AWS ELB status to InService?

A WordPress application is deployed in AWS Elastic Beanstalk that has a load balancer. I see sometimes there is ELB 5XX error. To make the instance OutOfService for the higher number of unhealthy threshold count, I set Unhealthy Threshold to 10. But sometimes health check fails and health is Severe. I get sometimes the error "% of the requests to the ELB are failing with HTTP 5xx". I checked the ELB access logs and sometimes request get the timeout (504) error and after a consecutive number of 504, ELB makes the instance OutOfService. I am trying to fix which request is failing.
What I don't know, is it possible to make the instance "InService" as quickly as possible. Because sometimes instance is OutOfService for 2-3 hours, which is really bad. Is there any good way to handle this situation. I am really in trouble with this situation. Looks like after the service is out, I have nothing to do. I am relatively new to AWS. Please help.
To solve this issue:
1) HTTP 504 means timeout. The resource that the load balancer is accessing on your backend is failing to respond. Determine what the path for the healthcheck from the AWS console.
2) In your browser verify that you can access the healthcheck path going around the load balancer. This may mean temporarily assigning an EIP to the EC2 instance. If the load balancer healthcheck is "/test/myhealthpage.php" then use "http://REPLACE_WITH_EIP/test/myhealthpage.php". For HTTPS listeners use https in your path.
3) Debug why the path that you specified is timing out and fix it.
Note: Healthcheck paths should not be to pages that do complicated tests or operations. A healthcheck should be a quick and simple GO / NO GO type of page.

http 502 errors when new instance is being created in a group

We are using cross region load balancing. When we get heavy traffic all at once, within 1 region, it begins to spin up new instances. While it is starting new instances, we get random HTTP 502 errors. Screenshots of configurations below. Is there any way to avoid the 502 errors while it is scaling up?
Image links of configuration below.
Instance Group Configuration (same setting on all regions)
Load Balancer
Thanks in advance for the help!
HTTP load balancer and the instances will have different external IPs.
1) Try accessing through one instance's external IP first to make sure the backend works. If it doesn't work, usually it's firewall settings problem.
2) HTTP 502 from load balancer usually indicates the health check of the load balancer thought the backend is unhealthy, check your health check config then.
See another similar question Google Load-balancer randomly failing requests to backend

Instance status is OutOfService in Load balancer

I have created a load balancer in amazon AWS.I created the load balancer in order to set up an ssl in server which already had another domain with SSL.The load balancer was working fine till today.But sometime before I noticed that the status of the instance has changed to Outofservice.
Im new to aws and couldnt find what is going wrong.
My health check is set as
Please help out.
Here is my checklist to troubleshoot this type of issue
Is the Security group of your instance OK ? ELB needs to have access to your instance for the health check
Is your Web / App server correctly running on the instance ? Does it accept connection requests ?
Is the HTTP return code of your health check URL 200 ? If your healthcheck URL returns anything else (a 30x redirect for example), ELB will consider your instance invalid. You can check this with curl -I on Linux instances.
HTH
--Seb

Why does Elastic Load Balancing report 'Out of Service'?

I am trying to set up Elastic Load Balancing (ELB) in AWS to split the requests between multiple instances. I have created several images of my webserver based on the same AMI, and I am able to ssh into each individually and access the site via each distinct public DNS.
I have added each of my instances to the load balancer, but they all come back with the Status: Out of Service because they failed the health check. I'm mostly confused because I can access each instance from its public DNS, but I get a timeout whenever I visit the load balancer DNS name.
I've been trying to read through all the docs and googling it, but I'm stuck. Any pointers or links in the right direction would be greatly appreciated.
I contacted AWS support about this same issue. Apparently their system doesn't know how to handle cases were all of the instances behind the ELB are stopped for an extended amount of time. AWS support can manually refresh the statuses, if you need them up immediately.
The suggested fix it to de-register the ec2 instances from the ELB instead of just stopping them and re-register them when you start again.
Health check is (by default) made by accessing index.html on each instance incorporated in load balancer. If you don't have index.html in document root of instance - default health check will fail. You can set custom protocol, port and path for health check when creating elastic load balancer.
Finally I got this working. The issue was with the Amazon Security Groups, because I've restricted the access to port 80 to few machines on my development area and the load balancer could not access the apache server on the instance. Once the load balancer gained access to my instance, it gets In Service.
I checked it with tail -f /var/log/apache2/access.log in my instance, to verify if the load balancer was trying to access my server, and to see the answer the server is giving to the load balancer.
Hope this helps.
If your web server is running fine, then it means the health check goes on a url that doesn't return 200.
A trick that works for me : go on the instance, type curl localhost:80/pathofyourhealthcheckurl
After you can adapt your health check url to always have a 200 response.
In my case, the rules on security groups assigned to the instance and the load balancer were not allowing traffic to pass between the two. This caused the health check to fail.
I to faced same issue , i changed Ping Protocol from https to ssl .. it worked !
Go to Health Check --> click on Edit Health Check -- > change Ping protocol from HTTPS to SSL
Ping Target SSL:443
Timeout 5 seconds
Interval 30 seconds
Unhealthy Threshold 5
Healthy Threshold 10
For anyone else that sees this thread as this isn't listed:
Check that the health check is checking the port that the responding server is listening on.
E.g. node.js running on port 3000 -> Point healthcheck to port 3000;
Not port 80 or 443. Those are what your ALB will be using.
I spent a morning on this. Yes.
I would like to provide you a general way to solve this problem. When you have set up you web server like apache or nginx, try to read the access log file to see what happened. In my occasion, it report 401 error because I have add the basic auth in nginx. Of course, just like #ivankoni remind, it may because of the document you check is not exist.
I was working on the AWS Tutorial on hosting a web app and ran into this problem. Step 7b states the following:
"Set Ping Path to /. This sends queries to your default page, whether
it is named index.html or something else."
They could have put the forward slash in quotations like this "/". Make sure you have that in your health checks and not this "/." .
Adding this because I've spent hours trying to figure it out...
If you configured your health check endpoint but it still says Out of Service, it might be because your server is redirecting the request (i.e. returning a 301 or 302 response).
For example, if your endpoint is supposed to be /app/health/ but you only enter /app/health (no trailing slash) into the health check endpoint field on your ELB, you will not get a 200 response, so the health check will fail.
I had a similar issue. The problem appears to have been caused due to my using a HTTP health check and also using .htaccess to password protect the site.
I got the same error, in my case had to copy the particular html file from s3 bucket to "/var/www/html" location. The same html referenced in load balancer path.
The issue resolved after copying html file.
I had this issue too, and it was due to both my inbound and outbound rule for the Load Balancer's Security Group only allowing HTTP traffic on port 80. I needed to add another rule for HTTPS traffic on port 443.
I was also facing that same issue,
where ELB (Classic-Load-Balancer) try to request /index.html not / (root) while health check.
If it unable to find /index.html resource it says 'OutOfService'. Be Sure index.html should be available.