How to prevent Google Cloud Load balancer to forward the traffic to newly created auto scaled Instance without being ready? - google-cloud-platform

I will need to host a PHP Laravel application on Google Cloud Compute Engine with auto scaling and load balancing. I tried to setup and configure following:
I Created instance template, where I have added startup script to install apache2, PHP, cloning the git repository of my project, Configuring the Cloud SQL proxy, and configure all settings required to run this Laravel project.
Created Instance group, Where I have configured a rule when CPU reaches certain percent it start creating other instances for auto scale.
Created Cloud SQL instance.
Created Storage bucket, in my application all of the public contents like images will be uploaded into storage bucket and it will be served from there.
Created Load Balancer and assigned the Public IP to load balancer, configured the fronted and backed correctly for load balancer.
As per my above configuration, everything working fine, When a instance reaches a defined CPU percentage, Auto scaling start creating another instances and load balancer start routing the traffic to new instance.
The issue I'm getting, to configure and setup my environment(the startup script of instance template) takes about 20-30 minutes to configure and start ready to serve the content from the newly created instance. But when the load balancer detects if the newly created machine is UP and running it start routing the traffic to new VM instance which is not being ready to serve the content from it.
As a result, when load balancer routes the traffic to not ready machine, it obviously send me 404 error, and some other errors.
How to prevent to happen it, is there any way that the instance that created through auto scaling service send some information to load balancer after this machine is ready to serve the content and then only the load balancer route the traffic to the newly created instance?

How to prevent Google Cloud Load balancer to forward the traffic to
newly created auto scaled Instance without being ready?
Google Load Balancers use the parameter Cool Down to determine how long to wait for a new instance to come online and be 100% available. However, this means that if your instance is not available at that time, errors will be returned.
The above answers your question. However, taking 20 or 30 minutes for a new instance to come online defeats a lot of the benefits of autoscaling. You want instances to come online immediately.
Best practices mean that you should create an instance. Configure the instance with all the required software applications, etc. Then create an image of this instance. Then in your template specify this image as your baseline image. Now your instances will not have to wait for software downloads and installs, configuration, etc. All you need to do is run a script that does the final configuration, if needed, to bring an instance online. Your goal should be 30 - 180 seconds from launch to being online and running for a new instance. Rethink / redesign anything that takes longer than 180 seconds. This will also save you money.

John Hanley answer is pretty good, I'm just completing it a bit.
You should take a look at packer to create your preconfigured google images, this will help you when you need to add a new configuration or do updates.
The cooldown is a great way, but in your case you can't really be sure that your installation won't take a bit more time sometimes due to updates as you should do an apt-get update && apt-get upgrade at instance startup to be up to date it will only take more and more time...
Load balancers normally should have a health check configured and should not route traffic unless the instance is detected as healthy. In your case as you have apache2 installed I suppose you have a HC on the port 80 or 443 depending on your configuration on a /healthz path.
A way to use the health check correctly would be to create a specific vhost for the health check and you add a fake domain in the HC, let's say health.test, that would give a vhost listening for health.test and returning a 200 response on /healthz path.
This way if you don't change you conf, just activate the health vhost last so the loadbalancer don't start routing traffic before the server is really up...

Related

Google Cloud Compute load balancing and auto scaling info NOT written for sysadmin type

I asked this on serverfault but evidently to basic for them.
I have read through a ton of documents on the Google cloud platform but most of it is over my head, I am a developer and not a network type person. I think what I am trying to do is pretty basic but I can't find anywhere that has step by step instructions on how to accomplish the process. Google documentation seems to assume a good deal of networking knowledge.
I have :
created a "managed instance group" with Autoscaling turned on.
RDP'd into the server and installed the required software
upload all the code to run a site
set up DNS to point to that site
tested and everything seems to work just as I would expect.
I need to set up a load balancer and change the DNS to point to that instead of the server.
My web app doesn't have a back-end perse as it is entirely api driven so not sure what to do with the "backend configuration" part of setting up the load balance service.
I have an SSL cert on the server but don't know how to move it to the load balancer.
When the autoscaling kicks in will all the software and code from the current server be used or is there another step that I need to do to make this happen. If I update code on the server via RDP will the new autoscale created instances be aware of it?
Can anyone explain these steps to point me to a place NOT written for a sysadmin that I can try to understand them myself?
Here I am sharing with you a short YouTube video (less than 5 mins) of step by step instructions on how to quickly configure a load balancer in Google Cloud Platform with backend services.
I also would like to mention here that SSL terminates at the load balancer. Here is the public documentation on Creating and Using SSL Certificates in load balancing.
Finally, you want to make sure that all the software and configurations you want on each instance is done before you create the managed instance group, otherwise, the changes you make on one server will not reflect in the others.
To do this, configure your server with all the necessary software and settings. Once the server is in the correct state, create an image out of your server. You can then use this image to create an instance template which you will use for the managed instance group.

Nginx caches upstream node resolved by an elastic load balancer

I have an nginx AWS EC2 instance that forwards requests to an elastic load balancer (ELB) that is maintained by an auto scaling group (ASG). Now, these instances behind the elastic load balancer are born in and out of existence depending on the load on the instance that is set to 80% in the auto scaling group properties. Problem is that, this property cannot be changed and these new instances spin up with a new ip. But, nginx has cached the old IP it resolved before, for the previous version of this instance. We have to manually restart nginx everytime this happens and debugging takes time when I'm not around as well. We cannot use crontabs as we need a robust fix for this, kinda like a root cause fix.Suggestions, anyone? Thanks in advance :)

How can I get useful load testing data for my AWS server?

I have a system set up on AWS where I have a set of ec2 insatnces (as an application server from an elastic beanstalk) running in an auto-scaling load-balanced environment. All this works fine.
I would like to load test this instance in order to obtain results that help me to figure out what more needs to be done to the system in order for it to handle, potentially, millions of users. I have used a tool called Locust (http://locust.io) so far to do this. This allows me to send requests to my instance(s?) through a proxy as desired. However, I cannot tell whether the requests are being routed to multiple instances or the same one constantly; and if they are being load balanced appropriately I can't see how many requests each of the ec2 instances are receiving or their health under load. (I have a feeling that the requests are not being properly load balanced as the failure rate always seems to increase drastically at a similar point every test run.)
Is there a way to get this information inside from the AWS ec2 or elastic beanstalk consoles, or is there a better distributed web based load testing tool that can provide the data I need?
There are two ways to get this information
1) Create S3 Bucket and save ELB logs. You can filter these logs to check which instance is serving your request
2) Retrieve application level logs : If apache/nginx installed on your EC2 instances to serve the request. Filter apache/nginx logs in every machine
Hope it helps !!
There is a way to get this data from the AWS console.
Inside the elastic beanstalk console there is a tab titled health. This tab (in the enhanced health overview) shows the number of requests per second, the response for the requests, the latency, the load average and the CPU utilisation for each ec2 instance being run by the elastic beanstalk.
An example of this data is shown in the following image.
This data allows the system manager to see which of their back-end instances are receiving requests and how many they are each being sent through a load-balancer and a proxy.
This can also be attained from the AWS CLI using:
eb health environment_name

When AWS ElasticBeanstalk scales to another server it seems to make it available before it is ready ?

When my Java application is deployed to Tomcat on Elastic-Beanastalk it takes a while (11 minutes) because it has to copy large data files from S3 and unzip them, but that is okay because this is all done in .ebextensions and the instance doesn't report itself ready until that is completed.
However, I have it configured for Autoscaling and it seems that when it decides it needs to start a new instance there is a period before the next instance has fully deployed that Elastic-Beanstalk will direct some application requests to this new server, of course because it is not ready it returns a 503 error.
But surely all calls should only go to the original instance until the second one is ready, has anyone else noticed this ?
Whether requests are directed to the new instance or not is decided by the Elastic Load Balancer (ELB). Your autoscaled instances are behind the ELB and ELB performs periodic health checks on your EC2 instances to decide whether traffic to your instances or not. By default the health check is TCP connect on port 80. So if ELB can establish a connection to port 80 on the Tomcat server, it will start sending traffic to the instance even before it is actually "ready".
The solution is to use a custom HTTP health check instead of the default TCP check. Set up your web app to return a 200 OK on a special path say '/health_ping'. Then configure the "Application Healthcheck URL" option to "/health_ping". You can do this using the following ebextension.
Create a file called .ebextensions/01-health-check.config in your app source with the following contents. Then deploy it to your environment.
option_settings:
- namespace: aws:elasticbeanstalk:application
option_name: Application Healthcheck URL
value: /health_ping
Read more about this option setting here.
You can also configure this in the web console or using the aws cli.

Load balancer setup on Amazon Web services

I have an application on an Windows server EC2 with an SQL server for our database.
What I would like to do is an load balancer so the application won't fail due to overload.
I have a couple of questions that Im not certain about it.
I believe that i need to create an image of my current instance and duplicate it. my problem is that my database is based on my current instance so it would duplicate my database as well.
Do I need another instance just for my database?
If yes, then it means that I need a total of 3 instances. 2 for the application and 1 for the database.
In this case I need to change my application to connect to the new instance database instead of the current database.
After all that happens I need to add a load balancer.
I hope I made myself clear.
I would recommend using RDS (http://aws.amazon.com/rds/) for this. This way you don't have to worry about the database server and just host your application server on EC2 instances. Your AMI will then contain just the application server and thus when you scale up you will be launching additional app servers only and not database servers.
Since you are deploying a .NET application, I would also recommend taking a look at Elastic Beanstalk (http://aws.amazon.com/elasticbeanstalk/) since it will really help to make auto scaling much easier and your solution will scale up/down as well as self-heal itself.
As far the load balancer is concerned, you can either manually update your load balancer will the new instances of your application server or you can let your auto scale script do it for you. If you go for ElasticBeanstalk, then Elastic Beanstalk will take care of adding/removing instances to/from your Elastic Load Balancer for you on its own.