AWS autoscaling starts not ready instances because of userdata script - amazon-web-services

I have an autoscaling that works great, with a launchconfiguration where i defined a userdata script that is executed on a new instance launch.
The userscript updates basecode and generate cache, this takes some seconds. But as soon as the instance is "created" (and not "ready"), the autoscaling adds it to the load balancer.
It's a problem because while the userdata script is executed, the instance does not answer with a good response (basically, 500 errors are throw).
I would like to avoid that, of course I saw this documentation : http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/InstallingAdd
As with a standalone EC2 instance, you have the option of configuring instances launched into an Auto Scaling group using user data. For example, you can specify a configuration script using the User data field in the AWS Management Console, or the --userdata parameter in the AWS CLI.
If you have software that can't be installed using a configuration script, or if you need to modify software manually before Auto Scaling adds the instance to the group, add a lifecycle hook to your Auto Scaling group that notifies you when the Auto Scaling group launches an instance. This hook keeps the instance in the Pending:Wait state while you install and configure the additional software.
Looks like i'm not in this case. Also, modify the pending hook on the userdata script is complicated. There must be a simple solution to fix my problem.
Thank you for your help !

EC2 instance Userdata does not utilize a lifecycle hook to stop a newly launched instance being brought into service until after it has finished executing.
Stopping your web server at the start of your user data script sounds a little unreliable to me, and therefore I would urge you to utilize the features AutoScaling provides that were designed to solve this very problem.
I have two suggestions:
Option 1:
Using lifecycle hooks isn't at all complicated, once you read through the docs. And in your user data, you can easily use the CLI to control the hook, check this out. In fact, a hook can be controlled from any supported language or scripting language.
Option 2:
If manually taking care of lifecycle hooks doesn't appeal to you, then I would recommend scrapping your user data script and doing a work around with AWS CodeDeploy. You could have CodeDeploy deploy nothing (eg. empty S3 folder) but you could use the deployment hook scripts to replace your user data script. Code Deploy integrates with AutoScaling seamlessly and handles lifecycle hooks automatically. A newly launched instance won't be brought into service by AutoScaling until a deployment has succeeded. Read the docs here and here for more info.
However, I would urge you to go with option 1. Lifecycle hooks were designed to solve the very problem you have. They're powerful, robust, awesome and free. Use them.

#Brooks said the easiest way to "wait" before the ELB serve the instance is to deal with ELB health status.
I solved my problem by shutting down the http server at the start of the userdata script. So the ELB can't have a green health status, and it does not send clients to the instance. I re-start the http server at the end of the script, the health is good so the ELB serve it.

Related

Configure cloud-based vscode ide on AWS

CONTEXT:
We have a platform where users can create their own projects - multiple projects per user. We need to provide them with a browser-based IDE to edit those projects.
We decided to go with coder-server. For this we need to configure an auto-scalable cluster on AWS. When the user clicks "Edit Project" we will bring up a new container each time.
https://hub.docker.com/r/codercom/code-server
QUESTION:
How to pass parameters from the url query (my-site.com/edit?project=1234) into a startup script to pre-configure the workspace in a docker container when it starts?
Let's say the stack is AWS + ECS + Fargate. We could use kubernetes instead of ECS if it helps.
I don't have any experience in cluster configuration. Will appreciate any help or at least a direction where to dig further.
The above can be achieved using multiple ways in AWS ECS. The basic requirements for such systems are to launch and terminate containers on the fly while persisting the changes in the files. (I will focus on launching the containers)
Using AWS SDK's:
The task can be easily achieved using AWS SDKs, Using a base task definition. AWS SDK allows starting tasks with overrides on the base task definition.
E.G. If task definition has a memory of 2GB then the SDK can override the memory to parameterised value while launching a task from task def.
Refer to the boto3 (AWS SDK for Python) docs.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ecs.html#ECS.Client.run_task
Overall Solution
Now that we know how to run custom tasks with python SDK (on demand). The overall flow for your application is your API calling AWS lambda function whit parameters to spin up and wait to keep checking task status and update and rout traffic to it once the status is healthy.
API calls AWS lambda functions with parameters
Lambda function using AWS SDK create a new task with overrides from base task definition. (assuming the base task definition already exists)
Keep checking the status of the new task in the same function call and set a flag in your database for your front end to be able to react to it.
Once the status is healthy you can add a rule in the application load balancer using AWS SDK to route traffic to the IP without exposing the IP address to the end client (AWS application load balancer can get expensive, I'll advise using Nginx or HAProxy on ec2 to manage dynamic routing)
Note:
Ensure your Image is lightweight, and the startup times are less than 15 mins as lambda cannot execute beyond that. If that's the case create a microservice for launching ad-hoc containers and hosting them on EC2
Using Terraform:
If you looking for infrastructure provisioning terraform is the way to go. It has a learning curve so recommend it as a secondary option.
Terraform is popular for parametrising using variables and it can be plugged in easily as a backend for an API. The flow of your application still remains the same from step 1, but instead of AWS Lambda API will be calling your ad-hoc container microservice, which in turn calls terraform script and passing variables to it.
Refer to the Terrafrom docs for AWS
https://registry.terraform.io/providers/hashicorp/aws/latest

How does the AWS EC2 Auto Scaling synchronisation work automatically?

We started our wordpress blog some time ago with only one single EC2 Instance and a Multi-AZ RDS Database.
The traffic increased with heavy ups and downs (up to 1.500 user per minute), so we decided to use EC2 Auto Scaling. Here is our problem: Every time we changed some code, we have to create a new AMI for the Auto Scaling Group and terminate all instance so new instances will start with the new AMI Data.
Is there a easy way to synchronize all instance automatically, when changing some code on one of them? Perhaps Opsworks could to that but I haven't experience with this. I already searched a couple of days for a tutorial, but could not find anything helpful.
You could configure your AMI to download the latest code on startup, so that you don't have to constantly update the AMI.
Or you could just use Elastic Beanstalk and let it manage all this stuff for you.
If you want an easy way to deploy changes to instances in your autoscaling group, I would recommend using Code Deploy.
Code Deploy integrates nicely with Autoscaling. If a scaling up event occurs, it will start a deployment to the newly launched instance and won't bring that instance into service in the AutoScaling group until the deployment has finished.
The deployments can be as simple as changing the code or else they can involve more thanks to Code Deploys deployment hooks.
Also you can have Code Deploy grab your code from S3, Github or CodeCommit.
Code Deploy is pretty easy to set up and the documentation is great:
Docs AutoScaling Integreation

How to deploy to autoscaling group with only one active node without downtime

There are two questions about AWS autoscaling + deployment which I cannot clearly answer:
I'm currently trying to figure out, whats the best strategy to deploy to an EC2 instance behind an ELB which is the only member of an autoscaling group without downtime.
By now the EC2 setup will be done with puppet including the deployment of the application, triggered after an successful build by jenkins.
The best solution I have found is to check per script how many instances are registered at the ELB. If a single one is registered, spawn a new one, which runs puppet on startup (the new node will be up to date) and kill the old node.
How to deploy (autoscaling EC2 behind an ELB) without delivering two different versions of the application?
Possible solution: Check per script how many EC2 instances are registered to the ELB, spawn the same amount of instances, register all new instances and unregister all old ones.
My experiences with AWS teacher me that AWS has a service for everything. So are there any services out there to accomplish my requirements and my solutions are inconvenient?
You can create an entirely new environment with its own ELB and when it's ready and checked, you switch the DNS record to the new ELB.
Anyway for a brief time (60 seconds or so, depending on the TTL of your DNS record) some users will see your old version while some others will see the new version.
In the end there were two possible solutions. Both of them would temporarily deliver two versions of the app.
Use AWS CodeDeploy to perform an sequential deployment (one after another). This solution offers the possibility to rollback to a previous state and visual shows the state and results of the deployment.
Create a python script to get the registered nodes (using Boto) and run the appropriate puppet script on them (using Fabric). This solution offers more control of the deployment but requires some time to build these script. Also there can be bugs..
For now I choose AWS CodeDeploy because its already available and - hopefully - well tested.

Boot strapping AWS auto scale instances

We are discussing at a client how to boot strap auto scale AWS instances. Essentially, a instance comes up with hardly anything on it. It has a generic startup script that asks somewhere "what am I supposed to do next?"
I'm thinking we can use amazon tags, and have the instance itself ask AWS using awscli tool set to find out it's role. This could give puppet info, environment info (dev/stage/prod for example) and so on. This should be doable with just the DescribeTags privilege. I'm facing resistance however.
I am looking for suggestions on how a fresh AWS instance can find out about it's own purpose, whether from AWS or perhaps from a service broker of some sort.
EC2 instances offer a feature called User Data meant to solve this problem. User Data executes a shell script to perform provisioning functions on new instances. A typical pattern is to use the User Data to download or clone a configuration management source repository, such as Chef, Puppet, or Ansible, and run it locally on the box to perform more complete provisioning.
As #e-j-brennan states, it's also common to prebundle an AMI that has already been provisioned. This approach is faster since no provisioning needs to happen at boot time, but is perhaps less flexible since the instance isn't customized.
You may also be interested in instance metadata, which exposes some data such as network details and tags via a URL path accessible only to the instance itself.
An instance doesn't have to come up with 'hardly anything on it' though. You can/should build your own custom AMI (Amazon machine image), with any and all software you need to have running on it, and when you need to auto-scale an instance, you boot it from the AMI you previously created and saved.
http://docs.aws.amazon.com/gettingstarted/latest/wah-linux/getting-started-create-custom-ami.html
I would recommend to use AWS Beanstalk for creating specific instances, this makes it easier since it will create the AutoScaling groups and Launch Configurations (Bootup code) which you can edit later. Also you only pay for EC2 instances and you can manage most of the things from Beanstalk console.

Best practice for reconfiguring and redeploying on AWS autoscalegroup

I am new to AWS (Amazon Web Services) as well as our own custom boto based python deployment scripts, but wanted to ask for advice or best practices for a simple configuration management task. We have a simple web application with configuration data for several different backend environments controlled by a command line -D defined java environment variable. Sometimes, the requirement comes up that we need to switch from one backend environment to another due to maintenance or deployment schedules of our backend services.
The current procedure requires python scripts to completely destroy and rebuild all the virtual infrastructure (load balancers, auto scale groups, etc.) to redeploy the application with a change to the command line parameter. On a traditional server infrastructure, we would log in to the management console of the container, change the variable, bounce the container, and we're done.
Is there a best practice for this operation on AWS environments, or is the complete destruction and rebuilding of all the pieces the only way to accomplish this task in an AWS environment?
It depends on what resources you have to change. AWS is evolving everyday in a fast paced manner. I would suggest you to take a look at the AWS API for the resources you need to deal with and check if you can change a resource without destroying it.
Ex: today you cannot change a Launch Group once it is created. you must delete it and create it again with the new configurations. but if you have one auto scaling group attached to that launch group you will have to delete the auto scaling group and so on.
IMHO a see no problems with your approach, but as I believe that there is always room for improvement, I think you can refactor it with the help of AWS API documentation.
HTH
I think I found the answer to my own question. I know the interface to AWS is constantly changing, and I don't think this functionality is available yet in the Python boto library, but the ability I was looking for is best described as "Modifying Attributes of a Stopped Instance" with --user-data as being the attribute in question. Documentation for performing this action using HTTP requests and the command line interface to AWS can be found here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_ChangingAttributesWhileInstanceStopped.html