Is there a way to STOP not TERMINATE instances using auto-scaling in AWS? - amazon-web-services

I am looking at using AWS auto-scaling to scale my infrastructure up and down based on various performance metrics (CPU, etc.). I understand how to set this up; however, I don't like that instances are terminated rather than stopped when it is scaled down. This means that when I scale back up, I have to start from scratch with a new instance and re-install my software, etc. I'd rather just start/stop my instances as needed rather than create/terminate. Is there a way to do this?

No, it is not possible to Stop an instance under Auto Scaling. When a Scaling Policy triggers the removal of an instance, Auto Scaling will always Terminate the instance.
However, here's some ideas to cope with the concept of Termination...
Option 1: Use pre-configured AMIs
You can configure an Amazon EC2 instance with your desired software, data and settings. Then, select the EC2 instance in the Management Console and choose the Create Image action. This will create a new Amazon Machine Image (AMI). You can then configure Auto Scaling to use this AMI when launching a new instance. Each new instance will contain exactly the same disk contents.
It's worth mentioning that EBS starts up very quickly from an AMI. Instead of copying the whole AMI to the boot disk, it copies it across on "first access". This means the new instance can start-up immediately rather than waiting for the whole disk to be copied.
Option 2: Use a startup (User Data) script
Each Amazon EC2 instance has a User Data field, which is accessible from the instance. A script can be passed through the User Data field, which is then executed when the instance starts. The script could be used to install software, download data and configure the instance.
The script could do something very simple, like download a configuration script from a source code repository, then execute the script. This means that machine configuration can be centrally managed and version-controlled. Want to update your app? Just launch a new instance with the updated script and throw away the old instance (which is much easier than "updating" an app).
Option 3: Add/Remove instances to an Auto Scaling group
Rather than using Scaling Policies to Launch/Terminate instances for an Auto Scaling group, it is possible to attach/detach specific instances. Thus, you could 'simulate' auto scaling:
When you want to scale-down, detach an instance from the Auto Scaling group, then stop it.
When you want to add an instance, start the instance then attach it to the Auto Scaling group.
This would require your own code, but it is very simple (basically two API calls). You would be responsible for keeping track of which instance to attach/detach.

You can suspend scaling processes, see documentation here:
https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-suspend-resume-processes.html#as-suspend-resume

Add that instance to Scale in protection and then stop the instance then it will not delete your instance as it's having the scale in protection.

Actually you have three official AWS options to reboot or even stop an instance which belongs to an Auto Scaling Group:
Put the instance into the Standby state
Detach the instance from the group
Suspend the health check process
Ref.: https://aws.amazon.com/premiumsupport/knowledge-center/reboot-autoscaling-group-instance/

As of April 2021:
Option 4: Use Warm Pools and an Instance Reuse Policy
By default, Amazon EC2 Auto Scaling terminates your instances when your Auto Scaling group scales in. Then, it launches new instances into the warm pool to replace the instances that were terminated.
If you want to return instances to the warm pool instead, you can specify an instance reuse policy. This lets you reuse instances that are already configured to serve application traffic.
This mostly automates option 3 from John's answer.
Release announcement: https://aws.amazon.com/blogs/compute/scaling-your-applications-faster-with-ec2-auto-scaling-warm-pools/
Documentation: https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-warm-pools.html

This is to expand a little on #mwalling's answer, because that is the right direction, but needs a little extra work to prevent instance termination.
There is now a way to stop or hibernate scaled in instances!
By default AWS Autoscaling scale in policy is to terminate an instance. Even if you have a warm pool configured. Autoscaling will create a fresh instance to put into the warm pool. Presumably to make sure you start with a fresh machine every time. However, with a instance reuse policy you can make AWS Autoscaling either stop or hibernate a running instance and store that instance in the warm pool.
Advantages include:
Local caches stay populated (use hibernate for in memory cache).
Burstable EC2 instances (those types with T*) keep built up burst credits instead of the newly created instance that have limited or no credits.
Practical example:
We use a burstable EC2 instance for CI/CD work that we scale to 0 instances outside working hours. With a reuse policy our local image repository stays populated with the most important Docker images. Also we keep the built up credit from the previous day and that speeds up automatic jobs we run first thing every morning.
How to implement:
There's currently no way of doing this completely via the management console. So you will need to use AWS CLI or SDK.
First create a warm pool as described in the AWS Documentation
Then execute this command to add a reuse policy:
aws autoscaling put-warm-pool --auto-scaling-group-name <Name-of-autoscaling-group> --instance-reuse-policy ReuseOnScaleIn=true
Reference docs for the command: AWS CLI Autoscaling put-warm-pool documentation
Flow diagram of possible life cycles of EC2 instances:
Image from AWS Documentation: Lifecycle state transitions for instances in a warm pool

Related

How does AWS autoscaling groups recognize that EC2 is idle and it should be terminated?

I am running a flask python program on EC2 which is under Load Balancer and autoscaling. In a scenario where is load increases on one Ec2 it creates another and if newly scaled Ec2 has been idle or not utilized it scales in or terminates it. The problem here is if a single user is accessing newly scaled instance which hardly takes any CPU utilization how autoscaling group will realize that it idle and if it doesn't it will terminate it leaving downtime for that user.
I have two scenarios in mind that it checks for a particular program for a amount of time in EC2 if it is running then don't, otherwise terminate it.
I see Step scaling policy but there option is only for CPU utilization that is hardly consumed if there is a single user, not even 0.1 %.
Can someone please tell me whats the best option for me and if these two options are possible then how to do it? I have been trying to ask developers since many days but could not get reliable answers in my case.
Amazon EC2 Auto-scaling does not know which of your instances are 'in use'.
Also, the decision to terminate an instance is typically made on a metric across all instances (eg CPU Utilization), rather than a metric on a specific instance.
When Auto Scaling decides to remove an instance from the Auto Scaling group, it picks an instance as follows:
It picks an Availability Zone with the most instances (to keep them balanced)
It then selects an instance based on the Termination Policy
See also: Control which Auto Scaling instances terminate during scale in - Amazon EC2 Auto Scaling
When using a Load Balancer with Auto Scaling, traffic going to the instance that will be terminated is 'drained', allowing a chance for the instance to complete existing requests.
You can further prevent an instance from terminating while it is still "in use"by implementing Amazon EC2 Auto Scaling lifecycle hooks that allow your own code to delay the Termination.
Or, if all of this is unsatisfactory, you can disable the automatic selection of an instance to terminate and instance have your own code call TerminateInstanceInAutoScalingGroup - Amazon EC2 Auto Scaling to terminate a specific instance of your choosing.
For an overview of Auto Scaling, I recommend this video from the AWS Reinvent conference: AWS re:Invent 2019: Capacity management made easy with Amazon EC2 Auto Scaling (CMP326-R1) - YouTube

ECS stop instance

I've an ECS cluster with running one task for my backend instance. I would like to be able to stop/start the EC2 instance whenever I want. Is it possible?? I was trying to stop instance directly but it terminates after few second when stopped and after that new instance is created automatically. I tried to change the Auto Scale Group to match desired=min=0 capacity but when I do that the instance gets auto terminated. I just want to turn off the Ec2 instance when its not needed to be used but at the same time I want data to persist betweet turning on and off. I am fighting with this for a few days now and wasn't able to achieve my goals.
Also how to link EBS volume with VOLUME /root/.local/share/XYZ from docker file image to persist the data from the XYZ folder
I would suggest you to do modifications in autoscaling group, when you want to turn off instance put 0 in auto scaling and when you want to turn on change value in autoscaling,
... you can do that with aws cli , and you can shcdule the period also by putting aws cli command in cron job
I would suggest using EFS. Here is an article from AWS on how to persist data from ECS containers using EFS.
Using Amazon EFS to Persist Data from Amazon ECS Containers
Start/Stop instances and auto-scale don't really fit together.
Auto-scale is specifically designed to solve scalein/scaleout.
One way to address this could be using customized termination policy (but I never tried this in ECS setup).
One note though, if your customized termination policy never terminates the instances and you continue adding instances to keep always, you might get good amount EC2 bill.

How can I control which EC2 instances get removed by an AutoScalingGroup using Amazon Web Services?

I have foreseen a problem that could happen with my application but I am unsure if it is possible to solve, and perhaps the architecture needs to be redesigned.
I am using an AutoScalingGroup (ASG) on AWS to create EC2 instances that host game servers that players can join. At the moment, the ASG is scaled manually via a matchmaking API which changes the desired capacity based on its needs. The problem occurs when a game server is finished.
When a game finishes, it signals to the matchmaker that it is finished and needs terminating, and the matchmaker will then scale down the ASG accordingly, however, it doesn't seem to know exactly which instance to remove, and it won't necessarily be the one that needs terminating.
I can terminate the instance, but then as the ASG desired capacity is never changed when the instance is terminated, another server is created.
Is there a way I can scale down the ASG, as well as specifying which servers to remove from the group?
In a nutshell, the default termination policy during scale in is designed to remove instances that use the oldest launch configuration.
Currently, Amazon EC2 Auto Scaling supports the following termination policie:
OldestInstance Terminate the oldest instance in the group. This option is useful when you're upgrading the instances in the Auto Scaling group to a new EC2 instance type. You can gradually replace instances of the old type with instances of the new type.
NewestInstance Terminate the newest instance in the group. This policy is useful when you're testing a new launch configuration but don't want to keep it in production.
OldestLaunchConfiguration Terminate instances that have the oldest launch configuration. This policy is useful when you're updating a group and phasing out the instances from a previous configuration.
ClosestToNextInstanceHour Terminate instances that are closest to the next billing hour. This policy helps you maximize the use of your instances and manage your Amazon EC2 usage costs.
Default Terminate instances according to the default termination policy. This policy is useful when you have more than one scaling policy for the group.
Instance protection
One of the possible solutions could be to use Instance protection. The auto-scaling provides an instance protection to control whether instance can be terminated when scaling-in.
Therefore, enable the instance protection for ASG to protect instances from scaling-in by default. Once you are done with you server, decrease a value of desired number of instances, remove instance protection from particular instance (either using CLI or SDK; note that this protection remains enabled for the rest of instances) and auto-scaling will terminate that exact instance.
For more information about instance protection, see Instance Protection
The oldest server is removed. If you want to scale down a specific server, you will have to kill that server before changing desired capacity.

How can I create and deploy applications to an EC2 instance via the AWS API?

I'm looking to see if I can create an instance and deploy applications to athis instance dynamically via the API. I only want these instances to be created when my application needs them, or I request for them to be created.
I have two applications that I need to deploy to each created instance which require some set up and installation of dependencies prior to their launch. When I am finished with this application, I want to terminate the instance.
Am I able to do this? If so, could someone please point me to the right section of the documentation. I have searched on the documentation and found some information about creating images but I am unsure as to what exactly I will need to achieve this task.
Yes. Using an Autoscaling Group, you can create a launch configuration that will launch you instances. Using CodeDeploy, you would link your deployment group to the auto-scaling group.
See Integrating AWS CodeDeploy with Auto Scaling
AWS CodeDeploy supports Auto Scaling, an AWS service that can launch
Amazon EC2 instances automatically according to conditions you define.
These conditions can include limits exceeded in a specified time
interval for CPU utilization, disk reads or writes, or inbound or
outbound network traffic. Auto Scaling terminates the instances when
they are no longer needed. For more information, see What Is Auto
Scaling?.
Assuming you set your desired/minimum instances to 0, then the default state of the ASG will be to have no instances.
When you application needs an instance spun up, it would simply change the desired instance value to 1. When your application is completed with the instance, it would set your desired count to 0 again, thereby terminating that instance.
To develop this setup, start by running your instance normally (manually) and get the application deployment working. When that works, then create your auto scaling group. Finally, update your deployment group to refer to the ASG so that your code is deployed when you have scaling events.

Duplicate and destroy an EC2 instance in AWS via .NET code

We have an application with long running processes which prevents us from being able to use Elastic Beanstalk to properly scale the environment. In fact no metric scaling would be useful for us and what we really need to be able to do is the following....
On demand, programatically, create a new EC2 instance which is a duplicate of a specific EC2 "template" instance (That template instance would be an EC2 running IIS with specific code deployed to it, probably via beanstalk).
On demand, programatically, destroy a specific instance
Based on specific events we would need to perform the above actions via our .NET code base.
I get the feeling that we should be able to do this with cloudformation templates but i dont see any clear documentation to handle this.
Any advise or direction would be greatly appreciated.
Not sure about doing this through .Net code, but you can create an auto scaling group in AWS console and that will take care of the scaling requirements in a maintainable way.
Log on to AWS and navigate to Management Console
First create an EC2 Instance with proper instance type (say, t2.large) or whatever your sizing is.
Then have IIS installed and get your app running on this instance.
Now create a new Image from the above running EC2 Instance
Create a new Auto Scaling Group and add the above new Image
Create a Launch Configuration for the Auto Scaling Group (AWS Console will redirect you to do this step when you try to setup Auto Scaling Group).
Once the Auto Scaling Group is setup, navigate to Auto Scaling Policy and add a policy there. For example, you can create a policy to 'add 1 instance on CPU utilization of 80% or more for 2 consecutive times'.
Also make sure Min is 0 and Max is say, 2. This is upto you to decide based on your scaling requirements. If you have Min as 1, it automatically creates the first instance. If Min is 0, no new instances will be created until the threshold in the Auto Scale Policy is met.
Also create a Scale Down Policy to remove instances when there is a low CPU utilization.
Note: I took CPU utilization as an example to describe the scenario. You can have any metrics as per your choice and architectural needs.