I'm running custom transactional tasks on my EC2 instances. The decision to shutdown or not an instance is taken under many conditions by special process running on this instance. The termination should be done by instance itself, because Autoscaling Group does not know when data processing is finish. Do the following steps are consistent with the philosophy of AWS?
Creates AMI with option: "Shutdown behaviour: Terminate".
Autoscaling group creates a new instance with option "Protect From Scale In".
Custom process on EC2 calls command:
$ sudo shutdown -P now
to terminate an instance in proper time.
Is that correct? Or maybe AWS has some tools to do that, eg. emit special signal to terminate an instance?
Thank you
That process has one issue I believe:
In step 1, the "Shutdown behaviour: Terminate" option is not an AMI level setting. It is a launch time setting, for instances launched outside of an autoscaling group.
Within an Autoscaling Group, there is no option to configure a Launch Configuration with the equivalent of "Shutdown behaviour: Terminate". Presumably, ASG instances must be terminated during scale in events.
The simple approach would be to have the instance call the AWS CLI terminate-instances command:
aws ec2 terminate-instances --instance-ids i-xxxxxxxx
You would need to acquire the instance id from the AWS Metadata in order to run the terminate-instances command.
We have a similar pattern, but getting a working solution seems kludgy. We do it a little differently:
Start with Protect From Scale In
When processing is complete, have the instance turn of it's "Protect From Scale In" flag
Have instance trigger the Scale-In policy, by reducing count by 1.
ASG then terminates this instance and doesn't restart a new one because scale in was called
One of the things I don't like about this is that you end up with creating the IAM Role for the EC2 instances, then create the ASG, then go back and update the IAM Role, to give it permissions to SetInstanceProtection and ExecutePolicy for it's own group. You need to do this because we couldn't figure out how to create a policy that referenced the autoscalegroup that the caller is in.
Did you ever resolve this with a different solution?
Related
I have an a scaling group of 2-5 instances to handle web traffic. I'm using a the rpush gem for push notifications, which requires a single daemon running to execute all the awaiting jobs. I'm already paying for the 2-5 instances, which have sufficient extra computing power to handle running the daemon, and I'd like to run the daemon on one of these instances.
The problem is, I can only use 1 API per auto-scaling group, so I'm having trouble finding a way to run the daemon on only one of the instances in the auto-scale group.
Is there a way to do this?
You could start your daemon manually on one of the instances and mark it as protected from termination. This way it won't be terminated during scaling in. While scaling out, the default new instances will be created without the deamon.
Keep in mind that while protected from termination in the auto-scaling
group, it may still be terminated by:
Manual termination through the Amazon EC2 console, the
terminate-instances command, or the TerminateInstances action. To
protect Auto Scaling instances from manual termination, enable
termination protection. For more information, see Enabling Termination
Protection in the Amazon EC2 User Guide for Linux Instances.
Health check replacement if the instance fails health checks.
Spot Instance interruption.
(source: AWS docs)
I have a stateful cluster deployed on AWS in which instances attach to an already existing EBS volume on startup and this volume would later be mounted to the Docker container running on the instance. If I forcefully detach this volume, the instance as well as the Docker container continue to be functional. To attach to the same volume, the instance has to be terminated and the new instance launched by the autoscaling group would attach to the detached volume through the userdata script.
Is there a way to automatically detect volume detachments and trigger an attachment? Or is it possible to automatically kill the instance if its EBS volume is forcefully detached?
I dont know of any automatic way to achieve this out-of-the-box. Best i can offer are a few ideas to investigate.
Run a cron script on your docker hosts that checks if the path is still accessible every X minutes. If path is not accessible, and if instances are set to terminate on shutdown just call shutdown -h to kill it. Or use the AWS CLI from your docker hosts to request the current instance is killed. A script can get the current instances InstanceId at runtime from the instance MetaData via curl, and you will need an IAM Policy and assign it to an IAM role for the instance to gain permission to terminate an instance.
Basically same thing, but do it from another server, or a Lambda function on a schedule that queries the API to get a list of instances / volumes(based on tag etc), and then checks the attachment status and terminate an instance if necessary.
Depending on your use case, you could maybe use cloudwatch to monitor the EBS metrics for the volume. could you detect a failure based on this for your use case then execute a lambda to actually inspect the instance and terminate it?
I am using AWS ECS in combination with EC2 instances.
Right now I am setting up Auto Scaling. How can I make sure that when, an EC2 instance gets terminated, all ECS tasks get migrated before the machine gets terminated?
Right now it is not automatically possible to achieve this. The best approach would be to have atleast 2 tasks running of each service, spread on different instances via a placement constraint.
Manually (or scripted) it is possible:
If you want to replace an instance attached to an ECS cluster, you can simply drain the instance. This will do the following
Start a new task of each running service on another instance in the cluster
Wait until the recently started task is 'steady'
shutdown the old task
To drain an instance using the AWS CLI, do the following:
Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.
In the navigation pane, choose Clusters and select the cluster.
Choose ECS Instances and select the check box for the container
instances.
Choose Actions, Drain instances.
After the instances are processed, choose Done.
This can also be done via the command line.
To do it automatically, you will need to add a lifecycle hook on termination.
Call the AWS CLI from the termination lifecycle hook to drain the instance, wait a fixed amount of time and then continue terminating the instance.
I've an ECS cluster with running one task for my backend instance. I would like to be able to stop/start the EC2 instance whenever I want. Is it possible?? I was trying to stop instance directly but it terminates after few second when stopped and after that new instance is created automatically. I tried to change the Auto Scale Group to match desired=min=0 capacity but when I do that the instance gets auto terminated. I just want to turn off the Ec2 instance when its not needed to be used but at the same time I want data to persist betweet turning on and off. I am fighting with this for a few days now and wasn't able to achieve my goals.
Also how to link EBS volume with VOLUME /root/.local/share/XYZ from docker file image to persist the data from the XYZ folder
I would suggest you to do modifications in autoscaling group, when you want to turn off instance put 0 in auto scaling and when you want to turn on change value in autoscaling,
... you can do that with aws cli , and you can shcdule the period also by putting aws cli command in cron job
I would suggest using EFS. Here is an article from AWS on how to persist data from ECS containers using EFS.
Using Amazon EFS to Persist Data from Amazon ECS Containers
Start/Stop instances and auto-scale don't really fit together.
Auto-scale is specifically designed to solve scalein/scaleout.
One way to address this could be using customized termination policy (but I never tried this in ECS setup).
One note though, if your customized termination policy never terminates the instances and you continue adding instances to keep always, you might get good amount EC2 bill.
I configured for my AWS account the new AWS instance scheduler https://aws.amazon.com/answers/infrastructure-management/instance-scheduler/
The problem seems that, tagging ec2-instances through a scaling group the ec2-instances are correctly stopped, but since my scaling group has Min number set to 2 AWS scaling group restarts them anyway.
I would not like to set Min number to 0, just because is useful during application redeploy.
How to make the 2 services work fine?
When you stop your EC2 instances that are controlled by Auto Scaling, then Auto Scaling will see them as "unhealthy" and it will proceed to terminate and replace them.
You have 2 options.
Option 1: Pause Auto Scaling processing while your EC2 instances are stopped. By doing this, Auto Scaling won't care that your EC2 instances are stopped and won't terminate them. Just remember to resume processing after you restart your EC2 instances.
However, AWS Instance Scheduler will not manage this for you, so you'll need to find another way to schedule your EC2 instances to stop & restart.
Option 2: Scale your Auto Scaling group to 0 and back to 2. This will result in terminating your EC2 instances (when you don't need them) and re-creating them (when you want them). This will only work if your EC2 instances are ephemeral.
Again, AWS Instance Scheduler will not manage this for you. Auto Scaling scheduled actions may be able to help you with this.
Another option is to use asg standby feature before and after the aws instance scheduler. This will also let you work on the same Ami before the shutdown.
So high level solution is below:
Define ec2 instance schedule using aws instance scheduler
Define lambda that fetch the shutdown schedule and put the ec2 in standby mode before the planned shutdown.
Define lambda that fetch the startup schedule and put the ec2 instance out of standby after the ec2 planned restart.