Group policy allowing single user control of an EC2 instance? - amazon-web-services

How can I create a policy that will allow a single user in a group to create an ec2 instance in a particular VPC and AZ?
I need it to also be destroyable by the same user or destroyed when idle for more than 24 hours.

You can't. What you can do, however, is script the logic yourself to provide equivalent functionality. A good starting point would be to google AWS SDK.

This is not possible via IAM policies nor other configurations.
You would have to write your own application with this logic, which would launch the instance on the user's behalf and then terminate it later (although that depends on your definition of 'when idle'). The user would interact with this application when they want to launch the instance, rather than interacting directly with AWS.
To be more specific:
It is possible to limit the number of EC2 instances that an AWS Account can have running simultaneously (the default is 20)
It is not possible to limit the number of instances a single user may launch. Either they have permissions to launch an instance with certain attributes, or they don't, but this cannot be limited by whether/what instances are already running.
Some ways you could create a self-terminating instance:
A CloudWatch Alarm could terminate an instance when CPU or Network drops below a certain level for a period of time (eg when Network Out is below 10KB over a half-our period).
A script on the instance itself could decide when to Shutdown the instance. If the instance is launched with Shutdown Behaviour set to Terminate, then shutting down the instance from within the instance will cause it to terminate.
You could create an application that regularly looks at running instances and terminates them 24 hours after being launched. This could be triggered by a Scheduled Task (Windows) or cron job (Linux).
A variation on the previous option is to add a Tag to the instance to indicate when to terminate it. An application could regularly check the tag to determine which instances to terminate.

Related

AWS EC2 LifeCyle Hook vs User Data in Launch configuration

In AWS EC2, both lifecycle hook and user data of launch configuration permits to execute some customised actions while launching instances.
Could you tell me is Lifecyle hook related actions are executed before or User data defined in launch configuration executed before ?
When do you chose which one ? What are their differences ?
User Data and Cloud-Init
When launching an Amazon EC2 instance, you can provide a User Data field. The information entered in this field is available to the instance via http://169.254.169.254/latest/user-data/.
This is an excellent way to pass information to the instance that is accessible to software running on the instance.
Then Canonical, maker of Ubuntu, conceived Cloud-Init as a way of running scripts during the startup of virtual machines. Cloud-Init takes a script passed via EC2 User Data and runs it as root during the first boot of an instance. It is a great way to install software and configure the machine when it is first used.
Amazon EC2 Auto-Scaling Lifecycle Hooks
Amazon EC2 Auto-Scaling is a method of automatically scaling a fleet of EC2 instances based on workload. Instances are launched or terminated based on a target capacity metric. Instances launched through Auto-Scaling are normal EC2 instances, so User Data can be used to configure these instances.
Sometimes, however, a more complex operation is required when launching/terminating instances. For example, when launching instances it might be necessary to contact an external configuration service, and when terminating instances it might be necessary to copy data off an instance. These tasks can be accomplished via Lifecycle Hooks.
From Amazon EC2 Auto Scaling Lifecycle Hooks - Amazon EC2 Auto Scaling:
Lifecycle hooks enable you to perform custom actions by pausing instances as an Auto Scaling group launches or terminates them. When an instance is paused, it remains in a wait state until either you complete the lifecycle action using the complete-lifecycle-action CLI command or CompleteLifecycleAction API action, or the timeout period ends (one hour by default).
Compared to User Data, Lifecycle Hooks are rarely used. They are typically required when a longer-running or external process is required before instances are ready to process requests. For example, there might be a long startup process required for new instances that exceeds the time normally allowed for health checks. Or, an external process (outside the instance) might need to be triggered before the instance can start processing traffic.
Lifecycle Hooks are more complicated because they involve a signalling mechanism. When an Auto Scaling instance is launched or terminated, Auto Scaling will send a message via Amazon SQS or Amazon SNS. You are then responsible for running a process that responds to this signal. When the process is complete, it must send a signal back to Auto Scaling so that the instance can be fully added to, or removed from, the Auto Scaling group. This will typically require something running external to the EC2 instance to process the Lifecycle Hook.
Bottom line: You want to use User Data. It is rare that you would use a Lifecycle Hook.
Thanks for your answers.
Actually, the LifeCycle hook is somehow useful. For example, in my situation, I have to deactivate Tableau Server licenses before the instance is terminated. This is a good situation to use it.
While, during starting up, user data is executed but the instance will be put into "In Service" states once the OS itself is up, even the user data is still not yet finished to be executed.
So to choose between the two approches, it all depends on what you want to do in the scripts.

How can I control which EC2 instances get removed by an AutoScalingGroup using Amazon Web Services?

I have foreseen a problem that could happen with my application but I am unsure if it is possible to solve, and perhaps the architecture needs to be redesigned.
I am using an AutoScalingGroup (ASG) on AWS to create EC2 instances that host game servers that players can join. At the moment, the ASG is scaled manually via a matchmaking API which changes the desired capacity based on its needs. The problem occurs when a game server is finished.
When a game finishes, it signals to the matchmaker that it is finished and needs terminating, and the matchmaker will then scale down the ASG accordingly, however, it doesn't seem to know exactly which instance to remove, and it won't necessarily be the one that needs terminating.
I can terminate the instance, but then as the ASG desired capacity is never changed when the instance is terminated, another server is created.
Is there a way I can scale down the ASG, as well as specifying which servers to remove from the group?
In a nutshell, the default termination policy during scale in is designed to remove instances that use the oldest launch configuration.
Currently, Amazon EC2 Auto Scaling supports the following termination policie:
OldestInstance Terminate the oldest instance in the group. This option is useful when you're upgrading the instances in the Auto Scaling group to a new EC2 instance type. You can gradually replace instances of the old type with instances of the new type.
NewestInstance Terminate the newest instance in the group. This policy is useful when you're testing a new launch configuration but don't want to keep it in production.
OldestLaunchConfiguration Terminate instances that have the oldest launch configuration. This policy is useful when you're updating a group and phasing out the instances from a previous configuration.
ClosestToNextInstanceHour Terminate instances that are closest to the next billing hour. This policy helps you maximize the use of your instances and manage your Amazon EC2 usage costs.
Default Terminate instances according to the default termination policy. This policy is useful when you have more than one scaling policy for the group.
Instance protection
One of the possible solutions could be to use Instance protection. The auto-scaling provides an instance protection to control whether instance can be terminated when scaling-in.
Therefore, enable the instance protection for ASG to protect instances from scaling-in by default. Once you are done with you server, decrease a value of desired number of instances, remove instance protection from particular instance (either using CLI or SDK; note that this protection remains enabled for the rest of instances) and auto-scaling will terminate that exact instance.
For more information about instance protection, see Instance Protection
The oldest server is removed. If you want to scale down a specific server, you will have to kill that server before changing desired capacity.

AWS Scaling In Termination Protection for EC2 Container Service

I am unable to figure out how to protect my ECS task instances when using Auto Scaling in Amazon AWS. I have a long running task that can scale out as required but I want to mark task instances that are running as "not destroyable". I have found several resources that talk about instance protection such as:
https://aws.amazon.com/blogs/aws/new-instance-protection-for-auto-scaling/
and (because I am using python) the API documentation is here:
http://boto3.readthedocs.io/en/latest/reference/services/autoscaling.html#AutoScaling.Client.set_instance_protection
This method requires an InstanceId and so I attempt to get the instance id of the current container by using a command like:
curl http://169.254.169.254/latest/meta-data/instance-id
However, this method just returns the instance id of the EC2 machine the task is running on. So my question is: Is there a way to get the instance id of a docker task instance (if that even exists)? If not is there another way I can prevent Auto Scaling from terminating a task that is still running? Do I have to write my own task manager that manages Scaling In?
To solve the same issue, we developed a simple application which is started when a job is done on one of the Auto Scaling Group (ASG) instances. This application checks the queue and if there is no job in the queue (for let's say 10 minutes or 10 times) it terminates its instance and decrements the Desired value of the ASG. This provides us with a reliable mechanism for scaling in. Scaling out on the other hand, is done by the ASG itself based on the number of jobs in the queue.

Is there a way to STOP not TERMINATE instances using auto-scaling in AWS?

I am looking at using AWS auto-scaling to scale my infrastructure up and down based on various performance metrics (CPU, etc.). I understand how to set this up; however, I don't like that instances are terminated rather than stopped when it is scaled down. This means that when I scale back up, I have to start from scratch with a new instance and re-install my software, etc. I'd rather just start/stop my instances as needed rather than create/terminate. Is there a way to do this?
No, it is not possible to Stop an instance under Auto Scaling. When a Scaling Policy triggers the removal of an instance, Auto Scaling will always Terminate the instance.
However, here's some ideas to cope with the concept of Termination...
Option 1: Use pre-configured AMIs
You can configure an Amazon EC2 instance with your desired software, data and settings. Then, select the EC2 instance in the Management Console and choose the Create Image action. This will create a new Amazon Machine Image (AMI). You can then configure Auto Scaling to use this AMI when launching a new instance. Each new instance will contain exactly the same disk contents.
It's worth mentioning that EBS starts up very quickly from an AMI. Instead of copying the whole AMI to the boot disk, it copies it across on "first access". This means the new instance can start-up immediately rather than waiting for the whole disk to be copied.
Option 2: Use a startup (User Data) script
Each Amazon EC2 instance has a User Data field, which is accessible from the instance. A script can be passed through the User Data field, which is then executed when the instance starts. The script could be used to install software, download data and configure the instance.
The script could do something very simple, like download a configuration script from a source code repository, then execute the script. This means that machine configuration can be centrally managed and version-controlled. Want to update your app? Just launch a new instance with the updated script and throw away the old instance (which is much easier than "updating" an app).
Option 3: Add/Remove instances to an Auto Scaling group
Rather than using Scaling Policies to Launch/Terminate instances for an Auto Scaling group, it is possible to attach/detach specific instances. Thus, you could 'simulate' auto scaling:
When you want to scale-down, detach an instance from the Auto Scaling group, then stop it.
When you want to add an instance, start the instance then attach it to the Auto Scaling group.
This would require your own code, but it is very simple (basically two API calls). You would be responsible for keeping track of which instance to attach/detach.
You can suspend scaling processes, see documentation here:
https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-suspend-resume-processes.html#as-suspend-resume
Add that instance to Scale in protection and then stop the instance then it will not delete your instance as it's having the scale in protection.
Actually you have three official AWS options to reboot or even stop an instance which belongs to an Auto Scaling Group:
Put the instance into the Standby state
Detach the instance from the group
Suspend the health check process
Ref.: https://aws.amazon.com/premiumsupport/knowledge-center/reboot-autoscaling-group-instance/
As of April 2021:
Option 4: Use Warm Pools and an Instance Reuse Policy
By default, Amazon EC2 Auto Scaling terminates your instances when your Auto Scaling group scales in. Then, it launches new instances into the warm pool to replace the instances that were terminated.
If you want to return instances to the warm pool instead, you can specify an instance reuse policy. This lets you reuse instances that are already configured to serve application traffic.
This mostly automates option 3 from John's answer.
Release announcement: https://aws.amazon.com/blogs/compute/scaling-your-applications-faster-with-ec2-auto-scaling-warm-pools/
Documentation: https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-warm-pools.html
This is to expand a little on #mwalling's answer, because that is the right direction, but needs a little extra work to prevent instance termination.
There is now a way to stop or hibernate scaled in instances!
By default AWS Autoscaling scale in policy is to terminate an instance. Even if you have a warm pool configured. Autoscaling will create a fresh instance to put into the warm pool. Presumably to make sure you start with a fresh machine every time. However, with a instance reuse policy you can make AWS Autoscaling either stop or hibernate a running instance and store that instance in the warm pool.
Advantages include:
Local caches stay populated (use hibernate for in memory cache).
Burstable EC2 instances (those types with T*) keep built up burst credits instead of the newly created instance that have limited or no credits.
Practical example:
We use a burstable EC2 instance for CI/CD work that we scale to 0 instances outside working hours. With a reuse policy our local image repository stays populated with the most important Docker images. Also we keep the built up credit from the previous day and that speeds up automatic jobs we run first thing every morning.
How to implement:
There's currently no way of doing this completely via the management console. So you will need to use AWS CLI or SDK.
First create a warm pool as described in the AWS Documentation
Then execute this command to add a reuse policy:
aws autoscaling put-warm-pool --auto-scaling-group-name <Name-of-autoscaling-group> --instance-reuse-policy ReuseOnScaleIn=true
Reference docs for the command: AWS CLI Autoscaling put-warm-pool documentation
Flow diagram of possible life cycles of EC2 instances:
Image from AWS Documentation: Lifecycle state transitions for instances in a warm pool

How to determine the state of an AWS instance

I'm trying to determine how to remove an instance from several applications (freeIPA, Chef, service discovery) from within an AWS autoscaling group but I'm finding that there's no reliable way to determine if an instance is simply stopping (sometimes our admins will take an instance out of the ASG for analysis) vs terminating. If the instance is stopped then I would like to retain the ability to have it stay connected to our LDAP and other systems. Anyone know a good way to do this?
Is the instance EBS backed or using instance store?
if instance store, you cannot really stop it (only terminate it)
Have you looked at the EC2 API (via aws-sdk maybe)? (looks like describe-instances and looking at reservations should do the trick here)
http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstances.html
I determined that the best way for me to do this was to use the ASG alarms (specifically the EC2_TERMINATE alarm). This effectively allows me to take no action if an instance is stopping and fire off a script if it is determined that the instance is terminating.