How to block until EC2 status check is passed using Python Boto3? - amazon-web-services

I have the following python code to detect whether an EC2 is really started. But it completes when "instance state" shows running.
which API function should I use to block until EC2 "status check" show "2/2 checks passed"
ec2 = boto3.resource('ec2')
instance = ec2.Instance(instanceid)
instance.wait_until_running()

It is rare that you would need to wait for the status check to pass.
When an instance enters the running state, the machine boots, loads the operating system and generally "runs".
The EC2 Status Checks are an independent process that check attributes of the virtual machine. However, your machine is normally running, and you can login to it, well before the status checks show a positive response.
If you do wish to wait for the Status Check, there are two waiters that might do this, but the documentation is unclear:
InstanceStatusOk
SystemStatusOk

Related

AWS EC2 Instance starting time

Sometimes the starting time of the instance takes more than 5 minutes. In this case, the Status Checks takes more than 4 minutes.
How can I make the instance run less than a minute, including checking the status?
You do not need to wait for the Instance Status Check to complete before using an Amazon EC2 instance.
Linux instances are frequently ready 60-90 seconds after launch. Windows instances take considerably longer because the AMI has been configured for sysprep, which involves a reboot.
New instances take longer to be ready than existing instances because they typically run code on first startup. So, if you Stop and instance and later Start it again, the instance will be available quite quickly (especially Linux instances).
I'm not sure that "You do not need to wait for the Instance Status Check to complete" is
correct, and if the status check failed for any reason you (obviously) have a problem and should investigate before using.
Doing a quick check using a aws jdk script creating a "Nano" instance from a linux image loaded with ubuntu, apache, tomcat, java, mysql etc it took 45 secs to get "running" and 2 mins 15 secs to finish the Status Checks.
Starting an existing "stopped" instance ("Nano") took 18 secs and 2 mins 15 secs to finish the status checks.
You can't change instance health check it manages by aws. When a system status check fails, you can choose to wait for AWS to fix the issue, or you can resolve it yourself by stop and start the instance. which in most cases migrates it to a new host computer.
The following are examples of problems that can cause system status checks to fail:
Loss of network connectivity, Loss of system power
Software issues on the physical host
Hardware issues on the physical host that impact network reachability.
The instance will be accessible once boot. it should not take 5 min of time. you can check instance boot logs or screen from
ec2 --> Action --> Instance settings`Get system log` and `Get instance screenshot` and optimized instance up time.

Node AWS SDK start an instance

Using the node aws-sdk, I check the status of an instance, and want to let a user start it if it is dead, or kill it if it is alive.
I found that there is a method called runinstances but it seems like it creates new instances, and I want to revive a live one.
Is there a way using the node sdk to start/kill an instance?
You can write the logic to check the EC2 instance status using describeInstanceStatus method which will return the instance states (InstanceState.Name)
pending
running
shutting-down
terminated
stopping
stopped
Based on the current state (running or stopped) you can toggle the instance state using either of the following method.
startInstances
stopInstances

several minute delay after ssh connects before AWS Status Checks report success

We're trying reduce our AWS instance start up time. We're able to ssh to an instance about 90 seconds after it starts. But the Status Checks returns "initializing" until the instance has been running for over four minutes. During that time I don't see the instance doing anything (top, vmstat, uptime all show the system basically idle).
Can anyone tell me how often the Status Check is run during instance start and what specifically its testing? Thanks.
Status Checks are performed every 60 seconds. And it performs different network tests along with making sure that an instance is started and configured properly.
You can read more on that in here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-system-instance-status-check.html

Hello World PipeLine with ShelCommandlActivity

I'm trying to create a simple dataFlow pipeline with a single Activity of ShellCommandActivity type. I've attached the configuration of the activity and ec2 resource.
When I execute this the Ec2Resource sits in the WAITING_ON_DEPENDENCIES state then after sometime changes to TIMEDOUT. The ShellCommandActivity is always in the CANCELED state. I see the instance launch and very quicky changes to the terminated stated.
I've specified a s3 log file url, but that never gets updated.
Can anyone give me any pointers? Also is there any guidance out there on debugging this?
Thanks!!
You are currently forcing your instance to shut down after 1 minute which gives the TIMEOUT status if it can't execute in that time. Try increasing it to 50 minutes.
Also make sure you are using an AMI that runs Amazon Linux and that you are using full absolute paths in your scripts.
S3 log files are written as:
s3://bucket/folder/

AWS AutoScaling, downscale - wait for processes termination

I want to use AWS AutoScaling to scaledown a group of instances when SQS queue is short.
These instances do some heavy work that sometimes requires 5-10 minutes to complete. And I want this work to be completed before the instance termination.
I know a lot of people should have faced the same problem. Is it possible on EC2 to handle the AWS termination request and complete all my running processes before the instance is actually terminated? What is the best approach to this?
You could also use Lifecycle hooks. You would need a way to control a specific worker remotely, because AWS will select a particular instance to put in Terminating:Wait state and you need to manage that instance. You would want to take the following actions:
instruct the worker process running on the instance to not accept any more work.
wait for the worker to finish the work it already is handling
call the complete-lifecycle action.
AWS will take care of the rest for you.
ps. if you are using celery to power your workers then you can remotely ask a worker to shutdown gracefully. It won't shutdown unless it finishes with the tasks it had started executing.
Assuming you are using linux, you can create a pre-baked AMI that you use in your Launch Config attached to your Auto Scaling Group.
In the AMI you can put a script under /etc/init.d say /etc/init.d/servicesdown. This script would execute anything that you need to shutdown which would be scripts under /usr/share/services for example.
Here's kind like the gist:
servicesdown
It would always get executed when doing a graceful shutdown.
Then say on Ubuntu/Debian you would do something like this to add it to your shutdown sequence:
/usr/sbin/update-rc.d servicesdown stop 25 0 1 6 .
On CentOS/RedHat you can use the chkconfig command to add it to the right shutdown runlevel.
I stumbled onto this problem because I didn't want to terminate an instance that was doing work. Thought I'd share my findings here. There are two ways to look at this though :
I need to terminate a worker, but I only want to terminate one that's not working
I need to terminate a SPECIFIC worker and I want that specific worker to wait until it's done with the work.
If you're goal is #1, Amazon's new "Instance Protection" looks like it was designed to resolve this.
See the below link for an example, they give this code snippet as an example:
https://aws.amazon.com/blogs/aws/new-instance-protection-for-auto-scaling/
while (true)
{
SetInstanceProtection(False);
Work = GetNextWorkUnit();
SetInstanceProtection(True);
ProcessWorkUnit(Work);
SetInstanceProtection(False);
}
I haven't tested this myself, but I see API calls related to setting the protection, so it appears that this could be integrated into the EC2 Worker App code-base and then when Scaling In, instances shouldn't be terminated if they are protected (currently working).
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/autoscaling/AmazonAutoScaling.html
As far as I know currently there is no option to terminate instance while gracefully shutdown and let process to complete work.
I suggest you to look at http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/as-configure-healthcheck.html.
We implemented it for resque workers while we are moving instance to unhealthy state and than downsizing AS. There is a script which checking constantly health state on each instance. Once instance moved to unhealthy state it stops all services gracefully and sending terminate signal to ec2.
Hope it helps you.