aws instance recovery vs reboot - amazon-web-services

The Cloudwatch docs show its possible, based on the outcome of an alarm, to (amongst others) either reboot, or recover an instance. What is the difference between a reboot and a recovery?
The docs say:
A recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html
But from the tests I have done, that is the case for rebooted instances too.

Just found what I needed, But Ill leave this here for others searching for the same thing.
Recovery will move the instance to a new hypervisor, rebooting keeps the instance on the same hypervisor.

Related

Amazon EC2 instance passed 1/2 checks

Newbie to Amazon Web Services here. I launched an instance from a Public AMI and found that I could not ssh into the instance - I received the error "Connection timed out." I checked the security groups to verify that Port 22 was associated with 0.0.0.0/0. Additionally, I checked the route tables to verify that 0.0.0.0/0 is associated with target gateway attached to the VPC.
I find that only 1/2 status checks have passed - the instance status check failed. I have tried stopping and starting the instance as well as terminated and launching a new instance, both to no avail. The error that I see in the system log is:
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1).
From this previous question, it appears that this could be a virtualization issue, but I'm not sure if that was due to something I did on my end when launching the instance or something that occurred from the creators of the AMI? Ec2 1/2 checks passed
Any help would be appreciated!
Can you share any more details about how you deployed the instance? Did you use the AWS Management Console, or one of the command line tools or SDKs to deploy it? Which public AMI did you use? Was it one of the ones provided by Amazon?
Depending on your needs, I would make sure that you use one of the AMIs provided by Amazon, such as Ubuntu, Amazon Linux, CentOS, etc. Here's the links to the docs on AMIs, but you can learn quite a bit by just searching for images. Since you mentioned virtualization types though, I'd suggest reading up briefly on the HVM vs. Paravirtual virtualization types on AWS. Each of the instance types / families uses a certain virtualization type, which is indicated in the chart on this page.
Instance Status Checks
This documentation page covers the instance status checks, which you'll probably want to familiarize yourself with. It's entirely possible that shutting down (not restart, but shutdown) and then starting the instance back up might resolve the instance status check.
Spot Instances - cost savings!
By the way, I'll just mention this since you indicated that you're new to AWS ... if you're just playing around right now, you can save a ton of cost by deploying EC2 Spot Instances, instead of paying the normal, on-demand rates. Depending on current rates, you can save more than 50%, and per-second billing still applies. Although there's the possibility that your EC2 instance could get "interrupted" based on market demand, you can configure your Spot Instance to just "Hibernate" or "Stop" instead of terminating and relaunching. That way, your work is instance state is saved for when it relaunches.
Hope this helps!
1) Use well-known images or contact with the image developer. Perhaps it requires more than one drive or tricky partitioning.
2) make sure you selected proper HVM/PV image according to the instance type.
3) (after checks are passed) make sure the instance has public ip

AWS EC2 instance becomes inaccessible via SSH after

Issue : after a given period of time (usually the time it takes for the initial status checks to complete) I can no longer access my EC2 instance via SSH. More specifically during the initial period, I have normal access to my instance via SSH, then it drops, and the machine becomes completely unreachable, even when trying to ping it.
I have double checked Security Group, VPC settings etc. but don't think that can be the issue as at one point in time I can access the machine.
The issue occurs on "vanilla" instances from very basic standard AMIs as well as with AMIs I run on other AWS accounts. I have tried various instance types / sizes, but the issue occurs again and again.
Any ideas welcome! Thanks in advance
Dan
There are many things that can cause this. Specifically, if you're supplying user data during startup it could encounter an issue and eliminate your ability to SSH if it is modifying the file system and mounts or changing permissions.
If you can save the underlying system volume you can remount it and check /var/log/boot.log and /var/log/cloud-init-output.log

Do EC2 instances randomly start/stop?

I am trying to wrap my head around EC2 instances, and I am having a bit of an issue. I heard from a friend of mine that Amazon will kill EC2 instances, and then they restart the image (thus losing all state). Unless it uses EBS as a backing store, you get no persistence.
But I have been looking into Xen and it seems like instances should easily migrate instead of being killed/restarted.
So, do Amazon EC2 instances randomly stop/start an image with all state being managed by something external like EBS?
Amazon EC2 instances will not be stopped/started/restarted unless you issue a command to do so.
In some situations (eg hardware maintenance), you might receive a request from Amazon asking you to stop & start your instance (which moves it to a different host). Such requests are typically issued with two weeks notice.
One AWS customer told me that their instance had been running continuously for over three years.
Yes it is quite possible that an EC2 instance dies and is replaced. Depending upon your data, you may need to use EBS, EFS or S3 to prevent data loss in such cases.

Amazon Elastic Beanstalk instance's non-persistent storage data recovery

Last night there was an error on my EB instance. The instance was removed and a new one was added. Because of that I lost data from the non-persistent instance storage. I don't have a backup / snapshot. A big beginner's mistake.
My question: Is there any chance to recover the instance's data from 12 hours ago? Maybe with the help from the AWS staff?
When an instance is stopped or terminated, the ephemeral volumes are gone. Terminating an instance releases the hardware for use by another customer. Stopping an instance does the same thing -- that's part of why you don't pay for stopped instances. The same instance will actually come up on physically different hardware if stopped and started.
Aside from the documentation...
The data in an instance store persists only during the lifetime of its associated instance. If an instance reboots (intentionally or unintentionally), data in the instance store persists. However, data in the instance store is lost under the following circumstances:
The underlying disk drive fails
The instance stops
The instance terminates
Therefore, do not rely on instance store for valuable, long-term data.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
...there are also numerous support forum posts on this topic, with responses by AWS personnel, indicating that the answer is "no."
Here's one example:
Once an instance that has Ephemeral (or Instance Store) volumes has been Stopped or Terminated, we are unable to recover the data that was on that volume. When you Stop or Terminate such an instance, those volumes are securely wiped and this is to ensure the security and confidentiality of your data that was on that volume.
This is as per: http://aws.amazon.com/instance-help/
And: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
https://forums.aws.amazon.com/thread.jspa?messageID=501815&#501815
And another:
Ephemeral (instance-store data) is the local host's hard drive and when an instance is migrated (moved) to new hardware (from a stop/start) the ephemeral data is scrubbed as part of the process as the instance will have new ephemeral storage as part of the new host.
All that said, there is not any way to get the data back from the ephemeral location.
https://forums.aws.amazon.com/thread.jspa?messageID=396680&#396680

AWS/EC2 - Initially working instances, become inaccessible, although still running.

Issue in a nutshell:
Simple-singular-practice ec2 instances are unexpectedly just falling off the grid even though they are still running, and I have to keep recreating them ,and if not, ssh accessing or online public DNS accessing will result in a "Timeout".
Little More Details Outside the Nutshell :)
I've followed the setting up a LAMP server instructions to the "T" and successfully have served up basic HTML pages.
Everything initially works fine:
I can ssh into the instance no problem
When accessing the public DNS online - the expected html pages render just fine.
Problem:
But then, quiet randomly, I can no longer access the instance through ssh and even online, the public DNS is inaccessible.
In both cases they just "Timeout"
Config:
Basic Free Tier
Amazon Linux AMI 2015.09.1 (HVM), SSD Volume Type
t2.micro
Number of Instances - 1
Auto-assign Public IP(Enabled)
Ports - 22(My IP),80(0.0.0.0),443(0.0.0.0)
Using a key pair
Question:
What typically causes instances freezing up like this?
LAMP stacks on EC2 are extremely common, and the guide you're following is extremely popular and has been used for years so it's likely you've gone wrong somewhere or the problem is something more sinister.
If you can't access the instance by any means, it would sound like it has become overloaded. Unless you've accidentally changed a firewall rule on the AWS side (eg. Security Groups, NACLS) or something on the instance level (eg. IP Tables).
Open up ICMP on your security group and try pinging the instance and see if you get a response.
After you've verified all your firewalls and you've tried to connect to it through every means, check out the logs, they're your friend.
To check the logs, start at the AWS level. CloudWatch records lots of data about your instance - CPU Utilization, Network In & Out and more. Check all of these through the AWS Console ensuring you select the "Maximum" statistic and not "Average". Also, take a look at the "StatusCheckFailed_System" (Hardware problem) and "StatusCheckFailed_Instance" (Instance not responding to health check probes) metrics to see if they have any story to tell. See the docs here and here for more info.
Next, reboot the instance and try stop starting and reconnect via SSH. Check you application logs (if any) and check your Apache Logs and Linux Logs to see what happened.
But to answer your question, what typically causes a instance to freeze up like this:
Bad Application code that sucks up all the CPU overloading the instance
Too much traffic overloading the instance
Running too many services on the instance that it's unable to handle
AWS Hardware problem - Uncommon