Do I need to backup my EC2 instance? - amazon-web-services

I’m using the g2.2xlarge instance type. This pricing table shows that the instance has “60 SSD Instance Storage.” The Best Practices for Amazon EC2 tells us that “the data stored in instance store is deleted when you stop or terminate your instance.” However, I have stopped the instance and the data on it remained. So does that mean that the data is on EBS or… I'm relatively new to EC2 and I just want to know whether I need to back up my data.

Yes, you should.
Here's why:
The data in an instance store (i.e. your 60 SSDs) are guaranteed to persist only during the associated instance's lifetime. This means that data is guaranteed to persist over reboots, but not if you were to STOP or TERMINATE the instance. In this scenario the underlining hardware may be replaced and you might lose everything. You are also subject to disk drive fails that can corrupt your data.
You said you did not lose anything when you stopped the instance, but you could have.
Therefore, you should use EBS or S3 or something else to backup your data.
Bonus points: you cannot detach your instance storage, what can be a problem if you ever need to change your instance - which you are very likely to do at one point.
Source: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
Cheers!

Backing-up is ALWAYS a good policy.
In the case of AWS EC2, however, you have the option to creating a snapshot. Follow This link:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html

Do you need to backup your instance?
In a purist sense - No, always assume your EC2 could fail at any time, so don't put anything on there you don't have elsewhere (ie. source code, data, etc)
If you stopped your instance and the data is still there, it's likely it's mounted on an EBS, not ephemeral storage.
Do you need to backup your EBS?
Depends on your requirement. EBS is distributed over an AZ which gives you pretty good durability, so just think about what's on there and how bad it would be if you lost it

Related

What does EC2 store and why does it even need a storage solution like EBS or Instance Store?

If you use EC2 and launch instances, you can add EBS volumes. So a storage option. However, what I still don't understand exactly is why. Why is there or does EC2 even need a storage option like EBS or Instance Store? What does EC2 store anyway? And why it makes sense that there is EBS?
I know that EBS volume is persistent block storage and data is not lost after exit, unlike instance store. I just don't really understand what EBS is useful for. For which cases and applications is EBS used? Or does using EBS have more to do with creating snapshots that you can create to cache data and then save it to S3?
I've already read a lot and tried to make it understandable somehow, but somehow I can't get any further here. I would be really happy if someone could shed some light on this for me.
Thank you already!
Think of an Amazon EC2 instance as a normal computer. Inside, there is CPU, RAM and (perhaps) a hard disk.
When an EC2 instance has a hard disk, it is called Instance Storage and it behaves just like a normal hard disk in a computer. However, when you turn off the instance and stop paying for it, the EC2 instance can give that computer to somebody else. Rather than giving your data to somebody else, the disk is erased. So, anything you stored on Instance Store is gone! (In truth, instance store is also a virtualised disk, but this is close enough.)
In fact, in the early days of EC2, this was the only storage available. If you wanted to keep data after the instance was turned off, you first had to copy it to Amazon S3. People didn't like this, so they invented Amazon EBS.
If you want to keep your data so that it is still there when you turn on the instance in future, it needs to be stored on a network disk and that is what Amazon EBS provides. Think of it a bit like a USB drive that you can plug into one computer, then disconnect it and plug it into another computer. However, rather than being a physical device, it uses a storage service that keeps multiple copies of the data (in case a disk fails) and lets you modify the size of the disk. You are charged based on the amount of storage space assigned and how long the data is kept ("GB-Month").
Amazon EBS Snapshots are simply a backup of the disk. A snapshot contains all the data currently on the disk, allowing you to create a new disk anytime that will contain an exact copy of the disk as it was when the snapshot was created. This is great for backups, but is also very useful for creating multiple EC2 instances with the same disk content. An Amazon Machine Image (AMI) is actually just an Amazon EBS Snapshot plus a bit of metadata. When a new EC2 instance is launched, it uses an AMI to populate the boot disk rather than loading the operating system from scratch every time.
It is possible to create an AMI that populates an Instance Store disk. This way, you don't actually need to use an Amazon EBS volume. This is good for instances that don't need to permanently keep any data -- they could simply store information in a database or Amazon S3 instead of saving it on disk. Instance Store disks can be very fast since they don't send data across the network, so this is very useful in some situations.
In summary:
Instance Store is a normal disk in a computer (but it gets erased when the instance turns off so nobody else sees your data)
Amazon EBS volumes are network-attached storage that stays around until you delete it

Amazon Elastic Beanstalk instance's non-persistent storage data recovery

Last night there was an error on my EB instance. The instance was removed and a new one was added. Because of that I lost data from the non-persistent instance storage. I don't have a backup / snapshot. A big beginner's mistake.
My question: Is there any chance to recover the instance's data from 12 hours ago? Maybe with the help from the AWS staff?
When an instance is stopped or terminated, the ephemeral volumes are gone. Terminating an instance releases the hardware for use by another customer. Stopping an instance does the same thing -- that's part of why you don't pay for stopped instances. The same instance will actually come up on physically different hardware if stopped and started.
Aside from the documentation...
The data in an instance store persists only during the lifetime of its associated instance. If an instance reboots (intentionally or unintentionally), data in the instance store persists. However, data in the instance store is lost under the following circumstances:
The underlying disk drive fails
The instance stops
The instance terminates
Therefore, do not rely on instance store for valuable, long-term data.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
...there are also numerous support forum posts on this topic, with responses by AWS personnel, indicating that the answer is "no."
Here's one example:
Once an instance that has Ephemeral (or Instance Store) volumes has been Stopped or Terminated, we are unable to recover the data that was on that volume. When you Stop or Terminate such an instance, those volumes are securely wiped and this is to ensure the security and confidentiality of your data that was on that volume.
This is as per: http://aws.amazon.com/instance-help/
And: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
https://forums.aws.amazon.com/thread.jspa?messageID=501815&#501815
And another:
Ephemeral (instance-store data) is the local host's hard drive and when an instance is migrated (moved) to new hardware (from a stop/start) the ephemeral data is scrubbed as part of the process as the instance will have new ephemeral storage as part of the new host.
All that said, there is not any way to get the data back from the ephemeral location.
https://forums.aws.amazon.com/thread.jspa?messageID=396680&#396680

Doubts about recovering a .pem of an EC2 in AWS

We are working with an EC2 instance in Amazon Web Services but we have lost our .pem.
In order to create a new one, we are following this guidance:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#replacing-lost-key-pair
However, we are a bit worried because of this warning:
When you stop an instance, the data on any instance store volumes is
erased. Therefore, if you have any data on instance store volumes that
you want to keep, be sure to back it up to persistent storage.
We cannot access the instance, therefore we cannot really make a proper backup. Instead, we have make a snapshot of the volumes in Elastic Block Store.
We are wondering if this is enough and we can indeed stop the instance to proceed to the pair key recover or we need to do something else in order not to lose any data.
It depends on the type of instance.
If it's EBS backed you are probably safe to proceed as the volume will be reattached.
If it's instance store backed and you lost access to it you basically have lost what's on that machine.
By the sounds of it it's EBS backed. If it's instance store backed and you later created and attached an EBS volume and used that, you're going to be able to restore/reattach that volume just fine - but it's going to be to another machine.
Depending on how many instances we are talking about you should also be able to take an AMI Image of the running instance which will take snapshot of the EBS but also the exact state of the instance.
However if the instance's root device type is using a EBS backed store all the data should be safe so saving a snapshot and relaunching a new instance with the snapshot should have the data.
Good luck.

Amazon instance store

As far as I understand for new created amazon instance ephermeral data store is used by default, unless EBS store is configured.
After stop of the instance, which uses ephermeral data store, I will loose all data. Is it correct ?
I noticed that EBS store has been created automatically for my instance. I have created few files in home directory, but this files were not deleted after reboot. So where is ephermeral data is stored ?
I want to install database to Amazon host. Should I worry about data loose with default setup and what is the common configuration, for example
Create instance
Install and configure database on ephermeral data store
Make AMI
Create EBS store and configure database to use it as storages
After stop of the instance, which uses ephermeral data store, I will loose all data. Is it correct ?
To be specific, after you terminate or stop a node, any data on instance-specific storage will be lost. A reboot is different, and your data is intact in those cases. I am using these terms to match the terms in the AWS console.
To confuse matters slightly, some EBS-backed nodes also have some instance-specific storage. All instance-storage nodes are 100% instance-backed, though. So you really need to understand whether your data is hitting an EBS disk or instance-local storage.
I noticed that EBS store has been created automatically for my instance. I have created few files in home directory, but this files were not deleted after reboot. So where is ephermeral data is stored ?
Several points here:
For an EBS-backed instance, your /home partition is on the EBS root device, and hence data will persist provided the volume exists.
Again a reboot wouldn't delete your data even if you had an instance-storage node, but it sounds like you chose an EBS-backed node.
If you had instead created these files in /mnt, then stopped your instance and later started it again, you might have lost them. Again it depends exactly which ec2 node type you're running.
Regarding your last point - I would recommend that you just make sure your data is being stored on some EBS backed disk. Whether that is your root device or a separate EBS volume is up to you and depends on your specific needs.
I want to install database to Amazon host.
You should give some thought to not installing and maintaining your own database. Doing so is complex, error prone, and can be quite time consuming. I
A better option for most folks is a turnkey database solution like RDS. This is a performant database that you don't have to really think about - it'll just work. RDS isn't for everyone, as there are some restrictive permission issues, but generally speaking it's great. I use it every day.
You can run databases on top of EBS and it'll work just fine. But you are biting off being a database admin at that point, and need to worry about all the complexity that comes with it. In my opinion, better to focus your time & energy on things like database schema, queries, and other aspects of your business.

AWS can an EBS-backed instance also access "instance store?

I thought I clearly understood the difference between instance-store and EBS backed AMIs.
But http://aws.amazon.com/maintenance-help/ says "if you are running an EBS-backed AMI, you can stop and then restart your instance in order to easily re-launch it. This will cause the loss of any data you have saved on the local instance store of the instance,"
Stop/start does NOT lose the sysvol data, so this confuses me.
I'm assuming that here, by "local instance store", they mean the backing EBS volume (the sysvol), and I'm thinking that they meant to say "terminate" instead of stop. Am I correct?
Terminating an EBS-backed instance will not cause your data to be deleted. You can still access the EBS volume until you delete it (unless you set it to delete when your instance is terminated).
Local instance store refers to hard drive space on the actual physical server that is running your instance. You can see the available instance store by doing sudo fdisk -l. Some images come with some instance store volumes already mounted (see df -h). Otherwise you'll have to mount and format the instance store volumes before you can use them.
Data on an instance store volume is lost when you stop (not terminate) your instance because it is local to a physical server, and your instance might start up on a new server.
Quite simply, EC2 is running your virtual server on some physical server. The root filesystem can either be on a local disk (ephemeral storage) or on network attached storage (EBS). With EBS, they can snapshot it for backups or to make a copy, so EBS is far more flexible, although not as fast as a local disk in the server where your instance is running.
In order to make this all work, when you shutdown an ephemeral server, amazon wipes the disk in order to reallocate it to the next customer. There is no need or reason for them to do that with EBS, since it was not physically attached to that server in the first place.
You might note, that even EBS backed instances (depending on size) come with an allocation of ephemeral storage (2-500gig+) which can be used for swap, logs, or whatever else you want to do with them. The only issue of course is that should the server be shutdown, or should there be a catastrophic disk or hardware error, you'll lose that data. You can still manually back it up, in the same way people have backed up traditional servers over the years.
Making your own AMI from an EBS backed server is trivial now, and can be done easily through the AWS web interface. Making a non-EBS backed AMI is a very complicated task the last time I tried to do it. With that said, there are certain use cases where it makes a lot of sense to consider using purely ephemeral storage. Computation or memory/cache nodes that have no need to persist data will be faster and cost less.