Make a second, independent copy of an EBS volume's data - amazon-web-services

I have an EBS volume with a number of snapshots. I would like a second, distinct copy of the EBS volume so I can:
restore a snapshot on the duplicate volume
continue using the original data on the original volume
Note this is distinct from similar questions eg, In Amazon EC2, how do I copy a EBS volume to another user?, which are more about changing permissions in volumes so that others can access.
How can I have continual access to two, divergent copies of the data?
Thanks!

Looks like Copying an Amazon EBS Snapshot from the official docs will do it.

I've read your question quite a few times and I'm not sure what you mean.
Once you've created a Snapshot that snapshot is stored in S3 and is durable. Regardless if your over-write the original volume or continue using it, the snapshot you made is good.
Any later snapshots you make are also durable.

Related

AWS EBS snapshots questions

I'm writing because I'm very confused around mechanism that is responsible for taking EBS snapshots.
First of all as far as I understand the difference between "backup" and "snapshot" - backup is full copy of volume blocks one to one, where snapshot is "delta" approach where only changed blocks are being copied right?
If that definition is right, than I can assume that taking EBS snapshot should be called backup - as we do typically full copy of all blocks that particular EBS is build on.
In almost every documentation from AWS website, I can read that EBS snapshots are taken incrementally (first one is full, then only difference between previous "state"). But after my small exercise on AWS console I was not able to see that in action.
I did snapshot of my EBS volume (50GB) and snapshot had a size exactly 50GB. Than I did another snapshot - again size 50GB. It made me incredible confused :///
All my experience / test were made only using root volume (first attached to EC2 instance). Now I was wondering if I have DB installed (postgreSQL) on EC2 that has only root volume attached, is that safe to make a snapshot of EBS (as a safe backup for my DB) as machine is running? Or unfortunately I should periodically take whole instance offline and only than make a backup of my DB volume?
EBS Snapshots work like this:
On your initial snapshot, it will create a block-level copy of your volume on S3 in the background. On subsequent snapshots it only saves the blocks that have changed since the last snapshot to S3 and for the rest it will keep track of a pointer to the original blocks. The third snapshot will work similar to the second snapshot, it again stores the blocks that have changed since the second snapshot and adds pointers to the other blocks.
If you restore the second snapshot, it will create a new volume and take a look at its metadata store, which pointers belong to that snapshot and then retrieve the blocks from S3 these point to.
If you delete Snapshot two, it will remove the pointers to the blocks that belong to snapshot two. If any of the blocks on S3 has no pointer left, i.e. doesn't belong to a snapshot anymore, it will be deleted.
To you as the client this whole process is transparent - you can delete or restore any snapshot you like and EBS will take care of the specifics in the background.
Should you be more interested in the details under the hood, I can recommend this article: The Jellyfish-inspired database under AWS Block Storage

EBS Snapshots versioning

Are EBS snapshots versioned?
If Yes, where can I find the version information.
I tried to check in Amazon official docs,but couldn't get a clear answer to this.
Yes and No.
Each snapshot is, in a way, a 'version'.
The reason for this is that, when a Snapshot is created, any block that has been added or modified since the previous snapshot is copied to Amazon S3 (in a place you can't directly access) and the Snapshot becomes the 'index' to those blocks.
Scenario:
Create Snapshot1
Modify one block
Create Snapshot2
When Snapshot2 was created, one block was copied to S3. Snapshot2 still points to all the blocks used in the volume, but they were already in S3 and didn't need to be re-copied. So, you can think of Snapshot1 and Snapshot2 as being different 'versions' of the disk.
If Snapshot1 is deleted, the underlying data is kept in S3 because it is used by Snapshot2. If Snapshot2 is then deleted, all of the snapshot data in S3 will be deleted. (Unless the original volume was based on an AMI, which is a snapshot itself! In that case, only the changes made since the AMI was instantiated are deleted. Neat and confusing, eh!)
AWS EBS Snaphots do not expose a version. They are identified by Snapshot ID, Date (Started) and Volume ID.
Here is an AWS article on snapshots:
EBS Snapshots
Here is third party article on snapshots:
AWS EBS Snapshot Explained

Cost-effectively store volumes that I won't need for a few months?

I have two EC2 instances I created this summer for personal use while learning basic ML concepts and doing Kaggle competitions. I'd like to save the work on them on eventually be able to use them again if I'm interested in competing in a Kaggle competition again without having to setup a new instance, but probably won't need them for a few months (and when I do need them, it won't be at a moment's notice).
Each instance has an 128gb EBS gp2 volume that's costing me ~$13/month. I was wondering if there's a way that I could pull these off AWS so that I'm not still paying for them when I don't need them. Is there a feature where I can store a snapshot outside of AWS and eventually upload it to AWS and restore the volumes if I need them?
Or is there a much cheaper (slower) storage method for keeping them on AWS? (sc1 volumes are $0.025/GB-month, but is there something even cheaper?)
Edit: Clarified volume type ($0.10/GB-month gp2)
Edit2: I think my best bet for now is to snapshot them since each only has ~30GB of used space (60GB*$0.05 = $3/month) and delete the original volumes.
If you wish to retain the exact contents of the disk volumes, the choice really comes down to:
Amazon EBS volume snapshots
ISO images
Amazon EBS volume snapshots are only charged for blocks that are used. They are the easiest to create and restore. It is not possible to export an Amazon EBS snapshot.
If you wish to move a disk image out of Amazon EC2 (eg to download, or to store in Amazon S3), use a standard disk utility to create a .iso image of the disk. This can later be restored to a new disk volume, and can even be directly mounted in read-only mode using disk utilities.
You can put all this data into Amazon Glacier which is far more cheaper ( around 10% cost )

Best option to take complete Backup of EC2 instance?

Currently I am taking manual backup of our EC2 instance by zipping the data and downloading it locally as well as on DropBox.
But I am wondering, can I have an option where I just take a complete copy of the whole system automatically daily so if something goes wrong/crashes, I can replace it with previous copy immediately rather than spending hours installing and configuring things ?
I can see there is an option of take "Image" but can I automated them to have just 1 latest image and replace the system with single click ?
You can create a single Image of your instance as Backup of your instance Configuration.
And
To keep back up of your data you can use snapshots of your volumes.
snapshots store data in incremental format whenever you make any changes.
When ever needed you can just attach the volume from the snapshot to your Instance.
It is not a good idea to do "external backup" for EC2 instance snapshot, before you read AWS pricing details.
First, AWS is charging every GB of data your transfer OUTside AWS cloud. Check out this pricing. Generally speaking, after the 1st GB, the rest will be charge at least $0.09/GB, against S3-standard pricing ~ $0.023/GB.
Second, the snapshot created is actually charges as S3 pricing(Check :
Copying an Amazon EBS Snapshot), not EBS pricing. After offset the transfer cost, perhaps you should consider create multiple snapshot than keep doing the data transfer out backup.
HOWEVER, if you happens to use an instance that use ephemeral storage, snapshot will not help. You need to copy the data out from ephemeral storage yourself. Then it is your choice to store under S3 or other place.
Third. If you worry the AWS region going down, check the multiple AZ option. Or checkout alternate AWS region option.
Fourth. When storing backup data in S3, you can always store them under Infrequent-Access, which save you some bucks, and you don't need to face an insane Glacier bills during emergency restore(Avoid Glacier, unless you are pretty sure about your own requirement).
Fifth, after done your plan of doing everything inside AWS, you can write bash script (AWS CLI) or use boto3, etc API to do the automatic backup.
Lastly , here is way of AWS create and maintain snapshot. Though each snapshot are deem "incremental", when u delete old snap shot :
the snapshot deletion process is designed so that you need to retain
only the most recent snapshot in order to restore the volume.
You can always "test" restore by create another EC2 instance that load the backup snapshot. Or you can mount the snapshot volume from another EC2 instance to check the contents.

How stable is EBS?

I'm thinking about saving data from EC2 instances to the EBS and later save the result on S3. I don't have a lot of experience working with EBS, so my questions are:
How stable they are? I mean how often (if any) you had problem with EBS. Do they crash if overloaded or something like this?
What are the chances of loosing data from EBS?
Is it possible to mount one EBS to the multiple Instances? (let's say two ec2 share the same ebs )
I assume you've read AWS's take on EBS
Pretty stable. Last year, 10% of EBS volumes failed in 2-3 data centers in us-east for a couple hours. This is the only issue I've ever had with them.
I've never lost data from EBS. Even if I had, I take hourly snapshots (stored in s3), so I would have been just fine.
Not at the same time. To attach it to another instance, you must detach from the currently attached one.
Perhaps what you're look is s3fs - a way to mount s3 as a filesystem.
EBS is quite stable and every data you write is redundantly copied in 3 disks inside a AZ. If you take regular snapshots of your EBS volumes you can protect your data more. Since EBS operate in AZ scope it is recommended to moves assets like user documents, images, videos to Amazon S3. S3 offer more redundancy and availability than an EBS Volume.
You cannot mount single EBS volume to Multiple EC2 instances. You will have to use Solution like GlusterFS on AWS so that multiple EC2 instances can talk to common storage pool.