Does taking a snapshot of an EBS volume increase reliability? - amazon-web-services

The EBS documentation states:
As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% – 0.5%, where failure refers to a complete loss of the volume.
..but this doesn't give any indication of the AFR for a volume with, for example:
No snapshot at all; or
A fresh snapshot with no modified data.
I've seen it suggested that missing or damaged blocks can be automatically/silently recovered from snapshots but I can't see any reference to this in the documentation. Is this true?
Can I assume that if I have a volume with no changed data and a fresh snapshot, my AFR for the volume matches S3's reliability?

I took a three day class from AWS last year, and they told us unequivocally that taking snapshots greatly increases the reliability of an EBS volume. They did not explain why that was so, but hinted that EBS volumes store changes from the latest snapshot and that the snapshot itself is very stable (stored in S3). Successive snapshots apparently use little storage, as AWS is smart enough to store diffs.
They did not give any hard numbers on failure rates, though. They suggested configuring multiple EBS volumes using RAID if reliability of the volume is essential. However, they also recommended architecting your application so that it can tolerate failure of any instance, making it less important for each EBS volume to be durable.

Snapshots taken of EBS Volumes are stored in S3. These snapshots get all the durability and availability benefits of S3. You can also copy snapshots to other regions, which is a nice insurance policy against a regional level outage.
If your EBS volume fails, you can then recover from your last snapshot. The more recent your snapshot, the more up-to-date your recovery story is. With the incremental nature of EBS snapshots performing them on a frequent basis is very practical.
EBS also provides "recovery volumes", which you can see from this AWS forum thread.
To my knowledge, the act of taking a snapshot doesn't directly impact the AFR of an active, running EBS volume. Rather, it just makes it easier for you to recover in the event of a failure.

Related

AMI EC2 EBS Backup- cost forecasting

Actually I have to take forecast of costing for one my instance, which is having a number of volumes attached... These volumes are different in size and types.
Let's suppose I took the AMI backup and terminated the server.
Now my confusion is how would I calculate the cost. The cost will be calculated based on pricing of Amazon EBS Volumes or Amazon EBS Snapshot. Because the cost difference is just double.
Let me know if you can help me understanding.
pricing of Amazon EBS Volumes or Amazon EBS Snapshot Which I took from AWS Pricing :
https://aws.amazon.com/ebs/pricing/
Amazon EBS snapshots are a complex subject due to the way they work.
There is a detailed explanation in: Amazon EBS snapshots - Amazon Elastic Compute Cloud
A quick summary is:
Snapshots contain only the data that is different to previous snapshots (they are incremental)
An AMI is actually a snapshot. So, if you booted a new Amazon EC2 instance from an AMI and then created a snapshot, the snapshot would contain very little since most of the volume was already contained in the previous snapshot (that was part of the AMI). Confused yet?
Any snapshot can be deleted and information will still be retained to allow any other snapshot to be restored. So, the snapshot is actually an 'index' to the snapshot data, and the snapshot data is stored separately to the snapshot itself. You should be questioning your sanity at this point!
So, the cost of Amazon EBS snapshots is mostly based on how much the contents of the volume changes, and how many snapshots (effectively, points-in-time) you wish to keep. If you only keep the most recent snapshot, then all data will be available, but the cost will be minimised because it won't keep any data that has been deleted from the volume.
Bottom line: Snapshots take less space than the data on a volume due to the incremental natures. The more snapshots ("points-in-time"), the more data will be kept and hence the more cost.

Can AWS not compress snapshots (for EBS containing binary data)?

I'm creating periodic snapshots of my EBS volume using a Scheduled Cron expression rule (thanks, John C).
My data is all binary, and I suspect that the automatic compression AWS performs on my data - will actually enlarge the resulting snapshots.
Is there a way to instruct AWS to not employ compression when creating snapshots (so I could compare the snapshot's size with/without compression)?
Note:
Creating an Amazon EBS Snapshot seems to indicate that using compression is mandatory.
You have no control over the compression used for EBS snapshots.
EBS snapshots are incremental (except for the first snapshot). That data is compressed based on AWS's own heuristics. You have no visibility into the actual compressed data's size.
When you're looking at an EBS snapshot, the snapshot's "size" will always be reported as the originating EBS volume's size, regardless of the actual size of the snapshot.
I don't think EBS snapshots are now compressed (I am not sure if they were earlier) and I could not find any reference to compression in AWS documentation as well. That is why the size of initial snapshot is same as the size of the volume. And after first snapshot, other snapshots are incremental so only the blocks on the device that have changed or added after last snapshot are saved in the new snapshot.
You can refer the blog on how the ebs snapshots backup & restore work.

EBS Snapshots, who manages backups?

I'm starting with AWS and I've been taking my first steps on EC2 and EBS. I've learnt about EBS snapshots but I still don't understand if the backups, once you've created a snapshot, are managed automatically by AWS or I need to do them on my own.
AWS just introduced a new feature called Lifecycle Manager (in the EC2 Dashboard, at the bottom left) that allows you to create automated backups for your volumes. Once you configure a policy, AWS will handle the backup process for your volumes.
This is only a couple of weeks old so just wanted to mention here.
Snapshots are managed by AWS
snapshot of an EBS volume, can be used as a baseline for new volumes or for data backup. If you make periodic snapshots of a volume, the snapshots are incremental—only the blocks on the device that have changed after your last snapshot are saved in the new snapshot. Even though snapshots are saved incrementally,
the built in durability of EBS is comparable to a RAID in the physical sense. The data itself is mirrored (think more like a RAID stripe though) in the availability zone where the volume exists. Amazon states that the failure rate is somewhere around 0.1-0.5% annually. This is more reliable than most physical RAID setups

Automating EBS snapshot from it's instance itself [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
Is it good idea to create periodic snapshot of the EBS volume from same instance it is attached to? Is there any downtime during snapshot process? I basically wanted to keep a regular may be daily or weekly snapshot of the ec2 instance so that If there is any virus or hacking or security issue I could spin another instance from the snapshots.
Absolutely yes. It's a good practice (personally, I consider it a must) to create point-in-time snapshots and to use them to create new volumes or restore old volumes. There is no downtime during the snapshot process. For a more detailed explanation you may take a look here, with particular emphasis on this part:
You can take a snapshot of an attached volume that is in use. However,
snapshots only capture data that has been written to your Amazon EBS
volume at the time the snapshot command is issued. This may exclude
any data that has been cached by any applications or the operating
system. If you can pause any file writes to the volume long enough to
take a snapshot, your snapshot should be complete. However, if you
can't pause all file writes to the volume, you should unmount the
volume from within the instance, issue the snapshot command, and then
remount the volume to ensure a consistent and complete snapshot. You
may remount and use your volume while the snapshot status is pending.
Before doing operation involving data I think it's very important to know everything about a technology that you are going to use. So, I would like take this opportunity to put the focus on some points, taken from the official AWS EBS documentation, that are very important:
Amazon EBS volumes are designed to be highly available and reliable.
At no additional charge to you, Amazon EBS volume data is replicated
across multiple servers in an Availability Zone to prevent the loss of
data from the failure of any single component.
If you wish to achieve greater durability, you can use the Amazon EBS
Snapshot capability. Snapshots are stored in Amazon S3 and are also
replicated automatically among multiple Availability Zones. You can
take frequent snapshots of your volume for a convenient and
cost-effective way to increase the long-term durability of your data.
In the unlikely event that your Amazon EBS volume does fail, all
snapshots of that volume remain intact and you can re-create your
volume from the last snapshot.
Here, some notes about the durability of EBS volumes:
The durability of your volume depends both on the size of your volume
and the percentage of the data that has changed since your last
snapshot. As an example, volumes that operate with 20 GB or less of
modified data since their most recent Amazon EBS Snapshot can expect
an annual failure rate (AFR) of between 0.1% – 0.5%, where failure
refers to a complete loss of the volume. This compares with commodity
hard disks that typically fail with an AFR of around 4%, making EBS
volumes 10 times more reliable than typical commodity disk drives.
Important details about the price:
Amazon EBS Snapshots are stored incrementally: only the blocks that
have changed after your last snapshot are saved, and you are billed
only for the changed blocks. If you have a device with 100 GB of data
but only 5 GB has changed after your last snapshot, a subsequent
snapshot consumes only 5 additional GB and you are billed only for the
additional 5 GB of snapshot storage, even though both the earlier and
later snapshots appear complete.
Here is why you may stay secure when you delete one of your snapshots:
When you delete a snapshot, you remove only the data not needed by any
other snapshot. All active snapshots contain all the information
needed to restore the volume to the instant at which that snapshot was
taken. The time to restore changed data to the working volume is the
same for all snapshots.
Another important advantage of snapshots:
Snapshots can be used to instantiate multiple new volumes, expand the
size of a volume, or move volumes across Availability Zones. When a
new volume is created, you may choose to create it based on an
existing Amazon EBS snapshot. In that scenario, the new volume begins
as an exact replica of the snapshot.
Ok, I think that these are some of the most important things to know when using amazon EBS. For further details take a look here. Pay particular attention on the "Amazon EBS Snapshots" section.

Cost of storing AMI

I understand Amazon will charge per GB provisioned EBS storage. If I create AMI of my instance, does this mean my EBS volume will be duplicated, and hence incur additional cost?
Is there other cost charge in creating and storing an AMI (Amazon Machine Image)?
You are only charged for the storage of the bits that make up your AMI, there are no charges for creating an AMI.
EBS-backed AMIs are made up of snapshots of the EBS volumes that form the AMI. You will pay storage fees for those snapshots according to the rates listed here. Your EBS volumes are not "duplicated" until the instance is launched, at which point a volume is created from the stored snapshots and you'll pay regular EBS volume fees and EBS snapshot billing.
S3-backed AMIs have their information stored in S3 and you will pay storage fees for the data being stored in S3 according to the S3 pricing, whether the instance is running or not.
In this case, you will pay for the size of the storage used, instead of the storage provisioned. Snapshots will not store any empty blocks.
In short, yes, you will incur additional charges, but at a less rate, namely, EBS snapshot storage rate. Provisioned EBS is the 'live' HD that will be charged at $0.10 per GB per month if using standard SSD (gp2, USA east pricing for 2022 used throughout). And if you provisioned 50 GB, you will be fully charged for that 50 GB, even if you are only using 5% of it. The charges will incur even if you forget to attach to an EC2 instance. $5 per month in this case.
When you create an AMI, AWS will create a snapshot in the background. This snapshot is viewable under EBS Snapshots and will not be deletable as long as that AMI is in existence. You will get an error if you try to delete this snapshot. Snapshots cost less than 'provisioned' EBS at $0.05 per GB per month, and since snapshots ignore empty blocks, it will be shrunk to used size, so if you are only using 5% of 50GB, the snapshot should only be around 2.5 GB. $0.13 per month in this case. No other charges.
If you are creating a lot of these, it can get expensive very quickly, so some people save these AMIs into S3, which is cheaper than EBS snapshots. This is somewhat advanced and as far as I know, it can only be done via AWS CLI, and not in the console. You use a command called aws ec2 create-store-image-task and you have to specify the destination bucket name, and make sure permissions for S3, EBS and EC2 will all allow it. More detail at the official AWS documentation. This would reduce the cost to about $0.023 per GB per month. There are other changes relating to this method, i.e. EBS Direct API, but it is not much and you can look it up in the documentations.
Recently in November 2021, AWS released archived function for EBS snapshots, which allows you to archive your snapshots for a minimum of 90 days for $0.0125. You do have to pay $0.03 per GB for restoring the data. However, this is designed for EBS backups (e.g. daily backups using snapshots) and you cannot archive an EBS snapshot that is associated with an AMI. You will get an error: Failed to archive snapshot... snap-xyz is in use by ami-123.
Below is an excerpt of an actual AWS bill that will explain it in a visual sense.