AWS snapshot frequency and its effect on cost - amazon-web-services

Does the frequency of AWS snapshot have any effect on price because of network consumption or any other parameter, say snapshot every 30 minute or a single snapshot at the end of the day.

There isn't any cost associated with the creation of a snapshot, such as for network bandwidth.
The cost is in storing the snapshots, so the cost is related to how many you keep, not now many you make... as well as how different they all are from each other (and, of course, volume size, to some extent). If you were to snapshot a volume every few minutes and nothing on that volume were changing, then the incremental cost for each additonal snapshot being stored would approach $0, because EBS snapshots are automatically deduplicated.

For snapshots, pricing calculates based on the total size of your initial snapshot and the incremental amount in the size.
For example, if you have got a 100GB volume, initial pricing applied for 100GB snapshot. And let's say the 2nd snapshot is incremental and size is 101 GB (which has added only 1GB), you will charge for 100 + 1 GB of size. Likewise you will be charged for the accumulative size.
However if you need your snapshots cross-region, there will be a data transfer charges as well.
More Info: https://aws.amazon.com/ebs/pricing/

Just so that it will help someone am adding this answer, neither frequency nor keeping or deleting the snapshot is going to affect the cost to support this i am quoting these line from aws user guide:
Deleting a snapshot might not reduce your organization's data storage
costs. Other snapshots might reference that snapshot's data, and
referenced data is always preserved. If you delete a snapshot
containing data being used by a later snapshot, costs associated with
the referenced data are allocated to the later snapshot.
Reference: Deleting an Amazon EBS Snapshot

Yes, you're paying for the snapshot storage. Per EBS Pricing:
$0.05 per GB-month of data stored
However:
Snapshot storage is based on the amount of space your data consumes in Amazon S3. Because Amazon EBS does not save empty blocks, it is likely that the snapshot size will be considerably less than your volume size. For the first snapshot of a volume, Amazon EBS saves a full copy of your data to Amazon S3. For each incremental snapshot, only the changed part of your Amazon EBS volume is saved.
So while you will pay more if you do snapshots frequently it's hard to determine how much more. You may consider a different backup solution as EBS is not the best one.

Related

Merge AWS EBS snapshots

I am exploring AWS EBS snapshot policy to minimize the data loss while any failure occurs to the server. I am thinking of an hourly snapshot policy with 7 days of retention. It will serve the purpose of minimizing the data loss but it will flood the AWS snapshot console which may lead to mistakes in future. To prevent this I am exploring a way so the hourly backups can be merged together daily.
Scenario
Hourly snapshot policy with 7 days retention means 24 snapshots daily till the end of the week = 168 snapshots for a server and 1 merged snapshot will be created at the end of the week.
What I am exploring
Hourly snapshot policy with 7 days retention and 1-day merging means it will create the snapshots hourly till the end of the day and then merge them to 1 single snapshot so I will have one snapshot for the day rather than 24.
I explored the AWS documentation but that doesn't help. Any help would be really appreciable.
If you delete any of the snapshots in between you will find that AWS will automatically perform this merge functionality to ensure there is no missing data in between snapshots.
Deleting a snapshot might not reduce your organization's data storage costs. Other snapshots might reference that snapshot's data, and referenced data is always preserved. If you delete a snapshot containing data being used by a later snapshot, costs associated with the referenced data are allocated to the later snapshot.
If you delete any snapshots (including the first) the data will be merged with the next snapshot that was taken.
Therefore you can relax and adjust the policies as required, without the risk of data loss.
More details are available in the how incremental snapshots work documentation.
I like to think of an Amazon EBS Snapshot as consisting of two items:
Individual backups of each 'block' on the disk
An 'index' of all the blocks on the disk and where their backup is stored
When an EBS Snapshot is created, a back-up is made of any blocks that are not already backed-up. An index is also made that lists all the blocks in that "backup".
For example, let's say that an EBS Volume has Snapshot #1 and then one block is modified on the disk. If another Snapshot (#2) is created, only one block will be backed-up, but the Snapshot index will point to all the blocks in the backup.
If the Snapshot #1 is then deleted, all the blocks will be retained for Snapshot #2 automatically. Thus, there is no need to "merge" snapshots -- this is all done automatically.
Bottom line: You can delete any snapshots you want. The blocks required to restore all remaining Snapshots will be retained.

Can I have less than 2.5TB of disk for a BigTable node?

In the GCP user interface I can estimate the pricing for whatever disk size I wish to use, but when I want to create my BigTable instance I can only choose the number of nodes and each node comes with 2.5TB of SSD or HDD disk.
Is there a way to, for example, setup a BigTable cluster with 1 node and 1TB of SSD instead of the 2.5TB default one ?
Even in the GCP pricing calculator I can change the disk size, but I can't find where to configure it when creating the cluster (https://cloud.google.com/products/calculator#id=2acfedfc-4f5a-4a9a-a5d7-0470d7fa3973)
Thanks
If you only want a 1TB database, then only write 1TB and you'll be charged accordingly.
From the Bigtable pricing documentation:
Cloud Bigtable frequently measures the average amount of data in your
Cloud Bigtable tables during a short time interval. For billing
purposes, these measurements are combined into an average over a
one-month period, and this average is multiplied by the monthly rate.
You are billed only for the storage you use, including overhead for
indexing and Cloud Bigtable's internal representation on disk. For
instances that contain multiple clusters, Cloud Bigtable keeps a
separate copy of your data with every cluster, and you are charged for
every copy of your data.
When you delete data from Cloud Bigtable, the data becomes
inaccessible immediately; however, you are charged for storage of the
data until Cloud Bigtable compacts the table. This process typically
takes up to a week.
In addition, if you store multiple versions of a value in a table
cell, or if you have set an expiration time for one of your table's
column families, you can read the obsolete and expired values until
Cloud Bigtable completes garbage collection for the table. You are
also charged for the obsolete and expired values prior to garbage
collection. This process typically takes up to a week.

How to back up using snapshots

I chose a snapshot as a way to backup the VM(google compute engine).
I know that snapshots are incremental and automatically compressed.
So I will take a snapshot every day at the appointed time.
And I want to delete the snapshots that are older than 60 days.
Question
Will 60-day snapshots (full snapshots with all data) be combined with 59-day snapshots (incremental snapshots)?
Question Will 60-day snapshots (full snapshots with all data) be
combined with 59-day snapshots (incremental snapshots)?
Yes. The consistency of all snapshots will be maintained when you delete any snapshot including the oldest one.
Technically, nothing is combined, each snapshot is just a list of pointers to stored data blocks. When you delete the oldest snapshot any data in that snapshot that has been overwritten in the next newer snapshot will be released (deleted). The list of blocks in the 60th snapshot will be merged into the 59th snapshot. The 59th snapshot now represents the entire disk volume.
Each snapshot will be incremental. You can have a better understanding of the procedure if you check the documentation.
Basically this is how it works.

Calculating AWS snapshot usage cost programatically

I am planning to calculate snapshot usage cost using a script.
As per the documentation if we have GB-month value we can calculate the cost based on this. Is there any way to calculate snapshot size and its age? I could not find any method to fetch the snapshot size. When I describe a snapshot I do get volume-size in snapshotInfo but I don't think that's the snapshot size. Also the age of a snapshot is not defined in the description. Only the timestamp when the snapshot was initiated is in the output.
I don't want the cost for all the snapshots. I will be filtering snapshots based on a custom tag. I saw https://aws.amazon.com/blogs/aws/new-cost-allocation-for-ebs-snapshots/ but this is via the UI and needs special permissions.
The cost and usage report is the only way to capture this information. It is not accessible through the service API.
EBS snapshots are -- logically -- the same size as the source volume, because every EBS snapshot contains a reference to a stored representation of every single block on the volume.
But it's only a reference -- a pointer -- because EBS doesn't store the actual data blocks inside the snapshot itself. It maintains a mapping and has the ability to determine which blocks are unchanged from snapshot to snapshot, so that it doesn't redundantly store them.
The price you pay for a given snapshot is directly determined by how many blocks in that snapshot are different from those in the most recent, prior snapshot of the same volume that still exists. Deleting older snapshots preserves any blocks that are still needed for restoring newer snapshots, and thus rolls the cost of those blocks forward into snapshots that still exist, with the cost shifting into the oldest snapshot that still needs the blocks after any older ones are deleted.
So the cost of a given snapshot changes as previous snapshots of the same volume are deleted.
Also:
Only the timestamp when the snapshot was initiated is in the output.
That's the age. Snapshots are snapshots -- an image of the disk at the moment in time the snapshot was initiated. Regardless of how long the snapshot takes to run, the data it captures is the data as it existed on the volume when the snapshot was initiated.

AWS - How is EBS charged?

How is EBS charged? Is it Per stored data and bandwidth, per allocated storage and bandwidth, per stored data and IO operations or per allocated data IO operations?
It's stated pretty clear inside the documentation
https://aws.amazon.com/ebs/pricing/
Generally it's based on the disk type, provisioned size and provisioned IOPS ( where applicable ) multiplied by the amount of time until you delete the resource
Price also varies a bit region to region. To use or not the provisioned resource will have no effect on your billed amount
Also
If you need help setting up AWS resources, ask those questions on Server Fault.