Amazon Web Services Geo Redundancy - amazon-web-services

I read that Azure has geo-redundant storage where data will have three copies synchronously created in the region and three copies asynchronously created in another geographic region for disaster recovery. I searched the web resources for AWS EBS storage but could not find any information for async geo-redundancy for EBS. Do they use another term for it or does AWS simply not have geo-redundant block storage?

No public cloud provider that I'm aware of has geo-redundant block storage. (Google Cloud has zone-redundant persistent disks though.) You probably saw geo-redundant blob/object storage.
AWS has S3 cross-region replication, but not a geo-redundant S3 storage class.

Related

Is there a cost for gcp cloud functions to access gcp storage buckets

Trying to figure out some gcp storage costs.
I have several applications running in GCP cloud functions, some of which store while others pull data from gcp storage buckets in the same region.
I want to know if there are any costs associated in cloud functions pulling or accessing data stored in gcp storage buckets ? Would that classify as egress cost and charged at $0.08/GB.
Additionally, if I end up configuring Google private access for cloud storage, will there again be any cost associated in my cloud functions pulling data from gcp cloud storage if I’m traversing over googles own backbone infrastructure ?
As per #FerreginaPelona's comment on pricing, it clarifies the difference between retrieval + operational fees to network egress. If data is accessed in the same region, there aren't any network egress costs.

Google Cloud disk snapshots default region?

Google Cloud Disk Snapshots are default located in which regions? Is there region same as Disk region?
Whenever you take snapshot in Google cloud, the snapshot data is incremental and the snapshot data is kept in Google cloud object storage and the details of which object storage, where and how is not at all revealed to users.
Because, Google needs to meet some SLAs for those snapshots they would want to disclose details of the snapshot , as to meet SLA s they would be changing and managing snapshots internally that user would be oblivious to
And that's true with other clouds too!!

AWS Disaster recovery together with backup and storage

I have an implementation of hybrid AWS setup where I have an on-prem hadoop cluster and also replication enabled towards an AWS setup with similar hadoop cluster running at low capacity for disaster recovery. This is an active active disaster recovery setup in AWS. Is it still recommended to take backups for data that is stored on AWS?
Is it still recommended to take backups for data that is stored on AWS?
Not clear what AWS services you're referring to
Well, let's say you have an S3 bucket only bound to us-east-1 and that region becomes unavailable... You can't access your data. Therefore, it's encouraged to replicate to another region. However S3 supposedly has several 9's of availability, and if an AWS service is down in a major region, it's probably expected that a good portion of the internet is in-accessible, not only your data

AWS Storage Volume Gateway - Cached volumes

AWS documentation clearly mentions Gateway Stored Volumes- "This data is asynchronously backed up to S3 in the form of Amazon EBS snapshots."
But there is no mention how the Storage Volume Gateway Cached volumes data is replicated - Aync/Async snapshots ?
The documentation reads
"Cached volumes let you use Amazon Simple Storage Service (Amazon S3) as your primary data storage while retaining frequently accessed data locally in your storage gateway."
"In the cached volumes solution, AWS Storage Gateway stores all your on-premises application data in a storage volume in Amazon S3. "
Can someone explain
Thanks
In Storage Volume Gateway Cached mode, data is written to S3 and cached locally for frequently accessed files.
Cached volumes let you use Amazon Simple Storage Service (Amazon S3) as your primary data storage while retaining frequently accessed data locally in your storage gateway. Cached volumes minimize the need to scale your on-premises storage infrastructure, while still providing your applications with low-latency access to their frequently accessed data. You can create storage volumes up to 32 TiB in size and attach to them as iSCSI devices from your on-premises application servers. Your gateway stores data that you write to these volumes in Amazon S3 and retains recently read data in your on-premises storage gateway's cache and upload buffer storage.
Cached volumes can range from 1 GiB to 32 TiB in size and must be rounded to the nearest GiB. Each gateway configured for cached volumes can support up to 32 volumes for a total maximum storage volume of 1,024 TiB (1 PiB).
In the cached volumes solution, AWS Storage Gateway stores all your on-premises application data in a storage volume in Amazon S3.
Cached Volume Architecture
So based on the documentation its asynchronous by nature.
As your applications write data to the storage volumes in AWS, the
gateway initially stores the data on the on-premises disks referred to
as cache storage before uploading the data to Amazon S3. The cache
storage acts as the on-premises durable store for data that is waiting
to upload to Amazon S3 from the upload buffer.

Amazon s3 vs Ec2 Storing Files

Which one is better for storing pictures and videos uploaded by user ?
Amazon s3 or Filesystem EC2 ?
While opinion-based questions are discouraged on StackOverflow, and answers always depend upon the particular situation, it is highly likely that Amazon S3 is your better choice.
You didn't say whether only wish to store the data, or whether you also wish to serve the data out to users. I'll assume both.
Benefits of using Amazon S3 to store static assets such as pictures and videos:
S3 is pay-as-you-go (only pay for the storage consumed, with different options depending upon how often/fast you wish to retrieve the objects)
S3 is highly available: You don't need to run any servers
S3 is highly durable: Your data is duplicated across three data centres, so it is more resilient to failure
S3 is highly scalable: It can handle massive volumes of requests. If you served content from Amazon EC2, you'd have to scale-out to meet requests
S3 has in-built security at the object, bucket and user level.
Basically, Amazon S3 is a fully-managed storage service that can serve static assets out to the Internet.
If you were to store data on an Amazon EC2 instance, and serve the content from the EC2 instance:
You would need to pre-provision storage using Amazon EBS volumes (and you pay for the entire volume even if it isn't all used)
You would need to Snapshot the EBS volumes to improve durability (EBS Snapshots are stored in Amazon S3, replicated between data centres)
You would need to scale your EC2 instances (make them bigger, or add more) to handle the workload
You would need to replicate data between instances if you are running multiple EC2 instances to meet request volumes
You would need to install and configure the software on the EC2 instance(s) to manage security, content serving, monitoring, etc.
The only benefit of storing this static data directly on an Amazon EC2 instance rather than Amazon S3 is that it is immediately accessible to software running on the instance. This makes the code simpler and access faster.
There is also the option of using Amazon Elastic File System (EFS), which is NAS-like storage. You can mount an EFS volume simultaneously on multiple EC2 instances. Data is replicated between multiple Availability Zones. It is charged on a pay-as-you-go basis. However, it is only the storage layer - you'd still need to use Amazon EC2 instance(s) to serve the content to the Internet.