Amazon S3 Bucket replica location - amazon-web-services

I am confusing about the Amazon S3 replica mechanism. In my understanding, by default, Amazon S3 applies 3-replica mechanism, in which there will be 3 replicas for each object created on my S3 bucket. And all the replicas are stored in multiple availability zones within only ONE region, which I specified when creating S3 bucket.
Is my understanding correct? If it's correct, is it possible to see where the replicas of an object are stored?
Thanks

You are pretty much correct. S3 replication works by replicating across at least 3 data centers, over at least two AZs within a single region (each availability zone can have multiple data centers).
The replication is part of s3, which is a managed service, meaning you just have to accept what they're telling you. Telling you where the replicas were wouldn't really serve any purpose, and AWS never really disclose the details of their infrastructure to anyone who doesn't need to know. Even if they told you the data was stored in Availability Zone 1 and 2, this is effectively meaningless information, as zones are aliases, i.e your Zone 1 probably isn't the same as my Zone 1.

Related

Amazon S3 redundancy over Availability Zones vs. over Regions

This https://aws.amazon.com/blogs/storage/architecting-for-high-availability-on-amazon-s3/#:~:text=Amazon%20S3%20maintains%20redundancy%20even%20within%20one%20of,can%20still%20access%20their%20data%20with%20no%20downtime states the following:
Amazon S3 storage classes replicate their data on more than three
Availability Zone (except for S3 One Zone-Infrequent Access).
What's the point of this article https://aws.amazon.com/blogs/startups/large-scale-disaster-recovery-using-aws-regions/ stating:
S3 snapshots: We rely on the cross s3 sync and this works like a
charm. We are able to copy the data from our primary to the DR region
within a matter of few minutes.
The latter seem superfluous now and is from 2017, so may be it is out-dated? Or is it the thrust that we should also be be placing Amazon S3 copies over over Regions? I see no such need as the AZ's within a Region are physically separated from each other. What am I missing?
S3 buckets are region specific. When you create a new bucket you need to select the target region for that bucket.
For DR reasons, you can keep backups in another region. Should the primary region fail in a way that the entire region is affected, then you could restore in the backup region.
Your DR strategy will depend on your use case, and your needs for returning services back to normal in case of region wide failure.
For example, let's say you rely on ec2/ebs to operate your service and those services suffer region wide outage for 5 hours. In order to recover your service you would need to move to a region where the resources are available. Assuming you need S3 data for operational processing you would want to have that data ready in the Target recovery region.
Storing in multiple AZs in a region does not guarantee safety in case of entire region failure.This is applicable for all regional services. The article you shared indeed mentions this so it is not irrelevant.
The service that runs in HA is handled by hosts running in different
availability zones but in the same geographical region. This approach,
however, does not guarantee that our business will be up and running
in case the entire region goes down

how >=3 AZs guaranteed in AWS S3

Amazon S3 gaurantees that the data being uploaded to bucket will be spread across >= 3 AZs.scroll down for chart. When we create bucket, we enter region. How amazon manages this AZs number when we create bucket in the region where we have only two AZs?
here's the answer from AWS S3 FAQ. Apparently, in those cases, more AZs exist, but they are not publicly available:
Q: What is an AWS Availability Zone (AZ)?
An AWS Availability Zone is an isolated location within an AWS Region. Within each AWS Region, S3 operates in a minimum of three AZs, each separated by miles to protect against local events like fires, floods, etc.
Amazon S3 Standard, S3 Standard-Infrequent Access, and S3 Glacier storage classes replicate data across a minimum of three AZs to protect against the loss of one entire AZ. This remains true in Regions where fewer than three AZs are publicly available. Objects stored in these storage classes are available for access from all of the AZs in an AWS Region.
The Amazon S3 One Zone-IA storage class replicates data within a single AZ. Data stored in this storage class is susceptible to loss in an AZ destruction event.

Amazon DynamoDB - geographically distributed?

I am new to AWS. Sorry if my question is basic, got stuck with this term.
AWS Global Infrastructure says "18 geographic Regions" -> Geographic term is used along with Regions, that makes sense.
DynamoDB FAQs 3rd questions says, "Amazon DynamoDB stores three geographically distributed replicas of each table to enable high availability and data durability."
Here(three geographically) is it referring to Region or Availability Zones ? Bit confused. If it is Region, does it mean my data is going out of my country(if my country has only 1 Region).
Please suggest.
Geographically isolated in this documentation refers to Availability Zones and not Regions. As per AWS documentation when you create a table in one region, it's replicated in others zones to ensure the high availability. If you do some activity in the table it's updated in the replicas. The AZ's are interconnected with low latency networks.
The data is stored on SSD disks and automatically replicated across
multiple Availability Zones in an AWS region, which brings the high
availability and your data is durable.
If you create a table in one region, the same table can be created in other regions also with same name.
If you want your table to be replicated in other regions you must enable the Cross-Region replication. For more details Refer
DynamoDB
All Things about DynamoDB
Almost every AWS service revolves around two things in availability: Multi AZ (multiple data centers in a single region) and Cross-Region (different geographic locations across globe) and so does the DynamoDB. By default AWS DynamoDB is a multi-AZ enabled service which means that your data is by default replicated across 3 data centers (minimum of 2 AZs) but for cross-region, you need to enable DynamoDB global tables (DynamoDB Streams).
Multi-Region Replication with DynamoDB
DynamoDB global tables are geographically distributed. They provide a fully managed solution for deploying a multiregion, multi-active database. Like with every other geographically distributed database, GlobalTables comes with ReplicationLatency.
An important thing to note here is, DynamoDB does not offer cross-region strong consistency (this is in contrast with CosmosDB, a similar offering from Azure)
From AWS documentation:
An application can read and write data to any replica table. If your
application only uses eventually consistent reads and only issues
reads against one AWS Region, it will work without any modification.
However, if your application requires strongly consistent reads, it
must perform all of its strongly consistent reads and writes in the
same Region. DynamoDB does not support strongly consistent reads
across Regions. Therefore, if you write to one Region and read from
another Region, the read response might include stale data that
doesn't reflect the results of recently completed writes in the other
Region.
Also, global tables are not to be confused with global indexes. Global indexes get their name because they are used in fetching data across multiple DynamoDB partitions.
"Amazon DynamoDB stores three geographically distributed replicas of each table to enable high availability and data durability."
This is specifically referring to multi AZ structure of dynamo, this helps in achieving high availability of your table. eg. if one of availability zone is down you still will be able to access you table.
To answer "my data is going out of my country(if my country has only
1 Region)."
For multi region its not by default ON you need to use global tables and specify regions in which you want to replicate that means your data/table wont go in any other region till you specifically want it to be.
For more on global tables refer
https://aws.amazon.com/dynamodb/global-tables/

Is it possible to have a redundant S3 bucket?

We use S3 for static file hosting. Is it possible to set it up as redundant? I don't want to rely only on one zone in case anything brokens.
Thanks.
Amazon S3 buckets are regional-level services. Data is replicated automatically across multiple Availability Zones.
So, if you wish to have redundancy across Availability Zones, it is done for you automatically.
If you wish to have redundancy across regions, you might be able to use Amazon CloudFront and/or Amazon Route 53.
Its not possible to have redundant bucket names in aws....all the bucket names in aws are unique.

What are possible ways to access Amazaon S3 data if S3 outage happens?

Can some one help me in understanding the S3 outage usecase here.
The probability of S3 outage is very less, but in case if this happens, what are the ways we can access data that sits in S3.
I know that there is one possibility, that is cross region replication, that works for new files, that I am going to put in my s3 bucket, if I enable it now. What happen to old files, I know if I go and upload all those historical files also to the other region, then it works.
Then again the same question, if both the regions went down, then what?
I am sure others would have thought of this. Any inputs on this.
From Protecting Data in Amazon S3:
Objects are redundantly stored on multiple devices across multiple facilities in an Amazon S3 region. To help better ensure data durability, Amazon S3 PUT and PUT Object copy operations synchronously store your data across multiple facilities before returning SUCCESS. Once the objects are stored, Amazon S3 maintains their durability by quickly detecting and repairing any lost redundancy.
...
Backed with the Amazon S3 Service Level Agreement
Designed to provide 99.999999999% durability and 99.99% availability of objects over a given year
Designed to sustain the concurrent loss of data in two facilities
So, if you're still not happy with all those statements, how can you access your data in an outage?
If your data is in only one region, and the region is not accessible, then your data is not accessible. Note, however, that an external network connectivity problem could prevent access to Amazon S3, yet Amazon S3 might still be accessible from Amazon EC2 instances in the same region.
Cross-region replication will copy your data to another Amazon S3 region. It requires versioning to be activated. To copy any files that exist prior to activating cross-region replication, use the sync command in the AWS Command-Line Utility (CLI), eg:
aws s3 sync s3://bucket1/folder s3://bucket2/folder
Each AWS region operates independently, so the possibility of multiple regions suffering outages would presumably be even less likely.
If you are feeling particularly paranoid, you could copy your data to another cloud provider (Azure, Google, Rackspace, etc). There are tools that can assist:
CloudBerry Cloud Migrator
AzureCopy
...and no doubt many more!