AWS S3 vs AWS Global Infrastructure: Availability Zones mismatch

AWS S3 vs AWS Global Infrastructure: Availability Zones mismatch - amazon-web-services

There's a statement in AWS S3 documentation that objects in S3 are replicated and stored across at least three geographically-dispersed Availability Zones. However, on the Global Infrastructure page there are a few regions (Canada Central and Beijing) with only 2 Availability Zones available.
If I understand it right, the replication settings are region-specific and all objects will be replicated only across 2 Availability Zones. Does anybody have any insights on that?

Some regions have fewer than three availability zones accessible to customers, but none -- apparently -- have fewer than three where S3 is deployed.
Amazon S3 Standard, S3 Standard-Infrequent Access, and S3 Glacier storage classes replicate data across a minimum of three AZs to protect against the loss of one entire AZ. This remains true in Regions where fewer than three AZs are publicly available.
https://aws.amazon.com/s3/faqs/

Related

what exactly mean AWS region and choosing right region for business

What exactly is the region in AWS world?
I have to ask which region is the right region for my business.Which factors are important before selecting region in AWS?

An AWS Region is a physical cluster of data centers located in a specific geographic location.
So, the Sydney Region data centers are all located in Sydney and the Oregon Region has data centers all located in Oregon.
A region consists of multiple Availability Zones. An Availability Zone is one or more data centers that contain the physical infrastructure that provides AWS services (eg data, storage, networking). There are very high-speed connections between Availability Zones within a Region.
So, which Region to choose? It should typically be the one closest to your customers (to provide faster response) or perhaps closest to your existing data center if you are connecting it to AWS.
You might want to use multiple data centers so that you have services closest to customers spread around the world, rather than having them all connect back to one location. Or, you might want to use multiple Regions for redundancy in case of failure. (Project Nimble: Region Evacuation Reimagined – Netflix TechBlog)
There might also be legal requirements of which Region to use (based on data governance, privacy laws, etc). You might even choose a Region based on a lower price (USA regions are generally lower cost than others, especially for Internet data transfer costs).
You might also choose a region based upon which services are available: Region Table
See also: Global Cloud Infrastructure | Regions & Availability Zones | AWS

The definition and documentation of AWS Region is stated in the above comments. In summary, AWS Region is a separate geographic area. AWS Region has Availability Zones which are isolated data centers. Availability Zones is used for high availability. There are 2 or more Availability Zones for each region.
Which factors are important before selecting region in AWS?
There are several factors to consider.
Latency - The faster your data center, the better your performance. This link can display the latency between ec2 instances. https://www.cloudping.co/
Cost - Different region has different cost. So far, North Virginia is the cheapest.
AWS Services to use - Not all AWS Services are available in all regions. This link can display the supported services per region. https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/

There are a number of resources that can help you understand AWS regions, availability zones, and how to architect using them, including:
AWS: Regions and Availability Zones
AWS: Architecting for the Cloud: Best Practices
CloudAcademy: How to Pick the Best AWS Region for Your Workload

How does Aurora work in regions with only two zones?

The Aurora documentation states that data is replicated six ways, across three availability zones. The Canada region (ca-central-1) offers Aurora as an option, yet only has two availability zones. How is data replication handled in regions with only two availability zones?

I was trying to answer a similar question in a training I'm giving.
The region in question no longer has only 2 AZs, but 3 official ones, which are called:
ca-central-1a
ca-central-1b
ca-central-1d
Curiously the ca-central-1c is left out, which leads me to believe there has been a sort of unofficial or stripped down AZ in place previously.
Looking at the official launch news for the third AZ they explicitly state, that for example S3 previously replicated its data across three AZs, even if only two were available.
It’s important to notice that Amazon S3 storage classes that replicate data across a minimum of three AZs, are always doing so, even when fewer than three AZs are publicly available.
So there was probably an unofficial/non-public third AZ present previously.

Initially AWS was having only 2 AZ for canada region.In june 2020, they added one more AZ to that.

AZ mapping per user

I read "Each AWS Account has independently mapped AZs, which can vary between different accounts" in a book.
As far as I know , each region has its well-defined AZs.
So how does it come that the AZs vary from a user to another ?

Availability Zone A for my AWS account may not be the same as Availability Zone A for your account. The AZ mappings are created when your account is generated.
The idea is to distribute customer workloads within a region. Most customers pick AZs A, B, or C when launching instances (human nature). If everyone had the same AZ A, workloads would be hugely unbalanced in a region, and that AZ would be a failure point for a lot of customers.

amazon web services - Durability

Can you let me know if data on below AWS technology keeps data on
Multiple Facilities? How many? Different Availability Zones?
S3, EBS, Dynamo DB
Also want to know in general what is the distance between two AZ, want to make sure that any catastrophe can destroy complete region?

Just to Start Point out All the above asked questions are easily answered in AWS Documentation.
What is Region and Availability-Zone ?
Refer This Documentation
Each region is a separate geographic area. Each region has multiple,
isolated locations known as Availability Zones.
Also want to know in general what is the distance between two AZ ?
I don't think any one would know answer to that , Amazon Does not Publish such kind of Information about their Data Centers,they are secretive about it.
Now to Start with S3 , As Per AWS Documentation:
Although, by default, Amazon S3 stores your data across multiple
geographically distant Availability Zones.
Now You can Also Enable Cross Region Replilcation as per AWS documentation but that will incur extra cost :
Cross-region replication is a bucket-level configuration that enables
automatic, asynchronous copying of objects across buckets in different
AWS Regions.
Now for EBS as per AWS Documentation :
Each Amazon EBS volume is automatically replicated within its
Availability Zone to protect you from component failure, offering high
availability and durability
Also As per Documentation You can Create Point In Time Snapshot and make it available in Another AWS Region and all the Snapshots are backed up on AWS S3.
Now for DyanamoDB as per AWS Documentation :
DynamoDB stores data in partitions. A partition is an allocation of
storage for a table, backed by solid-state drives (SSDs) and
automatically replicated across multiple Availability Zones within an
AWS Region.
Now you can make it available across region for more details please refer to this AWS Documentation
Hope This Clears your Doubts!

By default all these services replicate the data in different AZ(availability zones) which are in the same AWS region.
But AWS also provided the mechanism to replicate the data across different region(which you can choose), so that you can have more fault tolerant and low latency for the users(you can serve your users from the servers which is in the same region).
However keep in mind that replicating data across multiple zones involves more cost.
You can read AWS doc http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html to know where all aws regions and AZ presents to figure out the where they are located.
Whole Idea to keep different AZ and region is to provide high availability, so you shouldn't bother about the distance and availability, if you are having replication across multi AZ or region.
Edit :- Thanks to Michael for pointing out that EBS volumes are only replicated (mirrored) within the AZ where the volume is created

AWS VPC vs Subnet for Application Wrapping

I'm trying to get a better understanding of AWS organization patterns.
Suppose I define the term "application stack" as a set of interconnected AWS resources (e.g. a java microservice behind ELB + dynamoDB for peristence), then I need some way of isolating independent stacks. Each application would get a separate dynamodb or kinesis so there is no need for cross-stack resource sharing. But the microservices do need to communicate with each other.
A-priori I could see either of the two organizational methods being used:
Create a VPC for each independent stack (1 VPC per 1 Application)
Create a single "production" VPC and each stack resides within a separate private subnet.
There could be up to 100s of these independent "stacks" within the organization so there's the potential for resource exhaustion if there is a hard limit on VPC count. But other than resources scarcity, what are the decision criteria around creating a new VPC or using a pre-existing VPC for each stack? Are there strong positive or negative consequences to either approach?
Thank you in advance for consideration and response.

Subnet's and IP addresses are a limited commodity within your VPC. The number of IP addresses cannot be increased within your VPC if you hit that limit. Also, by default, all subnets can talk to other subnets, so there may be security concerns. Any limits on the number of VPCs are a soft limit and can be increased by AWS support.
For these reasons, separate distinct projects at the VPC level. Never mix projects within a VPC. That's just asking for trouble.
Also, if your production projects are going to include non-VPC-applicable resources, such as IAM users, DynamoDB tables, SQS queues, etc., then I also recommend isolating those projects within their own AWS account (at the production level).
This way, you're not looking at a list of DynamoDB tables that includes tables from different projects.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js