What is the best practices for write latency across regions for RDS?
Use case: We have users in both Europe and Cambridge with equal read and write access. The RDS is hosted in US-East, thus write latency from EU to US would be slow.
Would it be to have two separate databases in EU and US or is there any way to use a higher tier instance in US to improve write performance?
RDS doesn't support Multi-Region multi-master setups, i.e. you can't have multiple writer-regions.
You could have a primary region for read/write-traffic and then a cross-region read-replica in the other region.
Your app would have to send all write-activity to the primary region and could read from the local region (with a delay).
If you want to optimize for cross-region latency, you might want to look at Aurora Global Database. It uses dedicated replication infrastructure with typically sub-second latency.
As far as I know the only service that allows multi-region multi-master setups would be DynamoDB Global Tables, but that's a whole different database paradigm.
Related
I am in process of coming up with a multi-region high availability (active-active) architecture for my product. A simplified version of our stack is that we use Lambda to implement our micro services, which are exposed as APIs using API Gateway. These micro services integrate with downstream services or databases like DynamoDB, Aurora RDS. So, '
Route 53 >> Api gateway >> Lambda >> Downstream service/Database
'
I am trying to figure out what is the best mechanism to configure Route 53 such that it understands any of the services in the stack fails so that it routes the incoming requests to another region. Eg if Lambda service in region-1 fails, then it is easy because I would create Health Check records pointing to these Lambdas and once they are not reachable Route 53 will itself route to next requests to region-2.
However, if the downstream resource eg RDS that Lambda is dependent on fails, how will Route 53 know this so as to route the next requests to region-2?
Appreciate any pointers on this.
It depends a bit on your envision failover setup.
Let us assume you have two regions: region1,region2
Now you could have two failure scenarios:
Lambda fails in region1 => you failover to Lambda in region2
RDS fails in region1 => you failover to RDS in region2
In both cases you need to ask yourself: What I want to do. If for example, in case 1 you connect from Lambda in region2 to RDS in region1 then high region transfer costs may occur, so you may want to trigger in any case a fail-over of RDS to region2.
Note: Generally it is very advisable to not connect directly with Lambda to RDS, but use instead RDS proxy (to avoid hammering the database with requests, slowing it down etc.): https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-proxy.html
Generally, with RDS these region failovers are much more complicated (can answer on that bit if needed). It is also not simply changing IP to another region, because usually you need to promote in the other region the database (cluster) as a designated node to allow write operations.
For the databases (DynamoDB, Aurora) you mentioned there is though a solution: Use Global Tables.
A simpler solution could be - depending on your application - to use DynamoDB Global Tables (see https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GlobalTables.html). However, clearly DynamoDB is not a relational database so it may not fit all cases. Nevertheless, DynamoDB works generally very good with Lambda and is also easier for cross-region replication. Note: if you encrypt your data using AWS KMS CMK (recommended) then you need to have this key also available in all regions where you plan to use Global tables (see https://docs.aws.amazon.com/kms/latest/developerguide/multi-region-keys-overview.html).
Another solution could be AWS RDS Aurora Global Tables (https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-global-database.html) - those are available in multiple region and failover is thus easier (cf. https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-global-database-disaster-recovery.html).
In the Aurora case you have to detect a region failure yourself (e.g. you could have a lambda in both regions that regularly tries to connect to the current active cluster for writing) and automatically promote the cluster in the new region as primary if it is not available in the original region.
Do not forget: You need to regularly test the failover otherwise it is almost ensured that it will not work when you need it.
Generally having databases cross-regions implies transfer costs and additional resource costs compared to a single region - not only during failover, but all the time data is written.
With this configuration, I recommend failing over the entire stack (to another Region), rather than failing over individual tiers (components) of the architecture. (This is what you seem to be saying in your question, but just making sure we are on the same page).
Your question comes down to how to configure the health check, and specifically how to implement shallow versus deep (checking dependencies like RDS) health checks.
There is an AWS Well-Architected lab that covers these concepts Implementing Health Checks and Managing Dependencies to improve Reliability.
I would like to know what is more recommended when one DB instance should be shared across different AWS regions? Is it better to use cross-Region Read Replicas or to use Read Replica in region of origin + AWS Global Accelerator?
Is there some "best praxis solution" for global applications?
I am not experienced with AWS and the most of the things are pretty new for me. So I know that my question may look amateur.
From what I read, I think that one centralized Read replica is better solution, due to latency between regions, but if that would be a case, why anyone would use cross-region replicas at all?
If your application is hosted in a region e.g. eu-west-1 the best read performance will always come when it is reading data from eu-west-1.
If you happen to have customers in us-east-1 you have to choose between one of 3 options:
Edge Location
You reduce the latency using edge locations, i.e. CloudFront or Global Accelerator. This will improve the latency by using the AWS Backbone to route to your origins. This is faster than previously but the application remains in the original region (in this case eu-west-1). You also maintain one copy of the application only.
Latency based routing
This option brings the application closer to the user, by using either Route 53 with latency based records or Global Accelerator you can have your domains resolve to the location that has the lowest latency for them. You would have your central region (where the readwrite lives) and then create cross region replicas. This will provide the best read performance as the reads are being done locally (rather than being across region).
In the example eu-west-1 is the primary region with cross region replicas in us-east-1. Latency between regions is only observed with the time it takes to write to the readwrite (in the original region unless you use Aurora Read Replica Write Forwarding). This is by far the most complex and costly, but will provide the best performance overall.
Do nothing
If you do nothing this option will use the public internet to route to a host, those who are further away to your application will have a longer latency, but this is the cheapest option.
Summary
You need to essentially decide on the importance of cross region, if it is simply because your user base is in a further away region then ensuring you're as close to them as possible is key. You would not need to think about replicas if you're in a specific geographical region.
Remember you can always enhance your infrastructure when demand increases from other geographical regions.
I am relatively experienced with many AWS services - but I do have a large gap around Aurora/RDS
I'm trying to create a multi-region multi-master (write replicas) setup
The purpose is to give low latency to users (if each read and write replica is in the user's region) and to give resilience (if there is a region outage, the users can have their requests routed to another region (the latency will be higher, but reduced service is better than no service))
I'm trying to learn about AWS Aurora and I've created a toy cluster to learn. It seems I can create a cluster that is served out of multiple regions (and Aurora replicates data between regions automatically). I've also read that it is possible to have a multi-master setup (in my toy cluster, it only had one write partition, I couldn't work out how to create another write partition in another region, which made me question if it's possible?)
Here is a diagram of what I'm thinking:
https://imgur.com/DzoSpHL
Thank you in advance!
The purpose is to give low latency to users (if each read and write replica is in the user's region)
I couldn't work out how to create another write partition in another region, which made me question if it's possible?
That is not possible (at least not currently) because of multi-master Aurora limitations.
all DB instances in a multi-master cluster must be in the same AWS Region.
and others such as
you can have a maximum of two DB instances in a multi-master cluster
You can't enable cross-Region replicas from multi-master clusters.
You can read more here
The best thing you can do in your scenario is to create single master and place read replicas into those additional regions (possibly with some caching in necessary).
As mentioned earlier it is not possible with Aurora.
However DynamoDB supports multi-active multi-region:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GlobalTables.html
As others have said, with Amazon Aurora, you cannot deploy multi-Region and multi-master. However you can deploy multi-Region using Aurora Global Database. Then one writer endpoint would be in one Region, while reader endpoints would be available in all the other Regions. Then you can also use write forwarding (assuming you are using the MySQL flavor of Aurora) in the read-only Regions. I know latency is a concern for you, so note the write actually goes back to the primary Region, so writes will incur that extra latency.
In case of a disaster, when an entire AWS region fails and all its customers want to move their workloads to the next closest region in a disaster recovery scenario, is AWS ready for this?
I imagine millions of servers running in each region. Is AWS ready to provision them in another region the next day? Do they have that capacity at the ready?
AWS global infrastructure is using the concept of Availability Zones inside each region, to partition the resources, isolate the risks and ultimately reduce the blast radius of an eventual failure. AZs are groups of datacenter within a region that are designed to be independent of each others in terms of risks (i.e. different connection to the power grid, redundant and isolated network infrastructure, isolated in terms of geographical risks such as earthquake, fooding etc)
Some services are designed to automatically take advantage of this redundant infrastructure (Amazon S3, Amazon DynamoDB, ELB etc), customer do not need to configure anything, redundancy and failover at the regional level is handled by the service. Some other services are operating at AZ level (Amazon EC2, EBS, RDS etc) Fo these services, the best practice is to design for multiple AZ architecture and data replication.
In the very unlikely case a service would not be available in an AZ, a well architected architecture will transparently fail over to another AZ, without any noticeable customer impact.
Back to your question, the architecture is designed to avoid a region-wide failure of all services. This never happened since we launched AWS in 2006. And, yes, we have a lot of capacity. I propose you to watch this keynote from James Hamilton to learn more about it https://www.youtube.com/watch?v=AyOAjFNPAbA
Can you let me know if data on below AWS technology keeps data on
Multiple Facilities? How many? Different Availability Zones?
S3, EBS, Dynamo DB
Also want to know in general what is the distance between two AZ, want to make sure that any catastrophe can destroy complete region?
Just to Start Point out All the above asked questions are easily answered in AWS Documentation.
What is Region and Availability-Zone ?
Refer This Documentation
Each region is a separate geographic area. Each region has multiple,
isolated locations known as Availability Zones.
Also want to know in general what is the distance between two AZ ?
I don't think any one would know answer to that , Amazon Does not Publish such kind of Information about their Data Centers,they are secretive about it.
Now to Start with S3 , As Per AWS Documentation:
Although, by default, Amazon S3 stores your data across multiple
geographically distant Availability Zones.
Now You can Also Enable Cross Region Replilcation as per AWS documentation but that will incur extra cost :
Cross-region replication is a bucket-level configuration that enables
automatic, asynchronous copying of objects across buckets in different
AWS Regions.
Now for EBS as per AWS Documentation :
Each Amazon EBS volume is automatically replicated within its
Availability Zone to protect you from component failure, offering high
availability and durability
Also As per Documentation You can Create Point In Time Snapshot and make it available in Another AWS Region and all the Snapshots are backed up on AWS S3.
Now for DyanamoDB as per AWS Documentation :
DynamoDB stores data in partitions. A partition is an allocation of
storage for a table, backed by solid-state drives (SSDs) and
automatically replicated across multiple Availability Zones within an
AWS Region.
Now you can make it available across region for more details please refer to this AWS Documentation
Hope This Clears your Doubts!
By default all these services replicate the data in different AZ(availability zones) which are in the same AWS region.
But AWS also provided the mechanism to replicate the data across different region(which you can choose), so that you can have more fault tolerant and low latency for the users(you can serve your users from the servers which is in the same region).
However keep in mind that replicating data across multiple zones involves more cost.
You can read AWS doc http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html to know where all aws regions and AZ presents to figure out the where they are located.
Whole Idea to keep different AZ and region is to provide high availability, so you shouldn't bother about the distance and availability, if you are having replication across multi AZ or region.
Edit :- Thanks to Michael for pointing out that EBS volumes are only replicated (mirrored) within the AZ where the volume is created