Generally when using AWS RDS, the recommended practice to achieve high availability is to deploy hot replica in different AZ (multi AZ deployment). Also, some read replicas can be brought up to improve read performance.
I've read AWS Aurora documentation, it uses common virtual storage layer, which is replicated on 3 AZ, with two copies in each AZ.
My question is this: Is there any need to use Amazon multi AZ deployment of Aurora DB cluster, if Aurora itself is capable of healing itself, and has its storage distributed over multi AZs? If it keeps 2 storage copies in each of 3 AZs, then its as reliable as using the multi AZ replica setup for failover. Also, during failover. it automatically creates another instance (if no read replica exist) or switches the primary. I really do not understand any need to create additional requirement of using multi AZ aurora cluster to 'improve' availability.
Is it possible that there's some scenario where availibility would suffer under default Aurora deployment? What happens during loss of an entire AZ which contains the primary Aurora DB node?
If you are only interested in your data not being lost, then a non-multi AZ would probably work fine because, as you said, the data is replicated for you.
But the running instance of Aurora still lives on a physical machine, and that physical machine lives in a single AZ, so if that AZ goes down, while you may not lose any data you won't necessarily have access to it.
A multi-AZ deployment has a physical machine running in more than one AZ, so if one AZ goes down, the database server in the other AZ can still serve your requests.
The RDS Multi-AZ feature is much simpler for Aurora deployments than it is for non-Aurora deployments: An Aurora Replica is a Multi-AZ failover target in addition to a read-scaling endpoint, so creating a Multi-AZ Aurora deployment is as simple as deploying an Aurora Replica in a different Availability Zone from the primary instance.
This behavior is different from standard non-Aurora Multi-AZ deployments, which maintain a separate synchronously-replicated 'standby instance' which cannot be used as a read-scaling endpoint, and vice versa (standard RDS Read Replicas cannot be used as Multi-AZ failover targets).
Even though Aurora data is backed up across AZs, having a replica instance already running can still significantly reduce the amount of time it takes to recover from a failure of the primary instance. The typical amount of time Aurora takes to recover from a failover with an Aurora Replica available is 1-2 minutes, compared to 10 minutes without a Replica, as described in Fault Tolerance for an Aurora DB Cluster:
If the primary instance in a DB cluster fails, Aurora automatically fails over to a new primary instance in one of two ways:
By promoting an existing Aurora Replica to the new primary instance
By creating a new primary instance
If the DB cluster has one or more Aurora Replicas, then an Aurora Replica is promoted to the primary instance during a failure event. [...] However, service is typically restored in less than 120 seconds, and often less than 60 seconds. [...]
If the DB cluster doesn't contain any Aurora Replicas, then the primary instance is recreated during a failure event. [...] Service is restored when the new primary instance is created, which typically takes less than 10 minutes.
Promoting an Aurora Replica to the primary instance is much faster than creating a new primary instance.
Related
I have an AWS RDS instance deployed with Multi-AZ set as true.
As a disaster-management strategy in case the DB fails, is creating a read-replica in another AZ redundant?
If I create the read replica in another region (Outside the VPC), would that be redundant too?
As a disaster-management strategy in case the DB fails, is creating a read-replica in another AZ redundant?
Yes. RDS read-replicas are only for scaling read queries and they do not offer automatic failover.
Currently, we have running our application in one of AWS Region/data centers. Are there any strategies or principles we can follow to extends our application to another data center?
How we can quickly bring up the same or minimum set of services/AWS-stack to another region quickly?
Do any trade-offs need to be considered?
Current AWS resources in existing DC: EC2, S3, Dynamodb, RDS, VPC, security groups, ELB, Lambda, API G/W
All of the services you have listed (except for Amazon EC2) automatically run across multiple Availability Zones within the Region. This means that, if an AZ fails, those services are not impacted.
An Availability Zone is a physically separate data center within the Region. It is sufficiently distant and has different networking such that a failure in one AZ should not impact another AZ. Running your services across multiple AZs should be sufficient for high-availability rather than running across multiple Regions.
Each Amazon EC2 instance, however, resides in only one AZ since it is a virtual machine running on a single host computer. To make your application highly-available, you should:
Run EC2 across at least two AZs
Configure the Elastic Load Balancer to distribute traffic across all of those instances
This way, if an AZ fails and the EC2 instances in that AZ are not available, the app will continue to run in the other AZs.
Amazon RDS offers multi-AZ capabilities if you choose 'multi-AZ' when launching the database. This will run a primary database in one AZ and a secondary database in another AZ. If the primary AZ fails, the secondary database will take over. The data is automatically replicated to the secondary database, so no data will be lost. (Extra charges apply for this feature.)
There is lots of information available online on this topic. Just search for "aws multiple AZs".
I guess the title is pretty objective, but just to clarify:
When you create an Aurora Database Instance, it is asked to give a name for a Database Instance, a Database Cluster and a Database (where the name of the Database is optional, and no databases are created if it is not specified...). When you create another instance, you have to give the name for both again, and neither of them can be the same one as the first ones.
So, what's the difference between an Aurora Database Instance and an Aurora Database Cluster?
Also, can (and when do) you connect to each one of them?
Thanks!!
An Aurora cluster is simply a group of instances. By default, Aurora will create two instances in a cluster - one for reads and the other for writes. But you can change that configuration to be whatever you need.
For the names:
Database Cluster is the name of the cluster that holds the instances
Database Instances are the names of each instance in the cluster. By
default, if you named the instances "mydb", AWS will append the AZ to
the name. So it would become "mydb-us-east-1c" for example.
Database Name is the name of the initial database that will be created within Aurora. Think database like where you will add tables and data. If you do not specify a Database Name, you will just need to create your own - which is likely what you want to do anyway.
To connect, just point your application at the cluster endpoint. RDS will route traffic and handle failovers for you.
I will try to explain the setup in a simpler way. Hope this will answer all the questions in the end.
An Amazon Aurora DB cluster consists of one or more "DB instances" and a "cluster volume" that manages the data for those DB instances. Each Aurora DB cluster will have one primary DB instance for sure.
Despite from RDS instances, the major difference is that RDS Aurora instances don’t contain any data. They simply facilitate the reading and writing to the Aurora cluster [ Refer the diagram here ]. It’s the Aurora cluster that contains the data. That is why Aurora snapshots are not considered "DB snapshots". Instead, they are considered "cluster snapshots".
There are two endpoints associated with any Aurora cluster;
Cluster endpoint (or writer endpoint)
Reader endpoint
A cluster endpoint (or writer endpoint) for an Aurora DB cluster connects to the current primary DB instance for that DB cluster. A reader endpoint for an Aurora DB cluster provides load-balancing support for read-only connections to the DB cluster.
If the cluster only contains a primary instance and no Aurora Replicas (because we can create like that), the reader endpoint connects to the primary instance. In that case, you can perform write operations through the endpoint.
Links:
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.Endpoints.html
https://blog.skeddly.com/2018/01/rds-db-and-cluster-snapshots.html
I'm planning to run MySQL RDS.
My question is Is it possible to run MySQL in 3 availability zones? Or is it only limited to 2 AZs. If it's running in 3AZs does it mean I get better redundancy compare with running in two AZs?
Using the RDS Multi-AZ High Availability feature1 you can only have one stand-by replica:
In a Multi-AZ deployment, Amazon RDS automatically provisions and maintains a synchronous standby replica in a different Availability Zone. The primary DB instance is synchronously replicated across Availability Zones to a standby replica to provide data redundancy, eliminate I/O freezes, and minimize latency spikes during system backups. Running a DB instance with high availability can enhance availability during planned system maintenance, and help protect your databases against DB instance failure and Availability Zone disruption.
This is only a failover solution -- you can't use the standby for load balancing.
You can create additional Read Replicas2 that cover other availability zones and can be used to horizontally scale read traffic. But there are two caveats:
Unlike the standby, RDS cannot automatically fail over to a read replica when the primary DB goes down. You would need to implement this yourself using other tools like Route53.
Read replicas use asynchronous replication, so they may lag behind the master. You need to determine if this is acceptable in your failover scenario.
For version cut-over, I am using CloudFormation to spin up a new infrastructure with a new VPC, Subnets, and Security Groups. I want a copy of my production database in the new VPC (same region). I do not want to use a Snapshot, because that would require me to take the app down for a while (after the snapshot is taken, any new data will be lost, so I have to shut down the app).
I want to create a read-replica into the new VPC/SecurityGroup/Subnets, and then when I am ready for cut-over I will promote that read replica. Is this possible?
AWS documentation gives clues that creating a read replica across VPCs in the same region is not supported, but does not explicitly say so. Alternately, I am open to moving the database after promotion.
Thanks
P.S. example of what I mean by "clue":
"Within a region, all cross-region replicas created from the same source DB instance must either be in the same Amazon VPC or be outside of a VPC."
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html
This almost answers my question... But what about replicas created in the same region as the source DB?
It is not supported.
You can do a read replica within the same VPC, then take a snapshot of the replica and restore it on the other VPC.
You can actually do this using external MySQL replicas. You will need appropriate routing and security groups between your VPCs. As long as your VPC subnets can communicate with each other, create a replica, stop replication on it and record the binlog position info where it was stopped. Take a snapshot of the replica and use that to spin up a new RDS instance in the new VPC, now set the external replica to your old RDS instance and start replicating where the binlog position was recorded. You've now got a master RDS instance in your new VPC, replicating from the old VPC.
This article covers it in easy to follow steps:
http://quiddle.net/post/78453641455/migrating-rds-from-ec2-to-vpc