Read replicas in RDS AWS - amazon-web-services

I am a newbie to amazon RDS. I have set up a db instance in RDS. I want to try the RDS read replicas feature.
I have few queries:
For what kind of applications read replicas are suitable?
Is the read replica replicates synchronously or asynchronously data to other read replicas?
Is it the substitute of the Multi AZ deployments?
How is it better than the master slave or master master replication in MYSQL.
If we have replicas on EC2 will it work the same way as RDS read replicas work
Thanks in advance.

For what kind of applications read replicas are suitable?
It is best suited if your application is
Read intensive and is used by several read clients
Can adopt ( live with ) a minor lag between the data written to db and data replicated to read replicas.
Is the read replica replicates synchronously or asynchronously data to other read replicas?
The replication is asynchronous, so expect a small lag for replication
Is it the substitute of the Multi AZ deployments ?
Multi AZ setup and Read Replica compliment each other; they aren't replacement or substitute for each other. Multi AZ setup is for High Availability ( Out of the Box Setup By AWS ) whereas Read Replica is purely to reduce / distribute the load on the Database Instances to improve the read performance and to avoid bottlenecks to the databases for writes and read. You can / need to write your application logic to divert your reads to Read Replica and Writes to Main Instance; to make the best use of the setup.
Generally people mix and match both Multi AZ and Read Replica(s) depending on the application and load.
How is it better than the master slave or master master replication in MYSQL
The comparison of the master master vs master slave depends on several factors like data, data volume, operation like write or read, load etc. you need to work to see exactly how the system performs with either of the setup.
The best advantage you go with Multi AZ / Read Replica is that, you can offload the DB management activities and overhead of supervising the replica setup and health to AWS; instead of you managing those by yourself.
If we have replicas on EC2 will it work the same way as RDS read replicas work
This is again more like corollary to the Q4. When try to install a database in your EC2 instance you need to take care ( monitor & manage ) - EC2 Instance Patches, Database Instance Patches, Replication Setup, Replication Lag, Availability.
Whereas when you leave that to AWS by using Read Replica they manage all the above for you. It is your call to choose which ever is best for you either depending on the application requires which involves factors like cost, availability, compliance etc.

Related

AWS Aurora - load balance read operation between read replica and writer instance

Aurora cluster has two endpoints: one for writes and the second one for reads (the endpoints that contains -ro prefix). When going through the documentation about connection management I learned that the read-only endpoint balances the connections between Aurora replicas. But it looks like it doesn't include the writer instance in this load balancing.
Is there a way to include the writer instance in the read-only (-ro) endpoint? In applications where 99% of the traffic is reading the data having a cluster with a writer and one reader (to have better availability in case the writer goes down) sounds like a waste of resources. In such case the writer will be idle in 99% of the time.
In the same documentation there is an info about the possibility to create the custom endpoint, but I'm wondering if that's the only possibility to solve about problem.
read-only instances in Aurora are mainly used as failover targets in Multi-AZ environments. So, if you chose this deployment, read-only instances are not a waste of resources.
However, you can only connect to your writer endpoint and directing to it both writes and reads from your application. There is the risk of overwhelming the writer instance with reads and degrading performances of the 1% of the writes.
The main point here is that if you have both readers and writer instances you should make use of them, since you still have it for high availability purposes.

AWS RDS Read Replica act as Failover Standby

I am currently assessing whether to use RDS MySQL Multi-AZ or Single AZ with Read Replica.
Considerations are budget and performance, as Multi-AZ cost twice as much as Single AZ and have no ability to offload read operations, Single AZ with Read Replica seems to be a logical choice.
However, I saw a way to manually 'promote' the Read Replica to master in the event of master's failure, but is there a way to automate this?
Note: There was a similar question but it did not address my question:
Read replicas in RDS AWS
I think the problem is that you are a bit confused with these features. Let me help - you can launch AWS RDS in Multi-AZ deployment mode. In this case, AWS will do the following:
It will allocate a DNS record for you. This DNS record represents a single entry point to your master database, which is, lets assume, currently active and able to serve connections.
In the case of master failure for any reason, AWS will simply address hidden by the DNS record (quite fast, within 1-2 minutes) to be pointed to your stand by, which is located on another AZ.
When the master will become available again, then your stand by, which have served writes also needs now to synchronize everything with the master. You do not need to take care about it - AWS will manage it for you
In case of read replica:
AWS will allocate you 2 different DNS records - one for master, another for read replica. Read replica can be on the same AZ as a master, or even in an another Region
You can, and must in you application choose what DNS name to use in different scenarios. I mean, you, most probably, will have 2 different connection pools - one for master, another for read replica. Replication itself will be asynchronous
In the case of read replica, AWS solves the problem of replication by its own - you do not need to worry about it. But since the replica is read only AWS does not solve, by nature, the synchronization problem between read replica and master, because the replica is aimed to be read only, it should not accept any write traffic
Addressing your question directly:
Technically, you can try to make you read replica serve as a failover, but in this case you will have to implement a custom solution for synchronization with the master, because during the time the master was down, your read replica certainly received N amount of writes. AWS does not solve this synchronization problem in this case
In redards to Mutli-AZ - you cannot use your Multi-AZ standby as read replica, since it is not supported in AWS. I highly recommend to check out this documentation. I think it will help you sort the things out, have a nice day!)

Difference between "Multi-AZ Deployment" and "Read Replica Verison Multi-AZ Deployment"

Summary
Amazon RDS has two main different types of replicas, Multi-AZ Replica and Read Replica, and it's easily to find their difference.
However, Read Replica had supported Multi-AZ deployment at JAN, 2018.
What is the main difference between "Multi-AZ Deployment" and "Read Replica Version Multi-AZ Deployment"?
The two ways to add the Multi-AZ Deployment at the current database are as follow:
Situation 1: (Original, Multi-AZ Deployment)
Instance Action
→ Modify
→ specified the "Multi-AZ deployment" option
Situation 2: (Read Replica Version Multi-AZ Deployment)
Instance Action
→ Create read replica
→ specified the "Multi-AZ deployment" option
An RDS read replica instance is an asynchronous read-only replica of an upstream primary ("master") database instance. It can be used by your application for any query that does not require changing data, thus relieving load from the master. If the replica crashes or fails, it has no impact on the master but the replica itself can no longer handle any traffic.
Multi-AZ means the database instance has a standby spare server machine and spare hard drive in a different availability zone of the same region. This is a synchronous replica, but cannot be accessed by you. If the active server fails, the spare server takes over and starts handling traffic more quickly than would be possible without the spare.
Multi-AZ is a deployment strategy for higher reliability.
It reduces the downtime required for version upgrades, and reduces the impact of backup snapshots and creation of replicas, since snapshots can be done from the spare (by the service). It doubles the cost of the instance because of the hot standby capacity it provides.
Multi-AZ typically used only on the master instance, for fast recovery.
Historically, this was the only variant of Multi-AZ, but a Multi-AZ read replica is now possible, and is what it sounds like: a replica with Multi-AZ. It will recover more quickly from faults and failures because it has spare hardware. The active and spare are synchronous replicas of each other but are still asynchronous replicas of the master, as all non-Aurora replicas are in RDS/MySQL.
Combining Read Replicas with Multi-AZ enables you to build a resilient disaster recovery strategy and simplify your database engine upgrade process.
Amazon RDS Read Replicas enable you to create one or more read-only copies of your database instance within the same AWS Region or in a different AWS Region. Updates made to the source database are then asynchronously copied to your Read Replicas. In addition to providing scalability for read-heavy workloads, Read Replicas can be promoted to become a standalone database instance when needed.
https://aws.amazon.com/about-aws/whats-new/2018/01/amazon-rds-read-replicas-now-support-multi-az-deployments/
In summary, Multi-AZ on the master gets you one server with an invisible hot spare that is used for failure recovery but is not a usable database replica. It is a good strategy for resiliency.
Multi-AZ on a replica is an expensive way of speeding recovery time on a crashed instance. It is a separate server, so can be accessed by you, but so can a non-Multi-AZ read replica.
A multi-AZ deployment has a Master database in one AZ and a Standby (or Secondary) database in another AZ. Only the Master database serves traffic. If the Master fails, then the Secondary takes over.
A Read Replica is a read-only copy of the database. It is actively running and apps can use it for read-only queries. A Read Replica can be in a different AZ or even in a different region.
In terms of Highly Available, Multi-AZ has higher availability over Read-replica. As Multi-AZ provide a backup writer in other AZ, so both read and write is not affected on Single AZ fails.

Does AWS take down each availability zones(A-Z) or whole regions for maintenance

AWS has a maintenance window for each region.
https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/maintenance-window.html but could not find any documentation about how it works with multiple A-Z in the same region.
I have a Redis cache configured and have a replica on different(A-Z) in the same region. The whole purpose of configuring replica on different(A-Z) if one (A-Z) is not available serve it from next(A-Z)
When they doing maintenance are they take down the whole region or individual availability zone
You should read the FAQ on ElastiCache maintenance https://aws.amazon.com/elasticache/elasticache-maintenance/
This says that if you have a multi AZ deployment, it will take down the instances one at a time triggering a fail-over to the read replica, and then create new instances before taking down the rest so you should not experience any interruptions in your service.
Thanks #morras for the above link and explains how elasticache works maintenance window period. Below 3 question I have taken out from the above link and explain about it.
1. How long does a node replacement take?
A replacement typically completes within a few minutes. The replacement may take longer in certain instance configurations and traffic patterns. For example, Redis primary nodes may not have enough free memory, and may be experiencing high write traffic. When an empty replica syncs from this primary, the primary node may run out of memory trying to address the incoming writes as well as sync the replica. In that case, the master disconnects the replica and restarts the sync process. It may take multiple attempts for replica to sync successfully. It is also possible that replica may never sync if the incoming write traffic continues to remains high.
Memcached nodes do not need to sync during replacement and are always replaced fast irrespective of node sizes.
2. How does a node replacement impact my application?
For Redis nodes, the replacement process is designed to make a best effort to retain your existing data and requires successful Redis replication. For single node Redis clusters, ElastiCache dynamically spins up a replica, replicates the data, and then fails over to it. For replication groups consisting of multiple nodes, ElastiCache replaces the existing replicas and syncs data from the primary to the new replicas. If Multi-AZ or Cluster Mode is enabled, replacing the primary triggers a failover to a read replica. If Multi-AZ is disabled, ElastiCache replaces the primary and then syncs the data from a read replica. The primary will be unavailable during this time.
For Memcached nodes, the replacement process brings up an empty new node and terminates the current node. The new node will be unavailable for a short period during the switch. Once switched, your application may see performance degradation while the empty new node is populated with cache data.
3. What best practices should I follow for a smooth replacement experience and minimize data loss?
For Redis nodes, the replacement process is designed to make a best effort to retain your existing data and requires successful Redis replication. We try to replace just enough nodes from the same cluster at a time to keep the cluster stable. You can provision primary and read replicas in different availability zones. In this case, when a node is replaced, the data will be synced from a peer node in a different availability zone. For single node Redis clusters, we recommend that sufficient memory is available to Redis, as described here. For Redis replication groups with multiple nodes, we also recommend scheduling the replacement during a period with low incoming write traffic.
For Memcached nodes, schedule your maintenance window during a period with low incoming write traffic, test your application for failover and use the ElastiCache provided "smarter" client. You cannot avoid data loss as Memcached has data purely in memory.

What is the best way to automatically auto scale AWS RDS?

I want to autoscale AWS RDS automatically with scripts based on the metric monitoring.
RDS doesn't really do this for Read-Write
Multi AZ Write-Read database copies are intended for failover from primary to secondary if there is an availability problem. They don't address the problem of performance
Read replicas can be used to increase performance but they are read only
It might be possible to look at a load metric and use a Cloudwatch alarm to start an extra read replica. Read replicas can be used via an ELB or NLB
But probably this isn't a good idea. While an existing RDS is making a read replica, performance is degraded. RDS read replicas are quite slow to come up and become available so it's unlikely to respond in a good way to transient demand
You can make an API call to Modify an RDS Instance, including changing the instance class.
Amazon RDS will provision a new instance of the desired class and will then re-point the Endpoint to the new instance. Existing connections will be terminated, but applications can reconnect and all the data will be there.
Rather than scaling the RDS instance, you could always consider a caching layer, such as Amazon ElastiCache that supports Redis and Memcached. Most applications are read-heavy, which is ideal for using a cache. This can significantly improve application performance without having to scale the database.
In simple, it can be possible with Aurora 5.7 DB RDS instances only, they provide an option to auto-scale based on cloud watch metric conditions i.e CPU utilization etc.