What is the best way to automatically auto scale AWS RDS? - amazon-web-services

I want to autoscale AWS RDS automatically with scripts based on the metric monitoring.

RDS doesn't really do this for Read-Write
Multi AZ Write-Read database copies are intended for failover from primary to secondary if there is an availability problem. They don't address the problem of performance
Read replicas can be used to increase performance but they are read only
It might be possible to look at a load metric and use a Cloudwatch alarm to start an extra read replica. Read replicas can be used via an ELB or NLB
But probably this isn't a good idea. While an existing RDS is making a read replica, performance is degraded. RDS read replicas are quite slow to come up and become available so it's unlikely to respond in a good way to transient demand

You can make an API call to Modify an RDS Instance, including changing the instance class.
Amazon RDS will provision a new instance of the desired class and will then re-point the Endpoint to the new instance. Existing connections will be terminated, but applications can reconnect and all the data will be there.
Rather than scaling the RDS instance, you could always consider a caching layer, such as Amazon ElastiCache that supports Redis and Memcached. Most applications are read-heavy, which is ideal for using a cache. This can significantly improve application performance without having to scale the database.

In simple, it can be possible with Aurora 5.7 DB RDS instances only, they provide an option to auto-scale based on cloud watch metric conditions i.e CPU utilization etc.

Related

How to setup AWS RDS standalone instance without traffic from actual RDS cluster

We need to know what are the best options to set AWS RDS instance (Aurora mysql) that is standalone and does not get traffic from actual RDS cluster.
Requirement is for our data team to write analytical queries but we do not want it to impact actual application and DB performance. Hence we need a DB which always has near to live data but live traffic or application does not connect to this instance.
Need to know which fits better, DL clone OR AWS Pilot light OR AWS Warn standby OR AWS hot standby OR
multi-AZ configuration.
Kindly let us know which one would fit our requirement better.
We have so far read about below 3 options,
AWS Amazon Aurora DB clone, https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Managing.Clone.html
AWS Pilot light or AWS Warn standby or AWS hot standby
. https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-iii-pilot- light-and-warm-standby/
With multi-AZ configuration, we can create a new instance in new AZ, so that his instance will have a different host (kind off, a fail over strategy), where traffic to his instance will be from our queries and not from live prod application, unless there is some fail over issue.
Option 1, Aurora cloning says
Run workload-intensive operations, such as exporting data or running analytical queries on the clone.
...which seems to be your use case here.
Just be aware that the clone will not see any changes to the original data after it is made. So you will need to periodically delete and re-clone to get the updated data
Regarding option 2, I wrote those blog posts, and I do not think that approach suits your use case. That approach is for disaster recovery
Option 3 may work. To modify it a bit, the concept here is to create an Aurora Replica, which as you say is a separate instance. The problem here is the reader endpoint for your production workload, it may hit that instance (which is not what you want)
EDIT: Adding new option 4
Option 4. Check out Amazon Aurora zero-ETL integration with Amazon Redshift. This zero-ETL integration also enables you to analyze data from multiple Aurora database clusters in an Amazon Redshift cluster.

Multi region EC2 & RDS replication from Region A to various other regions

Our current server consisting of an 2x EC2 instances and RDS (Read/Write) database is in Mumbai Region. However I would like to copy everything (2x EC2 & RDS (R/W)) across to Sydney, and other to other regions.
Ideally I would like to replicate the contents in those instances as well.
Does anyone know a quick and easy way of doing this?
Edit 25/01/2019:
However I would like to copy everything including what ever is inside the instances (2x EC2s and the RDSs)
Edit 29/01/2019:
The purpose is to "scale/expand out". I want to have the same infrastructure replicated 1-to-1 (exactly/identically) across various regions.
It is simple!
- For EC2 - you need to create an AMI of those instances then right click on the AMI you've just created and choose "copy AMI" to the designated region.
For RDS
If you just wanna copy data to another region then take a snapshot then copy that snapshot to destination region
If you want to make the RDS replicate to another region continuously then you need to create a read-replica from your RDS instance.
Option for replicating environment depends on how much downtime can you tolerate.
If you are okay with downtime
1. Copy the AMI of EC2 instance and snapshot of RDS to another regions
2. Bring up your new environment.
This is perfect for non critial workload
If this is critical application
1. Copy the AMI of ec2 instance ( I am assuming this would be your web/app instnaces) For real time replication use rsync or robocopy .. or solution like cloudendure .
2. Create a new RDS instance in sydney
3. USE DMS migration tool .. create source and target relationship
4. once insync cut off the relation bring new environment in sydney
As suggested by previous answers for EC2 you can create AMIs and then move the AMI to a different region.
For RDS, you can either create read replicas (and read replicas of read replicas, but beware of latency), read replicas are used to mainly improve read performance of your app.
You can also create a Multi AZ backup which will act as a disaster recovery site. However, note that Multi-AZ is only used in case of a failover. Moreover, Multi-AZ involves Synchronous data copy and read replicas are asynchronous, so read replicas can demonstrate eventual consistency behavior.
But the real question here is - What are you trying to achieve?
Are you trying to "scale out" your infrastructure to support huge traffic to your application? Or are you simply trying to setup disaster recovery (DR)?
If your answer is DR, then the approach is pretty straight forward with Multi AZ and EC2 instance snapshots. But if the answer is scaling out and performance, you really need to be thinking of better strategies such as using Cloudfront (CDN) if it is a web app, using Elasticache in-memory cache for frequently read data, or RDS read replicas, using Elastic Load Balancers with Dynamic/Step scale-out/scale-in. Other, methods would be to evaluate the type of RDS storage subsystem used i.e. using Provisional IOPs vs. Using General Purpose SSD, checking if there are any NAT “instance” bottlenecks in your VPC and so on.
It may be tempting to spin up all these redundant copies of EC2 AMIs or RDS read replicas with a click of a button, but you really need to be thinking about the cost you are going to incur on a monthly basis for completely un-used resources.

How much outage time is involved in resizing an AWS RDS Multi-AZ instance?

We're implementing new SQL Server databases in AWS. Our cloud engineer recommended RDS, despite the known downsides (inability to restore a single database or copy out a single backup, inability to resize the instance or reconfigure storage without downtime). Meanwhile, if we implement on EC2 we could get the benefit of zero-downtime upgrades.
In further reading, it seems like Multi-AZ may avoid downtime when resizing (see samples below) but the documentation is vague.
"Running a DB instance with high availability can enhance availability during planned system maintenance"
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.MultiAZ.html
"There is minimal downtime when you are scaling up on a Multi-AZ environment because the standby database gets upgraded first, then a failover will occur to the newly sized database."
https://aws.amazon.com/blogs/database/scaling-your-amazon-rds-instance-vertically-and-horizontally/
My question: does using Multi-AZ in RDS allow zero downtime when adding storage? If not, how much outage time would we experience when reconfiguring a Multi-AZ instance?
Multi-AZ doesn't do zero downtime, but we generally see less than a minute (with MySQL).
I would just create a new multi-AZ db from a snapshot, and test it to see. It shouldn't cost more than a buck or two to find out.

Why does AWS RDS Aurora have the option of "Multi-AZ Deployment" when it does replication across different zones already by default?

When launching an Aurora instance I have the option of "Multi-AZ Deployment", which it describes as "Specifies if the DB Instance should have a standby deployed in another Availability Zone."
However the Aurora documentation states that Aurora already automatically spreads the database across different availability zones?
Additionally, what is the difference between an Aurora Multi-AZ standby and an ordinary Aurora replica. Is that that an ordinary replica can be read from increasing performance whereas a standby cannot be read from?
Aurora replicates your data across three availability zones, at the storage layer... but the database server instance, itself, is still a virtual machine running on a single physical machine that is located in a single availability zone.
The Aurora storage layer is outside that instance, and is able to let access continue uninterrupted without data loss, even in the event of the loss of up to two AZs, but the loss of the zone containing the db instance will still cause an outage for you, if you only have a single Aurora instance in your cluster (1 master, 0 replicas). Loss of an entire availability zone is one of those things that is highly improbable but not impossible. Your db instance is still a single point of failure when you only have one.
Multi-AZ makes allowance for a complete redundant database instance, in a different AZ, which will automatically take over for the primary within one minute, if it works as designed, in case of the loss of the AZ hosting the primary instance or a catastrophic failure of the primary instance. It's a second virtual machine, on a second physical machine, in a second availability zone. It's always running, but you can't access it. It's in the background, managed and monitored by the RDS infrastructure, but it is only accessible to you in the case of primary instance failure. The secondary machine can also be used to reduce downtime in the event of a software upgrade or maintenance event on the primary. When failover occurs, if you are using DNS to connect to your database (as you should), you'll find that the DNS entry is automatically pointed to the secondary.
Contrast this to a read replica, which is accessible all the time and can thus provide a significant performance benefit, by allowing the offloading of reads. Failing over to a replica involves promoting it to become a standalone master (which permanently detaches it from its own former master) and reconfiguring your application to use the alternate endpoint. This, of course, is still faster than recovering from a failure in the master by using a point-in-time snapshot to create a replacement master instance.
https://aws.amazon.com/rds/details/multi-az/
Storage in Aurora is replicated across three availability zones. The database head node is a single instance. So, while your data is spread across multiple targets, the head node is not.
When you enable a multi-AZ deployment, we create an Aurora read replica that is available as a failover target. Any Aurora read replicas you create (up to a max of 15 at this time) are also available as failover targets.
There isn't any meaningful difference between Multi-AZ and other Aurora replicas. This is primarily a simplification in the user interface for customers accustomed to using Multi-AZ for other RDS engines.
AWS Management console.
The answer to this is straightforward.
You can create Multi-AZ in the management console or ignore it. Irrespective, the shared storage for Amazon Aurora is across three AZ (Multi-AZs) as it's the feature of Amazon Aurora however if we choose the Mult-AZ option then we will also have your instances of Amazon Aurora in multiple AZs.
Thus you should choose the Amazon console image option

How to scale horizontally Amazon RDS instance?

How to scale horizontally amazon RDS instance? EC2 and load balancer+autoscaling is extremly easy to implement, but if I want scaling amazon RDS?
I can ugrade my RDS instance with more powerfull instance or I can create a read replica and I can direct SELECT queries to it. But in this mode I don't scale anything if I have a read-oriented web application. So, can I create RDS read replica with autoscaling and balance them with load balancer?
You can use a HAProxy to load-balance Amazon RDS Read Replica's. Check this http://harish11g.blogspot.ro/2013/08/Load-balancing-Amazon-RDS-MySQL-read-replica-slaves-using-HAProxy.html.
Hope this helps.
Note RDS covers several database engines- mysql, postgresql, Oracle, MSSQL.
Generally speaking, you can scale up (larger instance), use readonly databases, or shard. If you are using mysql, look at AWS Aurora. Think about using the database optimally- perhaps combining with memcached or Redis (both available under AWS Elasticache). Think about using a search engine (lucene, elasticsearch, cloudsearch).
Some general resources:
http://highscalability.com/
https://gigaom.com/2011/12/06/facebook-shares-some-secrets-on-making-mysql-scale/
If you are using PostgreSQL and have a workload that can be partitioned by a certain key and doesn't require complex transactions, then you could take a look at the pg_shard extension. pg_shard lets you create distributed tables that are sharded across multiple servers. Queries on the distributed table will be transparently routed to the right shard.
Even though RDS doesn't have the pg_shard extension installed, you can set up one or PostgreSQL servers on EC2 with the pg_shard extension and use RDS nodes as worker nodes. The pg_shard node only needs to store a tiny bit of metadata which can be backed up in one of the worker nodes, so they are relatively low maintenance and can be scaled out to accommodate higher query rates.
A guide with a link to a CloudFormation template to set everything up automatically is available at: https://www.citusdata.com/blog/14-marco/178-scaling-out-postgresql-on-amazon-rds-using-masterless-pg-shard