How are Amazon RDS Database Instances Provisioned? - amazon-web-services

I've been considering moving some databases from self-hosted database instances (e.g. MySQL or PostgreSQL on Linux, either bare-metal or within AWS itself) into Amazon RDS, but it's unclear to me how everything will behave once I've created the database and it's time for maintenance to begin.
For example, I have to choose the type of instance(s) that will be used for the database, which I guess means how responsive everything will be, and there is an option for multi-AZ deployments, but it's not clear how many of those types of instances I'm actually configuring. (Presumably, multi-AZ deployment requires at least two instances).
There are options for Failover, which leads me to believe that I can rely on the service to stay up if there are problems with an instance, but then there is also a section for selecting maintenance windows for automated upgrades, which I find confusing. If I were administering e.g. a two-instance MySQL setup, I'd upgrade one instance and then the other to avoid any downtime. Is that not how RDS behaves?
RDS advertises support for automatic "minor version upgrades" (yes, please), but doesn't say anything about OS upgrades. Presumably, the db engine will be running on Amazon Linux or something similar, and will periodically require updates to those packages. Does that all happen automatically, or do I need to manually perform those upgrades, etc.?
The whole point of using something like RDS is that the service should become something I no longer have to worry about: I don't have to deal with package maintenance, upgrades, failover, or unexpected downtime (as long as I pay enough, of course). But all of the options for the RDS instance are making me skeptical of the advantages provided by RDS over just running everything myself.
Can anyone with experience with AWS RDS comment on their experiences with maintenance, upgrades, and failover?

These were the same concerns which we had when we were planning to use RDS. Now that we are effectively using AWS RDS for multiple production workloads, let me try to clarify your queries. Hope this helps.
Your Question 1 : I have to choose the type of instance(s) that will be used for the database, which I guess means how responsive everything will be
Answer : Yes. This is to define what capacity (CPU,RAM etc) you will need for your database workload
Your Question 2 : There is an option for multi-AZ deployments, but it's not clear how many of those types of instances I'm actually configuring.
Answer : Multi-AZ deployments are to ensure high availability. AZ (Availability Zones) are isolated locations within an AWS Region to provide better protection against disaster scenarios. So when we choose a Multi AZ deployment, RDS will place 2 instances of your database server in 2 Availability Zones in the region where you are provisioning.
This is done automatically by RDS and we dont have to setup/maintain 2 servers separately/manually. ( Note : Your VPC should have atleast 1 subnet in each of the 2 different AZ to provision Multi AZ Setup)
Your Question 3: If I were administering e.g. a two-instance MySQL setup, I'd upgrade one instance and then the other to avoid any downtime. Is that not how RDS behaves?
Yes. RDS does it by itself without manual intervention if you enable Automatic Upgrades while setting up RDS (Only if you choose to have Multi AZ option)
Your Question 4 : RDS advertises support for automatic "minor version upgrades" (yes, please), but doesn't say anything about OS upgrades.
Answer : RDS dont expose/provide any OS access to us. The underlying OS and its upgrades/other activities are all done without affecting the RDS services hosted on top of it. We dont have to do anything about the OS of RDS. So we can forget about that part.
Your question 5 : Regarding Failover of AWS RDS Multi AZ database
I would classify into 2 cases.
Case 1 : Fail-overs required during maintenance/other automatic activities done by Multi AZ RDS instance.
Here, RDS will automatically do the failover one instance at a time. It will first move all the ongoing traffic to second instance and then upgrade/reboot the first instance and then do the same with second instance.
Case 2 : Fail-overs required during manual reboot/manually triggered actions done on Multi AZ RDS instance.
In this case, during the reboot, AWS RDS provides an option for you to select whether the reboot should be with failover or without one.

Related

How to setup AWS RDS standalone instance without traffic from actual RDS cluster

We need to know what are the best options to set AWS RDS instance (Aurora mysql) that is standalone and does not get traffic from actual RDS cluster.
Requirement is for our data team to write analytical queries but we do not want it to impact actual application and DB performance. Hence we need a DB which always has near to live data but live traffic or application does not connect to this instance.
Need to know which fits better, DL clone OR AWS Pilot light OR AWS Warn standby OR AWS hot standby OR
multi-AZ configuration.
Kindly let us know which one would fit our requirement better.
We have so far read about below 3 options,
AWS Amazon Aurora DB clone, https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Managing.Clone.html
AWS Pilot light or AWS Warn standby or AWS hot standby
. https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-iii-pilot- light-and-warm-standby/
With multi-AZ configuration, we can create a new instance in new AZ, so that his instance will have a different host (kind off, a fail over strategy), where traffic to his instance will be from our queries and not from live prod application, unless there is some fail over issue.
Option 1, Aurora cloning says
Run workload-intensive operations, such as exporting data or running analytical queries on the clone.
...which seems to be your use case here.
Just be aware that the clone will not see any changes to the original data after it is made. So you will need to periodically delete and re-clone to get the updated data
Regarding option 2, I wrote those blog posts, and I do not think that approach suits your use case. That approach is for disaster recovery
Option 3 may work. To modify it a bit, the concept here is to create an Aurora Replica, which as you say is a separate instance. The problem here is the reader endpoint for your production workload, it may hit that instance (which is not what you want)
EDIT: Adding new option 4
Option 4. Check out Amazon Aurora zero-ETL integration with Amazon Redshift. This zero-ETL integration also enables you to analyze data from multiple Aurora database clusters in an Amazon Redshift cluster.

Amazon RDS instances and the new Compute Savings Plans

I have a small single-instance deployment running on an EC2 instance which hosts both a web application and its database (MySQL). I've been looking to separate the deployment out into an EC2 instace for the web app and an RDS cluster for the database, and wanted to take advantage of the new AWS Savings Plans for both if possible.
My questions the are:
AWS Savings Plans seem to only apply to 'pure' compute EC2 instances, not to RDS instances as well. Can someone confirm or disprove this?
If Savings Plans did apply to RDS instances, is there a reason to not use them, and instead just use an Instance Reservation?
Since August 2020, AWS Savings Plans includes:
Amazon EC2
AWS Lambda
AWS Fargate
They do not apply to Amazon RDS db instances. For those, you can continue to use Amazon RDS Reserved Instances.
I want to clarify that even though Savings Plans do not cover RDS instances, they do cover EC2 instances that are part of EMR, ECS and EKS Clusters. Based on this link:
"Both plan types apply to EC2 instances that are a part of Amazon EMR, Amazon EKS, and Amazon ECS clusters. Amazon EKS charges will not be covered by Savings Plans, but the underlying EC2 instances will be. "
Also, Compute Savings Plans also apply to your Fargate and Lambda usage.
We moved to RDS from EC2 instances running self installed MySQL years ago. For me, at has been great. All of the RDS features work flawlessly, point and click, without the mundane work of spinning up, replicating, backing up, and failing over databases. It simply works great. Use reserved instances if you plan on keeping for at least a year. At 30% savings the cost is awash even if you bail on the server after about 9 months and don't use the entire year. Plus you can sell the unused remaining on the marketplace.
Downsides?
You do NOT get command line OS access to the MySQL server. You get an admin login to mySQL. The only way to manage it is through the AWS UI and the mysql client command line or managing client (like MySQL Workbench or Heidi).
You may want to run a mysqldump script on a separate EC2 to dump databases separately/additionally. AWS does SNAPSHOTS which require an entire restore of a sandbox server just to get a single table someone botched up, for example. I go to the MySQLdump files all the time. Never have needed the SNAPSHOT unless I am spinning up a sandbox copy of the entire instance for some reason.
In a nutshell, mySQL on RDS is great.
One other side note. We migrated an app using MySQL5.7 to Aurora MySQL with absolutely zero issues. Complete drop-in replacement (in our case).

Multi region EC2 & RDS replication from Region A to various other regions

Our current server consisting of an 2x EC2 instances and RDS (Read/Write) database is in Mumbai Region. However I would like to copy everything (2x EC2 & RDS (R/W)) across to Sydney, and other to other regions.
Ideally I would like to replicate the contents in those instances as well.
Does anyone know a quick and easy way of doing this?
Edit 25/01/2019:
However I would like to copy everything including what ever is inside the instances (2x EC2s and the RDSs)
Edit 29/01/2019:
The purpose is to "scale/expand out". I want to have the same infrastructure replicated 1-to-1 (exactly/identically) across various regions.
It is simple!
- For EC2 - you need to create an AMI of those instances then right click on the AMI you've just created and choose "copy AMI" to the designated region.
For RDS
If you just wanna copy data to another region then take a snapshot then copy that snapshot to destination region
If you want to make the RDS replicate to another region continuously then you need to create a read-replica from your RDS instance.
Option for replicating environment depends on how much downtime can you tolerate.
If you are okay with downtime
1. Copy the AMI of EC2 instance and snapshot of RDS to another regions
2. Bring up your new environment.
This is perfect for non critial workload
If this is critical application
1. Copy the AMI of ec2 instance ( I am assuming this would be your web/app instnaces) For real time replication use rsync or robocopy .. or solution like cloudendure .
2. Create a new RDS instance in sydney
3. USE DMS migration tool .. create source and target relationship
4. once insync cut off the relation bring new environment in sydney
As suggested by previous answers for EC2 you can create AMIs and then move the AMI to a different region.
For RDS, you can either create read replicas (and read replicas of read replicas, but beware of latency), read replicas are used to mainly improve read performance of your app.
You can also create a Multi AZ backup which will act as a disaster recovery site. However, note that Multi-AZ is only used in case of a failover. Moreover, Multi-AZ involves Synchronous data copy and read replicas are asynchronous, so read replicas can demonstrate eventual consistency behavior.
But the real question here is - What are you trying to achieve?
Are you trying to "scale out" your infrastructure to support huge traffic to your application? Or are you simply trying to setup disaster recovery (DR)?
If your answer is DR, then the approach is pretty straight forward with Multi AZ and EC2 instance snapshots. But if the answer is scaling out and performance, you really need to be thinking of better strategies such as using Cloudfront (CDN) if it is a web app, using Elasticache in-memory cache for frequently read data, or RDS read replicas, using Elastic Load Balancers with Dynamic/Step scale-out/scale-in. Other, methods would be to evaluate the type of RDS storage subsystem used i.e. using Provisional IOPs vs. Using General Purpose SSD, checking if there are any NAT “instance” bottlenecks in your VPC and so on.
It may be tempting to spin up all these redundant copies of EC2 AMIs or RDS read replicas with a click of a button, but you really need to be thinking about the cost you are going to incur on a monthly basis for completely un-used resources.

EC2, Webserver, and MySQL

I am about to launch an iOS app that will be communicating with my custom REST API. Right now I am running a single EC2 t2.micro instance running an Apache web server with MySQLi. Before I go ahead and launch it for the public, I want to hear what proper steps should be taken regarding the following.
Should I run two separate EC2 instances? One only for the web server and the other to handle only the database?
How should I approach setting up the database? Should I still use MySQLi or should I start using Amazon's RDS?
In relationship to number two, when the database and/or web server runs out of space, how is this issue handled so that it seamlessly adds space to allow the database/web server to continue growth? I also read something regarding auto-scale.
I will be expecting many requests per minute to my web server and want to take precaution.
The answer to these questions largely depends on the requirements of your application, your budget, and on what you decide to manage vs. what you'd prefer to allow AWS to manage. However, I'll answer these as best I can.
1) Yes. Separating the database from the web server (that is, 2 different EC2 instances) makes sense for a lot of reasons. This will allow you to tailor resources like memory, CPU, etc. to each layer of your application separately. You do not want your web and database competing for the same resources. Additionally, an issue that forces you to take down one (web or database) will not force you to also take down the other. If your database lives on one of the web servers and you need to perform maintenance, your app will effectively become offline, since down goes your database as you perform updates. Also, ideally you would protect your database server within a private subnet in your VPC. If you have the web and database on the same server, they will both be in a public subnet, since you're web will require access to an internet gateway.
2) Depends. If you want to maintain total control of the database server, than use an EC2 instance where you retain operating system control. If you want to take advantage of features like Multi-AZ for high availability or allowing AWS to manage things like updates for you, RDS can be a great option. Cost also plays a role. For things like read-replicas and Multi-AZ, you will pay more, but you are purchasing performance and high availability. Thus, depends on your requirements. You can find the features of RDS here: RDS Product Details
3) For anything running on an EC2 instance (database or web) or if you decide to use RDS, you may provision and attach additional storage volumes as necessary. The type of storage you select will depend on the performance requirements, your budget, and the kind of workload you expect your database to face. Amazon provides the storage options available to you as well as a section for adding more storage here: RDS Storage Options
If you are worried about too many requests overwhelming your EC2 t2.micro instance, consider creating an ELB load balancer and setting up an auto-scaling group which will allow you to expand your capacity as necessary while distributing traffic such that no one server gets overwhelmed.

AWS - HA NFS - Best practices

Anyone have a sound strategy for implementing NFS on AWS in such a way that it's not a SPoF (single point of failure), or at the very least, be able to recover quickly if an instance crashes?
I've read this SO post, relating to the ability to share files with multiple EC2 instances, but it doesn't answer the question of how to ensure HA with NFS on AWS, just that NFS can be used.
A lot of online assets are saying that AWS EFS is available, but it is still in preview mode and only available in the Oregon region, our primary VPC is located in N. Cali., so can't use this option.
Other online assets are saying that GlusterFS is a way to go, but after some research I just don't feel comfortable implementing this solution due to race conditions and performance concerns.
Another options is SoftNAS but I want to avoid bringing in an unknown AMI into a tightly controlled, homogeneous environment.
Which leaves NFS. NFS is what we use in our dev environment and works fine, but it's dev, so if it crashes we go get a couple beers while systems fixes the problem, but on production, this is obviously a no go.
The best solution I can come up with at this point is to create an EBS and two EC2 instances. Both instances will be updated as normal (via puppet) to maintain stack alignment (kernel, nfs libs etc), but only one instance will mount the EBS. We set up a monitor on the active NFS instance, and if it goes down, we are notified and we manually detach and attach to the backup EC2 instance. I'm thinking we also create a network interface that can also be de/re-attached so we only need to maintain a single IP in DNS.
Although I suppose we could do this automatically with keepalived, and a IAM policy that will allow the automatic detachment/re-attachment.
--UPDATE--
It looks like EBS volumes are tied to specific availability zones, so re-attaching to an instance in another AZ is impossible. The only other option I can think of is:
Create EC2 in each AZ, in public subnet (each have EIP)
Create route 53 healthcheck for TCP:2049
Create route 53 failover policies for nfs-1 (AZ1) and nfs-2 (AZ2)
The only question here is, what's the best way to keep the two NFS servers in-sync? Just cron an rsync script between them?
Or is there a best practice that I am completely missing?
There are a few options to build a highly available NFS server. Though I prefer using EFS or GlusterFS because all these solutions have their downsides.
a) DRBD
It is possible to synchronize volumes with the help of DRBD. This allows you to mirror your data. Use two EC2 instances in different availability zones for high availability. Downside: configuration and operation is complex.
b) EBS Snapshots
If a RPO of more than 30 minutes is reasonable you can use periodic EBS snapshots to be able to recover from an outage in another availability zone. This can be achieved with an Auto Scaling Group running a single EC2 instance, a user-data script and a cronjob for periodic EBS snapshots. Downside: RPO > 30 min.
c) S3 Synchronisation
It is possible to synchronize the state of an EC2 instance acting as NFS server to S3. The standby server uses S3 to stay up to date. Downside: S3 sync of lots of small files will take too long.
I recommend watching this talk from AWS re:Invent: https://youtu.be/xbuiIwEOCAs
AWS has reviewed and approved a number of SoftNAS AMIs, which are available on AWS Marketplace. The jointly published SoftNAS Architecture on AWS White Paper provides more details:
Security (pages 4-11)
HA across AZs (pages 13-14)
You can also try a 30 day free trial to see if it meets your needs.
http://softnas.com/tryaws
Full disclosure: I work for SoftNAS.