I am having an AWS RDS Aurora PostgreSQL cluster with four instances with a Multi-AZ deployment serving in Production. Encryption-at-rest hasn't been enabled on this cluster. Now I have to enable the encryption on this existing cluster. AWS docs suggest me to create a snapshot of that cluster and then restore the cluster again with the encryption enabled this time. Ref: Here
Since my cluster is serving in production and no downtime or I/O suspension is acceptable to me. Here are some questions that I would like to get answered before I plan about encrypting the existing cluster:
Is there any downtime during the creation of the snapshot assuming there is a lot of data and a snapshot will take time.
What about the new data that is being written on to the database during the snapshot creation? Is the snapshot creation real-time or I will lose my new data during the time till the snapshot is being taken?
Is this the only way for me to enable encryption on the production cluster knowing that it will result in some database outage?
There is a way to encrypt your AWS RDS Amazon Aurora with PostgreSQL compatibility Cluster with no or minimum downtime, but it will take a bit of effort.
You need to take the following steps:
For the source DB, you have to take snapshot.
Then copy that snapshot, and check Enable Encryption and select Default Encryption Key or select your Custom AWS KMS CMK, now you have an encrypted copy of your DB snapshot.
Restore this encrypted snapshot to the new DB instance, and you can enable Multi-AZ and add Read Replicas now or modify them after migration.
Now you have two DB instances Encrypted and Unencrypted, but the data mismatched as it is a production database.
We will use AWS DMS to make synchronous replication of data, or ou can use PostgreSQL logical replication with Aurora instead of AWS DMS, it will be better, both will works.
Go to AWS DMS console, create an AWS DMS task.
For migration type, choose Migrate existing data and replicate ongoing changes.
For target table preparation mode, choose Truncate.
Under Advanced Task Settings, enable the awsdms_status table if you want to verify replication status.
Run the migration task and wait until all the records are updated. AWS DMS will then determine the size of the data to migrate.
Then, you need to verify the data in the Encrypted DB instance after migration is the same as the Unencrypted DB instance.
Check replication status in AWS DMS, by checking the migration task and awsdms_status.
You can now route traffic to the new endpoint.
For a smooth cutover, use Amazon Route 53 to route traffic by changing the DNS TTL to a short value, and eventually replacing the endpoint names in Route 53.
Now replying to your questions,
Is there any downtime during the creation of the snapshot assuming there is a lot of data and a snapshot will take time.
According to you cluster setup, you are running a Multi-AZ deployment, automated backups and DB Snapshots are simply taken from the standby to avoid I/O suspension on the primary. Please note that you may experience increased I/O latency (typically lasting a few minutes) during backups for both Single-AZ and Multi-AZ deployments.
What about the new data that is being written on to the database during the snapshot creation? Is the snapshot creation real-time or I
will lose my new data during the time till the snapshot is being
taken?
You will lose your data written after the snapshot has been taken, so you will use AWS DMS to replicate synchronous data to your encrypted DB instances.
Is this the only way for me to enable encryption on the production cluster knowing that it will result in some database outage?
Yes this is the only way, but it will result in no or little downtime.
Related
We need to know what are the best options to set AWS RDS instance (Aurora mysql) that is standalone and does not get traffic from actual RDS cluster.
Requirement is for our data team to write analytical queries but we do not want it to impact actual application and DB performance. Hence we need a DB which always has near to live data but live traffic or application does not connect to this instance.
Need to know which fits better, DL clone OR AWS Pilot light OR AWS Warn standby OR AWS hot standby OR
multi-AZ configuration.
Kindly let us know which one would fit our requirement better.
We have so far read about below 3 options,
AWS Amazon Aurora DB clone, https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Managing.Clone.html
AWS Pilot light or AWS Warn standby or AWS hot standby
. https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-iii-pilot- light-and-warm-standby/
With multi-AZ configuration, we can create a new instance in new AZ, so that his instance will have a different host (kind off, a fail over strategy), where traffic to his instance will be from our queries and not from live prod application, unless there is some fail over issue.
Option 1, Aurora cloning says
Run workload-intensive operations, such as exporting data or running analytical queries on the clone.
...which seems to be your use case here.
Just be aware that the clone will not see any changes to the original data after it is made. So you will need to periodically delete and re-clone to get the updated data
Regarding option 2, I wrote those blog posts, and I do not think that approach suits your use case. That approach is for disaster recovery
Option 3 may work. To modify it a bit, the concept here is to create an Aurora Replica, which as you say is a separate instance. The problem here is the reader endpoint for your production workload, it may hit that instance (which is not what you want)
EDIT: Adding new option 4
Option 4. Check out Amazon Aurora zero-ETL integration with Amazon Redshift. This zero-ETL integration also enables you to analyze data from multiple Aurora database clusters in an Amazon Redshift cluster.
I noticed when a Cross Region Read Replica was created for my AWS Aurora cluster - it has both a Write and Reader instance (Similar to my Primary which naturally has a writer in addition to a reader instance). In the Cluster Configuration for the Cross Region Replica cluster - I can see this "Replica" cluster indeed has the Replication source tag and its correctly getting all data flowed asynchronously.
Couple of questions I need help understanding this:
should a Cross Region Replica have a Writer?
Should I write to it in case of a disaster in the source region?
I had to explicitly make the read-replica instance (in replica region) as read_only. Otherwise read-replica accepts writes to it.
Found relevant answer in AWS re:Post:
"You're replicating to a new cluster, and each cluster needs a writer
instance. It is recommended that customers apply the read_only
parameter to the replica, but by default they are able to write to
this instance. This allows for architectures where the replica is read
& writable.
https://repost.aws/questions/QUrCbnj0u4TWaz-A1uR-QDPQ/aurora-create-cross-region-read-replica-vs-add-region
I think the name "writer" is a bit misleading.
In the doc for Aurora endpoint:
A cluster endpoint (or writer endpoint) for an Aurora DB cluster connects to the current primary DB instance for that DB cluster. This endpoint is the only one that can perform write operations such as DDL statements. Because of this, the cluster endpoint is the one that you connect to when you first set up a cluster or when your cluster only contains a single DB instance. Each Aurora DB cluster has one cluster endpoint and one primary DB instance.
So the writer instance is the same entity as the primary instance of the cluster.
In the doc for Cross-Region Replication:
When you create a cross-Region read replica for Aurora MySQL by using the AWS Management Console, Amazon RDS creates a DB cluster in the target AWS Region, and then automatically creates a DB instance that is the primary instance for that DB cluster.
For cross region replication, the new cluster (writer instance) will be created.
You don't need to take care of the writer instance. For the disaster recovery, promote read replica.
Apologies if this isn't the right place to ask this question.
I'm looking to create an automated copy (backup) of my AWS RDS (MySQL) database daily and have this backup restored daily to another RDS instance and made available to another set of applications
I already have daily backups running and I can create a new rds instance from a backup but I want this to happen automatically within AWS.
Looking through AWS documentation and I can't find anything that fits this purpose but maybe there's a service that I'm not aware of.
AWS Aurora for MySQL and PostgreSQL support AutoScaling.
Autoscaling dynamically adjusts the number of Replicas available for cluster based on different metrics and policy. When sudden workload increase it'll add move read replicas and when it'll decrease it'll also remove so you don't have to pay for it.
Aurora AutoScaling
AWS RDS doesn't support autoscaling but you can always scale horizontally and vertically manually.
Scaling Your Amazon RDS Instance Vertically Horizontally
I have a doubt if in AWS all server-side work is done by cloud manager then why do we store backup for database?
I have studied in documentation that all the things are managed by cloud service providers for the database related things. Then what is the need of storing backup if service provider do everything for me?
You maintain your own backups of RDS instances for the same reason that you maintain offsite backups of on-premise databases: disaster recovery. In your own data center, a fire or terrorism or natural disaster could destroy both your database and your local backups. In the cloud, these disasters tend to take on a different form.
If all of your data is in any one place, then you are vulnerable to data loss in a catastrophic event, which could take a number of forms: a serious defect in the cloud provider's infrastructure (unlikely with AWS, but nothing is impossible), human error, malicious employees, a compromise of your credentials, or any other of a number of statistically-unlikely events -- the low probability of which becomes irrelevant when it occurs.
If you value your data, you back it up independently and outside of its native environment.
Amazon RDS runs a database of your choice: MySQL, PostgreSQL, Oracle, SQL Server. These are normal databases and operate in the same way as a database you would run yourself.
You are correct that a managed solution takes care of installation, maintenance and hardware issues. Also, you can configure the system to automatically take backups of the data.
From Working With Backups - Amazon Relational Database Service:
Amazon RDS creates and saves automated backups of your DB instance. Amazon RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases.
Amazon RDS creates automated backups of your DB instance during the backup window of your DB instance. Amazon RDS saves the automated backups of your DB instance according to the backup retention period that you specify. If necessary, you can recover your database to any point in time during the backup retention period.
You also have the ability to trigger a manual backup. This is advisable, for example, before you do major work on the database, such as modifying schemas when upgrading an application that uses the database.
Bottom line: Amazon RDS can manage the backups for you. You do not need to take manage the backup process, but you can trigger the RDS backups yourself.
I modified my RDS instance to "Multi AZ : Yes". My primary RDS instance is in us-west-1a and for multi-AZ the secondary zone is shown as us-west-1c. I wanted to verify if whatever changes I am making on my primary database are getting copied to the Multi-AZ standby database quickly.
But I am not able to understand what endpoint URL should I use to login into Multi-AZ database. I am thinking the end point URL would be different from primary. Could you please help me on this?
You do not have access to the secondary RDS instance in a Multi-AZ configuration. You just need to trust that AWS is replicating data correctly. In a Multi-AZ configuration, RDS will write to both replicas syncronously. It will not return the write request until both replicas have written correctly.
To access a Multi-AZ instance, you issue your reads and writes to the single RDS endpoint. In case of an issue, AWS will modify the DNS entry for that endpoint to point to the secondary replica. So as long as you are using the endpoint DNS record, and not caching the IP address when accessing the RDS instance, the failover process should be transparent to you with only a minute or so of "downtime".
take a look at https://aws.amazon.com/rds/details/multi-az/. You don't typically interact with the replica(s) of RDS resources directly; AFAIK ( I'm not an rds expert ) you can't do what you're describing. The idea is that RDS does that for you, automatically keeping a consistent replica in a different AZ, and providing to you a consistent DNS endpoint.
Although OP asks for "verify data is copied quickly", Google pointed me here to "verify a multi-AZ RDS deploy". I'll share what I found in hopes that it's halfway helpful.
In the RDS console, there is an option on reboot to Reboot from failover which doesn't appear on a standard deploy.
Standard deploys do not have this option, which was a small but satisfying indication that the multi-AZ was acting as expected.
Source (and generally a pretty good read)
Q: Can I initiate a “forced failover” for my Multi-AZ DB instance
deployment?
Amazon RDS will automatically fail over without user intervention
under a variety of failure conditions. In addition, Amazon RDS
provides an option to initiate a failover when rebooting your
instance. You can access this feature via the AWS Management Console
or when using the RebootDBInstance API call.