SQL Backups to S3 - amazon-web-services

I have the following Amazon EC2 configuration
Prod Web & DB server (Virginia)
Web & DB server (Oregon)
I would like to store my SQL backups in S3 so that they are available to be restored to my standby server in case the Virginia region goes down for any period of time (which has been known to happen :)
Here are the following 2 regions I am considering for my S3 bucket
US Standard
Oregon
I attempted first to specify Oregon. However, when I do that, I am unable (for some reason) to upload to that bucket from my Virginia instance. However, I am worried that if I specify US Standard, that my S3 bucket will not be available in the event Virginia becomes unavailable.
Does anyone have any recommendations for overcoming the issues with either of these scenarios?
Thanks!

My recommendation for you is to use RDS (Relational Database Service), which is basically managed RDBMS service for MySQL (or MS-SQL or Oracle). It takes care for backup and restore for the DB.
With MySQL is has the option to have an automatic stand-by in a different availability zone in each region. When you use the option for "Multi-AZ", it will create the stand-by with its backup in a synchronize way. This way your fail over will be very close to real time.

Related

AWS RDS disaster recovery using cross-account

We are running AWS RDS PostgreSQL, with daily automatic snapshots, encrypted by AWS managed KMS key. My objective is to minimize risks and data loss, in case when main AWS account (running RDS) got compromised or RDS deleted/damaged in some way.
What we've implemented so far: RDS snapshots are shared with different (backup) account, periodically copied to backup account and re-encrypted with the KMS key from the backup account, to make copies local, and independent from the main AWS account.
I'm wondering if there are better ways to minimize recovery time objective and recovery point objective in case of a disaster event?
This AWS blog post seems to weigh the options well.
Automated backups are limited to a single AWS Region while manual snapshots and Read Replicas are supported across multiple Regions.
Having cross region Read replica would give you the best RPO and RTO as you can promote replica to be an independent instance which should improve your RPO / RTO
Alternatively, if you choose to use Amazon Aurora Backtrack it seems to offer a similar option to having a read replica but I do not have a personal experience with this feature so can't say how effective it is in improving RTO and RPO.
I wrote two scripts implementing flow at the diagram drawn above ^^^, the idea is to run them daily:
src_acc_take_share_rds_snapshot.py in src account:
list available RDS snapshots according to provided regexp
recrypt them with KMS key, shared from dst account
share recrypted RDS snapshots with the dst account
remove old decrypted snapshots
dst_acc_copy_shared_rds_snapshot_to_local.py in dst account
list RDS snapshots, shared in src account with dst account
copy RDS snapshots from src account to dst account
remove old decrypted snapshots
fire an SNS message if desired snapshot count != actual
and put them at GitHub https://github.com/mvasilenko/dr-rds-share-snapshot

Shouldn't I use Direct Connect to deliver the solution of collecting info from multi regions in AWS?

I came across the following question during my AWS practice and I have a different opinion and want to post it here for more discussion as it addresses a very common need, thanks.
http://jayendrapatil.com/aws-rds-replication-multi-az-read-replica/?unapproved=227863&moderation-hash=c9a071a3758c183b1cf03e51c44d2373#comment-227863
Your company has HQ in Tokyo and branch offices all over the world and is using logistics software with a multi-regional deployment on AWS in Japan, Europe and US. The logistic software has a 3-tier architecture and currently uses MySQL 5.6 for data persistence. Each region has deployed its own database. In the HQ region you run an hourly batch process reading data from every region to compute cross-regional reports that are sent by email to all offices this batch process must be completed as fast as possible to quickly optimize logistics. How do you build the database architecture in order to meet the requirements?
A. For each regional deployment, use RDS MySQL with a master in the region and a read replica in the HQ region
B. For each regional deployment, use MySQL on EC2 with a master in the region and send hourly EBS snapshots to the HQ region
C. For each regional deployment, use RDS MySQL with a master in the region and send hourly RDS snapshots to the HQ region
D. For each regional deployment, use MySQL on EC2 with a master in the region and use S3 to copy data files hourly to the HQ region
E. Use Direct Connect to connect all regional MySQL deployments to the HQ region and reduce network latency for the batch process
I lean to E, the reason is:
Direct Connect provides bandwidth that bypasses the ISP and more privately, faster (if needed).
The question doesn't factor cost here.
The initial setup time could be longer comparing to other options, however, initial setup time cost should not be the point here, what is asking here is “this batch process must be completed as fast as possible to quickly optimize logistics.”, so it is not about the initial setup, it is about how to implement the right solution to deliver the “as fast as possible” service AFTER the setup
And hence I believe E is the best option for the need.
I am open to discussion, please, if my understanding is wrong. Thank you.
E is not applicable. You cannot use Direct Connect to connect 2 VPCs. Direct Connect is used to connect VPC and your premise. Question asks about multi-regional AWS infrastructure without mentioning anything about HQ not being hosted on AWS.
The easiest solution is A in my opinion.

Distributed Database Access in aws cloud front

I have MYSQL Database in AWS RDS and Web Application in Mumbai Region. I want to access the web application from the USA with good latency/speed. I have used AWS CloudFront still the application is very slow.
Any Suggestions.
Best,
Syed
AWS CloudFront
How about a cross-region read replica of your MySQL database in the USA? If the majority of your database operations are read rather than write, this will give you a significant improvement in response time.
In standard, it is recommended to keep databases and apps should be in the same regions(eventually can try to keep in the same zone) from the majority of end-user belongs to
As of now, you can create a cross-region replica but you need to be ready for replica lag and data transfer charges. In the long term, plan to move your setup to N.Virgania or any other USA region.

What is the need of backup in Amazon RDS?

I have a doubt if in AWS all server-side work is done by cloud manager then why do we store backup for database?
I have studied in documentation that all the things are managed by cloud service providers for the database related things. Then what is the need of storing backup if service provider do everything for me?
You maintain your own backups of RDS instances for the same reason that you maintain offsite backups of on-premise databases: disaster recovery. In your own data center, a fire or terrorism or natural disaster could destroy both your database and your local backups. In the cloud, these disasters tend to take on a different form.
If all of your data is in any one place, then you are vulnerable to data loss in a catastrophic event, which could take a number of forms: a serious defect in the cloud provider's infrastructure (unlikely with AWS, but nothing is impossible), human error, malicious employees, a compromise of your credentials, or any other of a number of statistically-unlikely events -- the low probability of which becomes irrelevant when it occurs.
If you value your data, you back it up independently and outside of its native environment.
Amazon RDS runs a database of your choice: MySQL, PostgreSQL, Oracle, SQL Server. These are normal databases and operate in the same way as a database you would run yourself.
You are correct that a managed solution takes care of installation, maintenance and hardware issues. Also, you can configure the system to automatically take backups of the data.
From Working With Backups - Amazon Relational Database Service:
Amazon RDS creates and saves automated backups of your DB instance. Amazon RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases.
Amazon RDS creates automated backups of your DB instance during the backup window of your DB instance. Amazon RDS saves the automated backups of your DB instance according to the backup retention period that you specify. If necessary, you can recover your database to any point in time during the backup retention period.
You also have the ability to trigger a manual backup. This is advisable, for example, before you do major work on the database, such as modifying schemas when upgrading an application that uses the database.
Bottom line: Amazon RDS can manage the backups for you. You do not need to take manage the backup process, but you can trigger the RDS backups yourself.

AWS Multi-AZ verification

I modified my RDS instance to "Multi AZ : Yes". My primary RDS instance is in us-west-1a and for multi-AZ the secondary zone is shown as us-west-1c. I wanted to verify if whatever changes I am making on my primary database are getting copied to the Multi-AZ standby database quickly.
But I am not able to understand what endpoint URL should I use to login into Multi-AZ database. I am thinking the end point URL would be different from primary. Could you please help me on this?
You do not have access to the secondary RDS instance in a Multi-AZ configuration. You just need to trust that AWS is replicating data correctly. In a Multi-AZ configuration, RDS will write to both replicas syncronously. It will not return the write request until both replicas have written correctly.
To access a Multi-AZ instance, you issue your reads and writes to the single RDS endpoint. In case of an issue, AWS will modify the DNS entry for that endpoint to point to the secondary replica. So as long as you are using the endpoint DNS record, and not caching the IP address when accessing the RDS instance, the failover process should be transparent to you with only a minute or so of "downtime".
take a look at https://aws.amazon.com/rds/details/multi-az/. You don't typically interact with the replica(s) of RDS resources directly; AFAIK ( I'm not an rds expert ) you can't do what you're describing. The idea is that RDS does that for you, automatically keeping a consistent replica in a different AZ, and providing to you a consistent DNS endpoint.
Although OP asks for "verify data is copied quickly", Google pointed me here to "verify a multi-AZ RDS deploy". I'll share what I found in hopes that it's halfway helpful.
In the RDS console, there is an option on reboot to Reboot from failover which doesn't appear on a standard deploy.
Standard deploys do not have this option, which was a small but satisfying indication that the multi-AZ was acting as expected.
Source (and generally a pretty good read)
Q: Can I initiate a “forced failover” for my Multi-AZ DB instance
deployment?
Amazon RDS will automatically fail over without user intervention
under a variety of failure conditions. In addition, Amazon RDS
provides an option to initiate a failover when rebooting your
instance. You can access this feature via the AWS Management Console
or when using the RebootDBInstance API call.