Distributed Database Access in aws cloud front - amazon-web-services

I have MYSQL Database in AWS RDS and Web Application in Mumbai Region. I want to access the web application from the USA with good latency/speed. I have used AWS CloudFront still the application is very slow.
Any Suggestions.
Best,
Syed
AWS CloudFront

How about a cross-region read replica of your MySQL database in the USA? If the majority of your database operations are read rather than write, this will give you a significant improvement in response time.

In standard, it is recommended to keep databases and apps should be in the same regions(eventually can try to keep in the same zone) from the majority of end-user belongs to
As of now, you can create a cross-region replica but you need to be ready for replica lag and data transfer charges. In the long term, plan to move your setup to N.Virgania or any other USA region.

Related

What is the cheapest way to allow others to download a dataset I have?

I have some datasets (can go up to 10 GBs (zipped) altogether possibly) for my Machine Learning applications
In order to expose these datasets to others, I believe I have to host a server and let others to download over the network.
what is the cheapest server I can use for this? (I checked AWS free tiers, can these be used?)
Do I need to write up a web server? is there a premade tool that I can use for my use case?
You haven't indicated how much data will be downloaded (GB/month) and that's important because you pay for data transfer out to the internet (about $0.09 per GB) beyond an initial free amount (1 GB/month, I believe, but check if free tier offers more), and that's relevant to both S3 and EC2.
That said, I'd consider a few options.
Storing the files in S3 and serving them from S3 via CloudFront may be cheaper than running a server 24x7 to host and serve the files.
A small EC2 server that fits into the free tier usage plan, running a web or FTP server, serving up your files.
Similar to #1 but you can also configure requester pays for S3 downloads. This option requires your downloaders to have AWS credentials and for you to manage their access. May not be feasible in your case.
Create an EBS volume containing your data, take a snapshot of that volume, and share the snapshot with other AWS accounts, then shut down your EC2 instance. This option requires your users to be AWS account holders and that they share their AWS account numbers with you. May not be feasible in your case.
AWS SFTP serving up data stored in S3.

Shouldn't I use Direct Connect to deliver the solution of collecting info from multi regions in AWS?

I came across the following question during my AWS practice and I have a different opinion and want to post it here for more discussion as it addresses a very common need, thanks.
http://jayendrapatil.com/aws-rds-replication-multi-az-read-replica/?unapproved=227863&moderation-hash=c9a071a3758c183b1cf03e51c44d2373#comment-227863
Your company has HQ in Tokyo and branch offices all over the world and is using logistics software with a multi-regional deployment on AWS in Japan, Europe and US. The logistic software has a 3-tier architecture and currently uses MySQL 5.6 for data persistence. Each region has deployed its own database. In the HQ region you run an hourly batch process reading data from every region to compute cross-regional reports that are sent by email to all offices this batch process must be completed as fast as possible to quickly optimize logistics. How do you build the database architecture in order to meet the requirements?
A. For each regional deployment, use RDS MySQL with a master in the region and a read replica in the HQ region
B. For each regional deployment, use MySQL on EC2 with a master in the region and send hourly EBS snapshots to the HQ region
C. For each regional deployment, use RDS MySQL with a master in the region and send hourly RDS snapshots to the HQ region
D. For each regional deployment, use MySQL on EC2 with a master in the region and use S3 to copy data files hourly to the HQ region
E. Use Direct Connect to connect all regional MySQL deployments to the HQ region and reduce network latency for the batch process
I lean to E, the reason is:
Direct Connect provides bandwidth that bypasses the ISP and more privately, faster (if needed).
The question doesn't factor cost here.
The initial setup time could be longer comparing to other options, however, initial setup time cost should not be the point here, what is asking here is “this batch process must be completed as fast as possible to quickly optimize logistics.”, so it is not about the initial setup, it is about how to implement the right solution to deliver the “as fast as possible” service AFTER the setup
And hence I believe E is the best option for the need.
I am open to discussion, please, if my understanding is wrong. Thank you.
E is not applicable. You cannot use Direct Connect to connect 2 VPCs. Direct Connect is used to connect VPC and your premise. Question asks about multi-regional AWS infrastructure without mentioning anything about HQ not being hosted on AWS.
The easiest solution is A in my opinion.

What is the need of backup in Amazon RDS?

I have a doubt if in AWS all server-side work is done by cloud manager then why do we store backup for database?
I have studied in documentation that all the things are managed by cloud service providers for the database related things. Then what is the need of storing backup if service provider do everything for me?
You maintain your own backups of RDS instances for the same reason that you maintain offsite backups of on-premise databases: disaster recovery. In your own data center, a fire or terrorism or natural disaster could destroy both your database and your local backups. In the cloud, these disasters tend to take on a different form.
If all of your data is in any one place, then you are vulnerable to data loss in a catastrophic event, which could take a number of forms: a serious defect in the cloud provider's infrastructure (unlikely with AWS, but nothing is impossible), human error, malicious employees, a compromise of your credentials, or any other of a number of statistically-unlikely events -- the low probability of which becomes irrelevant when it occurs.
If you value your data, you back it up independently and outside of its native environment.
Amazon RDS runs a database of your choice: MySQL, PostgreSQL, Oracle, SQL Server. These are normal databases and operate in the same way as a database you would run yourself.
You are correct that a managed solution takes care of installation, maintenance and hardware issues. Also, you can configure the system to automatically take backups of the data.
From Working With Backups - Amazon Relational Database Service:
Amazon RDS creates and saves automated backups of your DB instance. Amazon RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not just individual databases.
Amazon RDS creates automated backups of your DB instance during the backup window of your DB instance. Amazon RDS saves the automated backups of your DB instance according to the backup retention period that you specify. If necessary, you can recover your database to any point in time during the backup retention period.
You also have the ability to trigger a manual backup. This is advisable, for example, before you do major work on the database, such as modifying schemas when upgrading an application that uses the database.
Bottom line: Amazon RDS can manage the backups for you. You do not need to take manage the backup process, but you can trigger the RDS backups yourself.

AWS database operations and hour price

I need to host Accounting desktop application on Windows server. SQL database of this application will be used as a source for ecommerce website, so there will be quite often read/write operations to this database (from different linux server). Is using AWS a good idea here? Does the read/write database operations count for usage? Meaning, if I have a cron that reads DB every 5 minutes, does it mean I will be billed for 24/7 usage?
Thanks.
In databases PaaS = RDS (like in EC2, so with VMs) you're paying per hour of instance that you have available, it doesn't matter if you use it or not.
Answering your question - it doesn't matter if you will be querying the DB every 5 minutes, 1 second or 1 hour. You will pay for the database the same amount (transfer costs are in most cases negligible when compared to EC2/RDS costs) = for the availability you need. If you need it to be available 24x7, you will pay for 24x7. If you need your database to run only during specific hours during the day (or only Mon-Fri) you can automate starting/stopping it (e.g. with CloudWatch Events + AWS Lambda) to lower your cost.
But then I guess if it's ecommerce, you anyway need the database to be available 24x7 :)
Depends. If you want to setup your own SQL server on an EC2 instance or use AWS RDS.
In case of former you SQL server is like any other application running on an EC2 instance and the costs are simply a factor of EC2 pricing
In case of latter refer AWS RDS pricing for SQL Server

SQL Backups to S3

I have the following Amazon EC2 configuration
Prod Web & DB server (Virginia)
Web & DB server (Oregon)
I would like to store my SQL backups in S3 so that they are available to be restored to my standby server in case the Virginia region goes down for any period of time (which has been known to happen :)
Here are the following 2 regions I am considering for my S3 bucket
US Standard
Oregon
I attempted first to specify Oregon. However, when I do that, I am unable (for some reason) to upload to that bucket from my Virginia instance. However, I am worried that if I specify US Standard, that my S3 bucket will not be available in the event Virginia becomes unavailable.
Does anyone have any recommendations for overcoming the issues with either of these scenarios?
Thanks!
My recommendation for you is to use RDS (Relational Database Service), which is basically managed RDBMS service for MySQL (or MS-SQL or Oracle). It takes care for backup and restore for the DB.
With MySQL is has the option to have an automatic stand-by in a different availability zone in each region. When you use the option for "Multi-AZ", it will create the stand-by with its backup in a synchronize way. This way your fail over will be very close to real time.