Optimizing latency between application server (EC2) and RDS

Optimizing latency between application server (EC2) and RDS - amazon-web-services

here's how the story goes.
We started transforming a monolith, single-machine, e-commerce application (Apache/PHP) to cloud infrastructure. Obviously, the application and the database (MySQL) were on the same machine.
We decided to move to AWS. And as the first step of transformation, we decided to split the database and application. Hosting application on a c4.xlarge machine. And hosting database to RDS Aurora MySQL on a db.r5.large machine, with default options.
This setup performed well. Especially the database performance went up high.
Unfortunately, when the traffic spiked up, we started experiencing long response times. Looked like RDS, although being really fast for executing queries, wasn't returning results fast enough over the network to the EC2 machine.
So that was our conclusion after an in-depth analysis of the setup including Apache/MySQL/PHP tuning parameters. The delayed response time was definitely due to the network latency between EC2 and RDS/Aurora machine, both machines being in the same region.
Before adding additional resources (ex: ElastiCache etc) we'd first like to look into any default configuration we can play around with to solve this problem.
What do you think we missed there?

One of the bigest strength with the cloud is the scalability and you should always design your application to utilise it and it sounds like your RDS instance is getting chocked due to nr of request more than the process time for the queries. So rather go more small instances with load balancing than one big doing all the job. And with Load Balancers you will get away from a singel point of failure due to you can have replicas of your database and they can even be placed in different AZ.
Here is a blogpost you can read on the topic:
https://aws.amazon.com/blogs/database/scaling-your-amazon-rds-instance-vertically-and-horizontally/
Good luck in your aws journey.

The Best answer to your question is using read replicas, but remember only your read requests could be sent to your read replicas so you would need to design your application that way
Also for some cost savings, you should try aurora serverless
One more option is passing traffic between ec2 and rds through a private network rather than using the public internet to connect your ec2 to rds that can be one of the mistakes that might be happening

Related

EC2, Webserver, and MySQL

I am about to launch an iOS app that will be communicating with my custom REST API. Right now I am running a single EC2 t2.micro instance running an Apache web server with MySQLi. Before I go ahead and launch it for the public, I want to hear what proper steps should be taken regarding the following.
Should I run two separate EC2 instances? One only for the web server and the other to handle only the database?
How should I approach setting up the database? Should I still use MySQLi or should I start using Amazon's RDS?
In relationship to number two, when the database and/or web server runs out of space, how is this issue handled so that it seamlessly adds space to allow the database/web server to continue growth? I also read something regarding auto-scale.
I will be expecting many requests per minute to my web server and want to take precaution.

The answer to these questions largely depends on the requirements of your application, your budget, and on what you decide to manage vs. what you'd prefer to allow AWS to manage. However, I'll answer these as best I can.
1) Yes. Separating the database from the web server (that is, 2 different EC2 instances) makes sense for a lot of reasons. This will allow you to tailor resources like memory, CPU, etc. to each layer of your application separately. You do not want your web and database competing for the same resources. Additionally, an issue that forces you to take down one (web or database) will not force you to also take down the other. If your database lives on one of the web servers and you need to perform maintenance, your app will effectively become offline, since down goes your database as you perform updates. Also, ideally you would protect your database server within a private subnet in your VPC. If you have the web and database on the same server, they will both be in a public subnet, since you're web will require access to an internet gateway.
2) Depends. If you want to maintain total control of the database server, than use an EC2 instance where you retain operating system control. If you want to take advantage of features like Multi-AZ for high availability or allowing AWS to manage things like updates for you, RDS can be a great option. Cost also plays a role. For things like read-replicas and Multi-AZ, you will pay more, but you are purchasing performance and high availability. Thus, depends on your requirements. You can find the features of RDS here: RDS Product Details
3) For anything running on an EC2 instance (database or web) or if you decide to use RDS, you may provision and attach additional storage volumes as necessary. The type of storage you select will depend on the performance requirements, your budget, and the kind of workload you expect your database to face. Amazon provides the storage options available to you as well as a section for adding more storage here: RDS Storage Options
If you are worried about too many requests overwhelming your EC2 t2.micro instance, consider creating an ELB load balancer and setting up an auto-scaling group which will allow you to expand your capacity as necessary while distributing traffic such that no one server gets overwhelmed.

Migrating from AWS RDS to an AWS EC2 running MySQL

Yes, fellow SOrs, I'm doing it backwards. I tried an AWS RDS but the CPU seems to be spiking so often that I need the flexibility of an EC2 to run some fine tuning. I'm not a MySQL expert, so I'm asking:
How can I create a setup on the EC2 so that it reads and replicates my RDS?
Ideally I'd do the switch in real time via DNS but first I need the EC2 to act like a clone of the RDS updating with any new data happening between now and the actual migration period.
Any pointers are much appreciated. Thanks!

Why can't you use mysql-tuner with RDS?
You shouldn't need to run sysbench, since Amazon handles OS level tuning for you on RDS
Aurora is a drop-in replacement for MySQL and will scale better than any MySQL cluster you could setup on EC2
You should be addressing why your Wordpress instance is hammering the database so much instead of trying to optimize the database.
You should put a CDN in front of your Wordpress site and cache as much as you can to reduce the load on both your web server and database server. It looks like there are also solutions out there for using Redis to cache data so that Wordpress doesn't have to constantly go back to MySQL for data.
Amazon provides the CloudFront CDN, but I would also recommend looking into CloudFlare.
Honestly, given your number of concurrent users, unless you have tons of dynamic constantly changing content, you should be able to run your entire site on a t2.micro with CloudFlare in front of it with cache everything enabled.

I'd like to offer an update:
Mark B's input has been extremely valuable as I have discovered that I can run mysql tuner remotely and touch the RDS. Therefore there was no need to migrate after all.
The RDS CPU spikes were due to a large amount of non-INDEX JOINs.
I have added indexes and the results are fantastic:

Is Amazon EC Redis an effective caching solution or not?

As you may have noticed Amazon has announced a new feature for its own ElasticCache product, which is supporting Redis.
We are currently using one EC2 instance for our Redis (just queuing for now) and we've decided to use Redis for other upcoming features such as commenting system, discussion, real-time messaging, real-time user tracking and analytics, etc.
We don't mind to run more and bigger EC2 instances, but should we invest in ElasticCache (Redis) and move into it from the beginning now that we haven't started yet or it's too soon to see the results, benchmarks, and downside? Or it's even limited in some prospectives compare to having your own Redis on your own instances?
Update 1:
Let me to be detailed of what we are going to do with Redis. Probably using queuing as we have been doing it by Resque. Not sure if ElasticCache let us do any Pub/Sub but if it does we would like to do that as well. And of course atomic and high-level operations.
Update2:
There is a new video by Senior Product Manger of Amazon Elastic Cache posted a week ago that happened during AWS reInvent Conference. Because it is new he talks about Redis too!
http://www.youtube.com/watch?v=odMmdPBV8hM

I would say that if Redis is an effective caching solution for you, then ElasticCache will work for you - you're simply paying AWS to manage the back end and plumbing for you. Performance may be marginally slower - you have to have a DNS lookup for requests, vs having redis running in a VPC where you can access a private IP address directly - but even accessing it from an EC2 instance should resolve the public DNS name to the internal private IP. And of course you can launch your EC node in your VPC.
There are some complications when running a memcached cluster - you will need to use the amazon client to make sure your code connects to the correct node - but I do not believe as of Dec 2013 that this is needed for redis.
If you're implementing a queue on top of redis, have you looked at SQS to see if it will work for you?

Understanding Amazon offerings

I am working on a project and am at a point where the POC is done and now want to move towards a real product. I am trying to understand the Amazon cloud offerings just to see if I need to be aware of them at development time. I have a bunch of questions that I cannot get answered from the Amazon site. Its probably because I am new to the whole web services thing and have never hosted a site before. I am hoping someone out here will explain this to me like I am a C programmer :)
I see amazon has a bunch of offerings -
EC2
Elastic Block Store
Simple DB
AuotScaling
Elastic Load Balancing
I understand EC2 is virtual server instances that I can use and these could come pre-loaded with what I want (say Apache + python). I have the following questions -
If I want a custom instance of something (like say a custom apache module I wrote for my project). Can I create a server instance using the exact modules and make it the default the next time I create a new instance or in Autoscaling?
Do I get an IP Address to access this? Can I set my own hostname to it? I mean do I get a DNS record? Or is it what Elastic IP is?
How do I access it from the outside? SSH? Remote Desktop? Or is it entirely up to how I configure the instance?
What do they mean by Inter-Region or Intra-Region data transfer? What is data transfer to begin with? Is it just people using my instance? So if I go live with it that will be the cost I have to pay for people using it?
What is the difference between AutoScaling and Elastic Load Balancing?
What is Elastic Block Store? Is it storage? If so do I have to worry about backups or do they take care of it?
About the Simple DB -
It looks like the interface to use this is different to my regular SQL calls. Am I correct?
If so the whole development needs to be tailored specifically for Amazon. Which kind of sucks. Is there a better alternative?
Do I get data backups or do I have to worry about it myself?
Will I be able to connect to the DB using regular tools to inspect the DB (during or afte development). Or do I get other tools made by Amazon for it?
What about security? The DB is obviously somewhere in the cloud farm away from the EC2 instance. My DB password is going over the wire and so is all my data totally unencrypted. Don't I have to worry about that? The question comes up only because I don't own any of the hardware.
I really hope some one points me in the right direction here.
Thanks for taking the time to read.
P

I just went through the question and here I tried to answer few of them,
1) AWS EC2 instances doesnt publish pre-configured instances, in fact its configured by the developers and made it publicly available to the users so that they can use it. One can any one of those instances or you can just opt for what ever OS you want which is raw and provision it accordingly and create a snap shot of it so that you can use it for autos caling.The snap shot becomes the base AMI in your case.
2) Every instance you boot will have a public DNS attach to it, you can use the public DNS to connect to that instance using ssh if your are a linux user or using putty if you are a windows users. Apart from that, you can also attach a elastic IP which comes with a cost will is like peanuts and attach it to the instance and access your instance through the elastic IP and you can either map the public DNS or elastic ip to map to a website by adding a A record or Cname respectively.
3)AWS owns databases in the different parts of the world. For example you deploy your application depending upon your customer base, if you target customers are based out of India, the nearest region available is Singapore which is called as ap-southeast-1 by AWS. Each region will have multiple availability zones, example ap-southeast-1a and ap-southeast-1b, which are two different databases and geographically part. Intre region means from ap-southeast-1a to ap-southeast-1b. Inter Region means, from ap-southeast-1 to us-east-1 which is Northern Virginia Data centre. AWS charges from in coming and out going bandwidth, trust me its nothing.
They chargge 1/8th of a cent per GB. Its a thing to even think about it.
4)Elastic Load balancer is cluster which divides the load equally to all your regions across availability zones (if you are running in multi AZ) ELB sits on top the AWS EC2 instances and monitors the instance health periodically and enables auto scaling
5) To help you understand what is autoscaling please go through this document http://aws.amazon.com/autoscaling/
6)Elastic Block store or EBS are like hard disk which is a persistent data storage which can be attached to your instance.Regarding back up yes dependents upon your use case. I do backups of EBS periodically.
7)Simple Db now renamed as dynamo DB is nosql DB, I hope you understand what is nosql db, its a non RDMS db systems. Please read some documentation to understand what is nosql db is.
8)If you have mysql or oracle db you can opt for RDS, please read the documents.
9)I personally feel you are newbie to the entire cloud eco system, you need to understand what exactly cloud does first.
10)You dont have to make large number of changes to development as such, just make sure it works fine in your local box, it can be deployed to cloud with out much ado.
11) You dont have to use any extra tool for that, change the database end point to RDS(if your use it) or else install mysql in your ec2 instance and connect to the local db which resides in the ec2 instance and connect to it,which is as simple as your development mode.
12)You dont have to worry about any security issues aws, it is secured. Dont follow the myths, I am have been using aws since 3 years running I dont even know remember how many applications, like(e-commerce,m-commerce,social media apps) I never faced any kind of security issues and also aws allows to set your security how ever you want.
Go ahead, happy coding. Contact me if you have any problem.

The answer above is a good summary on AWS. Just wanted to add
AWS offers full data center, so it depends what you are trying to achieve. For starters you will need,
EC2 - This is your server, it comes with instance storage, which will be lost on restart
EBS - Your mounted storage, the data is persisted across reboots
S3 - Provides storage (RESTful API's on top, the cost is usage based rather than "provisioned" as in EBS)
Databases - can start with Amazon RDS, which provides managed database services, you can chose between various available databases. You can also install your own database using EC2 + EBS, you will have to take care of managing the database yourself.
Elastic IP: Public facing IP address, you can point your DNS server to this.
One great tool to calculate the pricing,
http://calculator.s3.amazonaws.com/calc5.html

Some other services to take in account are:
VPC (Virtual Private Cloud). This is your own private network. You can define subnets, route tables and internet gateways there. I would strongly recommend to use VPC for any serious deployment of more than one instance.
Glacier - this will replace your tape library to storing backups.
Cloud Formation - great tool for deployment and automation of instances.

Creating External Monitoring for a web app

The company I work for built and hosts a web app used by our customers and I am interested in creating some kind of external monitoring page (similar to trust.salesforce.com) that users can go to to see the current state of our servers/app. I know there are tons of different 'monitoring' services out there but I want to create the service myself, to have complete control and customization. Obviously, the service would have to be hosted in a different location and data center than the app itself. One thing I am concerned about is that if I just choose a different host in a different location, if that host goes down for any reason (power failure, server failure, or even ISP failure) the monitoring software is down. For this reason, I am thinking of hosting the monitoring app on an amazon EC2 instance. With their elastic IP feature, if for some reason the data center or point where the instance is running fails, I can just create a duplicate instance with the same data (but in a different location) and everything would work fine still.
Does this sound like a feasible plan? For even more security, I was thinking of creating 2 instances in different locations and monitoring from both of them. If one instance fails, the other would still be up. Obviously, one instance has to act as the actual web host for the monitoring page. Is it possible programatically for one instance to switch the elastic IP over to itself if it detects the other instance has failed for any reason?
I know there's a lot of different things involved in this question, I'm just looking for feedback regarding ANY of it...
If you've made it this far, thanks for taking the time to read this!

What you are talking about is a complicated solution for a complicated issue. I think you are on the right track with using something like Amazon's EC2 to reduce the chance of your monitoring app of going down. Also, you could develop it yourself but there are a great deal of free monitoring solutions out there like Nagios that will do everything you are asking for and is highly extensible so you can spend your time making it look and feel like you want while leaving the more complicated portions under the hood to software that is tried and tested. The worst thing would be for you to have a bug in your software that shows something as up when it is actually down. Based off of what you are talking about doing, I would assume that would be a huge issue.

Instead of using an elastic ip - which is only assigned to one instance, consider using the Elastic Load Balancer http://aws.amazon.com/elasticloadbalancing/ which then can route over instances in any of the availability zones. This way AWS manages taking instances in/out of the pool if they become unavailable for some reason and you do not have to spend time 'moving' the Elastic IP around. It is then easy to assign your monitoring cname to the ELB hostname.
I think RandomBen's idea of using Nagios on your instances is a good one because then you do not have to recreate all the functionality in Nagios. You then spend development time setting up the system and customizing the look and feel to your needs.
Also, if you can use MySQL, you should consider using RDS http://aws.amazon.com/rds/ although you will need to pay transfer fees if you have servers outside of a region accessing the RDS in another region.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js