Is Amazon EC Redis an effective caching solution or not?

Is Amazon EC Redis an effective caching solution or not? - amazon-web-services

As you may have noticed Amazon has announced a new feature for its own ElasticCache product, which is supporting Redis.
We are currently using one EC2 instance for our Redis (just queuing for now) and we've decided to use Redis for other upcoming features such as commenting system, discussion, real-time messaging, real-time user tracking and analytics, etc.
We don't mind to run more and bigger EC2 instances, but should we invest in ElasticCache (Redis) and move into it from the beginning now that we haven't started yet or it's too soon to see the results, benchmarks, and downside? Or it's even limited in some prospectives compare to having your own Redis on your own instances?
Update 1:
Let me to be detailed of what we are going to do with Redis. Probably using queuing as we have been doing it by Resque. Not sure if ElasticCache let us do any Pub/Sub but if it does we would like to do that as well. And of course atomic and high-level operations.
Update2:
There is a new video by Senior Product Manger of Amazon Elastic Cache posted a week ago that happened during AWS reInvent Conference. Because it is new he talks about Redis too!
http://www.youtube.com/watch?v=odMmdPBV8hM

I would say that if Redis is an effective caching solution for you, then ElasticCache will work for you - you're simply paying AWS to manage the back end and plumbing for you. Performance may be marginally slower - you have to have a DNS lookup for requests, vs having redis running in a VPC where you can access a private IP address directly - but even accessing it from an EC2 instance should resolve the public DNS name to the internal private IP. And of course you can launch your EC node in your VPC.
There are some complications when running a memcached cluster - you will need to use the amazon client to make sure your code connects to the correct node - but I do not believe as of Dec 2013 that this is needed for redis.
If you're implementing a queue on top of redis, have you looked at SQS to see if it will work for you?

Related

Optimizing latency between application server (EC2) and RDS

here's how the story goes.
We started transforming a monolith, single-machine, e-commerce application (Apache/PHP) to cloud infrastructure. Obviously, the application and the database (MySQL) were on the same machine.
We decided to move to AWS. And as the first step of transformation, we decided to split the database and application. Hosting application on a c4.xlarge machine. And hosting database to RDS Aurora MySQL on a db.r5.large machine, with default options.
This setup performed well. Especially the database performance went up high.
Unfortunately, when the traffic spiked up, we started experiencing long response times. Looked like RDS, although being really fast for executing queries, wasn't returning results fast enough over the network to the EC2 machine.
So that was our conclusion after an in-depth analysis of the setup including Apache/MySQL/PHP tuning parameters. The delayed response time was definitely due to the network latency between EC2 and RDS/Aurora machine, both machines being in the same region.
Before adding additional resources (ex: ElastiCache etc) we'd first like to look into any default configuration we can play around with to solve this problem.
What do you think we missed there?

One of the bigest strength with the cloud is the scalability and you should always design your application to utilise it and it sounds like your RDS instance is getting chocked due to nr of request more than the process time for the queries. So rather go more small instances with load balancing than one big doing all the job. And with Load Balancers you will get away from a singel point of failure due to you can have replicas of your database and they can even be placed in different AZ.
Here is a blogpost you can read on the topic:
https://aws.amazon.com/blogs/database/scaling-your-amazon-rds-instance-vertically-and-horizontally/
Good luck in your aws journey.

The Best answer to your question is using read replicas, but remember only your read requests could be sent to your read replicas so you would need to design your application that way
Also for some cost savings, you should try aurora serverless
One more option is passing traffic between ec2 and rds through a private network rather than using the public internet to connect your ec2 to rds that can be one of the mistakes that might be happening

EC2, Webserver, and MySQL

I am about to launch an iOS app that will be communicating with my custom REST API. Right now I am running a single EC2 t2.micro instance running an Apache web server with MySQLi. Before I go ahead and launch it for the public, I want to hear what proper steps should be taken regarding the following.
Should I run two separate EC2 instances? One only for the web server and the other to handle only the database?
How should I approach setting up the database? Should I still use MySQLi or should I start using Amazon's RDS?
In relationship to number two, when the database and/or web server runs out of space, how is this issue handled so that it seamlessly adds space to allow the database/web server to continue growth? I also read something regarding auto-scale.
I will be expecting many requests per minute to my web server and want to take precaution.

The answer to these questions largely depends on the requirements of your application, your budget, and on what you decide to manage vs. what you'd prefer to allow AWS to manage. However, I'll answer these as best I can.
1) Yes. Separating the database from the web server (that is, 2 different EC2 instances) makes sense for a lot of reasons. This will allow you to tailor resources like memory, CPU, etc. to each layer of your application separately. You do not want your web and database competing for the same resources. Additionally, an issue that forces you to take down one (web or database) will not force you to also take down the other. If your database lives on one of the web servers and you need to perform maintenance, your app will effectively become offline, since down goes your database as you perform updates. Also, ideally you would protect your database server within a private subnet in your VPC. If you have the web and database on the same server, they will both be in a public subnet, since you're web will require access to an internet gateway.
2) Depends. If you want to maintain total control of the database server, than use an EC2 instance where you retain operating system control. If you want to take advantage of features like Multi-AZ for high availability or allowing AWS to manage things like updates for you, RDS can be a great option. Cost also plays a role. For things like read-replicas and Multi-AZ, you will pay more, but you are purchasing performance and high availability. Thus, depends on your requirements. You can find the features of RDS here: RDS Product Details
3) For anything running on an EC2 instance (database or web) or if you decide to use RDS, you may provision and attach additional storage volumes as necessary. The type of storage you select will depend on the performance requirements, your budget, and the kind of workload you expect your database to face. Amazon provides the storage options available to you as well as a section for adding more storage here: RDS Storage Options
If you are worried about too many requests overwhelming your EC2 t2.micro instance, consider creating an ELB load balancer and setting up an auto-scaling group which will allow you to expand your capacity as necessary while distributing traffic such that no one server gets overwhelmed.

Migrating Redis to AWS Elasticache with minimal downtime

Let's start by listing some facts:
Elasticache can't be a slave of my existing Redis setup. Real shame, that would be so much more efficent.
I have only one Redis server to migrate, with roughly 3gb of data.
Downtime must be less than 10 mins. I assume the usual "stop the site, stop redis, provision cluster with snapshot" will take longer than this.
Similar to this question: How do I set an elasticache redis cluster as a slave?
One idea on how this might work:
Set Redis to use an AOF and trigger BGSAVE at the same time.
When BGSAVE finishes, provision the Elasticache cluster with RDB seed.
Stop the site and shut down my local Redis instance.
Use an aof-replay tool to replay the AOF into Elasticache.
Start the site again, pointed at the Elasticache cluster.
My questions:
How can I guarantee that my AOF file begins at exactly the point the RDB file ends, and that no data will be written in between?
Is there an AOF tool supported by the maintainers of Redis, or are they all third-party solutions, and therefore (potentially) of questionable reliability?*
* No offence intended to any authors of such tools, I'm sure they're great, I just feel much more confident using a tool written by the same team as the product to avoid potential compatibility bugs.

I have only one Redis server to migrate, with roughly 3gb of data
I would halt, save the REDIS to S3 and then upload it to a new cluster.
I'm guessing 10 mins to save the file and get it into s3.
10 minutes to just launch an elasticache cluster from that data.
Leaves you ten extra minutes to configure and test.
But there is a simple way of knowing EXACTLY how long.
Do a test migration of it.
DONT stop your live system
Run BGSAVE and get a dump of your Redis (leave everything running as normal)
move the dump S3
launch an elasticache cluster for it.
Take DETAILED notes, TIME each step, copy the commands to a notepad window.
Put a Word/excel document so you have a migration document. That way you know how long it takes and there are no surprises. Let us know how it goes.

ElastiCache has online migration support. You can use the start-migration API to start migration from self managed cluster to ElastiCache cluster.
aws elasticache start-migration --replication-group-id <ElastiCache Replication Group Id> --customer-node-endpoint-list "Address='<IP Address>',Port=<Port>"
The input to the API is your ElastiCache replication group id and the IP and port of the master of your self managed cluster. You need to ensure that the IP address is accessible from ElastiCache node. (An example IP address would be the private IP address of the master of your self managed cluster). This API will make the master node of the ElastiCache cluster call 'SLAVEOF' on the master of your self managed cluster. This will establish a replication stream and will start migrating data from self-managed cluster to ElastiCache cluster. During migration, the master of the ElastiCache cluster will stop accepting writes sent to it directly. You can start using ElastiCache cluster from your application for reads.
Once you have all your data in ElastiCache cluster, you can use the complete-migration API to stop the migration. This API will stop the replication from self managed cluster to ElastiCache cluster.
aws elasticache complete-migration --replication-group-id <ElastiCache Replication Group Id>
After this, the master of the ElastiCache cluster will start accepting writes. You can start using ElastiCache cluster from your application for both read and write.
The following limitations to be aware of for this migration method:
An existing or newly created ElastiCache deployment should meet the following requirements for migration:
It's cluster-mode disabled using Redis engine version 5.0.5 or higher.
It doesn't have either encryption in-transit or encryption at-rest enabled.
It has Multi-AZ with Auto-Failover enabled.
It has sufficient memory available to fit the data from your Redis on EC2 instance. To configure the right reserved memory settings, see Managing Reserved Memory.

There are a few ways to migrate the data without downtime. They are harder to achieve though.
you could have your app write to two redis instances simultaneously - one of which would be on EC. Once the caches are both 'warm', you could just restart your app, and read from the EC cache.
You could initially migrate to EC2 instead of EC. not really what you were hoping to hear, I imagine. this is easy to do because you can set EC2 as salve of your redis instance. Also, migrating from EC2 to EC is somewhat easier (the data is already on AWS), so there's a benefit for users with huge sets of data.
You could, in theory, intercept the commands from the client and send them to EC, thus effectively "replicating". But this requires some programming ( I dont believe a tool like this exists ATM) and would be hard with multiple, ephemeral clients.

Migrating from AWS RDS to an AWS EC2 running MySQL

Yes, fellow SOrs, I'm doing it backwards. I tried an AWS RDS but the CPU seems to be spiking so often that I need the flexibility of an EC2 to run some fine tuning. I'm not a MySQL expert, so I'm asking:
How can I create a setup on the EC2 so that it reads and replicates my RDS?
Ideally I'd do the switch in real time via DNS but first I need the EC2 to act like a clone of the RDS updating with any new data happening between now and the actual migration period.
Any pointers are much appreciated. Thanks!

Why can't you use mysql-tuner with RDS?
You shouldn't need to run sysbench, since Amazon handles OS level tuning for you on RDS
Aurora is a drop-in replacement for MySQL and will scale better than any MySQL cluster you could setup on EC2
You should be addressing why your Wordpress instance is hammering the database so much instead of trying to optimize the database.
You should put a CDN in front of your Wordpress site and cache as much as you can to reduce the load on both your web server and database server. It looks like there are also solutions out there for using Redis to cache data so that Wordpress doesn't have to constantly go back to MySQL for data.
Amazon provides the CloudFront CDN, but I would also recommend looking into CloudFlare.
Honestly, given your number of concurrent users, unless you have tons of dynamic constantly changing content, you should be able to run your entire site on a t2.micro with CloudFlare in front of it with cache everything enabled.

I'd like to offer an update:
Mark B's input has been extremely valuable as I have discovered that I can run mysql tuner remotely and touch the RDS. Therefore there was no need to migrate after all.
The RDS CPU spikes were due to a large amount of non-INDEX JOINs.
I have added indexes and the results are fantastic:

Understanding Amazon offerings

I am working on a project and am at a point where the POC is done and now want to move towards a real product. I am trying to understand the Amazon cloud offerings just to see if I need to be aware of them at development time. I have a bunch of questions that I cannot get answered from the Amazon site. Its probably because I am new to the whole web services thing and have never hosted a site before. I am hoping someone out here will explain this to me like I am a C programmer :)
I see amazon has a bunch of offerings -
EC2
Elastic Block Store
Simple DB
AuotScaling
Elastic Load Balancing
I understand EC2 is virtual server instances that I can use and these could come pre-loaded with what I want (say Apache + python). I have the following questions -
If I want a custom instance of something (like say a custom apache module I wrote for my project). Can I create a server instance using the exact modules and make it the default the next time I create a new instance or in Autoscaling?
Do I get an IP Address to access this? Can I set my own hostname to it? I mean do I get a DNS record? Or is it what Elastic IP is?
How do I access it from the outside? SSH? Remote Desktop? Or is it entirely up to how I configure the instance?
What do they mean by Inter-Region or Intra-Region data transfer? What is data transfer to begin with? Is it just people using my instance? So if I go live with it that will be the cost I have to pay for people using it?
What is the difference between AutoScaling and Elastic Load Balancing?
What is Elastic Block Store? Is it storage? If so do I have to worry about backups or do they take care of it?
About the Simple DB -
It looks like the interface to use this is different to my regular SQL calls. Am I correct?
If so the whole development needs to be tailored specifically for Amazon. Which kind of sucks. Is there a better alternative?
Do I get data backups or do I have to worry about it myself?
Will I be able to connect to the DB using regular tools to inspect the DB (during or afte development). Or do I get other tools made by Amazon for it?
What about security? The DB is obviously somewhere in the cloud farm away from the EC2 instance. My DB password is going over the wire and so is all my data totally unencrypted. Don't I have to worry about that? The question comes up only because I don't own any of the hardware.
I really hope some one points me in the right direction here.
Thanks for taking the time to read.
P

I just went through the question and here I tried to answer few of them,
1) AWS EC2 instances doesnt publish pre-configured instances, in fact its configured by the developers and made it publicly available to the users so that they can use it. One can any one of those instances or you can just opt for what ever OS you want which is raw and provision it accordingly and create a snap shot of it so that you can use it for autos caling.The snap shot becomes the base AMI in your case.
2) Every instance you boot will have a public DNS attach to it, you can use the public DNS to connect to that instance using ssh if your are a linux user or using putty if you are a windows users. Apart from that, you can also attach a elastic IP which comes with a cost will is like peanuts and attach it to the instance and access your instance through the elastic IP and you can either map the public DNS or elastic ip to map to a website by adding a A record or Cname respectively.
3)AWS owns databases in the different parts of the world. For example you deploy your application depending upon your customer base, if you target customers are based out of India, the nearest region available is Singapore which is called as ap-southeast-1 by AWS. Each region will have multiple availability zones, example ap-southeast-1a and ap-southeast-1b, which are two different databases and geographically part. Intre region means from ap-southeast-1a to ap-southeast-1b. Inter Region means, from ap-southeast-1 to us-east-1 which is Northern Virginia Data centre. AWS charges from in coming and out going bandwidth, trust me its nothing.
They chargge 1/8th of a cent per GB. Its a thing to even think about it.
4)Elastic Load balancer is cluster which divides the load equally to all your regions across availability zones (if you are running in multi AZ) ELB sits on top the AWS EC2 instances and monitors the instance health periodically and enables auto scaling
5) To help you understand what is autoscaling please go through this document http://aws.amazon.com/autoscaling/
6)Elastic Block store or EBS are like hard disk which is a persistent data storage which can be attached to your instance.Regarding back up yes dependents upon your use case. I do backups of EBS periodically.
7)Simple Db now renamed as dynamo DB is nosql DB, I hope you understand what is nosql db, its a non RDMS db systems. Please read some documentation to understand what is nosql db is.
8)If you have mysql or oracle db you can opt for RDS, please read the documents.
9)I personally feel you are newbie to the entire cloud eco system, you need to understand what exactly cloud does first.
10)You dont have to make large number of changes to development as such, just make sure it works fine in your local box, it can be deployed to cloud with out much ado.
11) You dont have to use any extra tool for that, change the database end point to RDS(if your use it) or else install mysql in your ec2 instance and connect to the local db which resides in the ec2 instance and connect to it,which is as simple as your development mode.
12)You dont have to worry about any security issues aws, it is secured. Dont follow the myths, I am have been using aws since 3 years running I dont even know remember how many applications, like(e-commerce,m-commerce,social media apps) I never faced any kind of security issues and also aws allows to set your security how ever you want.
Go ahead, happy coding. Contact me if you have any problem.

The answer above is a good summary on AWS. Just wanted to add
AWS offers full data center, so it depends what you are trying to achieve. For starters you will need,
EC2 - This is your server, it comes with instance storage, which will be lost on restart
EBS - Your mounted storage, the data is persisted across reboots
S3 - Provides storage (RESTful API's on top, the cost is usage based rather than "provisioned" as in EBS)
Databases - can start with Amazon RDS, which provides managed database services, you can chose between various available databases. You can also install your own database using EC2 + EBS, you will have to take care of managing the database yourself.
Elastic IP: Public facing IP address, you can point your DNS server to this.
One great tool to calculate the pricing,
http://calculator.s3.amazonaws.com/calc5.html

Some other services to take in account are:
VPC (Virtual Private Cloud). This is your own private network. You can define subnets, route tables and internet gateways there. I would strongly recommend to use VPC for any serious deployment of more than one instance.
Glacier - this will replace your tape library to storing backups.
Cloud Formation - great tool for deployment and automation of instances.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js