Is it possible to divide a redis cluster on ElastiCache? - amazon-web-services

I currently have various clients and various redis clusters on aws, on for each client. But I would like to know if it's possible to divide de redis cluster so I can use, let's say, one redis cluster for a lot of clients. Each client has its own server that uses the cache independently, and if they were in the same cluster, they probably would have keys with the same name and might be in conflict. So in this scenario my questions are:
Is it possible to divide the redis cluster for use different parts of it isolated from the others ? I think this is different from shards.
How would I avoid the probable conflict described above ?

When trying to achieve protection against key name collisions you can use multiple approaches.
If you are using none cluster mode Redis then you can configure each client to use different database as described here. Please note that if you suspect you might need to scale out in the future then multiple databases are not supported by Redis. This would be a reason to start with other options.
You can use {tags} as described here. The tag could be the name of your client or anything else which you want to use to group all keys on the same slot for multi key operations. This will also promise that key {client1}key1 will not collide with key {client2}key1. This approach will work for both cluster mode enabled and cluster mode disabled clusters so it is future proof if you will need to scale out.

Related

Use of redis cluster vs standalone redis

I have a question about when it makes sense to use a Redis cluster versus standalone Redis.
Suppose one has a real-time gaming application that will allow multiple instances of the game and wish to implement
real time leaderboard for each instance. (Games are created by communities of users).
Suppose at any time we have say 100 simultaneous matches running.
Based on the use cases outlined here :
https://d0.awsstatic.com/whitepapers/performance-at-scale-with-amazon-elasticache.pdf
https://redislabs.com/solutions/use-cases/leaderboards/
https://aws.amazon.com/blogs/database/building-a-real-time-gaming-leaderboard-with-amazon-elasticache-for-redis/
We can implement each leaderboard using a Sorted Set dataset in memory.
Now I would like to implement some sort of persistence where leaderboard state is saved at the end of each
game as a snapshot. Thus each of these independent Sorted Sets are saved as a snapshot file.
I have a question about design choices:
Would a redis cluster make sense for this scenario ? Or would it make more sense to have standalone redis instances and create a new database for each game ?
As far as I know there is only a single database 0 for a single redis cluster.(https://redis.io/topics/cluster-spec)
In that case, how would one be able to snapshot datasets for each leaderboard at different times work ?
https://redis.io/topics/cluster-spec
From what I can see using a Redis cluster only makes sense for large-scale monolithic applications and may not be the best approach for the scenario described above. Is that the case ?
Or if one goes with AWS Elasticache for Redis Cluster mode can I configure snapshotting for individual datasets ?
You are correct, clustering is a way of scaling out to handle really high request loads and store tons of data.
It really doesn't quite sound like you need to bother with a cluster.
I'd quite be very surprised if a standalone Redis setup would be your bottleneck before having several tens of thousands of simultaneous players.
If you are unsure, you can probably mock some simulated load and see what it can handle. My guess is that you are better off focusing on other complexities of your game until you start reaching quite serious usage. Which is a good problem to have. :)
You might however want to consider having one or two replica instances, which is a different thing.
Secondly, regardless of cluster or not, why do you want to use snap-shots (SAVE or BGSAVE) to persist your scoreboard?
If you want to have individual snapshots per game, and its only a few keys per game, why don't you just have your application read and persist those keys when needed to a traditional db? You can for example use MULTI, DUMP and RESTORE to achieve something that is very similar to snapshotting, but on the specific keys you want.
It doesn't sound like multiple databases is warranted for this.
Multiple databases on clustered Redis is only supported in the Enterprise version, so not on ElastiCache. But the above mentioned approach should work just fine.

Single Redis Instance with multiple simultaneous uses

So I know that redis is an in-memory data store, but I don't really understand that much about under the hood.
My question is, if I have three separate uses for it, such as python-socketio to enable multiple instances of the socket server, Celery to send tasks to another microservice (which would also use the same redis instance), and just a standard subscriber to listen for notifications to emit, can I use the same redis instance for all three tasks, or will I run into collisions between the different data (i.e celery misinterpreting a call to python-socketio as a task)?
It depends on how your data flows, the question is not clear about how the data flows between each component and their relations.
If there's no relation or dependency between these messages, you can avoid collisions by storing your messages into different DBs in the same redis instance.
Or, in case you need to use the same DB for all of them, you can use namespaces aka prefixes for your redis keys to make sure there will be no key collisions. Here's more information about how to name keys under Redis keys section.
However this is not scalable to have one instance handling this but still it depends on how much traffic you have and what are you exactly trying to achieve.
Please leave comments in case there's anything not clear or I misunderstood your question

AWS Elasticache - Redis VS MemcacheD

I am reading in AWS console about Redis and MemcacheD:
Redis
In-memory data structure store used as database, cache and message broker. ElastiCache for Redis offers Multi-AZ with Auto-Failover and enhanced robustness.
Memcached
High-performance, distributed memory object caching system, intended for use in speeding up dynamic web applications.
Did anyone used/compared both? What is the main difference and use cases between the two?
Thanks.
Pasting my answer from another stackoverflow question
Select Memcached if you have these requirements:
You want the simplest model possible.
You need to run large nodes with multiple cores or threads.
You need the ability to scale out/in,
Adding and removing nodes as demand on your system increases and decreases.
You want to partition your data across multiple shards.
You need to cache objects, such as a database.
Select Redis if you have these requirements:
You need complex data types, such as strings, hashes, lists, and sets.
You need to sort or rank in-memory data-sets.
You want persistence of your key store.
You want to replicate your data from the primary to one or more read replicas for read intensive applications.
You need automatic failover if your primary node fails.
You want publish and subscribe (pub/sub) capabilities—to inform clients about events on the server.
You want backup and restore capabilities.
Here is interesting article by aws https://d0.awsstatic.com/whitepapers/performance-at-scale-with-amazon-elasticache.pdf

Using memcached or Redis on aws-elasticache

I am working on an application on AWS and I am using AWS elasticache for caching.
I am confused between using memcached or redis.
I read the about the redis 3.0.2 update and how it is equivalent to memchached now.
https://groups.google.com/forum/#!msg/redis-db/dO0bFyD_THQ/Uoo2GjIx6qgJ
But I read on the amazon aws faq page that amazon elasticache dows not support 3.0.2. They currently support Redis 2.6.13, 2.8.6 and 2.8.19.
http://aws.amazon.com/elasticache/faqs/ (Date June 10,2015)
I have read AWS white papers on elsticache. But they have not specified for which version of redis they are providing the suggestions.
How should I decide between the use of memcached or redis for any application I may create ? What are the points one needs to keep in mind before using redis or memcached ? Should I consider that amazon will update the redis version soon and go on with redis ?
p.s. I am a novice developer.
Actually depends upon use case
Select Memcached if you have these requirements:
You want the simplest model possible.
You need to run large nodeswith multiple cores or threads.
You need the ability to scale out/in,
Adding and removing nodes as demand on your system increases and decreases.
You want to partition your data across multiple shards.
You need to cache objects, such as a database.
Select Redis if you have these requirements:
You need complex data types, such as strings, hashes, lists, and sets.
You need to sort or rank in-memory data-sets.
You want persistence of your key store.
You want to replicate your data from the primary to one or more read replicas for read intensive applications.
You need automatic failover if your primary node fails.
You want publish and subscribe (pub/sub) capabilities—to inform clients about events on the server.
You want backup and restore capabilities.
Here is interesting article by aws https://d0.awsstatic.com/whitepapers/performance-at-scale-with-amazon-elasticache.pdf
This is the main discussion of comparing Memcached and Redis Memcached vs. Redis?
Both AWS and Azure for sure will upgrade in the future to the newer versions of Redis, but when and how they will roll out it will depend only on them. Meanwhile you could install Redis 3.0.2 yourself, but you need to see if you really need Redis 3 which actually gives you the cluster support. And if you don't need the cluster then you can go with 2.8 from Elasticache.

Can you get a cluster of Google Compute Engine instances that are *physically* local?

Google Compute Engine lets you get a group of instances that are semantically local in the sense that only they can talk to each other and all external access has to go through a firewall etc. If I want to run Map-Reduce or other kinds of cluster jobs that are going to induce high network traffic, then I also want machines that are physically local (say, on the same rack). Looking at the APIs and initial documentation, I don't see any way to request that; does anyone know otherwise?
There is no support in GCE right now for specifying rack locality. However, we built the system to work well in the face of large numbers of instances talking to each other in a fully connected way, as long as they are in the same zone.
This is one of the things that allowed MapR to approach the record for a hadoop terasort. You can see that in action in the video for the Criag Mcluckie's talk from IO:
https://developers.google.com/events/io/sessions/gooio2012/302/
The best way to see is to test out your application and see how it works.