How to use AWS memcached with multiple nodes

How to use AWS memcached with multiple nodes - amazon-web-services

I am new to AWS, and I want to store some temporary data on memcached. My memcached has two nodes, one in us-east-1-a, one in us-east-1-b. It stores data in 2 nodes but is not syncing them. Is there any way I can get all data from 2 buckets instead of going into 2 nodes 1 by 1.
edit: I use telnet to connect to the endpoint of memcached

So memcached clusters do not replicate data between nodes. Only redis will do that if setup like that, that is why you need to use the Configuration Endpoint for memcached, it knows all the instances in your cluster
The Memcached engine supports partitioning your data across multiple nodes
A Memcached cluster is a logical grouping of one or more ElastiCache Nodes. Data is partitioned across the nodes in a Memcached cluster.
Memcached cluster, If you use Automatic Discovery, you can use the cluster's configuration endpoint to configure your Memcached client. This means you must use a client that supports Automatic Discovery.
If you don't use Automatic Discovery, you must configure your client to use the individual node endpoints for reads and writes. You must also keep track of them as you add and remove nodes.
Finding the Endpoints for a Memcached Cluster (Console)

Related

What is the difference between nodes, cluster, and database in redshift

Reading thru the documentation of aws, im quite confused with these three concepts
Cluster: composed of one or more compute nodes; composed of one or mode database
Compute node: run the query execution plans and transmit data among themselves to serve these queries
Database: User data is stored on the compute nodes
With this it is easy to assume that a compute node and a database is the same, isn't it? But when creating a redshift cluster, a portion of it is named as database configuration but seemingly referring to cluster. Below is an image of it, if my understanding is correct from the documentation, database configuration should be referring to compute nodes and not the cluster.
With these, what exactly is a cluster, database, and a compute node?

With this it is easy to assume that a compute node and a database is the same, isn't it?
No, that's not the case. You can have single-node Redshift cluster with multiple databases, or a single (large) database hosted on multiple compute nodes.
Basically, node refers to the hardware layer of Redshift, while database refers to software layer only.
Your screenshot shows only a default database called dev. You can create many more if you want. All hosted on the same cluster.

How to query for the NodeIds in an AWS Elasticsearch cluster?

I'm trying to use Boto3 to return data about the nodes in an AWS elastic search cluster like free storage space, CPU usage, etc. I know that the Ids of the nodes in the cluster can change when the cluster is restarted so I don't want to hardcode them. Is there a way to return a list of NodeIds present in the cluster so I don't have to hardcode them?

Basically Nodes information is allocated can be extracted from ES cluster by API calls only.
So to extract those values what you need to do is use https://elasticsearch-py.readthedocs.io/en/master/index.html library and connect to ES Domain. Than execute this API call(GET /_nodes) https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-info.html.
Thanks
Ashish

Does AWS take down each availability zones(A-Z) or whole regions for maintenance

AWS has a maintenance window for each region.
https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/maintenance-window.html but could not find any documentation about how it works with multiple A-Z in the same region.
I have a Redis cache configured and have a replica on different(A-Z) in the same region. The whole purpose of configuring replica on different(A-Z) if one (A-Z) is not available serve it from next(A-Z)
When they doing maintenance are they take down the whole region or individual availability zone

You should read the FAQ on ElastiCache maintenance https://aws.amazon.com/elasticache/elasticache-maintenance/
This says that if you have a multi AZ deployment, it will take down the instances one at a time triggering a fail-over to the read replica, and then create new instances before taking down the rest so you should not experience any interruptions in your service.

Thanks #morras for the above link and explains how elasticache works maintenance window period. Below 3 question I have taken out from the above link and explain about it.
1. How long does a node replacement take?
A replacement typically completes within a few minutes. The replacement may take longer in certain instance configurations and traffic patterns. For example, Redis primary nodes may not have enough free memory, and may be experiencing high write traffic. When an empty replica syncs from this primary, the primary node may run out of memory trying to address the incoming writes as well as sync the replica. In that case, the master disconnects the replica and restarts the sync process. It may take multiple attempts for replica to sync successfully. It is also possible that replica may never sync if the incoming write traffic continues to remains high.
Memcached nodes do not need to sync during replacement and are always replaced fast irrespective of node sizes.
2. How does a node replacement impact my application?
For Redis nodes, the replacement process is designed to make a best effort to retain your existing data and requires successful Redis replication. For single node Redis clusters, ElastiCache dynamically spins up a replica, replicates the data, and then fails over to it. For replication groups consisting of multiple nodes, ElastiCache replaces the existing replicas and syncs data from the primary to the new replicas. If Multi-AZ or Cluster Mode is enabled, replacing the primary triggers a failover to a read replica. If Multi-AZ is disabled, ElastiCache replaces the primary and then syncs the data from a read replica. The primary will be unavailable during this time.
For Memcached nodes, the replacement process brings up an empty new node and terminates the current node. The new node will be unavailable for a short period during the switch. Once switched, your application may see performance degradation while the empty new node is populated with cache data.
3. What best practices should I follow for a smooth replacement experience and minimize data loss?
For Redis nodes, the replacement process is designed to make a best effort to retain your existing data and requires successful Redis replication. We try to replace just enough nodes from the same cluster at a time to keep the cluster stable. You can provision primary and read replicas in different availability zones. In this case, when a node is replaced, the data will be synced from a peer node in a different availability zone. For single node Redis clusters, we recommend that sufficient memory is available to Redis, as described here. For Redis replication groups with multiple nodes, we also recommend scheduling the replacement during a period with low incoming write traffic.
For Memcached nodes, schedule your maintenance window during a period with low incoming write traffic, test your application for failover and use the ElastiCache provided "smarter" client. You cannot avoid data loss as Memcached has data purely in memory.

Migrating Redis to AWS Elasticache with minimal downtime

Let's start by listing some facts:
Elasticache can't be a slave of my existing Redis setup. Real shame, that would be so much more efficent.
I have only one Redis server to migrate, with roughly 3gb of data.
Downtime must be less than 10 mins. I assume the usual "stop the site, stop redis, provision cluster with snapshot" will take longer than this.
Similar to this question: How do I set an elasticache redis cluster as a slave?
One idea on how this might work:
Set Redis to use an AOF and trigger BGSAVE at the same time.
When BGSAVE finishes, provision the Elasticache cluster with RDB seed.
Stop the site and shut down my local Redis instance.
Use an aof-replay tool to replay the AOF into Elasticache.
Start the site again, pointed at the Elasticache cluster.
My questions:
How can I guarantee that my AOF file begins at exactly the point the RDB file ends, and that no data will be written in between?
Is there an AOF tool supported by the maintainers of Redis, or are they all third-party solutions, and therefore (potentially) of questionable reliability?*
* No offence intended to any authors of such tools, I'm sure they're great, I just feel much more confident using a tool written by the same team as the product to avoid potential compatibility bugs.

I have only one Redis server to migrate, with roughly 3gb of data
I would halt, save the REDIS to S3 and then upload it to a new cluster.
I'm guessing 10 mins to save the file and get it into s3.
10 minutes to just launch an elasticache cluster from that data.
Leaves you ten extra minutes to configure and test.
But there is a simple way of knowing EXACTLY how long.
Do a test migration of it.
DONT stop your live system
Run BGSAVE and get a dump of your Redis (leave everything running as normal)
move the dump S3
launch an elasticache cluster for it.
Take DETAILED notes, TIME each step, copy the commands to a notepad window.
Put a Word/excel document so you have a migration document. That way you know how long it takes and there are no surprises. Let us know how it goes.

ElastiCache has online migration support. You can use the start-migration API to start migration from self managed cluster to ElastiCache cluster.
aws elasticache start-migration --replication-group-id <ElastiCache Replication Group Id> --customer-node-endpoint-list "Address='<IP Address>',Port=<Port>"
The input to the API is your ElastiCache replication group id and the IP and port of the master of your self managed cluster. You need to ensure that the IP address is accessible from ElastiCache node. (An example IP address would be the private IP address of the master of your self managed cluster). This API will make the master node of the ElastiCache cluster call 'SLAVEOF' on the master of your self managed cluster. This will establish a replication stream and will start migrating data from self-managed cluster to ElastiCache cluster. During migration, the master of the ElastiCache cluster will stop accepting writes sent to it directly. You can start using ElastiCache cluster from your application for reads.
Once you have all your data in ElastiCache cluster, you can use the complete-migration API to stop the migration. This API will stop the replication from self managed cluster to ElastiCache cluster.
aws elasticache complete-migration --replication-group-id <ElastiCache Replication Group Id>
After this, the master of the ElastiCache cluster will start accepting writes. You can start using ElastiCache cluster from your application for both read and write.
The following limitations to be aware of for this migration method:
An existing or newly created ElastiCache deployment should meet the following requirements for migration:
It's cluster-mode disabled using Redis engine version 5.0.5 or higher.
It doesn't have either encryption in-transit or encryption at-rest enabled.
It has Multi-AZ with Auto-Failover enabled.
It has sufficient memory available to fit the data from your Redis on EC2 instance. To configure the right reserved memory settings, see Managing Reserved Memory.

There are a few ways to migrate the data without downtime. They are harder to achieve though.
you could have your app write to two redis instances simultaneously - one of which would be on EC. Once the caches are both 'warm', you could just restart your app, and read from the EC cache.
You could initially migrate to EC2 instead of EC. not really what you were hoping to hear, I imagine. this is easy to do because you can set EC2 as salve of your redis instance. Also, migrating from EC2 to EC is somewhat easier (the data is already on AWS), so there's a benefit for users with huge sets of data.
You could, in theory, intercept the commands from the client and send them to EC, thus effectively "replicating". But this requires some programming ( I dont believe a tool like this exists ATM) and would be hard with multiple, ephemeral clients.

How to scale horizontally Amazon RDS instance?

How to scale horizontally amazon RDS instance? EC2 and load balancer+autoscaling is extremly easy to implement, but if I want scaling amazon RDS?
I can ugrade my RDS instance with more powerfull instance or I can create a read replica and I can direct SELECT queries to it. But in this mode I don't scale anything if I have a read-oriented web application. So, can I create RDS read replica with autoscaling and balance them with load balancer?

You can use a HAProxy to load-balance Amazon RDS Read Replica's. Check this http://harish11g.blogspot.ro/2013/08/Load-balancing-Amazon-RDS-MySQL-read-replica-slaves-using-HAProxy.html.
Hope this helps.

Note RDS covers several database engines- mysql, postgresql, Oracle, MSSQL.
Generally speaking, you can scale up (larger instance), use readonly databases, or shard. If you are using mysql, look at AWS Aurora. Think about using the database optimally- perhaps combining with memcached or Redis (both available under AWS Elasticache). Think about using a search engine (lucene, elasticsearch, cloudsearch).
Some general resources:
http://highscalability.com/
https://gigaom.com/2011/12/06/facebook-shares-some-secrets-on-making-mysql-scale/

If you are using PostgreSQL and have a workload that can be partitioned by a certain key and doesn't require complex transactions, then you could take a look at the pg_shard extension. pg_shard lets you create distributed tables that are sharded across multiple servers. Queries on the distributed table will be transparently routed to the right shard.
Even though RDS doesn't have the pg_shard extension installed, you can set up one or PostgreSQL servers on EC2 with the pg_shard extension and use RDS nodes as worker nodes. The pg_shard node only needs to store a tiny bit of metadata which can be backed up in one of the worker nodes, so they are relatively low maintenance and can be scaled out to accommodate higher query rates.
A guide with a link to a CloudFormation template to set everything up automatically is available at: https://www.citusdata.com/blog/14-marco/178-scaling-out-postgresql-on-amazon-rds-using-masterless-pg-shard

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js