How to decide the instance type to use for data nodes and master nodes? Any guidelines for that?
Is it necessary to use master nodes in elastic search? How do they help? Can we have a cluster without any master nodes?
Does the Elastic search service cluster cost also includes the cost of the master node instances?
Can we change the instance types and increase the number of data nodes later without any downtime? Eg. if we found we need more memory or the current instance type are not that useful?
It totally depends on your use-case, traffic, types of search queries(real-time or back-office), read-heavy(website search) or write-heavy(log analysis), etc. It's a very open-ended question and it would be the same as you plan the capacity for your other systems. But as master nodes are just used for lightweight cluster-wide actions, it can be much smaller than the data and co-ordinating nodes, where actually heavy lifting of search, aggregation, indexing etc. happens.
Yes, its required to use the master node and below is the task performed by master nodes. Although you can mark any node in elasticsearch cluster as the master node and it's not mandatory to have a dedicated master node, but its a good practice to have a dedicated master node for a healthy cluster state.
The master node is responsible for lightweight cluster-wide actions
such as creating or deleting an index, tracking which nodes are part
of the cluster, and deciding which shards to allocate to which nodes.
It is important for cluster health to have a stable master node.
It's clearly mentioned on their site. You are charged only for Amazon Elasticsearch Service instance hours, Amazon EBS storage (if you choose this option), and data transfer. Hence if you are creating a dedicated master node then you will have to pay for that as well.
You can do both, adding more data nodes to existing ES clusters doesn't require any down-time but if you are changing the instance type, that requires downtime on that node but with the rolling upgrade, you can avoid overall downtime to your ES cluster.
Hope I am able to give a satisfactory answer and let me know if you need more details.
Related
I can't find information, on how to change my AWS Redis cluster instance type and won't lose my data in it?
Short Answer - No, Just don't reboot your master nodes.
Long Answer -
Elasticache implements somewhat of a replacement work-flow whenever you modify your clusters.
New Nodes will be launched in the backend with that modification (in your case, a different node type) and then they sync with the data in your current nodes.
Once the nodes are in sync, there will be change in the DNS records and modification will end.
https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheNodes.NodeReplacement.html
PS - always a good idea to take back up if its the cluster has critical data.
Cheers.
So I am building an akka cluster with 2.6.6 and I am setting up a master node which will be the seed node and worker nodes that can dynamically leave or enter the cluster. I also have "client" nodes that should talk to the master node, possible a router but not to the workers directly.
The problem right now is that sometimes if too many workers leave due to shutdown without properly leaving the cluster the split brain downing provider unelects the master node as leader and hence shuts it down also and also right now the "client" nodes are also part of the cluster and will be shutdown as well in that case which should not happen.
Is there a way to pin the leader to the master node but still have autodowning for the workers but don't down client nodes as well?
EDIT:
Maybe a bit more structured, this is what I like to accomplish:
Master node never shuts down automatically, if it crashes it will restarted manually
Worker nodes shutdown if master node is not available
Non worker client nodes never shutdown, but try reconnect to master node indefinitely if master node is not available
Assuming you're using the now-open-sourced (formerly Lightbend commercial) split-brain resolver, the static-quorum strategy seems like a good fit.
The decision can be based on nodes with a configured role instead of all nodes in the cluster. This can be useful when some types of nodes are more valuable than others. You might, for example, have some nodes responsible for persistent data and some nodes with stateless worker services. Then it probably more important to keep as many persistent data nodes as possible even though it means shutting down more worker nodes.
There is another use of the role as well. By defining a role for a few (e.g. 7) stable nodes in the cluster and using that in the configuration of static-quorum you will be able to dynamically add and remove other nodes without this role and still have good decisions of what nodes to keep running and what nodes to shut down in the case of network partitions. The advantage of this approach compared to keep-majority (described below) is that you do not risk splitting the cluster into two separate clusters, i.e. a split brain*. You must still obey the rule of not starting too many nodes with this role as described above. It also suffers the risk of shutting down all nodes if there is a failure when there are not enough nodes with this role remaining in the cluster, as described above.
This could be accomplished with the following in application.conf:
akka.cluster.split-brain-resolver.active-strategy=static-quorum
akka.cluster.split-brain-resolver.static-quorum {
# one leader node at a time
quorum-size = 1
role = "leader"
}
akka.cluster.roles = [ ${AKKA_CLUSTER_ROLE} ]
You would then specify the cluster role for each instance via the environment variable AKKA_CLUSTER_ROLE (setting it to leader on your leader node and worker or client as appropriate).
Since nodes are required to agree on the SBR strategy, the best you can do is have the client nodes die if the leader goes away.
I'll take this opportunity at the end to point out that having client nodes joining an Akka cluster is perhaps a design decision worth revisiting: it strikes me as being well on the way to being a distributed monolith. I'd hope that clients interacting with the cluster via http or a message queue was seriously considered.
I have a basic cluster, which has a master and 2 nodes. The 2 nodes are part of an aws autoscaling group - asg1. These 2 nodes are running application1.
I need to be able to have further nodes, that are running application2 be added to the cluster.
Ideally, I'm looking to maybe have a multi-region setup, whereby aplication2 can be run in multiple regions, but be part of the same cluster (not sure if that is possible).
So my question is, how do I add nodes to a cluster, more specifically in AWS?
I've seen a couple of articles whereby people have spun up the instances and then manually logged in to install the kubeltet and various other things, but I was wondering if it could be done in more of an automatic way?
Thanks
If you followed this instructions, you should have an autoscaling group for your minions.
Go to AWS panel, and scale up the autoscaling group. That should do it.
If you did it somehow manually, you can clone a machine selecting an existing minion/slave, and choosing "launch more like this".
As Pablo said, you should be able to add new nodes (in the same availability zone) by scaling up your existing ASG. This will provision new nodes that will be available for you to run application2. Unless your applications can't share the same nodes, you may also be able to run application2 on your existing nodes without provisioning new nodes if your nodes are big enough. In some cases this can be more cost effective than adding additional small nodes to your cluster.
To your other question, Kubernetes isn't designed to be run across regions. You can run a multi-zone configuration (in the same region) for higher availability applications (which is called Ubernetes Lite). Support for cross-region application deployments (Ubernetes) is currently being designed.
can i add aerospike cluster under aws autoscale? Like . my initial autoscale group size will 3, if more traffic comes in and if cpu utilization is greater then 80% then it will add another instance into the cluster. do you think it is possible? and does it has any disadvantage or will create any problem in cluster?
There's an Amazon CloudFormation script at aerospike/aws-cloudformation that gives an example of how to launch such a cluster.
However, the point of autoscale is to grow shared-nothing worker nodes, such as webapps. These nodes typically don't have any shared data on them, you simply launch a new one and it's ready to work.
The point of adding a node to a distributed database like Aerospike is to have more data capacity, and to even out the data across more nodes, which gives you an increased ability to handle operations (reads, writes, etc). Autoscaling Aerospike would probably not work as you expect it. This is because of the fact that when a node is added to the cluster a new (larger) cluster is formed, and the data is automatically balanced. Part of balancing is migrating partitions of data between nodes, and it ends when the number of partitions across each node is even once again (and therefore the data is evenly spread across all the nodes of the cluster). Migrations are heavy, taking up network bandwidth.
This would work if you could time it to happen ahead of the traffic peaking, because then migrations could be completed ahead of time, and your cluster would be ready for the next peak. You would not want to do this as peak traffic is occuring, because it would only make things worse. You also want to make sure that when the cluster contracts there is enough room for the data, enough DRAM for the primary-index, as the per-node usage of both will grow.
One more point of having extra capacity in Aerospike is to allow for rolling upgrades, where one node goes through upgrade at a time without needing to take down the entire cluster. Aerospike is typically used for realtime applications that require no downtime. At a minimum your cluster needs to be able to handle a node going down and have enough capacity to pick up the slack.
Just as a note, you have fine grain configuration control over the rate in which migrations happen, but they run longer if you make the process less aggressive.
I would like to know if I can use AutoScaling to automatically scaling up or down Amazon Ec2 capacity according to cpu utilization with elastic map reduce.
For example, I start a mapreduce job with only 1 instance, but if this instance arrive to 50% utilization for example I want to use the created AutoScaling group to start a new instance. This is possible?
Do you know if it is possible? Or elastic mapreduce because is "elastic", if it needs starts automatically more instances without any configuration?
You need Qubole: http://www.qubole.com/blog/product/industrys-first-auto-scaling-hadoop-clusters/
We have never seen any of our users/customers use vanilla auto-scaling successfully with Hadoop. Hadoop is stateful. Nodes hold HDFS data and intermediate outputs. Deleting nodes based on cpu/memory just doesn't work. Adding nodes needs sophistication - this isn't a web site. One needs to look at the sizes of jobs submitted and the speed at which they are completing.
We run the largest Hadoop clusters, easily, on AWS (for our customers). And they auto-scale all the time. And they use spot instances. And it costs the same as EMR.
No, Auto Scaling cannot be used with Amazon Elastic MapReduce (EMR).
It is possible to scale EMR via API or Command-Line calls, adding and removing Task Nodes (which do not host HDFS storage). Note that it is not possible to remove Core Nodes (because they host HDFS storage, and removing nodes could lead to lost data). In fact, this is the only difference between Core and Task nodes.
It is also possible to change the number of nodes from within an EMR "Step". Steps are executed sequentially, so the cluster could be made larger prior to a step requiring heavy processing, and could be reduced in size in a subsequent step.
From the EMR Developer Guide:
You can have a different number of slave nodes for each cluster step. You can also add a step to a running cluster to modify the number of slave nodes. Because all steps are guaranteed to run sequentially by default, you can specify the number of running slave nodes for any step.
CPU would not be a good metric on which to base scaling of an EMR cluster, since Hadoop will keep all nodes as busy as possible when a job is running. A better metric would be the number of jobs waiting, so that they can finish quicker.
See also:
Stackoverflow: Can we add more Amazon Elastic Mapreduce instances into an existing Amazon Elastic Mapreduce instances?
Stackoverflow: Can Amazon Auto Scaling Service work with Elastic Map Reduce Service?