I have configured HA-HDFS.
I am facing an issue with ClusterID Where the ClusterID of my namenodes and datanodes don't match therefore, my datanodes are not getting started.
How to provide the same ClusterID for both namenodes.
Thanks.
Related
I'm facing this such error in kubernetes( 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/unreachable: }, that the pod didn't tolerate.). My application server is down.
First, I just add one file in daemon set , due to memory allocation (we are having one node), all pods are failed to allocate and shows pending state and fully clashes (stays in pending condition).If I delete all deployments and I run any new deployments also its showing pending condition .Now please help to get sort it out this issue. I also tried the taint commands, also it doesn't work.
As per my consent , can I create a node with existing cluster or revoke the instance? thanks in advance
You need to configure autoscaling (it doesn't work by default) for the cluster
https://docs.aws.amazon.com/eks/latest/userguide/create-managed-node-group.html
Or, you can manually change the desired size of the node group.
Also, make sure that your deployment has relevant resources request for your nodes
We have configured Kubernetes cluster on EC2 machines in our AWS account using kops tool (https://github.com/kubernetes/kops) and based on AWS posts (https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-kops/) as well as other resources.
We want to setup a K8s cluster of master and slaves such that:
It will automatically resize (both masters as well as nodes/slaves) based on system load.
Runs in Multi-AZ mode i.e. at least one master and one slave in every AZ (availability zone) in the same region for e.g. us-east-1a, us-east-1b, us-east-1c and so on.
We tried to configure the cluster in the following ways to achieve the above.
Created K8s cluster on AWS EC2 machines using kops this below configuration: node count=3, master count=3, zones=us-east-1c, us-east-1b, us-east-1a. We observed that a K8s cluster was created with 3 Master & 3 Slave Nodes. Each of the master and slave server was in each of the 3 AZ’s.
Then we tried to resize the Nodes/slaves in the cluster using (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-run-on-master.yaml). We set the node_asg_min to 3 and node_asg_max to 5. When we increased the workload on the slaves such that auto scale policy was triggered, we saw that additional (after the default 3 created during setup) slave nodes were spawned, and they did join the cluster in various AZ’s. This worked as expected. There is no question here.
We also wanted to set up the cluster such that the number of masters increases based on system load. Is there some way to achieve this? We tried a couple of approaches and results are shared below:
A) We were not sure if the cluster-auto scaler helps here, but nevertheless tried to resize the Masters in the cluster using (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-run-on-master.yaml). This is useful while creating a new cluster but was not useful to resize the number of masters in an existing cluster. We did not find a parameter to specify node_asg_min, node_asg_max for Master the way it is present for slave Nodes. Is there some way to achieve this?
B) We increased the count MIN from 1 to 3 in ASG (auto-scaling group), associated with one the three IG (instance group) for each master. We found that new instances were created. However, they did not join the master cluster. Is there some way to achieve this?
Could you please point us to steps, resources on how to do this correctly so that we could configure the number of masters to automatically resize based on system load and is in Multi-AZ mode?
Kind regards,
Shashi
There is no need to scale Master nodes.
Master components provide the cluster’s control plane. Master components make global decisions about the cluster (for example, scheduling), and detecting and responding to cluster events (starting up a new pod when a replication controller’s ‘replicas’ field is unsatisfied).
Master components can be run on any machine in the cluster. However, for simplicity, set up scripts typically start all master components on the same machine, and do not run user containers on this machine. See Building High-Availability Clusters for an example multi-master-VM setup.
Master node consists of the following components:
kube-apiserver
Component on the master that exposes the Kubernetes API. It is the front-end for the Kubernetes control plane.
etcd
Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.
kube-scheduler
Component on the master that watches newly created pods that have no node assigned, and selects a node for them to run on.
kube-controller-manager
Component on the master that runs controllers.
cloud-controller-manager
runs controllers that interact with the underlying cloud providers. The cloud-controller-manager binary is an alpha feature introduced in Kubernetes release 1.6.
For more detailed explanation please read the Kubernetes Components docs.
Also if You are thinking about HA, you can read about Creating Highly Available Clusters with kubeadm
I think your assumption is that similar to kubernetes nodes, masters devide the work between eachother. That is not the case, because the main tasks of masters is to have consensus between each other. This is done with etcd which is a distributed key value store. The problem maintaining such a store is easy for 1 machine but gets harder the more machines you add.
The advantage of adding masters is being able to survive more master failures at the cost of having to make all masters fatter (more CPU/RAM....) so that they perform well enough.
I have a Kubernetes cluster distributed on AWS via Kops consisting of 3 master nodes, each in a different AZ. As is well known, Kops realizes the deployment of a cluster where Etcd is executed on each master node through two pods, each of which mounts an EBS volume for saving the state. If you lose the volumes of 2 of the 3 masters, you automatically lose consensus among the masters.
Is there a way to use information about the only master who still has the status of the cluster, and retrieve the Quorum between the three masters on that state? I recreated this scenario, but the cluster becomes unavailable, and I can no longer access the Etcd pods of any of the 3 masters, because those pods fail with an error. Moreover, Etcd itself becomes read-only and it is impossible to add or remove members of the cluster, to try to perform manual interventions.
Tips? Thanks to all of you
This is documented here. There's also another guide here
You basically have to backup your cluster and create a brand new one.
I have Redisson cluster configuration below in yaml file,
subscriptionConnectionMinimumIdleSize: 1
subscriptionConnectionPoolSize: 50
slaveConnectionMinimumIdleSize: 32
slaveConnectionPoolSize: 64
masterConnectionMinimumIdleSize: 32
masterConnectionPoolSize: 64
readMode: "SLAVE"
subscriptionMode: "SLAVE"
nodeAddresses:
- "redis://X.X.X.X:6379"
- "redis://Y.Y.Y.Y:6379"
- "redis://Z.Z.Z.Z:6379"
I understand it is enough to give one of master node ip address in the configuration and Redisson automatically identifies all the nodes in the cluster, but my questions are below,
1 Are all nodes identified at the boot of the application and used for future connections?
2 what if one of the master node goes down, when the application is running, the request to the particular master will fail and the redisson api automatically tries contacting the other nodes(master) or does it try to connect to same master node repeatedly and fail?
3 Is it a best practice to give DNS instead of server ip's?
Answering to your questions:
That's correct, all nodes are identified at the boot process. If you use Config.readMode = MASTER_SLAVE or SLAVE (which is default) then all nodes will be used. If you use Config.readMode = MASTER then only master node is used.
Redisson tries to reach master node until the moment of Redis topology update. Till that moment it doesn't have information about elected new master node.
Cloud services like AWS Elasticache and Azure Cache provides single hostname bounded to multiple ip addresses.
I created a new cluster in Amazon AWS, with 3 instances, following the instructions on the Datastax installation page for Cassandra.
I also set up all of the things in the cassandra.yaml file, as provided by the instructions on Datastax for adding a new node. They all match.
Now, when I run nodetool status, I get 3 nodes in my cluster, which is great! However, 2 of the nodes are down. The ones that are down are the ones that are not the node I am running nodetool status from.
For example: If I run nodetool status from node 1, it shows node0 and node2 down. If I run nodetool status from node0, it shows node1 and node2 down.
Anyone know why this is happening? I set up the security groups as it said to in the documentation.
Thanks