Tracking Master node failure in Multi-master AWS cluster - amazon-web-services

Am using EMR 5.26 cluster version in AWS and it supports having multiple Master nodes(3 Master nodes). This is to remove the single point failure of the cluster. When a master node gets terminated, another node takes it's place as master node and keeps the EMR Cluster and it's steps running.
Here the issue is, am trying to track the exact time when a master node goes into problem(termination). And also the time taken by another node to take it's place and become the new Master node.
Couldn't find any detailed documentation on tracking the failure of master node in AWS Multi master cluster and hence posting it here.

Related

Google cloud kubernetes switching off a node

I'm currently testing out google cloud for a home project. I only require the node to run between a certain time slot. When I switch the node off it automatically switches itself on again. Not sure if I am missing something as I did not enabling autoscaling and it's also a General Purpose e2-small instance
When I switch the node off it automatically switches itself on again.
Not sure if I am missing something as I did not enabling autoscaling
and it's also a General Purpose e2-small instances
Kubernetes nodes are managed by the Node pool. Which you might created during your cluster creation of GKE if you are using it.
Node pool manages the number of the available node counts. there could be chances new nodes is getting created again or existing node starting back.
If you are on GKE and want to scale down to zero you can reduce number of node count in Node pool from GKE console.
Check your node pool : https://cloud.google.com/kubernetes-engine/docs/how-to/node-pools#console_1
Resize your node pool from here : https://cloud.google.com/kubernetes-engine/docs/how-to/node-pools#resizing_a_node_pool

AWS Redshift Node Failure - is the entire cluster unavailable despite having mutliple nodes?

I was looking at the official FAQ for Redshift. They indicated that if a "Node goes down, the cluster remains unavailable while Redshift replaces the node. I was wondering if this applies to Redshift clusters having multiple nodes? Redshift does support upto 120 Nodes - so if 1 node goes down - is the entire Redshift cluster still unavailable?
(My confusion is that I wasnt able to properly discern whether FAQ was talking about a cluster with 1 node or multiple)
For a single node cluster the single compute node is also the leader node. Copies of node data is not replicated on other nodes since there are no other nodes to copy it to. A disk failure or a node failure will cause the database to crash completely and will need to restarted and restored from an S3 snapshot. Because of this single node Redshift "clusters" are not recommended for production workloads. These are meant for trying out Redshift and dev work.
In a multi-node Redshift cluster the data from every compute node is replicated in some set of the other nodes. If a disk fails on a node then the data can be fetched from these other nodes/disks until the disk is replaced and the data is brought back in to the new disk. This is all seamless and unless you look at the logs you likely won't notice.
A compute node failure will cause a short pause in the operation of you cluster while a replacement node is provisioned. Once the replacement is up the cluster will start executing queries using the backup node data from the other nodes. In a short while the node will be "refilled" with all of its data. Node failures are much rarer than disk failures.
If the leader node fails, on any size cluster, the database will crash and will need to be restored from S3 snapshot. Leader node failures is very rare but therefore Redshift is not a full HA database. Sometimes people will use 2 Redshift clusters and set up a main and backup cluster for this reason.

How to disable node auto-repair

How do I disable GKE cluster nodes maintenance auto-repair using terraform? It seems I can't stop the nodes or change the settings of the GKE nodes from GCP console. So I guess I'll have to try it using terraform even if its recreates the cluster.
How does the maintenance happen? I think it migrates all the pods to the secondary node and then restarts the first node correct? But what if there isn't enough resources available for the secondary node to handle all the nodes from primary node? Will GCP create new node? For example: Primary node has around 110 pods and secondary node has 110 pods. How the maintenance happen if the nodes needs to be restarted?
You can disable node auto-repair by running the following command in the GCP shell:
gcloud container node-pools update <pool-name> --cluster <cluster-name> \
--zone compute-zone \
--no-enable-autorepair
You will find how to do it using the GCP console in this link as well.
If you are still facing issues and want to disable node auto-repair using terraform you have to specify in the argument management if you want to enable auto-repair. You can find further details in the terraform's documentation.
Here you can also find how the node repair process works:
If GKE detects that a node requires repair, the node is drained and re-created. GKE waits one hour for the drain to complete. If the drain doesn't complete, the node is shut down and a new node is created.
If multiple nodes require repair, GKE might repair nodes in parallel. GKE balances the number of repairs depending on the size of the cluster and the number of broken nodes. GKE will repair more nodes in parallel on a larger cluster, but fewer nodes as the number of unhealthy nodes grows.
If you disable node auto-repair at any time during the repair process, in- progress repairs are not cancelled and continue for any node currently under repair.

Questions on AWS Elasticsearch Service Cluster Setup

How to decide the instance type to use for data nodes and master nodes? Any guidelines for that?
Is it necessary to use master nodes in elastic search? How do they help? Can we have a cluster without any master nodes?
Does the Elastic search service cluster cost also includes the cost of the master node instances?
Can we change the instance types and increase the number of data nodes later without any downtime? Eg. if we found we need more memory or the current instance type are not that useful?
It totally depends on your use-case, traffic, types of search queries(real-time or back-office), read-heavy(website search) or write-heavy(log analysis), etc. It's a very open-ended question and it would be the same as you plan the capacity for your other systems. But as master nodes are just used for lightweight cluster-wide actions, it can be much smaller than the data and co-ordinating nodes, where actually heavy lifting of search, aggregation, indexing etc. happens.
Yes, its required to use the master node and below is the task performed by master nodes. Although you can mark any node in elasticsearch cluster as the master node and it's not mandatory to have a dedicated master node, but its a good practice to have a dedicated master node for a healthy cluster state.
The master node is responsible for lightweight cluster-wide actions
such as creating or deleting an index, tracking which nodes are part
of the cluster, and deciding which shards to allocate to which nodes.
It is important for cluster health to have a stable master node.
It's clearly mentioned on their site. You are charged only for Amazon Elasticsearch Service instance hours, Amazon EBS storage (if you choose this option), and data transfer. Hence if you are creating a dedicated master node then you will have to pay for that as well.
You can do both, adding more data nodes to existing ES clusters doesn't require any down-time but if you are changing the instance type, that requires downtime on that node but with the rolling upgrade, you can avoid overall downtime to your ES cluster.
Hope I am able to give a satisfactory answer and let me know if you need more details.

Disaster Recovery Kops Kubernetes Master Node on AWS

I have currently a cluster HA (with three multiple masters, one for every AZ) deployed on AWS through kops. Kops deploys a K8S cluster with a pod for etcd-events and a pod for etcd-server on every master node. Every one of this pods uses a mounted volume.
All works well, for example when a master dies, the autoscaling group creates another master node in the same AZ, that recovers its volume and joins itself to the cluster. The problem that I have is respect to a disaster, a failure of an AZ.
What happens if an AZ should have problems? I periodically take volume EBS snapshots, but if I create a new volume from a snapshot (with the right tags to be discovered and attached to the new instance) the new instance mounts the new volumes, but after that, it isn't able to join with the old cluster. My plan was to create a lambda function that was triggered by a CloudWatch event that creates a new master instance in one of the two safe AZ with the volume mounted from a snapshot of the old EBS volume. But this plan has errors because it seems that I am ignoring something about Raft, Etcd, and their behavior. (I say that because I have errors from the other master nodes, and the new node isn't able to join itself to the cluster).
Suggestions?
How do you recover theoretically the situation of a single AZ disaster and the situation when all the master died? I have the EBS snapshots. Is it sufficient to use them?
I'm not sure how exactly you are restoring the failed node but technically the first thing that you want to recover is your etcd node because that's where all the Kubernetes state is stored.
Since your cluster is up and running you don't need to restore from scratch, you just need to remove the old node and add the new node to etcd. You can find out more on how to do it here. You don't really need to restore any old volume to this node since it will sync up with the other existing nodes.
Then after this, you can start other services as kube-apiserver, kube-controller-manager, etc.
Having said that, if you keep the same IP address and the exact same physical configs you should be able to recover without removing the etcd node and adding a new one.