I have Redisson cluster configuration below in yaml file,
subscriptionConnectionMinimumIdleSize: 1
subscriptionConnectionPoolSize: 50
slaveConnectionMinimumIdleSize: 32
slaveConnectionPoolSize: 64
masterConnectionMinimumIdleSize: 32
masterConnectionPoolSize: 64
readMode: "SLAVE"
subscriptionMode: "SLAVE"
nodeAddresses:
- "redis://X.X.X.X:6379"
- "redis://Y.Y.Y.Y:6379"
- "redis://Z.Z.Z.Z:6379"
I understand it is enough to give one of master node ip address in the configuration and Redisson automatically identifies all the nodes in the cluster, but my questions are below,
1 Are all nodes identified at the boot of the application and used for future connections?
2 what if one of the master node goes down, when the application is running, the request to the particular master will fail and the redisson api automatically tries contacting the other nodes(master) or does it try to connect to same master node repeatedly and fail?
3 Is it a best practice to give DNS instead of server ip's?
Answering to your questions:
That's correct, all nodes are identified at the boot process. If you use Config.readMode = MASTER_SLAVE or SLAVE (which is default) then all nodes will be used. If you use Config.readMode = MASTER then only master node is used.
Redisson tries to reach master node until the moment of Redis topology update. Till that moment it doesn't have information about elected new master node.
Cloud services like AWS Elasticache and Azure Cache provides single hostname bounded to multiple ip addresses.
Related
I have an existing cluster running k8s version 1.12.8 on AWS EC2. The cluster contains several pods - some serving web traffic and others configured as scheduled CronJobs. The cluster has been running fine in it's current configuration for at least 6 months, with CronJobs running every 5 minutes.
Recently, the CronJobs simply stopped. Viewing the pods via kubectl shows all the scheduled CronJobs last run was at roughly the same time. Logs sent to AWS Cloudwatch show no error output, and stop at the same time kubectl shows for the last run.
In trying to diagnose this issue I have found a broader pattern of the cluster being unresponsive to changes, eg: I cannot retrieve logs or nodes via kubectl.
I deleted Pods in Replica Sets and they never return. I've set autoscale values on Replica Sets and nothing happens.
Investigation of the kubelet logs on the master instance revealed repeating errors, coinciding with the time the failure was first noticed:
I0805 03:17:54.597295 2730 kubelet.go:1928] SyncLoop (PLEG): "kube-scheduler-ip-x-x-x-x.z-west-y.compute.internal_kube-system(181xxyyzz)", event: &pleg.PodLifecycleEvent{ID:"181xxyyzz", Type:"ContainerDied", Data:"405ayyzzz"}
...
E0805 03:18:10.867737 2730 kubelet_node_status.go:378] Error updating node status, will retry: failed to patch status "{\"status\":{\"$setElementOrder/conditions\":[{\"type\":\"NetworkUnavailable\"},{\"type\":\"OutOfDisk\"},{\"type\":\"MemoryPressure\"},{\"type\":\"DiskPressure\"},{\"type\":\"PIDPressure\"},{\"type\":\"Ready\"}],"conditions\":[{\"lastHeartbeatTime\":\"2020-08-05T03:18:00Z\",\"type\":\"OutOfDisk\"},{\"lastHeartbeatTime\":\"2020-08-05T03:18:00Z\",\"type\":\"MemoryPressure\"},{\"lastHeartbeatTime\":\"2020-08-05T03:18:00Z\",\"type\":\"DiskPressure\"},{\"lastHeartbeatTime\":\"2020-08-05T03:18:00Z\",\"type\":\"PIDPressure\"},{\"lastHeartbeatTime\":\"2020-08-05T03:18:00Z\",\"type\":\"Ready\"}]}}" for node "ip-172-20-60-88.eu-west-2.compute.internal": Patch https://127.0.0.1/api/v1/nodes/ip-x-x-x-x.z-west-y.compute.internal/status?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
...
E0805 03:18:20.869436 2730 kubelet_node_status.go:378] Error updating node status, will retry: error getting node "ip-172-20-60-88.eu-west-2.compute.internal": Get https://127.0.0.1/api/v1/nodes/ip-172-20-60-88.eu-west-2.compute.internal?timeout=10s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Running docker ps on the master node shows that both k8s_kube-controller-manager_kube-controller-manager and k8s_kube-scheduler_kube-scheduler containers were started 6 days ago, where the other k8s containers are at 8+ months.
tl;dr
A container on my main node (likely kube-scheduler, kube-controller-manager or both) died. The containers have come back up but are unable to communicate with the existing nodes - this is preventing any scheduled CronJobs or new deployments from being satisfied.
How can re-configure kubelet and associated services on the master node to communicate again with the worker nodes?
From the docs on Troubleshoot Clusters
Digging deeper into the cluster requires logging into the relevant machines. Here are the locations of the relevant log files. (note that on systemd-based systems, you may need to use journalctl instead)
Master Nodes
/var/log/kube-apiserver.log - API Server, responsible for serving the API
/var/log/kube-scheduler.log - Scheduler, responsible for making scheduling decisions
/var/log/kube-controller-manager.log - Controller that manages replication controllers
Worker Nodes
/var/log/kubelet.log - Kubelet, responsible for running containers on the node
/var/log/kube-proxy.log - Kube Proxy, responsible for service load balancing
Another way to get logs is to use docker ps to get containerid and then use docker logs containerid
If you have (which you should) a monitoring system setup using prometheus and Grafana you can check metrics such as high cpu load on API Server pods
AWS has a maintenance window for each region.
https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/maintenance-window.html but could not find any documentation about how it works with multiple A-Z in the same region.
I have a Redis cache configured and have a replica on different(A-Z) in the same region. The whole purpose of configuring replica on different(A-Z) if one (A-Z) is not available serve it from next(A-Z)
When they doing maintenance are they take down the whole region or individual availability zone
You should read the FAQ on ElastiCache maintenance https://aws.amazon.com/elasticache/elasticache-maintenance/
This says that if you have a multi AZ deployment, it will take down the instances one at a time triggering a fail-over to the read replica, and then create new instances before taking down the rest so you should not experience any interruptions in your service.
Thanks #morras for the above link and explains how elasticache works maintenance window period. Below 3 question I have taken out from the above link and explain about it.
1. How long does a node replacement take?
A replacement typically completes within a few minutes. The replacement may take longer in certain instance configurations and traffic patterns. For example, Redis primary nodes may not have enough free memory, and may be experiencing high write traffic. When an empty replica syncs from this primary, the primary node may run out of memory trying to address the incoming writes as well as sync the replica. In that case, the master disconnects the replica and restarts the sync process. It may take multiple attempts for replica to sync successfully. It is also possible that replica may never sync if the incoming write traffic continues to remains high.
Memcached nodes do not need to sync during replacement and are always replaced fast irrespective of node sizes.
2. How does a node replacement impact my application?
For Redis nodes, the replacement process is designed to make a best effort to retain your existing data and requires successful Redis replication. For single node Redis clusters, ElastiCache dynamically spins up a replica, replicates the data, and then fails over to it. For replication groups consisting of multiple nodes, ElastiCache replaces the existing replicas and syncs data from the primary to the new replicas. If Multi-AZ or Cluster Mode is enabled, replacing the primary triggers a failover to a read replica. If Multi-AZ is disabled, ElastiCache replaces the primary and then syncs the data from a read replica. The primary will be unavailable during this time.
For Memcached nodes, the replacement process brings up an empty new node and terminates the current node. The new node will be unavailable for a short period during the switch. Once switched, your application may see performance degradation while the empty new node is populated with cache data.
3. What best practices should I follow for a smooth replacement experience and minimize data loss?
For Redis nodes, the replacement process is designed to make a best effort to retain your existing data and requires successful Redis replication. We try to replace just enough nodes from the same cluster at a time to keep the cluster stable. You can provision primary and read replicas in different availability zones. In this case, when a node is replaced, the data will be synced from a peer node in a different availability zone. For single node Redis clusters, we recommend that sufficient memory is available to Redis, as described here. For Redis replication groups with multiple nodes, we also recommend scheduling the replacement during a period with low incoming write traffic.
For Memcached nodes, schedule your maintenance window during a period with low incoming write traffic, test your application for failover and use the ElastiCache provided "smarter" client. You cannot avoid data loss as Memcached has data purely in memory.
I have access to a kops-built kubernetes cluster on AWS EC2 instances. I would like to make sure, that all available security patches from the corresponding package manager are applied. Unfortunately searching the whole internet for hours I am unable to find any clue on how this should be done. Taking a look into the user data of the launch configurations I did not find a line for the package manager - Therefor I am not sure if a simple node restart will do the trick and I also want to make sure that new nodes come up with current packages.
How to make security patches on upcoming nodes of a kubernetes cluster and how to make sure that all nodes are and stay up-to-date?
You might want to explore https://github.com/weaveworks/kured
Kured (KUbernetes REboot Daemon) is a Kubernetes daemonset that performs safe automatic node reboots when the need to do so is indicated by the package management system of the underlying OS.
Watches for the presence of a reboot sentinel e.g. /var/run/reboot-required
Utilises a lock in the API server to ensure only one node reboots at a time
Optionally defers reboots in the presence of active Prometheus alerts or selected pods
Cordons & drains worker nodes before reboot, uncordoning them after
We have configured Kubernetes cluster on EC2 machines in our AWS account using kops tool (https://github.com/kubernetes/kops) and based on AWS posts (https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-kops/) as well as other resources.
We want to setup a K8s cluster of master and slaves such that:
It will automatically resize (both masters as well as nodes/slaves) based on system load.
Runs in Multi-AZ mode i.e. at least one master and one slave in every AZ (availability zone) in the same region for e.g. us-east-1a, us-east-1b, us-east-1c and so on.
We tried to configure the cluster in the following ways to achieve the above.
Created K8s cluster on AWS EC2 machines using kops this below configuration: node count=3, master count=3, zones=us-east-1c, us-east-1b, us-east-1a. We observed that a K8s cluster was created with 3 Master & 3 Slave Nodes. Each of the master and slave server was in each of the 3 AZ’s.
Then we tried to resize the Nodes/slaves in the cluster using (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-run-on-master.yaml). We set the node_asg_min to 3 and node_asg_max to 5. When we increased the workload on the slaves such that auto scale policy was triggered, we saw that additional (after the default 3 created during setup) slave nodes were spawned, and they did join the cluster in various AZ’s. This worked as expected. There is no question here.
We also wanted to set up the cluster such that the number of masters increases based on system load. Is there some way to achieve this? We tried a couple of approaches and results are shared below:
A) We were not sure if the cluster-auto scaler helps here, but nevertheless tried to resize the Masters in the cluster using (https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-run-on-master.yaml). This is useful while creating a new cluster but was not useful to resize the number of masters in an existing cluster. We did not find a parameter to specify node_asg_min, node_asg_max for Master the way it is present for slave Nodes. Is there some way to achieve this?
B) We increased the count MIN from 1 to 3 in ASG (auto-scaling group), associated with one the three IG (instance group) for each master. We found that new instances were created. However, they did not join the master cluster. Is there some way to achieve this?
Could you please point us to steps, resources on how to do this correctly so that we could configure the number of masters to automatically resize based on system load and is in Multi-AZ mode?
Kind regards,
Shashi
There is no need to scale Master nodes.
Master components provide the cluster’s control plane. Master components make global decisions about the cluster (for example, scheduling), and detecting and responding to cluster events (starting up a new pod when a replication controller’s ‘replicas’ field is unsatisfied).
Master components can be run on any machine in the cluster. However, for simplicity, set up scripts typically start all master components on the same machine, and do not run user containers on this machine. See Building High-Availability Clusters for an example multi-master-VM setup.
Master node consists of the following components:
kube-apiserver
Component on the master that exposes the Kubernetes API. It is the front-end for the Kubernetes control plane.
etcd
Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.
kube-scheduler
Component on the master that watches newly created pods that have no node assigned, and selects a node for them to run on.
kube-controller-manager
Component on the master that runs controllers.
cloud-controller-manager
runs controllers that interact with the underlying cloud providers. The cloud-controller-manager binary is an alpha feature introduced in Kubernetes release 1.6.
For more detailed explanation please read the Kubernetes Components docs.
Also if You are thinking about HA, you can read about Creating Highly Available Clusters with kubeadm
I think your assumption is that similar to kubernetes nodes, masters devide the work between eachother. That is not the case, because the main tasks of masters is to have consensus between each other. This is done with etcd which is a distributed key value store. The problem maintaining such a store is easy for 1 machine but gets harder the more machines you add.
The advantage of adding masters is being able to survive more master failures at the cost of having to make all masters fatter (more CPU/RAM....) so that they perform well enough.
Trying to make AWS-Elasticache Redis3.2 as the Master and the redis instances in my EC2 as slaveof for this elasticache. I get this error.
Connecting to MASTER masterredis.XXXXXXXXXXXXXXXXXXX.amazonaws.com:6379
MASTER <-> SLAVE sync started
Non blocking connect for SYNC fired the event.
Master replied to PING, replication can continue...
Partial resynchronization not possible (no cached master)
Master does not support PSYNC or is in error state (reply: -ERR unknown command 'PSYNC')
Retrying with SYNC...
MASTER aborted replication with an error: ERR unknown command 'SYNC'
....
ElastiCache is a Redis-as-a-Service from AWS. As such, its operator has the liberty to disable certain commands/features - the ability to replicate to an external instance is one of these disabled features and that's the reason for the PSYNC/SYNC errors that you're getting.
Amazon ElastiCache, as noted by Itamar, is a managed service. When using the ElastiCache Redis engine (it also has a memcached option), the interface is 100% open source Redis, but Amazon has done some changes to the underlying code to optimize for the Cloud.
Replication in ElastiCache uses primary nodes and replica nodes. These are similar, but not identical, to the masters and slaves used by Redis Sentinel. Since ElastiCache is always running on AWS EC2, it can use direct memory transfer to make replication faster and failover smoother than the OSS distribution that needs to support a huge number of possible infrastructure stacks. But you cannot mix Sentinel nodes with ElastiCache nodes in the same cluster.
(BTW, I am part of the Managed Databases team at AWS, so feel free to reach out if you'd like more detail, either here on stackoverflow on by email: briskman at amazon dot com).
For documentation, see http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/Replication.html
We don't explicitly state that SYNC/PSYNC are not supported ... but you can find the ElastiCache API reference online at http://docs.aws.amazon.com/AmazonElastiCache/latest/APIReference/Welcome.html . It has details on API calls such as CreateReplicationGroup and such.