I have both Node Auto Provisioning and Autoscaling enabled on a GKE Cluster. Few queries on Auto Scaling.
For AutoScaling, the minimum nodes are 1 and the maximum number of nodes is 2. Few queries based on this setup.
I set the number of nodes to 0 using gcloud command
gcloud container clusters resize cluster-gke-test --node-pool pool-1 --num-nodes 0 --zone us-central1-c
Now I can see the message that Pods are unscheduleable
Your cluster has one or more unschedulable pods.
Following are my queries
Since autoscaling is enabled , the nodes should have been automatically spawned in order to run these Pods . But I don't see this happening . Is this not the expected behavior ?
Auto Scaling does not work when we reduce the number of nodes manually?
Auto Scaling works based on load only. If there are requests which cannot be handled by existing nodes then only it will launch new nodes. The minimum number of nodes for Node autoscaling to work should be always greater than zero?
It's a documented limitation. If your node pool is set to 0, there isn't auto scaling from 0
Yes it works as long as you don't manually scale to 0.
It's also documented. The node pool scale according with the request. If a Pod is unschedulable because of a lack of resource, and the max-node isn't reach, a new node is provisioned and the pod deployed.
you can set 0 to min-nodes, but you must at least have 1 node active in the cluster, on anther node pool
If you specify a minimum of zero nodes, an idle node pool can scale down completely. However, at least one node must always be available in the cluster to run system Pods.
Related
I am trying to launch a kubernetes cluster over EKS which would have multiple pods in it . Once the worker node has maximum pods on it running then a new node launches and the extra pod launches over the new node. Launching of a new node takes time and creates a downtime which I want to reduce. Pod disruption budget is one option but I am not sure how to use it with scaling up of nodes.
A simpler way to approach this would be to have your scaling policies pre-defined to scale up at reasonably lower limits. This way, let's say if your server reaches 60% of the capacity and triggers a scale up - you would have enough grace time to not face a downtime (since the first one can handle requests while the second one bootstraps) and allow the the new server to come up.
I am testing the Google Kubernetes autoscaling.
I have created a cluster with 1 master node.
Then I have used
gcloud container node-pools create node-pool-test \
--machine-type g1-small --cluster test-master \
--num-nodes 1 --min-nodes 0 --max-nodes 3 \
--enable-autoscaling --zone us-central1-a
to create a node pool with autoscaling and minimum node to 0.
Now, the problem is that it's been 30 minutes since the node pool was created (and I haven't run any pods) but the node pool is not scaling down to 0. It was supposed to scale down in 10 minutes.
Some system pods are running on this node pool but the master node is also running them.
What am I missing?
Have a look at the documentation:
If you specify a minimum of zero nodes, an idle node pool can scale
down completely. However, at least one node must always be available
in the cluster to run system Pods.
and also check the limitations here and here:
Occasionally, cluster autoscaler cannot scale down completely and an
extra node exists after scaling down. This can occur when required
system Pods are scheduled onto different nodes, because there is no
trigger for any of those Pods to be moved to a different node
and possible workaround.
More information you can find at Autoscaler FAQ.
Also, as a solution, you could create one node pool with a small machine for system pods, and an additional node pool with a big machine where you would run your workload. This way the second node pool can scale down to 0 and you still have space to run the system pods. Here you can find an example.
I have a node pool with a minimum pool size of 0, and a max pool size of 3. Normally, nothing is happening on this node pool, so GKE correctly scales down to zero. However, if I tried to submit a job to this pool via kubectl, the pod fails with Unschedulable.
I can run a kubectl with --enable-autoscaling --min-nodes 1 --max-nodes 3 wait 10 seconds, and then deploy, and then wait for completion before changing the min-nodes back to 0 but this doesn't seem ideal.
Is there a better way to get the pool to start a node when a pod is pending?
Even with something like taints or nodeAffinity, I don't think you can tell Kubernetes to spin up nodes in order to schedule workloads. The scheduler requires a node to be available already.
(Out of curiosity, how were you scheduling jobs to a specific nodepool via kubectl?)
As the autoscaler is based on Pod resource requests to scale up/down, you need at least 1 value that autoscaler can use as base calculation to know if the pool need additional node or not.
Here is more information about How cluster autoscaler works [1]
I am using kubernetes v1.11.1 configured using kubeadm consisting of five nodes and hundreds of pods are running. How can I enable or configure cluster autoscaling based on the total memory utilization of the cluster?
K8s cluster can be scaled with the help of Cluster Autoscaler(CA) cluster autoscaler github page, find info on AWS CA there.
It is not scaling the cluster based on “total memory utilization” but based on “pending pods” in the cluster due to not enough available cluster resources to meet their CPU and Memory requests.
Basically, Cluster Autoscaler(CA) checks for pending(unschedulable) pods every 10 seconds and if it finds any, it will request AWS Autoscaling Group(ASG) API to increase the number of instances in ASG. When a node to ASG is added, it then joins the cluster and becomes ready to serve pods. After that K8s Scheduler allocates “pending pods” to a new node.
Scale-down is done by CA checking every 10 seconds which nodes are unneeded and the node is considered for removal if: the sum of CPU and Memory Requests of all pods is smaller than 50% of node’s capacity, pods can be moved to other nodes and no scale-down disabled annotation.
If K8s cluster on AWS is administered with Kubeadm, all the above holds true. So in a nutshell(intricate details omitted, refer to the doc on CA):
Create Autoscaling Group(ASG) aws ASG doc.
Add tags to ASG like k8s.io/cluster-autoscaler/enable(mandatory),
k8s.io/cluster-autoscaler/cluster_name(optional).
Launch “CA” in a cluster following the offical doc.
I have been trying to auto-scale a 3node Cassandra cluster with Replication Factor 3 and Consistency Level 1 on Amazon EC2 instances. Despite the load balancer one of the autoscaled nodes has zero CPU utilization and the other autoscaled node has considerable traffic on it.
I have experimented more than 4 times to auto-scale a 3 node with RF3CL1 and the CPU utilization on one of the autoscaling nodes is still zero. The overall CPU utilization has a drop but one of the autoscaled nodes is consistently idle from the point of auto scaling.
Note that the two nodes which are launched at the point of autoscaling are started by the same launch configuration. The two nodes have the same configuration in every aspect. There is an alarm for the triggering of the nodes and the scaling policy is set as per that alarm.
Can there be a bash script that can be run on the user data?
For example, altering the keyspaces?
Can someone let me know what could be the reason behind this behavior?
AWS auto scaling and load balancing is not a good fit for Cassandra. Cassandra has its own built in clustering with seed nodes to discover the other members of the cluster, so there is no need for an ELB. And auto scaling can screw you up because the data has to be re-balanced between the nodes.
https://d0.awsstatic.com/whitepapers/Cassandra_on_AWS.pdf
yes, you don't need ELB for Cassandra.
So you created a single node Cassandra, and created some keyspace. Then you scaled Cassandra to three nodes. You found one new node was idle when accessing the existing keyspace. Is this understanding correct? Did you alter the existing keyspace's replication factor to 3? If not, the existing keyspace's data will still have 1 replica.
When adding the new nodes, Cassandra will automatically balance some tokens to the new nodes. This is probably why you are seeing load on one of the new nodes, which happens to get some tokens that has keyspace data.