Testing if Cluster Autoscaling and overprovisioners works as expected k8s AWS - amazon-web-services

This must sound like a real noob question. I have a cluster-autoscaler and cluster overprovisioner set up in my k8s cluster (via helm). I want to see the auto-scaler and overprovisioner actually kick in. I am not able to find any leads on how to accomplish this.
does anyone have any ideas?

You can create a Deployment that runs a container with a CPU intensive task. Set it initially to a small number of replicas (perhaps < 10) and start increasing the replicas number with:
kubectl scale --replicas=11 your-deployment
Edit:
How to tell the Cluster Autoscaler has kicked in?
There are three ways you can determine what the CA is doing. By watching the CA pods' logs, checking the content of the kube-system/cluster-autoscaler-status ConfigMap or via Events.

Related

ECS/EC2 ASG not scaling out for new service tasks

I'm relatively new to ECS, for the most part its been fine, but lately I'm facing an issue that I can seem to find an intuitive solution for.
I'm running a ECS cluster with an EC2 capacity provider. EC2 has an AutoScalingGroup with a min_capacity: 1 & max_capacity: 5.
Each ECS service task has auto scaling enabled based upon CPU/Memory utilisation.
The issue I'm seeing is that when new Tasks and being deployed as part of our CI/CD, ECS returns unable to place a task because no container instance met all of its requirementsunable to place a task because no container instance met all of its requirements.
I'm wondering how I get ECS to trigger a scale out event for the ASG when this happens? Do I need a particular scaling policy for the ASG? (I feel its related to this)
I attempted to setup a EventBridge/Cloudwatch alarm to trigger a scale out event whenever this happens but seems hacky. It worked, not ideally but worked. Surely there is a nicer/simpler way of doing this?
Any advice or points from experience would be greatly appreciated!
(PS - let me know if you need any more information/screenshots/code examples etc.)

HPA on EKS-Fargate

this is not a question about how to implement HPA on a EKS cluster running Fargate pods... It´s about if it is necessary to implement HPA along with Fargate, because as far as I know, Fargate is a "serverless" solution from AWS: "Fargate allocates the right amount of compute, eliminating the need to choose instances and scale cluster capacity. You only pay for the resources required to run your containers, so there is no over-provisioning and paying for additional servers."
So I´m not sure in which cases I would like to implement HPA on an EKS cluster running Fargate but the option is there. So I would like to know if someone could give more information.
Thank you in advance
EKS/Fargate allows you to NOT run "Cluster Autoscaler" (CA) because there are not nodes you need to run your pods. This is what it is referred to with "no over-provisioning and paying for additional servers."
HOWEVER, you could/would use HPA because Fargate does not provide a resource scaling mechanism for your pods. You can configure the size of your Faragte pods via K8s requests but at that point that is a regular pod with finite resources. You can use HPA to determine the number of pods (on Fargate) you need to run at any point in time for your deployment.

Scale down Kubernetes pods to 0 replica until traffic on site

I have setup a standard Kubernetes cluster which includes ReplicaSets, Deployments, Pods etc ... I am looking to save costings around this as it is just constantly running and it is a Pre Production environment.
I was wondering if there is a feature in Kubernetes to say that if a Pod has not been used in the last 60 minutes, it shuts down. If then someone requests to use that pod it will spin back up. I understand the request might take longer as the Pod will need to spin up, but the cost saved for the pre-production environment would be huge.
I have been trying to look around, but the only resource I could find for this was https://codeberg.org/hjacobs/kube-downscaler. Looking at this, it only allows you to specify times for shut down not traffic.
If someone could point me in the right direction that would be great.
Since kubernetes 1.16 there is a feature gate called HPAScaleToZero which enables setting minReplicas to 0 for HorizontalPodAutoscaler resources when using custom or external metrics. It has to be explicitly enabled as it's disabled by default.
Additionally you can use KEDA for event-driven autoscaling. It enables you to scale down deployment to 0. It uses ScaledObject custom resource definition which is used to define how KEDA should scale your application and what the triggers are.
Other approach might be to use a custom solution called Zero Pod Autoscaler which:
can scale the Deployment all the way down to zero replicas when it is
not in use. It can work alongside an HPA: when scaled to zero, the HPA
ignores the Deployment; once scaled back to one, the HPA may scale up
further.

Is it possible to use Kubernetes Cluster Autoscaler to scale nodes if number of nodes hit a threshold?

I created an EKS cluster but while deploying pods, I found out that the native AWS CNI only supports a set number of pods because of the IP restrictions on its instances. I don't want to use any third-party plugins because AWS doesn't support them and we won't be able to get their tech support. What happens right now is that as soon as the IP limit is hit for that instance, the scheduler is not able to schedule the pods and the pods go into pending state.
I see there is a cluster autoscaler which can do horizontal scaling.
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Using a larger instance type with more available IPs is an option but that is not scalable since we will run out of IPs eventually.
Is it possible to set a pod limit for each node in cluster-autoscaler and if that limit is reached, a new instance is spawned. Since each pod uses one secondary IP of the node so that would solve our issue of not having to worry about scaling. Is this a viable option? and also if anybody has faced this and would like to share how they overcame this limitation.
EKS's node group is using auto scaling group for nodes scaling.
You can follow this workshop as a dedicated example.

Volumes between deployment replicas

I have this issue: Two or more nodes on cluster and 5 deployment replicas, and I have to use one volume for them. For example I will add one file to first pod and can take it from another, and if my first pod will deleted, I still can take this data from second pod.
I tried kubernetes volumes types like hostPath, but it's didn't work.
I tried NFS but it didn't work. Because we have many instructions, but each of them not full and not correct! Can you please write full instruction, like for junior, ok - like for idiots? I never use NFS, Gluster, but in kubernetes docs information is too short about how to install it and connect to kubernetes.
Now I try using AWS EFS and kubernetes and the same story, a lot of general information, individual instructions, but not consistent. Why, it's so hard for you, explain how it works? I am in fire now, kubernetes documentation about base elements like deployment, services - ok, but about integrations, not basic volumes - awfully!
Maybe some one can help me with it?
AWS part: https://aws.amazon.com/getting-started/tutorials/create-network-file-system/
KUBERNETES part: https://github.com/kubernetes-incubator/external-storage/blob/master/aws/efs/deploy/manifest.yaml
Thanks for help.