HPA on EKS-Fargate - amazon-web-services

this is not a question about how to implement HPA on a EKS cluster running Fargate pods... It´s about if it is necessary to implement HPA along with Fargate, because as far as I know, Fargate is a "serverless" solution from AWS: "Fargate allocates the right amount of compute, eliminating the need to choose instances and scale cluster capacity. You only pay for the resources required to run your containers, so there is no over-provisioning and paying for additional servers."
So I´m not sure in which cases I would like to implement HPA on an EKS cluster running Fargate but the option is there. So I would like to know if someone could give more information.
Thank you in advance

EKS/Fargate allows you to NOT run "Cluster Autoscaler" (CA) because there are not nodes you need to run your pods. This is what it is referred to with "no over-provisioning and paying for additional servers."
HOWEVER, you could/would use HPA because Fargate does not provide a resource scaling mechanism for your pods. You can configure the size of your Faragte pods via K8s requests but at that point that is a regular pod with finite resources. You can use HPA to determine the number of pods (on Fargate) you need to run at any point in time for your deployment.

Related

AWS fargate cluster - limitations and best practices

I have a fargate cluster in dev environment which contains an ecs service supporting a single client.
We need to on-board 50 more clients. So wanted to know what are some best practices around fargate clusters. I looked around and did not find any suitable content(including aws fargate FAQ). Can anyone help me with the below:
Should I create one fargate cluster per client or within same fargate cluster create one ecs service per client ? Which one is better and why ?
Is there any limitation on how many fargate clusters can be created in aws ?
Let's say it depends but none of the options you can pick will result in you doing anything wrong. A cluster in Fargate doesn't have a very specific meaning because there are no container instances you would provision and attach to said cluster(s) to provide capacity. In the context of Fargate a cluster really just become some sort of "folder" or namespace. The only real advantage of having multiple clusters is because you can scope your users at the cluster level and delegate the ability to deploy in said clusters. If you don't have a specific need like that, for simplicity you are probably good with just one cluster and 50 separate ECS services in it.

How can I understand `Nodes` in EKS Fargate?

I deployed a EKS cluster and a fargate profile. Then I deployed a few application to this cluster. I can see these fargate instances are launched.
when I click each of this instance, it shows me some information like os, image etc. But it doesn't tell me the CPU and memory. When I look at fargate pricing: https://aws.amazon.com/fargate/pricing/. It is calculated based on CPU and Memory.
I have used ECS and it is very clear that I need to provision CPU/Memory in service/task level. But I can't find anything in EKS.
How do I know how much resources they are consuming?
With Fargate you don`t have provision, configure or scale virtual machines to run your containers so that they become fundamental compute primitive.
This solution model is called serverless where you are being charged for only the compute resources and storage that are need to execute some piece of your code. It does not mean that there are not server involved in this, it just you don`t need to care about those.
To monitor there those you can use CloudWatch. Below documents describe how this can be achieved:
How do I troubleshoot high CPU utilization on an Amazon ECS task on
Fargate?
How can I monitor high memory utilization for Amazon ECS tasks on
Fargate?
It is worth to mention that Fargate is just a launch type for ECS (Another one is EC2). Please have a look at the diagram in this document for clear image of how those are connected. The CloudWatch metrics are collected automatically for Fargate. If you are using the AKS with Fargate you can monitor them with usage of metrics-addon or prometheus inside your kubernetes cluster.
Here's an example of monitoring Fargate with Prometheus. Notice that it scrapes the metrics from CloudWatch.

Is AWS Fargate better than Amazon EKS managed node groups?

Amazon EKS managed node groups automate the provisioning and lifecycle management of nodes (Amazon EC2 instances) for Amazon EKS Kubernetes clusters.
AWS Fargate is a technology that provides on-demand, right-sized compute capacity for containers. With AWS Fargate, you no longer have to provision, configure, or scale groups of virtual machines to run containers. This removes the need to choose server types, decide when to scale your node groups, or optimize cluster packing.
So, Is AWS Fargate better than Amazon EKS managed node groups? When should I choose managed node groups?
We chose to go with AWS Managed Node groups for the following reasons:
Daemonsets are not supported in EKS Fargate, so observability tools like Splunk and Datadog have to run in sidecar containers in each pod instead of a daemonset per node
In EKS Fargate each pod is run in its own VM and container images are not cached on nodes, making the startup times for pods 1-2 minutes long
All replies are on point. There isn't really "better" here. It's a trade-off. I am part of the container team at AWS and we recently wrote about the potential advantages of using Fargate over EC2. Faster pod start time, images caching, large pods configurations, special hw requirements e.g. GPUs) are all good reasons for needing to use EC2. We are working hard to make Fargate a better place to be though by filling some of the gaps so you could appreciate only the advantages.
There is no better than other. Your requirements (and skills) makes a product better than another!
The real difference in Fargate is that it's serverless, so you don't need for example to care about the EC2 instances right-sizing, you won't pay the idle time.
To go straight to the point: unless you are a K8S expert I would suggest Fargate.

Is it possible to use Kubernetes Cluster Autoscaler to scale nodes if number of nodes hit a threshold?

I created an EKS cluster but while deploying pods, I found out that the native AWS CNI only supports a set number of pods because of the IP restrictions on its instances. I don't want to use any third-party plugins because AWS doesn't support them and we won't be able to get their tech support. What happens right now is that as soon as the IP limit is hit for that instance, the scheduler is not able to schedule the pods and the pods go into pending state.
I see there is a cluster autoscaler which can do horizontal scaling.
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Using a larger instance type with more available IPs is an option but that is not scalable since we will run out of IPs eventually.
Is it possible to set a pod limit for each node in cluster-autoscaler and if that limit is reached, a new instance is spawned. Since each pod uses one secondary IP of the node so that would solve our issue of not having to worry about scaling. Is this a viable option? and also if anybody has faced this and would like to share how they overcame this limitation.
EKS's node group is using auto scaling group for nodes scaling.
You can follow this workshop as a dedicated example.

If you are running your applications on DC/OS in AWS, is creating an AutoScaling group redundant?

Since you can enable autoscaling of containers through DC/OS, when running this on an EC2 cluster, is it still necessary to, or redundant to run your cluster in an AutoScaling cluster?
There are two (orthogonal) concepts here at play and unfortunately the term 'auto-scale' is ambiguous here:
Certain IaaS platforms (incl. AWS) support dynamically adding VMs to a cluster.
The other is the capability of a container orchestrator to scale the number of copies of a service—in case of Marathon this is called instances or replicas in the context of Kubernetes—as long as there are sufficient resources (CPU, RAM, etc.) available in the cluster,
In the simplest case you'd auto-scale the services up to the point where the overall cluster utilization is high (>60%? >70%? >80%?) and the use the IaaS-level auto-scaling functionality to add further nodes. Turns out scaling back is the trickier thing.
So, complementary rather than redundant.