Good morning. I am doing some tests with the new Google Cloud Kubernetes Engine's Autopilot mode. I know that it automates a lot of the machine resources' management, but I am not sure about what it automates. Does it only cares about provisioning the hardware resources that I set inside my PodSpec? Or does it also cares about scaling up and down the number of containers that I have based on traffic intensity?
I am coming from Cloud Run, so, after all, my main question is: Now, with GKE Autopilot, do I need to do something for it to create new container instances when the traffic intensity increases or is it all automatically managed? Do I need to set HPA, VPA and other autoscaler technologies when using autopilot?
For GKE autopilot you need to create the HPA and VPA configuration
GKE autopilot will the scaling of Node by default
You can read more at : https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#comparison
Scaling Pre-configured: Autopilot handles all the scaling and
configuring of your nodes.
Default: You configure Horizontal pod autoscaling (HPA) You configure
Vertical Pod autoscaling (VPA)
Do I need to set HPA, VPA and other autoscaler technologies when using
autopilot?
Autoscaler is not required as it will be by default managed by GKE and will scale the Node as per requirement.
Related
The Cloud Run on GKE documentation says
Note that although these instructions don't enable cluster autoscaling to resize clusters for demand, Cloud Run for Anthos on Google Cloud automatically scales instances within the cluster.
Does that mean that if I create a Cloud Run cluster using the default configuration, my service will never scale past the capacity of the three nodes of the cluster?
Is it possible to enable Kubernetes autoscaling for Cloud Run clusters, or will that conflict with the internal Cloud Run autoscaler? I'd like to be able to scale up my Cloud Run cluster to many nodes, but take advantage of the autoscaler to avoid wasting resources.
You can define an autoscaling NodePool.
The warning is just about the Cloud Run (or Knative) autoscaller manage only the Pod autoscalling and doesn't manage the nodes autoscalling.
The nodes autoscaller is managed by K8S and based on CPU usage.
Remember, you can't scale to 0 node, but you can scale to 0 Pod. In addition, the node scaling and very slow compared to node scaling.
I created an EKS cluster but while deploying pods, I found out that the native AWS CNI only supports a set number of pods because of the IP restrictions on its instances. I don't want to use any third-party plugins because AWS doesn't support them and we won't be able to get their tech support. What happens right now is that as soon as the IP limit is hit for that instance, the scheduler is not able to schedule the pods and the pods go into pending state.
I see there is a cluster autoscaler which can do horizontal scaling.
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Using a larger instance type with more available IPs is an option but that is not scalable since we will run out of IPs eventually.
Is it possible to set a pod limit for each node in cluster-autoscaler and if that limit is reached, a new instance is spawned. Since each pod uses one secondary IP of the node so that would solve our issue of not having to worry about scaling. Is this a viable option? and also if anybody has faced this and would like to share how they overcame this limitation.
EKS's node group is using auto scaling group for nodes scaling.
You can follow this workshop as a dedicated example.
I'm planning on shifting from EC2 to Fargate because it said it automatically "removes the need to choose server types, decide when to scale your clusters, or optimize cluster packing". I think I understand how a cluster scales through auto-scaling rules, but that isn't exactly automatic. So am I missing something regarding how scaling in AWS Fargate works?
As far as I understand so far, I make a basic task. I assign the task some memory and CPU, and the only way it scales is through auto scaling which will basically recreate these tasks when the need arises (either through alarms or specific rules). TIA!
Fargate abstracts underline cluster nodes. In ECS (EC2 instances) you have to manage auto scaling for services and cluster both.
In fargate however you can scale services only and don't have to worry about underline cluster. Very much like service autoscaling you have been doing in ECS.
Since you can enable autoscaling of containers through DC/OS, when running this on an EC2 cluster, is it still necessary to, or redundant to run your cluster in an AutoScaling cluster?
There are two (orthogonal) concepts here at play and unfortunately the term 'auto-scale' is ambiguous here:
Certain IaaS platforms (incl. AWS) support dynamically adding VMs to a cluster.
The other is the capability of a container orchestrator to scale the number of copies of a service—in case of Marathon this is called instances or replicas in the context of Kubernetes—as long as there are sufficient resources (CPU, RAM, etc.) available in the cluster,
In the simplest case you'd auto-scale the services up to the point where the overall cluster utilization is high (>60%? >70%? >80%?) and the use the IaaS-level auto-scaling functionality to add further nodes. Turns out scaling back is the trickier thing.
So, complementary rather than redundant.
Is it possible in Mesos to have dynamic cluster size - with total cluster CPU and RAM quotas set?
Mesos knows my AWS credentials and spawns new ec2 instances only if there is a new job that cannot fit into existing resources. (AWS or other cloud provider). Similar to that - when the job is finished it could kill the ec2 instance.
It can be Mesos plugin/framework or some external tool - any help appreciated.
Thanks
What we are doing is we are using Mesos monitoring tools and HTTP endpoints # http://mesos.apache.org/documentation/latest/endpoints/ to monitor the cluster.
We have our own framework that gets all the relevant information from the master and slave nodes and our algorithm uses that information to scale the cluster.
For example if the cluster CPU utilization is > 0.90 we bring up a new instance and register that slave to master.
If I understand you correctly you are looking for a solution to autoscale your Mesos cluster?
What some people will do on AWS for example is to create an autoscaling group allowing them to scale up and down the number of agents/slave nodes depending on their needs.
Note that the trigger when to scale up/down are usually application dependent (e.g., could be ok for one app to be at a 100% utilization while for others 80% should already trigger a scale-up action).
For an example of using the AWS auto scaling groups you could have a look at Mesosphere DCOS Community edition (note as mentioned above you will still have to write the trigger code for scaling your scaling group).
AFAIK, the Mesos can not autoscaling itself; it need someone to start Mesos Agent for the cluster. One option is to build a script and be managed by Marathon, this script is to start/stop agents after comparing your pending tasks in the framework and Mesos cluster.