AWS EKS - NodeGroup Update Instance Types - amazon-web-services

I'm currently have a Kubernetes Application using AWS EKS. I also created nodegroup; initial I provisioned low instance capacity on that nodeGroup can only handle 4 pods. When I tried to rollout an update on my deployments error occurred insufficient pods, this is mainly due to the under capacity instance type that I initially provision. My question is it possible to update the live nodeGroup instancetype?
I solved the problem though by creating additional nodegroup with scaled up instance type. I'm just wondering if it's possible to edit the live nodegroup instance type for scaling up.

EKS nodegroups instance types cannot be changed after creation. You'll have to create a new node group every time you'd like a new instance type.

The instance type can be changed by applying a new launch template version.
However as any node related changes are immutable in nature, beware that this will in reality create new EC2 instances and get rid of the old ones (depending on the use case), and won't change instance types on the existing nodes.
The EKS nodegroups are in essence EC2 auto scaling groups, which use launch templates to scale the nodes up and down. Furthermore the launch template defines the instance type. Hence by defining a new launch template, any new nodes that would be spun up would use the new instance type (plus, in case the number of nodes doesn't change, then the change can be executed via a rolling update to minimize the impact to the cluster).
Steps to update in AWS console:
Navigate to auto scaling groups under the EC2 service
Find the launch template corresponding to the auto scaling group for the nodegroup
Create a new version by selecting Actions - Modify template (create new version)
This will take the existing template, so only the instance type needs to be modified.
Set default version for the launch template by clicking on Actions - Set default version
Applying the change
Number of nodes remain the same:
Open the auto scaling group
Click on Start instance refresh
Set appropriate minimum healthy percentage and instance warmup
An instance refresh replaces instances. Each instance is terminated first and then replaced, which temporarily reduces the capacity available within your Auto Scaling group. Learn more
In case there is only a single node, then it could make sense to temporarily scale up to 2 nodes for the refresh process to be able to reschedule the workload evicted from the node being refreshed.
Number of nodes reduces:
The nodegroup can be scaled down via eksctl scale nodegroup. But bear in mind, that this will terminate all instances in the nodegroup and create the new instances based on the updated launch template.
Number of nodes increases:
The nodegroup can be scaled up via eksctl scale nodegroup. The new instances that will be created will based on the updated launch template.
Reference with screenshots

You can not update instance type , use autoscaling or create a new node group & make pods schedule over there

surely we can update the node type. this is possible only when you created the node group via the launch template and EKS optimized instance. so when you create a new template version with a new instance type you can update the node group instance type without deleting the node group

Related

Proper way to update AWS ASG with new launch template version

I'm updating our production ASG at night to use small type of instances.
For example, using m5 type instances in business hours, and using t3 type instances at night.
For this, I update the launch template version and desired capacity of the ASG by lambda with cloudWatch.
When it update the launch template version and desired capacity, it start a new instance depends on the new version of template well. But the problem is, sometimes ASG stop the new instance instead of the old one (old version type)
So I'm planning to update the minSize of the ASG also and change it again after sometimes to wait the new version instance be started well.
For example, update the minSize and desired capacity as 2 and wait to start the new type instance by updated version launch template. And after sometimes, update the minSize and desired capacity as 1 to stop the old type instance.
Is this right way? or Could you advice me better way?
Thanks.
The solution is to set termination policy setting in the autoscaling group to OldestInstance.
This way, ASG will first terminate the oldest instances, which are the instances that you want to get rid of.

EC2 Auto Scaling Group's Instance refresh goes below Healthy threshold

I have an ASG with desired/min/max of 1/1/5 instances (I want ASG just for rolling deploys and zone failover). When I start the Instance refresh with MinHealthyPercentage=100,InstanceWarmup=180, the process starts by deregistration (the instance goes to draining mode almost immediately on my ALB, instead waiting the 180 Warmup seconds until the new instance is healthy) and the application becomes unavailable for a while.
Note that this is not specific just to my case with one instance. If I had two instances, the process also starts by deregistering one of the instances and that does not fulfill the 100% MinHealthy constraint either (the app will stay available, though)!
Is there any other configuration option I should tune to get the rolling update create and warm up the new instance first?
Currently instance refresh always terminates before launching, and it uses the minHealthyPercent to determine batch size and when it can move on to the next batch.
It takes a set of instances out of service, terminates them, and launches a set of instances with the new desired configuration. Then, it waits until the instances pass your health checks and complete warmup before it moves on to replacing other instances.
...
Setting the minimum healthy percentage to 100 percent limits the rate of replacement to one instance at a time. In contrast, setting it to 0 percent causes all instances to be replaced at the same time.
https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-instance-refresh.html
If you are running the 1 instance and using the Launch template with the Autoscaling it would be hard to rolling update the EC2 instance.
i am coming from the above scenario and hitting up on this immature feature of AWS.
it's mentioned in the limitation of instance refresh, it will scale down the instance and will recreate the new one instead of creating the first new one instance.
Instances terminated before launch: When there is only one instance in
the Auto Scaling group, starting an instance refresh can result in an
outage. This is because Amazon EC2 Auto Scaling terminates an instance
and then launches a new instance.
Ref : https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-instance-refresh.html
i tried work around of scaling up the auto-scaling group desired size to 2, it will create a new instance with the latest AMI in the launch template.
Now you have two instances running the old version & latest version, you will be good to set the desired capacity now back to 1 in the auto-scaling group.
Auto-scaling desired capacity to 1 will delete the older instance and keep the latest instance with the latest AMI.
Command to update desired capacity to 2
- aws autoscaling update-auto-scaling-group --auto-scaling-group-name $ASG_GROUP --desired-capacity 2
Command to update desired capacity to 1
- aws autoscaling update-auto-scaling-group --auto-scaling-group-name $ASG_GROUP --desired-capacity 1
Instead of using the instance-refresh this worked well for me.
This does not seem to be the case anymore. An instance refresh creates now a fresh instance and terminates the old one after health checks are successful. AWS Support mentioned this behavior was not changed since 2020.

Adjusting AWS ECS spot request

I have AWS ECS cluster but spot instance type I selected is too small.
I can't find way to adjust Spot Fleet request ID or change Instance type(s) for Spot Fleet request cluster is using.
Do I have to create a new cluster with a new spot fleet request?
Is there any cli option to adjust cluster?
Do I have manually order EC2 with ECS optimized AMI ?
UPDATE In question How to change instance type in AWS ECS cluster? that sounds similar advised to copy Launch Configuration. But I have no Launch Configuration
There is no way of changing the instance types' requested by a spot fleet after its been created.
If you want to run you ECS workload on another instance type, create a new spot fleet (with instances which are aware of your ECS cluster).
When the spot instances spin up, they will register with your ECS Cluster.
Once they are registered, you can find the old instances (in the ECS Instances tab of the cluster view) and click the checkbox net to them.
Then, go to Actions -> Drain instances
This tells ECS that you no longer wish to use these instances. New tasks will now be scheduled on the new instances.
Once all the tasks are running on the new instances, you can delete the old spot fleet.
On the subject of launch configurations. There are two ways of creating collections of spot instances.
Through a Spot Fleet (which is what you're doing)
Through and Auto Scaling Group (ASG)
ASGs allow you to supply a launch configuration (basically a set of instructions to set up an EC2 instances.
Spot Fleets only allow you to customise the instance on creation via User Data.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
Because you're using Spot Fleets, Launch Configurations are really a consideration for you.

ECS is there a way to avoid downtime when I change instance type on Cloudformation?

I have created a cluster to run our test environment on Aws ECS everything seems to work fine including zero downtime deploy, But I realised that when I change instance types on Cloudformation for this cluster it brings all the instances down and my ELB starts to fail because there's no instances running to serve this requests.
The cluster is running using spot instances so my question is there by any chance a way to update instance types for spot instances without having the whole cluster down?
Do you have an AutoScaling group? This would allow you to change the launch template or config to have the new instances type. Then you would set the ASG desired and minimum counts to a higher number. Let the new instance type spin up, go into service in the target group. Then just delete the old instance and set your Auto scaling metrics back to normal.
Without an ASG, you could launch a new instance manually, place that instance in the ECS target group. Confirm that it joins the cluster and is running your service and task. Then delete the old instance.
You might want to break this activity in smaller chunks and do it one by one. You can write small cloudformation template as well because by default if you update the instance type then your instances will be restarted and to avoid zero downtime, you might have to do it one at a time.
However, there are two other ways that I can think of here but both will cost you money.
ASG: Create a new autoscaling group or use the existing one and change the launch configuration.
Blue/Green Deployment: Create the exact set of resources but this time with updated instance type and use Route53's weighted routing policy to control the traffic.
It solely depends upon the requirement, if you can pour money then go with above two approaches otherwise stick with the small deployments.

How to change instance type in AWS ECS cluster?

I have a cluster in AWS EC2 Container Service. When I've set it up, I used t2.micro instances because those were sufficient for development. Now I'd like to use more powerful instances, like m4.large.
I would like to know whether it is possible to change the instance types only, so I don't need to recreate the whole cluster. I could not find how to do this.
Yes, you can achieve this in CloudFormation.
Click on the Stack corresponding to your ECS-Cluster.
Click Update Stack
Use radiocurrent template, Next
change EcsInstanceType
Next, Next, Update
Upscale your cluster to 2*n instances
Wait for the n new instances of the new type being created
Downscale your cluster to n
Or you could just drain and terminate the instances 1 by 1
Yes, this is possible.
The instance types in your cluster are determined by the 'Instance Type' setting within your Launch Configuration. To update the instance type without having to recreate the cluster:
Make a copy of the cluster Launch Configuration and update the 'Instance Type'.
Adjust the cluster Auto Scaling Group to point to your new Launch Configuration.
Wait for your new instances to register in your cluster and your services to start.
You can also add multiple instances types to a single cluster by creating multiple Auto Scaling Groups linked to different Launch Configurations. Note however that you can't copy Auto Scaling Groups easily within the console.
To do it without any downtime:
Create a copy of the Launch Configuration used by your Auto Scaling
Group, including any changes you want to make.
Edit the Auto Scaling Group to:
Use the new Launch Configuration
Desired Capacity = Desired Capacity * 2
Min = Desired Capacity
Wait for all new instances to become 'ACTIVE' in the ECS Instances tab of the ECS Cluster
Select the old instances and click Actions -> Drain Instances
Wait until all the old instances are running 0 tasks
Edit the Auto Scaling Group and change Min and Desired back to their original values
Here are the exact steps I took to update the instance type on my cluster:
Go to the cluster service, update Number of tasks to 0
Go to EC2 -> Launch Configurations -> Actions dropdown -> Copy launch configuration and set the new instance type
Go to EC2 -> Auto Scaling Groups -> Edit -> set Launch Configuration to newly created launch configuration
Go to EC2 -> Auto Scaling Groups -> Instances -> Detach instance
Go to EC2 -> Launch Configurations -> Delete old launch configuration
Go to the cluster service, update Number of tasks to your desired count.
Now when tasks start, it'll be running on the updated EC2 instance type.
This can be achieved by modifying EcsInstanceType in the CloudFormation stack for the ECS instance. Any change to the autoscaling group by hand will be overwritten by the next "Scale ECS Instances" operation.
Yes, you can change the instance type in ECS cluster. I believe you have created ECS cluster manually from AWS GUI. Behind the scene, its creating aws cloud formation template as per your inputs from AWS console(ECS) like VPC, instance type, and size, etc. Please follow the below steps for the same.
Find the cloud formation template with the name "EC2ContainerService-{your-ecs-cluster-name}".
Check the existing setting in the Parameters tab(you can check your instance type here).
Now you need to update the cloud formation. Click on-> Update ->use current template ->next->update the EcsInstanceType variable ->next->next->update stack.
Now your cloud formation update. now you can check in EC2 console that there is a new spot fleet with new instance type.
Definitely, there are multiple ways to change the instance type as suggested about using launch configurations.
But beware that, it is a challenge to use multiple launch configuration to attach to ECS cluster that has Container Instances Scaling policies.
For example, If one is running a cluster with t2.medium type of instances using a launch configuration and have a Auto scaling policy attached to ECS cluster then it can signal only Auto scaling group and not more than 1.
The AWS docs has a complete step by step guide covering CloudFormationStack and ECS cluster launched manually.
How do I change my container instance type in Amazon ECS?
From the guide:
To change your container instance type, complete the steps in one of
the following sections:
Update container instances launched in an ECS cluster through the AWS CloudFormation stack
Update container instances launched manually in an ECS cluster