AWS ECS Capacity providers - amazon-web-services

I created a capacity provider for a particular ecs cluster(t2.small) and it got attached to auto-scaling-group also.
Now I am running 2 tasks with similar resource which utilizes t2.small completely. This is a batch job which runs for max 5 seconds. Now when I set desired count of tasks in service to 4 or 6 my cluster never scales out. According to https://aws.amazon.com/blogs/containers/deep-dive-on-amazon-ecs-cluster-auto-scaling/ , my CapacityProviderReservation should go 200% when desired count increases to 4. But average and maximum CapacityProviderReservation is not going beyond 100%.
I have already set autoscaling policy for service, but still I am not able to scale instances using capacity providers. Can anyone explain me how to implement it

In that deep dive article, it explains:
CapacityProviderReservation = M / N * 100
Where:
M = the number of instances the cluster should have
N = the number of instances currently in the cluster
one of the factors to calculate "M", is the current number of tasks plus the tasks in the "PROVISIONING" state.
In order for a service to go into the "PROVISIONING" state, the service has to also be configured to use a capacity provider.
Once I converted a bunch of services over and scaled them up, there was no capacity for them and instead of failing instantly, they went into the "PROVISIONING" state.
At this point, CapacityProviderReservation went above 100% and the cluster scaled to meet the needs.

Related

AWS ASG target tracking an ECS took 15 minutes to scale-in after the desired tasks of ECS is 0

I have an ECS on AWS which uses a capacity provider. The ASG associated with the capacity provider is responsible to scale-out and scale-in EC2 instances based on the ECS desired task count of ECS. It is worth mentioning that the desired task is managed by a lambda function and updated based on some metrics (calculate the depth of an SQS and based on that, change the desired task of ECS).
Scaling-out is happening almost immediately (without considering the provisioning and pending time) but when the desired task is set to zero in ECS (By lambda function), it takes at least 15 minutes for ASG to turn off the instances. Sinec we are using high performance EC2 types with large numbers, this scaling-in time costs a lot of money to us. I want to know is there any way to reduce this cooldown time to a minutes?
P.S: I have set the default cooldown to 120 but it didn't change anything

AWS ECS Scaling based on memoryreservation

I've been given a AWS environment to look after and it runs ECS on EC2 instances and has scaling configured using ECS Memory Reservation. The system was originally running before Cluster Autoscaling was made generally available so it's just using a cloudwatch metric to scale out and scale in. As far as I can work out it is following a basic AWS design.
The EC2 has an autoscaling group and allows scale from 1 to 5 instances with 1 being the desired state.
There is 1 cluster service running with 6 tasks configured.
5 of those tasks are configured to run up to 2 copies of the task maximum and 1 the desired, the other is set to maximum of 1.
The tasks have MemoryReservation (soft limit) figures configured but not Memory (hard limit).
The tasks are primarily running Java.
The highest memory reservation is set at about 200MB and most are around this figure.
The scale out rule is based on MemoryReservation at 85%.
Docker stats shows most of the tasks are running about 300MB and some exceed 600MB.
The instance size has 4GB of RAM.
If the maximum reservation is 2GB, even if the tasks are consuming more like 3GB in reality, am I right in believing that the scale out rule will NEVER be invoked because 2GB is 50% of available RAM? Do I need to increase the memory reservations to something more realistic?
Also if it is only running a single EC2 instance am I right in thinking even if I increased the MemoryReservation figures to something more realistic, just because there's no theoretical room to start another task it won't spin up a second EC2 instance automatically? Just picked this up from different articles I've been reading when searching.
Thanks
After the update of Capacity Providers in May 2022, Capacity Providers still have a gap to fill in Memory scaling.
As per the OP "ECS Memory Reservation" seems not to even be an option any more (at least in the web console)
And when creating the Capacity Provider, only the target value is configurable.
There are more details into how this Capacity is calculated in this blog, but while it mentions:
This calculation accounts for vCPU, memory, ENI, ports, and GPUs of the tasks and the instances
If you have tasks that not necessarily grow memory consumption, but you have a service with scheduled actions configured to scale tasks (eg: minimum tasks at different times of day)
This case will not trigger a scale out, since the memory in the instances does not get to be used if the tasks simply does not fit in, due to its configuration and you will see errors (in the service events) like:
service myservice was unable to place a task because no container
instance met all of its requirements. The closest matching
container-instance abc123xxxx has insufficient memory available.
This basically mean a scheduled task scaling change may not happen if the task memory setting is just big enough so it doesn't fit in the running instances, and the CapacityProviderReservation does not change because the calculation is only done when tasks are in Provisioned state, which does not happen in this case.
Possible workarounds
Decrease the Capacity Reservation. This basically means "to have spare capacity", ie: by default Reservation is 100 (%) so it tries to use the ASG cluster resources as much as possible, so having a number less than 100, means it will scale out when the cluster is used at that capacity therefore having a margin spare of resources at all times, which means new scheduled tasks will fit in, as long as the spare is enough (eg: calculate per task memory reservation and cluster memory reservation of all expected running tasks)
Setup ASG rules for scaling that match the service scaling rules.
While possible, this may be bound for problems with timing and auto scaling due to other triggers.
A few things:
Cluster AutoScaling usually is just the term ECS uses for "An AutoScaling Group that launches instances into the cluster", and it sounds like that's what you are currently using. Capacity Providers are a newer feature where ECS more directly manages the ASG, which might be the newer feature you're thinking onf
'Desired Capacity' isn't a state that you set for where you want the group to be, its the current amount of capacity that AutoScaling wants there to be in the group. So if a scaling policy goes off and says +1, the desired will change to 2, and then AutoScaling will try to launch an instance since you presumably only had 1 before (since the desired was 1 before)
Memory reservation is based on that 2GB's reserved, so it doesn't mater how much is in use for scaling purposes. This is importaint because even if you had 6/8GB reserved (from 3 2GB tasks), but 7.5Gb in use, ECS would still allow another task to be launched, since there's still 2 reservable GBs
Because of 3) you should probably increase the reservation value, wouldn't want an instance to get overloaded. Java can be nasty about RAM issues. This would also help with your scale out threshold issue.
For your second question, scaling will only happen after the cloudwatch alarm is triggered. So if the metric never goes above that threshold, alarm can't trigger the scaling policy. There are a whole host of cases where just because the alarm triggers, scaling won't happen (more of them for scaling in than scaling out, but it can still happen on scale out too); but the alarm going into the Alarm state is definitely a required step.

Running single task in ECS without Blue/Green Deployment

Is it possible to have exactly one task running in AWS ECS at all time? I don't want to have Blue/Green kind of deployment.
My Requirement:
Min/Desired/Max task = 1;
When I redeploy ECS service, then it should first stop old task and then spin new task. Currently it does opposite.
Any reference would be helpful.
Yes it is possible.
You can create an ECS Service with Number of Tasks as 1 that will set the desired count to 1.
Since you want only 1 task and that should stop and then a new one should come, you can modify the Deployment Configuration with below values:
Minimum Healthy Percent - 0
Maximum Percent - 100
With Desired Count as 1, Minimum Healthy Percent as 0 and Maximum Percent as 100, ECS Service will kill the already running task and then create a new task.
Note: Service will de down during this time.
To explain the behavior you noticed, The default values are
Minimum Healthy Percent - 100
Maximum Percent - 200
and with Desired Count as 1, in this case ECS Service will maintain one running task at all times since Minimum Healthy Percent is 100 i.e. 100% of 1 is 1. However, Maximum Percent as 200 allows ECS Service to create another task as 200% of 1 is 2. So a new task is started first and once this task is stable the old task is stopped.

Updating an AWS ECS Service

I have a service running on AWS EC2 Container Service (ECS). My setup is a relatively simple one. It operates with a single task definition and the following details:
Desired capacity set at 2
Minimum healthy set at 50%
Maximum available set at 200%
Tasks run with 80% CPU and memory reservations
Initially, I am able to get the necessary EC2 instances registered to the cluster that holds the service without a problem. The associated task then starts running on the two instances. As expected – given the CPU and memory reservations – the tasks take up almost the entirety of the EC2 instances' resources.
Sometimes, I want the task to use a new version of the application it is running. In order to make this happen, I create a revision of the task, de-register the previous revision, and then update the service. Note that I have set the minimum healthy percentage to require 2 * 0.50 = 1 instance running at all times and the maximum healthy percentage to permit up to 2 * 2.00 = 4 instances running.
Accordingly, I expected 1 of the de-registered task instances to be drained and taken offline so that 1 instance of the new revision of the task could be brought online. Then the process would repeat itself, bringing the deployment to a successful state.
Unfortunately, the cluster does nothing. In the events log, it tells me that it cannot place the new tasks, even though the process I have described above would permit it to do so.
How can I get the cluster to perform the behavior that I am expecting? I have only been able to get it to do so when I manually register another EC2 instance to the cluster and then tear it down after the update is complete (which is not desirable).
I have faced the same issue where the tasks used to get stuck and had no space to place them. Below snippet from AWS doc on updating a service helped me to make the below decision.
If your service has a desired number of four tasks and a maximum
percent value of 200%, the scheduler may start four new tasks before
stopping the four older tasks (provided that the cluster resources
required to do this are available). The default value for maximum
percent is 200%.
We should have the cluster resources available / container instances available to have the new tasks get started so they can start and the older one can drain.
These are the things i do
Before doing a service update add like 20% capacity to your cluster. You can use the ASG (Autoscaling group) commandline and from the desired capacity add 20% to your cluster. This way you will have some additional instance during deployment.
Once you have the instance the new tasks will start spinning up quickly and the older one will start draining.
But does this mean i will have extra container instances ?
Yes, during the deployment you will add some instances but as the older tasks drain they will hang around. The way to remove them is
Create a MemoryReservationLow alarm (~70% threshold in your case) for like 25 mins (longer duration to be sure that we have over commissioned). As the reservation will go low once you have those extra server not being used they can be removed.
I have seen this before. If your port mapping is attempting to map a static host port to the container within the task, you need more cluster instances.
Also this could be because there is not enough available memory to meet the memory (soft or hard) limit requested by the container within the task.

What is the minimum healthy percent and maximum percent in Amazon ECS

I already have the experience in Docker and EC2. But I'm new to ECS. Can someone help me to understand what these two parameters actually does, their difference and usage.
Official Docs says:
The minimum healthy percent represents a lower limit on the number of your service's tasks that must remain in the RUNNING state during a deployment, as a percentage of the desired number of tasks (rounded up to the nearest integer). This parameter enables you to deploy without using additional cluster capacity. For example, if your service has a desired number of four tasks and a minimum healthy percent of 50%, the scheduler may stop two existing tasks to free up cluster capacity before starting two new tasks. Tasks for services that do not use a load balancer are considered healthy if they are in the RUNNING state; tasks for services that do use a load balancer are considered healthy if they are in the RUNNING state and the container instance it is hosted on is reported as healthy by the load balancer. The default value for minimum healthy percent is 50% in the console and 100% for the AWS CLI, the AWS SDKs, and the APIs.
The maximum percent parameter represents an upper limit on the number of your service's tasks that are allowed in the RUNNING or PENDING state during a deployment, as a percentage of the desired number of tasks (rounded down to the nearest integer). This parameter enables you to define the deployment batch size. For example, if your service has a desired number of four tasks and a maximum percent value of 200%, the scheduler may start four new tasks before stopping the four older tasks (provided that the cluster resources required to do this are available). The default value for maximum percent is 200%.
I still didn't get a clear idea about these two parameters.
With minimum healthy percent, if my service has a desired number of 4 tasks and a minimum healthy percent of 25%, how many existing/new tasks will the scheduler start/stop?
With maximum percent, if my service has a desired number of 4 tasks and a maximum percent value of 50%, how many existing/new tasks will the scheduler start/stop?
If my service has only one task running, how to set these parameters to get existing task stopped and new task running.
With desired tasks at 4 and minimum health at 25% then during deployment ECS is allowed to stop all tasks except 1 before it starts new tasks
Maximum should be 100+ so that configuration does not make sense. But if you have 4 desired tasks and maximum at 150% then ECS is allowed to start 2 new tasks before it stops other tasks.
If you want to make sure there is never more than 1 task running during redeployment then you need to set desired to 1, minimum to 0% and maximum to 100%.
If you want to make sure there is always at least 1 task running you need to set desired to 1, minimum to 100% and maximum to 200%.
Other example. If you have desired at 4, minimum at 50% and maximum of 150%. Then ECS can decide what it will do during deployment.
If the cluster does not have more resources it can decide to stop 2 tasks, start 2 new tasks in the new version, wait for the new task to be healthy. Then stop the two remaining tasks and start two new ones.
If the cluster has more resources then ECS can decide to start two new tasks before stopping of existing tasks.
Or you could look at it this way. During redeployment ECS needs to run between MinimumPercent/100*desired and MaximumPercent/100*desired tasks. In this case between 2-6 tasks.