this is my first time using amazon ecs service.
I have searched online for awhile to understand auto-scaling with ecs services.
I found there are two options to auto-scale my application. But, There are some I don't understand.
First is Service auto scaling which track the cpu/memory metric from cloudWatch and increase the task number accordingly.
Second is cluster auto scaling which needs to create auto scaling resource, create capacity provider and so on. But, in Tutorial: Using cluster auto scaling, it can run the task definition without service. But it also seems increasing the task number in the end.
So what's the different and 'pros and cons' between them?
I will try to explain briefly.
A Task is a container which is running our code(from a docker image).
As Service is making sure that given desired no of tasks are maintained.
We will be running these services in ECS backed by EC2 or Fargate. Ec2 is machines managed by us. Fargate is machines managed by AWS.
Scaling:
Ultimately, We will be scaling the tasks just by setting desired no of tasks between min and max tasks, based on CPU or any other metric of individual task. This is called service auto scaling.
Fargate: Since AWS will manage necessary VMs behind the scenes, we can set any no of desired tasks we want and seamlessly scale without worrying about any infrastructure.
EC2: We can't seamlessly scale services because we need to add/remove EC2 instances behind the scenes too. We need to auto scale these instances also based on cpu or any other metrics of the Ec2 machines, which is called Cluster scaling.
Related
Motivation
I currently have a number of services deployed to ECS with EC2 Launch Type, but I can change that if needed.
I would like to use the EC2 Predictive Scaling feature since traffic is very periodic (peak in day, slack at night).
ECS Service Auto-Scaling uses Application Auto-Scaling and only supports these policies:
Target Tracking Scaling
Scheduled Scaling
Step Scaling
Questions
Is it possible to use EC2 Predictive Auto Scaling while still deploying to ECS? If so, what is the simplest approach?
Is there a reason AWS hasn't included Predictive Auto Scaling in ECS Service Auto-Scaling?
There isn't a particular reason why predictive autoscaling is limited to EC2 instances only other than customers' interest/demand. I'd suggest you open an issue on the public roadmap for the AWS containers services so that the team can track the request and get more insides about the use case.
I'm not seeing the point in using a capacity provider to scale the ECS cluster if I have automatic scaling at the ECS service level:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html
Am I missing something? Why would I use a capacity provider to scale the auto scaling group if I already can scale it at the service level?
Your auto scaling group scaling works on a service level only. An ECS cluster can have many services running. Therefore, capacity provider runs at cluster level and can scale your container instances based on all the services in the cluster, not only one service.
Capacity Providers is the compute interface that links your Amazon ECS cluster with your ASG. With Capacity Providers, you can define flexible rules for how containerized workloads run on different types of compute capacity, and manage the scaling of the capacity. Capacity Providers improve the availability, scalability, and cost of running tasks and services on ECS.
In other words, with Capacity provider, you can apply flexible rule to scales your app that saves the cost.
Example - By simply setting the base and weights in capacity provider strategy that to set the default strategy to be a mix of Fargate and Fargate Spot to ensure that if Spot tasks are terminated, we still have our minimum, desired amount of tasks running on Fargate. In this way we can take advantage of the cost savings of Fargate Spot in our every day workloads.
(for more information, check out the official AWS documentation.)
I'm planning on shifting from EC2 to Fargate because it said it automatically "removes the need to choose server types, decide when to scale your clusters, or optimize cluster packing". I think I understand how a cluster scales through auto-scaling rules, but that isn't exactly automatic. So am I missing something regarding how scaling in AWS Fargate works?
As far as I understand so far, I make a basic task. I assign the task some memory and CPU, and the only way it scales is through auto scaling which will basically recreate these tasks when the need arises (either through alarms or specific rules). TIA!
Fargate abstracts underline cluster nodes. In ECS (EC2 instances) you have to manage auto scaling for services and cluster both.
In fargate however you can scale services only and don't have to worry about underline cluster. Very much like service autoscaling you have been doing in ECS.
I have multiple services running in my ECS cluster. Each service contains one or more tasks based on CPU utilization or a number of users.
I have deployed these containers with EC2 launch type.
Now, I want to increase/decrease the number of EC2 instances based on available resources in the cluster.
Let's say there are four ECS tasks running in two m5.large instances.
Now, if an ECS service increases the number of tasks and there aren't enough resources available in the cluster, how can I spin up an instance and add to the cluster?
And same goes for vice versa. If there is instance running with no ecs task in it, how can I destroy it automatically?
PS - I was using Fargate. Since it's cost is very high, I moved to EC2 instances.
you need to setup your ecs cluster instances in a ASG as #Nitesh says, second you need to set up a cloudwatch alert based in a key metric, with ecs is complex because you need to set up two autoscaling policies one by service another one to scale up your instances, for ec2 the metric that you could use is Cluster CPU reservation and /or Cluster memory reservation.
The scheme works like this your service increases the number of the desired container by an autoscaling rule using a key metric for your service as could be de CPU usage or the number the request in a load balancer and in consequence the Cluster CPU reservation increase this triggers the cloudwatch alert and your ASG increase the number of instaces.
Some tips scale up fast, and scale down slow this could by handle by setting up the time of the alerts
For the containers use Service Auto Scaling and Target tracking policies for more info see
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch-metrics.html#cluster_reservation
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-auto-scaling.html
https://aws.amazon.com/blogs/compute/automatic-scaling-with-amazon-ecs/
I hope this help
Regards
I am using an ECS cluster for Jenkins agents/slaves with the Jenkins ECS plugin.
The plugin places a ECS Task when a job requests a build-node. Now I want to scale the EC2 instances in a Autoscaling Group associated with the ECS Cluster according to the demand.
The jenkins is often idle. In this case, I do not want there to be any instances in the autoscaling group.
If a node (and therefore an ECS task) is requested and cannot be placed, I want to add an EC2 instance to the autoscaling group.
If an instance is idle and shortly before an billing hour, I want that instance to be removed.
The 3. point can be accomplished by a cronjob on the EC2 instances that regularly checks if the conditions are met and removes the EC2 instance.
But how can I accomplish the 2. point? I am unable to create a cloudwatch alarm that triggers, if a task cannot be placed.
How can I accomplish this?
A rather hacky way to achieve this: You could use a Lambda function to detect when a service has runningCount + pendingCount < desiredCount for more than X seconds. (I have not tested this yet.)
Similar solutions are proposed here.
There does not seem to be a proper solution to scale only when tasks cannot be placed. Maybe AWS wants us to over-provision our clusters, which might be good practice for high availability, but not always the best or cheapest solution.
When a task cannot be placed it means that placing that task in your ECS cluster would exceed either your MemoryReservation or CPUReservation. You could set up Cloudwatch alarms for one or both of these ECS metrics and an auto scaling policy that will add and remove EC2 instances in your ECS cluster.
This, in combination with an auto scaling policy that scales your ECS services on the ecs:service:DesiredCount dimension should be enough to get you adding the underlying EC2 instances your ECS cluster requires.
For example your ScalingPolicy for an ECS Service might be "when we're using 70% of our allotted memory for this service, add 2 to the DesiredCount". After adding 1 service task, your ECS Cluster MemoryReservation metric might bump up past an "80" threshold, at which point a Cloudwatch alarm would trigger for some threshold on ECS MemoryReservation, with an auto scaling policy adding another EC2 node, on which the 2nd task could now be placed.
For those arriving after January 2020, the way to handle it now is probably Cluster Auto Scaling as documented here: "Amazon ECS cluster auto scaling" with more info here: "Deep Dive on Amazon ECS Cluster Auto Scaling)".
Essentially, ECS now handles most the heavy lifting. Not all, or I wouldn't be here looking for an answer ;)
For point 2, one way to solve this would be to autoscale when there is not enough cpu units for placing a new jenkins slave.
You should use the cpu reservation metric on the cluster to scale.
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch-metrics.html#cluster_reservation