Fargate ThrottlingException Rate exceeded - amazon-web-services

I am attempting to run 30 Fargate tasks at once and I am receiving "ThrottlingException: Rate exceeded".
In the ECS Service Limits, it mentions that the default limit for concurrent Fargate tasks is 50.
Am I being throttled for something other than the number of concurrent Fargate tasks? For example, is Fargate registering a container instance for each task; and thus I'm exceeding the container instance registration rate?

I reached out to AWS support and got the following answer:
ECS' run-task API, when launching a Fargate task, is throttled at 1 TPS
by default with a burst rate of 10. This means that you can--at
most--launch 10 tasks every 10 seconds. As such, we recommend that
[you] use some backoff strategy on [your] end when launching tasks.
Alternatively, [you] can use ECS create-service, in which case ECS will
ensure that all tasks are run in time while honoring the throttle
rate.
Essentially, although I could run 30 tasks concurrently, I couldn't start all 30 tasks at the same time due to the throttling of the run-task API for Fargate tasks.
As of November 7th 2018, this limit is not mentioned in the AWS documentation: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service_limits.html

Related

AWS ASG target tracking an ECS took 15 minutes to scale-in after the desired tasks of ECS is 0

I have an ECS on AWS which uses a capacity provider. The ASG associated with the capacity provider is responsible to scale-out and scale-in EC2 instances based on the ECS desired task count of ECS. It is worth mentioning that the desired task is managed by a lambda function and updated based on some metrics (calculate the depth of an SQS and based on that, change the desired task of ECS).
Scaling-out is happening almost immediately (without considering the provisioning and pending time) but when the desired task is set to zero in ECS (By lambda function), it takes at least 15 minutes for ASG to turn off the instances. Sinec we are using high performance EC2 types with large numbers, this scaling-in time costs a lot of money to us. I want to know is there any way to reduce this cooldown time to a minutes?
P.S: I have set the default cooldown to 120 but it didn't change anything

How to define rate limiting for ECS fargate service?

I am developing a service using ECS on fargate. This service calls AWS textract to process documents. The service listens to a SQS with requests to process.
As per textract documentation: https://docs.aws.amazon.com/general/latest/gr/textract.html, there is a limit on TPS per account at which the textract apis can be called.
Since there are "no visible instances" in fargate and all architecture from launching instances to auto scale them is managed by fargate, I am confused on how should I define rate limiter while calling AWS textract apis.
For instance suppose there comes 1000 message requests in the SQS, the service(internally) might spawn multiple instances to process the requests and the TPS at which textract is called might easily exceed the limits.
Is there any way to define a rate limiter for such scenario such that it blocks the request if limits are breached?
Fargate doesn't handle autoscaling for you at all. Your description of how Fargate works sounds more like Lambda than Fargate. ECS handles the autoscaling, not Fargate. Fargate just runs the containers ECS tells it to run. In ECS you would have complete control over the autoscaling settings, such as the maximum number of Fargate tasks that you want to run in your ECS service.

How to detect ECS fargate task autoscaling events like LifeCycleHook

I have ECS container running some tasks. The server running inside the task may take 1~10 minutes to complete one request.
I am using SQS for task queuing. When certain amount tasks exceeds it scale-up the ECS tasks. And it scale-down when task in queue go below certain numbers.
However, as there is no LifeCycleHook feature for ECS task, during the time of scale-down the ECS tasks are shut down while the processing is still running. And it's not possible to delay the task termination due to the lack of LifeCycleHook.
According to our specification, we can't use the timeout feature, as we don't know earlier how much time it will take to finish the job.
Please suggest how to solve the problem.
There is no general solution to this problem, especially if you don't want to use timeout. In fact there is long lasting, still open, github issue dedicated to this:
[ECS] [request]: Control which containers are terminated on scale in
You could somehow control this through running your services on EC2 (EC2 scale-in protection), not Fargate. So either you have to re-architect your solution, or manually scale-out and in your service.

ECS starting tasks sequentially though resources are available

In our ECS cluster setup with ASG Capacity provider, we have 5 EC2 instances and each instance can take around 20 tasks. So overall there are resources available to run 100 tasks. Now if we submit a service with 100 tasks, though there are enough resources, not all tasks are started parallely. I see tasks are coming up in batches of size 20 with a gap of 10 secs between each batch. I observed this from ECS Service Event logs. Any configuration which we can tweak to achieve complete parallelism.
This behavior is due to artificially controlled throughput (expressed in Tasks per Second - TPS) that the ECS service control plane imposes. There is a bursting concept in there (which is the reason for which you see this batch of tasks being launched and then a delta in seconds). The reasons for which these limits exist is to avoid being throttled in other parts of the services surface. These limits can be lifted if there is a strong need but the engineering team will need to validate the use case and expectations (see the point about hitting potentially other limits). The best way to address this discussion is by opening a ticket with AWS Support and explore your alternatives (based on your requirements).

Can an AWS Fargate Service have 0 tasks running?

I currently have a Fargate cluster that contains a service. This service always has 1 task running and is polling from SQS. The service will scale the number of tasks if SQS grows/shrinks. However, the task has a lot of idle time, where there are no messages in the queue. To save on costs, is it possible to make the service go down to 0 task?
I have been trying to do this and the service will always try to start at least 1 task.
If this is not possible, then would it be best practice for me to not use a service and have a CloudWatch alarm on SQS and just create a task directly in the cluster when the size is greater than 0, and then shut down the task when the SQS is back to 0? Essentially mimicking the functionality of a service.
Yes you can. You can also use a Target Tracking Policy that allow you to scale more efficiently than a Step Scaling Policy.
See https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-using-sqs-queue.html for more details (it's about EC2 but works for ECS as well).