AWS Fargate and its memory management - amazon-web-services

From AWS documentation I can see that CPU and Memory properties are required in the AWS::ECS::TaskDefinition for Fargate but not in the ContainerDefinition within the resource if Fargate is used.
How does this exactly work? If I do not specify it in the ContainerDefinition it will use as much resources the Task have available? If there is only one container within the task... does it make any sense defining those values? If they are required, it seems pretty redundant and verbose to me.

When you register a task definition, you can specify the total cpu and memory used for the task. This is separate from the cpu and memory values at the container definition level.
If using the Fargate launch type, these task definition fields are required and there are specific values for both cpu and memory that are supported. This will be a hard limit of CPU/Memory to present to the task. For example, if your task is configured to use 1 vCPU and 2 GB of memory, so at the moment the memory limit is 2 GB. If at any moment the task memory utilization exceed the 2 GB, the task will terminate with OutOfMemory error.
Task size: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#task_size
You can also specify cpu and memory resource on the container level. This will be the amount of resources to present to the container (a Task can have multiple containers). If your container attempts to exceed the resource specified here, the container is killed. These fields are optional for tasks using the Fargate launch type, and the only requirement is that the total amount of cpu and memory reserved for all containers within a task be lower than the task-level cpu and memory value, if one is specified.
On a container level, the Docker daemon reserves a minimum of 4 MiB of memory for a container, so you should not specify fewer than 4 MiB of memory for your containers.
Standard Container Definition Parameters: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html#standard_container_definition_params

When a container has no specified limits in a TaskDefinition, the container will use all the available resources for the task, which are mandatory for a Fargate task.
This means that there is no need to define them if there is only one container in the TaskDefinition. They can be specified though they are redundant (if equal to those of the task itself) or even harmful (if they are lower than the amount given to the task).
In case more than once container belongs to the same task, Fargate will distribute evenly the resources among all the containers. This may (or not) be a desired behaviour.

Related

ECS clarify on resources

I'm having trouble understanding the config definitions of a task.
I want to understand the resources. There are a few options (if we talk only about memory):
memory
containerDefinitions.memory
containerDefinitions.memoryReservation
There are a few things I'm not sure about.
First of all, the docs say that when the hard limit is exceeded, the container will stop running. Isn't the goal of a container orchestration service to keep the service alive?
Root level memory must be greater than all containers memory. In theory I would imagine once there aren't enough containers deployed, new containers are created for the image. I wouldn't like to use more resources than I need, but if I reserve the memory on root level, first, I do reserve much more than needed, and second, if my application receives a huge load, the whole cluster will shut down if the memory limit is exceeded or what?
I want to implement a system that auto-scales, and I would imagine that this way I don't have to define resources allocated, it just uses the amount needed, and deploys/kills new containers if the load increases/decreases.
For me there are a lot of confusion around ECS, and Fargate, and how it works, how it scales, and the more I read about it, the more confusing it gets.
I would like to set the minimum amount of resources per container, at how much load to create a new container, and at how much load to kill one (because it's not needed anymore).
P.S. not experienced in devops in general, I used kubernetes at my company, and there are things I'm not clear about, just learning this ECS world.
First of all, the docs say that when the hard limit is exceeded, the container will stop running. Isn't the goal of a container orchestration service to keep the service alive?
I would say the goal of a container orchestration service is to deploy your containers, and restart them if they fail for some reason. A container orchestration service can't magically add RAM to a server as needed.
I want to implement a system that auto-scales, and I would imagine that this way I don't have to define resources allocated, it just uses the amount needed, and deploys/kills new containers if the load increases/decreases.
No, you always have to define the amount of RAM and CPU that you want to reserve for each of your Fargate tasks. Amazon charges you by the amount of RAM and CPU you reserve for your Fargate tasks, regardless of what your application actually uses, because Amazon is having to allocate physical hardware resources to your ECS Fargate task to ensure that much RAM and CPU are always available to your task.
Amazon can't add extra RAM or CPU to a running Fargate task just because it suddenly needs more. There will be other processes, of other AWS customers, running on the same physical server, and there is no guarantee that extra RAM or CPU are available on that server when you need it. That is why you have to allocate/reserve all the CPU and RAM resources your task will need at the time it is deployed.
You can configure autoscaling to trigger on the amount of RAM your tasks are using, to start more instances of your task, thus spreading the load across more tasks which should hopefully reduce the amount of RAM being used by each of your individual tasks. You have to realize each of those new Fargate task instances created by autoscaling are spinning up on different physical servers, and each one is reserving a specific amount of RAM on the server they are on.
I would like to set the minimum amount of resources per container, at how much load to create a new container, and at how much load to kill one (because it's not needed anymore).
You need to allocate the maximum amount of resources all the containers in your task will need, not the minimum. Because more physical resources can't be allocated to a single task at run time.
You would configure autoscaling with the target value, of for example 60% RAM usage, and it would automatically add more task instances if the average of the current instances exceeds 60%, and automatically start removing instances if the average of the current instances is well below 60%.

ECS not respecting Task Placement Constraint

I have an ECS cluster in which some services have Task Placement constraints. However, some seem to work while others don't.
I want a specific service to only launch on ECS instances that have a specific attribute: In this case, task==relay and ecs.instance-type==t2.micro.
My task placement looks like this:
And my ECS registered instances look like this:
However, when I try to run two tasks of that service, one gets place in it's appropriate instance while the other one tries to be placed in one that doesn't satisfy any of those 2 constraints, giving the following error (the instance is in a warm pool and doesn't have the agent activated, and it's also not a t2.micro instance).
I want both tasks to run in the same t2.micro instance, that has 512 CPU available and 682 Memory available. The task size is 512 CPU Units and 300 MB memory, so it should fit 2 in the same t2.micro, unless i'm not counting on something. Even if that was the case, it should tell me that that the micro instance (that satisfies the constraints) doesn't have enough resources, not that it tried to run it in a totally different instance altogether, correct?
Thank you
This is due to the task placement strategy of your service. The Strategy and Constraint disagree with each other. The current task placement strategy defined tells ECS to evenly spread your tasks across instance types available, while the constraint says that the attribute task should equal relay and instance type should be t2.micro. ECS places one task according to the constraint then moves on to spread the task w.r.t. instance type. Since the constraint restricts that, it's unable to place the second task for you.
Fix to this would be to go for a binpack placement strategy w.r.t. CPU which will leave the least amount of unused CPU while also minimising the number of container instances in use. Refer to doc for more clarity.

Task definition CPU reservation on AWS ECS EC2

I am building my cluster on ECS while using EC2 instances. I am curious about specifying CPU reservation on my Task definitions. How does AWS manage my tasks inside of EC2 instances when i leave CPU reservation empty or write 0?
I have read this article: https://aws.amazon.com/blogs/containers/how-amazon-ecs-manages-cpu-and-memory-resources/
And here it says:
when you don’t specify any CPU units for a container, ECS intrinsically enforces two Linux CPU shares for the cgroup (which is the minimum allowed).
I am not really sure what this means and is it different for Tasks, because this is specifically stated for containers?
Cgroups are a feature of the Linux kernel that allow the distribution and hierarchy of services that run on your host.
This enables your containers to operate independently from each other (they will have access to a portion of the available CPU), whilst also providing the ability for higher priority tasks to gain access to the CPU if it required.
A CPU share defines how much of the overall CPU your container can have access to, as you add more containers this becomes a ratio of division between each container. Each of your containers in your case will get 2, if there are 4 containers this is a ratio of 0.25 of the available CPU per each one.
If you define in a task the limits you can cap the maximum of the resource on the host that can be used, of which the CPU shares will then be split in a ratio of. However, this will affect scheduling of new containers (if there is not enough resource available for the task and auto scaling is not enabled your chosen task cannot be scheduled).
There is some documentation here on cgroups, it is technical so if you have little experience of Linux it might be a little confusing.

AWS Fargate "memoryReservation" - why?

I have some trouble to understand the memory management on AWS ECS Fargate.
We can set a memoryReservation to our containers, and a "memory" limit on our task.
Why should i define reserverd memory on Fargate? I can understand it for EC2 instance types (to select an instance with enough free memory) but on fargate, aws should put the task on an Instance with enough free memory?
Task Memory is the total memory available for all container ( its hard upper bound)
memoryReservation is a soft lower bound and container can use more memory if required.
This is helpful you have two or more container to in one task definition, to clarify this more we can look into this example
In this example, we allocate 128 MB for WordPress and 128Mb for MySQL, which become 256MB, which is half of the task level memory but we do not want a situation where container halt because of using max memory so we set hard memory limit to 512 and one container will reached to this the agent will the kill container.
deep-dive-into-aws-fargate

Amazon ECS Task Definition - CPU units & Memory - set container to use 100% of the EC2 available Resources

I'd like to have multiple different services running on an ECS cluster, each service should be running on a single EC2 instance. The EC2 instances type for all services are the same. And I would like those services to use all their hosting EC2 available resources.
I have the assumption that if i use only the soft memory parameter (without using the hard one ) in the Task Configuration, this will allow my container instance to use all the available memory on the EC2 instance hosting it and that i won't be limiting. Is that correct?
As for the EC2 type (t2.micro [vCPU=1, Memory=1Gib] for example) !! is it possible to simply put:
{
...
"memory": 1024,
"cpu": 1024,
...
}
Since the EC2 should be already set up with a bunch of Container Service Requirements.
Is it correct that you're trying to have each ECS Instance handle only a single task per instance?
The short answer to your question is, no. Usually the amount of memory made available to your containers is a bit less than the amount of memory available on the machine itself. This is so that the operating system has enough memory to keep running. From my experience, a T2.Small, which has 2048 MB of memory will end up with 2004 MB available for containers.
When it comes to your task definition, there are two ways of specifying Memory. The memory setting is a hard limit. If the containers memory usage hits this amount, the container will be terminated. If on the other hand, you specify memoryReservation, that much memory will be reserved for the task, but it can use more, up to the total amount of the machine. Check out the Task Definition documentation for further details.
An important consideration here is that only one of memory and memoryReservation are required. If both are used, memoryReservation should be less than memory. If you are only going to specify one of these, I'd recommend memoryReservation, as it will allow your task to use up to the total memory on the machine. If both are used, the memoryReservation will be used in calculating the amount of memory consumed by a task.
When placing tasks on an instance, it looks at the amount of available memory, that is the registered amount of memory for the instance, minus any tasks already placed on it. If this number is less than the amount of memory required for a task, no task will be placed on it. If no instance has enough memory for the task, it will not be placed, and the error will be logged in the Services Events log.
So it's important to look at the amount of memory actually registered by your instance type, and then ensure your memory or memoryReservation are lower than the amount registered by your instances. Otherwise, your tasks will never be placed.
As for cpu, this value is not required, and if not specified, all tasks on an instance are allowed an equal portion of the CPU available on the system. If only one task is on the instance, it can use the entire CPU of the instance by default.