AWS ECS Fargate Memory usage - amazon-web-services

I've doubt related to ECS fargate Memory Usage.
I've created a Task Definition(0.5vCpu 1GB RAM) with fargate, and launched it via service.
I've also created the Cloudwatch Dashboard for monitoring.
I've seen that when my task is not in use, It uses 10-15% of Memory, and CPU is almost 0 when not in use.
Can Anyone can explain me? Like is there any Docker master or some daemonset is taking the memory from task Definition.

Your container is running a process that is consuming memory to stay up. When it's not doing anything is pretty normal to see 0 CPU usage and "some" memory usage. I mean if your laptop is doing NOTHING its CPU usage is probably near 0 but memory will be used in the hundreds of MB. Are you expecting to see 0 memory usage? (because that's not how it works)

Related

ECS clarify on resources

I'm having trouble understanding the config definitions of a task.
I want to understand the resources. There are a few options (if we talk only about memory):
memory
containerDefinitions.memory
containerDefinitions.memoryReservation
There are a few things I'm not sure about.
First of all, the docs say that when the hard limit is exceeded, the container will stop running. Isn't the goal of a container orchestration service to keep the service alive?
Root level memory must be greater than all containers memory. In theory I would imagine once there aren't enough containers deployed, new containers are created for the image. I wouldn't like to use more resources than I need, but if I reserve the memory on root level, first, I do reserve much more than needed, and second, if my application receives a huge load, the whole cluster will shut down if the memory limit is exceeded or what?
I want to implement a system that auto-scales, and I would imagine that this way I don't have to define resources allocated, it just uses the amount needed, and deploys/kills new containers if the load increases/decreases.
For me there are a lot of confusion around ECS, and Fargate, and how it works, how it scales, and the more I read about it, the more confusing it gets.
I would like to set the minimum amount of resources per container, at how much load to create a new container, and at how much load to kill one (because it's not needed anymore).
P.S. not experienced in devops in general, I used kubernetes at my company, and there are things I'm not clear about, just learning this ECS world.
First of all, the docs say that when the hard limit is exceeded, the container will stop running. Isn't the goal of a container orchestration service to keep the service alive?
I would say the goal of a container orchestration service is to deploy your containers, and restart them if they fail for some reason. A container orchestration service can't magically add RAM to a server as needed.
I want to implement a system that auto-scales, and I would imagine that this way I don't have to define resources allocated, it just uses the amount needed, and deploys/kills new containers if the load increases/decreases.
No, you always have to define the amount of RAM and CPU that you want to reserve for each of your Fargate tasks. Amazon charges you by the amount of RAM and CPU you reserve for your Fargate tasks, regardless of what your application actually uses, because Amazon is having to allocate physical hardware resources to your ECS Fargate task to ensure that much RAM and CPU are always available to your task.
Amazon can't add extra RAM or CPU to a running Fargate task just because it suddenly needs more. There will be other processes, of other AWS customers, running on the same physical server, and there is no guarantee that extra RAM or CPU are available on that server when you need it. That is why you have to allocate/reserve all the CPU and RAM resources your task will need at the time it is deployed.
You can configure autoscaling to trigger on the amount of RAM your tasks are using, to start more instances of your task, thus spreading the load across more tasks which should hopefully reduce the amount of RAM being used by each of your individual tasks. You have to realize each of those new Fargate task instances created by autoscaling are spinning up on different physical servers, and each one is reserving a specific amount of RAM on the server they are on.
I would like to set the minimum amount of resources per container, at how much load to create a new container, and at how much load to kill one (because it's not needed anymore).
You need to allocate the maximum amount of resources all the containers in your task will need, not the minimum. Because more physical resources can't be allocated to a single task at run time.
You would configure autoscaling with the target value, of for example 60% RAM usage, and it would automatically add more task instances if the average of the current instances exceeds 60%, and automatically start removing instances if the average of the current instances is well below 60%.

AWS ECS Task Out of Memory - Cloudwatch Alarm

I have an ECS Service that uses multiple tasks in order to execute a daily job. The memory that every task uses varies depending on the data it process. I have set to 16GB Ram in all my tasks but some tasks stopped with the following error "OutOfMemory".
Unfortunately, I can't break down the data that each task process because it has to be processed all together in order to produce the insights I want.
I know how to set up alarms for ECS Services for RAM and CPU. But RAM and CPU for the service refer to the Average of CPU and RAM for all tasks.
How can I set an alarm in order to trigger when a Task runs out of memory?
Is there a suggested way to not encounter the OutOfMemory error ?
I believe you have to enable ECS CloudWatch Container Insights to get per-task and per-container memory usage. Once you do that you will begin to see metrics for task memory usage (among other things) in CloudWatch that you can create alarms for.
Note that there is an added cost involved with enabling Container Insights.
Is there a suggested way to not encounter the OutOfMemory error ?
From an infrastructure perspective, all you can do is start provisioning more RAM for your tasks.
From an application perspective you could analyze your application for memory leaks, and examine the data structures your application is creating in memory for possible opportunities, like reducing duplicated data in memory, or moving some of the data to disk, or to a distributed cache. This sort of memory optimization work is extremely application specific.

About Cloud Run charges by hardware resources: Am I billed for underused memory?

I have a question about Cloud Run: If I setup my service with 4GB of RAM and 2 vCPUs, for example, differing from the standard 256MB and 1 vCPU, will I have to pay much more even if I never consume all the resources I have made available? For example again, let's say that I set --memory to 6GB and no request ever consumes more than 2GB, will I pay for 6GB of RAM or 2GB, considering that the peak of usage was 2GB?
I am asking because I want to be sure that my application will never die out of memory, since I think that the default 256MB of Cloud Run isn't enough for me, but I want to be sure of how Google charges and scales.
Here's a quote from the docs:
You are billed only for the CPU and memory allocated during billable time, rounded up to the nearest 100 milliseconds.
Meaning, if you allocated 4GB of memory on your Cloud Run service, you're still billed with 4GB whether it's underused or not.
On your case, since you want to make sure that requests don't run out of memory, then you can dedicate an instance resource to each request. With this, you just find out the cheapest memory setting that can run your requests and limit the concurrency setting to one.
Or, take advantage of concurrency (and allocate higher memory) because Cloud Run allows concurrent requests so you have control on how many requests can share resources before it starts a new instance, which can be a good thing as it helps drive down costs and minimize starting many instances (see cold starts). This can be a better option if you are confident that certain amount of requests can share an instance without running out of memory.
When more container instances are processing requests, more CPU and memory will be used, resulting in higher costs. When new container instances need to be started, requests might take more time to be processed, decreasing the performances of your service.
Note that each approach have different advantages and drawbacks. You should take note of each before making a decision. Experimenting with the costs by using GCP Calculator can help (the calculator includes Free Tier on the computation).

AWS Fargate "memoryReservation" - why?

I have some trouble to understand the memory management on AWS ECS Fargate.
We can set a memoryReservation to our containers, and a "memory" limit on our task.
Why should i define reserverd memory on Fargate? I can understand it for EC2 instance types (to select an instance with enough free memory) but on fargate, aws should put the task on an Instance with enough free memory?
Task Memory is the total memory available for all container ( its hard upper bound)
memoryReservation is a soft lower bound and container can use more memory if required.
This is helpful you have two or more container to in one task definition, to clarify this more we can look into this example
In this example, we allocate 128 MB for WordPress and 128Mb for MySQL, which become 256MB, which is half of the task level memory but we do not want a situation where container halt because of using max memory so we set hard memory limit to 512 and one container will reached to this the agent will the kill container.
deep-dive-into-aws-fargate

High Memory Utilization in AWS EC2

I have used AWS EC2 t2.large server for my web application.
I have setup the custom metric for MemoryUtilization. After setup, When i viewed the MemoryUtilization Metric, it shows more than 85% almost all time.
Also, I have checked the CPU Utilization for the same instance, it is less than 10% in most of the time.
I am wondering how MemoryUtilization has gone such high? What might be the possible options to reduce them? Is it due to the virtualization system of AWS?
Your application has memory leaks or unnecessary memory usage.
Try using any memory leak detection tools to fix the application.
If you don't want to fix your application, try changing the instance type to any Memory Optimized instance type.