AWS ECS Container Agent only registers 1 vCPU but 2 are available - amazon-web-services

In the AWS console I setup a ECS cluster. I registered a EC2 container instance on a m3.medium, which has 2vCPUs. In the ECS console it says only 1024 CPU units are available.
Is this expected behavior?
Should the m3.medium instance not make 2048 CPU units available for the cluster?
I have been searching the documentation. I find a lot of explanation of how tasks consume and reserve CPU, but nothing about how the container agent contributes to the available CPU.
ECS Screenshot

tldr
In the AWS console I setup a ECS cluster. I registered a EC2 container > instance on a m3.medium, which has 2vCPUs. In the ECS console it says > only 1024 CPU units are available. > > Is this expected behavior?
yes
Should the m3.medium instance not make 2048 CPU units available for
the cluster?
No. m3.medium instances only have 1vCPU (1024 CPU units)
I have been searching the documentation. I find a lot of explanation
of how tasks consume and reserve CPU, but nothing about how the
container agent contributes to the available CPU.
There prob isn't going to be much official documentation on the container agent performance specifically, but my best recommendation is to pay attention to issues,releases,changelogs, etc in the github.com/aws/amazon-ecs-agent and github.com/aws/amazon-ecs-init projects
long version
The m3 instance types are essentially deprecated at this point (not even listed on main instance details page), but you'll be able to see the m3.medium only has 1vCPU in the the details table on kinda hard to find ec2/previous-generations page
Instance Family Instance Type Processor Arch vCPU Memory (GiB) Instance Storage (GB) EBS-optimized Available Network Performance
...
General purpose m3.medium 64-bit 1 3.75 1 x 4 - Moderate
...
According to this knowledge-center/ecs-cpu-allocation article, 1 vCPU is equivalent to 1024 CPU Units as described in
Amazon ECS uses a standard unit of measure for CPU resources called
CPU units. 1024 CPU units is the equivalent of 1 vCPU.
For example, 2048 CPU units is equal to 2 vCPU.
Capacity planning for ECS cluster on an EC2 can be... a journey... and will be highly dependent on your specific workload. You're unlikely to find a "one size" fits all documentation/how-to source but I can recommend the following starting points:
The capacity section of the ECS bestpractices guide
The cluster capacity providers section of the ECS Developers guide
Running on an m3.medium is prob a problem in and of itself since the smallest types i've in the documentation almost are c5.large, r5.large and m5.large which have 2vCPU

Related

AWS Fargate Prices Tasks

I have set up a Task Definition with CPU maximum allocation of 1024 units and 2048 MiB of memory with Fargate being the launch type. When I looked at the costs it was way more expensive than I thought ($ 1.00 per day or $ 0.06 per hour [us-east-1]). What I did was to reduce to 256 units and I am waiting to see if the costs goes down. But How does the Task maximum allocation work? Is the task definition maximum allocation responsible for Fargate provisioning a more powerfull server with a higher cost even if I dont use 100%?
The apps in containers running 24/7 are NestJS application + apache (do not ask why) + redis and I can see that it has low CPU usage but the price is too high for me. Is the fargate the wrong choice for this? Should I go for EC2 instances with ECS?
When you run a task, Fargate provisions a container with the resources you have requested. It's not a question of "use up to this maximum CPU and memory," but rather "use this much CPU and memory." You'll pay for that much CPU and memory for as long as it runs, as per the AWS Fargate pricing. At the current costs, the CPU and memory you listed (1024 CPU units, 2048MiB), the cost would come to $0.04937/hour, or $1.18488/day, or $35.55/month.
Whether Fargate is the right or wrong choice is subjective. It depends what you're optimizing for. If you just want to hand off a container and allow AWS to manage everything about how it runs, it's hard to beat ECS Fargate. OTOH, if you are optimizing for lowest cost, on-demand Fargate is probably not the best choice. You could use Fargate Spot ($10.66/month) if you can tolerate the constraints of spot. Alternatively, you could use an EC2 instance (t3.small # $14.98/month), but then you'll be responsible for managing everything.
You didn't mention how you're running Redis which will factor in here as well. If you're running Redis on Elasticache, you'll incur that cost as well, but you won't have to manage anything. If you end up using an EC2 instance, you could run Redis on the same instance, saving latency and expense, with the trade off that you'll have to install/operate Redis yourself.
Ultimately, you're making tradeoffs between time saved and money spent on managed services.

AWS BATCH - how to run more concurrent jobs

I have just started working with AWS BATCH for my deep learning workload. I have created a compute environment with the following config:
min vCPUs: 0
max vCPUs: 16
Instance type: g4dn family, g3s family, g3 family, p3 family
allocation strategy: BEST_FIT_PROGRESSIVE
The maximum number of vCPU limits for my account is 16 and each of my jobs requires 16GB of memory. I observe that a maximum of 2 jobs can run concurrently at any point in time. I was using allocation strategy: BEST_FIT before and changed it to allocation strategy: BEST_FIT_PROGRESSIVE but I still see that only 2 jobs can run concurrently. This limits the amount of experimentation I can do in a given time. What can I do to increase number of jobs that can run concurrently?
I figured it out myself just now. I'm posting an answer here just in case anyone finds it helpful in the future. It turns out that the instances that were assigned to each of my jobs are g4dn2xlarge. Each of these instances takes up 8 vCPUs. And as my vCPU limit is 16 only 2 jobs can run concurrently. One of the solutions to this is to ask AWS to increase the limit on vCPU by creating a new support case. Another solution could be to modify the compute environment to use GPU instances that consume 4 vCPUs (lowest possible on AWS) and in this case maximum of 4 jobs can run concurrently.
There are 2 kind of solutions:
Configure your compute environment with ec2 instances with vCPUs tha be
multiple of your jobs definitions. For example:
Compute env. with ec2 instance type 8 vCPU and limit up 128 vCPUs of you
have a job definition with 8 vCPU it will let you to execute up to 16
concurrent jobs.Because 16 jobs concurrents X 8 vCPU = 128 vCPUs (take
in count the allocation strategy and memory of your instance which is
important in your job consume memory resources too)
Multi-node parallel jobs, this a very interesting soution because in
this kind of scenario you don't need ec2 instances vCPU that at lest be
multiple of you vCPU used in your Job definition and jobs can be spaned
accross multiple Amazon EC2 instances.

AWS ECS Deployment: insufficient memory

I have configured an AWS ECS Cluster with 3 instances (m5.large), with one instance across each availability zones (A, B, and C). The Service is configured as follows:
Service type: REPLICA
Number of Tasks: 3
Minimum Healthy Percent: 30
Maximum Percent: 100
Placement Templates: AZ Balanced Spread
Service AutoScaling: No.
In the Task Definition, I have used the following:
Network Mode: awsvpc
Task Memory: --
Task CPU: --
At the container level, I have configured only Memory Soft Limit:
Soft Limit: 2048 MB
Hard Limit: --
I have used awslogs for logging. The above configuration works and when I start the service, there is one docker running in each of the instances. The 'docker stats' in one of the instances shows the following:
MEM USAGE / LIMIT
230MiB / 7.501GiB
And the container instance (ECS Console) shows the following:
Resources Registered Available
CPU 2048 2048
Memory 7680 5632
Ports 5 ports
The above results are the same across all the 3 instances -- 2 GB of memory has been reserved (soft limit) and upper memory limit is instance memory of nearly 8 GB (no hard limit set). Everything works as expected so far.
But when I re-deploy the code (using force deploy) from Jenkins, I get the following error in the Jenkins Log:
"message": "(service App-V1-Service) was unable to place a task because no container instance met all of its requirements. The closest matching (container-instance 90d4ba21-4b19-4e31-c42d-d7223b34f17b) has insufficient memory available. For more information, see the Troubleshooting section of the Amazon ECS Developer Guide.
In Jenkins, the job shows up as 'Success', but it is the old version of the code that is running. There is sufficient memory available on all the three instances. Also, the I have changed the Minimum Healthy Percent to 30 hoping that ECS can stop the container and re-delpoy the new one. Any solution or pointers to debug this further will be of great help.
As during deployment, the ECS schedule will allocate memory base on soft limit for each container which can be
2048 * 3 = 6144 MB
which is less than the available memory in the instance
5632 (available memory) < 6144 (required memory)
If you running replica in the same ECS container instance then I will recommend to keep minimum soft limit which should be less or equal to 1GB also this is suggested by ECS as well.
So with this configuration, you will be run blue-green deployment as well. As this nothing harm to keep the soft limit minimum as container can scale to use more memory when it's required so applying some big memory for soft limit does not affect the performance.
I will not recommend lowering the Minimum Healthy Percent: 0 as decrease the soft limit to 1GB will resolve the issue.
Or if you want to keep the same memory limit then decrease Minimum Healthy Percent

What vCPUs in Fargate really mean?

I was trying to get answers on my question here and here, but I understood that I need to know specifically Fargate implementation of vCPUs. So my question is:
If I allocate 4 vCPUs to my task does that mean that my
single-threaded app running on a container in this task will be able to fully use all this vCPUs as they are essentially only a
portion of time of the processor's core that I can use?
Let's say, I assigned 4vCPUs to my task, but on a technical level I
assigned 4vCPUs to a physical core that can freely process one
thread (or even more with hyperthreading). Is my logic correct for
the Fargate case?
p.s. It's a node.js app that runs session with multiple players interacting with each other so I do want to provide a single node.js process with a maximum capacity.
Fargate uses ECS (Elastic Container Service) in the background to orchestrate Fargate containers. ECS in turn relies on the compute resources provided by EC2 to host containers. According to AWS Fargate FAQ's:
Amazon Elastic Container Service (ECS) is a highly scalable, high performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances
...
ECS uses containers provisioned by Fargate to automatically scale, load balance, and manage scheduling of your containers
This means that a vCPU is essentially the same as an EC2 instance vCPU. From the docs:
Amazon EC2 instances support Intel Hyper-Threading Technology, which enables multiple threads to run concurrently on a single Intel Xeon CPU core. Each vCPU is a hyperthread of an Intel Xeon CPU core, except for T2 instances.
So to answer your questions:
If you allocate 4 vCPUs to a single threaded application - it will only ever use one vCPU, since a vCPU is simply a hyperthread of a single core.
When you select 4 vCPUs you are essentially assigning 4 hyperthreads to a single physical core. So your single threaded application will still only use a single core.
If you want more fine grained control of CPU resources - such as allocating multiple cores (which can be used by a single threaded app) - you will probably have to use the EC2 Launch Type (and manage your own servers) rather than use Fargate.
Edit 2021: It has been pointed out in the comments that most EC2 instances in fact have 2 hyperthreads per CPU core. Some specialised instances such as the c6g and m6g have 1 thread per core, but the majority of EC2 instances have 2 threads/core. It is therefore likely that the instances used by ECS/Fargate also have 2 threads per core. For more details see doco
You can inspect what physical CPU your ECS runs on, by inspecting the /proc/cpuinfo for model name field. You can just cat this file in your ENTRYPOINT / CMD script or use ECS Exec to open a terminal session with your container.
I've actually done this recently, because we've been observing some weird performance drops on some of our ECS Services. Out of 84 ECS Tasks we ran, this was the distribution:
Intel(R) Xeon(R) CPU E5-2686 v4 # 2.30GHz (10 tasks)
Intel(R) Xeon(R) Platinum 8124M CPU # 3.00GHz (22 tasks)
Intel(R) Xeon(R) Platinum 8175M CPU # 2.50GHz (10 tasks)
Intel(R) Xeon(R) Platinum 8259CL CPU # 2.50GHz (25 tasks)
Intel(R) Xeon(R) Platinum 8275CL CPU # 3.00GHz (17 tasks)
Interesting that it's 2022 and AWS is still running CPUs from 2016 (the E5-2686 v4). All these tasks are fully-paid On-Demand ECS Fargate. When running some tasks on SPOT, I even got an E5-2666 v3 which is 2015, I think.
While assigning random CPUs for our ECS Tasks was somewhat expected, the differences in these are so significant that I observed one of my services to report 25% or 45% CPU Utilization in idle, depending on which CPU it hits on the "ECS Instance Type Lottery".

One big EC2 or multiple small EC2 or one ECS - which is cost-effective?

We are running 6 Java Spring Boot based microservice projects in one AWS t2.large(2 CPU & 8GB RAM) EC2 machine. When we see the CPU and RAM utilization of this EC2 machine, CPU is underutilized, hardly hits 1% but the RAM usage is always above 85%.
Now we want to add 2 more microservice. We definitely can't deploy these 2 services in the EC2 which we already have as the RAM will be exhausted. So, we need to add one more EC2 machine or we need to upgrade t2.large to t2.xlarge(4 CPU & 16GB RAM) to deploy the 2 new services but we also want to be cost-effective.
We are studying some of the AWS articles and it says that running ECS services will be cost-effective as the billing is based on how much CPU and RAM is being used(reference - ECS Fargate Pricing). As our CPU is always underutilized, we are thinking that ECS Fargate pricing policy will reduce our cost.
So, 3 questions with respect to cost saving for the above case,
If we prefer ECS, will that be really cost-effective than adding one more EC2 instance as ECS pricing is based on how much resources are utilized?
If we need to deploy 8 projects in ECS, will that create 8 EC2 instances or we can still use 1 or 2 EC2 machines for all 8 projects deployment?
Should we completely re-think about our deployment process in a different way?
If we prefer ECS, will that be really cost-effective than adding one more EC2 instance as ECS pricing is based on how much resources are utilized?
This is worded a little awkward, but I think what you're asking is will Fargate be more cost effective than standard EC2 instances. That, I think, is pretty difficult for us to determine for you, though you should be able to do some quick estimations based on your real-world usage and compare that with the cost of the smallest necessary instance type.
To throw another option in the mix, consider Reserved Instances, as they come at a pretty steep discount if you're willing to reserve them for some period of time. From the linked doc, you'd be looking at somewhere around a 40% discount (compared to on-demand) for a 1 year reservation, and 60% for a 3 year reservation.
Additionally, consider different instance types, specifically the memory optimized instances r4 and x1. These let you purchase more memory and less cpu.
If we need to deploy 8 projects in ECS, will that create 8 EC2 instances or we can still use 1 or 2 EC2 machines for all 8 projects deployment
Assuming you're not referring to Fargate here, the answer is ... it depends; it depends on how you design your system. You can host as many containers on an instance as you want (assuming you have the resources) -- but whether or not that's what you want from a high-availability / disaster scenario is for you to decide.
Should we completely re-think about our deployment process in a different way?
This will be answered when you come to a conclusion on your first question.