We have an ECS cluster with 3 EC2 instances. In this cluster we have a bunch of services running, all separate apps with 1 task.
Frequently when I try to run a new service, ECS tries to run the task in an EC2 instance with not enough memory/CPU, while there is a different instance available with more than enough. In fact, there are now 2 instances with both 5 tasks, and 1 instance with only 1.
What could be the reason of this weird division of tasks? I've tried every possible task placement strategy but that doesn't seem to make a difference.
Most recent error message:
service [service name] was unable to place a task because no container instance met all of its requirements. The closest matching container-instance [instance-id] has insufficient memory available.
Related
I have two clusters in my Amazon Elastic Container Service, one for production and one as a testing environment.
Each cluster has three different services with one task each. There should be 6 tasks running.
To update a task, I always pushed my new Docker Image to the Elastic Container Registry and restarted the Service with the new Image.
Since about 2 weeks I am only able to start 2 Tasks at all. It doesn't depend on the cluster, just 2 Tasks in general.
It looks like the tasks that should start are stuck in the "In Progress" Rollout State.
Has anybody similar problem or knows how to fix this?
I wrote to the support with this issue.
"After a review, I have noticed that the XXXXXXX region has not yet been activated. In order to activate the region you will have to launch an instance, I recommended a Free Tier EC2 instance.
After the EC2 instance has been launched you can terminate it thereafter.
"
I don't know why, but it's working
I have an ECS service which has a requirement that it should be running on exactly 2 container instances. How can this be achieved? I could not find any specific place in container definition, where I can fix the number of ECS instances.
There are a few ways to achieve this. One is to deploy your ECS service on Fargate. When you do so and you set your task count to, say, 2 ... ECS will deploy your 2 tasks onto 2 separate and dedicated operating systems/VMs managed by AWS. Two or more tasks can never be colocated to one of these VMs. It's always a 1 task : 1 VM relationship.
If you are using EC2 as your launch type and you want to make sure your service deploys exactly 1 task per instance the easiest way would be to configure your ECS service as type DAEMON. In this case you don't even need (or can't) configure the number of tasks in your service because ECS will always deploy 1 task per EC2 instance that is part of the cluster.
At the time of creating service you will find the field Number of tasks it means that how many container you want exactly. If you write 1 than it will launch only 1 and if you write 2 then it will launch 2 . I Hope you understand
I have an ECS Cluster which has two instances as defined in an autoscaling group with 2 as minimum capacity,
I have defined the ecs service to run two containers per instance when it is created or updated. So it launches two containers per ecs instances in the ecs cluster.
Now, suppose when I stop/terminate an instance in that cluster a new instance will automatically come up since the autoscaling group has a minimum capacity of two.
The problem is when the new instance come up in the autoscaling group it does not run two tasks which are defined to be in service, instead, it runs 4 tasks on one ecs instance and the other new ecs instance doesn't have any task running on it.
How could I manage that whenever a new instance come up in Auto Scaling group it also has those two tasks running?
if you want those two ec2 instance to be dedicated for those 4 tasks then you can modify task definition memory limits and make it require half of your 1 ecs instance memory.
Let's say you have t3.small then your task definitions limits would be 1gb for memory limit. in this way if you have one t3.small instance you will get only 2 tasks running on it. whenever you add another t3.small instance you should fulfil the missing required memory and another two tasks will run on that new t3.small instance.
You can also consider running 1 task per ecs instance, to do so in service creation choose to have Deamon service type. and give more memory to your task in task definition. so every new ec2 instance will have 1 running task for this service all the time.
When I created service in Amazon EC2 Container Service, there were 2 options for service type: REPLICA and DAEMON.
What is the exact difference between them?
Replica services place and maintain a desired number of tasks across
your cluster. Daemon services place and maintain one copy of your task
for each container instance
Your ECS cluster most likely exists out of multiple EC2 instances (= Container instances).
According to the AWS documentation
Replica: The replica scheduling strategy places and maintains the desired number of tasks across your cluster. By default, the service
scheduler spreads tasks across Availability Zones. You can use task
placement strategies and constraints to customize task placement
decisions
Daemon: The daemon scheduling strategy deploys exactly one task on each active container instance that meets all of the task placement
constraints that you specify in your cluster. When using this
strategy, there is no need to specify a desired number of tasks, a
task placement strategy, or use Service Auto Scaling policies.
This means that, if you have an ECS cluster with three EC2 instances and you want to launch a new service with four tasks, the following will happen:
Replica: Your four tasks will start randomly distributed over your container instances. This can be all four on one instance or any other random distribution. This is the use case for normal micro services.
Daemon: For a daemon you do not specify how many tasks you want to run. A daemon service automatically scales depending on the amount of EC2 instances you have. In this case, three. A daemon task is a pattern used when building microservices where a task is deployed onto each instance in a cluster to provide common supporting functionality like logging, monitoring, or backups for the tasks running your application code.
After pushing a new image of my container I use Terraform apply to update the task definition. This seems to work fine but in the ECS service list of tasks I can see the task as inactive and I have an event:
service blahblah was unable to place a task because no container instance met all of its requirements. The closest matching container-instance [guid here] is already using a port required by your task.
The thing is, the site is still active and working.
This is more of an ECS issue than a Terraform issue because Terraform is updating your task definition and updating the service to use the new task definition but ECS is unable to schedule new tasks on to the container instances because you're (presumably) defining a specific port that the container must run on and directly mapping it to the host or using host networking instead of bridge (or the new aws-vpc CNI plugin).
ECS has a couple of parameters to control the behaviour of an update to the service: minimum healthy percent and maximum healthy percent. By default these are set to 100% and 200% respectively meaning that ECS will attempt to deploy a new task matching the new task definition and wait for it to be considered healthy (such as passing ELB health checks) before terminating the old tasks.
In your case you have as many tasks as you have container instances in your cluster and so when it attempts to schedule a new task on to the cluster it is unable to place it because the port is already bound to by the old task. You could also find yourself in this position if you had placement constraints on your task/service.
Because the minimum healthy percent is set to 100% it is unable to schedule the removal of any of the old tasks that would then free up a placement option for a new task.
You could have more container instances in the cluster than you have instances of the task running which would allow ECS to deploy new tasks before removing old tasks from the other instances or you could change the minimum healthy percent (deployment_minimum_healthy_percent in Terraform's ECS service resource) to a number less than 100 that allows deployments to happen.
For example, if you normally deploy 3 instances of the task in the service then setting the minimum healthy percent to 50% would allow ECS to remove one task from the service before scheduling a new task matching the new task definition. It would then proceed with a rolling upgrade, making sure the new task is healthy before replacing the old task.
Setting the minimum healthy percent to 0% would mean that ECS can stop all of the tasks running before starting new tasks but this would obviously lead to a potential (but not guaranteed) service interruption.
Alternatively you could remove the placement constraint by switching away from host networking if that is viable for your service.