How to scale ECS sidecar containers independently - amazon-web-services

How can I create and independently scale sidecar containers in ECS Fargate using the AWS console?
The task creation step allows adding multiple containers with different CPU and memory configurations but not an independent scaling option. On the other hand, the ECS Service launch allows the option to scale only at the task level. Also, ECS doesn't clearly mention how a container can be specified as a sidecar.

You can't independently scale a sidecar in ECS. The unit of scaling in ECS is at the task level.
You can specify cpu and memory of the Fargate task (e.g. 512/1024) - this is the resources that are assigned for that task and are what you will pay for on your bill.
Within that task you can have 1:n containers - each of these can have their own cpu and memory configurations but these are assigned within the constraints of the task - the combined cpu/memory values for all containers cannot exceed those assigned to the task (e.g. you couldn't have a 512/1024 task and assign 2048 memory to a container within it).
This effectively allows you to give weighting to the containers in your task, e.g. giving an nginx sidecar less of a weighting than your main application.
ECS doesn't clearly mention how a container can be specified as a sidecar.
A 'sidecar' is just a container that shares resources (network, disk etc) with another container. Creating a task definition with 2 containers gives you a sidecar.
Below is a sample task-definition that has nginx fronting a Flask app. It contains:
A flask app listening on port 5000.
An nginx container listening on port 80. This has an envvar upstream=http:\\localhost:5000. As these both share the same network they can communicate via localhost (so nginx could forward to flask-app).
They both have access to a shared drive ("shared-volume")
{
"containerDefinitions": [
{
"name": "main-app",
"image": "ghcr.io/my-flask-app",
"portMappings": [
{
"containerPort": 5000,
"hostPort": 5000,
"protocol": "tcp"
}
],
"mountPoints": [
{
"sourceVolume": "shared-volume",
"containerPath": "/scratch"
}
]
},
{
"name": "sidecar",
"image": "ghcr.io/my-nginx",
"portMappings": [
{
"containerPort": 80,
"hostPort": 80,
"protocol": "tcp"
}
],
"environment": [
{
"name": "upstream",
"value": "http:\\localhost:5000"
}
],
"mountPoints": [
{
"sourceVolume": "shared-volume",
"containerPath": "/scratch"
}
]
}
],
"networkMode": "awsvpc",
"revision": 1,
"volumes": [
{
"name": "shared-volume",
"host": {}
}
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "512",
"memory": "1024"
}

Related

AWS ECS Tasks are killed by OOM killer without exceeding memory reserves

We have an Elixir application that is being frequently killed by Amazon ECS for being "OutOfMemoryError: Container killed due to memory usage".
We're using Fargate for our deployments, with 8GB ram containers. We have 4 instances running at a time. When the OutOfMemoryError happens, all containers are killed. We track memory usage with AWS & NewRelic, and at no time does the memory come near the max allowable limit of the container. Additionally all containers are killed at the same time. Seemingly it's killing the containers at only 60% memory usage.
What we're trying to figure out is why this is happening & what we can do to prevent these unexpected memory kill events. Even if one container was killed as it approaches 60%, we'd be in a better place than all containers being killed at the same time.
This graph shows the memory utilization (on average) of the ECS application. You can see at the peak it uses about 5GB of memory (blue line), which is well below the reserved memory (orange line). The green line shows the containers (tasks) restarting.
Here's a snippet of our Task Definition (redacted where necessary):
{
"taskDefinitionArn": "our-task",
"containerDefinitions": [
{
"name": "our-application",
"image": "image-on-registry",
"cpu": 0,
"portMappings": [
{
"containerPort": 8080,
"hostPort": 8080,
"protocol": "tcp"
}
],
"essential": true,
"environment": []
}
],
"networkMode": "awsvpc",
"status": "ACTIVE",
"compatibilities": [
"EC2",
"FARGATE"
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "4096",
"memory": "8192"
}
This issue has been baffling us for awhile and causes downtime every or every other day. Thanks for your help!

Docker links with awsvpc network mode

I have a Java webapp deployed in ECS using the tomcat:8.5-jre8-alpine image. The network mode for this task is awsvpc; I have many of these tasks running across 3 EC2 instances fronted by an ALB.
This is working fine but now I want to add an nginx reverse-proxy in front of each tomcat container, similar to this example: https://github.com/awslabs/ecs-nginx-reverse-proxy/tree/master/reverse-proxy.
My abbreviated container definition file is:
{
"containerDefinitions": [
{
"name": "nginx",
"image": "<NGINX reverse proxy image URL>",
"memory": "256",
"cpu": "256",
"essential": true,
"portMappings": [
{
"containerPort": "80",
"protocol": "tcp"
}
],
"links": [
"app"
]
},
{
"name": "app",
"image": "<app image URL>",
"memory": "1024",
"cpu": "1024",
"essential": true
}
],
"volumes": [],
"networkMode": "awsvpc",
"placementConstraints": [],
"family": "application-stack"
}
When I try to save a new task definition I received the error: "links are not supported when the network type is awsvpc"
I am using the awsvpc network mode because it gives me granular control over the inbound traffic via a security group.
Is there any way to create a task definition with 2 linked containers when using awsvpc network mode?
You dont need the linking part at all, because awsvpc allows you to reference other containers simply by using
localhost:8080
(or whatever port is your other container mapped to)
in your nginx config file.
So remove links from your json and use localhost:{container port} in nginx config. Simple as that.
Actually if you want to use a reverse-proxy you can stop using links, because you can make service discovery or using your reverse-proxy to use your dependency.
If you still want to use link instead of using that reverse proxy you can use consul and Fabio. Both services are dockerizable.
With this, there is no necessity to use awsvpc and you can use consul for service-discovery.
Hope it helps!

ECS task_definition environment variable needs IP address

So I have two container definitions for a service that I am trying to run on ECS. For one of the services (Kafka), it requires the IP Address of the other service (Zookeeper). In the pure docker world we can achieve this using the name of the container, however in AWS the container name is appended by AWS to create a unique name, so how do we achieve the same behaviour?
Currently my Terraform task definitions look like:
[
{
"name": "${service_name}",
"image": "zookeeper:latest",
"cpu": 1024,
"memory": 1024,
"essential": true,
"portMappings": [
{ "containerPort": ${container_port}, "protocol": "tcp" }
],
"networkMode": "awsvpc"
},
{
"name": "kafka",
"image": "ches/kafka:latest",
"environment": [
{ "name": "ZOOKEEPER_IP", "value": "${service_name}" }
],
"cpu": 1024,
"memory": 1024,
"essential": true,
"networkMode": "awsvpc"
}
]
I don't know enough about the rest of the setup to give really concrete advice, but there's a few options:
Put both containers in the same task, and use links between them
Use route53 auto naming to get DNS names for each service task, specify those in the task definition environment, also described as ecs service discovery
Put the service tasks behind a load balancer, and use DNS names from route53 and possibly host matching on the load balancer, specify the DNS names in the task definition environment
Consider using some kind of service discovery / service mesh framework (Consul, for instance)
There are posts describing some of the alternatives. Here's one:
How to setup service discovery in Amazon ECS

AWS ECS Service for Wordpress

I created a service for wordpress on AWS ECS with the following container definitions
{
"containerDefinitions": [
{
"name": "wordpress",
"links": [
"mysql"
],
"image": "wordpress",
"essential": true,
"portMappings": [
{
"containerPort": 0,
"hostPort": 80
}
],
"memory": 250,
"cpu": 10
},
{
"environment": [
{
"name": "MYSQL_ROOT_PASSWORD",
"value": "password"
}
],
"name": "mysql",
"image": "mysql",
"cpu": 10,
"memory": 250,
"essential": true
}
],
"family": "wordpress"
}
Then went over to the public IP and completed the Wordpress installation. I also added a few posts.
But now, when I update the service to use a an updated task definition (Updated mysql container image)
"image": "mysql:latest"
I loose all the posts created and data and Wordpress prompts me to install again.
What am i doing wrong?
I also tried to use host volumes but to no vail - creates a bind mount and a docker managed volume (Did a docker inspect on container).
So, every time I update the task it resets Wordpress.
If your container needs access to the original data each time it
starts, you require a file system that your containers can connect to
regardless of which instance they’re running on. That’s where EFS
comes in.
EFS allows you to persist data onto a durable shared file system that
all of the ECS container instances in the ECS cluster can use.
Step-by-step Instructions to Setup an AWS ECS Cluster
Using Data Volumes in Tasks
Using Amazon EFS to Persist Data from Amazon ECS Containers

Where are the volumes located when using ECS and Fargate?

I have the following setup (I've stripped out the non-important fields):
{
"ECSTask": {
"Type": "AWS::ECS::TaskDefinition",
"Properties": {
"ContainerDefinitions": [
{
"Name": "mysql",
"Image": "mysql",
"MountPoints": [{"SourceVolume": "mysql", "ContainerPath": "/var/lib/mysql"}]
}
],
"RequiresCompatibilities": ["FARGATE"],
"Volumes": [{"Name": "mysql"}]
}
}
}
It seems to work (the container does start properly), but I'm not quite sure where exactly is this volume being saved. I assumed it would be an EBS volume, but I don't see it there. I guess it's internal to my task - but in that case - how do I access it? How can I control its limits (min/max size etc)? How can I create a backup for this volume?
Thanks.
Fargate does not support persistent volumes. Any volumes created attached to fargate tasks are ephemeral and cannot be initialized from an external source or backed up, sadly.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_data_volumes.html