I'm trying to create a ECS Cluster with a service from task definition but get this error:
Resource handler returned message: "Error occurred during operation 'ECS Deployment Circuit Breaker was triggered'." (RequestToken: 657c3321-f13a-4213-3a8f-267e8865a914, HandlerErrorCode: GeneralServiceException)
The task definition json is:
{
"taskDefinitionArn": "arn:aws:ecs:eu-west-1:317470260945:task-definition/task-alberto-django:1",
"containerDefinitions": [
{
"name": "alberto-django-ctnr",
"image": "317470260945.dkr.ecr.eu-west-1.amazonaws.com/asm_repository:repo_personal_django",
"cpu": 0,
"links": [],
"portMappings": [
{
"containerPort": 80,
"hostPort": 0,
"protocol": "tcp"
}
],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [
...
],
"environmentFiles": [],
"mountPoints": [],
"volumesFrom": [],
"secrets": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"systemControls": []
}
],
"family": "task-alberto-django",
"executionRoleArn": "arn:aws:iam::317470260945:role/ecsTaskExecutionRole",
"networkMode": "bridge",
"revision": 1,
"volumes": [],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
},
{
"name": "ecs.capability.execution-role-ecr-pull"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
}
],
"placementConstraints": [],
"compatibilities": [
"EXTERNAL",
"EC2"
],
"requiresCompatibilities": [
"EC2"
],
"cpu": "1024",
"memory": "1024",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"registeredAt": "2022-10-27T10:15:22.553Z",
"registeredBy": "arn:aws:iam::317470260945:root",
"tags": [
{
"key": "ecs:taskDefinition:createdFrom",
"value": "ecs-console-v2"
},
{
"key": "ecs:taskDefinition:stackId",
"value": "arn:aws:cloudformation:eu-west-1:317470260945:stack/ECS-Console-V2-TaskDefinition-bb050013-6338-4b6e-83a3-54edd1775f3e/41ce7c80-55e0-11ed-bbd1-0add076f7d23"
}
]
}
Anybody could help me please ?
Related
When deploying Task in ECS Cluster with a public repo Docker Hub, the task always Stopped with this error in the Task Container:
Stopped reason
Cannotpullcontainererror:
pull image manifest has been retried 5 time(s):
failed to resolve ref docker.io/username/repo:
failed to do request:
Head "https://registry-1.docker.io/v2/username/repo/manifests/latest":
dial tcp 44.205.64.79:443: i/o timeout
This is my Task Definition:
{
"taskDefinitionArn": "arn:aws:ecs:ap-southeast-1:...:task-definition/taskname_task:6",
"containerDefinitions": [
{
"name": "containername_container",
"image": "username/repo",
"cpu": 0,
"links": [],
"portMappings": [
{
"name": "containername_container-8888-tcp",
"containerPort": 8888,
"hostPort": 8888,
"protocol": "tcp",
"appProtocol": "http"
}
],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [],
"environmentFiles": [],
"mountPoints": [],
"volumesFrom": [],
"secrets": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-create-group": "true",
"awslogs-group": "/ecs/taskname_task",
"awslogs-region": "ap-southeast-1",
"awslogs-stream-prefix": "ecs"
},
"secretOptions": []
},
"systemControls": []
}
],
"family": "taskname_task",
"taskRoleArn": "arn:aws:iam::...:role/ecsTaskExecutionRole",
"executionRoleArn": "arn:aws:iam::...:role/ecsTaskExecutionRole",
"networkMode": "awsvpc",
"revision": 6,
"volumes": [],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
{
"name": "ecs.capability.execution-role-awslogs"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
},
{
"name": "com.amazonaws.ecs.capability.task-iam-role"
},
{
"name": "ecs.capability.extensible-ephemeral-storage"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"name": "ecs.capability.task-eni"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
}
],
"placementConstraints": [],
"compatibilities": [
"EC2",
"FARGATE"
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "1024",
"memory": "2048",
"ephemeralStorage": {
"sizeInGiB": 21
},
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"registeredAt": "...",
"registeredBy": "arn:aws:iam::...:root",
"tags": [
{
"key": "ecs:taskDefinition:createdFrom",
"value": "ecs-console-v2"
},
{
"key": "ecs:taskDefinition:stackId",
"value": "arn:aws:cloudformation:ap-southeast-1:...:stack/ECS-Console-V2-TaskDefinition-.../..."
}
]
}
I'm new to ECS and AWS also. I have try the request https://registry-1.docker.io/v2/username/repo/manifests/latest in the error of Task Container above and received this:
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"repository","Class":"","Name":"username/repo","Action":"pull"}]}]}
Is it about the request docker.io configured? I have done a lot of research but not figure anything out.
You can use Dockerhub image from within Amazon ECS Tasks
The format of Dockerhub image would be [registry-url]/[namespace]/[image]:[tag], you do not need registry-url for Dockerhub as the docker client assumes Dockerhub if one is not specified
Alternatively Docker official images should be present on ECR public in addition to Dockerhub so you can reference the ECR public repositories instead from within the ECS Tasks
Now Fargate uses the awsvpc network mode so essentially there are two options to run the task in Fargate:
If the task is being run inside a public subnet, then Auto assign Public IP must be enabled and we need to ensure that public subnet route table has Internet Gateway for internet connectivity to be able to pull the container image from public docker repository
If the task is being run from a private subnet then Auto assign Public IP must be disabled and we need to ensure that private subnet route table has an associated NAT Gateway allowing the task inside private subnet to pull the container image from public docker repository
After lots of tries, I have solved the problem by changing App environment from FARGATE to EC2 and the Network mode from awsvpc to bridge. Although this is not what my beginning intention to use FARGATE but it's solved the problem as well.
And I still don't know what, why, and how the problem has been caused and solved. Help me know.
This is my Task Definition in EC2:
{
"taskDefinitionArn": "arn:aws:ecs:ap-southeast-1:...:task-definition/taskname_task:16",
"containerDefinitions": [
{
"name": "containername_container",
"image": "username/repo",
"cpu": 0,
"links": [
"aws-otel-collector"
],
"portMappings": [
{
"name": "containername_container-8888-tcp",
"containerPort": 8888,
"hostPort": 8888,
"protocol": "tcp",
"appProtocol": "http"
}
],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [],
"environmentFiles": [],
"mountPoints": [],
"volumesFrom": [],
"secrets": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-create-group": "true",
"awslogs-group": "/ecs/taskname_task",
"awslogs-region": "ap-southeast-1",
"awslogs-stream-prefix": "ecs"
},
"secretOptions": []
},
"systemControls": []
}
],
"family": "taskname_task",
"executionRoleArn": "arn:aws:iam::...:role/ecsTaskExecutionRole",
"networkMode": "bridge",
"revision": 16,
"volumes": [],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
{
"name": "ecs.capability.execution-role-awslogs"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
}
],
"placementConstraints": [],
"compatibilities": [
"EC2"
],
"requiresCompatibilities": [
"EC2"
],
"cpu": "1024",
"memory": "922",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"registeredAt": "...",
"registeredBy": "arn:aws:iam::...:root",
"tags": [
{
"key": "ecs:taskDefinition:createdFrom",
"value": "ecs-console-v2"
},
{
"key": "ecs:taskDefinition:stackId",
"value": "arn:aws:cloudformation:ap-southeast-1:...:stack/ECS-Console-V2-TaskDefinition-.../..."
}
]
}
I'm trying to create an ArangoDB cluster in ECS using the default arangodb/arangodb-starter container but when I start my ECS Task, I'm getting an error saying that /usr/sbin/arangod was not found.
I pulled the arangodb/arangodb-starter image locally using docker pull and then I tagged it according to the push commands from ECR, I pushed it to ECR and I created an ECS Task (Fargate) for it. I created a service in ECS to start that task and the container starts, but the ECS Service logs show this error:
|INFO| Starting arangodb version 0.15.5, build 7832707 component=arangodb
[ERROR| Cannot find arangod (expected at /usr/sbin/arangod). component=arangodb
How to solve this:
1 - Install ArangoDB locally or run the ArangoDB starter in docker. (see README for details).
I started the exact same container by tag locally and it works. Why doesn't it work in ECS?
edit The ECS Task definition is in the snippet below:
{
"taskDefinitionArn": "arn:aws:ecs:eu-west-1:123456789:task-definition/dev-arangodb-server:1",
"containerDefinitions": [
{
"name": "dev-arangodb-server",
"image": "123456789.dkr.ecr.eu-west-1.amazonaws.com/arangodb:latest",
"cpu": 0,
"links": [],
"portMappings": [
{
"containerPort": 8529,
"hostPort": 8529,
"protocol": "tcp"
}
],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [
{
"name": "ARANGO_ROOT_PASSWORD",
"value": "password"
}
],
"environmentFiles": [],
"mountPoints": [
{
"sourceVolume": "storage",
"containerPath": "/mnt/storage",
"readOnly": false
}
],
"volumesFrom": [],
"secrets": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-create-group": "true",
"awslogs-group": "/ecs/dev-arangodb-server",
"awslogs-region": "eu-west-1",
"awslogs-stream-prefix": "ecs"
},
"secretOptions": []
},
"systemControls": []
}
],
"family": "dev-arangodb-server",
"taskRoleArn": "arn:aws:iam::123456789:role/dev-aws-ecs-ecr-power-user",
"executionRoleArn": "arn:aws:iam::123456789:role/ecsTaskExecutionRole",
"networkMode": "awsvpc",
"revision": 1,
"volumes": [
{
"name": "storage",
"host": {}
}
],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
{
"name": "ecs.capability.execution-role-awslogs"
},
{
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
},
{
"name": "com.amazonaws.ecs.capability.task-iam-role"
},
{
"name": "ecs.capability.execution-role-ecr-pull"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"name": "ecs.capability.task-eni"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.29"
}
],
"placementConstraints": [],
"compatibilities": [
"EC2",
"FARGATE"
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "1024",
"memory": "3072",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"registeredAt": "2022-11-03T08:43:25.264Z",
"registeredBy": "arn:aws:iam::123456789:user/MY_USER",
"tags": [
{
"key": "ecs:taskDefinition:createdFrom",
"value": "ecs-console-v2"
},
{
"key": "ecs:taskDefinition:stackId",
"value": "arn:aws:cloudformation:eu-west-1:123456789:stack/ECS-Console-V2-TaskDefinition-e1519bf7-ff78-423a-951d-2bc8d79242ec/925d88d0-5b53-11ed-97a3-066ee48e3b9b"
}
]
}
I tested on my cluster and seems like that image is not running with default options like yours task definition. That image is not documented so we don't know how to start it correctly
Please try this official image and do the same process. Remember the environment, or you will face this issue.
error: database is uninitialized and password option is not specified
You need to specify one of ARANGO_ROOT_PASSWORD, ARANGO_ROOT_PASSWORD_FILE, ARANGO_NO_AUTH and ARANGO_RANDOM_ROOT_PASSWORD
I am trying to deploy an ECS service. How do I resolve the following error?
Stopped | CannotStartContainerError: ResourceInitializationError: unable to create new container: mount callback failed on /tmp/containerd-mount2307274170: open /tmp/containerd-mount2307274170/etc/passwd: no such file or directory
Here is the task definition: (removed sensitive info)
{
"taskDefinitionArn": "arn:aws:ecs:us-east-1:000000000000:task-definition/my-app:2",
"containerDefinitions": [
{
"name": "my-app",
"image": "000000000000.dkr.ecr.us-east-1.amazonaws.com/my-app",
"cpu": 0,
"links": [],
"portMappings": [
{
"containerPort": 80,
"hostPort": 80,
"protocol": "tcp"
},
{
"containerPort": 443,
"hostPort": 443,
"protocol": "tcp"
}
],
"essential": true,
"entryPoint": [],
"command": [],
"environment": [],
"environmentFiles": [],
"mountPoints": [],
"volumesFrom": [],
"secrets": [],
"dnsServers": [],
"dnsSearchDomains": [],
"extraHosts": [],
"dockerSecurityOptions": [],
"dockerLabels": {},
"ulimits": [],
"systemControls": []
}
],
"family": "my-app-task",
"executionRoleArn": "arn:aws:iam::000000000000:role/ecsTaskExecutionRole",
"networkMode": "awsvpc",
"revision": 2,
"volumes": [],
"status": "ACTIVE",
"requiresAttributes": [
{
"name": "com.amazonaws.ecs.capability.ecr-auth"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
},
{
"name": "ecs.capability.execution-role-ecr-pull"
},
{
"name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
},
{
"name": "ecs.capability.task-eni"
}
],
"placementConstraints": [],
"compatibilities": [
"EC2",
"FARGATE"
],
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "1024",
"memory": "3072",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"registeredAt": "2022-10-26T22:07:52.854Z",
"registeredBy": "arn:aws:iam::000000000000:root",
"tags": [
{
"key": "ecs:taskDefinition:createdFrom",
"value": "ecs-console-v2"
},
{
"key": "ecs:taskDefinition:stackId",
"value": "arn:aws:cloudformation:us-east-1:000000000000:stack/ECS-Console-V2-TaskDefinition-00000000-0000-0000-0000-000000000000/00000000-0000-0000-0000-000000000000"
}
]
}
i am trying to override the CPU Units for a ECS Task in the RunTask method of the SDK.
Task Definition
{
"ipcMode": null,
"executionRoleArn": "arn:aws:iam::111459517389:role/ecsTaskExecutionRole",
"containerDefinitions": [
{
...,
"portMappings": [
{
"hostPort": 80,
"protocol": "tcp",
"containerPort": 80
},
...
],
"command": null,
"linuxParameters": null,
"cpu": 256, # CONTAINER CPU Units (default)
"environment": [
{
"name": "ECS_IMAGE_PULL_BEHAVIOR",
"value": "prefer-cached"
}
],
"ulimits": null,
...
"name": "some-job-container"
}
],
"placementConstraints": [],
"memory": "8192", # TASK SIZE
"taskRoleArn": "arn:aws:iam::111459517389:role/ecsTaskExecutionRole",
"compatibilities": [
"EC2",
"FARGATE"
],
"taskDefinitionArn": "arn:aws:ecs:eu-west-3:111459517389:task-definition/some-definition:7",
"family": "some-job-dev",
"requiresAttributes": [
{
"targetId": null,
"targetType": null,
"value": null,
"name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
},
...
],
"pidMode": null,
"requiresCompatibilities": [
"FARGATE"
],
"networkMode": "awsvpc",
"cpu": "4096", # TASK SIZE
"revision": 7,
"status": "ACTIVE",
"inferenceAccelerators": null,
"proxyConfiguration": null,
"volumes": []
}
And here's the RunTask parameters
{
"taskDefinition":"some-job-dev",
"cluster":"some-cluster",
"overrides":{
"containerOverrides":[
{
"name":"some-job-container",
"command":[
"kosmos",
"segmentation-queue"
],
"cpu":4092,
"memory":8192
}
]
},
"networkConfiguration":{
"awsvpcConfiguration":{
"assignPublicIp":"ENABLED",
"subnets":[
"subnet-789",
"subnet-456",
"subnet-123"
]
}
}
}
When i run a task with these parameters, the memory of the container gets correctly overridden, but not the CPU.
I am following the ECS Documentation and still it doesn't work, am i missing something here ?
Notes:
My task launch type is Fargate
I had a similar issue, and its intermittent. Were you able to solve it?
I see you are passing values as an int, for me specifying them as a string helped.
I'm using AWS-CLI to register an ECR task definition. My task definition is like follows:
{
"family": "",
"taskRoleArn": "",
"executionRoleArn": "",
"networkMode": "none",
"containerDefinitions": [
{
"name": "",
"image": "",
"cpu": 0,
"memory": 0,
"memoryReservation": 0,
"links": [
""
],
"portMappings": [
{
"hostPort": 80,
"protocol": "tcp",
"containerPort": 80
}
],
"essential": true,
"entryPoint": [
""
],
"command": [
""
],
"environment": [
{
"name": "",
"value": ""
}
],
"mountPoints": [
{
"sourceVolume": "",
"containerPath": "",
"readOnly": true
}
],
"volumesFrom": [
{
"sourceContainer": "",
"readOnly": true
}
],
"linuxParameters": {
"capabilities": {
"add": [
""
],
"drop": [
""
]
},
"devices": [
{
"hostPath": "",
"containerPath": "",
"permissions": [
"mknod"
]
}
],
"initProcessEnabled": true
},
"hostname": "",
"user": "",
"workingDirectory": "",
"disableNetworking": true,
"privileged": true,
"readonlyRootFilesystem": true,
"dnsServers": [
""
],
"dnsSearchDomains": [
""
],
"extraHosts": [
{
"hostname": "",
"ipAddress": ""
}
],
"dockerSecurityOptions": [
""
],
"dockerLabels": {
"KeyName": ""
},
"ulimits": [
{
"name": "fsize",
"softLimit": 0,
"hardLimit": 0
}
],
"logConfiguration": {
"logDriver": "syslog",
"options": {
"KeyName": ""
}
}
}
],
"volumes": [
{
"name": "",
"host": {
"sourcePath": ""
}
}
],
"placementConstraints": [
{
"type": "memberOf",
"expression": ""
}
],
"requiresCompatibilities": [
"EC2"
],
"cpu": "10",
"memory": "600"
}
, which is basically almost identical to the auto-generated skeleton:
aws ecs register-task-definition --generate-cli-skeleton
But it looks that when using the command
aws ecs register-task-definition --family taskDef --cli-input-json taskDef-v1.json --region us-east-2
I get this:
An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Container links should not have a cycle
What am I doing wrong?
That particular error is caused because you have empty links defined:
"links": [
""
]
The CLI skeleton is a template you need to edit - it has many empty values for fields that are not required. The minimum task definition template is something like:
{
"containerDefinitions": [
{
"name": "my-task-name",
"image": "my-registry/my-image",
"memoryReservation": 256,
"cpu": 256
}
]
}