I am using the AWS interface to configure my services on ECS. Before the interface change, I used to be able to access a screen that would allow me to see why the task had failed (like in the example below), that interface could be accessed from the ECS service events by clicking on the taskid. Does anyone know how to get the task stopped reason data with the new interface?
You can see essentially the same message if you do the following steps:
Select your service from your ECS cluster:
Go to Configuration and tasks tab:
Scroll down and select a task. You would want to chose one which was stopped by the failing deployment:
You should have the Stopped reason message:
Related
I'm trying to run a new ECS task on a new cluster using Fargate as the deployment. I'm doing everything via the aws console. However when I go to run my task I simply get the error in a red banner
"Unable to run task",
nothing else. I've tried looking for the service events described here: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-event-messages.html#service-event-messages-1
but can't find them. Does anyone know how to debug this?
You can show the stopped tasks where you can see the reason for stopping the container. Also on the details of the particular task you can find the logs.
I have a fargate task that I want to run as a scheduled task every n minutes. I have a task definition that works perfectly as expected (with cloud watch logs as expected and VPC connections working properly). That is when I run it as a task or a service. However, when I try to run it as scheduled task, it does not start. I checked the cloudwatch logs, however, there are no log entries in the log group. If I lookup the metrics page, I see a FailedInvocations entry under the metric name.
I understand that it is a bit tricky to schedule a task in fargate, as we have to go to cloudwatch rules, and update the scheduled task there, in order to add subnets and define a security group, as this option is not available when creating the scheduled task through my ECS cluster page.
I also have studied the documentation page here, and also checked this question. But I still cannot understand why it does not work. Thank you in advance.
This seems like an issue with the web interface of AWS for scheduled tasks, as they don't let me set the assignPublicIp to enabled.
Without this, the Fargate task cannot pull images from the ECR registry. However, when I started this task using boto3 using a lambda function that gets called through cloudwatch rules, it works fine.
I am trying to configure my docker hub image with aws ecs..I have created repository, cluster, and task while running task am getting an error as an essential container in task exited 1. while trying to get exact error details I have found that some of my variables are shown as not configured.
find the screenshot attached of errors.
cluster details
error detail
You should setup the "Log Configuration" by specifying a log configuration in your task definition. I would recommend the awslogs configuration type, as this lets you see the logs from your container right inside the console.
Once you do that you will get a new tab on the task details screen called "Logs" and you can click that to see the output from your container as it was starting up. You will probably see some kind of error or crash as the "Essential container exited" error means that the container was expected to stay up and running, but it just exited.
I had to expand the corresponding container details in the stopped task and check the "Details" --> "Status reason", which revealed the following issue:
OutOfMemoryError: Container killed due to memory usage
Exit Code 137
After increasing the available container memory, it worked fine.
I had a similar issue. You can setup the cloudwatch log, there you can get the full error log which will help you to debug and fix the issue. below are the part of the taken taken from aws official documentation.
Using the auto-configuration feature to create a log group
When registering a task definition in the Amazon ECS console, you have the option to allow Amazon ECS to auto-configure your CloudWatch logs. This option creates a log group on your behalf using the task definition family name with ecs as the prefix.
To use log group auto-configuration option in the Amazon ECS console
Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.
In the left navigation pane, choose Task Definitions, Create a new Task Definition, alternatively, you can also do create a revision of the exiting task definition.
Select your compatibility option and choose Next Step.
Choose Add container.
In the Storage and Logging section, for Log configuration, choose Auto-configure CloudWatch Logs.
Enter your awslogs log driver options. For more information, see Specifying a log configuration in your task definition.
Complete the rest of the task definition wizard.
I have been stuck on the same error.
The problem ended up being "Task tagging configuration" > DISABLE "Enable ECS managed tags"
When this parameter is enabled, Amazon ECS automatically tags your tasks with two tags corresponding to the cluster and service names. These tags allow you to identify tasks easily in your AWS Cost and Usage Report.
Billing permissions are separate and not by default assigned when you create a new ECS cluster and task definition with the default setting. This is why ECS was failing with "STOPPED: Essential container in task exited"
I'm trying to deploy backend application to the AWS Fargate using cloudformation templates that I found. When I was using the docker image training/webapp I was able to successfully deploy it and access with the externalUrl from the networking stack for the app.
When I try to deploy our backend image I can see the stacks are deploying correctly but when I try to go to the externalUrl I get 503 Service Temporarily Unavailable and I'm unable to see it... Another thing that I've noticed is on the docker hub I can see that the image is continuously pulled all the time when the cloudformation services are running...
The backend is some kind of maven project I don't know exactly what but I know that locally its working but to get it up running the container with this backend image takes about 8 minutes... I'm not sure if this affects the Fargate ?? Any Idea how to get it working ?
It sounds like you need to find the actual error that you're experiencing, the 503 isn't enough information. Can you provide some other context?
I'm not familiar with fargate but have been using ecs quite a bit this year and I generally would find that by going to (on the dashboard) ecs -> cluster -> service -> events. The events tab gives more specific errors as to what is happening.
My ecs deployment problems are generally summarized into
the container is not exposing the same port as is in the definition, this could be the case if you're deploying from a stack written by someone else.
the task definition memory/cpu restrictions don't grant enough space for the application and it has trouble placing (probably a problem with ecs more than fargate but you never know.)
Your timeout in the task definition is not set to 8 minutes: see this question, it has a lot of this covered
Your start command in the task definition does not work as expected with the container you're trying to deploy
If it is pulling from docker hub continuously my bet would be that it's 1, 3 or 4, and it's attempting to pull the image over and over again.
Try adding a Health check grace period of 60 by going to ECS -> cluster -> service -> update Network Access section.
The overall deployment failed because too many individual instances failed deployment, too few healthy instances are available for deployment, or some instances in your deployment group are experiencing problems. (Error code: HEALTH_CONSTRAINTS)
Note : 1. Already installed AWS code deploy agent
2. Roles created
Please assist and let me know for any more information.
You can see more information about the error by accessing the deployment ID events:
Here you can check all the steps and their detailed status:
If there are any errors you can click on the Logs column.
There can be different reasons why your deployment failed. As instructed above, you can see click on the deployment ID to go to the deployment details page and see which instances failed. You can then click on the view events for each instance to see why deployment to that instance failed. If you do not see any link to "View events", and no event details for that instance, it is likely that the agent is not running properly. Otherwise, you should be able to click on "View events" to see which lifecycle event failed. You can also log in to the failed instance and view the host agent logs to get more information.