The app runner is successfully created and works fine, but any attempt to change the configuration gets an error. It seems that the healthcheck does not work, although after creation everything works fine.
[AppRunner] Service status is set to RUNNING.
[AppRunner] Service update failed. For details, see service logs.
[AppRunner] Performing health check on path '/healthz' and port '8080'.
[AppRunner] Provisioning instances and deploying image.
[AppRunner] Service status is set to OPERATION_IN_PROGRESS.
[AppRunner] Service update started.
[AppRunner] Service status is set to RUNNING.
[AppRunner] Service creation completed successfully.
[AppRunner] Successfully routed incoming traffic to application.
[AppRunner] Health check is successful. Routing traffic to application.
[AppRunner] Performing health check on path '/healthz' and port '8080'.
[AppRunner] Provisioning instances and deploying image.
[AppRunner] Successfully pulled image from ECR.
[AppRunner] Service status is set to OPERATION_IN_PROGRESS.
[AppRunner] Service creation started.
This happens with any change. For example, here I just changed the healthcheck interval from the default 10 seconds to the maximum 20.
At the same time, it is impossible to find any logs explaining what went wrong. Cloudwatch just duplicates the message Service update failed. For details, see service logs.
If it matters, I'm running the application inside a VPC, with NAT configured. There is also a private db instance with access only from the VPC. Healthcheck /healthz checks access to the internet and to the database, there are no problems with this.
Any ideas what I'm doing wrong or where I can find useful logs would be helpful.
Related
So I setup a aws ecs cluster to run a docker image of Valhalla service almost as-is.
Issue : target group seems to be not able to check for cluster health, like if the health request was reaching the cluster, but container is not "forwarding" the request to Valhalla.
Description :
I created a repository on AWS ECR, and pushed a docker image of gisops/valhalla with only the valhalla.json file changed.
Here is the valhalla configuration I used.
Note that I changed the default listening port from 8002 to 80.
I created a ECS Fargate cluster, and a service that uses this task definition to launch a container that runs Valhalla.
The service receives traffic from an application load balancer via port 80.
The target group is checking /status path on port 80.
All set, the task is then creating, and task logs shows that Valhalla is initializing perfectly and running.
However the target group is not able to check for health status : the request seems to timeout.
If the request was reaching valhalla, the task logs would have at least show it (because valhalla logs every incoming request by default), but it doesn't.
Therefore fargate kills the task (Task failed ELB health checks in (target-group {my-target-group-uri})) (showing that the health request was reaching the cluster service indeed)
I don't think the issue is with the valhalla configuration, because I can run the same docker image locally, and it works perfectly, using :
docker run -dt -p 3000:80 -v /local/path/to/valhalla-files:/custom_files/ --name valhalla gisops/valhalla:latest
And then checking localhost:3000/status
Anyone has an idea of what could be the issue ?
Already spent a lot of time on this, and I'm out of ideas. Thanks for your help !
My AWS App Runner application is running normally,
12-19-2021 05:28:15 PM [AppRunner] Service status is set to RUNNING.
12-19-2021 05:28:15 PM [AppRunner] Service creation completed successfully.
12-19-2021 05:28:14 PM [AppRunner] Successfully routed incoming traffic to application.
12-19-2021 05:27:48 PM [AppRunner] Health check is successful. Routing traffic to application.
12-19-2021 05:26:39 PM [AppRunner] Performing health check on path '/ping' and port '8081'.
12-19-2021 05:26:29 PM [AppRunner] Provisioning instances and deploying image.
12-19-2021 05:26:18 PM [AppRunner] Successfully pulled image from ECR.
12-19-2021 05:24:17 PM [AppRunner] Service status is set to OPERATION_IN_PROGRESS.
12-19-2021 05:24:16 PM [AppRunner] Service creation started.
It's an express/ws application, and it works just fine in Docker locally. I am able to reach the myapplication.com/ WebSocket endpoint when it's running locally. However, I am seemingly unable to reach the WebSocket domain when the app is running on AWS App Runner. The application runs at port 8081 internally, but of course App Runner will port that over to port 80/443 to the outside.
I can confirm that the application is running at-least partially in App Runner, since I can reach the myapplication.com/ping endpoint.
I have tried manually with JavaScript in the console to connect to the WebSocket endpoint with every combination of ws://myapplication.com/, wss://myapplication.com/, wss://myapplication.com:8081/, ws://myapplication.com:8081/ and nothing has worked.
My question is - Does App Runner even support WebSockets? I read this on the documentation of App Runner:
Stateless apps – App Runner doesn't guarantee state persistence beyond the duration of processing a single incoming web request.
This, of course, means that having a long term WebSocket client running on AWS App Runner isn't a great idea, but does it also mean that WebSockets are impossible?
Unfortunately, AWS App Runner doesn't support WebSockets as well as sticky sessions.
I'm using GCP's an unmanaged HTTP external load balacer, and I have several Nginx servers running on its backend.
I wanted to check the logs for changes in the health status of the Nginx servers, but could not find a way to do so.
First, I enabled logging for the health check assigned to the backend service. Then, I stopped the service of one Nginx server to test it. Then I selected GCE Health Check from the Resource Type in the Logs Explorer, but only logs related to the creation, update, and deletion of the health check itself appeared.
Next, I enabled logging for the backend service and did the same experiment. However, similarly, only logs related to the creation, update, and deletion of the backend service itself appeared.
I have three questions:
Doesn't the logging of the health check log the changes in the health status of the monitored service?
Doesn't the logging of backend services log the health status changes of the instances belonging to the backend?
How can I check when each Nginx server became unhealthy and when it became healthy in the log?
I have set up a Docker container running on port 5566 with a small Django application. The Docker image is uploaded into the ECR and later used by Fargate container(s).
I have set up an ECS cluster with a VPC.
After creating the Task Definition and Service, the Service starts up 2 tasks (as it is supposed to):
Here's the Service's Network Access (with health check grace period on 300s):
I also set up an Application Load Balancer (with DNS) with a target group for the service, but the health checks seem to be failing:
Here's the health check configuration:
Because the health checks are failing the tasks are terminated and new ones are started after ~every 5 minutes.
Here's the container's port mapping:
As one cannot access the Fargate container (via SSH for example) and the logs are empty, how should I troubleshoot the issue?
I have tried to follow every step in the Troubleshoot Your Application Load Balancer.
Feel free to ask additional information.
can you confirm once, your application is working on port 5566 inside docker?
you can check logs in cloudwatch. you'll get the link in cluster -> service -> tasks -> your task.
Can you post your ALB configuration? your Target group port?
I deployed docker image using AWS Fargate. When I created a service out of the task definition, logs show that tomcat has no errors and app is up and running but new instances are getting constantly getting spun as health check is failing
Health Checks (On target group tied to the service)
Protocol: HTTP
Path: /Sampler/data/ping
Port: traffic/port
What is the right path for health check?
I tried giving servicename too, but it did not work
for example: /servicename/data/ping
Can you please suggest what I am missing?
I have deployed the same war file in local by running docker run -p 8080:8080 sampler:latest (same image pushed from local to ECR) and when I hit the URL http://localhost:8080/Sampler/data/ping, I get 200 status code
Dockerfile
FROM tomcat:9.0-jre8-alpine
COPY target/Sampler-*.war $CATALINA_HOME/webapps/Sampler.war
EXPOSE 80
The path for the health check depends on your application. Based on the information you have provided, I suspect the issue could be related to healthCheckGracePeriodSeconds
healthCheckGracePeriodSeconds
The period of time, in seconds, that the Amazon ECS service scheduler ignores unhealthy
Elastic Load Balancing target health checks after a task has first started.
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_Service.html
When ECS tasks took a long time to start, Elastic Load Balancing (ELB) health checks could mark the task as unhealthy and the service scheduler would shut the task down.
You can specify a health check grace period in ECS service definition parameter. This instructs the service scheduler to ignore ELB health checks for a pre-defined time period after a task has been instantiated.
https://aws.amazon.com/about-aws/whats-new/2017/12/amazon-ecs-adds-elb-health-check-grace-period/