ECS migration from AL1 to AL2 - ECS service not starting - amazon-web-services

I have recently changed AMI on which my ECS EC2 instances are running from Amazon Linux to Amazon Linux 2 (in both cases I am using ECS optimized images). I am deploying my instances using cloudformation and having a real headache as those new instances sometimes are being run successfully and sometimes not (same stack, no updates, same code).
On the failed instances I see that there is an issue with ECS service itself after executing ecs-logs-collector.sh I see in ecs file log "warning: The Amazon ECS Container Agent is not running". Also directory "/var/log/ecs" doesn't even exist!.
I have correct IAM role attached to an instance.
Also as mentioned, it is the same code being run, and on 75% of attempts it fails with ECS service, I have no more ideas, where else to look for some issues/logs/errors.
AMI: ami-0650e7d86452db33b (eu-central-1)

Solved. If someone will fall into this issue adding this to my userdata helped:
cp /usr/lib/systemd/system/ecs.service /etc/systemd/system/ecs.service
sed -i '/After=cloud-final.service/d' /etc/systemd/system/ecs.service
systemctl daemon-reload

Related

ECS Fargate foverver with "rolloutState" in "IN_PROGRESS" with no stopped task on AWS Console

I'm using ECS with Fargate, I have a service running and it is working OK.But after I update the task definition (new deploy) and the console (ECS -> Clusters -> Tasks tab) shows that my current task is INACTIVE, which is normal, but it doesn't show any new task being created, nor any stopped task, even after an hour. It is as if ECS is not trying to run my new definition.
If I use the awscli to find information about my service:
aws ecs describe-services --cluster cluster-xxxxxxx --services service-svc-xxxxxxx --region us-east-1
It has two deployments. The first is alright, it is the running deployment. The most recent one, it shows:
"desiredCount": 1,
"pendingCount": 0,
"runningCount": 0,
"failedTasks": 7,
...
"rolloutState": "IN_PROGRESS",
"rolloutStateReason": "ECS deployment ecs-svc/XXXXXXXXXXXXXXXXXX in progress."
Again, there is nothing on the ECS console that points to failed tasks. It is as if the task is failing on a so premature state, its not even logging anything.
I tried looking at CloudTrail event, but there is nothing there about failed tasks. On CloudWatch, the logs for container insights (/aws/ecs/containerinsights/cluster-xxxxxxx/performance) also don't mention failing tasks.
How can I troubleshoot this situation?
Turns out, if the account/organization is new, then AWS apply quotas that seems not to be documented anywhere. ECS was not authorized to start more than two tasks.
There is a trick, I found on this post, that says that I had to create an EC2 instance on the same region I was trying to use ECS. Shortly after I created the EC2 instance, I received an email from AWS and everything behaved normally again. https://forums.aws.amazon.com/thread.jspa?threadID=340770

Running AWS ECS Task Attached (Not Detached)

Is there easy way to run an ECS Task attached or to follow the logs only while the container is Running (ie. Detach after displaying all of the logs associated)?
Using the AWS CLI (1.17.0) and ecs-cli (1.21.0), I have gotten decently close with the following two commands:
aws ecs run-task --cluster "mycluster" --task-definition testhelloworldjob --launch-type FARGATE --network-configuration etc.etc.etc.
ecs-cli logs --task-id {TASK_ID_HERE_FROM_OUTPUT_OF_PREVIOUS_COMMAND} --follow
I am currently have two issues with the above approach:
There is a race condition being that the logs are not available when the task is in a pre "running" state. Instead of ecs-cli logs waiting for the logs to exist, there is an error immediately thrown.
Even after waiting for the task to be in a running state, and issuing the ecs-cli logs the command refuses to detach even AFTER the task is finished and in a Post Running status.
For the first issue I could poll until there is a post activating/pending status, prior to calling logs. For the second issue I could draft some type of threaded call that would poll to stop the following of a log after the container in question is no longer running.... But there has to be an easier way?
To clarify I am coming from numerous other container orchestration tools/technologies that seemingly supported this very seamlessly. Here are some examples of tools and their associated commands that would yield me my intended results:
Docker CLI:
docker run hello-world
Docker-Compose Yaml:
docker-compose up
K8 Kubectl Yaml:
kubectl apply -f ./hello-k8.yaml && kubectl logs --follow hello-world
I think ecs-cli is the best option available at the moment.
Apart from that, you can change the logs driver of the AWS ECS task to syslog and then watch the logs file from the terminal after doing SSH into the EC2 container instance in which it is running.
Another thing you can do is SSH into the EC2 container instance in which it was running before and then run the container of that AWS ECS task by yourself in it using docker run and once the testing is done, you can stop and remove that container and then get that task started via AWS ECS.
Note: You can use AWS SSM Session Manager in order to avoid using EC2 key pair and adding an inbound rule for SSH.

Update AWS ECS service after ECR image being updated

I have ESC service with EC2 task running on an EC2 instance. The image in the EC2 task is from the ECR uri: for example: 688523422345.dkr.ecr.us-west-1.amazonaws.com/image, I noticed that when I load this image into my EC2 task I just directly using the uri:688523422345.dkr.ecr.us-west-1.amazonaws.com/image:latest, because the image uri never changes and I just keep push image to update it.
However, when the image did updated on ECR, the task and service running on EC2 instance doesn't updating. I wondering why, and search on stack overflow, people told me that using aws ecs update-service --cluster <cluster name> --service <service name> --force-new-deployment to force the service to re-deploy. However, I just got error on not enough memory left on the instance(seems the re-deployment will create new task and it keep taking more memories, not a good solution).
How can I solve this?
This could be because of your Deployment configuration and the parameters:
maximumPercent
minimumHealthyPercent
By default minimumHealthyPercent is 100% which means that replacement operation will first attempt to run new tasks, before terminating old ones, potentially resulting in out of memory error. You can set it up to minimumHealthyPercent to 0 and maximumPercent to 100 as to force termination of existing tasks first, before creating new ones.
It's not working, after tried a lot. I find out is because the EC2 instance already stored all the information from a task(even if deleted task, the instance is still running with the image). The right way to do it is to re-start the instance.
I used aws-cli: aws ec2 reboot-instances --instance-ids <instance_id>
It worked!

No ECS agent docker container in ECS optimised instance

I launched an ECS Optimised instance in ap-south-1 region of AWS from ami id: ami-0a8bf4e187339e2c1 using the link https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html but there is no ecs agent present. Even /var/log/ecs directory is not present so I cannot check logs. I have correct cluster name configured in /etc/ecs/ecs.config
If you look at the instances in the EC2 console in AWS, can you see the AMI ID? Is it the AMI ID you expect?
Just to have a point of comparison, I just SSH'd to an ECS-optimized EC2 instances and I can see ecs-agent in a docker ps listing, I can see /var/log/ecs, so my first instinct is that this EC2 instance didn't end up using the AMI you expected it to.
If you want to check logs go to tasks and click on the task in which you wan to see logs and then click on logs yo will see the logs of your container.

Kubernetes run on AWS

I've been struggling with configuring Kubernetes for many hours and I don't know how to move it forward.
What I did :
I created few services using spring cloud
I created docker images for each service
I pushed those images to docker hub
I launched AWS by running
export KUBERNETES_PROVIDER=aws; wget -q -O - https://get.k8s.io | bash
Command kubectl cluster-info shows that it actually works.
I created Kubernetes pods for each service. Command kubectl get pods
shows that all pods have status running.
The problem is that when I log to my AWS account I don't see any running instance, although I can see kubernetes-staging created in my S3 bucket.
My goal is to actually access my service , not on localhost. How can I do it ?
You should be able to see instances of course - as #kichik mentioned check whether your AWS console is using the same region as the deployment scripts.
To use your services/applications the next step is to expose them to the public with Kubernetes services as described here and here