AWS ECS firstRun not showing EC2 instance - amazon-web-services

I went through the firstRun steps here (AWS login required).
I have uploaded a docker image to the ECR and the cluster launches successfully, all steps succeed (ECS status - 4 of 4 complete and EC2 instance status - 14 of 14 complete).
There is no instance attached to the cluster although it is running (see screenshots). What am I doing wrong?

The permissions were missing, or in other words, EMR_EC2_DefaultRule wasn't a AmazonEC2ContainerServiceforEC2Role.
It's explained here.
Would've expected the setup to fail if the role does not grant enough permissions for the EC2 instance to actually connect to the cluster.

Related

Unable to deploy code on ec2 instance using codedeploy

I have single ec2 instance running on ubuntu server and I am trying to implement CI/CD flow using codedeploy and source is bit-bucket.I jave also installed codedeploy-agent on ec2 instance and it is installed and running successfully but whenever I am deploying code on ec2 deployment is failing with an error shown below:
The overall deployment failed because too many individual instances failed deployment, too few
healthy instances are available for deployment, or some instances in your deployment group are
experiencing problems.
In the CodeDeploy agent log file that I am accessing using less /var/log/aws/codedeploy-agent/codedeploy-agent.log showing below error:
ERROR [codedeploy-agent(31598)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller:
Missing credentials - please check if this instance was started with an IAM instance profile
I am unable to understand how can I overcome this error someone let me know.
CodeDeploy agent requires IAM permissions provided by IAM role/profile of your instance. The exact permissions needed are given in AWS docs:
Step 4: Create an IAM instance profile for your Amazon EC2 instances

Update AWS ECS service after ECR image being updated

I have ESC service with EC2 task running on an EC2 instance. The image in the EC2 task is from the ECR uri: for example: 688523422345.dkr.ecr.us-west-1.amazonaws.com/image, I noticed that when I load this image into my EC2 task I just directly using the uri:688523422345.dkr.ecr.us-west-1.amazonaws.com/image:latest, because the image uri never changes and I just keep push image to update it.
However, when the image did updated on ECR, the task and service running on EC2 instance doesn't updating. I wondering why, and search on stack overflow, people told me that using aws ecs update-service --cluster <cluster name> --service <service name> --force-new-deployment to force the service to re-deploy. However, I just got error on not enough memory left on the instance(seems the re-deployment will create new task and it keep taking more memories, not a good solution).
How can I solve this?
This could be because of your Deployment configuration and the parameters:
maximumPercent
minimumHealthyPercent
By default minimumHealthyPercent is 100% which means that replacement operation will first attempt to run new tasks, before terminating old ones, potentially resulting in out of memory error. You can set it up to minimumHealthyPercent to 0 and maximumPercent to 100 as to force termination of existing tasks first, before creating new ones.
It's not working, after tried a lot. I find out is because the EC2 instance already stored all the information from a task(even if deleted task, the instance is still running with the image). The right way to do it is to re-start the instance.
I used aws-cli: aws ec2 reboot-instances --instance-ids <instance_id>
It worked!

No ECS agent docker container in ECS optimised instance

I launched an ECS Optimised instance in ap-south-1 region of AWS from ami id: ami-0a8bf4e187339e2c1 using the link https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html but there is no ecs agent present. Even /var/log/ecs directory is not present so I cannot check logs. I have correct cluster name configured in /etc/ecs/ecs.config
If you look at the instances in the EC2 console in AWS, can you see the AMI ID? Is it the AMI ID you expect?
Just to have a point of comparison, I just SSH'd to an ECS-optimized EC2 instances and I can see ecs-agent in a docker ps listing, I can see /var/log/ecs, so my first instinct is that this EC2 instance didn't end up using the AMI you expected it to.
If you want to check logs go to tasks and click on the task in which you wan to see logs and then click on logs yo will see the logs of your container.

ECS migration from AL1 to AL2 - ECS service not starting

I have recently changed AMI on which my ECS EC2 instances are running from Amazon Linux to Amazon Linux 2 (in both cases I am using ECS optimized images). I am deploying my instances using cloudformation and having a real headache as those new instances sometimes are being run successfully and sometimes not (same stack, no updates, same code).
On the failed instances I see that there is an issue with ECS service itself after executing ecs-logs-collector.sh I see in ecs file log "warning: The Amazon ECS Container Agent is not running". Also directory "/var/log/ecs" doesn't even exist!.
I have correct IAM role attached to an instance.
Also as mentioned, it is the same code being run, and on 75% of attempts it fails with ECS service, I have no more ideas, where else to look for some issues/logs/errors.
AMI: ami-0650e7d86452db33b (eu-central-1)
Solved. If someone will fall into this issue adding this to my userdata helped:
cp /usr/lib/systemd/system/ecs.service /etc/systemd/system/ecs.service
sed -i '/After=cloud-final.service/d' /etc/systemd/system/ecs.service
systemctl daemon-reload

Can't login to docker with aws

This is an extension of my last question considering I've decided to deploy a Docker container onto a ton of EC2's. I've set up a repository and a user with full rights, and I added the correct keys to my aws cli configuration. When I try to run the docker login command that comes up after running the "aws ecr get-login" command, it gives me a failed with status: 403 forbidden error. I have absolutely no clue what's going on, and I've spent the past 2 days trying to fix this error... Any ideas?
I would suggest to check the security group of the EC2 Instance
To allow access via SSH you have to apply the following settings for the Security Group of the EC2 Instance:
Security Groups