AWS CodeDeploy stuck in AllowTraffic step - amazon-web-services

I'm using AWS CodeDeploy to deploy my project (triggered by CodePipeline) to an autoscaling group (EC2 instances behind an ALB). This is my appSpec file:
version: 0.0
os: linux
files:
- source: /
destination: /var/www/html/test-deploy
overwrite: true
permissions:
- object: /var/www/html/test-deploy/codedeploy
pattern: "*.sh"
owner: root
group: root
mode: 755
type:
- file
hooks:
BeforeInstall:
- location: codedeploy/before_install.sh
timeout: 180
AfterInstall:
- location: codedeploy/after_install.sh
runas: centos
timeout: 180
The files get deployed successfully to the EC2 instance, but for some reason after the "BeforeAllowTraffic" nothing happens, like I waited 15 min and the next step was still at "pending".
The two .sh files do nothing fancy (and codedeploy passed those steps so I don't think that's the problem).
Can anyone point me to a direction? I don't get any error messages, so I don't even know how to debug it.
Thanks

I have got the same issue, after investigation, I found that my target group was "unhealthy". I just add the health check path/file i.e "/rorbots.txt" and rebooted the Ec2 Server and its fixed the problem.

We also had an unhealthy target instance. The problem was hosting two applications on the same instance, where one (application A) was responsible for health checks and talking to the load balancer, and the other one (application B without any open network ports) was being deployed. One instance was always getting stuck in AllowTraffic during app B deployments. I found the root cause when I looked at the target group for app A and saw that same instance in the "unhealthy" status, so of course deploying app B wasn't going to fix that. After I re-deployed app A and restored the instance back to health, app B deployments were able to progress.

Check your logs on your target group instances. It may be caused by one of the following:
the application startup command did not finish successfully
the application is not running due to an error
your target group's health check is NOT configured with the endpoint you expect
your application is NOT responding at the endpoint you expect

Related

CodeDeploy events not running

This is how my CodeDeploy status looks like:
This is first time I'm trying to set this up. I created EC2 and added following policies to attached IAM role:
and edited Trust relationships like this:
also I installed code deploy agent on EC2 instance.
this is my appspec.yml
version: 0.0
os: linux
files:
- source: .
destination: /home/ubuntu
hooks:
ApplicationStop:
- location: scripts/stop_server.sh
timeout: 5
runas: root
stop_server.sh is just an empty file
any ideas?
The most likely problem you're facing is that the agent either isn't installed or the instance doesn't have sufficient permissions. When there are no events started on the instance for the deployment, it means that CodeDeploy couldn't talk to the host for some reasons.
Here's the steps I would take:
Confirm that you installed the CodeDeploy agent
Confirm that you've created the IAM service role
Confirm that you have the IAM Instance Profile and that it's associated with the instance
Check that you can reach the CodeDeploy commands endpoint in your region from the box. i.e. ping codedeploy.us-east-1.amazonaws.com Otherwise, your networking setup might be too restrictive.
Look at the logs on the host to see what's going on

Restrict Elastic Beanstalk from creating a security group and use provided one instead

When I create a beanstalk environment using a saved configuration, it works fine but creates a new security group for no reason and attaches it to the instances. I already provide a security group to allow SSH access to the instances from VPC sources.
I followed this thread and tried to restrict this behaviour with the following config inside .ebextentions:
Resources:
AWSEBSecurityGroup: { "CmpFn::Remove" : {} }
AWSEBAutoScalingLaunchConfiguration:
Properties:
SecurityGroups:
- sg-07f419c62e8c4d4ab
Now the creation process gets stuck at:
Creating application version archive "app-210517_181530".
Uploading stage/app-210517_181530.zip to S3. This may take a while.
Upload Complete.
Environment details for: restrict-sg-poc
Application name: stage
Region: ap-south-1
Deployed Version: app-210517_181530
Environment ID: e-pcpmj9mdjb
Platform: arn:aws:elasticbeanstalk:ap-south-1::platform/Tomcat 8.5 with Corretto 11 running on 64bit Amazon Linux 2/4.1.8
Tier: WebServer-Standard-1.0
CNAME: UNKNOWN
Updated: 2021-05-17 12:45:35.701000+00:00
Printing Status:
2021-05-17 12:45:34 INFO createEnvironment is starting.
2021-05-17 12:45:35 INFO Using elasticbeanstalk-ap-south-1-############ as Amazon S3 storage bucket for environment data.
How can I do this properly so that my SG is added to the instances and no new SGs are created.
PS: I am using a shared ALB so SG created for load balancers is not a problem right now.

Codedeploy with S3 always fails after 5 minutes

I've spent the better half of the day trying to setup CodeDeploy, CodePipeline, S3 and EC2.
Codepipeline will successfully:
Pick up detected changes in GitHub
Push the ZIP file up to S3
Trigger CodeDeploy to begin deployment
Also
EC2 has list and read access to S3
S3 allows all actions from EC2
I've followed this outdated guide mostly: https://cloudacademy.com/blog/how-to-deploy-application-code-from-s3-using-aws-codedeploy/
appspec.yml
version: 0.0
os: linux
files:
- source: /
destination: /var/www
hooks:
AfterInstall:
- location: hooks/after-install.sh
runas: root
I'm rather new to AWS and can't for my life find where the logs are telling me what's going on, nor do I get any error message that points me anywhere, so I've literally been shooting blind double checking everything all day and trying again and this is taunting me now:
Any help even if it's pointing me towards where I can actually find the error message would be tremendously appreciated, thanks for your time
This generally occurs for one of the following 3 reasons:
The CodeDeploy agent needs to be installed and running on the target instance.
No access to CodeDeploy and S3 service. Either ensure you are:
Running an instance in a public subnet with an internet gateway
Running an instance in a private subnet with a NAT gateway/NAT instance
The IAM permissions for the IAM role of the instance are not sufficient, for sufficient permissions attach the AWSCodeDeployRole policy.
As you have said your IAM role permissions are fine you are left with one of the other 2 scenarios.
Once these are working you can generally see the logs within the /var/log/aws/codedeploy-agent location.

How can I stream a specific log file from multi-container Docker Elastic Beanstalk to CloudWatch?

I have a web service deployed to an Elastic Beanstalk environment running the Docker Multi-Container stack. I have enabled Log Streaming to CloudWatch on the environment, so about five different log groups show up in Cloudwatch, and so when I click "Request Logs" from Beanstalk it loads a webpage that shows me all the log files, one after another. I've noticed that there are some logs on this web page that do not show up as Log Groups in CloudWatch, and these are the logs I really care about. My question is how do I get them to show up as CloudWatch Log Groups?
In particular, the five Log Groups that Elastic Beanstalk automatically created for me are:
/aws/elasticbeanstalk/my-web-service/var/log/docker-events.log
/aws/elasticbeanstalk/my-web-service/var/log/eb-activity.log
/aws/elasticbeanstalk/my-web-service/var/log/eb-ecs-mgr.log
/aws/elasticbeanstalk/my-web-service/var/log/ecs/ecs-agent.log
/aws/elasticbeanstalk/my-web-service/var/log/ecs/ecs-init.log
And when I look in the file that gets generated when I "request logs," those five are indeed there. But these other log files are also represented:
/aws/elasticbeanstalk/my-web-service/var/log/awslogs.log
/aws/elasticbeanstalk/my-web-service/var/log/docker
/aws/elasticbeanstalk/my-web-service/var/log/docker-ps.log
/aws/elasticbeanstalk/my-web-service/var/log/eb-commandprocessor.log
/aws/elasticbeanstalk/my-web-service/var/log/containers/my-svc-8edcf9cec583-stdouterr.log
It's that last one that I'm really interested in, the one ending in stdouterr.log. That's where my containerized application writes all of its log messages to. What I would like to see is a Log Group in CloudWatch that corresponds to that stdouterr.log file. As far as I can tell, the 12-digit ID that's in the log file name is the ID of the docker image that gets installed on the host, and is subject to change every time you restart the server. So I'm guessing I'll likely need to mount a volume, or something like that, in the Dockerrun.aws.json configuration? And furthermore I would guess that I'd then need to manually add a Log Group to CloudWatch? How can I get this file to show up?
It looks like you currently only have the default logs being sent to Cloudwatch logs. You can add additional logs to the cloudwatch agent through your .ebextensions
### BEGIN .ebextensions/logs.config
option_settings:
- namespace: aws:elasticbeanstalk:cloudwatch:logs
option_name: StreamLogs
value: true
- namespace: aws:elasticbeanstalk:cloudwatch:logs
option_name: DeleteOnTerminate
value: false
- namespace: aws:elasticbeanstalk:cloudwatch:logs
option_name: RetentionInDays
value: 7
files:
"/etc/awslogs/config/stdout.conf":
mode: "000755"
owner: root
group: root
content: |
[docker-stdout]
log_group_name=/aws/elasticbeanstalk/environment_name/docker-stdout
log_stream_name={instance_id}
file=/var/log/eb-docker/containers/eb-current-app/*-stdouterr.log
commands:
"00_restart_awslogs":
command: service awslogs restart
### END .ebextensions/logs.config
source

Elastic BeanStalk MultiContainer docker fails

I want to deploy an multi-container application in elastic beanstalk. I get the following error.
Error 1: The EC2 instances failed to communicate with AWS Elastic
Beanstalk, either because of configuration problems with the VPC or a
failed EC2 instance. Check your VPC configuration and try launching
the environment again.
I have set up the VPC with just the public subnet and the security group that allows all traffic both inbound and outbound. I know this is not encouraged for production level deployment, but I have reduced the complexity to find the cause of the error.
So, the load balancer and the EC2 instance are inside the same public subnet that is attached with the internet gateway. They both share the same security group allowing all the traffic.
Before the above error, I also get another error stating
Error 2: No ecs task definition (or empty definition file) found in environment
Having said, I have bundled my Dockerrun.aws.json file with .ebextensions folder inside the source bundle which the beanstalk uses for deployment.
After all these errors, drilling down to two questions:
I cannot understand why No ecs task error appears, when I have packaged my dockerrun.aws.json file containing containerDefinitions?
Since there is no ecs task running, there is nothing running in the instance. Is this why beanstalk and ELB cannot communicate to the instance? (Assuming my public subnet and all traffic security group is not a problem)
The problem was the VPC. Even I had the simple VPC with just an public subnet, the beanstalk cannot talk to the instance and so cannot deploy the ECS task definition and docker containers in the instance.
By creating two subnets namely public and private and having an NAT instance in public subnet, which becomes the router for the instances in the private subnet. Making the above setup worked for me and I could deploy the ECS task definition successfully to the EC2 instance in the private subnet.
I found this question because I got the same error. Here are the steps that worked for me to actually deploy a multi-container app on Beanstalk:
To get past this particular error, I used the eb CLI tools. For some reason, using eb deploy instead of zipping and uploading myself fixed this. It didn't actually work, but it gave me a new error.
So, I changed my Dockerrun.aws.json, a file format that needs WAY more documentation, until I stopped getting errors about that.
Then, I got an even better error!
ERROR: [Instance: i-0*********0bb37cf] Command failed on instance.
Return code: 1 Output: (TRUNCATED)..._api_call
raise ClientError(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when
calling the GetObject operation: Access Denied
Failed to download authentication credentials [config file name] from [bucket name].
Hook /opt/elasticbeanstalk/hooks/appdeploy/enact/02update-
credentials.sh failed. For more detail, check /var/log/eb-activity.log
using console or EB CLI.
Per this part of the docs the way to solve this is to
Open the Roles page in the IAM console.
Choose aws-elasticbeanstalk-ec2-role.
On the Permissions tab, under Managed Policies, choose Attach Policy.
Select the managed policy for the additional services that your application uses. For example, AmazonS3FullAccess or AmazonDynamoDBFullAccess. (For our problem, the S3 one)
Choose Attach Policies.
This part got really exciting, because I got yet another error: Authentication credentials are not in JSON format as expected. Please generate the credentials using 'docker login'. (Keep in mind, I tried to follow the instructions on how to do this to the letter, but, oh well). Turns out this one was on me, I had malformed JSON in my DockerHub auth file stored on S3. I renamed the file to dockercfg.json to get syntax checking, and it seems the Beanstalk/ECS is okay with having the .json as part of the name, because this time... there was a different error: CannotPullContainerError: Error: image [DockerHub organization]/[repo name]:latest not found). Hmm, maybe there was a typo? Let's check:
$ docker run -it [DockerHub organization]/[repo name]:latest
Unable to find image '[DockerHub organization]/[repo name]:latest' locally
latest: Pulling from [DockerHub organization]/[repo name]
Ok, the repo is there. So... my auth is bad? Yup, turns out I followed an example in the DockerHub auth docs that was of what you shouldn't do. Your dockercfg.json should look like
{
"https://index.docker.io/v1/": {
"auth": "ZWpMQ=Vyd5zOmFsluMTkycN0ZGYmbn=WV2FtaGF2",
"email": "your#email.com"
}
}
There were a few more errors (volume sourcePath has to be a absolute path! That's what the invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed message means), but it eventually deployed. Sorry for the novel; hoping it helps someone.