'No hosts succeeded' error on AWS CodeDeploy service - amazon-web-services

I am trying to set up AWS CodeDeploy for my PHP web app. I have created a CodeDeploy app and a deployment group on the AWS console. I have created the necessary revision bundle with the appspec yaml file. The revision bundle is stored on Amazon S3.
When I click 'Deploy this revision' button on the AWS console it gives me 'no hosts succeeded' error. I went through the Technical FAQ and could not find any answers. How can I counter this error?
UPDATE: I now understand that this error has something to do with Minimum Healthy Hosts count. But still I am not able to understand how does AWS calculate the healthiness of a host.

Basically what its saying is "The codedeploy service on your ec2 instance is not running"...

For why a deployment failed host health is fairly simple. A host is healthy if that host succeeded in deploying the last deployment to it. A host is unhealthy if it failed. A host is unknown if it was Skipped and had no previous deployment.
There are other aspects of host health that affect what order they are deployed to in the next deployment, but that's not going to affect you deployment failing with "No hosts succeeded".
A host can fail it's individual deployment if any of it's lifecycle events failed. A lifecycle event can fail due to service side timeout waiting for the agent to respond or because the host agent reports an error executing the command. You can check the host agent log for more details in exactly why the host agent reported a failure.
If you are hitting the server side timeouts, you should check that the host agent is running and is able to poll for commands correctly. You might have accidentally restricted access in your VPC configuration or didn't grant appropriate permissions to the instance to poll for commands in the instance profile.

This error message means you are not running CodeDeploy service at the EC2 instance targeted by your deployment group.
1) Download latest version of codedeploy from S3 (choose your region)
PS> Read-S3Object -BucketName aws-codedeploy-eu-west-1 -Key latest/codedeploy-agent.msi -File c:\temp\codedeploy-agent.msi
2) Install codedeploy
cmd> c:\temp\codedeploy-agent.msi /quiet /l c:\temp\host-agent-install-log.txt
3) Start codeploy
PS> Start-Service -Name codedeployagent
AWS CodeDeploy guide: http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-run-agent.html#how-to-run-agent-install-windows

I just ran into this issue myself. My solution was to run:
ntpdate-debian
If you are running centos it's something like
ntpdate pool.ntp.org
For me the time was off and was causing issues with the codedeploy agent.
Now, if this doesn't solve your problem. First make sure your problem is that your CodeDeploy agent is not registering. I have had this issue before and it was because one of my instances was in a failed state from a botched deployment so be sure to double check. (ELB status, tests, etc)
Then you should enable logging for your CodeDeploy agent by setting log_aws_wire and verbose to true in /etc/codedeploy-agent/conf/codedeployagent.yml and then restart the CodeDeploy. Tail the logs and you should see the reason for your problems.

Related

Getting an error when trying to register EC2 with ECS Cluster

Working on AWS and at a loss with this...
I am trying to register an EC2 instance to an ECS Cluster, the EC2 instance was launched as part of Codestar project.
Steps I have followed as per AWS documentation:
Go to ECS
Access Cluster
Click on Register External Instances
Click to next page
Copy the Curl command for Linux to register the EC2 to the Cluster.
When I input the Curl command into the Linux CLI it runs, however stalls on this line:
Trying to wait for ECS agent to start ...
Soon after I recieve an error that states:
Timed out waiting for ECS Agent to start.
Logs show:
===================================================
level=error time=2022-06-14T18:19:25Z msg="Unable to register as a container instance with ECS: InvalidParameterException: The identity document and identity document signature were not valid." module=client.go
level=error time=2022-06-14T18:19:25Z msg="Error registering container instance" error="InvalidParameterException: The identity document and identity document signature were not valid."
===================================================l
Can anyone help identify what the issue is?
TIA!
Thanks for your input - Mark B
I had a Eureka moment!
Well it dosent resolve the error I posted, but found another way around this.
Basically I do not need to use or register the EC2 instance created by codestar.
I was able to update the codepipeline 'Deploy' stage to deploy to ECS (instances) and with a few tweaks to the codestar IAM role it is working!
So can consider the matter closed.
Many thanks

Amazon Cloudwatch Agent won't start

After installing Cloudwatch Agent on Amazon Linux 2 EC2, I ran cloudwatch-agent-ctl status
This command shows the status as stopped
I tried running 'cloudwatch-agent-ctl status` and got the following message:
cwagent-otel-collector will not be started "as it has not been configured yet"
Am not sure if the above message is causing CWAgent to not start. Any pointers?
Any pointers on how to find why my CWAgent won't start?
Before you can start your CW agent, you must configure it. From docs:
Before running the CloudWatch agent on any servers, you must create a CloudWatch agent configuration file.
You can follow the docs how to setup the config files, before running the agent.

Failed to deploy application: During an aborted deployment, some instances may have deployed the new application version

I can't deploy a new version on Elastic Beanstalk.
Everything was working fine until I tried to deploy a new version where I have lots of issues (It is not the first time I deploy a new version on this environment, I already have deployed dozens). When I manage to fixe all of them I got those errors:
Failed to deploy application.
During an aborted deployment, some instances may have deployed the new application version. To ensure all instances are running the same version, re-deploy the appropriate application version.
Unsuccessful command execution on instance id(s) 'i-...'. Aborting the operation
I redeploy the version which does not work.
Here is the Elastic Beanstalk console:
Elastic Beanstalk console
Elastic Beanstalk events
The request logs button from Elastic Beanstalk return nothing.
The system log from EC2 instance shows the last working version logs.
I enable the CloudWatch logs from Configuration navigation pane. It added 4 files to CloudWatch logs:
/var/log/eb-activity.log -> empty so far
/var/log/httpd/access_log -> empty so far
/var/log/httpd/error_log -> empty so far
/environment-health.log -> Command is executing on all instances (56 minutes or more elapsed).", "Incorrect application version found on all instances. Expected version \"prod-v1.7.28-0\" (deployment 128).
It is an Amazon Linux, t2.medium instance with Apache as web server
What I already try:
Change the name of .zip each time to be different of other zip already deploy
Terminate the instance and the loadBalancer automatically create a new one
Reboot the instance
Rebuild Elastic Beanstalk environment
Deploy a simplest code
I tried to deploy just a zip with the code below but I got same errors.
<html>
<head>
<title>This is the title of the webpage!</title>
</head>
<body>
<p>This is an example paragraph. Anything in the <strong>body</strong> tag will appear on the page, just like this <strong>p</strong> tag and its contents.</p>
</body>
</html>
It always go back to last working version and when I tried to deploy the new version it does not work.
On some post I see some people telling it is maybe because the instance is too small but before it was working perfectly and the size does not change since then.
If you have some questions or ideas I will be very thankful.
Have a nice day !
Answer:
The issue was in the logs like you said. I had to ssh into my EC2 instance to reached them. The error was in the file cfn-init-cmd.log.
One of the command was waiting for an input so it timed out with no error message.
You should check the logs of the EBS for any hints as to what goes wrong with your deployment. The AWS console
can be helpful for that.
There are also the logs that can be acquired from EC2:
CloudWatch logs is another thing to check.
You should also check the autoscaling group, and see if there are any health checks there. What kind of checks are these? What's the grace period?
Here's a list of reasons that an EC2 health check could fail.
You could launch a better ec2 instance for troubleshooting.
Instance status checks.
The following are examples of problems that can cause instance status checks to fail:
Failed system status checks
Incorrect networking or startup configuration
Exhausted memory
Corrupted file system
Incompatible kernel
Also rebuilding is really a drastic step as it destroys and rebuilds all your resources. Your ELB DNS for example will be gone, any associated EIP will be released. These things can't be reclaimed.
I also faced same issue and deleted the wrong application versions. And increased the command timeout.
Default max deployment time -Command timeout- is 600 (10 minutes)
Go to Your Environment → Configuration → Deployment preferences → Command timeout
Increase the Deployment preferences higher like 1800 and then try to deploy the previous working application version. It will work.

Spinnaker clouddriver doesn't start

After Spinnaker deployment on EC2, clouddriver doesn't start. tried the same on local machine and the result is the same. Trying to run 1.6.1 on ubuntu 16.04.
I am using s3 as storage aws as cloudprovider.
After deployment spinnaker UI is accesable, when creating new application the windows hangs and error message appears in browser's console regarding localhost:8084/ credentials and 7002 port.
tried to send curl request to localhost:7002 from the server, but connection refused. 7002 port isn't being listened but all other services ports are. clouddriver start and then enters failed state (for about after 30 seconds).
For deployment I've followed this guide on official website.
Also I can't find logs of services in /var/log/spinnaker/any service/ path, there are logs only in /var/log/spinnaker/halyard/ path.
All policies/roles/users have been made in aws properly as described in official setup guide. double checked. Still facing issue.
Maybe I am missing anything?
Here is the error from browser console when trying to create new application
GET http://localhost:8084/credentials?expand=true 500 () angular.js:14525 Possibly unhandled rejection: {"data":{"error":"Internal Server Error","exception":"com.google.common.util.concurrent.UncheckedExecutionException","message":"retrofit.RetrofitError: Failed to connect to localhost/127.0.0.1:7002","status":500,"timestamp":1523484058259},"status":500,"config":{"method":"GET","transformRequest":[null],"transformResponse":[null],"jsonpCallbackParam":"callback","url":"http://localhost:8084/credentials","cache":true,"params":{"expand":true},"timeout":65000,"headers":{"X-RateLimit-App":"deck","Accept":"application/json, text/plain, */*"},"withCredentials":true},"statusText":""} undefined
Have done some tests later. Here are results.
1deployed spinnaker without s3 storage and any cloud provider - clouddriver works
2added s3 as persistent storage - clouddriver works again. Opened UI created dummy project and saw that files have been created in the s3 bucket under front50 folder. everything fine.
3added aws configurations - created user in aws, and ran this command with appropriate changes
hal config provider aws edit --access-key-id ${ACCESS_KEY_ID} \ --secret-access-key
and ran this command with appropriate changes
hal config provider aws account add $AWS_ACCOUNT_NAME \ --account-id ${ACCOUNT_ID} \ --assume-role role/spinnakerManaged
after checking aws configs with hal config provider aws the value of defaultAssumeRole=0
and after hal deploy apply again clouddriver doesn't start and I cannot create an application from UI. the window loads infinitely.
This is the option for dev spinnaker. localdebian type never worked for me as it contains extra dependencies.
Please use a Kubernetes cluster for the installation or user Minnaker for quick PoC of OSS. It runs on a K3S cluster.

Code Deploy fails without any error message

so I have been trying to setup code deploy for my application, but it keeps on failing. Initially, I didn't have an appspec.yml file in repository, so I got the error message that the appspec.yml file doesn't exist.
I have now included an appspec.yml file, but it still doesn't work and it doesn't give any error message. There are no events mentioned, like it used to before adding the appspec file.
I have less than a beginner's knowledge when it comes to creating a appspec.yml file, but I took hint from a youtube tutorial, and here is the file.
version: 0.0
os: linux
files:
- source: /
destination: /var/www/cms
If it helps, the ec2 instance is running an ubuntu server, /var/www/cms is that directory out of which the nginx is supposed to serve files.
The most likely problem you're facing is that the agent either isn't installed or the instance doesn't have sufficient permissions. When there are no events started on the instance for the deployment, it means that CodeDeploy couldn't talk to the host for some reasons.
Here's the steps I would take:
Confirm that you installed the CodeDeploy agent
Confirm that you've created the IAM service role
Confirm that you have the IAM Instance Profile and that it's associated with the instance
Check that you can reach the CodeDeploy commands endpoint in your region from the box. i.e. ping codedeploy.us-east-1.amazonaws.com Otherwise, your networking setup might be too restrictive.
Look at the logs on the host to see what's going on
I faced at sometime this thing and it was due to the following:
If we initially created and turned on the ec2 instance without setting the IAM service role, and after that we added the service role, it will not take effect until we restart the instance.
I had attached IAM role to EC2 instance but I did not restart my systemd service. And that was the cause of failure.
Also, without rebooting instance, you can just restart systemd service of codedeploy-agent.
In case it helps, I had the same problem and the reason was that codedeploy agent was not installed in the ec2 instance.
After installing it, everything worked like a charm.