Spinnaker clouddriver doesn't start - amazon-web-services

After Spinnaker deployment on EC2, clouddriver doesn't start. tried the same on local machine and the result is the same. Trying to run 1.6.1 on ubuntu 16.04.
I am using s3 as storage aws as cloudprovider.
After deployment spinnaker UI is accesable, when creating new application the windows hangs and error message appears in browser's console regarding localhost:8084/ credentials and 7002 port.
tried to send curl request to localhost:7002 from the server, but connection refused. 7002 port isn't being listened but all other services ports are. clouddriver start and then enters failed state (for about after 30 seconds).
For deployment I've followed this guide on official website.
Also I can't find logs of services in /var/log/spinnaker/any service/ path, there are logs only in /var/log/spinnaker/halyard/ path.
All policies/roles/users have been made in aws properly as described in official setup guide. double checked. Still facing issue.
Maybe I am missing anything?
Here is the error from browser console when trying to create new application
GET http://localhost:8084/credentials?expand=true 500 () angular.js:14525 Possibly unhandled rejection: {"data":{"error":"Internal Server Error","exception":"com.google.common.util.concurrent.UncheckedExecutionException","message":"retrofit.RetrofitError: Failed to connect to localhost/127.0.0.1:7002","status":500,"timestamp":1523484058259},"status":500,"config":{"method":"GET","transformRequest":[null],"transformResponse":[null],"jsonpCallbackParam":"callback","url":"http://localhost:8084/credentials","cache":true,"params":{"expand":true},"timeout":65000,"headers":{"X-RateLimit-App":"deck","Accept":"application/json, text/plain, */*"},"withCredentials":true},"statusText":""} undefined
Have done some tests later. Here are results.
1deployed spinnaker without s3 storage and any cloud provider - clouddriver works
2added s3 as persistent storage - clouddriver works again. Opened UI created dummy project and saw that files have been created in the s3 bucket under front50 folder. everything fine.
3added aws configurations - created user in aws, and ran this command with appropriate changes
hal config provider aws edit --access-key-id ${ACCESS_KEY_ID} \ --secret-access-key
and ran this command with appropriate changes
hal config provider aws account add $AWS_ACCOUNT_NAME \ --account-id ${ACCOUNT_ID} \ --assume-role role/spinnakerManaged
after checking aws configs with hal config provider aws the value of defaultAssumeRole=0
and after hal deploy apply again clouddriver doesn't start and I cannot create an application from UI. the window loads infinitely.

This is the option for dev spinnaker. localdebian type never worked for me as it contains extra dependencies.
Please use a Kubernetes cluster for the installation or user Minnaker for quick PoC of OSS. It runs on a K3S cluster.

Related

Atlantis plan erroring with querying Cloud Storage failed message

I have a GCP VM to which a GCP Service Account has been attached.
This SA has the appropriate permissions to perform some terraform / terragrunt related actions, such as querying the backend configuration GCS bucket etc.
So, when I log in to the VM (to which I have already transferred my terraform configuration files, I can for example do
$ terragrunt plan
Initializing the backend...
Successfully configured the backend "gcs"! Terraform will automatically
use this backend unless the backend configuration changes.
Initializing provider plugins...
- terraform.io/builtin/terraform is built in to Terraform
- Finding hashicorp/random versions matching "3.1.0"...
- Finding hashicorp/template versions matching "2.2.0"...
- Finding hashicorp/local versions matching "2.1.0"...
.
.
.
(...and the plan goes on)
I have now set up atlantis to run as a systemd service (under a same name user)
The problem is that when I create a PR, the plan (as posted as a PR comment) fails as follows:
Initializing the backend...
Successfully configured the backend "gcs"! Terraform will automatically
use this backend unless the backend configuration changes.
Failed to get existing workspaces: querying Cloud Storage failed: storage: bucket doesn't exist
Does anyone know (suspects) whether this problem may be related to the change the terraform service account is / can not be used by the systemd service running atlantis? (cause the bucket is there, since I am able to plan manually)
update: I have validated that a systemd service does inherit the GCP SA by creating a systemd service that just runs this script
#!/bin/bash
gcloud auth list
and this does output the SA of the VM.
So I changed my original question since this apparently is not the issue.
Posting my comment as an answer for visibility to other community members.
You were maybe getting an error because there can be an issue with the terraform configuration. To update it, Please run the following command and see if it solves your issue.
terraform init -reconfigure

AWS: How do I continuously deploy a static website on AWS

I have a github repo with static website contents (i.e I try not to use EC2, but the AWS static website service). Now I want to automatically deploy it on AWS anytime I change and push something to the master branch of my github repo.
Any experience or idea doing this?
I do this for many projects by using a Jenkins server - I happen to run it on another ec2 instance, but you could also run it on-premise if you prefer.
Github notifies Jenkins server that a checkin has occurred, and a Jenkins job deploys all the files to the proper places and also notifies me by SMS (or email), that a deployment has occurred.
(Jenkins is not the only tool that can do this there are others).

Code Deploy fails without any error message

so I have been trying to setup code deploy for my application, but it keeps on failing. Initially, I didn't have an appspec.yml file in repository, so I got the error message that the appspec.yml file doesn't exist.
I have now included an appspec.yml file, but it still doesn't work and it doesn't give any error message. There are no events mentioned, like it used to before adding the appspec file.
I have less than a beginner's knowledge when it comes to creating a appspec.yml file, but I took hint from a youtube tutorial, and here is the file.
version: 0.0
os: linux
files:
- source: /
destination: /var/www/cms
If it helps, the ec2 instance is running an ubuntu server, /var/www/cms is that directory out of which the nginx is supposed to serve files.
The most likely problem you're facing is that the agent either isn't installed or the instance doesn't have sufficient permissions. When there are no events started on the instance for the deployment, it means that CodeDeploy couldn't talk to the host for some reasons.
Here's the steps I would take:
Confirm that you installed the CodeDeploy agent
Confirm that you've created the IAM service role
Confirm that you have the IAM Instance Profile and that it's associated with the instance
Check that you can reach the CodeDeploy commands endpoint in your region from the box. i.e. ping codedeploy.us-east-1.amazonaws.com Otherwise, your networking setup might be too restrictive.
Look at the logs on the host to see what's going on
I faced at sometime this thing and it was due to the following:
If we initially created and turned on the ec2 instance without setting the IAM service role, and after that we added the service role, it will not take effect until we restart the instance.
I had attached IAM role to EC2 instance but I did not restart my systemd service. And that was the cause of failure.
Also, without rebooting instance, you can just restart systemd service of codedeploy-agent.
In case it helps, I had the same problem and the reason was that codedeploy agent was not installed in the ec2 instance.
After installing it, everything worked like a charm.

'No hosts succeeded' error on AWS CodeDeploy service

I am trying to set up AWS CodeDeploy for my PHP web app. I have created a CodeDeploy app and a deployment group on the AWS console. I have created the necessary revision bundle with the appspec yaml file. The revision bundle is stored on Amazon S3.
When I click 'Deploy this revision' button on the AWS console it gives me 'no hosts succeeded' error. I went through the Technical FAQ and could not find any answers. How can I counter this error?
UPDATE: I now understand that this error has something to do with Minimum Healthy Hosts count. But still I am not able to understand how does AWS calculate the healthiness of a host.
Basically what its saying is "The codedeploy service on your ec2 instance is not running"...
For why a deployment failed host health is fairly simple. A host is healthy if that host succeeded in deploying the last deployment to it. A host is unhealthy if it failed. A host is unknown if it was Skipped and had no previous deployment.
There are other aspects of host health that affect what order they are deployed to in the next deployment, but that's not going to affect you deployment failing with "No hosts succeeded".
A host can fail it's individual deployment if any of it's lifecycle events failed. A lifecycle event can fail due to service side timeout waiting for the agent to respond or because the host agent reports an error executing the command. You can check the host agent log for more details in exactly why the host agent reported a failure.
If you are hitting the server side timeouts, you should check that the host agent is running and is able to poll for commands correctly. You might have accidentally restricted access in your VPC configuration or didn't grant appropriate permissions to the instance to poll for commands in the instance profile.
This error message means you are not running CodeDeploy service at the EC2 instance targeted by your deployment group.
1) Download latest version of codedeploy from S3 (choose your region)
PS> Read-S3Object -BucketName aws-codedeploy-eu-west-1 -Key latest/codedeploy-agent.msi -File c:\temp\codedeploy-agent.msi
2) Install codedeploy
cmd> c:\temp\codedeploy-agent.msi /quiet /l c:\temp\host-agent-install-log.txt
3) Start codeploy
PS> Start-Service -Name codedeployagent
AWS CodeDeploy guide: http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-run-agent.html#how-to-run-agent-install-windows
I just ran into this issue myself. My solution was to run:
ntpdate-debian
If you are running centos it's something like
ntpdate pool.ntp.org
For me the time was off and was causing issues with the codedeploy agent.
Now, if this doesn't solve your problem. First make sure your problem is that your CodeDeploy agent is not registering. I have had this issue before and it was because one of my instances was in a failed state from a botched deployment so be sure to double check. (ELB status, tests, etc)
Then you should enable logging for your CodeDeploy agent by setting log_aws_wire and verbose to true in /etc/codedeploy-agent/conf/codedeployagent.yml and then restart the CodeDeploy. Tail the logs and you should see the reason for your problems.

AWS Elastic Beanstalk - Deployment Quandry

I have an AWS Elastic Beanstalk with an environment setup (Windows Server 2012, IIS 8, Load Balanced). When I first create the environment with a .NET application, everything appears to work just fine. However, when I redeploy the application - using the AWS tools for Visual Studio 2012 - the new version does not seem to be deployed. I see the new deployment bundle up inthe proper S3 location, and the event viewer in the console indicates that everything is going fine:
Environment update is starting.
Deploying new version to instance(s).
Command execution completed successfully.
New application version was deployed to running EC2 instances.
Environment update completed successfully.
However, no new files appear on the server. Just for a check, I deleted all of the files in the c:\inetpub\wwwroot directory (the application deploys as the root app) and when the redeploy completes, I still do not see any files in this directory. I've tried to snapshot the logs, but there don't appear to be any (the list comes back empty). I've checked the deployments log files on the server itself (via RDP) and they are also empty. I've checked the server's event viewer as well - also void of any messages. It is almost as if the server is not actually running the deployment.
I am not sure what I could be doing wrong, but any guidance or suggestions are appreciated.
Have you looked under your 'Application Versions'?
It is possible that the bundle has been uploaded but not running on the instances.
The problem was because I was using a custom AMI for the beanstalks. I found out that the AMI I was using was not beanstalk-friendly, even though I created it from a beanstalk EC2 instance that I had customized. There was something in the configuration that made the new machines not deploy properly. In any case, for now I decided that I should just update my deployment package to include the stuff I needed (e.g., C++ redistributable) as opposed to trying to customize the machine images (i.e., Command for Elastic Beanstalk configuration to install Visual C++ Redistributable).