AWS Elastic Beanstalk won't deploy my Rails app even once - amazon-web-services

I'm currently using the "Ruby 2.6 running on 64bit Amazon Linux 2/3.0.2" image, and by looking, inside the EC2 instance at the /var/logs/eb-engine.log ("eb logs" command won't show me this), there is a recurring error:
[ERROR] failed to parse JSON file
/opt/elasticbeanstalk/deployment/app_version_manifest.json with error:
json: cannot unmarshal string into Go struct field
AppVersionManifest.Serial of type uint64
When I check that file, I do not know what is wrong with it, or what is preventing that file from being parsed, if that is actually the problem:
{ "RuntimeSources":{"my_api":{"my_api-source_alfa0.2":"s3url":""}}},"DeploymentId":9,"Serial":"23","VersionLabel":"my_api-source_alfa0.2"}
The serial "23" seems pretty parsable to me. Please help!

What causes this
I believe this is a bug.
In some cases, this can occur if you try to terminate or rebuild your Elastic Beanstalk environment and the operation fails to delete your AWSEBSecurityGroup.
There are reports (see comments) of other causes besides this.
How to fix it
The AWS document How do I terminate or rebuild my AWS Elastic Beanstalk environment when the AWSEBSecurityGroup fails to delete? describes how to resolve this, but I excerpted the main steps below, in case that link ever breaks:
Open the AWS CloudFormation console.
From the Stack Name column, choose the stack that failed to delete.
Note: The Status column of your stack shows DELETE_FAILED.
From the Actions menu, choose Delete Stack.
In the Delete Stack pop-up window, choose AWSEBSecurityGroup, and then choose Yes, Delete.
Terminate or rebuild the Elastic Beanstalk environment.
The linked docs have other steps if you prefer the CLI or have a more complex setup.
Then what?
After you've deleted the group and rebuilt your environment, you won't get the app_version_manifest.json error any more. Deploy your app.
Once it's done, if you SSH in and run…
cat /opt/elasticbeanstalk/deployment/app_version_manifest.json
…you'll notice that Serial is now correctly represented as a JSON number.

Related

Application information missing in Spinnaker after re-adding GKE accounts - using spinnaker-for-gke

I am using a Spinnaker implementation set up on GCP using the spinnaker-for-gcp tools. My initial setup worked fine. However, we recently had to re-configure our GKE clusters (independently of Spinnaker). Consequently I deleted and re-added our gke-accounts. After doing that the Spinnaker UI appears to show the existing GKE-based applications but if I click on any of them there are no clusters or load balancers listed anymore! Here are the spinnaker-for-gcp commands that I executed:
$ hal config provider kubernetes account delete company-prod-acct
$ hal config provider kubernetes account delete company-dev-acct
$ ./add_gke_account.sh # for gke_company_us-central1_company-prod
$ ./add_gke_account.sh # for gke_company_us-west1-a_company-dev
$ ./push_and_apply.sh
When the above didn't work I did an experiment where I deleted the two account and added an account with a different name (but the same GKE cluster) and ran push_and_apply. As before, the output messages seem to indicate that everything worked, but the Spinnaker UI continued to show all the old account names, despite the fact that I deleted them and added new ones (which did not show up). And, as before, not details could be seen for any of the applications. Also note that hal config provider kubernetes account list did show the new account name and did not show the old ones.
Any ideas for what I can do, other than complete recreating our Spinnaker installation? Is there anything in particular that I should look for in the Spinnaker logs in GCP to provide more information?
Thanks in advance.
-Mark
The problem turned out to be that the data that was in my .kube/config file in Cloud Shell was obsolete. Removing that file, recreating it (via the appropriate kubectl commands) and then running the commands mentioned in my original description fixed the problem.
Note, though, that it took a lot of shell script and GCP log reading by our team to figure out the problem. Ultimately, what would have been nice would have been if the add_gke_account.sh or push_and_apply.sh scripts could have detected the issue, presumably by verifying that the expected changes did, in fact, correctly occur in the running spinnaker.

Error: Asset 'webhooks/ActionsOnGoogleFulfillment' cannot be deployed

I wanted to build a Google assistant with custom actions using actions-sdk. Since I am new to this, I have followed the steps in the tutorial "Build Actions for Google Assistant using Actions SDK (Level 1)" as it is, inorder to build a sample assistant. I followed the tutorial as it is. However, in step 5(Implement fulfillment), when trying to test the the fulfillment by running the command
gactions deploy preview
I am getting the below output in the terminal with error
Sending configuration files...
Sending resources...
Waiting for server to respond. It could take up to 1 minute if your cloud function needs to be redeployed.
[ERROR] Server did not return HTTP 200.
{
"error": {
"code": 400,
"message": "Asset 'webhooks/ActionsOnGoogleFulfillment' cannot be deployed. [An operation on function cf-_CcGD8lKs_F_LHmFYfJZsQ-name in region us-central1 in project <my-project-id> is already in progress. Please try again later.]"
}
}
And when I checked the "Google Cloud Platform -> Cloud Functions Console" for this project, the following is seen.
Image 1(Screenshot)
Cloud Platform Cloud Functions Console
A failed deploy of cloud function with an exclamation mark. And if I delete that functions, then immediately a new function is deployed automatically. But instead of an exclamation mark, a spinning wheel symbol(loading/still deploying) mark is present. I cannot delete that cloud function if it is still loading/deploying. Then after 10-15 min, the spinning symbol changes to exclamation symbol. And then if I delete it, then again a new one automatically appears. And it goes on like this
Image 2 (Screenshot)
Cloud Platform Cloud Function Console
This problem arises only when implementing a webhook/fulfillment(Step 5). For static Actions' response, it successfully deploys for testing on entering the command "gactions deploy preview".(Step 1 to Step 4 are successfully implemented)
I have followed the tutorial as it is, hence the code and directory structure is the same as in tutorial,(only the project-id or actions-console project name will be different).
Github Repository for Code
Since, this is only for the tutorial, at present I am not using a billing account, instead did the following changes in package.json(changed node version from 10 to 8.).
"engines": {
"node": "8"
},
Due to this continuous automatic failed deployment, when I try to explicitly deploy the project, as mentioned above, this error occurs.
"An operation on function cf-_CcGD8lKs_F_LHmFYfJZsQ-name in region us-central1 in project <my-project-id> is already in progress. Please try again later".
Can anyone please suggest how to stop this continuous automatic failed deployment of the cloud functions, so that the function I deploy will be successfully deployed? Would really appreciate your help.
(Note: This is the first time I have posted a question in stack overflow, so please let me know if there are any mistakes or stack overflow question conventions I might not have followed. I will improve it.)
Posting this as Community Wiki as it's based in the comments.
As clarified the issue seems to be the billing account, as the tutorial mentions that it's necessary to have one set for the Cloud Functions to be deployed correctly. Besides that, it's not possible to deploy Cloud Functions (webhooks) without a billing account, so yes, even though that you are not using Node.js 10, you will need to have a billing account configured for your project.
To summarize, billing account will be needed to avoid any possible deployment failure, even if you are not using Node.js 10, as explained in the followed tutorial.

GCE Instance won't recognize metadata (startup script)

I am currently having a strange issue that I am hoping I can get some help with;
I am attempting to start GCE instances with a startup script that is being stored in Google Cloud Storage, and regardless of whether I attempt to launch the instance from the command line or the web UI, even though the config shows the appropriate metadata pair, the logs show "INFO No startup scripts found in metadata" and my startup script does not execute. See below screenshots.
I can see in my instance details that the metadata for the script URL exists.
But when I look in the logs, I get the following:
Anyone have any advice?
Apologies; I figured out the problem.
startup_script_url != startup-script-url
Use hyphens, not underscores.

Jenkins triggered code deploy is failing at ApplicationStop step even though same deployment group via code deploy directly is running successfully

When I trigger via Jenkins (code deploy plugin), I get the following error -
No such file or directory - /opt/codedeploy-agent/deployment-root/edbe4bd2-3999-4820-b782-42d8aceb18e6/d-8C01LCBMG/deployment-archive/appspec.yml
However, if I trigger deployment into the same deployment group via code deploy directly, and specify the same zip in S3 (obtained via Jenkins trigger), this step passes.
What does this mean, and how do I find a workaround to this? I am currently working on integrating a few things and so, will need to deploy via code deploy and via Jenkins simultaneously. I will run the code deploy triggered deployment when I will need to ensure that the smaller unit is functioning well.
Update
Just mentioning another point, in case it applies. I was previously using a different codedeploy "application" and "deployment group" on the same ec2 instances, and deplying using jenkins and code deploy directly as well. In order to fix some issue (not allowing to overwrite existing files due to failed deployments, allegedly), I had deleted everything inside the /opt/codedeploy-agent/deployment-root/<directory containing deployments> directory, trying to follow what was mentioned in this answer. However, note that I deleted only items inside that directory. Thereafter, I started getting this error appspec.yml not found in deployment archive. So, then I created a new application and deployment group and since then, I am working on it.
So, another point to consider is whether I should do some further cleanup, if the jenkins triggered deployment is somehow still affected by those deletions (even though it is referring to the new application and deployment group).
As part of its process, CodeDeploy needs to reference previous deployments for Redeployments and Deployment Rollbacks operations. These references are maintained outside of the deployment archive folders. If you delete these archives manually as you indicate, then a CodeDeploy install can get fatally corrupted: the references left to previous deployments are no longer correct or consistent, and deploys will fail.
The best thing at this point is to remove the old installation completely, and re-install. This will allow the code deploy agent to work correctly again.
I have learned the hard way not to remove/modify any of the CodeDeploy install folders or files manually. Even if you change apps or deployment groups, CodeDeploy will figure it out itself, without the need for any manual cleanup.
In order to do a deployment, the bundle needs to contain a appspec.yml file, and the file needs to be put at the top directory. Seems the error message is due to the host agent can't find the appspec.yml file.

"\xCB" from ASCII-8BIT to UTF-8 in Amazon Code Deploy

I get this error after creating a new instance in an autoscale group and trying to deploy. I also get this error if I create a new instance (not part of autoscale group) and deploy to it.
If I log into this newly created instance, and restart the code deploy agent, and try deploying again, it succeeds. It will now succeed every time.
If I create an image of this instance at this point, and use this image as base for a new auto-scale group, the deploy fails again.
Since I cannot restart the agent during the auto-scale setup, auto-scaling always fails.
Does anybody have any ideas on how to fix this?
I use AWS code pipeline to pull from GitHub. there are no UTF8 issues in the repo. I confirmed line endings are correct too. I converted all non-UTF8 text files to UTF8 to prove this.
There is a recent fix for Ruby encoding issue:
https://github.com/aws/aws-codedeploy-agent/commit/c2f6489a8429c5f09470fa8e354c5406ec4a4d6a.
For this error to happen, it is likely that you have foreign characters as file names that will be deployed onto instances.
Check the format of the appspec.yml file. You very likely have an encoding/line end problem there.