Terraform destroy add the same resources - google-cloud-platform

I'm facing a difficult issue to resolve.
I'm terraforming the deployement for multiple ressources on GCP's plateform.
Those ressources are all included in the terraform GCP's network module. (https://github.com/terraform-google-modules/terraform-google-network).
I'm building 2 projects with VPC (shared) and some subnetworks. Easy at first glance.
First terraform init/plan & apply was Ok, the tfstate file is on gcs backend with versionning set on true.
Today, I launch an terraform plan on it to check if everything was ok before doing some modifications.
The output of the plan is telling me that terraform wants to destroy some resources ... and recreate (adding) ... strictly the same resources ...
The code is on our bitbucket repo, no changes on it till the last apply who was ok.
I tried to retreive an old version of the tfstate files, disable the gcs backend to debug and correct it localy, but I can't find a way to refresh the current state.
I tried those tricks :
terraform refresh
terraform import (40 ressources with my little hands ... and even if import commands are working, the plan command still want to destroy my existing resources to recreate strictly the same ...)
So I'm wondering if you already encountered the same problem.
If yes, how you managed it ?
I can share my source on demand.
Terraform v0.12.9
provider.google v2.19.0
provider.google-beta v3.3.0
provider.null v2.1.2
provider.random v2.2.1

Ok, big rookie mistake, terraform providers save my day. No versions was set on the source's module version ... I just define it, replan, everything was fine again.

Related

How can I destroy a specific Terraform managed resource?

I have a DMS task that failed and isn't resuming or restarting. Unfortunately, according to AWS Support, the only recourse is to destroy and recreate it. I have a large infrastructure that takes several hours to destroy and recreate with Terraform. I'm running Terraform version 1.2.X with the AWS provider version 4.17.0.
I tried running terraform plan -destroy -target="<insert resource_type>.<insert resource_name>". I tried with and without quotes, double hyphens prior to the target option, module names, etc. Every time the result comes back with this error:
Either you have not created any objects yet or the existing objects were...
My hierarchy is this: Main module -> sub module -> resource. My spelling and punctuation are correct.
I've Google it. I find only the Hashicorp documentation that specifies the syntax but not the naming convention, as well as bug reports from years ago. How do I selectively destroy a resource?
It turns out I wasn't naming my resource correctly.
After some trial and error, I ran a destroy plan for my entire infrastructure (terraform plan <insert module runtime params> -destroy). Using the output from that, I found the name of the resource I wanted to destroy. The format was module.<submodule>.<resourcetype>.<resourcename>.
Once I acquired the resource name directly from Terraform, I first ran the terraform plan -destroy -target="module.<submodule>.<resourcetype>.<resourcename>" command to verify the outcome, then the terraform destroy -target="module.<submodule>.<resourcetype>.<resourcename>" command and it worked!

Application information missing in Spinnaker after re-adding GKE accounts - using spinnaker-for-gke

I am using a Spinnaker implementation set up on GCP using the spinnaker-for-gcp tools. My initial setup worked fine. However, we recently had to re-configure our GKE clusters (independently of Spinnaker). Consequently I deleted and re-added our gke-accounts. After doing that the Spinnaker UI appears to show the existing GKE-based applications but if I click on any of them there are no clusters or load balancers listed anymore! Here are the spinnaker-for-gcp commands that I executed:
$ hal config provider kubernetes account delete company-prod-acct
$ hal config provider kubernetes account delete company-dev-acct
$ ./add_gke_account.sh # for gke_company_us-central1_company-prod
$ ./add_gke_account.sh # for gke_company_us-west1-a_company-dev
$ ./push_and_apply.sh
When the above didn't work I did an experiment where I deleted the two account and added an account with a different name (but the same GKE cluster) and ran push_and_apply. As before, the output messages seem to indicate that everything worked, but the Spinnaker UI continued to show all the old account names, despite the fact that I deleted them and added new ones (which did not show up). And, as before, not details could be seen for any of the applications. Also note that hal config provider kubernetes account list did show the new account name and did not show the old ones.
Any ideas for what I can do, other than complete recreating our Spinnaker installation? Is there anything in particular that I should look for in the Spinnaker logs in GCP to provide more information?
Thanks in advance.
-Mark
The problem turned out to be that the data that was in my .kube/config file in Cloud Shell was obsolete. Removing that file, recreating it (via the appropriate kubectl commands) and then running the commands mentioned in my original description fixed the problem.
Note, though, that it took a lot of shell script and GCP log reading by our team to figure out the problem. Ultimately, what would have been nice would have been if the add_gke_account.sh or push_and_apply.sh scripts could have detected the issue, presumably by verifying that the expected changes did, in fact, correctly occur in the running spinnaker.

Cloud Function build error - failed to get OS from config file for image

I'm seeing this Cloud Build error when I try to deploy a Cloud Function:
"Step #2 - "analyzer": [31;1mERROR: [0mfailed to initialize cache: failed to create image cache: accessing cache image "us.gcr.io/MY_PROJECT/gcf/us-central1/SOME_KEY/cache:latest": failed to get OS from config file for image 'us.gcr.io/MY_PROJECT/gcf/us-central1/SOME_KEY/cache:latest'"
I'm able to build and emulate the cloud function locally, but I can't deploy it due to this error. I was able to deploy just fine until now. I've looked everywhere and I can't find any discussion about this. Anyone know what's going on here?
UPDATE: I deployed a new function 3 days ago and now I can't seem to deploy an update to it. I get the same error. I'm fairly sure this is happening due to the lifecycle rule I set up to ensure I don't keep storing images of functions: Firebase storage artifacts is huge and keeps increasing. This rule is important to keep around because I don't want to pay for unnecessary storage, but it seems like it might be the source of our problem here. Can someone from Google look into this?
I got the same error, even for code that deployed successfully before.
A workaround is to delete the Docker images for the failing Firebase functions inside Container Registry and re-deploying the functions. (The images will be re-created upon deploying.)
The error still occurs sporadically, so I suspect this may be a bug introduced in Firebase's deployment process. Thankfully for now, the workaround above resolves the issue every time the error comes up.
I also encountered the same problem, and solved it by deleting the images in the Container Registry of Firebase Project.
I made a Script at that time, and I'll put it here. The usage is as follows. Please use it if you like.
Install the Google Cloud SDK.
Download the Script
Edit CONTAINER_REGISTRY to your registry name. For example: CONTAINER_REGISTRY=asia.gcr.io/project-name/gcf/asia-northeast1
Grant execute permission. - $ chmod +x script.sh
Execute it. - $ sh script.sh
Deploy your functions.
I'm having the same problem for the last few days and in contact with the support. I had the same log and in my case it wasn't connected to the artifacts because the artifacts rebuild themselves automatically on deploy (read below about a subtle case related to the artifacts and how to fix it), but deleting the functions and redeploying solved it for me.
Artifacts auto cleanup
Note that if the artifacts bucket is empty, then the problem is somewhere else.
But if it's not empty, what you can do to resolve any possible problems related to the artifacts auto cleanup, is to delete the whole "container" folder manually in the artifacts which should solve it. Then just redeploy again.
Make sure not to delete the artifacts bucket itself!
Dough from firebase confirmed in the question you referring to that removing the artifacts content is safe.
So, here is how to delete it:
go to the google cloud console, select your project -> storage -> browser https://console.cloud.google.com/storage/browser
Select the "artifacts" bucket
Choose "containers" and delete it
If the problem was here, it should work fine after that.
This happens because the deletion rule you refer to in your question checks the "last updated" timestamp of each file while on redeploy only some files are updated. So the next day the rule will delete some of the files while leaving the others which will lead to the inconsistent state of the bucket in this case. So you just remove everything manually.

Importing existing resources with multiple accounts

We have four AWS accounts used to define different environments: dev, sqe, stg, prd. We're only now using CF and I'd like to import an existing resource into a stack. As we roll this out each environment will get the new stack and I'm wondering if there's an easier way to import the resource in each env. than to initially go through the console to import the reasource while add the stack (would be nice if we could just deploy via our deployment system.)
What I was hoping for was something I could specify in the stack definition itself (e.g., "here's a bucket that already exists, take ownership"), but I'm not finding anything. Currently it seems like the easiest route would be to create an empty stack in each environment which imports the resource and then just deploy as normal.
Also, what happens when/if an update fails and a stack gets stuck in ROLLBACK_COMPLETE? Do I have to go through this again after deleting the stack?
What you have described sounds exactly like your after a Continuous Integration / Continuous Deployment (CICD) pipeline. Instead of trying to import existing resources into your accounts, your better off designing the cloudformation templates then deploying them to each environment through Code Pipeline. This will also provide a clean separation between the accounts instead of importing stg resources to prd.
A fantastic example and quickstart is the serverless-cicd-for-enterprise which should serve as a good starting point for you.
You can't get stuck on 'rollback complete', as that is the last action a failed change set executes. What it means is that it tried to update, couldn't and has reverted to the last successful deployment. If this is the first deployment (no successful deployments) you will need to delete the stack and try again. However, if you have had a successful deployment you can run an update stack.

Limitations of Immutable Deployments on AWS/EB

I am trying to understand the disadvantages of immutable deployments on AWS/Elastic Beanstalk. The docs say this:
You can't perform an immutable update in concert with resource configuration changes. For example, you can't change settings that require instance replacement while also updating other settings, or perform an immutable deployment with configuration files that change configuration settings or additional resources in your source code. If you attempt to change resource settings (for example, load balancer settings) and concurrently perform an immutable update, Elastic Beanstalk returns an error.
(Source: https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/environmentmgmt-updates-immutable.html)
However, I am unable to come up with a practical scenario that would fail. I use CloudFormation templates for all of the configuration. Can the above be interpreted that I cannot deploy CloudFormation changes as well as changes to the application (.jar) at the same time?
I would be very thankful for clarification.
Take this with a grain of salt because it's just a guess based on reading the docs; I think basic support is $40/month, would be a good question to ask them to know for sure.
Can the above be interpreted that I cannot deploy CloudFormation changes as well as changes to the application (.jar) at the same time
I'm assuming you deploy your application .jar using a different process than your CloudFormation template. Meaning when you deploy source code you don't use CloudFormation, you maybe use a CI/CD tool e.g. Codeship. And when you make a change to your CloudFormation template, you log in to AWS Console and update the the template there (or use the AWS CLI tool).
Changing both at the same time would, I think, fall under what they're saying here. Don't do it for obvious reasons; you wouldn't want CloudFormation trying to make changes to an ec2 instance at the same time that EB is shutting down that instance and starting a new one. But a more common example would be I think if you happen to use .ebextensions for some configuration settings.
.ebextensions are a way to configure some things in EB that CloudFormation can't really do or easily do. They are config files that get deployed with your source code in a folder named /.ebextensions at the root of your project. An example is changing some specific linux settings https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html
You wouldn't want to make a change to your application code and an .ebextension at the same time. This is just my guess at reading the docs, you could test this out pretty easily.