Google Cloud Build - Terraform Self-Destruction on Build Failure - google-cloud-platform

I'm currently facing an issue with my Google Cloud Build for CI/CD.
First, I build new docker images of multiple microservices and use Terraform to create the GCP infrastructure for the containers that they will also live in production.
Then I perform some Integration / System Tests and if everything is fine I push new versions of the microservice images to the container registry for later deployment.
My problem is, that the Terraformed infrastructure doesn't get destroyed if the cloud build fails.
Is there a way to always execute a cloud build step even if some previous steps have failed, here I would want to always execute "terraform destroy"?
Or specifically for Terraform, is there a way to define a self-destructive Terraform environment?
cloudbuild.yaml example with just one docker container
steps:
# build fresh ...
- id: build
name: 'gcr.io/cloud-builders/docker'
dir: '...'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/staging/...:latest', '-t', 'gcr.io/$PROJECT_ID/staging/...:$BUILD_ID', '.', '--file', 'production.dockerfile']
# push
- id: push
name: 'gcr.io/cloud-builders/docker'
dir: '...'
args: ['push', 'gcr.io/$PROJECT_ID/staging/...']
waitFor: [build]
# setup terraform
- id: terraform-init
name: 'hashicorp/terraform:0.12.28'
dir: '...'
args: ['init']
waitFor: [push]
# deploy GCP resources
- id: terraform-apply
name: 'hashicorp/terraform:0.12.28'
dir: '...'
args: ['apply', '-auto-approve']
waitFor: [terraform-init]
# tests
- id: tests
name: 'python:3.7-slim'
dir: '...'
waitFor: [terraform-apply]
entrypoint: /bin/sh
args:
- -c
- 'pip install -r requirements.txt && pytest ... --tfstate terraform.tfstate'
# remove GCP resources
- id: terraform-destroy
name: 'hashicorp/terraform:0.12.28'
dir: '...'
args: ['destroy', '-auto-approve']
waitFor: [tests]

Google Cloud Build doesn't yet support allow_failure or some similar mechanism as mentioned in this unsolved but closed issue.
What you can do, and as mentioned in the linked issue, is to chain shell conditional operators.
If you want to run a command on failure then you can do something like this:
- id: tests
name: 'python:3.7-slim'
dir: '...'
waitFor: [terraform-apply]
entrypoint: /bin/sh
args:
- -c
- pip install -r requirements.txt && pytest ... --tfstate terraform.tfstate || echo "This failed!"
This would run your test as normal and then echo This failed! to the logs if the tests fail. If you want to run terraform destroy -auto-approve on the failure then you would need to replace the echo "This failed!" with terraform destroy -auto-approve. Of course you will also need the Terraform binaries in the Docker image you are using so will need to use a custom image that has both Python and Terraform in it for that to work.
- id: tests
name: 'example-customer-python-and-terraform-image:3.7-slim-0.12.28'
dir: '...'
waitFor: [terraform-apply]
entrypoint: /bin/sh
args:
- -c
- pip install -r requirements.txt && pytest ... --tfstate terraform.tfstate || terraform destroy -auto-approve ; false"
The above job also runs false at the end of the command so that it will return a non 0 exit code and mark the job as failed still instead of only failing if terraform destroy failed as well.
An alternative to this would be to use something like Test Kitchen which will automatically stand up infrastructure, run the necessary verifiers and then destroy it at the end all in a single kitchen test command.
It's probably also worth mentioning that your pipeline is entirely serial so you don't need to use waitFor. This is mentioned in the Google Cloud Build documentation:
A build step specifies an action that you want Cloud Build to perform.
For each build step, Cloud Build executes a docker container as an
instance of docker run. Build steps are analogous to commands in a
script and provide you with the flexibility of executing arbitrary
instructions in your build. If you can package a build tool into a
container, Cloud Build can execute it as part of your build. By
default, Cloud Build executes all steps of a build serially on the
same machine. If you have steps that can run concurrently, use the
waitFor option.
and
Use the waitFor field in a build step to specify which steps must run
before the build step is run. If no values are provided for waitFor,
the build step waits for all prior build steps in the build request to
complete successfully before running. For instructions on using
waitFor and id, see Configuring build step order.

Related

Filter build steps by branch google cloud plateform

I have the below steps
steps:
# This step show the version of Gradle
- id: Gradle Install
name: gradle:7.4.2-jdk17-alpine
entrypoint: gradle
args: ["--version"]
# This step build the gradle application
- id: Build
name: gradle:7.4.2-jdk17-alpine
entrypoint: gradle
args: ["build"]
# This step run test
- id: Publish
name: gradle:7.4.2-jdk17-alpine
entrypoint: gradle
args: ["publish"]
The last step I want to do only on MASTER branch
Found one link related to this https://github.com/GoogleCloudPlatform/cloud-builders/issues/138
Its using a bash command, how can I put the gradle command inside the bash.
Update
After the suggestion answer I have updated the steps as
- id: Publish
name: gradle:7.4.2-jdk17-alpine
entrypoint: "bash"
args:
- "-c"
- |
[[ "$BRANCH_NAME" == "develop" ]] && gradle publish
The build pipeline failed with below exception
Starting Step #2 - "Publish"
Step #2 - "Publish": Already have image: gradle:7.4.2-jdk17-alpine
Finished Step #2 - "Publish"
ERROR
ERROR: build step 2 "gradle:7.4.2-jdk17-alpine" failed: starting step container failed: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "bash": executable file not found in $PATH: unknown
The suggested solution didn't work for me, I have to write the below code
# This step run test
- id: Publish
name: gradle:7.4.2-jdk17-alpine
entrypoint: "sh"
args:
- -c
- |
if [ $BRANCH_NAME == 'master' ]
then
echo "Branch is = $BRANCH_NAME"
gradle publish
fi
Current workarounds are mentioned as following :
Using different cloudbuild.yaml files for each branch
Overriding entrypoint and injecting bash as mentioned in the link:
steps:
- name: 'gcr.io/cloud-builders/docker'
entrypoint: 'bash'
args:
- '-c'
- |
echo "Here's a convenient pattern to use for embedding shell scripts in cloudbuild.yaml."
echo "This step only pushes an image if this build was triggered by a push to master."
[[ "$BRANCH_NAME" == "master" ]] && docker push gcr.io/$PROJECT_ID/image
This tutorial outlines an alternative where you check in different cloudbuild.yaml files on different development branches.
You can try the following for Gradle command as mentioned by bhito:
- id: Publish
name: gradle:7.4.2-jdk17-alpine
entrypoint: sh
args: - c
- |
[[ "$BRANCH_NAME" == "master" ]] && gradle publish
Cloud build provides configuring triggers by branch, tag, and pr . This lets you define different build configs to use for different repo events, e.g. one for prs, another for deploying to prod, etc.you can refer to the documentation on how to create and manage triggers.
you can check this blog for more updates on additional features and can go through the release notes for more Cloud build updates.
To gain some more insights on Gradle, you can refer to the link

Google Cloud Build Trigger failing with "ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1"

I am trying to setup continuous deployment of my golang backend using the Google documentation, but when my trigger fires, it fails with the following error:
starting build "eba3ce39-caad-43f0-a255-0a3cacec4913"
FETCHSOURCE
Initialized empty Git repository in /workspace/.git/
From https://source.developers.google.com/p/my-porject/r/github_myusername_myproject.com
* branch 660796f575bae6860d6f96df60cfd631a730c3ae -> FETCH_HEAD
HEAD is now at 660796f cloudbuild.yaml
BUILD
Starting Step #0
Step #0: Already have image (with digest): gcr.io/cloud-builders/docker
Step #0: unable to prepare context: unable to evaluate symlinks in Dockerfile path: lstat /workspace/Dockerfile: no such file or directory
Finished Step #0
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1
My project file structure looks like:
project
frontend
backend
main.go
cloudbuild.yaml
Dockerfile
where my cloudbuild.yaml looks like:
steps:
# Build the container image
- name: "gcr.io/cloud-builders/docker"
args:
[
"build",
"-t",
"gcr.io/my-project/github.com/username/project.com:$COMMIT_SHA",
".",
]
# Push the image to Container Registry
- name: "gcr.io/cloud-builders/docker"
args:
[
"push",
"gcr.io/my-project/github.com/username/project.com:$COMMIT_SHA",
]
# Deploy image to Cloud Run
- name: "gcr.io/cloud-builders/gcloud"
args:
- "run"
- "deploy"
- "[SERVICE_NAME]"
- "--image"
- "gcr.io/my-project/github.com/username/project.com:$COMMIT_SHA"
- "--region"
- "us-central1"
- "--platform"
- "managed"
images:
- gcr.io/my-project/github.com/username/project.com
and my Dockerfile looks like
# Use the official Golang image to create a build artifact.
# This is based on Debian and sets the GOPATH to /go.
# https://hub.docker.com/_/golang
FROM golang:1.13 as builder
# Create and change to the app directory.
WORKDIR /app
# Retrieve application dependencies.
# This allows the container build to reuse cached dependencies.
COPY go.* ./
RUN go mod download
# Copy local code to the container image.
COPY . ./
# Build the binary.
RUN CGO_ENABLED=0 GOOS=linux go build -mod=readonly -v -o server
# Use the official Alpine image for a lean production container.
# https://hub.docker.com/_/alpine
# https://docs.docker.com/develop/develop-images/multistage-build/#use-multi-stage-builds
FROM alpine:3
RUN apk add --no-cache ca-certificates
# Copy the binary to the production image from the builder stage.
COPY --from=builder /app/server /server
# Run the web service on container startup.
CMD ["/server"]
I got the Dockerfile from Quickstart: Build and Deploy
.
When you execute a push command to your github repo, the Cloud Build will triggers and look for the cloudbuild.yaml file. You can specify the cloudbuild.yaml location when you create the build trigger by editing the Configuration section and Cloud Build configuration file (yaml or json) in which you can choose the cloudbuild.yaml location. in your case just make it backend/cloudbuild.yaml.
Now, that's not enough because when the build start, docker build command will initiate to build your image as per your first step. However, your build context for docker is . which should not be because all your repo was copied to GCP and the build context here is relational to the project and not where the cloud build is.
To solve this issue just change the build context of docker to ./backend. Your cloudbuild final version should be something like:
steps:
# Build the container image
- name: "gcr.io/cloud-builders/docker"
args:
[
"build",
"-t",
"gcr.io/my-project/github.com/username/project.com:$COMMIT_SHA",
"./backend",
]
#Rest of the steps ...
The Cloud Build trigger is currently pointing to /project/ while your directory structure is as follows:
project
frontend
backend
main.go
cloudbuild.yaml
Dockerfile
When you execute the trigger, the directory workspace is copied to /workspace/, thus it cannot find the Dockerfile therein.
You can move everything to the same working directory.
.
├── main.go
├── cloudbuild.yaml
├── Dockerfile
If you would like to keep your current directory structure,your Cloud Build trigger will need to point to /project/backend/, instead. Note that you can check your directory structure using the ls -la linux command.

How can I call gcloud commands from a shell script during a build step?

I have automatic builds set up in Google Cloud, so that each time I push to the master branch of my repository, a new image is built and pushed to Google Container Registry.
These images pile up quickly, and I don't need all the old ones. So I would like to add a build step that runs a bash script which calls gcloud container images list-tags, loops the results, and deletes the old ones with gcloud container images delete.
I have the script written and it works locally. I am having trouble figuring out how to run it as a step in Cloud Builder.
It seems there are 2 options:
- name: 'ubuntu'
args: ['bash', './container-registry-cleanup.sh']
In the above step in cloudbuild.yml I try to run the bash command in the ubuntu image. This doesn't work because the gcloud command does not exist in this image.
- name: 'gcr.io/cloud-builders/gcloud'
args: [what goes here???]
In the above step in cloudbuild.yml I try to use the gcloud image, but since "Arguments passed to this builder will be passed to gcloud directly", I don't know how to call my bash script here.
What can I do?
You can customize the entry point of your build step. If you need gcloud installed, use the gcloud cloud builder and do this
step:
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: "bash"
args:
- "-c"
- |
echo "enter 1 bash command per line"
ls -la
gcloud version
...
As per the official documentation Creating custom build steps indicates, you need a custom build step to execute a shell script from your source, the step's container image must contain a tool capable of running the script.
The below example, shows how to configure your args, for the execution to perform correctly.
steps:
- name: 'ubuntu'
args: ['bash', './myscript.bash']
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/custom-script-test', '.']
images: ['gcr.io/$PROJECT_ID/custom-script-test']
I would recommend you to take a look at the above documentation and the example as well, to test and confirm if it will help you achieve the execution of the script.
For your case, specifically, there is this other answer here, where is indicated that you will need to override the endpoint of the build to bash, so the script runs. It's indicated as follow:
- name: gcr.io/cloud-builders/gcloud
entrypoint: /bin/bash
args: ['-c', 'gcloud compute instances list > gce-list.txt']
Besides that, these two below articles, include more information and examples on how to configure customized scripts to run in your Cloud Build, that I would recommend you to take a look.
CI/CD: Google Cloud Build — Custom Scripts
Mastering Google Cloud Build Config Syntax
Let me know if the information helped you!

Run node.js database migrations on Google Cloud SQL during Google Cloud Build

I would like to run database migrations written in node.js during the Cloud Build process.
Currently, the database migration command is being executed but it seems that the Cloud Build process does not have access to connect to Cloud SQL via an IP address with username/password.
In the case with Cloud SQL and Node.js it would look something like this:
steps:
# Install Node.js dependencies
- id: yarn-install
name: gcr.io/cloud-builders/yarn
waitFor: ["-"]
# Install Cloud SQL proxy
- id: proxy-install
name: gcr.io/cloud-builders/yarn
entrypoint: sh
args:
- "-c"
- "wget https://storage.googleapis.com/cloudsql-proxy/v1.20.1/cloud_sql_proxy.linux.amd64 -O cloud_sql_proxy && chmod +x cloud_sql_proxy"
waitFor: ["-"]
# Migrate database schema to the latest version
# https://knexjs.org/#Migrations-CLI
- id: migrate
name: gcr.io/cloud-builders/yarn
entrypoint: sh
args:
- "-c"
- "(./cloud_sql_proxy -dir=/cloudsql -instances=<CLOUD_SQL_CONNECTION> & sleep 2) && yarn run knex migrate:latest"
timeout: "1200s"
waitFor: ["yarn-install", "proxy-install"]
timeout: "1200s"
You would launch yarn install and download Cloud SQL Proxy in parallel. Once these two steps are complete, you run launch the proxy, wait 2 seconds and finally run yarn run knex migrate:latest.
For this to work you would need Cloud SQL Admin API enabled in your GCP project.
Where <CLOUD_SQL_INSTANCE> is your Cloud SQL instance connection name that can be found here. The same name will be used in your SQL connection settings, e.g. host=/cloudsql/example:us-central1:pg13.
Also, make sure that the Cloud Build service account has "Cloud SQL Client" role in the GCP project, where the db instance is located.
As of tag 1.16 of gcr.io/cloudsql-docker/gce-proxy, the currently accepted answer no longer works. Here is a different approach that keeps the proxy in the same step as the commands that need it:
- id: cmd-with-proxy
name: [YOUR-CONTAINER-HERE]
timeout: 100s
entrypoint: sh
args:
- -c
- '(/workspace/cloud_sql_proxy -dir=/workspace -instances=[INSTANCE_CONNECTION_NAME] & sleep 2) && [YOUR-COMMAND-HERE]'
The proxy will automatically exit once the main process exits. Additionally, it'll mark the step as "ERROR" if either the proxy or the command given fails.
This does require the binary is in the /workspace volume, but this can be provided either manually or via a prereq step like this:
- id: proxy-install
name: alpine:3.10
entrypoint: sh
args:
- -c
- 'wget -O /workspace/cloud_sql_proxy https://storage.googleapis.com/cloudsql-proxy/v1.16/cloud_sql_proxy.linux.386 && chmod +x /workspace/cloud_sql_proxy'
Additionally, this should work with TCP since the proxy will be in the same container as the command.
Use google-appengine/exec-wrapper. It is an image to do exactly this. Usage (see README in link):
steps:
- name: "gcr.io/google-appengine/exec-wrapper"
args: ["-i", "gcr.io/my-project/appengine/some-long-name",
"-e", "ENV_VARIABLE_1=value1", "-e", "ENV_2=value2",
"-s", "my-project:us-central1:my_cloudsql_instance",
"--", "bundle", "exec", "rake", "db:migrate"]
The -s sets the proxy target.
Cloud Build runs using a service account and it looks like you need to grant access to Cloud SQL for this account.
You can find additional info about setting service account permissions here.
Here's how to combine Cloud Build + Cloud SQL Proxy + Docker.
If you're running your database migrations/operations within a Docker container in Cloud Build, it won't be able to directly access your proxy, because Docker containers are isolated from the host machine.
Here's what I managed to get up and running:
- id: build
# Build your application
waitFor: ['-']
- id: install-proxy
name: gcr.io/cloud-builders/wget
entrypoint: bash
args:
- -c
- wget -O /workspace/cloud_sql_proxy https://storage.googleapis.com/cloudsql-proxy/v1.15/cloud_sql_proxy.linux.386 && chmod +x /workspace/cloud_sql_proxy
waitFor: ['-']
- id: migrate
name: gcr.io/cloud-builders/docker
entrypoint: bash
args:
- -c
- |
/workspace/cloud_sql_proxy -dir=/workspace -instances=projectid:region:instanceid & sleep 2 && \
docker run -v /workspace:/root \
--env DATABASE_HOST=/root/projectid:region:instanceid \
# Pass other necessary env variables like db username/password, etc.
$_IMAGE_URL:$COMMIT_SHA
timeout: '1200s'
waitFor: [build, install-proxy]
Because our db operations are taking place within the Docker container, I found the best way to provide the access to Cloud SQL by specifying the Unix socket -dir/workspace instead of exposing a TCP port 5432.
Note: I recommend using the directory /workspace instead of /cloudsql for Cloud Build.
Then we mounted the /workspace directory to Docker container's /root directory, which is the default directory where your application code resides. When I tried to mount it to other than /root, nothing seemed to happen (perhaps a permission issue with no error output).
Also: I noticed the proxy version 1.15 works well. I had issues with newer versions. Your mileage may vary.

Container builder dockerfile with gcloud commands

I have a container builder step
steps:
- id: dockerbuild
name: gcr.io/cloud-builders/docker
entrypoint: 'bash'
args:
- -c
- |
docker build . -t test
images: ['gcr.io/project/test']
The Dockerfile used to create this test image has gsutil specific commands like
FROM gcr.io/cloud-builders/gcloud
RUN gsutil ls
When I submit a docker build to container builder service using
gcloud container builds submit --config cloudbuild.yml
I see the following error
You are attempting to perform an operation that requires a project id, with none configured. Please re-run gsutil config and make sure to follow the instructions for finding and entering your default project id.
The command '/bin/sh -c gsutil ls' returned a non-zero code: 1
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: exit status 1
My question is, how do we use gcloud/gsutil commands inside the DockerFile so that I can run inside a docker build step ?
To invoke "gcloud commands." using the tool builder, you need Container Builder service account, because it executes your builds on your behalf.
Here in this GitHub there is an example for cloud-builders using the gcloud command:
Note : you have to specify $PROJECT_ID it's mandatory for your builder to work.
To do this, your Dockerfile either needs to start from a base image that has the cloud SDK installed already (like FROM gcr.io/cloud-builders/gcloud) or you would need to install it. Here's a Dockerfile that installs it: https://github.com/GoogleCloudPlatform/cloud-builders/blob/master/gcloud/Dockerfile.slim