Google Cloud Build keeps on giving me 'Can't reach database server at' - database-migration

So I have been at this for days now almost and it is driving me crazy. Based on other posts, I have set up the following cloudbuild.yaml :
steps:
- name: gcr.io/cloud-builders/docker
args:
- build
- -t
- gcr.io/${INSTANCE_NAME}
- .
- name: gcr.io/cloud-builders/docker
args:
- push
- gcr.io/${INSTANCE_NAME}
- name: 'gcr.io/${INSTANCE_NAME}'
entrypoint: sh
env:
- DATABASE_URL=postgresql://USER:PASSWORD#localhost/DATABASE?host=/cloudsql/CONNECTION_NAME
args:
- -c
- |
wget https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 -O cloud_sql_proxy
chmod +x cloud_sql_proxy
./cloud_sql_proxy -instances=CONNECTION_NAME=tcp:5432 & sleep 3
npx prisma migrate deploy
- name: gcr.io/google.com/cloudsdktool/cloud-sdk
entrypoint: gcloud
args:
- run
- deploy
- backend
- --image
- gcr.io/${INSTANCE_NAME}
- --region
- europe-west1
images:
- gcr.io/${INSTANCE_NAME}
When running this, I am greeted by:
Step #2: 2023/02/05 13:00:49 Listening on 127.0.0.1:5432 for CONNECTION_NAME
Step #2: 2023/02/05 13:00:49 Ready for new connections
Step #2: 2023/02/05 13:00:49 Generated RSA key in 118.117245ms
Step #2: npm WARN exec The following package was not found and will be installed: prisma#4.9.0
Step #2: Prisma schema loaded from prisma/schema.prisma
Step #2: Datasource "db": PostgreSQL database "develop", schema "public" at "localhost"
Step #2:
Step #2: Error: P1001: Can't reach database server at `/cloudsql/CONNECTION_NAME`:`5432`
Step #2:
Step #2: Please make sure your database server is running at `/cloudsql/CONNECTION_NAME`:`5432`.
So even with using the database url hardcoded and with the Cloud SQL proxy working, i am STILL getting this error. What am I missing?

Check for the container-name in .env file and change it to postgres as it would replace name in connection string as discussed here
Or try the following format if you don’t want to hardcode IP address
DB_USER=dbuser
DB_PASS=dbpass
DB_HOST=localhost
DB_PORT=5432
CLOUD_SQL_CONNECTION_NAME=/cloudsql/gcp-project-id:europe-west3:db-instance-name
DATABASE_URL=postgres://${DB_USER}:${DB_PASS}#${DB_HOST}:${DB_PORT}/${DB_BASE}?host=${CLOUD_SQL_CONNECTION_NAME}
If you have public IP try connecting by unix socket

Related

GCP Helm Cloud Builder

Just curious, why isn't there a helm cloud builder officially supported? It seems like a very common requirement, yet I'm not seeing one in the list here:
https://github.com/GoogleCloudPlatform/cloud-builders
I was previously using alpine/helm in my cloudbuild.yaml for my helm deployment as follows:
steps:
# Build app image
- name: gcr.io/cloud_builders/docker
args:
- build
- -t
- $_IMAGE_REPO/$_CONTAINER_NAME:$COMMIT_SHA
- ./cloudbuild/$_CONTAINER_NAME/
# Push my-app image to Google Cloud Registry
- name: gcr.io/cloud-builders/docker
args:
- push
- $_IMAGE_REPO/$_CONTAINER_NAME:$COMMIT_SHA
# Configure a kubectl workspace for this project
- name: gcr.io/cloud-builders/kubectl
args:
- cluster-info
env:
- CLOUDSDK_COMPUTE_REGION=$_CUSTOM_REGION
- CLOUDSDK_CONTAINER_CLUSTER=$_CUSTOM_CLUSTER
- KUBECONFIG=/workspace/.kube/config
# Deploy with Helm
- name: alpine/helm
args:
- upgrade
- -i
- $_CONTAINER_NAME
- ./cloudbuild/$_CONTAINER_NAME/k8s
- --set
- image.repository=$_IMAGE_REPO/$_CONTAINER_NAME,image.tag=$COMMIT_SHA
- -f
- ./cloudbuild/$_CONTAINER_NAME/k8s/values.yaml
env:
- KUBECONFIG=/workspace/.kube/config
- TILLERLESS=false
- TILLER_NAMESPACE=kube-system
- USE_GKE_GCLOUD_AUTH_PLUGIN=True
timeout: 1200s
substitutions:
# substitutionOption: ALLOW_LOOSE
# dynamicSubstitutions: true
_CUSTOM_REGION: us-east1
_CUSTOM_CLUSTER: demo-gke
_IMAGE_REPO: us-east1-docker.pkg.dev/fakeproject/my-docker-repo
_CONTAINER_NAME: app2
options:
logging: CLOUD_LOGGING_ONLY
# In this option we are providing the worker pool name that we have created in the previous step
workerPool:
'projects/fakeproject/locations/us-east1/workerPools/cloud-build-pool'
And this was working with no issues. Then recently it just started failing with the following error so I'm guessing a change was made recently:
Error: Kubernetes cluster unreachable: Get "https://10.10.2.2/version": getting credentials: exec: executable gke-gcloud-auth-plugin not found"
I get this error regularly on VM's and can workaround it by setting USE_GKE_GCLOUD_AUTH_PLUGIN=True, but that does not seem to fix the issue here if I add it to the env section. So I'm looking for recommendations on how to use helm with Cloud Build. alpine/helm was just something I randomly tried and was working for me up until now, but there's probably better solutions out there.
Thanks!

gcloud alpha run deploy --set-secrets flag does not work: "should be either `latest` or a positive integer"

I fail when trying to use injecting my cloud secret to my cloud run services as environment variable. I followed the documentation at https://cloud.google.com/sdk/gcloud/reference/alpha/run/deploy#--set-secrets
Here is the relevant portion of the cloudbuild.yml file:
steps:
- name: 'gcr.io/cloud-builders/docker'
args: [ 'build', '-t', 'eu.gcr.io/$PROJECT_ID/backend:$BUILD_ID', '.' ]
- name: 'gcr.io/cloud-builders/docker'
args: [ 'push', 'eu.gcr.io/$PROJECT_ID/backend:$BUILD_ID' ]
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'alpha'
- 'run'
- 'deploy'
- 'backend'
- '--image=eu.gcr.io/$PROJECT_ID/backend:$BUILD_ID'
- '--concurrency=10'
- '--cpu=1'
- '--memory=512Mi'
- '--region=europe-west4'
- '--min-instances=1'
- '--max-instances=2'
- '--platform=managed'
- '--port=8080'
- '--timeout=3000'
- '--set-env-vars=SQL_CONNECTION=10.0.0.3, SQL_USER=root, SQL_PASSWORD=root, SQL_DATABASE=immobilien'
- '--set-env-vars=^#^SPRING_PROFILES_ACTIVE=prod'
- '--set-env-vars=MAIL_SMTP_HOST=smtp.foo.com'
- '--set-env-vars=MAIL_SMTP_PORT=993'
- '--set-env-vars=MAIL_SMTP_USER=root'
- '--set-secrets=[MAIL_SMTP_PASSWORD=mail_smtp_password:1]'
- '--ingress=internal'
- '--vpc-connector=cloud-run'
- '--vpc-egress=private-ranges-only'
- '--set-cloudsql-instances=abc-binder-3423:europe-west4:data'
This is the error output:
Step #2: Status: Downloaded newer image for gcr.io/google.com/cloudsdktool/cloud-sdk:latest
Step #2: gcr.io/google.com/cloudsdktool/cloud-sdk:latest
Step #2: Skipped validating Cloud SQL API and Cloud SQL Admin API enablement due to an issue contacting the Service Usage API. Please ensure the Cloud SQL API and Cloud SQL Admin API are activated (see https://console.cloud.google.com/apis/dashboard).
Step #2: Deploying container to Cloud Run service [backend] in project [abc-binder-3423] region [europe-west4]
Step #2: Deploying...
Step #2: failed
Step #2: Deployment failed
Step #2: ERROR: (gcloud.alpha.run.deploy) should be either `latest` or a positive integer
Finished Step #2
ERROR
ERROR: build step 2 "gcr.io/google.com/cloudsdktool/cloud-sdk:latest" failed: step exited with non-zero status: 1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ERROR: (gcloud.builds.submit) build caf4fae8-daae-49d4-9349-6995b1f275e8 completed with status "FAILURE"
I do not understand what is meant by
should be either `latest` or a positive integer
Don't surround the value with brackets.
--set-secrets=MAIL_SMTP_PASSWORD=mail_smtp_password:1

Django's ./manage.py always causes a skaffold rebuild: is there a way to prevent this?

I develop in local k8s cluster with minikube and skaffold. Using Django and DRF for the API.
I'm working on a number of models.py and one thing that is starting to get annoying is anytime I run a ./manage.py command like (showmigrations, makemigrations, etc.) it triggers a skaffold rebuild of the API nodes. It takes less than 10 seconds, but getting annoying none the less.
What should I exclude/include specifically from my skaffold.yaml to prevent this?
apiVersion: skaffold/v2beta12
kind: Config
build:
artifacts:
- image: postgres
context: postgres
sync:
manual:
- src: "**/*.sql"
dest: .
docker:
dockerfile: Dockerfile.dev
- image: api
context: api
sync:
manual:
- src: "**/*.py"
dest: .
docker:
dockerfile: Dockerfile.dev
local:
push: false
deploy:
kubectl:
manifests:
- k8s/ingress/development.yaml
- k8s/postgres/development.yaml
- k8s/api/development.yaml
defaultNamespace: development
It seems that ./manage.py must be recording some state locally, and thus triggering a rebuild. You need to add these state files to your .dockerignore.
Skaffold normally logs at a warning level, which suppresses details of what triggers sync or rebuilds. Run Skaffold with -v info and you'll see more detail:
$ skaffold dev -v info
...
[node] Example app listening on port 3000!
INFO[0336] files added: [backend/src/foo]
INFO[0336] Changed file src/foo does not match any sync pattern. Skipping sync
Generating tags...
- node-example -> node-example:v1.20.0-8-gc9335b0ad-dirty
INFO[0336] Tags generated in 80.293621ms
Checking cache...
- node-example: Not found. Building
INFO[0336] Cache check completed in 1.844615ms
Found [minikube] context, using local docker daemon.
Building [node-example]...

Run Elasticsearch on AWS EC2 with Docker

I'm trying to run Elasticsearch with Docker on an AWS EC2 instance, but when it runs, after a few seconds will be stopped, any of you have any experiences what the problem could be?
This is my Elasticsearch config in the docker-compose.yaml:
elasticsearch:
build:
context: ./elasticsearch
args:
- ELK_VERSION=${ELK_VERSION}
volumes:
- elasticsearch:/usr/share/elasticsearch/data
environment:
- cluster.name=laradock-cluster
- node.name=laradock-node
- bootstrap.memory_lock=true
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms7g -Xmx7g"
- xpack.security.enabled=false
- xpack.monitoring.enabled=false
- xpack.watcher.enabled=false
- cluster.initial_master_nodes=laradock-node
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
ports:
- "${ELASTICSEARCH_HOST_HTTP_PORT}:9200"
- "${ELASTICSEARCH_HOST_TRANSPORT_PORT}:9300"
depends_on:
- php-fpm
networks:
- frontend
- backend
And This is my Dockerfile:
FROM docker.elastic.co/elasticsearch/elasticsearch:7.5.1
RUN /usr/share/elasticsearch/bin/elasticsearch-plugin install --batch discovery-ec2
EXPOSE 9200 9300
Also, I did sysctl -w vm.max_map_count=655360 on my AWS EC2 instance
Notice: my AWS EC2 instance is Ubuntu 18.4
Thanks
I am not sure about your docker-compose.yaml as you are not referring this in your dockerfile, But I am able to reproduce the issue. I launched same ubuntu 18.4 in my AWS account and used your dockerfile to launch a ES docker container using below commands:
docker build --tag=elasticsearch-custom .
docker run -ti -v /usr/share/elasticsearch/data elasticsearch-custom
And my docker container was also stopping just after starting up as shown below:
ubuntu#ip-172-31-32-95:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
03cde4a19389 elasticsearch-custom "/usr/local/bin/dock…" 33 seconds ago Exited (78) 6 seconds ago mystifying_napier
When checked the logs on console, when starting the docker, I found below error:
ERROR: [1] bootstrap checks failed [1]: the default discovery settings
are unsuitable for production use; at least one of
[discovery.seed_hosts, discovery.seed_providers,
cluster.initial_master_nodes] must be configured
Which is very well known error and can be easily resolved just by adding -e "discovery.type=single-node" to docker run command. After adding this in docker run command as below:
docker run -e "discovery.type=single-node" -ti -v /usr/share/elasticsearch/data elasticsearch-custom
its running fine as shown below:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
191fc3dceb5a elasticsearch-custom "/usr/local/bin/dock…" 8 minutes ago Up 8 minutes 9200/tcp, 9300/tcp recursing_elgamal

Ansible docker_container 'no Host in request URL', docker pull works correctly

I'm trying to provision my infrastructure on AWS using Ansible playbooks. I have the instance, and am able to provision docker-engine, docker-py, etc. and, I swear, yesterday this worked correctly and I haven't changed the code since.
The relevant portion of my playbook is:
- name: Ensure AWS CLI is available
pip:
name: awscli
state: present
when: aws_deploy
- block:
- name: Add .boto file with AWS credentials.
copy:
content: "{{ boto_file }}"
dest: ~/.boto
when: aws_deploy
- name: Log in to docker registry.
shell: "$(aws ecr get-login --region us-east-1)"
when: aws_deploy
- name: Remove .boto file with AWS credentials.
file:
path: ~/.boto
state: absent
when: aws_deploy
- name: Create docker network
docker_network:
name: my-net
- name: Start Container
docker_container:
name: example
image: "{{ docker_registry }}/example"
pull: true
restart: true
network_mode: host
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone
My {{ docker_registry }} is set to my-acct-id.dkr.ecr.us-east-1.amazonaws.com and the result I'm getting is:
"msg": "Error pulling my-acct-id.dkr.ecr.us-east-1.amazonaws.com/example - code: None message: Get http://: http: no Host in request URL"
However, as mentioned, this worked correctly last night. Since then I've made some VPC/subnet changes, but I'm able to ssh to the instance, and run docker pull my-acct-id.dkr.ecr.us-east-1.amazonaws.com/example with no issues.
Googling has led me not very far as I can't seem to find other folks with the same error. I'm wondering what changed, and how I can fix it! Thanks!
EDIT: Versions:
ansible - 2.2.0.0
docker - 1.12.3 6b644ec
docker-py - 1.10.6
I had the same problem. Downgrading docker-compose pip image on that host machine from 1.9.0 to 1.8.1 solved the problem.
- name: Install docker-compose
pip: name=docker-compose version=1.8.1
Per this thread: https://github.com/ansible/ansible-modules-core/issues/5775, the real culprit is requests. This fixes it:
- name: fix requests
pip: name=requests version=2.12.1 state=forcereinstall