How to authenticate a gcloud service account from within a docker container - dockerfile

I’m trying to create a docker container that will execute a BigQuery query. I started with the Google provided image that had gcloud already and I add my bash script that has my query. I'm passing my service account key as an environment file.
Dockerfile
FROM gcr.io/google.com/cloudsdktool/cloud-sdk:latest
COPY main.sh main.sh
main.sh
gcloud auth activate-service-account X#Y.iam.gserviceaccount.com --key-file=/etc/secrets/service_account_key.json
bq query --use_legacy_sql=false
The gcloud command successfully authenticates but can't save to /.config/gcloud saying it is read-only. I've tried modifying that folders permissions during build and struggling to get it right.
Is this the right approach or is there a better way? If this is the right approach, how can I get ensure gcloud can write to the necessary folder?

See the example at the bottom of the Usage section.
You ought to be able to combine this into a single docker run command:
KEY="service_account_key.json"
echo "
[auth]
credential_file_override = /certs/${KEY}
" > ${PWD}/config
docker run \
--detach \
-env=CLOUDSDK_CONFIG=/config \
--volume=${PWD}/config:/config \
--volume=/etc/secrets/${KEY}:/certs/${KEY} \
gcr.io/google.com/cloudsdktool/cloud-sdk:latest \
bq query \
--use_legacy_sql=false
Where:
--env set the container's value for CLOUDSDK_CONFIG which depends on the first --volume flag which maps the host's config that we created in ${PWD} to the container's /config.
The second --volume flag maps the host's /etc/secrets/${KEY} (per your question) to the container's /certs/${KEY}. Change as you wish.
Suitably configured (🤞), you can run bq
I've not tried this but that should work :-)

Related

Terraform script to build and run Dataflow Flex template

Need to convert these 2 gcloud commands to build and run dataflow jobs using Terraform.
gcloud dataflow flex-template build ${TEMPLATE_PATH} \
--image-gcr-path "${TARGET_GCR_IMAGE}" \
--sdk-language "JAVA" \
--flex-template-base-image ${BASE_CONTAINER_IMAGE} \
--metadata-file "/Users/b.j/g/codebase/g-dataflow/pubsub-lite/src/main/resources/g_pubsublite_to_gcs_metadata.json" \
--jar "/Users/b.j/g/codebase/g-dataflow/pubsub-lite/target/debian/pubsub-lite-0.0.1-SNAPSHOT-uber.jar" \
--env FLEX_TEMPLATE_JAVA_MAIN_CLASS="com.in.g.gr.dataflow.PubSubLiteToGCS"
gcloud dataflow flex-template run "pub-sub-lite-flex-`date +%Y%m%d-%H%M%S`" \
--template-file-gcs-location=$TEMPLATE_FILE_LOCATION \
--parameters=subscription=$SUBSCRIPTION,output=$OUTPUT_DIR,windowSize=$WINDOW_SIZE_IN_SECS,partitionLevel=$PARTITION_LEVEL,numOfShards=$NUM_SHARDS \
--region=$REGION \
--worker-region=$WORKER_REGION \
--staging-location=$STAGING_LOCATION \
--subnetwork=$SUBNETWORK \
--network=$NETWORK
I've tried using the resource google_dataflow_flex_template_job from which i can run the dataflow job using the stored dataflow template(2nd gcloud command), now I need to create the template and docker image as per my 1st gcloud command using terraform ?
Any inputs on this ?? And whats the best way to pass the jars used in the 1st gcloud command (placing it in GCS bucket) ?
And whats the best way to pass the jars used in the 1st gcloud command (placing it in GCS bucket)?
There is no need to manually store these jar files in GCS. The gcloud dataflow flex-template build command will build a docker container image including all the required jar files and upload the image to the container registry. This image (+ the metadata file) is the only thing needed to run the template.
now I need to create the template and docker image as per my 1st gcloud command using terraform ?
AFAIK there is no special terraform module to build a flex template. I'd try using the terraform-google-gcloud module, which can execute an arbitrary gcloud command, to run gcloud dataflow flex-template build.
If you build your project using Maven, another option is using jib-maven-plugin to build and upload the container image instead of using gcloud dataflow flex-template build. See these build instructions for an example. You'll still need to upload the json image spec ("Creating Image Spec" section in the instructions) somehow, e.g. using the gsutil command or maybe using terraform's google_storage_bucket_object, so I think this approach is more complicated.

What does `gcloud compute instances create` do? - POST https://compute.googleapis.com…

Some things are very easy to do with the gcloud CLI, like:
$ export network='default' instance='example-instance' firewall='ssh-http-icmp-fw'
$ gcloud compute networks create "$network"
$ gcloud compute firewall-rules create "$firewall" --network "$network" \
--allow 'tcp:22,tcp:80,icmp'
$ gcloud compute instances create "$instance" --network "$network" \
--tags 'http-server' \
--metadata \
startup-script='#! /bin/bash
# Installs apache and a custom homepage
apt update
apt -y install apache2
cat <<EOF > /var/www/html/index.html
<html><body><h1>Hello World</h1>
<p>This page was created from a start up script.</p>
</body></html>'
$ # sleep 15s
$ curl $(gcloud compute instances list --filter='name=('"$instance"')' \
--format='value(EXTERNAL_IP)')
(to be exhaustive in commands, tear down with)
$ gcloud compute networks delete -q "$network"
$ gcloud compute firewall-rules delete -q "$firewall"
$ gcloud compute instances delete -q "$instance"
…but it's not clear what the equivalent commands are from the REST API side. Especially considering the HUGE number of options, e.g., at https://cloud.google.com/compute/docs/reference/rest/v1/instances/insert
So I was thinking to just steal whatever gcloud does internally when I write my custom REST API client for Google Cloud's Compute Engine.
Running rg I found a bunch of these lines:
https://github.com/googleapis/google-auth-library-python/blob/b1a12d2/google/auth/transport/requests.py#L182
Specifically these 5 in lib/third_party:
google/auth/transport/{_aiohttp_requests.py,requests.py,_http_client.py,urllib3.py}
google_auth_httplib2/__init__.py
Below each of them I added _LOGGER.debug("With body: %s", body). But there seems to be some fancy batching going on because I almost never get that With body line 😞
Now messing with Wireshark to see what I can find, but I'm confident this is a bad rabbit hole to fall down. Ditto for https://console.cloud.google.com/home/activity.
How can I find out what body is being set by gcloud?
Add the command line option --log-http to see the REST API parameters.
There is no simple answer as the CLI changes over time. New features are added, removed, etc.

Invoke different entrypoints/modules when training with custom container

I've built a custom Docker container with my training application. The Dockerfile, at the moment, is something like
FROM python:slim
COPY ./src /pipelines/component/src
RUN pip3 install -U ...
...
ENTRYPOINT ["python3", "/pipelines/component/src/training.py"]
so when I run
gcloud ai-platform jobs submit training JOB_NAME \
--region=$REGION \
--master-image-uri=$IMAGE_URI
it goes as expected.
What I'd like to do is to add another module, like /pipelines/component/src/tuning.py; remove the default ENTRYPOINT from Dockerfile; decide which module to call from the gcloud command. So I tried
gcloud ai-platform jobs submit training JOB_NAME \
--region=$REGION \
--master-image-uri=$IMAGE_URI \
--module-name=src.tuning \
--package-path=/pipelines/component/src
It returns Source directory [/pipelines/component] is not a valid directory., because it's searching for the package path on the local machine, instead of the container. How can I solve this problem?
You can use TrainingInput.ReplicaConfig.ContainerCommand field to override the docker image's entrypoint. Here is a sample command:
gcloud ai-platform jobs submit training JOB_NAME \
--region=$REGION
--master-image-uri=$IMAGE_URI
--config=config.yaml
And config.yaml content will be something like this:
trainingInput:
scaleTier: BASIC
masterConfig:
containerCommand: ["python3", "/pipelines/component/src/tuning.py"]
This link has more context about config flag.
Similarly, you can override docker image's command with containerArgs field.

How to set/get airflow variables which are in json format from command line

I can't edit values of airflow variables in json format through cloud shell.
I am using cloud shell to access my airflow variable params (in json format) and it gives me the complete json when i use following command:
gcloud composer environments run composer001
--location us-east1 variables
--get params
However I want to edit one of the values inside json, how do i access that?
I referred to the documentation and various other links on google however could only find how to set variables that are not in json format but are a single value variables.
Cloud Composer CLI and Airflow CLI only operate on top-level variables, not their JSON contents.
You can use Airflow UI to edit your JSON variable, as the UI loads the whole variable and you can edit it in place. Or if you need to update a specific value inside your JSON variable through command line, you can first export your variables to a JSON file:
gcloud composer environments run \
[ENVIRONMENT] --location [LOCATION] \
variables -- --export /home/airflow/gcs/data/your-vars.json
gcloud composer environments storage data export \
--environment [ENVIRONMENT] --location [LOCATION] \
--source your-vars.json --destination .
edit the value inside JSON using a command like jq:
jq '.params.jsonkey = "newvalue"' your-vars.json > your-updated-vars.json
and import the updated file back to Cloud Composer:
gcloud composer environments storage data import \
--environment [ENVIRONMENT] --location [LOCATION] \
--source your-updated-vars.json
gcloud composer environments run \
[ENVIRONMENT] --location [LOCATION] \
variables -- --import /home/airflow/gcs/data/your-updated-vars.json

Unable to Push to Google Container Registry (access denied)

When I tried to push a container image to the Container Registry, it gave me the following error,
denied: Token exchange failed for project 'my-proj-123'. Caller does not have permission 'storage.buckets.create'. To configure permissions, follow instructions at: https://cloud.google.com/container-registry/docs/access-control
I had to follow the Bucket Name Verification process to be able to create the artifacts.my-proj-123.appspot.com bucket. Now when I try to push the docker image, it does not complain on storage.buckets.create permission but only gives:
denied: Access denied.
I don't know which user I need to give access to. I gave Storage Admin access to the Compute Engine default service account to no avail. How can I fix it?
I was able to push a Docker image to Container Registry from a Container Optimized OS.
If you are having permission problems, I recommend you to give the Compute Engine default service account at least project editor permissions, just for testing purposes. Even if you just target Cloud Storage, other parts of the processes may need more permissions. Once you finish testing, you can create a new service account with less permissions and fine tune it for your needs.
Also, there is an alternative to gcloud for authentication. You can try by following this:
First try to download docker-credential-gcr with:
VERSION=1.5.0
OS=linux # or "darwin" for OSX, "windows" for Windows.
ARCH=amd64 # or "386" for 32-bit OSs
curl -fsSL "https://github.com/GoogleCloudPlatform/docker-credential-gcr/releases/download/v${VERSION}/docker-credential-gcr_${OS}_${ARCH}-${VERSION}.tar.gz" \
| tar xz --to-stdout ./docker-credential-gcr \
> /usr/bin/docker-credential-gcr && chmod +x /usr/bin/docker-credential-gcr
After that execute docker-credential-gcr configure-docker
Download the Compute Engine default service account json key.
Execute cat [your_service_account_credentials.json] | docker login -u _json_key --password-stdin https://[HOSTNAME]
I hit the similar issue while i was trying to upload the docker image to GCR from container optimized OS, i ran the following sequence of command,
Created a service account and assigned Storage Admin privileges.
Downloaded the JSON key
Executed docker-credential-gcr configure-docker
Logged in with docker command - docker login -u _json_key -p "$(cat ./mygcrserviceaccount.JSON)" https://gcr.io
Tried pushing the image gcr - docker push gcr.io/project-id/imagename:tage01
It failed with following error,
denied: Token exchange failed for project 'project-id'. Caller does not have permission 'storage.buckets.create'. To configure permissions, follow instructions at: https://cloud.google.com/container-registry/docs/access-control
I tried giving every possible permission to my service account through IAM role but it would fail with same error.
After reading this issue i did the following changes,
Removed the docker config directory rm -rf ~/.docker
Executed docker-credential-gcr configure-docker
Stored the JSON key into variable named GOOGLE_APPLICATION_CREDENTIALS
GOOGLE_APPLICATION_CREDENTIALS=/path/to/mygcrserviceaccount.JSON
Logged in with docker command - docker login -u _json_key -p "$(cat ${GOOGLE_APPLICATION_CREDENTIALS})" https://gcr.io
Executed docker push command - docker push gcr.io/project-id/imagename:tage01
Voila, it worked like a charm!