I am using Gcloud to run Prow (Continous Integration server). One of my job creates a virtual machine, perform some tests and then delete that instance. I use a service account to create VM, run tests.
set -o errexit
cleanup() {
gcloud compute instances delete kyma-integration-test-${RANDOM_ID}
gcloud config set project ...
gcloud auth activate-service-account --key-file ...
gcloud compute instances create <vm_name> \
--metadata enable-oslogin=TRUE \
--image debian-9-stretch-v20181009 \
--image-project debian-cloud --machine-type n1-standard-4 --boot-disk-size 20 \
trap cleanup exit
gcloud compute scp --strict-host-key-checking=no --quiet <script.sh> <vm_name>:~/<script.sh>
gcloud compute ssh --quiet <vm_name> -- ./<script.sh>
After some time, I got following error:
ERROR: (gcloud.compute.scp) INVALID_ARGUMENT: Login profile size exceeds 32 KiB. Delete profile values to make additional space.
Indeed, for that service account, describe command returns a lot of data, for example ~70 entries in sshPublicKeys section.
Most of this public keys refer to already removed VM instances. How to perform cleanup of this list? Or is it possible to not store that public keys at all?
The permanent solution is to use --ssh-key-expire-after 30s.
You still need to cleanup the existing keys with the solutions above or a little more command kungfu like this (without grep).
for i in $(gcloud compute os-login ssh-keys list --format="table[no-heading](value.fingerprint)"); do
echo $i;
gcloud compute os-login ssh-keys remove --key $i || true;
NOTE: you have to be using the offending account. gcloud config account activate ACCOUNT and/or gcloud auth activate-service-account --key-file=FILE or gcloud auth login
Need a new ssh key in a script:
# KEYNAME should be something like $HOME/.ssh/google_compute_engine
ssh-keygen -t rsa -N "" -f "${KEYNAME}" -C "${USERNAME}" || true
chmod 400 ${KEYNAME}*
cat > ssh-keys <<EOF
${USERNAME}:$(cat ${KEYNAME}.pub)
Testing this solution:
while :; do
rm -f ~/.ssh/google_compute_engine*
ssh-keygen -t rsa -N "" -f "${KEYNAME}" -C "${USERNAME}" || true
chmod 400 ${KEYNAME}*
cat > ssh-keys <<EOF
${USERNAME}:$(cat ${KEYNAME}.pub)
gcloud --project=test-project compute ssh --ssh-key-expire-after 30s one-bastion-to-rule-them-all -- date
gcloud --project=test-project compute os-login ssh-keys list --format="table[no-heading](value.fingerprint)" \
|wc -l
A very crude way to do the above that worked for me was:
for i in $(gcloud compute os-login ssh-keys list); do echo $i; gcloud compute os-login ssh-keys remove --key $i; done
I stopped this (with Control-C) after deleting a few tens of keys and then it worked again.
Actually, in the project metadata in the GUI, I do not see a lot of key. Only :
gke...cidr : network-name...
sshKeys : gke-e9...
SSH Keys => peter_v : ssh-rsa my public key
These key are stored in your Project Metadata yo can remove them by deleting trough the Google Console UI
Seeing as you were mentioning OS Login in your question: there is a way to delete specific SSH keys from a user's profile using this command. Alternatively, instead of performing SCP, I'd advise you, much like John Hanley has, to put the file you're copying into the instance in Storage and retrieve it via a startup script (you could also use a custom Compute image).
In my case, I was using another service account to run ssh, so basically I'm using a impersonate.
If you are using an impersonation too, you need to delete the ssh key list from the service account which you're impersonating.
for i in $(gcloud compute os-login ssh-keys list --impersonate-service-account="your_sc#serviceaccount.com" --format="table[no-heading](value.fingerprint)");
do echo $i;
gcloud compute os-login ssh-keys remove --key $i --impersonate-service-account="your_sc#serviceaccount.com" || true;
And then add "--ssh-key-expire-after=7m" the amount of time is defined by your needs
gcloud compute ssh ${MY_VM} --zone ${GKE_ZONE} --project ${PROJECT_ID} --tunnel-through-iap --ssh-key-expire-after=7m --impersonate-service-account="your_sc#serviceaccount.com"
I'm trying to run a docker container via Compute (not Cloud Run as I need long term instance)
The container works fine on the local machine, it can access GCloud account resources.
I can see that it's running on a Container-Optimized OS on Compute
I tried running the following commands in order, but I get the same issue. CONSUMER_INVALID. The IAM account has access to all the required permissions, I tripled checked this.
// Fix gcloud issue
alias gcloud='(docker images google/cloud-sdk || docker pull google/cloud-sdk) > /dev/null;docker run -t -i --net=host -v $HOME/.config:/.config -v /var/run/docker.sock:/var/run/docker.sock -v /usr/bin/docker:/usr/bin/docker google/cloud-sdk gcloud'
// Setup GCloud
// Enable access via to dockers gcloud
gcloud auth configure-docker
Need to run
docker-credential-gcr configure-docker
export GOOGLE_CLOUD_PROJECT=project-352014
Not sure what to do now; seems Compute isn't communicating property with the internal account resources?
I need to execute commands on my Compute Engine VM. We need an initial setup for the SQL and the plan is to use cloud build (will only be triggered once) for this; IAP is implemented and Firewall rule is already in place. (Allow TCP 22 from
This is my build step:
# Setup Cloud SQL
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
id: 'Setup Cloud SQL Tables'
entrypoint: 'bash'
- -c
- |
echo "Upload File to $_SQL_JUMP_BOX_NAME" &&
gcloud compute scp --recurse cloud-sql/setup-sql.sh --tunnel-through-iap --zone $_ZONE "$_SQL_JUMP_BOX_NAME:~" &&
echo "SSH to $_SQL_JUMP_BOX_NAME" &&
gcloud compute ssh --tunnel-through-iap --zone $_ZONE "$_SQL_JUMP_BOX_NAME" --project "$_TARGET_PROJECT_ID" --command="chmod +x setup-sql.sh && ./setup-sql.sh"
I am receiving this error:
root#compute.3726515935009049919: Permission denied (publickey).
To increase the performance of the tunnel, consider installing NumPy. For instructions,
please see https://cloud.google.com/iap/docs/using-tcp-forwarding#increasing_the_tcp_upload_bandwidth
root#compute.3726515935009049919: Permission denied (publickey).
ERROR: (gcloud.compute.scp) Could not SSH into the instance. It is possible that your SSH key has not propagated to the instance yet. Try running this command again. If you still cannot connect, verify that the firewall and instance are set to accept ssh traffic.
This will also be triggered/executed to multiple environments, hence we use cloud build for reusability.
Already working!
I stumbled upon this blog -- https://hodo.dev/posts/post-14-cloud-build-iap/
Made changes on my script, need to specify user on SCP/SSH command:
Working Script/Step:
# Setup Cloud SQL
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
id: 'Setup Cloud SQL Tables'
entrypoint: 'bash'
- -c
- |
echo "Upload File to $_SQL_JUMP_BOX_NAME" &&
gcloud compute scp --recurse cloud-sql/setup-sql.sh --tunnel-through-iap --zone $_ZONE cloudbuild#$_SQL_JUMP_BOX_NAME:~ &&
echo "SSH to $_SQL_JUMP_BOX_NAME" &&
gcloud compute ssh --tunnel-through-iap --zone $_ZONE cloudbuild#$_SQL_JUMP_BOX_NAME --project "$_TARGET_PROJECT_ID" --command="chmod +x setup-sql.sh && ./setup-sql.sh"
Need changes related to the destination VM
gcloud compute ssh --tunnel-through-iap --zone $_ZONE "$_SQL_JUMP_BOX_NAME"
gcloud compute ssh --tunnel-through-iap --zone $_ZONE cloudbuild#$_SQL_JUMP_BOX_NAME
Can I use google cloud's identity aware proxy to connect to the gRPC endpoint on a TPU worker? By "TPU worker" I mean that I am creating a TPU with no associated compute instance (using gcloud compute tpus create) and I wish to connect to the gRPC endpoint found by running gcloud compute tpus describe my-tpu:
ipAddress: <XXX>
port: <YYY>
I can easily set up an SSH tunnel to connect to this endpoint from my local machine but I would like to use IAP to create that tunnel instead. I have tried the following:
gcloud compute start-iap-tunnel my-tpu 8470
but I get
- The resource 'projects/.../zones/.../instances/my-tpu' was not found
This makes sense because a TPU is a not a compute instance, and the command gcloud compute start-iap-tunnel expects an instance name.
Is there any way to use IAP to tunnel to an arbitrary internal IP address? Or more generally, is there any other way that I can use IAP to create a tunnel to my TPU worker?
Yes, it can be done using the internal ip address of the TPU Worker, here is an example:
gcloud alpha compute start-iap-tunnel \ 8470 \
--local-host-port="localhost:$LOCAL_PORT" \
--region $REGION \
--network $SUBNET \
--project $PROJECT
Be aware that Private Google Access must be enabled in the TPU subnet, which can be easily done with the following command:
gcloud compute networks subnets update $SUBNET \
--region=$REGION \
Just as a reference, here you have an example on how to create a TPU Worker with no external ip address:
gcloud alpha compute tpus tpu-vm create \
--project $PROJECT \
--zone $ZONE \
--internal-ips \
--version tpu-vm-tf-2.6.0 \
--accelerator-type v2-8 \
--network $SUBNET \
To successfully authenticate the endpoint source of the IAP tunnel, you need to add the SSH keys to the project's metadata following these steps:
Check if you already have SSH keys generated in your endpoint:
ls -1 ~/.ssh/*
/. . ./id_rsa
/. . ./id_rsa.pub
If you don't have any, you can generate them with the command: ssh-keygen -t rsa -f ~/.ssh/id_rsa -C id_rsa.
Add the SSH keys to your project's metadata:
gcloud compute project-info add-metadata \
--metadata ssh-keys="$(gcloud compute project-info describe \
$(whoami):$(cat ~/.ssh/id_rsa.pub)"
Updated [https://www.googleapis.com/compute/v1/projects/$GCP_PROJECT_NAME].
Assign the iap.tunnelResourceAccessor role to the user:
gcloud projects add-iam-policy-binding $GCP_PROJECT_NAME \
--member=user:$USER_ID \
I am currently running 29 instances in each available regions on GCP. And I need all of the instances to have some python script file.
As I was getting tired to upload them manually through the console 29 times, I was wondering if there's a way to upload the script in only one instance, and copy them all over to 28 other instances with gcloud scp command?
Currently, I was trying the following:
sudo gcloud compute scp --zone='asia-east1-b' /home/file.txt instance-asia-east1:/home/
The code above is trying to scp "file.txt" over to the instance-asia-east1.
I included the sudo command as it was having some permission issues. But after adding the sudo, I get another error message:
root# Permission denied (publickey).
lost connection
ERROR: (gcloud.compute.scp) [/usr/bin/scp] exited with return code [1].
What can be the issue, and how can I resolve this?
You should avoid using sudo.
If you add --verbosity=debug to (any but in this case) gcloud compute ssh or gcloud compute scp command, you'll see that gcloud invokes your host's (probably /usr/bin) ssh and scp commands. It uses a private key that was generated by gcloud using your credentials (gcloud config get account or the default gcloud auth list).
gcloud compute scp \
${PWD}/${FILE} \
--project=${PROJECT} \
--zone=${ZONE} \
DEBUG: Running [gcloud.compute.scp] with arguments: ...
DEBUG: Current SSH keys in project: ['...:ssh-rsa ... user#host']
DEBUG: Running command [/usr/bin/scp -i .../.ssh/google_compute_engine -o ...
INFO: Display format: "default"
DEBUG: SDK update checks are disabled.
NOTE /usr/bin/scp -i .../.ssh/google_compute_engine ...
When you run as sudo, even if you copy your credentialed user's google_compute_engine SSH keys (to e.g. /root/.ssh), the authenticated user won't match, unless you also duplicate the gcloud config...
I recommend you solve the permission issue that triggered your use of sudo.
From my laptop, I am able to execute most gcloud commands, for example creating a cluster and many other commands. I have the Project Owner role.
But when I try to get credentials for a K8s cluster, I get a permission error. But in Cloud Shell, the command succeeds.
The logged-in account is the same in both.
% gcloud container clusters get-credentials my-first-cluster-1 --zone us-central1-c --project my-project
Fetching cluster endpoint and auth data.
ERROR: (gcloud.container.clusters.get-credentials) get-credentials requires edit permission on my-project
$ gcloud config list account --format "value(core.account)"
But in Cloud Shell, this succeeds!
$ gcloud container clusters get-credentials my-first-cluster-1 --zone us-central1-c --project my-project
Fetching cluster endpoint and auth data.
kubeconfig entry generated for my-first-cluster-1.
$ gcloud config list account --format "value(core.account)"
The error message is indeed incorrect and not very helpful in this case. This issue occurs when the gcloud config value container/use_client_certificate is set to True but no client certificate has been configured (note that client certificate is a legacy authentication method and is disabled by default for clusters created with GKE 1.12 and higher.). Setting it to False via the following gcloud command solves this issue:
gcloud config set container/use_client_certificate False
This config value is set to False by default in Cloud Shell, which explains the different behavior you experienced.