How to specify preemptible GPU Deep Learning Virtual Machine on GCP - google-cloud-platform

I can't figure out how to specify preemptible GPU Deep Learning VM on GCP
This what I used:
export IMAGE_FAMILY="tf-latest-gpu"
export ZONE="europe-west4-a "
export INSTANCE_NAME="deeplearning"
gcloud compute instances create $INSTANCE_NAME \
--zone=$ZONE \
--image-family=$IMAGE_FAMILY \
--image-project=deeplearning-platform-release \
--maintenance-policy=TERMINATE \
--accelerator='type=nvidia-tesla-v100,count=2' \
--metadata='install-nvidia-driver=True'
Thank you!

You can create a preemptible Compute Engine instance with GPU by adding the --preemptible gcloud command option. As per your example, that would be:
export IMAGE_FAMILY="tf-latest-gpu"
export ZONE="europe-west4-a "
export INSTANCE_NAME="deeplearning"
gcloud compute instances create $INSTANCE_NAME \
--zone=$ZONE \
--image-family=$IMAGE_FAMILY \
--image-project=deeplearning-platform-release \
--maintenance-policy=TERMINATE \
--accelerator type=nvidia-tesla-v100,count=2 \
--metadata='install-nvidia-driver=True'
--preemptible
See documentation here and here for more details on available options.

Related

GCP create instance-template for instances with public ip

I am trying to create an instance-template, where a instance create with this template automatically gets an public ipv4 asigned.
Currently I am using something like following gcloud command:
gcloud compute instance-templates create TEMPLATENAME \
--project=PROJECT \
--machine-type=e2-small \
--network-interface=network=default,network-tier=PREMIUM \
--maintenance-policy=MIGRATE --provisioning-model=STANDARD \
--service-account=SERVICE_ACCOUNT \
--scopes=https://www.googleapis.com/auth/cloud-platform \
--tags=http-server,https-server \
--create-disk=CREATE_DISK \
--no-shielded-secure-boot \
--shielded-vtpm \
--shielded-integrity-monitoring \
--reservation-affinity=any
This command is generated by the Google Cloud Console, but I have to use gcloud since I have to use a image-family to create the disk (which is to my knowledge not supported using gui).
If running this command I get the following result:
The result I want to get is:
What am I missing?
In order to get an ephemeral IP adress has to be set as empty string in the network interface flag.
--network-interface=network=default,network-tier=PREMIUM,adress=''
see https://cloud.google.com/sdk/gcloud/reference/compute/instance-templates/create?hl=de#--network-interface

how to include list of IPs while creating egress rule via gcloud?

I try to create an egress firewall rule to open specific destination IPs, here is what I do for only one destination-ranges:
gcloud compute firewall-rules create my_egress \
--network ${NETWORK_NAME} \
--action allow \
--rules all \
--direction egress \
--destination-ranges 43.249.72.0/22 \
--priority 1000
My question is how to have a list of IP ranges instead of just one (here instead of 43.249.72.0/22, I want 23.235.32.0/20, 43.249.72.0/22 for example)?
After some trial-and-error I found something useful here: https://cloud.google.com/sdk/gcloud/reference/compute/firewall-rules/create
It seems you need to put it inside "", without space, separated by comma.
gcloud compute firewall-rules create my_egress \
--network ${NETWORK_NAME} \
--action allow \
--rules all \
--direction egress \
--destination-ranges "43.249.72.0/22,23.235.32.0/20" \
--priority 1000

Unable to change memory allocation in Cloud Function

I am trying to increase the the Memory Allocation for some of my Cloud Function for past few hours. If I change it and deploy the Memory Allocation keeps staying in 512 MiB. It was working when I tried it few days back.
This is what I am doing,
Click on edit in function
Change Memory allocated to 2 GiB and click Next & Deploy
The memory allocated remain 512 MiB after deploying
What am I doing wrong ? Can someone help me out on this please?
I'm unable to repro your experience using Cloud Console and gcloud.
gcloud functions describe ${NAME} \
--region=${REGION} \
--project=${PROJECT} \
--format="value(availableMemoryMb)"
256
Then revise it in the Console to 256MB and:
gcloud functions describe ${NAME} \
--region=${ERGION} \
--project=${PROJECT} \
--format="value(availableMemoryMb)"
512
Then revise it to 1024MB using gcloud deploy:
gcloud functions deploy ${NAME} \
--trigger-http \
--entry-point=${FUNCTION} \
--region=${REGION} \
--project=${PROJECT} \
--memory=1024MB
gcloud functions describe ${NAME} \
--region=${ERGION} \
--project=${PROJECT} \
--format="value(availableMemoryMb)"
1024

How Can I add a start-up script to an existing ai notebook instance on google cloud?

I know how to do it when I create an instance:
gcloud compute instances create ${INSTANCE_NAME} \
--machine-type=n1-standard-8 \
--scopes=https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/userinfo.email \
--min-cpu-platform="Intel Skylake" \
${IMAGE} \
--image-project=deeplearning-platform-release \
--boot-disk-size=100GB \
--boot-disk-type=pd-ssd \
--accelerator=type=nvidia-tesla-p100,count=1 \
--boot-disk-device-name=${INSTANCE_NAME} \
--maintenance-policy=TERMINATE --restart-on-failure \
--metadata="proxy-user-mail=${GCP_LOGIN_NAME},install-nvidia-driver=True,startup-script=${STARTUP_SCRIPT}"
but what if I already have an instance, how do I update/create the startup script?
To add or update the metadata, you can use the endpoint "add-metadata" like this
gcloud compute instances add-metadata ${INSTANCE_NAME} \
--metadata startup-script=${NEW_STARTUP_SCRIPT}
The other metadatas are kept.

Cannot create gcloud instance

Following https://course.fast.ai/start_gcp.html this set up:
export IMAGE_FAMILY="pytorch-latest-gpu" # or "pytorch-latest-cpu"
for non-GPU instances
export ZONE="us-west2-b" # budget: "us-west1-b"
export INSTANCE_NAME="my-fastai-instance"
export INSTANCE_TYPE="n1-highmem-8" # budget: "n1-highmem-4"
# budget: 'type=nvidia-tesla-k80,count=1'
gcloud compute instances create $INSTANCE_NAME \
--zone=$ZONE \
--image-family=$IMAGE_FAMILY \
--image-project=deeplearning-platform-release \
--maintenance-policy=TERMINATE \
--accelerator="type=nvidia-tesla-p100,count=1" \
--machine-type=$INSTANCE_TYPE \
--boot-disk-size=200GB \
--metadata="install-nvidia-driver=True" \
--preemptible
Got this error:
(gcloud.compute.instances.create) Could not fetch resource:
- The resource 'projects/xxxxxx/zones/us-west2-b/acceleratorTypes/nvidia-tesla-p100' was not found
Anyone?
I tried replicating the same steps you followed from the tutorial and got the same error.
According to Google's documentation, NVIDIA-TESLA-P100 is only available in these zones:
us-west1-a
us-west1-b
us-central1-c
us-central1-f
us-east1-b
us-east1-c
europe-west1-b
europe-west1-d
europe-west4-a
asia-east1-a
asia-east1-c
australia-southeast1-c
And you may have selected us-west2-b, which is not available.
Therefore, I would just change your zone to one of the previously mentioned ones.
To get this list in a more programmatic way, using Cloud SDK for example, you could issue:
gcloud compute accelerator-types list --filter "name=nvidia-tesla-p100" --format "table[box,title=Zones](zone:sort=1)" 2>/dev/null
The error you are reporting is caused because this GPU is not available in the zone “us-west2-b”, you can review where GPU you can use in this official documentation.
In this case, according at the region you are using, you can use in:
us-west1-a
us-west1-b
Regards.