Syntax Error When Running `gcloud ml-engine` commands in Google Datalab - google-cloud-platform

I am trying to deploy a model to Google ML Engine using DataLab. The code works in my live project datalab but gives a syntax error on my staging datalab. I thought this may be due to different versions of gcloud and so I ran the updates but I am still getting the same syntax error. How can I fix this?
Code:
MODEL_NAME="waittimes_model_03"
MODEL_VERSION="ml_on_gcp_waittimes_06"
gcloud ml-engine models create ${MODEL_NAME} --regions us-central1
gcloud ml-engine versions create ${MODEL_VERSION} --model ${MODEL_NAME} --origin waitestimates/export/exporter/1532010994 --staging-bucket ${BUCKET} --runtime-version 1.6
Error:
File "<ipython-input-4-104542ff058c>", line 8
gcloud ml-engine models create ${MODEL_NAME} --regions us-central1
^
SyntaxError: invalid syntax

Add the ! prefix to your command, e.g.,
!gcloud ml-engine models create ${MODEL_NAME} --regions us-central1

Related

Creating a node-pool with --enable-autoscaling results in an invalid argument

I made a gke cluster node-pool using follow commands.
gcloud container node-pools create autoscale-pool --cluster cluster-xxx --zone asia-northeast1-a --machine-type e2-highmem-2 --disk-size 30 --enable-autoscaling --scopes bigquery,storage-rw --num-nodes 1 --min-nodes 1 --max-nodes 5 --enable-autorepair --enable-autoupgrade --node-labels=node-label-ap=ap,node-label-memorysort=memorysort,node-label-batchjob=batchjob,node-label=auto
Then I was facing the error follows.
ERROR: (gcloud.container.node-pools.create) ResponseError: code=400, message=Request contains an invalid argument.
--enable-autoscaling seems to be an invalid argument.
I can activate "Enable auto scale" in the admin panel.
No errors occurred until April 1.
Is it no longer possible to run the command with the --enable-autoscaling parameter?
GKE Cluster Creation with Google cloud SDK version 379.0.0 will fail with the invalid argument error when the --enable-autoscaling flag is used in the gcloud command line. We are experiencing an issue with Google Kubernetes Engine from April 1, 2022. Mitigation work is still underway by the Google Cloud Engineering team.
EDIT
There is an update that the issue has been resolved. The new version of gcloud SDK (380) is released and it doesn't have any issues.
So, Upgrade your gcloud SDK version to 380 in order to overcome this issue.
To know the current version of gcloud SDK, run the command
gcloud version | grep 'SDK' # the resultant output will be Google Cloud SDK 380.0.0 version.

Cloud Run throws error "ERROR: gcloud crashed (AttributeError): 'Namespace' object has no attribute 'use_http2'"

Deploying a TF serving container I get the following error:
ERROR: gcloud crashed (AttributeError): 'Namespace' object has no attribute 'use_http2'
Versions
gcloud version
Google Cloud SDK 277.0.0
alpha 2019.05.17
beta 2019.05.17
bq 2.0.52
core 2020.01.17
docker-credential-gcr
gsutil 4.47
Complete output
➜ cloud_run gcloud run deploy predict --image gcr.io/$PROJECT_ID/predict --port=8501 --memory=512 --platform managed --allow-unauthenticated --region=us-central1
ERROR: gcloud crashed (AttributeError): 'Namespace' object has no attribute 'use_http2'
If you would like to report this issue, please run the following command:
gcloud feedback
To check gcloud for common problems, please run the following command:
gcloud info --run-diagnostics
➜ cloud_run gcloud run deploy predict --image gcr.io/$PROJECT_ID/predict --port=8501 --memory=512 --platform managed --allow-unauthenticated
ERROR: gcloud crashed (AttributeError): 'Namespace' object has no attribute 'use_http2'
If you would like to report this issue, please run the following command:
gcloud feedback
To check gcloud for common problems, please run the following command:
gcloud info --run-diagnostics
➜ cloud_run gcloud run deploy predict --image gcr.io/$PROJECT_ID/predict --port=8501 --memory=512 --platform managed
ERROR: gcloud crashed (AttributeError): 'Namespace' object has no attribute 'use_http2'
If you would like to report this issue, please run the following command:
gcloud feedback
To check gcloud for common problems, please run the following command:
gcloud info --run-diagnostics
➜ cloud_run gcloud run deploy predict --image gcr.io/$PROJECT_ID/predict --port=8501 --memory=512
ERROR: gcloud crashed (AttributeError): 'Namespace' object has no attribute 'use_http2'
If you would like to report this issue, please run the following command:
gcloud feedback
To check gcloud for common problems, please run the following command:
gcloud info --run-diagnostics
➜ cloud_run gcloud run deploy predict --image gcr.io/$PROJECT_ID/predict
Deploying container to Cloud Run service [predict] in project [XXXXXXX] region [us-central1]
✓ Deploying... Done.
✓ Creating Revision...
✓ Routing traffic...
Done.
Service [predict] revision [predict-00005-lub] has been deployed and is serving 100 percent of traffic at https://predict-XXXXXX.a.run.app
Run diagnostics as indicated:
gcloud info --run-diagnostics
Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic passed (1/1 checks passed).
Property diagnostic detects issues that may be caused by properties.
Checking hidden properties...done.
Hidden Property Check passed.
Property diagnostic passed (1/1 checks passed).
Seems to be all flags are valid:
NAME
gcloud beta run deploy - deploy a container to Cloud Run
SYNOPSIS
gcloud beta run deploy [[SERVICE] --namespace=NAMESPACE] --image=IMAGE
[--args=[ARG,...]] [--async] [--command=[COMMAND,...]]
[--concurrency=CONCURRENCY] [--max-instances=MAX_INSTANCES]
[--memory=MEMORY] [--platform=PLATFORM] [--port=PORT]
[--timeout=TIMEOUT]
[--clear-env-vars | --set-env-vars=[KEY=VALUE,...]
| --remove-env-vars=[KEY,...] --update-env-vars=[KEY=VALUE,...]]
[--clear-labels | --remove-labels=[KEY,...] --labels=[KEY=VALUE,...]
| --update-labels=[KEY=VALUE,...]]
[--connectivity=CONNECTIVITY --cpu=CPU]
[--[no-]allow-unauthenticated --revision-suffix=REVISION_SUFFIX
--service-account=SERVICE_ACCOUNT
--add-cloudsql-instances=[CLOUDSQL-INSTANCES,...]
| --clear-cloudsql-instances
| --remove-cloudsql-instances=[CLOUDSQL-INSTANCES,...]
| --set-cloudsql-instances=[CLOUDSQL-INSTANCES,...]]
[--region=REGION
| --cluster=CLUSTER --cluster-location=CLUSTER_LOCATION
| --context=CONTEXT --kubeconfig=KUBECONFIG] [GCLOUD_WIDE_FLAG ...]
DESCRIPTION
(BETA) Deploys container images to Google Cloud Run.
use the Alpha version in the SDK for the time being. a fix for the problem is being implemented, check here.
gcloud alpha run ....

Submit Presto job on dataproc

I am trying to submit a dataproc job on a cluster running Presto with the postgresql connector.
The cluster is initialized as followed:
gcloud beta dataproc clusters create ${CLUSTER_NAME} \
--project=${PROJECT} \
--region=${REGION} \
--zone=${ZONE} \
--bucket=${BUCKET_NAME} \
--num-workers=${WORKERS} \
--scopes=cloud-platform \
--initialization-actions=${INIT_ACTION}
${INIT_ACTION} point to a bash file with the initialization actions for starting a presto cluster with postgresql.
I do not use --optional-components=PRESTO since I need --initialization-actions to perform non-default operations. And having both --optional-component and --initialization-actions does not work.
When I try to run a simple job:
gcloud beta dataproc jobs submit presto \
--cluster ${CLUSTER_NAME} \
--region ${REGION} \
-e "SHOW TABLES"
I get the following error:
ERROR: (gcloud.beta.dataproc.jobs.submit.presto) FAILED_PRECONDITION: Cluster
'<cluster-name>' requires optional component PRESTO to run PRESTO jobs
Is there some other way to define the optional component on the cluster?
UPDATE:
Using both --optional-component and --initialization-actions, as:
gcloud beta dataproc clusters create ${CLUSTER_NAME} \
...
--scopes=cloud-platform \
--optional-components=PRESTO \
--image-version=1.3 \
--initialization-actions=${INIT_ACTION} \
--metadata ...
The ${INIT_ACTION} is copied from this repo. With a slight modification to the function configure_connectors to create a postgresql connector.
When running the create cluster the following error is given:
ERROR: (gcloud.beta.dataproc.clusters.create) Operation [projects/...] failed: Initialization action failed. Failed action 'gs://.../presto_config.sh', see output in: gs://.../dataproc-initialization-script-0_output.
The error output is logged as:
+ presto '--execute=select * from system.runtime.nodes;'
Error running command: java.net.ConnectException: Failed to connect to localhost/0:0:0:0:0:0:0:1:8080
Which leads me to believe I have to re-write the initialization script.
It would be nice to know which initialization script is running when I specify --optional-components=PRESTO.
If all you want to do is setup the optional component to work with a Postgres endpoint writing an optional component to do it is pretty easy. You just have to add the catalog file and restart presto.
https://gist.github.com/KoopaKing/8e653e0c8d095323904946045c5fa4c2
Is an example init action. I have tested it successfully with the presto optional component, but it is pretty simple. Feel free to fork the example and stage it in your GCS bucket.

gcloud crashed (AttributeError): 'NoneType' object has no attribute 'revisionTemplate'

I'm working on Cloud Run, which seems to be beta yet, preventing from redeploying as shown below. It works if I delete the service from GCP console, then deploy the same Docker as a new service. I could not find a way to to set revisionTemplate.
I run this command to deploy a Cloud Run service using gcloud.
gcloud beta run deploy v2-cms --image gcr.io/my-project/v2-cms --quiet
Then, it fails saying like this.
X Deploying...
. Creating Revision...
. Routing traffic...
Deployment failed
ERROR: gcloud crashed (AttributeError): 'NoneType' object has no attribute 'revisionTemplate'
If you would like to report this issue, please run the following command:
gcloud feedback
To check gcloud for common problems, please run the following command:
gcloud info --run-diagnostics
To fix this issue, please update gcloud to ite latest version with gcloud components update
Make sure that your local Tensorflow version is still supported by GCloud https://cloud.google.com/ai-platform/training/docs/runtime-version-list

Invalid FromUrl error when submitting job

I'm getting this error:
$ gcloud ml-engine jobs submit training testX
--job-dir="gs://testxxx"
--package-path=trainer
--module-name=trainer.task
--region us-central1
ERROR: (gcloud.ml-engine.jobs.submit.training)
argument --job-dir: invalid FromUrl value: 'gs://testxxx'
However, if I submit it at staging:
$ gcloud ml-engine jobs submit training testX
--staging-bucket="gs://testxxx"
--package-path=trainer
--module-name=trainer.task
--region us-central1
It works just fine ... Any clue of why this error is showing?
Thanks!
M
Currently, gcloud expects --job-dir to be an object path (not a bucket). So try something like --job-dir="gs://testxxx/run1.
In the meantime, we will improve the error message; we will consider allowing buckets to be used as the actual job-dir as well.