Setting "traffic-split" Parameter for multiple models deployed on vertex ai - google-cloud-platform

I want to update the traffic as 50% and 50% for two deployed models on same endpoint(Vertex AI). I am using following command but it is not working.
gcloud ai endpoints update 2003578466645041111 --region=us-central1 --traffic-split=[DEPLOYED_MODEL_ID=4779212008081850368=50,DEPLOYED_MODEL_ID=6143661957686755328=50] --service-account= ********* --project= **********
In --traffic-split parameter I am passing the IDs of models deployed and percentage of traffic I wanted to set. I tried with many values instead of 50 such as 0.5, .5 etc but its not working. Even I tried removing the word "DEPLOYED_MODEL_ID" and just kept model id and value in the list.
gcloud ai endpoints update 2003578466645041111 --region=us-central1 --traffic-split=[4779212008081850368=50,6143661957686755328=50] --service-account= ********* --project= **********

Related

How to list "labels" and "in use by" along with instances in a project?

I am currently using the following piece of code to get instance list from a project (which seems to work ok):
gcloud compute instances list
--format="csv(name,description,machineType,status,zone)"
However, looking at the response body for instances.list, I found labels but couldnt find where "In Use By" values are listed. I've tried the following, but it didn't work.
gcloud compute instances list \
--format="csv(name,description,machineType,status,zone,items.labels.list())"
If it helps, I am looking for the values in red to be listed along with my instances.list output:
https://imgur.com/FFeDHoW
You can use the below commands to get the details using gcloud compute instances list --topic format:
gcloud compute instances list --format='csv(name,description,machineType,status,zone,labels,inUseBy,instanceTemplate.list())'
or
gcloud compute instances list --format='table(name,description,machineType,status,zone,labels,inUseBy,instanceTemplate.list())'
Sample Output:

Authenticate Custom Training Job in Vertex AI with Service Account

I am trying to run a Custom Training Job to deploy my model in Vertex AI directly from a Jupyterlab. This Jupyterlab is instantiated from a Vertex AI Managed Notebook where I already specified the service account.
My aim is to deploy the training script that I specify to the method CustomTrainingJob directly from the cells of my notebook. This would be equivalent to pushing an image that contains my script to container registry and deploying the Training Job manually from the UI of Vertex AI (in this way, by specifying the service account, I was able to corectly deploy the training job). However, I need everything to be executed from the same notebook.
In order to specify the credentials to the CustomTrainingJob of aiplatform, I execute the following cell, where all variables are correctly set:
import google.auth
from google.cloud import aiplatform
from google.auth import impersonated_credentials
source_credentials = google.auth.default()
target_credentials = impersonated_credentials.Credentials(
source_credentials=source_credentials,
target_principal='SERVICE_ACCOUNT.iam.gserviceaccount.com',
target_scopes = ['https://www.googleapis.com/auth/cloud-platform'])
aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_NAME)
job = aiplatform.CustomTrainingJob(
display_name=JOB_NAME,
script_path=SCRIPT_PATH,
container_uri=MODEL_TRAINING_IMAGE,
credentials=target_credentials
)
When after the job.run() command is executed it seems that the credentials are not correctly set. In particular, the following error is returned:
/opt/conda/lib/python3.7/site-packages/google/auth/impersonated_credentials.py in _update_token(self, request)
254
255 # Refresh our source credentials if it is not valid.
--> 256 if not self._source_credentials.valid:
257 self._source_credentials.refresh(request)
258
AttributeError: 'tuple' object has no attribute 'valid'
I also tried different ways to configure the credentials of my service account but none of them seem to work. In this case it looks like the tuple that contains the source credentials is missing the 'valid' attribute, even if the method google.auth.default() only returns two values.
To run the custom training job using a service account, you could try using the service_account argument for job.run(), instead of trying to set credentials. As long as the notebook executes as a user that has act-as permissions for the chosen service account, this should let you run the custom training job as that service account.

Cannot create GCP VM, public images are empty

When attempting to create a new GCP VM on a new account, the list of public images is empty. If I try to launch an image from the public marketplace, the boot device will not be attached.
What am I doing wrong here?
Turns out the problem was caused by the security group in my organization which set a public image constraint and neglected to ensure my team was aware of it.
https://cloud.google.com/compute/docs/images/restricting-image-access
open cloud shell as per my comment (right upper corner square with ">" sign)
list images using
gcloud compute images list
Output will look like this:
try to create your VM with desired image via opened cloud shell using for example the following
gcloud compute instances create test --image-family ubuntu-1804-lts --image-project ubuntu-os-cloud
Output should look something like this:
If so, you will find your instance running under compute engine.
PS.
Dont forget to turn these VMs off when you are done.
If you want to select another image make sure to use image project and image family from output from step 2.
Either way at least this should give you some errors to resolve

GCP Vertex AI Training Custom Job : User does not have bigquery.jobs.create permission

I'm struggling to execute a query with Bigquery python client from inside a training custom job of Vertex AI from Google Cloud Platform.
I have built a Docker image which contains this python code then I have pushed it to Container Registry (eu.gcr.io)
I am using this command to deploy
gcloud beta ai custom-jobs create --region=europe-west1 --display-name="$job_name" \
--config=config_custom_container.yaml \
--worker-pool-spec=machine-type=n1-standard-4,replica-count=1,container-image-uri="$docker_img_path" \
--args="${model_type},${env},${now}"
I have even tried to use the option --service-account to specify a service account with admin Bigquery role, it did not work.
According to this link
https://cloud.google.com/vertex-ai/docs/general/access-control?hl=th#granting_service_agents_access_to_other_resources
the Google-managed service accounts for AI Platform Custom Code Service Agent (Vertex AI) have already the right to access to BigQuery, so I do not understand why my job fails with this error
google.api_core.exceptions.Forbidden: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/*******/jobs?prettyPrint=false:
Access Denied: Project *******:
User does not have bigquery.jobs.create permission in project *******.
I have replaced the id with *******
Edit:
I have tried several configuration, my last config YAML file only contents this
baseOutputDirectory:
outputUriPrefix:
Using the field serviceAccount does not seem to edit the actual configuration unlike --service-account option
Edit 14-06-2021 : Quick Fix
like #Ricco.D said
try explicitly defining the project_id in your bigquery code if you
have not done this yet.
bigquery.Client(project=[your-project])
has fixed my problem. I still do not know about the causes.
To fix the issue it is needed to explicitly specify the project ID in the Bigquery code.
Example:
bigquery.Client(project=[your-project], credentials=credentials)

Invalid service name [GOOGLE_APPLICATION_CREDENTIALS=name] in gcp connect twilio messaging with dialogflow

I have created one agent in Dialogflow and then connect it with GCP Function with Webhook. And now I want to integrate it with Twilio text messaging so that I follow https://github.com/GoogleCloudPlatform/dialogflow-integrations/tree/master/twilio#readme tutorial but when I put the command:
"gcloud beta run deploy --image gcr.io/test1/dialogflow-twilio--update-env-vars GOOGLE_APPLICATION_CREDENTIALS=test1.json --memory 1Gi"
it gives me error that
(gcloud.beta.run.deploy) Invalid service name [GOOGLE_APPLICATION_CREDENTIALS=name].
Service name must use only lowercase alphanumeric characters
and dashes. Cannot begin or end with a dash, and cannot be longer than 63 characters...
My gcloud sdk version is 290.0.1. I have created a service account in which have given access to dialogflow-client and use that account json file. Help me what I am missing in this please.
You must be entering GOOGLE_APPLICATION_CREDENTIALS=name whenever the command prompts you to enter a service name. In this case you can simply hit enter and it will create a default service name for you.
From README.md:
When prompted for a service name hit enter to accept the default.
Edit:
Run your command like this (add a space between dialogflow-twilio and --update env-vars):
gcloud beta run deploy --image gcr.io/test1/dialogflow-twilio --update-env-vars GOOGLE_APPLICATION_CREDENTIALS=test1.json --memory 1Gi
The current Google Cloud SDK version is 316. There is 1 release per week. If yours is 290, that means you are 26 weeks behind, roughly 6 month.
Update your gcloud SDK, it should fix your issue (the error message simply don't know the param that you use! And take the param value as the name of the Cloud Run service)
Try a gcloud components update