Authenticate Custom Training Job in Vertex AI with Service Account

Authenticate Custom Training Job in Vertex AI with Service Account - google-cloud-platform

I am trying to run a Custom Training Job to deploy my model in Vertex AI directly from a Jupyterlab. This Jupyterlab is instantiated from a Vertex AI Managed Notebook where I already specified the service account.
My aim is to deploy the training script that I specify to the method CustomTrainingJob directly from the cells of my notebook. This would be equivalent to pushing an image that contains my script to container registry and deploying the Training Job manually from the UI of Vertex AI (in this way, by specifying the service account, I was able to corectly deploy the training job). However, I need everything to be executed from the same notebook.
In order to specify the credentials to the CustomTrainingJob of aiplatform, I execute the following cell, where all variables are correctly set:
import google.auth
from google.cloud import aiplatform
from google.auth import impersonated_credentials
source_credentials = google.auth.default()
target_credentials = impersonated_credentials.Credentials(
source_credentials=source_credentials,
target_principal='SERVICE_ACCOUNT.iam.gserviceaccount.com',
target_scopes = ['https://www.googleapis.com/auth/cloud-platform'])
aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_NAME)
job = aiplatform.CustomTrainingJob(
display_name=JOB_NAME,
script_path=SCRIPT_PATH,
container_uri=MODEL_TRAINING_IMAGE,
credentials=target_credentials
)
When after the job.run() command is executed it seems that the credentials are not correctly set. In particular, the following error is returned:
/opt/conda/lib/python3.7/site-packages/google/auth/impersonated_credentials.py in _update_token(self, request)
254
255 # Refresh our source credentials if it is not valid.
--> 256 if not self._source_credentials.valid:
257 self._source_credentials.refresh(request)
258
AttributeError: 'tuple' object has no attribute 'valid'
I also tried different ways to configure the credentials of my service account but none of them seem to work. In this case it looks like the tuple that contains the source credentials is missing the 'valid' attribute, even if the method google.auth.default() only returns two values.

To run the custom training job using a service account, you could try using the service_account argument for job.run(), instead of trying to set credentials. As long as the notebook executes as a user that has act-as permissions for the chosen service account, this should let you run the custom training job as that service account.

Related

Try deploying a custom ML model with an endpoint by using a custom image on Google Cloud's Vertex AI

I have been banging my head around this for a while and Google Cloud does not have a lot of documentation about this issue. What I am trying to do is deploy a custom ML model on Google Cloud Vertex by:
Uploading the model onto the Model Registry in Vertex AI
Create an endpoint
Deploying the uploaded model on the created endpoint.
Steps 1 and 2 are easy to implement, and I am not facing and issues. However step 3 is always failing for some reason. Even the logs don't give me a lot of information.
For Step 1:
This is Dockerfile I am using to create a custom image to serve my ML model:
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim
COPY requirements-base.txt requirements.txt
RUN pip3 install --no-cache-dir -r requirements.txt
COPY serve.py serve.py
COPY model.pkl model.pkl
And this is what my serve.py file looks like:
from fastapi import Request, FastAPI, Response
import json
import catboost
import pickle
import os
app = FastAPI(title="Sentiment Analysis")
AIP_HEALTH_ROUTE = os.environ.get('AIP_HEALTH_ROUTE', '/health')
AIP_PREDICT_ROUTE = os.environ.get('AIP_PREDICT_ROUTE', '/predict')
#app.get(AIP_HEALTH_ROUTE, status_code=200)
async def health():
return {'health': 'ok'}
#app.post(AIP_PREDICT_ROUTE)
async def predict(request: Request):
with open('model.pkl', 'rb') as file:
model = pickle.load(file)
data = request.get_json()
input_data = data['input']
predictions = model.predict(input_data)
return json.dumps({'predictions': predictions.tolist()})
if __name__ == '__main__':
app.run(debug = True, host="0.0.0.0", port=8080)
After building the image, I push it to artifact registry on Google Cloud.
Is there an issue with how I have written the serve.py file or Dockerfile?
Or is there an easier way to deploy custom ML models on Google Cloud for MLOps and prediction purposes.
Well I tried a couple of manual approaches from the Google Cloud Vertex AI and also using gcloud commands.
In the manual process, after importing the model with the custom image I clicked on deploy to an end point. But this seems to always fail and takes forever.
Similarly using gcloud, I first create endpoint, then upload my model on to the registry, and the upload the model on the endpoint created. But this approach also fails.
At the end of the day I want my model to be successfully deployed on the endpoint and should give the right answers for predictions. Or, somehow host my custom ML model on Google Cloud and make predictions with it in a reasonable and manageable way!

GCP Vertex AI Training Custom Job : User does not have bigquery.jobs.create permission

I'm struggling to execute a query with Bigquery python client from inside a training custom job of Vertex AI from Google Cloud Platform.
I have built a Docker image which contains this python code then I have pushed it to Container Registry (eu.gcr.io)
I am using this command to deploy
gcloud beta ai custom-jobs create --region=europe-west1 --display-name="$job_name" \
--config=config_custom_container.yaml \
--worker-pool-spec=machine-type=n1-standard-4,replica-count=1,container-image-uri="$docker_img_path" \
--args="${model_type},${env},${now}"
I have even tried to use the option --service-account to specify a service account with admin Bigquery role, it did not work.
According to this link
https://cloud.google.com/vertex-ai/docs/general/access-control?hl=th#granting_service_agents_access_to_other_resources
the Google-managed service accounts for AI Platform Custom Code Service Agent (Vertex AI) have already the right to access to BigQuery, so I do not understand why my job fails with this error
google.api_core.exceptions.Forbidden: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/*******/jobs?prettyPrint=false:
Access Denied: Project *******:
User does not have bigquery.jobs.create permission in project *******.
I have replaced the id with *******
Edit:
I have tried several configuration, my last config YAML file only contents this
baseOutputDirectory:
outputUriPrefix:
Using the field serviceAccount does not seem to edit the actual configuration unlike --service-account option
Edit 14-06-2021 : Quick Fix
like #Ricco.D said
try explicitly defining the project_id in your bigquery code if you
have not done this yet.
bigquery.Client(project=[your-project])
has fixed my problem. I still do not know about the causes.

To fix the issue it is needed to explicitly specify the project ID in the Bigquery code.
Example:
bigquery.Client(project=[your-project], credentials=credentials)

How to sign gcs blob from the dataflow worker

my beam dataflow job succeeds locally (with DirectRunner) and fails on the cloud (with DataflowRunner)
The issue localized in this code snippet:
class SomeDoFn(beam.DoFn):
...
def process(self, gcs_blob_path):
gcs_client = storage.Client()
bucket = gcs_client.get_bucket(BUCKET_NAME)
blob = Blob(gcs_blob_path, bucket)
# NEXT LINE IS CAUSING ISSUES! (when run remotely)
url = blob.generate_signed_url(datetime.timedelta(seconds=300), method='GET')
and dataflow points to the error: "AttributeError: you need a private key to sign credentials.the credentials you are currently using just contains a token."
My dataflow job uses the service account (and appropriate service_account_email is provided in the PipelineOptions), however I don't see how I could pass the .json credentials file of that service account to the dataflow job. I suspect that locally my job runs successfully because I set the environment variable GOOGLE_APPLICATION_CREDENTIALS=<path to local file with service account credentials>, but how do I set it similarly for remote dataflow workers? Or maybe there is another solution, if anyone could help

You can see an example here on how to add custom options to your Beam pipeline. With this we can create a --key_file argument that will point to the credentials stored in GCS:
parser.add_argument('--key_file',
dest='key_file',
required=True,
help='Path to service account credentials JSON.')
This will allow you to add the --key_file gs://PATH/TO/CREDENTIALS.json flag when running the job.
Then, you can read it from within the job and pass it as a side input to the DoFn that needs to sign the blob. Starting from the example here we create a credentials PCollection to hold the JSON file:
credentials = (p
| 'Read Credentials from GCS' >> ReadFromText(known_args.key_file))
and we broadcast it to all workers processing the SignFileFn function:
(p
| 'Read File from GCS' >> beam.Create([known_args.input]) \
| 'Sign File' >> beam.ParDo(SignFileFn(), pvalue.AsList(credentials)))
Inside the ParDo, we build the JSON object to initialize the client (using the approach here) and sign the file:
class SignFileFn(beam.DoFn):
"""Signs GCS file with GCS-stored credentials"""
def process(self, gcs_blob_path, creds):
from google.cloud import storage
from google.oauth2 import service_account
credentials_json=json.loads('\n'.join(creds))
credentials = service_account.Credentials.from_service_account_info(credentials_json)
gcs_client = storage.Client(credentials=credentials)
bucket = gcs_client.get_bucket(gcs_blob_path.split('/')[2])
blob = bucket.blob('/'.join(gcs_blob_path.split('/')[3:]))
url = blob.generate_signed_url(datetime.timedelta(seconds=300), method='GET')
logging.info(url)
yield url
See full code here

You will need to provide the service account JSON key similarly to what you are doing locally using the env variable GOOGLE_APPLICATION_CREDENTIALS.
To do so you can follow a few approaches mentioned in the answers to this question. Such as passing it using PipelineOptions
However, keep in mind that the safest way is to store the JSON key let's say in a GCP Bucket and get the file from there.
The easy but not safe workaround is getting the key, opening it, and in your code create a json object based on it to pass it later.

How to schedule an export from a BigQuery table to Cloud Storage?

I have successfully scheduled my query in BigQuery, and the result is saved as a table in my dataset. I see a lot of information about scheduling data transfer in to BigQuery or Cloud Storage, but I haven't found anything regarding scheduling an export from a BigQuery table to Cloud Storage yet.
Is it possible to schedule an export of a BigQuery table to Cloud Storage so that I can further schedule having it SFTP-ed to me via Google BigQuery Data Transfer Services?

There isn't a managed service for scheduling BigQuery table exports, but one viable approach is to use Cloud Functions in conjunction with Cloud Scheduler.
The Cloud Function would contain the necessary code to export to Cloud Storage from the BigQuery table. There are multiple programming languages to choose from for that, such as Python, Node.JS, and Go.
Cloud Scheduler would send an HTTP call periodically in a cron format to the Cloud Function which would in turn, get triggered and run the export programmatically.
As an example and more specifically, you can follow these steps:
Create a Cloud Function using Python with an HTTP trigger. To interact with BigQuery from within the code you need to use the BigQuery client library. Import it with from google.cloud import bigquery. Then, you can use the following code in main.py to create an export job from BigQuery to Cloud Storage:
# Imports the BigQuery client library
from google.cloud import bigquery
def hello_world(request):
# Replace these values according to your project
project_name = "YOUR_PROJECT_ID"
bucket_name = "YOUR_BUCKET"
dataset_name = "YOUR_DATASET"
table_name = "YOUR_TABLE"
destination_uri = "gs://{}/{}".format(bucket_name, "bq_export.csv.gz")
bq_client = bigquery.Client(project=project_name)
dataset = bq_client.dataset(dataset_name, project=project_name)
table_to_export = dataset.table(table_name)
job_config = bigquery.job.ExtractJobConfig()
job_config.compression = bigquery.Compression.GZIP
extract_job = bq_client.extract_table(
table_to_export,
destination_uri,
# Location must match that of the source table.
location="US",
job_config=job_config,
)
return "Job with ID {} started exporting data from {}.{} to {}".format(extract_job.job_id, dataset_name, table_name, destination_uri)
Specify the client library dependency in the requirements.txt file
by adding this line:
google-cloud-bigquery
Create a Cloud Scheduler job. Set the Frequency you wish for
the job to be executed with. For instance, setting it to 0 1 * * 0
would run the job once a week at 1 AM every Sunday morning. The
crontab tool is pretty useful when it comes to experimenting
with cron scheduling.
Choose HTTP as the Target, set the URL as the Cloud
Function's URL (it can be found by selecting the Cloud Function and
navigating to the Trigger tab), and as HTTP method choose GET.
Once created, and by pressing the RUN NOW button, you can test how the export
behaves. However, before doing so, make sure the default App Engine service account has at least the Cloud IAM roles/storage.objectCreator role, or otherwise the operation might fail with a permission error. The default App Engine service account has a form of YOUR_PROJECT_ID#appspot.gserviceaccount.com.
If you wish to execute exports on different tables,
datasets and buckets for each execution, but essentially employing the same Cloud Function, you can use the HTTP POST method
instead, and configure a Body containing said parameters as data, which
would be passed on to the Cloud Function - although, that would imply doing
some small changes in its code.
Lastly, when the job is created, you can use the Cloud Function's returned job ID and the bq CLI to view the status of the export job with bq show -j <job_id>.

Not sure if this was in GA when this question was asked, but at least now there is an option to run an export to Cloud Storage via a regular SQL query. See the SQL tab in Exporting table data.
Example:
EXPORT DATA
OPTIONS (
uri = 'gs://bucket/folder/*.csv',
format = 'CSV',
overwrite = true,
header = true,
field_delimiter = ';')
AS (
SELECT field1, field2
FROM mydataset.table1
ORDER BY field1
);
This could as well be trivially setup via a Scheduled Query if you need a periodic export. And, of course, you need to make sure the user or service account running this has permissions to read the source datasets and tables and to write to the destination bucket.
Hopefully this is useful for other peeps visiting this question if not for OP :)

You have an alternative to the second part of the Maxim answer. The code for extracting the table and store it into Cloud Storage should work.
But, when you schedule a query, you can also define a PubSub topic where the BigQuery scheduler will post a message when the job is over. Thereby, the scheduler set up, as described by Maxim is optional and you can simply plug the function to the PubSub notification.
Before performing the extraction, don't forget to check the error status of the pubsub notification. You have also a lot of information about the scheduled query; useful is you want to perform more checks or if you want to generalize the function.
So, another point about the SFTP transfert. I open sourced a projet for querying BigQuery, build a CSV file and transfert this file to FTP server (sFTP and FTPs aren't supported, because my previous company only used FTP protocol!). If your file is smaller than 1.5Gb, I can update my project for adding the SFTP support is you want to use this. Let me know

How can I grant individual permissions in Google Cloud Platform for BigQuery users using python

I need to set up very fine-grained access control for user accounts in GCP using a python script
I know that via UI/gcloud util I can give it role roles/big query. user, but it has a lot of other permissions I don't want this service account to have.
How can I grant individual permissions via python scripts?

Go to your BigQuery console, click into the arrow at the right of one dataset and then click into Share dataset
And then add the e-mail of the user here:
You can choose one of 3 roles available: Viewer/Owner/Editor.
Do this in every dataset to every user.
Update to do it via Python script
You can do it with a Python script following this small tutorial.
The code will be something like:
from google.cloud import bigquery
client = bigquery.Client()
dataset = client.get_dataset(client.dataset('dataset1'))
entry = bigquery.AccessEntry(
role='READER',
entity_type='userByEmail',
entity_id='user1#example.com')
assert entry not in dataset.access_entries
entries = list(dataset.access_entries)
entries.append(entry)
dataset.access_entries = entries
dataset = client.update_dataset(dataset, ['access_entries']) # API request
#assert entry in dataset.access_entries

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Authenticate Custom Training Job in Vertex AI with Service Account - google-cloud-platform

Related

Try deploying a custom ML model with an endpoint by using a custom image on Google Cloud's Vertex AI

GCP Vertex AI Training Custom Job : User does not have bigquery.jobs.create permission

How to sign gcs blob from the dataflow worker

How to schedule an export from a BigQuery table to Cloud Storage?

How can I grant individual permissions in Google Cloud Platform for BigQuery users using python

Categories

Resources