How do I save the predictions obtained from the gcloud ml engine? - google-cloud-ml

I'm using the windows powershell and using gcloud commands to request predictions from a python tensorflow savedmodel on gcloud ml engine. The output of the ml model prediction is actually an image and I want to save that image. I couldn't find a way to do this via the gcloud commands and was wondering if it was possible to do so.
For the record, the prediction does work because the predict command returns a large amount of data, presumably the output image.

The method to save the response is:
gcloud ml-engine predict --model model_name --json-instances ./data/data.json >>preds.json
Adding >>preds.json to the end of that command saves the response to a json file.

Related

Error when invoking pre-trained model: NotFittedError("Vocabulary not fitted or provided")

I'm new to AWS SageMaker and I'm trying to deploy a simple pre-trained model to SageMaker to create endpoint and then make predictions.
The model is a sklearn linear regression model, the input is a vectorized sparse matrix, derived from a string of text (customer's review), and output the star-rating value (1 to 5).
I have trained the model locally and export its artifact to a model.joblib file.
Then I configure the inference.py file to zip it together with the model.joblib file into a model.tar.gz file, which is then uploaded to S3 for model registration and endpoint creation.
However, when I invoke the endpoint on a sample text, the following error is returned in the CloudWatch log:
File "/miniconda3/lib/python3.8/site-packages/sklearn/feature_extraction/text.py", line 498, in _check_vocabulary
raise NotFittedError("Vocabulary not fitted or provided")
I understand that this means SageMaker is complaining about the trained model artifact being not fitted, and there is no problem with other parts (such as the inference.py file). However the pre-trained model was fitted before exporting.
I'm not sure which part was wrong, so I didn't upload any more codes not to cluster.
Thank you.

How to reduce the verbosity of sagemaker training jobs?

I am training a model with tqdm in the dataloader, the log prints every update in a seperate line that fills up the entire notebook. Is there a way to reduce the verbosity or remove the logs entirely from the jupyter cell similar to from IPython.display import clear_output clear_output(wait=True)?
usually the logging can be controlled using the API parameters and SageMaker doesn't usually add a lot of additional verbose. If you are using a sagemaker notebook instance to run your training you can very well use Ipython clear_output as the notebook is based on jupyter lab.
You can set log="None" inside the .fit() parameter.
Here's an example how I set it
job_name = "something"
model = sagemaker.estimator.Estimator(image_uri=container_image_uri,
...)
model.fit(inputs=train_input,
job_name=job_name,
wait=True,
logs="None")

Getting details of a BigQuery job using gcloud CLI on local machine

I am trying to process the billed bytes of each bigquery job runned by all user. I was able to find the details in BigQuery UI under Project History. Also running bq --location=europe-west3 show --job=true --format=prettyjson JOB_ID on Google Cloud Shell gives the exact information that I want (BQ SQL query, billed bytes, run time for each bigquery job).
For the next step, I want to access the json that returned by above script on local machine. I have already configured gcloud cli properly, and able to find bigquery jobs using gcloud alpha bq jobs list --show-all-users --limit=10.
I select a job id and run the following script: gcloud alpha bq jobs describe JOB_ID --project=PROJECT_ID,
I get (gcloud.alpha.bq.jobs.describe) NOT_FOUND: Not found: Job PROJECT_ID:JOB_ID--toyFH. It is possibly because of creation and end times
as shown here
What am I doing wrong? Is there another way to get details of a bigquery job using gcloud cli (maybe there is a way to get billed bytes with query details using Python SDK)?
You can get job details with diff APIs or as you are doing, but first, why are you using the alpha version of the bq?
To do it in python, you can try something like this:
from google.cloud import bigquery
def get_job(
client: bigquery.Client,
location: str = "us",
job_id: str = << JOB_ID >>,
) -> None:
job = client.get_job(job_id, location=location)
print(f"{job.location}:{job.job_id}")
print(f"Type: {job.job_type}")
print(f"State: {job.state}")
print(f"Created: {job.created.isoformat()}")
There are more properties that you can get with some kind of command from the job. Also check the status of the job in the console first, to compare between them
You can find more details here: https://cloud.google.com/bigquery/docs/managing-jobs#python

Beginners guide to Sagemaker

I have followed an Amazon tutorial for using SageMaker and have used it to create the model in the tutorial (https://aws.amazon.com/getting-started/tutorials/build-train-deploy-machine-learning-model-sagemaker/).
This is my first time using SageMaker, so my question may be stupid.
How do you actually view the model that it has created? I want to be able to see a) the final formula created with the parameters etc. b) graphs of plotted factors etc. as if I was reviewing a GLM for example.
Thanks in advance.
If you followed the SageMaker tutorial you must have trained an XGBoost model. SageMaker places the model artifacts in a bucket that you own, check the output S3 location in the AWS SageMaker console.
For more information about XGBoost you can check the AWS SageMaker documentation https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#xgboost-sample-notebooks and the example notebooks, e.g. https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone.ipynb
To consume the XGBoost artifact generated by SageMaker, check out the official documentation, which contains the following code:
# SageMaker XGBoost uses the Python pickle module to serialize/deserialize
# the model, which can be used for saving/loading the model.
# To use a model trained with SageMaker XGBoost in open source XGBoost
# Use the following Python code:
import pickle as pkl
model = pkl.load(open(model_file_path, 'rb'))
# prediction with test data
pred = model.predict(dtest)

What is the best way to feed image data (tfrecords) from GCS to your model?

I set myself a goal to solve the MNIST Skin Cancer dataset using only Google Cloud.
Using GCS & Kubeflow on Google Kubernetes.
I converted the data from jpeg to tfrecord with the following script:
https://github.com/tensorflow/tpu/blob/master/tools/datasets/jpeg_to_tf_record.py
I have seen a lot of examples how they feed a csv-file to their model but no examples with image data.
Should it be smart to copy all the tfrecords to the Google Cloud Shell so I can feed the data to my model like that?
Or are there any better methods available?
Thanks in advance.
In the case you are using Kubeflow, I would suggest to use the kubeflow pipelines.
For the preprocessing you could use an image that is build on top of the standard pipeline dataflow image gcr.io/ml-pipeline/ml-pipeline-dataflow-tft:latest where you simply copy your dataflow code and run it:
FROM gcr.io/ml-pipeline/ml-pipeline-dataflow-tft:latest
RUN mkdir /{folder}
COPY run_dataflow_pipeline.py /{folder}
ENTRYPOINT ["python", "/{folder}/run_dataflow_pipeline.py"]
See this boilerplate for the dataflow code that does exactly this. The idea is that you write the TF records to Google Cloud Storage (GCS).
Subsequently you could use Google Cloud's ML engine for the actual training. In this case you can start also from the image google/cloud-sdk:latest and basically copy over the required files with probably a bash script that will be run to execute the gcloud commands to start the training job.
FROM google/cloud-sdk:latest
RUN mkdir -p /{src} && \
cd /{src}
COPY train.sh ./
ENTRYPOINT ["bash", "./train.sh"]
An elegant way to pass on the storage location of your TF records into your model is to use TF.data:
# Construct a TFRecordDataset
train_records = [os.path.join('gs://{BUCKET_NAME}/', f.name) for f in
bucket.list_blobs(prefix='data/TFR/train')]
validation_records = [os.path.join('gs://{BUCKET_NAME}/', f.name) for f in
bucket.list_blobs(prefix='data/TFR/validation')]
ds_train = tf.data.TFRecordDataset(train_records, num_parallel_reads=4).map(decode)
ds_val = tf.data.TFRecordDataset(validation_records,num_parallel_reads=4).map(decode)
# potential additional steps for performance:
# https://www.tensorflow.org/guide/performance/datasets)
# Train the model
model.fit(ds_train,
validation_data=ds_val,
...,
verbose=2)
Check out this blog post for an actual implementation of a similar (more complex) kubeflow pipeline