Cannot create an Endpoint with Unified Cloud AI Platform custom containers - google-cloud-platform

Because of certain VPC restrictions I am forced to use custom containers for predictions for a model trained on Tensorflow. According to the documentation requirements I have created a HTTP server using Tensorflow Serving. The Dockerfile used to build the image is as follows:
FROM tensorflow/serving:2.4.1-gpu
# copy the model file
ENV MODEL_NAME=my_model
COPY my_model /models/my_model
Where my_model contains the saved_model inside a folder named 1/.
I have then pushed this image to a Google Container Repository and then created a Model by using Import an existing custom container and changing the Port to 8501. However when trying to deploy the model to an endpoint using a single compute node of type n1-standard-16 and 1 P100 GPU the deployment runs into the below error:
Failed to create session: Internal: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
I am unable to figure how this is happening. I am able to run the same docker image on my local machine and I am able to successfully get predictions by hitting the endpoint that is created: http://localhost:8501/v1/models/my_model:predict
Any help is this regard will be appreciated.

The issue has been solved by downgrading Tensorflow serving image to 2.3.0-gpu version. According to the error context, the CUDA driver in the custom model image doesn't match the appropriate driver version in GCP AI Platform training cluster.

Related

Model server container out of memory - Vertex AI

I am trying to deploy a simple support vector regression (SVR) model to Vertex AI using scikit-learn version 1.0, but I am encountering the following error:
Failed to create endpoint "adpool_eval" due to the error: model server container out of memory,
please use a larger machine type for model deployment:
https://cloud.google.com/vertex-ai/docs/predictions/configure-compute#machine-types.
The model is saved as "model.joblib" in a cloud storage bucket as required by Vertex AI.
The model is very small and simple so I can't understand how it is running out of memory. I tried machines with much more RAM and CPUs and the issue still persists. The cloud storage bucket, model registry, and endpoint are all in the same region (europe-west1).
The issue seems to be with saving the model using the joblib library. I used the pickle library instead and I managed to deploy the same model without any issues on the n1-standard machine.

Calling SageMaker Notebook instance function by endpoint

I am a newbie in AWS. Right now I have defined an image segmentation function in SageMaker notebook instance and this will return masks.
I didn't train my models there, what I have done is pip install models packages there, upload pre-trained weights manually. The rest is very similar to working in local machine: I imported package, load the weights, defined a function to take an image as input then outputs masks.
My question is: is there a way to host my function so that I can call it with URL endpoint + one image info, then it returns me masks in response?
Again I am so new to AWS and I begin to doubt SageMaker is not designed for this job... The reason I chose SageMaker is the need of computing capacity, I don't think I can do this job with pure lambda.
SageMaker inference endpoints currently rely on an interface based on Docker images. At the base level, you can set up a Docker image that runs a web server and responds to the endpoints on the ports that AWS require. This guide will show you how to do it: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html.
This is an annoying amount of work. If you're using a well-known framework they have a container library that contains some boilerplate code you might be able to reuse: https://github.com/aws/sagemaker-containers. You might have to do some customization there.
Or don't use SageMaker inference endpoints at all :) If your model can fit within the size / memory restrictions of AWS Lambda, that is an easier option!
Full disclaimer, I'm working on a platform that competes with SageMaker: Model Zoo

Custom code containers for google cloud-ml for inference

I am aware that it is possible to deploy custom containers for training jobs on google cloud and I have been able to get the same running using command.
gcloud ai-platform jobs submit training infer name --region some_region --master-image-uri=path/to/docker/image --config config.yaml
The training job was completed successfully and the model was successfully obtained, Now I want to use this model for inference, but the issue is a part of my code has system level dependencies, so I have to make some modification into the architecture in order to get it running all the time. This was the reason to have a custom container for the training job in the first place.
The documentation is only available for the training part and the inference part, (if possible) with custom containers has not been explored to the best of my knowledge.
The training part documentation is available on this link
My question is, is it possible to deploy custom containers for inference purposes on google cloud-ml?
This response refers to using Vertex AI Prediction, the newest platform for ML on GCP.
Suppose you wrote the model artifacts out to cloud storage from your training job.
The next step is to create the custom container and push to a registry, by following something like what is described here:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements
This section describes how you pass the model artifact directory to the custom container to be used for interence:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#artifacts
You will also need to create an endpoint in order to deploy the model:
https://cloud.google.com/vertex-ai/docs/predictions/deploy-model-api#aiplatform_deploy_model_custom_trained_model_sample-gcloud
Finally, you would use gcloud ai endpoints deploy-model ... to deploy the model to the endpoint:
https://cloud.google.com/sdk/gcloud/reference/ai/endpoints/deploy-model

how to run a pre-trained model in AWS sagemaker?

I have a model.pkl file which is pre-trained and all other files related to the ml model. I want it to deploy it on the aws sagemaker.
But without training, how to deploy it to the aws sagmekaer, as fit() method in aws sagemaker run the train command and push the model.tar.gz to the s3 location and when deploy method is used it uses the same s3 location to deploy the model, we don't manual create the same location in s3 as it is created by the aws model and name it given by using some timestamp. How to put out our own personalized model.tar.gz file in the s3 location and call the deploy() function by using the same s3 location.
All you need is:
to have your model in an arbitrary S3 location in a model.tar.gz archive
to have an inference script in a SageMaker-compatible docker image that is able to read your model.pkl, serve it and handle inferences.
to create an endpoint associating your artifact to your inference code
When you ask for an endpoint deployment, SageMaker will take care of downloading your model.tar.gz and uncompressing to the appropriate location in the docker image of the server, which is /opt/ml/model
Depending on the framework you use, you may use either a pre-existing docker image (available for Scikit-learn, TensorFlow, PyTorch, MXNet) or you may need to create your own.
Regarding custom image creation, see here the specification and here two examples of custom containers for R and sklearn (the sklearn one is less relevant now that there is a pre-built docker image along with a sagemaker sklearn SDK)
Regarding leveraging existing containers for Sklearn, PyTorch, MXNet, TF, check this example: Random Forest in SageMaker Sklearn container. In this example, nothing prevents you from deploying a model that was trained elsewhere. Note that with a train/deploy environment mismatch you may run in errors due to some software version difference though.
Regarding your following experience:
when deploy method is used it uses the same s3 location to deploy the
model, we don't manual create the same location in s3 as it is created
by the aws model and name it given by using some timestamp
I agree that sometimes the demos that use the SageMaker Python SDK (one of the many available SDKs for SageMaker) may be misleading, in the sense that they often leverage the fact that an Estimator that has just been trained can be deployed (Estimator.deploy(..)) in the same session, without having to instantiate the intermediary model concept that maps inference code to model artifact. This design is presumably done on behalf of code compacity, but in real life, training and deployment of a given model may well be done from different scripts running in different systems. It's perfectly possible to deploy a model with training it previously in the same session, you need to instantiate a sagemaker.model.Model object and then deploy it.

How to set diskSourceImage in google data flow pipeline

I've been trying to use custom made images to run my google data flow pipeline. Given the information from https://cloud.google.com/compute/docs/reference/latest/images I've tested the following code snippets:
DataflowPipelineOptions options = PipelineOptionsFactory.create().as(DataflowPipelineOptions.class);
...
options.setDiskSourceImage("ubuntu-1504-vivid-v20150911");
options.setDiskSourceImage("projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");
options.setDiskSourceImage("https://www.googleapis.com/compute/beta/projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");
all of the above tries led to the following error in my pipeline:
(b9c7b66a676906f4): Unable to create VMs. Causes: (b9c7b66a67690aef): Error: Message: Invalid value for field 'resource.disks[0].initializeParams.sourceImage': '[edited]'. Must be the URL to a Compute resource of the correct type HTTP Code: 400
Using a custom disk image with Dataflow is not a viable option. The flag diskSourceImage is deprecated and will be removed in a future SDK release. The reason it is no longer supported is because the Dataflow service relies on versioned resources in the VM image. So Dataflow needs control of the VM image so that we can upgrade it as necessary. If users supply their own custom images we have no way of keeping them in sync with the requirements of the Dataflow service.
If your custom VM image is based off a Dataflow image then you would be able to execute jobs using that custom image until the next release of a Dataflow VM image. There is no reasonable way in which you would be able to keep your custom images in sync with Dataflow's VM images so that you would be able to keep this working.
If you would like to customize the VM image please let us know why (e.g. send us an email at dataflow-feedback#google.com) so we can either suggest an alternative solution or else consider supporting your use case in the future.
There's a subtle issue with setDiskSourceImage -- it uses 'beta' instead of the current 'v1' version for Compute Engine. If you try the following, it should work:
options.setDiskSourceImage("https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");