I have my trained AutoML models in my current GCP project, but I want to copy some of them into another project. Is that possible in GCP, or do I have to create a new AutoML model in my new project with the same dataset and train it again, so I get a copy of the one I have in the other project?
Unfortunately it is not possible to transfer between projects as per the offical documentation..
"Unless otherwise specified in applicable terms of service or documentation, custom models created in Cloud AutoML products cannot be exported"
You can review the documentation here 1
Related
I am able to train a model on Sagemaker and then deploy a model endpoint out of it.
Now, I want to retrain my model every week with the new data that is coming in. My question is - when I retrain the model how do I update my existing endpoint to use the latest model. (I don't want to deploy a new endpoint)
From some exploration, I think I can do it in 2 ways -
Near the end of the training job, I create a new EndpointConfig and later use UpdateEndpoint - The downside of this would be - I would end up with a lot of unnecessary Endpoint Configurations in my AWS Account? Or am I thinking about it wrongly?
Near the end of the training job, I deploy the trained model using .deploy() and set update_endpoint=True as illustrated in Sagemaker SDK Doc
I am not sure which is the better solution to accomplish this? Is there an even better way to do this?
If you are interested in doing this programmatically, use an AWS SDK (I will answer this assuming you are using Java.
Look the AWS SDK for Java V2 Javadocs. You can use the UpdateEndpoint to perform this use case. This method deploys the new EndpointConfig specified in the request, switches to using newly created endpoint, and then deletes resources provisioned for the endpoint using the previous EndpointConfig (there is no availability loss).
I am looking for a solution that allows me to host my trained Sklearn model (that I am satisfied with) on SageMaker without having to retrain it before deploying to an endpoint.
On the one hand I have seen specific examples for bring-your-own scikit model that involve containerizing the trained model but - these guides go through the training step and dont specifically show how you can alternatively avoid retraining the model and just deploy. (https://github.com/awslabs/amazon-sagemaker-examples/blob/27d3aeb9166a4d4dbbb0721d381329e41d431078/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb)
On the other hand, there are guides that show you how to BYOM only for deploying - but these are specific to MXNet and TensorFlow frameworks. I noticed that the way you export your model artifacts among frameworks differs. I need something specific to Sklearn and how to get to a good point where I have model artifacts in the correct format Sagemaker expects(https://github.com/awslabs/amazon-sagemaker-examples/tree/27d3aeb9166a4d4dbbb0721d381329e41d431078/advanced_functionality/mxnet_mnist_byom)
The closest guide I have seen that might work is this one: https://aws.amazon.com/blogs/machine-learning/bring-your-own-pre-trained-mxnet-or-tensorflow-models-into-amazon-sagemaker/
However, I dont know what my sklearn "model artifacts" includes. I think I need a clear understanding of what sklearn model artifacts looks like and what it includes.
Any help is appreciated. The goal is to avoid training in Sagemaker and only deploy my already trained scikit model to an endpoint.
I am aware that it is possible to deploy custom containers for training jobs on google cloud and I have been able to get the same running using command.
gcloud ai-platform jobs submit training infer name --region some_region --master-image-uri=path/to/docker/image --config config.yaml
The training job was completed successfully and the model was successfully obtained, Now I want to use this model for inference, but the issue is a part of my code has system level dependencies, so I have to make some modification into the architecture in order to get it running all the time. This was the reason to have a custom container for the training job in the first place.
The documentation is only available for the training part and the inference part, (if possible) with custom containers has not been explored to the best of my knowledge.
The training part documentation is available on this link
My question is, is it possible to deploy custom containers for inference purposes on google cloud-ml?
This response refers to using Vertex AI Prediction, the newest platform for ML on GCP.
Suppose you wrote the model artifacts out to cloud storage from your training job.
The next step is to create the custom container and push to a registry, by following something like what is described here:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements
This section describes how you pass the model artifact directory to the custom container to be used for interence:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#artifacts
You will also need to create an endpoint in order to deploy the model:
https://cloud.google.com/vertex-ai/docs/predictions/deploy-model-api#aiplatform_deploy_model_custom_trained_model_sample-gcloud
Finally, you would use gcloud ai endpoints deploy-model ... to deploy the model to the endpoint:
https://cloud.google.com/sdk/gcloud/reference/ai/endpoints/deploy-model
Say I have code on App Engine reading Gmail attachments, parsing that it goes to Cloud Data Store, through Data Prep recipes and steps, stored back into Data Store, then predicted on by ML Engine Tensorflow model?
Reference:
Is this all achievable through Dataflow?
EDIT 1:
Is it possible to export the Data Prep steps and use them as preprocessing before an Ml Engine Tensorflow model?
The input for a Cloud ML Engine model can be defined how you better see fit for your project. This means you can apply the preprocessing steps the way you consider fit and then send your data to the Tensorflow model.
Be sure that the format you use in your Dataprep steps is supported by the Tensorflow model. Once you apply your Dataprep recipe with all the required steps, make sure that you use an appropriate format, such as CSV. It is recommended that you store your input to a Cloud Storage bucket for better access.
I don't know how familiar you are with Cloud Dataprep, but you can try this to check how to handle all the steps that you want to include in your recipe.
Can we train a model by just giving data and related column names without creating trainer in Google Cloud ML either using Rest API or command line interface
Yes. You can use Google Cloud Datalab, which comes with a structured data solution. It has an easier interface and takes care of the trainer. You can view the notebooks without setting up Datalab:
https://github.com/googledatalab/notebooks/tree/master/samples/ML%20Toolbox
Once you set up Datalab, you can run the notebook. To set up Datalab, check https://cloud.google.com/datalab/docs/quickstarts.
Instead of building a model and calling CloudML service directly, you can try Datalab's ml toolbox which supports structured data and image classification. The ml toolbox takes your data, and automatically builds and trains a model. You just have to describe your data and what you want to do.
You can view the notebooks first without setting up datalab:
https://github.com/googledatalab/notebooks/tree/master/samples/ML%20Toolbox
To set up Datalab and actually run these notebooks, see https://cloud.google.com/datalab/docs/quickstarts.