I want to deploy a model on the new version of Google ML Engine.
Previously, with Google ML, I could export my trained model creating a tf.train.Saver(), saving the model with saver.save(session, output).
So far I've not been able to find out if the exported model obtained this way is still deployable on ml-engine, or else I must follow the training procedure described here and create a new trainer package and necessarily train my model with ml-engine.
Can I still use tf.train.Saver() to obtain the model I will deploy on ml-engine?
tf.train.Saver() only produces a checkpoint.
Cloud ML Engine uses a SavedModel, produced from these APIs: https://www.tensorflow.org/versions/master/api_docs/python/tf/saved_model?hl=bn
A saved model is a checkpoint + a serialized protobuf containing one or more graph definitions + a set of signatures declaring the inputs and outputs of the graph/model + additional asset files if applicable, so that all of these can be used at serving time.
I suggest looking at couple of examples:
The census sample - https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/census/tensorflowcore/trainer/task.py#L334
And my own sample/library code - https://github.com/TensorLab/tensorfx/blob/master/src/training/_hooks.py#L208 that calls into https://github.com/TensorLab/tensorfx/blob/master/src/prediction/_model.py#L66 to demonstrate how to use a checkpoint, load it into a session and then produce a savedmodel.
Hope these pointers help adapt your existing code to produce a model to now produce a SavedModel.
I think you also asked another similar question to convert a previously exported model, and I'll link to it here for completeness for anyone else: Deploy retrained inception SavedModel to google cloud ml engine
Related
I have trained my yolov5 model, and have weights.pt, now I need to deploy it using sagemaker, for that I need to create an endpoint.
I'm following this tutoriel https://sagemaker-examples.readthedocs.io/en/latest/frameworks/pytorch/get_started_mnist_deploy.html
Since I'm working using images I'm trying to customise input_fn,output_fn functions but unfortunately when I run inference I always get errors, my question is what logic should I follow in order to customise these functions ?
Each of these functions have a different purpose. Using the input_fn you want to prepare your input for what your model is expecting. Using the model_fn you want to load up your pytorch model. Using the predict_fn you want to have your predict/inference function code. Using output_fn you can shape the output that you return from your endpoint. Check out this article for an understanding of each of these handlers more in depth as well as examples: https://aws.plainenglish.io/adding-custom-inference-scripts-to-amazon-sagemaker-2208c3332510
I trained a XGBoost model using AI Platform as here.
Now I have the choice in the Console to download the model, as follows (but not Deploy it, since "Only models trained with built-in algorithms can be deployed from this page"). So, I click to download.
However, in the bucket the only file I see is a tar, as follows.
That tar (directory tree follows) holds only some training code, and not a model.bst, model.pkl, or model.joblib, or other such model file.
Where do I find model.bst or the like, which I can deploy?
EDIT:
Following the answer, below, we see that the "Download model" button is misleading as it sends us to the job directory, not the output directory (which is set arbitrarily in the codel the model is at census_data_20210527_215945/model.bst )
bucket = storage.Client().bucket(BUCKET_ID)
blob = bucket.blob('{}/{}'.format(
datetime.datetime.now().strftime('census_%Y%m%d_%H%M%S'),
model))
blob.upload_from_filename(model)
Only in-build algorithms automatically store the model in Google Cloud storage.
In your case, you have a custom training application.
You have to take care of saving the model on your own.
Referring to your example this is implemented as listed here.
The model is uploaded to Google Cloud Storage using the cloud storage client.
I have my own trained TF Object Detection model. If I try to deploy/implement the same model in AWS Sagemaker. It was not working.
I have tried TensorFlowModel() in Sagemaker. But there is an argument called entrypoint- how to create that .py file for prediction?
entrypoint is a argument which contains the file name inference.py,which means,once you create a endpoint and try to predict the image using the invoke endpoint api. the instance will be created based on you mentioned and it will go to the inference.py script and execute the process.
Link : Documentation for tensor-flow model deployment in amazon sage-maker
.
The inference script must contain a methods input_handler and output_handler or handler which will cover both the function in inference.py script, this for pre and post processing of your image.
Example for Deploying the tensor flow model
In the above link, i have mentioned a medium post, this will be helpful for your doubts.
I have followed an Amazon tutorial for using SageMaker and have used it to create the model in the tutorial (https://aws.amazon.com/getting-started/tutorials/build-train-deploy-machine-learning-model-sagemaker/).
This is my first time using SageMaker, so my question may be stupid.
How do you actually view the model that it has created? I want to be able to see a) the final formula created with the parameters etc. b) graphs of plotted factors etc. as if I was reviewing a GLM for example.
Thanks in advance.
If you followed the SageMaker tutorial you must have trained an XGBoost model. SageMaker places the model artifacts in a bucket that you own, check the output S3 location in the AWS SageMaker console.
For more information about XGBoost you can check the AWS SageMaker documentation https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#xgboost-sample-notebooks and the example notebooks, e.g. https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/xgboost_abalone/xgboost_abalone.ipynb
To consume the XGBoost artifact generated by SageMaker, check out the official documentation, which contains the following code:
# SageMaker XGBoost uses the Python pickle module to serialize/deserialize
# the model, which can be used for saving/loading the model.
# To use a model trained with SageMaker XGBoost in open source XGBoost
# Use the following Python code:
import pickle as pkl
model = pkl.load(open(model_file_path, 'rb'))
# prediction with test data
pred = model.predict(dtest)
I've submitted a training job to the cloud using the RESTful API and see in the console logs that it completed successfully. In order to deploy the model and use it for predictions I have saved the final model using tf.train.Saver().save() (according to the how-to guide).
When running locally, I can find the graph files (export-* and export-*.meta) in the working directory. When running on the cloud however, I don't know where they end up. The API doesn't seem to have a parameter for specifying this, it's not in the bucket with the trainer app, and I can't find any temporary buckets on the cloud storage created by the job.
When you set up your Cloud ML environment you set up a bucket for this purpose. Have you looked in there?
https://cloud.google.com/ml/docs/how-tos/getting-set-up
Edit (for future record): As Robert mentioned in comments, you'll want to pass the output location to the job as an argument. Couple of things to be mindful of:
Use a unique output location per job, so one job doesn't clobber over the outputs of another.
The recommendation is to specify the parent output path, and use it to contain the exported model in a subpath called 'model', as well as organizing other outputs like checkpoints and summaries within that path. That makes it easier to manage all the outputs.
While not required, I'll also suggest staging the training code in a packages subpath of the output, which helps correlate the source with the outputs it produces.
Finally(!), also keep in mind when you use hyperparameter tuning, you'll need to append the trial id to the output path for outputs produced by individual runs.