I have trained my yolov5 model, and have weights.pt, now I need to deploy it using sagemaker, for that I need to create an endpoint.
I'm following this tutoriel https://sagemaker-examples.readthedocs.io/en/latest/frameworks/pytorch/get_started_mnist_deploy.html
Since I'm working using images I'm trying to customise input_fn,output_fn functions but unfortunately when I run inference I always get errors, my question is what logic should I follow in order to customise these functions ?
Each of these functions have a different purpose. Using the input_fn you want to prepare your input for what your model is expecting. Using the model_fn you want to load up your pytorch model. Using the predict_fn you want to have your predict/inference function code. Using output_fn you can shape the output that you return from your endpoint. Check out this article for an understanding of each of these handlers more in depth as well as examples: https://aws.plainenglish.io/adding-custom-inference-scripts-to-amazon-sagemaker-2208c3332510
Related
I have created a custom container for prediction and successfully uploaded the model to Vertex AI. I was also able to deploy the model to an endpoint and successfully request predictions from the endpoint. Within the custom container code, I use the parameters field as described here, which I then supply later on when making an online prediction request.
My questions are regarding requesting batch predictions from a custom container for prediction.
I cannot find any documentation that describes what happens when I request a batch prediction. Say, for example, I use the my_model.batch_predict function from the Python SDK and set the instances_format to "csv" and provide the gcs_source. Now, I have setup my custom container to expect prediction requests at /predict as described in this documentation. Does Vertex AI make a POST request to this path, converting the cvs data into the appropriate POST body?
How do I specify the parameters field for batch prediction as I did for online prediction?
Yes vertex AI makes a POST request your custom containers in batch prediction.
No, there is no way for batch prediction to pass a parameter since we don't know which column is "parameter". We put everything into "instances".
I have an endpoint running a trained SageMaker model on AWS, which expects the data on a specific format.
Initially, the data has been processed on the client side of the application, it means, the API Gateway (which receives the POST API calls on AWS) used to receive pre-processed data, but now there's a change, the API Gateway will receive raw data from the client, and the job of pre-processing this data before sending to our SageMaker model is up to our workflow.
What is the best way to create a pre-processing job on this workflow, without needing to re-train the model? My pre-process is just a bunch of dataframe transformations, no standardization or calculation with the training set required (it would not need to save any model file).
Thanks!
After some research, this is the solution I've followed:
First I have created a SKLearn sagemaker model to do all the preprocess setup (I've built a Scikit-Learn custom class to handle all the preprocess steps, following this AWS code)
Trained this preprocess model on my training data. My model, in specific, didn't need to be trained (it does not have any standardization or anything that would need to store training data parameters), but sagemaker requires the model to be trained.
Loaded the trained legacy model that we had using the Model parameter.
Created a PipelineModel with the preprocessing model and legacy model in cascade:
pipeline_model = PipelineModel(name=model_name,
role=role,
models=[
preprocess_model,
trained_model
])
Create a new endpoint, calling the PipelineModel and then changed the Lambda function to call this new endpoint. With this I could send the raw data directly for the same API Gateway and it would call only one endpoint, without needing to pay two endpoints 24/7 to perform the entire process.
I've found this to be a good and "economic" way to perform the preprocess outside the trained model, without having to do hard processing jobs on a Lambda function.
I would create a Lambda, which is getting invoked by the API-Gateway, processing the data and sending it to your SageMaker endpoint.
I have my own trained TF Object Detection model. If I try to deploy/implement the same model in AWS Sagemaker. It was not working.
I have tried TensorFlowModel() in Sagemaker. But there is an argument called entrypoint- how to create that .py file for prediction?
entrypoint is a argument which contains the file name inference.py,which means,once you create a endpoint and try to predict the image using the invoke endpoint api. the instance will be created based on you mentioned and it will go to the inference.py script and execute the process.
Link : Documentation for tensor-flow model deployment in amazon sage-maker
.
The inference script must contain a methods input_handler and output_handler or handler which will cover both the function in inference.py script, this for pre and post processing of your image.
Example for Deploying the tensor flow model
In the above link, i have mentioned a medium post, this will be helpful for your doubts.
I am trying to deploy sklearn SVM model on AWS SageMaker. But while testing the model, I am getting different outputs even if I am using same hyperparameters for algorithm, same training and testing data.
svm.SVC(kernel='rbf',gamma=1.0,C=10,probability=True)
I am expecting five classes in the output. The following is the output of test data that I got when ran locally:
In SageMaker, I am only getting four as output for all test data.
Have to tried setting the same random seed everywhere?
Try using np.random.seed(0) at the begin of your code before instantiating the SVM
If that doesn't work, try adding a random state to your model
svm.SVC(kernel='rbf',gamma=1.0,C=10,probability=True, random_state=0)
I want to deploy a model on the new version of Google ML Engine.
Previously, with Google ML, I could export my trained model creating a tf.train.Saver(), saving the model with saver.save(session, output).
So far I've not been able to find out if the exported model obtained this way is still deployable on ml-engine, or else I must follow the training procedure described here and create a new trainer package and necessarily train my model with ml-engine.
Can I still use tf.train.Saver() to obtain the model I will deploy on ml-engine?
tf.train.Saver() only produces a checkpoint.
Cloud ML Engine uses a SavedModel, produced from these APIs: https://www.tensorflow.org/versions/master/api_docs/python/tf/saved_model?hl=bn
A saved model is a checkpoint + a serialized protobuf containing one or more graph definitions + a set of signatures declaring the inputs and outputs of the graph/model + additional asset files if applicable, so that all of these can be used at serving time.
I suggest looking at couple of examples:
The census sample - https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/census/tensorflowcore/trainer/task.py#L334
And my own sample/library code - https://github.com/TensorLab/tensorfx/blob/master/src/training/_hooks.py#L208 that calls into https://github.com/TensorLab/tensorfx/blob/master/src/prediction/_model.py#L66 to demonstrate how to use a checkpoint, load it into a session and then produce a savedmodel.
Hope these pointers help adapt your existing code to produce a model to now produce a SavedModel.
I think you also asked another similar question to convert a previously exported model, and I'll link to it here for completeness for anyone else: Deploy retrained inception SavedModel to google cloud ml engine