I have a usecase for hosting multiple xgboost models in one sagemaker endpoint. The models have a slightly different feature set and preprocessing for features.
The two options I am considering are:
Creating models with custom docker images and hosting them in one endpoint using production variants. I will then invoke the endpoint with the variant name and correct feature set.
Sagemaker Inference Toolkit (multi-model-server). In the handler script I am planning to pre-process the input differently based on the model name.
Are these the right approach for the problem? Or is there a better approach for working with Sagemaker and multiple xgboost model with pre and post processing?
Related
I am using AWS Sagemaker to deploy my speech models trained outside of Sagemaker. I am able to convert my model into something Sagemaker would understand and have deployed it as an endpoint. Problem is that Sagemaker directly loads the model and calls .predict to get the inference. I am unable to figure out where can I add my preprocessing functions in the deployed model. It is suggested to use AWS Lambda or another server for preprocessing. Is there any way I can incorporate complex preprocessing (cannot be done by simple Scikit, Pandas like framework) in Sagemaker itself?
You will want to adjust the predictor.py file in the container that you are bringing your speech models in. Assuming you are using Bring Your Container to deploy these models on SageMaker you will want to adjust the predictor code to include the preprocessing functionality that you are working with. For any extra dependencies that you are working with make sure to update this in your Dockerfile that you are bringing. Having the preprocessing functionality within the predictor file will make sure your data is transformed, processed as you desire before returning predictions. This will add to the response time however, so if you have heavy preprocessing workloads or ETL that needs to occur you may want to look into a service as AWS Glue (ETL) or Kinesis (real-time data streaming/data transformation). If you choose to use Lambda you want to keep in mind the 15 minute timeout limit.
I work for AWS & my opinions are my own
I am a newbie in AWS. Right now I have defined an image segmentation function in SageMaker notebook instance and this will return masks.
I didn't train my models there, what I have done is pip install models packages there, upload pre-trained weights manually. The rest is very similar to working in local machine: I imported package, load the weights, defined a function to take an image as input then outputs masks.
My question is: is there a way to host my function so that I can call it with URL endpoint + one image info, then it returns me masks in response?
Again I am so new to AWS and I begin to doubt SageMaker is not designed for this job... The reason I chose SageMaker is the need of computing capacity, I don't think I can do this job with pure lambda.
SageMaker inference endpoints currently rely on an interface based on Docker images. At the base level, you can set up a Docker image that runs a web server and responds to the endpoints on the ports that AWS require. This guide will show you how to do it: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html.
This is an annoying amount of work. If you're using a well-known framework they have a container library that contains some boilerplate code you might be able to reuse: https://github.com/aws/sagemaker-containers. You might have to do some customization there.
Or don't use SageMaker inference endpoints at all :) If your model can fit within the size / memory restrictions of AWS Lambda, that is an easier option!
Full disclaimer, I'm working on a platform that competes with SageMaker: Model Zoo
I have a custom machine learning predictive model. I also have a user defined Estimator class that uses Optuna for hyperparameter tuning. I need to deploy this model to SageMaker so as to invoke it from a lambda function.
I'm facing trouble in the process of creating a container for the model and the Estimator.
I am aware that SageMaker has a scikit learn container which can be used for Optuna, but how would I leverage this to include the functions from my own Estimator class? Also, the model is one of the parameters passed to this Estimator class so how do I define it as a separate training job in order to make it an Endpoint?
This is how the Estimator class and the model are invoked:
sirf_estimator = Estimator(
SIRF, ncov_df, population_dict[countryname],
name=countryname, places=[(countryname, None)],
start_date=critical_country_start
)
sirf_dict = sirf_estimator.run()
where:
Model Name : SIRF
Cleaned Dataset : ncov_df
Would be really helpful if anyone could look into this, thanks a ton!
The SageMaker inference endpoints currently rely on an interface based on Docker images. At the base level, you can set up a Docker image that runs a web server and responds to the endpoints on the ports that AWS require. This guide will show you how to do it: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html.
This is an annoying amount of work. If you're using a well-known framework they have a container library that contains some boilerplate code you might be able to reuse: https://github.com/aws/sagemaker-containers. You might be able to reuse some code from there, but customize it.
Or don't use SageMaker inference endpoints at all :) If your model can fit within the size / memory restrictions of AWS Lambda, that is an easier option!
I have a model.pkl file which is pre-trained and all other files related to the ml model. I want it to deploy it on the aws sagemaker.
But without training, how to deploy it to the aws sagmekaer, as fit() method in aws sagemaker run the train command and push the model.tar.gz to the s3 location and when deploy method is used it uses the same s3 location to deploy the model, we don't manual create the same location in s3 as it is created by the aws model and name it given by using some timestamp. How to put out our own personalized model.tar.gz file in the s3 location and call the deploy() function by using the same s3 location.
All you need is:
to have your model in an arbitrary S3 location in a model.tar.gz archive
to have an inference script in a SageMaker-compatible docker image that is able to read your model.pkl, serve it and handle inferences.
to create an endpoint associating your artifact to your inference code
When you ask for an endpoint deployment, SageMaker will take care of downloading your model.tar.gz and uncompressing to the appropriate location in the docker image of the server, which is /opt/ml/model
Depending on the framework you use, you may use either a pre-existing docker image (available for Scikit-learn, TensorFlow, PyTorch, MXNet) or you may need to create your own.
Regarding custom image creation, see here the specification and here two examples of custom containers for R and sklearn (the sklearn one is less relevant now that there is a pre-built docker image along with a sagemaker sklearn SDK)
Regarding leveraging existing containers for Sklearn, PyTorch, MXNet, TF, check this example: Random Forest in SageMaker Sklearn container. In this example, nothing prevents you from deploying a model that was trained elsewhere. Note that with a train/deploy environment mismatch you may run in errors due to some software version difference though.
Regarding your following experience:
when deploy method is used it uses the same s3 location to deploy the
model, we don't manual create the same location in s3 as it is created
by the aws model and name it given by using some timestamp
I agree that sometimes the demos that use the SageMaker Python SDK (one of the many available SDKs for SageMaker) may be misleading, in the sense that they often leverage the fact that an Estimator that has just been trained can be deployed (Estimator.deploy(..)) in the same session, without having to instantiate the intermediary model concept that maps inference code to model artifact. This design is presumably done on behalf of code compacity, but in real life, training and deployment of a given model may well be done from different scripts running in different systems. It's perfectly possible to deploy a model with training it previously in the same session, you need to instantiate a sagemaker.model.Model object and then deploy it.
I have already implemented a sagemaker pipeline model. In particular for an end-to-end notebook that trains a model, builds a pipeline model and deploys it, I have followed this sample notebook.
Now I would like to retrain and deploy the entire pipeline every day using Airflow, but I have seen here the possibility to retrain and deploy only a single sagemaker model.
Is there a way to retrain and deploy the entire pipeline? Thanks
SageMaker provides 2 options for users to do Airflow stuff:
Use the APIs in SageMaker Python SDK to generate input of all SageMaker operators in Airflow. The blog you linked goes this way. For example, they use API training_config in SageMaker Python SDK and operator SageMakerTrainingOperator in Airflow.
Use PythonOperator provided by Airflow and write Python codes to do what you want.
For 1, SageMaker only implemented APIs related to training, tuning, single model deployment and transform. Hence you are doing pipeline model, I don't think it has the API you want.
But for 2, if you can finish what you want in whatever Python codes with SageMaker. You should be able to adapt it as Python callables and make them work with PythonOperators. Here's an example for training in this way provided by SageMaker:
https://sagemaker.readthedocs.io/en/stable/using_workflow.html#using-airflow-python-operator
I think you can do similar things to make Airflow work with your pipeline model.