Amazon SageMaker multiple-models - amazon-web-services

I am interested when using Amazon Sagemaker multiple-models options running on one endpoint. How does it look in practice? When I send more requests on different models, can Sagemaker deal with this simultaneously?
Thank you.

You need to specify which model in the request body. The model name specified when creating the sagemaker model.
Invoke a Multi-Model Endpoint
response = runtime_sm_client.invoke_endpoint(
EndpointName = ’my-endpoint’,
ContentType = 'text/csv',
TargetModel = ’new_york.tar.gz’,
Body = body)
Save on inference costs by using Amazon SageMaker multi-model endpoints
There are multiple limitations. Currently the sagemaker multi model server (MMS) cannot use GPU.
Host Multiple Models with Multi-Model Endpoints
Multi-model endpoints are not supported on GPU instance types.
The SageMaker Python SDK is not clear which framework model supports the multi model server deployment and how. For instance with Use TensorFlow with the SageMaker Python SDK, the SageMaker endpoint docker image is automatically picked up by SageMaker using the images in Available Deep Learning Containers Images. However it is not clear which framework images are MMS ready.
[Deploy Multiple ML Models on a Single Endpoint Using Multi-model Endpoints on Amazon SageMaker] explains building AWS XGBoost image with MMS. Hence apparently the docker image needs to be built with MMS being specified as the front-end. If the images are not built in such a way, MMS may not be available.
Such information is missing in AWS, so if there is an issue encountered, you would need AWS support to identify the cause. Especially SageMaker team keeps changing the images, MMS implementation, etc, there can be issues expected.
References
SageMaker Inference Toolkit
Multi Model Server
Deploy Multiple ML Models on a Single Endpoint Using Multi-model Endpoints on Amazon SageMaker

Related

Aws Model Quality Monitoring without Endpoints

Is there any possible ways to do model monitoring in aws without an endpoint? Kindly share any good notebook regarding this if you knew
Aws not gives any explainable example regarding Batch Model monitoring.
Amazon SageMaker Model Monitor monitors the quality of Amazon SageMaker machine learning models in production.
You can set up continuous monitoring with a real-time endpoint (or a batch transform job that runs regularly), or on-schedule monitoring for asynchronous batch transform jobs.
Here are some example notebooks:
(1) SageMaker Model Monitor with Batch Transform - Data Quality Monitoring On-Schedule (link)
(2) SageMaker Data Quality Model Monitor for Batch Transform with SageMaker Pipelines On-demand (link)

AWS SageMaker - Upload our own docker image

I am new to AWS SageMaker and i am using this technology for building and training the machine learning models. I have now developed a docker image which contains our custom code for tensorflow. I would like to upload this custom docker image to AWS SageMaker and make use of it.
I have searched various links but could not find proper information on how to upload our own custom docker image.
Can you please suggest me the recommended links regarding the process of uploading our own docker image to AWS SageMaker?
In order to work with sagemaker, you have to push your container to ECR. The most important thing is that the container must be "adapted" to be complaint to what sagemaker requires, but everything is described here. In addition if you want to take a look to an example, here is mine.. where I use my container with TF Object Detection API in AWS Sagemaker.

sagemaker - factorization machines - deserialize model

I estimated a factorization machine model in sagemaker and it saved a file model.tar.gz into an s3 folder.
Is there a way I can load this file in Python and access the parameter of the model, i.e. the factors, directly?
Thanks
As of April 2019: yes. An official AWS blog post was created to show how to open the SageMaker Factorization Machines artifact and extract its parameters: https://aws.amazon.com/blogs/machine-learning/extending-amazon-sagemaker-factorization-machines-algorithm-to-predict-top-x-recommendations/
That being said, be aware that Amazon SageMaker built-in algorithm are primarily built for deployment on AWS, and only SageMaker XGBoost and SageMaker BlazingText are designed to produce artifacts interoperable with their open-source equivalent.

Hosting model on amazon aws

I have pre-trained model in Keras.I want to host the model on Amazon AWS for real-time prediction. Can someone list the steps to do this. I am very new to this.How to deploy my model for predictions?
You could package your own pre-trained algorithms by "containerizing" the algorithm via Docker. This documentation page will guide you through how to package your algorithm into a Docker image in Elastic Container Service: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html
You may then directly deploy your packaged algorithm via SageMaker Hosting. This is a three-step process: CreateModel -> CreateEndpointConfig -> CreateEndpoint. Here's the documentation about how to host your packaged algorithm on SageMaker: https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html
Cheers,
Yuting

AWS SageMaker hosting multiple models on the same machine (ML compute instance)

I am able to host the models developed in SageMaker by using the deploy functionality. Currently, I see that the different models that I have developed needs to deployed on different ML compute instances.
Is there a way to deploy all models on the same instance, using separate instances seems to be very expensive option. If it is possible to deploy multiple models on the same instance, will that create different endpoints for the models?
SageMaker is designed to solve deployment problems in scale, where you want to have thousands of model invocations per seconds. For such use cases, you want to have multiple tasks of the same model on each instance, and often multiple instances for the same model behind a load balancer and an auto scaling group to allow to scale up and down as needed.
If you don’t need such scale and having even a single instance for a single model is not economic for the request per second that you need to handle, you can take the models that were trained in SageMaker and host them yourself behind some serving framework such as MXNet serving (https://github.com/awslabs/mxnet-model-server ) or TensorFlow serving (https://www.tensorflow.org/serving/ ).
Please also note that you have control over the instance type that you are using for the hosting, and you can choose a smaller instance for smaller loads. Here is a list of the various instance types that you can choose from: https://aws.amazon.com/sagemaker/pricing/instance-types/
I believe this is a new feature introduced in AWS sagemaker, please refer below links which exactly does the same.
Yes, now in AWS sagemaker you can deploy multiple models in same ML instance.
In Below Link,
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/
You can find examples,
multi_model_bring_your_own
multi_model_sklearn_home_value
multi_model_xgboost_home_value
Another link which explains multi-model XGboost in details.
https://aws.amazon.com/blogs/machine-learning/save-on-inference-costs-by-using-amazon-sagemaker-multi-model-endpoints/
Hope this helps anyone who is looking to solve this issue in future.