How to make parameters available to SageMaker Tensorflow Endpoint - amazon-web-services

I'm looking to make some hyper parameters available to the serving endpoint in SageMaker. The training instances is given access to input parameters using hyperparameters in:
estimator = TensorFlow(entry_point='',
However, when the endpoint is deployed, there is no way to pass in parameters that are used to control the data processing in the input_fn(serialized_input, content_type) function.
What would be the best way to pass parameters to the serving instance?? Is the source_dir parameter defined in the sagemaker.tensorflow.TensorFlow class copied to the serving instance? If so, I could use a config.yml or similar.

Ah i have had a similar problem to you where I needed to download something off S3 to use in the input_fn for inference. In my case it was a dictionary.
Three options:
use your config.yml approach, and download and import the s3 file from within your entrypoint file before any function declarations. This would make it available to the input_fn
Keep using the hyperparameter approach, download and import the vectorizer in serving_input_fn and make it available via a global variable so that input_fn has access to it.
Download the file from s3 before training and include it in the source_dir directly.
Option 3 would only work if you didnt need to make changes to the vectorizer seperately after initial training.
Whatever you do, don't download the file directly in input_fn. I made that mistake and the performance is terrible as each invoking of the endpoint would result in the s3 file being downloaded.

The Hyper-parameters are used in the training phase to allow you to tune (Hyper-Parameters Optimization - HPO) your model. Once you have a trained model, these hyper-parameters are not needed for inference.
When you want to pass features to the serving instances you usually do that in the BODY of each request to the invoke-endpoint API call (for example see here: or the call to the predict wrapper in the SageMaker python SDK ( You can see such examples in the sample notebooks (

Yes, one option is to add your configuration file to source_dir and load the file in the input_fn.
Another option is to use serving_input_fn(hyperparameters). That function transforms the TensorFlow model in a TensorFlow serving model. For example:
def serving_input_fn(hyperparameters):
# gets the input shape from the hyperparameters
shape = hyperparameters.get('input_shape', [1, 7])
tensor = tf.placeholder(tf.float32, shape=shape)
# returns the ServingInputReceiver object.
return build_raw_serving_input_receiver_fn({INPUT_TENSOR_NAME: tensor})()
tensorflow amazon-sagemaker hyperparameters tensorflow-serving


Specify checkpoint path in custom docker image in SageMaker

I am training a model on SageMaker using a custom docker image.
I need to specify the local path (the one in the container) used to store checkpoints, so that SageMaker can copy its output to S3.
According to the documentation here , I can do that when I initialize the Estimator:
# The local path where the model will save its checkpoints in the training container
estimator = Estimator(
image_uri="<ecr_path>/<algorithm-name>:<tag>" # Specify to use built-in algorithms
# Parameters required to enable checkpointing
I'd like better to specify the checkpoint_local_path within the docker build. Is there a way to do that when building the image? Maybe using an environment variable? This would be also more consistent to what AWS recommend: *We recommend specifying the local paths as '/opt/ml/checkpoints' to be consistent with the default SageMaker checkpoint settings. *
unlike you don't like the /opt/ml/checkpoints name, you don't need to specify anything in your docker, apart from writing in /opt/ml/checkpoints (and reading from it if you're doing transfer learning or want to pickup from previously saved checkpoints)
Anything you write to /opt/ml/checkpoints in your container will be saved in S3 at the location you specify in checkpoint_s3_uri='s3://...'

Vertex AI custom container batch prediction

I have created a custom container for prediction and successfully uploaded the model to Vertex AI. I was also able to deploy the model to an endpoint and successfully request predictions from the endpoint. Within the custom container code, I use the parameters field as described here, which I then supply later on when making an online prediction request.
My questions are regarding requesting batch predictions from a custom container for prediction.
I cannot find any documentation that describes what happens when I request a batch prediction. Say, for example, I use the my_model.batch_predict function from the Python SDK and set the instances_format to "csv" and provide the gcs_source. Now, I have setup my custom container to expect prediction requests at /predict as described in this documentation. Does Vertex AI make a POST request to this path, converting the cvs data into the appropriate POST body?
How do I specify the parameters field for batch prediction as I did for online prediction?
Yes vertex AI makes a POST request your custom containers in batch prediction.
No, there is no way for batch prediction to pass a parameter since we don't know which column is "parameter". We put everything into "instances".

How to deploy our own TensorFlow Object Detection Model in amazon Sagemaker?

I have my own trained TF Object Detection model. If I try to deploy/implement the same model in AWS Sagemaker. It was not working.
I have tried TensorFlowModel() in Sagemaker. But there is an argument called entrypoint- how to create that .py file for prediction?
entrypoint is a argument which contains the file name,which means,once you create a endpoint and try to predict the image using the invoke endpoint api. the instance will be created based on you mentioned and it will go to the script and execute the process.
Link : Documentation for tensor-flow model deployment in amazon sage-maker
The inference script must contain a methods input_handler and output_handler or handler which will cover both the function in script, this for pre and post processing of your image.
Example for Deploying the tensor flow model
In the above link, i have mentioned a medium post, this will be helpful for your doubts.

Where are models saved by default?

I've submitted a training job to the cloud using the RESTful API and see in the console logs that it completed successfully. In order to deploy the model and use it for predictions I have saved the final model using tf.train.Saver().save() (according to the how-to guide).
When running locally, I can find the graph files (export-* and export-*.meta) in the working directory. When running on the cloud however, I don't know where they end up. The API doesn't seem to have a parameter for specifying this, it's not in the bucket with the trainer app, and I can't find any temporary buckets on the cloud storage created by the job.
When you set up your Cloud ML environment you set up a bucket for this purpose. Have you looked in there?
Edit (for future record): As Robert mentioned in comments, you'll want to pass the output location to the job as an argument. Couple of things to be mindful of:
Use a unique output location per job, so one job doesn't clobber over the outputs of another.
The recommendation is to specify the parent output path, and use it to contain the exported model in a subpath called 'model', as well as organizing other outputs like checkpoints and summaries within that path. That makes it easier to manage all the outputs.
While not required, I'll also suggest staging the training code in a packages subpath of the output, which helps correlate the source with the outputs it produces.
Finally(!), also keep in mind when you use hyperparameter tuning, you'll need to append the trial id to the output path for outputs produced by individual runs.

Saving a file in AWS filesystem

Hi I am trying out opencv in AWS lambda. I want to save a SVM model in txt file so that I can load it again. Is it possible to save it in tmp directory and load it from there whenever I need it or will I have to use s3?
I am using python and trying to do something like this:
# saving the model"/tmp/svm.dat")
# Loading the model
svm ="/tmp/svm.dat")
Its not possible as Lambda execution environment is distributed and therefore the same function might run on several different instances.
The alternative is to save your svm.dat to S3 and then download it every time you start your lambda function.