Shared input in Sagemaker inference pipeline models - amazon-web-services

I'm deploying a SageMaker inference pipeline composed of two PyTorch models (model_1 and model_2), and I am wondering if it's possible to pass the same input to both the models composing the pipeline.
What I have in mind would work more or less as follows
Invoke the endpoint sending a binary encoded payload (namely payload_ser), for example:
client.invoke_endpoint(EndpointName=ENDPOINT,
ContentType='application/x-npy',
Body=payload_ser)
The first model parses the payload with inut_fn function, runs the predictor on it, and returns the output of the predictor. As a simplified example:
def input_fn(request_body, request_content_type):
if request_content_type == "application/x-npy":
input = some_function_to_parse_input(request_body)
return input
def predict_fn(input_object, predictor):
outputs = predictor(input_object)
return outputs
def output_fn(predictions, response_content_type):
return json.dumps(predictions)
The second model gets as payload both the original payload (payload_ser) and the output of the previous model (predictions). Possibly, the input_fn function would be used to parse the output of model_1 (as in the "standard case"), but I'd need some way to also make the original payload available to model_2. In this way, model_2 will use both the original payload and the output of model_1 to make the final prediction and return it to whoever invoked the endpoint.
Any idea if this is achievable?

Sounds like you need an inference DAG. Amazon SageMaker Inference pipelines currently supports only a chain of handlers, where the output of handler N is the input for handler N+1.
You could change model1's predict_fn() to return both (input_object, outputs), and output_fn(). output_fn() will receive these two objects as the predictions, and will handle serializing both as json. model2's input_fn() will need to know how to parse this pair input.
Consider implementing this as a generic pipeline handling mechanism that adds the input to the model's output. This way you could reuse it for all models and pipelines.
You could allow the model to be deployed as a standalone model, and as a part of a pipeline, and apply the relevant input/output handling behavior that will be triggered by the presence of an environment variable (Environment dict), which you can specify when creating the inference pipelines model.

Related

How does one automatically format a returned dynamodb item so that it can be put back into dynamodb as-is or slightly modified?

dynamodb yields an item in a string format:
cdb = boto3.client('dynamodb', region_name='us-west-2')
db = boto3.resource('dynamodb', region_name='us-west-2')
table = db.Table('my-table')
response = table.scan()
my_data = response['Items']
foo = my_data[0]
print(foo)
# {'theID': 'fffff8f-dfsadfasfds-fsdsaf', 'theNumber': Decimal('1')}
Now, when I treat this like a black-box unit, do nothing, and return it to the db via put-item, I'll get many errors indicating none of the values in the dictionary are the expected type:
cdb.put_item(TableName='my-table', Item=foo, ReturnValues="ALL_OLD")
# many errors
I'd like to rely on boto3 to do everything and avoid manipulating the values if possible. Is there a utility available that will convert a response item into the format it needs to be to be placed back in the db?
You should use your table resource to write items because you also use it to read.
Something like this should do the trick:
table.put_item(Item=foo, ReturnValues="ALL_OLD")
You're reading the data using the higher-level resource API, which maps many native Python Types to DynamoDB types, and trying to write using the lower-level client API, which doesn't do that.
The simple solution is also to use the resource-API to write, and it will perform the mappings.
(I'm inferring this based on your method signature for put_item, the question is not overly clear...)
Apparently there is also a serializer that can be used:
from boto3.dynamodb.types import TypeSerializer

Python Google Prediction example

predict_custom_model_sample(
"projects/794xxx496/locations/us-central1/xxxx/3452xxx524447744",
{ "instance_key_1": "value", ... },
{ "parameter_key_1": "value", ... }
)
Google is giving this example, I am not understanding the parameter_key and instance_key. To my understanding, I need to send the JSON instance.
{"instances": [ {"when": {"price": "1212"}}]}
How can I make it work with the predict_custom_model_sample?
I assume that you are trying this codelab.
Note that there seems to be a mismatch between the function name defined (predict_tabular_model) and the function name used (predict_custom_model_sample).
INSTANCES is an array of one or more JSON values of any type. Each values represents an instance that you are providing a prediction for.
Instant_key_1 is just the first key of the key/value that goes into instances.
Similarly, parameter_key_1 is just the first key of the key/value that goes into the parameters JSON object.
If your model uses a custom container, your input must be formatted as JSON, and there is an additional parameters field that can be used for your container.
PARAMETERS is a JSON object containing any parameters that your container requires to help serve predictions on the instances. AI Platform considers the parameters field optional, so you can design your container to require it, only use it when provided, or ignore it.
Ref.: https://cloud.google.com/ai-platform-unified/docs/predictions/custom-container-requirements#request_requirements
Here you have examples of inputs for online predictions from custom-trained models
For the codelab, I believe you can use the sample provided:
test_instance={
'Time': 80422,
'Amount': 17.99,
…
}
And then call for prediction (Remember to check for the function name in the notebook cell above)
predict_custom_model_sample(
"your-endpoint-str",
test_instance
)

How to create a beam template with current date as an input (updated daily) [Create from GET request]

I am trying to create a Dataflow job run daily with Cloud Scheduler. I need to get the data from an external API using GET requests, so I need the current date as an input. However, when I export the dataflow job as a template for scheduling, the date input stays at execution time and not updated daily. I have been looking around for a solution, and come across the ValueProvider, but my pipeline, stating with apache_beam.transforms.Create always return an error 'RuntimeValueProvider(option: test, type: str, default_value: 'killme').get() not called from a runtime context' when the ValueProvider is not specified.
Is there anyway I can overcome this? It seems like such a simple problem, yet I cannot make it work no matter how. I appreciate a lot if there is any idea!!
You can use the ValueProvider interface to pass runtime parameters to your pipeline, to access it within a DoFn you will need to pass it in as parameter. Similar to the following example from here:
https://beam.apache.org/documentation/patterns/pipeline-options/#retroactively-logging-runtime-parameters
class LogValueProvidersFn(beam.DoFn):
def __init__(self, string_vp):
self.string_vp = string_vp
# Define the DoFn that logs the ValueProvider value.
# The DoFn is called when creating the pipeline branch.
# This example logs the ValueProvider value, but
# you could store it by pushing it to an external database.
def process(self, an_int):
logging.info('The string_value is %s' % self.string_vp.get())
# Another option (where you don't need to pass the value at all) is:
logging.info(
'The string value is %s' %
RuntimeValueProvider.get_value('string_value', str, ''))
| beam.Create([None])
| 'LogValueProvs' >> beam.ParDo(
LogValueProvidersFn(my_options.string_value)))
You may also want to have a look at Flex templates :
https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates

DynamoDB streams - write data back into table

Consider the following architecture:
write -> DynamoDB table -> stream -> Lambda -> write metadata item to same table
It could be used for many, many awsome situations, e.g table and item level aggregations. I've seen this architecture promoted in several tech talks by official AWS engineers.
But doesn't writing metadata item add new item to stream and run Lambda again?
How to avoid infinite loop? Is there a way to avoid metadata write appearing in stream?
Or is spending 2 stream and Lambda requests inevitable with this architecture? (we're charged per request) I.e exit Lambda function early if it's metadata item.
As triggering an AWS Lambda function from a DynamoDB stream is a binary option (on/off), it's not possible to only trigger the AWS Lambda function for certain writes to the table. So your AWS Lambda function will be called again for the items it just wrote to the DynamoDB table. The important bit is to have logic in place in your AWS Lambda function to detect that it wrote that data and to not write data in that case again. Otherwise you'd get the mentioned infinite loop, which would be a really unfortunate situation, especially if it would went unnoticed.
Currently dynamo DB does not offer condition based subscription to stream, so yes Dynamo DB will execute your lambda function in an infinite loop, currently the only solution is to limit the time your lambda function execute, you can use multiple lambda functions, one lambda function would be there just to check whether a metadata was written or not, I'm sharing a cloud architecture diagram of how you can achieve it,
A bit late but hopefully people looking for a more demonstrative answer will find this useful.
Suppose you want to process records where you want to add to an item up to a certain threshold, you could have an if condition that checks that and processes or skips the record, e.g.
This code assumes you have an attribute "Type" for each of your entities / object types - this was recommended to me by Rick Houlihan himself but you could also check if an attribute exists i.e. "<your-attribute>" in record["dynamodb"]["NewImage"] - and you are designing with PK and SK as generic primary and sort key names.
threshold = (os.environ.get("THRESHOLD"))
def get_value():
response = table.query(KeyConditionExpression=Key('PK').eq(<your-pk>))
value = response['Items']['<your-attribute>'] if 'Items' in response else 0
return value
def your_aggregation_function():
# Your aggregation logic here
# Write back to the table with a put_item call once done
def lambda_handler(event, context):
for record in event['Records']:
if record['eventName'] != "REMOVE" and record["dynamodb"]["NewImage"]["Type'] == <your-entity-type>:
# Query the table to extract the attribute value
attribute_value = get_value(record["dynamodb"]["Keys"]["PK"]["S"])
if attribute_value < threshold:
# Send to your aggregation function
Having the conditions in place in the lambda handler (or you could change where to suit your needs) prevents the infinite loop mentioned.
You may want additional checks in the update expression to make sure two (or more) concurrent lambda are not writing the same object. I suggest you use a date = # timestamp defined in the lambda and add this in the SK, or if you cant, have an "EventDate" attribute in your item so that yo ucould add ConditionExpression or UpdateExpression SET if_not_exists(#attribute, :date)
The above will guarantee that your lambda is idempotent.

is there a way to just test the ItemProcessor in spring batch

All of our business logic is coded in the ItemProcessor. Rather than testing the entire Step that Reads/Processes/Writes records I would like to just invoke the Processor with a given Input record that I create for each test case.
I would control the input data that I feed to it by record creation. I would expect specific results so I would like to just assert he individual fields in the output record returned.
How could I invoke the Processor bean itself and get result back?