AWS send image to Sagemaker from Lambda: how to set content handling? - amazon-web-services

similar question to
AWS Lambda send image file to Amazon Sagemaker
I try to make simple-mnist work (the model was built by referring to aws tutorial)
Then I am using API gateway (REST API w/ proxy integration) to post image data to lambda, and would like to send it to sagemaker endpoint and make an inference.
In lambda function, I wrote the code(.py) like this.
runtime = boto3.Session().client('sagemaker-runtime')
endpoint_name = 'tensorflow-training-YYYY-mm-dd-...'
res = runtime.invoke_endpoint(EndpointName=endpoint_name,
Body=Image,
ContentType='image/jpeg',
Accept='image/jpeg')
However, when I send image to lambda via API gateway, this error occurs.
[ERROR] ModelError: An error occurred (ModelError) when calling the
InvokeEndpoint operation: Received client error (415) from model with
message " {
"error": "Unsupported Media Type: image/jpeg" }
I think I need to do something referring to Working with binary media types for REST APIs
But since I am very new, I have no idea about the appropriate thing to do, on which page (maybe API Gateway page?) or how...
I need some clues to solve this problem. Thank you in advance.

Looking here you can see that only some specific content types are supported by default, and images are not in this list. I think you have to either implement your input_fn function or adapt your data to one of the supported content types.

Related

LAMBDA_RUNTIME Failed to post handler success response. Http response code: 413

I have node/express + serverless backend api which I deploy to Lambda function.
When I call an api, request goes to API gateway to lambda, lambda connects to S3, reads a large bin file, parses it and generates an output in JSON object.
The response JSON object size is around 8.55 MB (I verified using postman, running node/express code locally). Size can vary as per bin file size.
When I make an api request, it fails with the following msg in cloudwatch,
LAMBDA_RUNTIME Failed to post handler success response. Http response code: 413
I can't/don't want to change this pipeline : HTTP API Gateway + Lambda + S3.
What should I do to resolve the issue ?
the AWS lambda functions have hard limits for the sizes of the request and of the response payloads. These limits cannot be increased.
The limits are:
6MB for Synchronous requests
256KB for Asynchronous requests
You can find additional information in the official documentation here:
https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
You might have different solutions:
use EC2, ECS/Fargate
use the lambda to parse and transform the bin file into the desired JSON. Then save this JSON directly in an S3 public bucket. In the lambda response, you might return the client the public URL/URI/FileName of the created JSON.
For the last solution, if you don't want to make the JSON file visible to whole the world, you might consider using AWS Amplify in your client or/and AWS Cognito in order to give only an authorised user access to the file that he has just created.
As noted in other questions, API Gateway/Lambda has limits on on response sizes. From the discussion I read that latency is a concern additionally.
With these two requirements Lambda are mostly out of the question, as they need some time to start up (which can be lowered with provisioned concurrency) and do only have normal network connections (whereas EC2,EKS can have enhanced networking).
With this requirements it would be better (from AWS Point Of View) to move away from Lambda.
Looking further we could also question the application itself:
Large JSON objects need to be generated on demand. Why can't these be pre-generated asynchronously and then downloaded from S3 directly? Which would give you the best latency and speed and can be coupled with CloudFront
Why need the JSON be so large? Large JSONs also need to be parsed on the client side requiring more CPU. Maybe it can be split and/or compressed?

What is the difference between the Amazon S3 API calls GetObject and GetObjectRequest?

I am new to the Amazon S3 API and I am attempting to build a client using Go. I was confused about how I would go about writing a Get function to get an object from an S3 bucket. The documentation for the API calls are a little confusing to me, I am wondering what is the difference between using the GetObject call vs the GetObjectRequest call? And when is it appropriate to use one over the other?
Per the documentation:
Calling the request form of a service operation, which follows the naming pattern OperationName Request, provides a simple way to control when a request is built, signed, and sent. Calling the request form immediately returns a request object. The request object output is a struct pointer that is not valid until the request is sent and returned successfully.
So, use GetObject if you want to immediately send the request and wait for the response. Use GetObjectRequest if you prefer to construct the request but not send it till later.
For most scenarios, you'd probably just use GetObject.

Sagemaker Pytorch model - An error occurred (InternalFailure) when calling the InvokeEndpoint operation (reached max retries: 4):

I am facing an issue while invoking the Pytorch model Endpoint. Please check the below error for detail.
Error Message:
An error occurred (InternalFailure) when calling the InvokeEndpoint operation (reached max retries: 4): An exception occurred while sending request to model. Please contact customer support regarding request 9d4f143b-497f-47ce-9d45-88c697c4b0c4.
Automatically restarted the Endpoint after this error. No specific log in cloud watch.
There may be a few issues here we can explore the paths and ways to resolve.
Inference Code Error
Sometimes these errors occur when your payload or what you're feeding your endpoint is not in the appropriate format. When invoking the endpoint you want to make sure your data is in the correct format/encoded properly. For this you can use the serializer SageMaker provides when creating the endpoint. The serializer takes care of encoding for you and sends data in the appropriate format. Look at the following code snippet.
from sagemaker.predictor import csv_serializer
rf_pred = rf.deploy(1, "ml.m4.xlarge", serializer=csv_serializer)
print(rf_pred.predict(payload).decode('utf-8'))
For more information about the different serializers based off the type of data you are feeding in check the following link.
https://sagemaker.readthedocs.io/en/stable/api/inference/serializers.html
Throttling Limits Reached
Sometimes the payload you are feeding in may be too large or the API request rate may have been exceeded for the endpoint so experiment with a more compute heavy instance or increase retries in your boto3 configuration. Here is a link for an example of what retries are and configuring them for your endpoint.
https://aws.amazon.com/premiumsupport/knowledge-center/sagemaker-python-throttlingexception/
I work for AWS & my opinions are my own

Is it possible to access the original request body in a response body mapping template in AWS API Gateway?

Using API Gateway, I am trying to define a POST end point that accepts application/json to do the following:
Trigger a Lambda asynchronously
Respond with a JSON payload composed of elements from the request body
I have #1 working. I think it's by the book.
It's #2 I'm getting tripped up on. It looks like I don't have access to the request body in the context of the response mapping template. I have access to the original query params with $input.params but I cannot find any property that will give me the original request body, and I need it to get the data that I want to respond with. It's either that or I need to figure out how to get the asynchronous launch of a Lambda to somehow provide the original request body.
Does anyone know if this is possible?
My goal is to ensure that my API responds as fast as possible without incurring a cold start of a Lambda to respond AND simultaneously triggering an asynchronous workflow by starting a Lambda. I'd also be willing to integrate with SNS instead of Lambda directly and have Lambda subscribe to the topic but I don't know if that will get me access to the data I need in the response mapping template.
From https://stackoverflow.com/a/61482410/3221253:
Save the original request body in the integration mapping template:
#set($context.requestOverride.path.body = $input.body)
Retrieve it in the integration mapping response:
#set($body = $context.requestOverride.path.body)
{
"statusCode": 200,
"body": $body,
}
You can also access specific attributes:
#set($object = $util.parseJson($body))
{
"id": "$object.id"
}
To access the original request directly, you should use a Proxy Integration for Lambda rather than mapping things via a normal integration. You'll be able to access the entire request context, such as headers, path params, etc.
I have determined that it is not possible to do what I want to do.

How to get Address of elasticache nodes from Java API?

I am making a describeCacheClusters request as follows and get a valid response but the getCacheClusters() method returns null even though that cluster has available nodes. Is there another request I should be using or a missing parameter?
DescribeCacheClustersResult result = awsClient
.describeCacheClusters(new DescribeCacheClustersRequest()
.withCacheClusterId(ELASTICACHE_CLUSTER_ID));
You are missing a parameter indeed due to a somewhat confusing API design resp. documentation issue with Amazon ElastiCache:
You need to add setShowCacheNodeInfo() to your DescribeCacheClustersRequest and call getCacheNodes() for each CacheCluster retrieved via getCacheClusters() from the DescribeCacheClustersResult - see my answer to the semantic duplicate Finding AWS ElastiCache endpoints with Java for details and code samples.