AWS Lambda Payload Response - amazon-web-services

I'm working on an application and our Back-end is written in .NET Web APi Core and Front-end in React. I make an endpoint which gets a JSON list and the size of the list is almost 83 MB. When I, deploy my back-end into AWS Lambda and call my endpoint it gives me an error (Error converting the response object of type Amazon.Lambda.APIGatewayEvents.APIGatewayProxyResponse from the Lambda function to JSON: Unable to expand the length of this stream beyond its capacity.: JsonSerializerException). I already check the Lambda payload response limit is 6 MB, (storing data into S3 from lambda endpoint and then call into Front-End will not work for me), so is there any way I can get that much data through Lambda.

Not with a single call, sorry, you cannot. As described in this link, the payload limit (for synchronous call) is, as you say, 6MB. Asynchronous calls have even lower limits.
I'd suggest you modify your UI/API to narrow your results first, or trap this error and alert the user (or calling service) that the payload is too large (and hence should be aborted, narrowed, or split into multiple calls).

In a single call we cannot retrieve more than 6.2 mb data from AWS lambda or through AWS API.Im also facing the same issue,Either we have to filter data or should do multiple calls

Related

LAMBDA_RUNTIME Failed to post handler success response. Http response code: 413

I have node/express + serverless backend api which I deploy to Lambda function.
When I call an api, request goes to API gateway to lambda, lambda connects to S3, reads a large bin file, parses it and generates an output in JSON object.
The response JSON object size is around 8.55 MB (I verified using postman, running node/express code locally). Size can vary as per bin file size.
When I make an api request, it fails with the following msg in cloudwatch,
LAMBDA_RUNTIME Failed to post handler success response. Http response code: 413
I can't/don't want to change this pipeline : HTTP API Gateway + Lambda + S3.
What should I do to resolve the issue ?
the AWS lambda functions have hard limits for the sizes of the request and of the response payloads. These limits cannot be increased.
The limits are:
6MB for Synchronous requests
256KB for Asynchronous requests
You can find additional information in the official documentation here:
https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
You might have different solutions:
use EC2, ECS/Fargate
use the lambda to parse and transform the bin file into the desired JSON. Then save this JSON directly in an S3 public bucket. In the lambda response, you might return the client the public URL/URI/FileName of the created JSON.
For the last solution, if you don't want to make the JSON file visible to whole the world, you might consider using AWS Amplify in your client or/and AWS Cognito in order to give only an authorised user access to the file that he has just created.
As noted in other questions, API Gateway/Lambda has limits on on response sizes. From the discussion I read that latency is a concern additionally.
With these two requirements Lambda are mostly out of the question, as they need some time to start up (which can be lowered with provisioned concurrency) and do only have normal network connections (whereas EC2,EKS can have enhanced networking).
With this requirements it would be better (from AWS Point Of View) to move away from Lambda.
Looking further we could also question the application itself:
Large JSON objects need to be generated on demand. Why can't these be pre-generated asynchronously and then downloaded from S3 directly? Which would give you the best latency and speed and can be coupled with CloudFront
Why need the JSON be so large? Large JSONs also need to be parsed on the client side requiring more CPU. Maybe it can be split and/or compressed?

Lambda boto3 background functions with api

I'm trying to build a basic AWS Lambda API and function setup to do the following:
Part 1: Client calls function with api and runs both a background 1 min function to process data and a quick messesge to client in browser.
Part 2: When background function is complete it returns 302 redirect to the client with a generated link.
I'm stuck on Part 2. How can I go from the background function to the API back to the client?
I'm using python boto3 for my Lambda scripts.
This is AWS Lambda so your client doesn't have a persistent connection to the server-side code.
Here is an idea of one way to build this:
your client makes an API request that triggers a Lambda function
on invocation, your Lambda function generates a new, unique id (a UUID), writes that to DynamoDB so that this UUID can later be associated with the result of the background processing
the Lambda kicks off the background processing, passing the UUID to it
the Lambda returns the generated UUID to the client
the background processing happens asynchronously, ultimately writing any results to the DynamoDB item associated with the UUID that triggered it
the client polls another API periodically, say every 10s, sending in the UUID it was given
the polled Lambda takes the presented UUID, does a lookup in DynamoDB and returns a 302 redirect to a URL result, or an indication that the results aren't ready yet (e.g. HTTP 404)
some process that you create removes the item from DynamoDB later (or not)

Create API using Amazon lambda function and API Gateway

I am implementing an API using Amazon lambda function and API Gateway this lambda function will, in turn, call another 3rd party API and will transform that data into a specific format and will return it.
The 3rd part API that I am using to fetch records has pagination and throttling enabled but the API I am building using lambda and API Gateway I don't want to implement pagination in it rather I want this API to get all the pages one by one transform them in the specific format and return at once. The Client of this API should not have to call it with different pagination parameters.
Now as Lambda function has a maximum of 15-minute limit and the 3rd part API also has a max request per minute limit what is the best way to implement this.
This is how I am doing it right now, in my lambda function I push a specific number of requests in promises and when the max number is reached I stop pushing more and execute the pending promises and set a timeout function for one minute meanwhile the pending promises executes and I generate the response from them but don't send it back as there are pending requests to be made. When the timeout completes I again push a specific amount of requests in promises and repeat the process.
Once all the pages are completed I return the data.
Now the problem is this may exceed more than 15 minutes and the lambda function would terminate.
Is there a better approach for this, even by using some other amazon services.

Rate Exceeded on AWS Lambda Using API Gateway and serverless framework

When I try to invoke a method that has a HTTP event it results in 500 Internal server error.
On CloudWatch logs it shows Recoverable error occurred (Rate Exceeded.)
When I try invoke a function without lambda it executes with response.
Here is my serverless config:
You have set your Lambda's reservedConcurrency to 0. This will prevent your Lambda from ever being invoked. Setting it to 0 is usually useful when your functions are getting invoked but you're not sure why and you want to stop it right away.
If you want to have it invoked, change reservedConcurrency to a positive integer (by default, it can be a positive integer <= 1000, but you can increase this limit by contacting AWS) or simply remove the reservedConcurrency attribute from your .yml file as it will use the default values.
Why would one ever use reservedConcurrency anyways? Well, let's say your Lambda functions are triggered by requests from API Gateway. Let's say you get 400 (peak hours) requests/second and, upon every request, two other Lambda functions are triggered, one to generate a thumbnail for a given image and one to insert some metadata in DynamoDB. You'd have, in theory, 1200 Lambda functions running at the same time (given all of your Lambda functions finish their execution in less than a second). This would lead to throttling as the default concurrent execution for Lambda functions is 1000. But is the thumbnail generation as important as the requests coming from API Gateway? Very likely not as it's naturally an eventually consistent task, so you could set reservedConcurrency on the thumbnail Lambda to only 200, so you wouldn't use up your concurrency, meaning other functions would be able to spin up to do something more useful at a given point in time (in our example, receiving HTTP requests is more important than generating thumbnails). The other 800 left concurrency could then be split between the function triggered from API Gateway and the one that inserts data into DynamoDB, thus preventing throttling for the important stuff and keeping the not-so-important-stuff eventually consistent.

Can AWS API Gateway cache invalidate specific entries based on the response content?

I have used AWS API Gateway with the endpoint as a lambda function. I have enabled the cache functionality provided by API Gateway, to reduce both response time & the number of calls forwarded to my lambda function.
The lambda function queries another data store to return data. If data is not found, an asynchronous call is made to update the data store and "data not found" is returned to the caller. Now the API Gateway is even caching this result, which we do not want to happen. This results in the cache always returning "data not found" for its lifetime (1 hr TTL), even though data is updated asynchronously in the data store.
I'm aware of the request header (Cache-Control: max-age=0) which can invalidate cache and get response directly from Lambda, as mentioned in this documentation page.
But this will not be useful because the caller is unaware whether data is present in data store or not and hence cannot selectively send such request header.
So, my 2 questions are:
Does the API Gateway caches other HTTP response as well, like 404 (apart from 200)?
Is it possible to tell the API Gateway not to cache specific responses?
As you observed, API Gateway caches the result of your endpoint regardless of the status code.
No, not at this time. I can bounce the idea to see if this is something we can support in the future.
Even if API Gateway conditionally did not cache the 404, the caller would need to call the endpoint again anyway, so why not return the result synchronously? This pattern is how most cached APIs that I'm aware of behave and would allow your API to work with what API Gateway offers today.