How to clear an AWS Lambda cache (or force a cold start) - amazon-web-services

The short version:
If I am caching values in my lambda container, how can I clear this cache? I guess I could redeploy the lambda, which will force all new requests to initiate a new cold start, but this doesn't seem like a nice solution.
The long version:
I am writing a custom authorizer for AWS API Gateway (in Python) that does two things:
It gets an api-key from an http header and looks it up in a dynamo table to verify it is valid (and get some attributes attached to it).
It verifies a JWT token (using some of the attributes from #1).
After following some code (this code), I learnt that I can cache values "globally" that can be re-used across invocations of the lambda, great! But if I cache say, the dynamodb response when looking up the api key, what if I have to revoke / issue a new api key at some point?
I'd like to be able to ensure that my lambda cache gets wiped somehow.

Short answer: You can force a new container for each invoke by calling the UpdateFunctionCode or UpdateFunctionConfiguration before exiting the execution for the same function. You can keep changing function time out before returning the response and the next invoke will spin up a new execution environment (container/sandbox) with a cold start penalty.
The right approach: If you are caching the function variables, you can clear them off inside the handler and continue with the execution logic. This will ensure you are not facing cold start penalties for subsequent invocations and you can in control of choosing the "right" values.
This can be better explained in using database clients. You can create the client outside the handler, but for every invoke verify if the client is valid. Recreate the client inside the handler if the original is now invalid. This will save you some processing time - as the CPU is throttled when the function hits the handler.
Since you are working with API Gateway, the cold start penalties will contribute towards API's Integration timeout (hard limit of 29 seconds for auth and backend combined); and I will try to avoid forcing cold start as much as possible.

Related

How to reject the same POST request sent twice in a short gap of time

I am wondering if there is a standard way to reject requests with the same body sent within a few seconds at the API gateway itself.
Forex: Reddit rejects if I try to post the same content within few seconds in a different group. Similarly, if I make a credit card payment for the second time, it automatically rejects it.
I am wondering if there is a way to have the same behavior in the AWS API gateway itself so that we are not handling it in lambda functions with dynamoDB and stuff.
Looking forward to efficient ways of doing it.
The API Gateway currently doesn't offer a feature like that, you'd have to implement this yourself.
If I was to implement this, I'd probably use an in-memory cache like ElastiCache for Redis or Memcached as the storage backend for deduplications.
For each incoming request I'd determine what makes it unique and create a hash from that.
Then I check if that hash value is in the cache already. If that's the case it 's a duplicate and I reject the request. If it isn't already in the cache, I'd add it with a time to live of n seconds (The time interval in which I wish to deduplicate).

Rate Exceeded on AWS Lambda Using API Gateway and serverless framework

When I try to invoke a method that has a HTTP event it results in 500 Internal server error.
On CloudWatch logs it shows Recoverable error occurred (Rate Exceeded.)
When I try invoke a function without lambda it executes with response.
Here is my serverless config:
You have set your Lambda's reservedConcurrency to 0. This will prevent your Lambda from ever being invoked. Setting it to 0 is usually useful when your functions are getting invoked but you're not sure why and you want to stop it right away.
If you want to have it invoked, change reservedConcurrency to a positive integer (by default, it can be a positive integer <= 1000, but you can increase this limit by contacting AWS) or simply remove the reservedConcurrency attribute from your .yml file as it will use the default values.
Why would one ever use reservedConcurrency anyways? Well, let's say your Lambda functions are triggered by requests from API Gateway. Let's say you get 400 (peak hours) requests/second and, upon every request, two other Lambda functions are triggered, one to generate a thumbnail for a given image and one to insert some metadata in DynamoDB. You'd have, in theory, 1200 Lambda functions running at the same time (given all of your Lambda functions finish their execution in less than a second). This would lead to throttling as the default concurrent execution for Lambda functions is 1000. But is the thumbnail generation as important as the requests coming from API Gateway? Very likely not as it's naturally an eventually consistent task, so you could set reservedConcurrency on the thumbnail Lambda to only 200, so you wouldn't use up your concurrency, meaning other functions would be able to spin up to do something more useful at a given point in time (in our example, receiving HTTP requests is more important than generating thumbnails). The other 800 left concurrency could then be split between the function triggered from API Gateway and the one that inserts data into DynamoDB, thus preventing throttling for the important stuff and keeping the not-so-important-stuff eventually consistent.

How can I keep warm an AWS Lambda invoked from API Gateway with proxy integration

I have defined a lambda function that is invoked from API Gateway with proxy integration. Thus, I have defined an eager resource path for it:
And referenced my lambda function:
My lambda is able to process request like GET /myresource, POST /myresource.
I have tried this strategy to keep it warm, described in acloudguru. It consists of setting up a CloudWatch event rule that invokes the lambda every 5 minutes to keep it warm. Unfortunately it isn't working.
This is the behaviour I have seen:
After some period, let's say 20 minutes, I call GET /myresource from API Gateway and it takes around 15 seconds. Subsequent requests last ~30ms. The CloudWatch event is making no difference...
Let's suppose another long period without calling the gateway. If I go to the Lambda console and invoke it directly (test button) it answers right away (less than 1ms) with a 404 (that's normal because my lambda expects GET /myresource or POST /myresource).
Immediately after this lambda console execution I call GET /myresource from API Gateway and it still takes ~20 seconds. That is to say, the function was still cold despite having being invoked from the Lambda console. This might explain why the CloudWatch event doesn't work since it calls the lambda without setting the method/resource-url.
So, how can I make this particular case with API Gateway with proxy integration + Lambda stay warm to prevent those slow first request?
As of now (2019-02-27) [1], A periodic CloudWatch event rule does not deterministically solve the cold start issue. But a periodic CloudWatch event rule will reduce the probability of cold starts.
The reason is it's upto the Lambda server to decide whether to use a new Lambda container instead of an existing container to process an incoming request. Some of the related details regarding how Lambda containers are reused is explained in [1]
In order to reduce the cold start time (not to reduce the number cold starts), can you try followings? 1. increasing the memory allocated to the function, 2. reduce the deployment package size (eg- remove unnecessary dependencies), and 3. use a language like NodeJS, Python instead of Java, .Net
[1]According to reinvent session, (39:50 at https://www.youtube.com/watch?v=QdzV04T_kec), the Lambda team expects to improve the VPC cold start latency in Lambda.
[2] https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/
Denis is quite right about the non deterministic lambda behaviour regarding the number of containers hit by CloudWatch events. I'll follow his advice to improve the startup time.
On the other hand I have managed to make my CloudWatch events hit the lambda function properly, reducing (in many cases) the number of cold starts.
I just had to add an additional controller mapped to "/" with a hardcoded response:
#Controller("/")
class WarmUpController {
private val logger = LoggerFactory.getLogger(javaClass)
#Get
fun warmUp(): String {
logger.info("Warming up")
return """{"message" : "warming up"}"""
}
}
With this in place the default (/) invocation from CloudWatch does keep the container warm most of the time.

Api Gateway Api Key immediate use upon creation giving forbidden

Application creates an api key on a per user basis, meaning the process is as follows:
Lambda function creates api key and adds to a usage plan
Api key value is returned from lambda function
Api key is then immediately used to call an Api Gateway end point
Forbidden message is returned
If I delay execution between api key creation and the http request to the api gateway end point (by around 5 seconds), then it works as intended, but less than that I get an error.
I suspect that the api key takes a few seconds to propagate to the endpoint but I can't find an AWS API method that correctly lets me know when it has done so. Has anyone come across this problem before and how did you solve it?
The best solution I have at the moment is to retry the api call on a sliding timeout until an unreasonable amount of time has passed.
How long should I wait after applying an AWS IAM policy before it is valid? is not the same question but seems likely to be similar in its underlying explanation -- it's not so much a case of the API key taking time to exist but rather taking time to propagate and become visible at every possible place where it might need to exist before being valid for any subsequent request.
If those assumptions are correct, there is no mechanism for authoritatively determining whether the key is ready for use or not, because for some period of time after the key creation request succeeds, it's in a situation arguably reminiscent of Schrödinger's cat -- the key both exists and doesn't exist -- you don't know until you try it, and (unlike the cat) even a successful test does not necessarily prove that it is fully ready for use, because of the possibility (however unlikely) of a result such as fail fail fail fail pass fail pass pass pass. Such is the characteristic behavior of many large-scale, distributed systems.
From comments:
If an API call returns the api key value then I would expect it to be able to be used instantly, or at least return only when the key has been propagated fully to the end points.
That makes sense on the surface, but it becomes problematic in implementation. What if one of the endpoints is failed, offline for maintenance, or in the middle of recovering from an outage and lagging... what then? Fail the request? Delay the response waiting for something statistically unlikely to impact you?
The resource cost of observing replication tends to outweigh the benefits in many cases and can destabilize the control plane of a system if a replication issue causes a sufficient backlog, and is often not implemented except in cases where it has a high value, viz. the GetChange action in Route 53 which allows you to verify the propagation of a change through the system -- and note that even in this case, the change request itself succeeds without waiting -- if you need to verify the sync state, you have to ask separately.
A lot of AWS services take time to create. Usually there is a way to detect if the job has been completed. In this case it looks like you get a forbidden response until the key is created.
I think you will have to handle this in your client.

AWS API Gateway Cache - Multiple service hits with burst of calls

I am working on a mobile app that will broadcast a push message to hundreds of thousands of devices at a time. When each user opens their app from the push message, the app will hit our API for data. The API resource will be identical for each user of this push.
Now let's assume that all 500,000 users open their app at the same time. API Gateway will get 500,000 identical calls.
Because all 500,000 nearly concurrent requests are asking for the same data, I want to cache it. But keep in mind that it takes about 2 seconds to compute the requested value.
What I want to happen
I want API Gateway to see that the data is not in the cache, let the first call through to my backend service while the other requests are held in queue, populate the cache from the first call, and then respond to the other 499,999 requests using the cached data.
What is (seems to be) happening
API Gateway, seeing that there is no cached value, is sending every one of the 500,000 requests to the backend service! So I will be recomputing the value with some complex db query way more times than resources will allow. This happens because the last call comes into API Gateway before the first call has populated the cache.
Is there any way I can get this behavior?
I know that based on my example that perhaps I could prime the cache by invoking the API call myself just before broadcasting the bulk push job, but the actual use-case is slightly more complicated than my simplified example. But rest assured, solving this simplified use-case will solve what I am trying to do.
If you anticipate that kind of burst concurrency, priming the cache yourself is certainly the best option. Have you also considered adding throttling to the stage/method to protect your backend from a large surge in traffic? Clients could be instructed to retry on throttles and they would eventually get a response.
I'll bring your feedback and proposed solution to the team and put it on our backlog.