I know one shouldn't rely on a Lambda being reused, and that's not my goal. Just trying to get an idea of how many invocations I'm getting handled by the same instance of a Lambda.
Looking at the graphs below, it shows that at some point in time, there were 5,093 invocations and 57 concurrent executions at that same point in time.
Question
Can I assume all of those invocations were handled by those concurrent executions? Thus, 5093 / 57 = ~89 requests handled by each lambda instance on average?
From the Lambda docs:
Invocations – The number of times your function code is executed, including successful executions and executions that result in a function error. Invocations aren't recorded if the invocation request is throttled or otherwise resulted in an invocation error. This equals the number of requests billed.
and
ConcurrentExecutions – The number of function instances that are processing events. If this number reaches your concurrent executions quota for the Region, or the reserved concurrency limit that you configured on the function, additional invocation requests are throttled.
Based on this I think your interpretation is correct, you can divide Invocations by ConcurrentExecutions to get the number of requests each execution context (instance) has handled on average. Note that there might be a lot of variance in the numbers, which you can't measure based on the available metrics. You'd have to generate your own metrics for that.
Related
We have a lambda setup which will occasionally take ten minutes between initialization and invocation, leading to severe performance degradation of the apps that depend on it. The lambda request is handled by API gateway, sent to the Lambda Context for our handler, and then sent to the lambda function itself. At this point it looks like the lambda is initialized, but will take anywhere from 1 to 10 minutes to invoke for the slow performing requests.
Provisioned concurrency seems to address the cold start problem, but we cant seem to find any indication that concurrency is the problem. Furthermore, we have no reason to believe this is a cold start problem, given that the request takes 10 minutes, versus 10 seconds. I have no idea where to start to address this problem. Can someone give me some tips?
I have a lambda function that is being concurrently executed more than once. And this function needs to visit an API that block all frequent visits. If there a way to avoid concurrent executions? Other methods to avoid authentication failures help.
Just set the concurrency limit on the function. You can set the limit as low as 1 so that only one instance of the function is ever running at any given time.
I am using distributed scheduler 'Chronos'(distributed crontab) to hit a REST API after few minute of job addition(example: Add job at time T to schedule it at T+5minutes).This run on a bigger infrastructure and take care of fault-tolerant and no-data loss, however it has significant cost and I am thinking some alternative to the similar requirement. Please help if it can be done using a lambda function.
Its possible to do invoke a lambda function, block/wait for X seconds and continue execution, but not recommended. You cannot wait for more than 300 seconds though as thats the max timeout legally allowed by Lambda functions.
Moreover, you will hit concurrent execution limits from AWS and will need to keep calling AWS support to increase your concurrent execution limits.
Another approach to solve this problem could be to use Actor based system such as Akka, to create an Actor for each job and do the needful.
I'm building a server-less web-tracking system which serves its tracking pixel using AWS API Gateway, which calls a Lambda function whenever a tracking request arrives to write the tracking event into a Kinesis stream.
The Lambda function itself does not do anything fancy. It just a takes the incoming event (its own argument) and writes it to the stream. Essentially, it's just:
import boto3
kinesis_client = boto3.client("kinesis")
kinesis_stream = "my_stream_name"
def return_tracking_pixel(event, context):
...
new_record = ...(event)
kinesis_client.put_record(
StreamName=kinesis_stream,
Data=new_record,
PartitionKey=...
)
return ...
Sometimes I experience a weird spike in the Lambda execution duration that causes some of my Lambda function invocations to time-out and the tracking requests to be lost.
This is the graph of 1-minute invocation counts of the Lambda function in the in affected time period:
Between 20:50 and 23:10 I suddenly see many invocation errors (1-minute error counts):
which are obviously caused by the Lambda execution time-out (maximum duration in 1-minute intervals):
There is nothing weird going on neither with my Kinesis stream (data-in, number of put records, put_record success count etc., all looks normal), nor with my API GW (number of invocations corresponds to number of API GW calls, well within the limits of the API GW).
What could be causing the sudden (and seemingly randomly occurring) spike in the Lambda function execution duration?
EDIT: neither the lambda functions are being throttled, which was my first idea.
Just to add my 2 cents, because there's not much investigative work without extra logging or some X-Ray analysis.
AWS Lambda sometimes will force recycle containers which will feel like cold starts even though your function is being reasonably exercised and warmed up. This might bring all cold start related issues, like extra delays for ENIs if your Lambda has an attached VPC and so on... but even for a simple function like yours, 1 second timeout is sometimes too optimistic for a cold start.
I don't know of any documentation on those forced recycles, other than some people having evidence for it.
"We see a forced recycle about 7 times a day." source
"It also appears that even once warmed, high concurrency functions get recycled much faster than those with just a few in memory." source
I wonder how you could confirm this is the case. Perhaps you could check those errors appearing in Cloud Watch log streams to be from containers that never appeared before.
According to the docs, "by default, AWS Lambda limits the total concurrent executions across all functions within a given region to 100."
Consider a simple mobile app using Lambda for back end processing. If I'm understanding the constraint correctly, not more than 100 concurrent executions can happen at one time meaning that if I have 100 users invoking lambda functions at the same time, there will be throttling constraints?
I understand I can call customer support and increase that limit but is this the correct interpretation of the constraint? How is this supposed to scale to 1000, 10,000 or 1,000,000 users?
update: Since this answer was written, the default limit for concurrent executions was increased by a factor of 10, from 100 to 1,000. The limit is per account, per region.
By default, AWS Lambda limits the total concurrent executions across all functions within a given region to 1000
http://docs.aws.amazon.com/lambda/latest/dg/concurrent-executions.html#concurrent-execution-safety-limit (link visited 2017-05-02)
However, as before, this is a protective control, and AWS support will increase the limit if you present them with your use case and it is approved. There isn't a charge for creating this type of request in the support center and there isn't a charge for raising your limits.
The Lambda platform also may allow excursions beyond your limit if it deems the action appropriate. The logic behind such an action isn't documented, but a reasonable assumption would be that if the traffic appears to be genuine demand/load driven, rather than a result of a runaway loopback condition where Lambda functions invoke more Lambda functions, directly or indirectly.
A fun example of a runaway condition might be something like this: A bucket has a create object event that invokes a Lambda function, which creates 2 objects in the same bucket... which invokes the same Lambda function 4 times, creating 8 objects... invoking the lambda function 8 times, creating 16 objects.
On about the 15th iteration, which would only require a matter of seconds, you theoretically would have 32,768 concurrent invocations trying to create 65,536 objects. Real world traffic ramps up much more slowly, in most cases.
if I have 100 users invoking lambda functions at the same time, there will be throttling constraints
Yes, that's the idea behind "concurrent."
How is this supposed to scale
Nobody said it would, with the limit in place.
This limit is a protective control, not a reflection of an actual limitation of the platform.
But also, how likely is it that your users are making concurrent requests to Lambda? Assuming your Lambda function runs for 100ms, you could handle something like 750 invocations per second within a limit of 100 concurrent invocations at a blocking probability of only 0.1%.
(That's an Erlang B calculation, which seems applicable here. With no random arrivals, of course, the "pure" capacity would be 100 × 10 = 1000 invocations/sec for a 100ms function).