AWS Lambda temporal load balancing - concurrency

I have a AWS Lambda function that I invoke with every 1 minute with >1000 SNS events. This is a problem because my account concurrency is set at 3000, so if I start adding more jobs then eventually I'm going to have >3000 concurrent Lambda instances.
Each job takes around 2-5 seconds to complete which means that within each 1 minute window the concurrency limit will only be threatened within the first 5 seconds and I'll have 0 concurrency for the remaining 55 seconds.
If I set a concurrency limit (e.g. 1000) for the lambda will it handle the first 1000 SNS events and then automatically pick up the remainder once the concurrency frees up? And will I only be charged for the actual runtime rather than time spent waiting for concurrency to reduce?
Otherwise, is there a way that AWS will allow me to spread the load of jobs throughout the 1 minute window so that I can invoke the lambda every ~5 seconds with a subset of the total number of jobs?

If I set a concurrency limit (e.g. 1000) for the lambda will it handle the first 1000 SNS events and then automatically pick up the remainder once the concurrency frees up? And will I only be charged for the actual runtime rather than time spent waiting for concurrency to reduce?
Yes. Setting the concurrency limit definitely comes in handy on your use case and is the way to go. This is one of the reasons why concurrency limit actually exists :)
Unfortunately you can't take advantage of batching with SNS because it always sends one and only event. What you could do is to hook up a SQS queue with your SNS topic and have the Lambda function subscribe to the SQS queue instead, then you can take advantage of batching (max batch size is 10), greatly reducing the amount of concurrent Lambda executions, but still, you'd need to set a concurrency limit to make sure you don't use up all the available concurrency.
Otherwise, is there a way that AWS will allow me to spread the load of jobs throughout the 1 minute window so that I can invoke the lambda every ~5 seconds with a subset of the total number of jobs?
No, but this is unnecessary because of the above.

Related

AWS lambda does not reach maximum concurrency available

I have a lambda function that is triggered by an sqs queue . I set batch size to 1 because I want each message to map to one lambda instance to benefit from concurrency and finish processing faster.
However,after some trials with 1000 messages available in sqs queue max concurrent execution only reachs 50 although I reserve 1000 concurrency for my function.
Is there a reason behind this behavior
One reason could be that your functions finish quickly. Thus there is no reason to span 1000 concurrent functions. Lambda polls the sqs at fixed intervals, so you can just span 1000 concurrent invocations in an instance. Similarly, lambda does not scale to 1000 in an instant. Please read the following for more details:
Understanding how AWS Lambda scales with Amazon SQS standard queues

Why Lambda never reach or close to the reserved concurrency? Want SQS triggers more Lambda function to process message concurrently

I've set a Lambda trigger with a SQS Queue. Lambda's reserved concurrency is set to 1000. However, there are millions of messages waiting in the queue need to be processed and it only invokes around 50 Lambdas at the same time. Ideally, I want SQS to trigger 1000 (or close to 1000) Lambda functions concurrently. Do I miss any configuration in SQS or Lambda? Thank you for any suggestion.
As stated in AWS Lambda developer guide:
...Lambda increases the number of processes that are reading batches by up to 60 more instances per minute. The maximum number of batches that an event source mapping can process simultaneously is 1,000.
So the behavior that you encountered (only invokes around 50 Lambdas at the same time) is actually expected.
If you are not using already, I would suggest doing batch processing in your lambda (so you can process 10 messages per invocation). If that is still not enough, you can potentially create more queues and lambdas to divide your load (considering that order is not relevant in your case), or move away from it and start polling the queue directly with EC2/ECS (which can increase your costs considerably however).

aws worker queue with long idle and sudden burst

I'm trying to implement a worker queue to handle burst of messages. Every few days/weeks I get a burst of ~5,000 messages that need to be processed in a reasonable time (each message can be proceeded in about 1-2 min). Ideally if I could run 5,000 lambdas simultaneous the whole process should take 1-2 min but my account is limit to 1,000 concurrency lambda.
I was planning to create an SQS as event source to lambda but I can't see any Throttling mechanism? I can set reserved concurrency on the lambda but it will be deducted from my account limit and it sound "expensive" to keep those lambda idle for an event that can occur only one per few weeks.
Is there a way to throttling SQS? or maybe I need to choose other service to implement such queue?
In that case, the reserved concurrency is the way to limit number of your lambda functions that can run at the same time:
Reserved concurrency – Reserved concurrency creates a pool of requests that can only be used by its function, and also prevents its function from using unreserved concurrency.
It does not cost extra. I think you are confusing reserved concurrency with provisioned concurrency:
When enabled, Provisioned Concurrency keeps functions initialized and hyper-ready to respond in double-digit milliseconds.

Scaling AWS Lambda with SQS

I want to use SQS for calling Lambda.
An execution time of lambda function is 3 minutes.
I want to execute 1000 lambda functions at once, so I send 1000 messages to SQS queue
But according to an AWS documentation
Amazon Simple Queue Service supports an initial burst of 5 concurrent function invocations and increases concurrency by 60 concurrent invocations per minute.
https://docs.aws.amazon.com/en_us/lambda/latest/dg/scaling.html
So I should wait a few minutes until all messages will be processed. Is there any workaround to call 1000 concurrent lambda and avoid "cold start"?
UPD: I got answer from AWS support
You are correct that SQS will start at an initial burst of 5 and
increase by a concurrency of 60 per minute. Scaling rates can't be
increased
If you see the Automatic Scaling section of the documentation page that describes the autoscaling behaviour under sudden load. I don't think cold start would be a problem. The first batch of concurrent Lambdas executions would likely see the cold start and all the subsequent invocations would be fine.

AWS Lambda is seemingly not highly available when invoked from SNS

I am invoking a data processing lambda in bulk fashion by submitting ~5k sns requests in an asynchronous fashion. This causes all the requests to hit sns in a very short time. What I am noticing is that my lambda seems to have exactly 5k errors, and then seems to "wake up" and handle the load.
Am I doing something largely out of the ordinary use case here?
Is there any way to combat this?
I suspect it's a combination of concurrency, and the way lambda connects to SNS.
Lambda is only so good at automatically scaling up to deal with spikes in load.
Full details are here: (https://docs.aws.amazon.com/lambda/latest/dg/scaling.html), but the key points to note that
There's an account-wide concurrency limit, which you can ask to be
raised. By default it's much less than 5k, so that will limit how
concurrent your lambda could ever become.
There's a hard scaling limit (+1000 instances/minute), which means even if you've managed to convince AWS to let you have a concurrency limit of 30k, you'll have to be under sustained load for 30 minutes before you'll have that many lambdas going at once.
SNS is a non-stream-based asynchronous invocation (https://docs.aws.amazon.com/lambda/latest/dg/invoking-lambda-function.html#supported-event-source-sns) so what you see is a lot of errors as each SNS attempts to invoke 5k lambdas, but only the first X (say 1k) get through, but they keep retrying. The queue then clears concurrently at your initial burst (typically 1k, depending on your region), +1k a minute until your reach maximum capacity.
Note that SNS only retries three times at intervals (AWS is a bit sketchy about the intervals, but it is probably based on the retry: delay the service returns, so should be approximately intelligent); I suggest you setup a DLQ to make sure you're not dropping messages because the time for the queue to clear.
While your pattern is not a bad one, it seems like you're very exposed to the concurrency issues that surround lambda.
An alternative is to use a stream based event-source (like Kinesis), which processes in batches at a set concurrency (e.g. 500 records per lambda, concurrent by shard count, rather than 1:1 with SNS), and waits for each batch to finish before processing the next.