AWS lambda does not reach maximum concurrency available - amazon-web-services

I have a lambda function that is triggered by an sqs queue . I set batch size to 1 because I want each message to map to one lambda instance to benefit from concurrency and finish processing faster.
However,after some trials with 1000 messages available in sqs queue max concurrent execution only reachs 50 although I reserve 1000 concurrency for my function.
Is there a reason behind this behavior

One reason could be that your functions finish quickly. Thus there is no reason to span 1000 concurrent functions. Lambda polls the sqs at fixed intervals, so you can just span 1000 concurrent invocations in an instance. Similarly, lambda does not scale to 1000 in an instant. Please read the following for more details:
Understanding how AWS Lambda scales with Amazon SQS standard queues

Related

Why Lambda never reach or close to the reserved concurrency? Want SQS triggers more Lambda function to process message concurrently

I've set a Lambda trigger with a SQS Queue. Lambda's reserved concurrency is set to 1000. However, there are millions of messages waiting in the queue need to be processed and it only invokes around 50 Lambdas at the same time. Ideally, I want SQS to trigger 1000 (or close to 1000) Lambda functions concurrently. Do I miss any configuration in SQS or Lambda? Thank you for any suggestion.
As stated in AWS Lambda developer guide:
...Lambda increases the number of processes that are reading batches by up to 60 more instances per minute. The maximum number of batches that an event source mapping can process simultaneously is 1,000.
So the behavior that you encountered (only invokes around 50 Lambdas at the same time) is actually expected.
If you are not using already, I would suggest doing batch processing in your lambda (so you can process 10 messages per invocation). If that is still not enough, you can potentially create more queues and lambdas to divide your load (considering that order is not relevant in your case), or move away from it and start polling the queue directly with EC2/ECS (which can increase your costs considerably however).

Process elements from an SQS queue sequentially using AWS Lambda

I need to process the elements of an SQS Queue sequentially using an AWS Lambda. I need to process the elements sequentially as I don't want to impact the DB when processing multiple elements in parallel. Note that this process is not time-sensitive.
I have noticed that AWS Lambda reads up to 5 batches when messages are available, then invokes 5 Lambdas in parallel, which I want to avoid. Lambda may increase that number to up to 1000 batches. https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
Any help will be appreciated, thanks!
You just need to specify the --batch-size parameter to 1.
Other things to look at to achieve the desired outcome.
Make sure you're using FIFO SQS queue
Fetch 1 message at the time using batch-size
Configure a Dead Letter Queue (DLQ) for the SQS itself, so the messages that can't be processed after a several attempts are sent to DLQ SQS
Check the doco for details on each parameter supported by event source mapping
https://docs.aws.amazon.com/cli/latest/reference/lambda/create-event-source-mapping.html
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-dead-letter-queues.html
It appears that your requirement is to:
Process a single message at any time
No parallel processing
Order is not important
Time is not important
This can be achieved by setting a concurrency limit on the AWS function, which limits the number of parallel executions.
From Set Concurrency Limits on Individual AWS Lambda Functions:
This feature allows you to throttle a given function if it reaches a maximum number of concurrent executions allowed, which you can choose to set. This is useful when you want to limit traffic rates to downstream resources called by Lambda (e.g. databases).
Since order is not important, you can use a Standard Queue (instead of a FIFO queue).

AWS SQS Lambda Processing n files at once

I have setup an SQS queue where S3 paths are being pushed whenever there is a file upload.
I have also set up a Lambda with an SQS trigger and a batch size of 1.
In my scenario, I have to process n files at a time. Lets say (n = 10).
Say, there are 100 messages in the queue. In my current implementation I'm doing the following steps:
Whenever there is a message in the input queue, Lambda will be triggered
First I check the active number of concurrent executions I have. If am already running 10 executions, the code will simply return without doing anything. If it is less than 10, it reads one message from the queue and calls for processing.
Once the processing is done, the message will be manually deleted from the queue.
With the above mentioned approach, I'm able to process n files at a time. However, Say 100 files lands into S3 at the same time.
It leads to 100 lambda calls. Since we have a condition check in Lambda, the first 10 messages go for processing and the remaining 90 messages go to the in-flight mode.
Now, when some of my processing is done (say 3/10 got over), still the main queue is empty since the messages are still in-flight.
As per my understanding, if processing a file takes x minutes, the visibility timeout of the messages in the queue should be lesser than x (<x) . So that the message would once be available in the queue.
But it also leads to another problem. Say the batch took some more time to complete, message would come back to queue. Lambda would be triggered and once again it goes to the flight mode.
Is there any way, I can control the number of triggers made in lambda. For example: only first 10 messages should be processed however remaining 90 messages should remain visible in the queue. Or is there any other way I can make this design simple ?
I don't want to wait until 10 messages. Even if there are only 5 messages, it should trigger those files. And I don't want to call the Lambda in timely fashion (ex: calling it every 5 minutes).
There is a setting in Lambda called Reserved Concurrency, I'm going to quote from the docs (emphasis mine):
Reserved concurrency – Reserved concurrency creates a pool of requests that can only be used by its function, and also prevents its function from using unreserved concurrency.
[...]
To ensure that a function can always reach a certain level of concurrency, configure the function with reserved concurrency. When a function has reserved concurrency, no other function can use that concurrency. Reserved concurrency also limits the maximum concurrency for the function, and applies to the function as a whole, including versions and aliases.
For a deeper dive, check out this article from the documentation.
You can use this to limit how many Lambdas can be triggered in parallel - if no Lambda execution contexts are available, SQS invocations will wait.
This is only necessary if you want to limit how many files can be processed in parallel. If there is no actual need to limit this, it won't cost you more to let Lambda scale out for you.
You don't have to limit your concurrent Lambda execution. AWS already handling that for you. Here are the list of maximum concurrent per region from this document: https://docs.aws.amazon.com/lambda/latest/dg/invocation-scaling.html
Burst concurrency quotas
3000 – US West (Oregon), US East (N. Virginia), Europe (Ireland)
1000 – Asia Pacific (Tokyo), Europe (Frankfurt), US East (Ohio)
500 – Other Regions
In this document: https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
Scaling and processing
For standard queues, Lambda uses long polling to poll a queue until it
becomes active. When messages are available, Lambda reads up to 5
batches and sends them to your function. If messages are still
available, Lambda increases the number of processes that are reading
batches by up to 60 more instances per minute. The maximum number of
batches that can be processed simultaneously by an event source
mapping is 1000.
For FIFO queues, Lambda sends messages to your function in the order
that it receives them. When you send a message to a FIFO queue, you
specify a message group ID. Amazon SQS ensures that messages in the
same group are delivered to Lambda in order. Lambda sorts the messages
into groups and sends only one batch at a time for a group. If the
function returns an error, all retries are attempted on the affected
messages before Lambda receives additional messages from the same
group.
Your function can scale in concurrency to the number of active message
groups. For more information, see SQS FIFO as an event source on the
AWS Compute Blog.
You can see that Lambda is handling the scaling up automatically. No need to artificially limit the number of Lambda running to 10.
The idea of Lambda is you want to run as many tasks as possible so that you can achieve parallel execution in the shortest time.

AWS Lambda temporal load balancing

I have a AWS Lambda function that I invoke with every 1 minute with >1000 SNS events. This is a problem because my account concurrency is set at 3000, so if I start adding more jobs then eventually I'm going to have >3000 concurrent Lambda instances.
Each job takes around 2-5 seconds to complete which means that within each 1 minute window the concurrency limit will only be threatened within the first 5 seconds and I'll have 0 concurrency for the remaining 55 seconds.
If I set a concurrency limit (e.g. 1000) for the lambda will it handle the first 1000 SNS events and then automatically pick up the remainder once the concurrency frees up? And will I only be charged for the actual runtime rather than time spent waiting for concurrency to reduce?
Otherwise, is there a way that AWS will allow me to spread the load of jobs throughout the 1 minute window so that I can invoke the lambda every ~5 seconds with a subset of the total number of jobs?
If I set a concurrency limit (e.g. 1000) for the lambda will it handle the first 1000 SNS events and then automatically pick up the remainder once the concurrency frees up? And will I only be charged for the actual runtime rather than time spent waiting for concurrency to reduce?
Yes. Setting the concurrency limit definitely comes in handy on your use case and is the way to go. This is one of the reasons why concurrency limit actually exists :)
Unfortunately you can't take advantage of batching with SNS because it always sends one and only event. What you could do is to hook up a SQS queue with your SNS topic and have the Lambda function subscribe to the SQS queue instead, then you can take advantage of batching (max batch size is 10), greatly reducing the amount of concurrent Lambda executions, but still, you'd need to set a concurrency limit to make sure you don't use up all the available concurrency.
Otherwise, is there a way that AWS will allow me to spread the load of jobs throughout the 1 minute window so that I can invoke the lambda every ~5 seconds with a subset of the total number of jobs?
No, but this is unnecessary because of the above.

Scaling AWS Lambda with SQS

I want to use SQS for calling Lambda.
An execution time of lambda function is 3 minutes.
I want to execute 1000 lambda functions at once, so I send 1000 messages to SQS queue
But according to an AWS documentation
Amazon Simple Queue Service supports an initial burst of 5 concurrent function invocations and increases concurrency by 60 concurrent invocations per minute.
https://docs.aws.amazon.com/en_us/lambda/latest/dg/scaling.html
So I should wait a few minutes until all messages will be processed. Is there any workaround to call 1000 concurrent lambda and avoid "cold start"?
UPD: I got answer from AWS support
You are correct that SQS will start at an initial burst of 5 and
increase by a concurrency of 60 per minute. Scaling rates can't be
increased
If you see the Automatic Scaling section of the documentation page that describes the autoscaling behaviour under sudden load. I don't think cold start would be a problem. The first batch of concurrent Lambdas executions would likely see the cold start and all the subsequent invocations would be fine.