I have a lambda setup that has event source configured with 5 different SQS queues. Now, if the batch size of the lambda is configured to be 10, will the 10 records in SQSEvent in the lambda handler will be from the same queue or can the 10 records in the batch be from any of the 5 queues ?
The behavior is undocumented, but it's almost certainly the case that the Lambda service will batch events from one, and only, queue in a single Lambda invocation.
That said, if it's critical to your application that you be able to distinguish one queue source from another, then either:
create one Lambda handler per queue (it could simply call your common handler function with an indicator of which queue was the source), or
check the value of the eventSourceARN in each record in the event
Related
in the AWS doc, it is written
Lambda reads up to five batches and sends them to your function.
(https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#events-sqs-scaling)
I am a bit confused about that part
"reads up to five batches".
Does it mean:
5 SQS ReceiveMessage API calls are made in parallel at the same time ?
5 SQS ReceiveMessage API calls are made one by one (each one creating a new lambda environment)
Lambda polls 5 batches in parallel.
AWS Lambda, in python for example, uses the queue.receive_messages function, to receive messages. This function is able to receive a batch of messages in a single request from an SQS queue.
The default is 10 messages per batch as seen here and may range to 10000 for standard queues. But there is a limit for simultaneous batches and that's 5 batches, sent to the same lambda.
If there are still messages in the Queue, lambda launches up to 60 more lambdas per minute to consume them.
Finally, event source mapping (lambda's link to the SQS queue) can handle up to 1000 batches of messages simultaneously.
I'm having a use case where I have an Amazon SQS fifo queue with lambda function. I need to make sure that fifo triggers the lambda only when the previous lambda execution is completed (also the events come in order). As from aws docs, fifo supports exactly once processing but it does not mention anywhere that it would not push more event on lambda untill the first message is completely processed.
I need to make sure that the next message is processed only when the previous message is completely processed by the lambda function.
Is there are way to ensure that message 2 is only processed by lambda when message 1 is completely processed by lambda?
fifo supports exactly once processing but it does not mention anywhere
that it would not push more event on lambda untill the first message
is completely processed.
SQS never pushes anything anywhere. You have to poll SQS for messages. When you configure Lambda integration with SQS Lambda is actually running a process behind the scenes to poll SQS for you.
AWS FIFO queues allow you to force messages to be processed in order by specifying a Message Group ID. When you specify the same Message Group ID for multiple messages, then the FIFO queue will only make one of those messages available at a time (in first-in-first-out) order. Only after the first message is removed from the queue is the second message made available, etc...
In addition to this, you should configure AWS Lambda SQS integration with a Batch Size of 1, so that it doesn't try to wait for multiple messages to be available before processing. And you could configure the Reserved Concurrency on the Lambda function to 1, as mentioned in the other answer, so that only one instance of the Lambda function can be running at a time.
It is actually pretty easy to do this. It is not clarified, since it will by default simply use up the available account concurrency and handle as many messages in parallel as is possible.
You can influence this by setting the reserved concurrency for the lambda function to 1. This will ensure no more than 1 lambda function will be executed at the same time.
So, I am putting some entries in SQS Queue which is set as an event source for the Lambda, and this flow is working fine. As soon as entry comes in SQS queue lambda process it. so far so good.
But I have a situation where I want to let the entries to stay in SQS for 3-4 days and then let a lambda process them.
So basically if I see that okey, I have 100 entries in my SQS Queue and it's been 4 days now. I want to let lambda drain them and run some logic. Is this possible, Kindly guide me?
I think disabling lambda is not the way to fulfil the requirement, as you will miss other messages too.
SQS is messaging service and when it integrated with Lambda you can just configure retry and process the message, keeping the message in SQS, not in user control but lambda do that by design.
Lambda polls the queue and invokes your function synchronously with an
event that contains queue messages. Lambda reads messages in batches
and invokes your function once for each batch. When your function
successfully processes a batch, Lambda deletes its messages from the
queue.
enter link description here
One solution that can work to deal with your query
But I have a situation where I want to let the entries to stay in SQS
for 3-4 days and then let a lambda process them.
You also need to decide which SQS should not be processed at the moment and push these message to DynamoDb and then process these message after 4 or 5 days base on Dynamo DB TTL that was added during insertion. You can follow below steps
Add property to SQS is_dynamodb to identify the message that should not be processed at the moment
Push such message to DynamoDB
Add TTL during insertion
Check event in Lambda function that stream from DynamoDb is removed not insertion
Process messages if the event is Removed
I have a Scheduled Lambda function (via CloudWatch event rule) which is triggered every minute.
This lambda picks up a request from SQS queue, process the parameters and triggers AWS step functions workflow.
Now, ONLY 1 Lambda function instance is running every minute. How can I trigger multiple (e.g. 10) concurrent Lambda functions like this?
One way I can think of is to create 10 Cloudwatch event rule which runs every 1 minute, but I am not sure if that is the right way of doing it. Also, if I use this way, 10 lambda would be called even if I don't have entries in my SQS queue.
You can use the lambda step function.
Event trigger first function. Then it will call multiple functions parallel.
Some useful links:
https://www.youtube.com/watch?v=c797gM0f_Pc
https://medium.com/soluto-nashville/simplifying-workflows-with-aws-step-functions-57d5fad41e59
since your lambda function fetching data from SQS so you can create event source mapping between lambda and SQS so whenever message published to SQS, your lambda function will invoke concurrently depending on number of messages in queue so you do not need to invoke lamnda from cloudwatch event
I'm using an AWS Lambda function that is triggered from an SNS event trigger to consume from an SQS queue. When the Lambda function executes, it pulls 10 messages from the queue, processes them, pulls another 10, and so on and so forth - up to a certain time limit that's coded into the Lambda function (less than the max of 5 minutes, obviously).
It's my understanding that a Lambda function triggered by an SNS event is one-to-one, is that correct? In other words, one SNS event won't trigger multiple Lambda functions (up to the maximum concurrent execution limit). There's no scaling based on load.
Are there any other potential solutions, leveraging Lambda, that would let me consume from SQS as frequently/fast as possible? I had considered trying to auto-scale my Lambda functions by leveraging CloudWatch alarms (and SNS event triggers) based on SQS queue size, but it seems like those alarms can fire, at most, every 5 minutes. I've also considered developing a master Lambda function that can automatically execute (many) slave Lambdas based on querying the queue size.
I understand that the more optimal design may be to leverage Kinesis instead of SNS. I may consider incorporating Kinesis in the future, but let's just pretend that Kinesis is not an option at this time.
There is no best way to do this. One approach (which you've kind of already mentioned) is to use CloudWatch and schedule a Lambda function to run every minute (that's the minimum schedule time for Lambda). This Lambda function will then look for new SQS messages and invoke other Lambda functions to handle new message(s). Here is a very good article for that use case: https://cloudonaut.io/integrate-sqs-and-lambda-serverless-architecture-for-asynchronous-workloads/
Personally, I do not recommend triggering your Lambda by SNS for this use case, because SNS doesn't give a full guarantee for delivery and recommend sending the SNS notifications to SQS - which does not solve your problem. From the FAQ's:
[...] If it is critical that all published messages be successfully processed, developers should have notifications delivered to an SQS queue (in addition to notifications over other transports).
Source: https://aws.amazon.com/sns/faqs/
For this kind of processing, instead of SQS if you push messages to Kinesis Stream you should be able to flexibly process(In batches of needed size) the messages.
Note: If you use SQS, after triggering a Lambda function through SNS (or using a Scheduled Lambda), it can invoke inner Lambda functions to check the queue where multiple concurrent inner Lambdas are spawned. However the problem is that its not practical to process SQS items in batches.