Map Reduce AWS Lambda & SQS - amazon-web-services

I have a Lambda function which spawns up a number of worker Lambda functions and each worker function posts to an SQS queue if there is any error.
There is a UI which long-polls the SQS queue for any errors. My problem is that how do I know when the processing is completed?
Since the first Lambda function (which spawns the worker Lambda functions) runs asynchronously, that is, it splits the data across the worker Lambda functions and then returns/finishes. I need to have an ability to be able to figure out when the processing is completed.
The reason why I'm only posting the errors to the SQS queue and not success is because if I have 10,000 objects to process and there are 9,000 successes, I would have to do quite many ReceiveMessages (an SQS API call to retrieve items from the SQS queue) in the UI/client side (probably around 900 calls if I specify the maximum number of messages to receive as 10 per call. You cannot retrieve more than 10 messages from the queue per a call.)
How can I overcome this design issue?
I'm using API Gateway, AWS Lambda and Dynamo DB (feel free to suggest any other Amazon/AWS service that could make this easier to get the job done.)

Related

How does AWS Lambda determine if messages are still in SQS queue?

When using AWS Lambda with a SQS queue (as event source), it is written in the doc
If messages are still available, Lambda increases the number of
processes that are reading batches by up to 60 more instances per
minute.
https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html
My question here is how does the Lambda service determine "If messages are still available" ?
Answering the "how" question in a slightly different way:
Behind the scenes, Lambda operates a "State Manager" control-plane service that discovers work from the queue. State Manager also manages scaling of the fleet of "Poller" workers that do the actual retrieving, batching, invoking, and deleting.
These implementation details are from the Event Source Mapping section of the re:Invent 2022 video A closer look at AWS Lambda (SVS404-R). Here is a screenshot:
One of the calls to the SQS API is to get queue attributes (Java API, others similar). This returns a response and one of the attributes of the response is "approximate number of messages". With this you or AWS can determine about how many messages are in the queue.
From this, AWS can determine if it's worth spinning up additional instances. You too can get this information from the queue.
I imagine it uses the ApproximateNumberOfMessagesVisible metric on the SQS queue to check how many messages are available, and uses that number, plus your batch size configuration, to determine how many more Lambda instances your function needs to be scaled out to.
I believe the documentation refers to Lambda polling the queue to know whether there are still messages. Read more about it here.
Lambda polls the queue and invokes your Lambda function synchronously
with an event that contains queue messages. Lambda reads messages in
batches and invokes your function once for each batch. When your
function successfully processes a batch, Lambda deletes its messages
from the queue.
Event Source Mapping:
Lambda only sees messages that are visible, via the visibility timeout setting on the SQS queue. This is to prevent other queue consumers processing the message. I believe as an event-source, Lambda receives messages from the SQS queue, via being mapped to it.
As per the documentation you shared,for standard queues, Long Polling is in effect. Long polling basically waits for a certain amount of time to verify if there is a message in the queue. refer to the following docs :
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-short-and-long-polling.html
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/confirm-queue-is-empty.html

AWS Lambda read from SQS without concurrency

My requirement is like this.
Read from a SQS every 2 hours, take all the messages available and then process it.
Processing includes creating a file with details from SQS messages and sending it to an sftp server.
I implemented a AWS Lambda to achieve point 1. I have a Lambda which has an sqs trigger. I have set batch size as 50 and then batch window as 2 hours. My assumption was that Lambda will get triggered every 2 hours and 50 messages will be delivered to the lambda function in one go and I will create a file for every 50 records.
But I observed that my lambda function is triggered with varied number of messages(sometimes 50 sometimes 20, sometimes 5 etc) even though I have configured batch size as 50.
After reading some documentation I got to know(I am not sure) that there are 5 long polling connections which lambda spawns to read from SQS and this is causing this behaviour of lambda function being triggered with varied number of messages.
My question is
Is my assumption on 5 parallel connections being established correct? If yes, is there a way I can control it? I want this to happen in a single thread / connection
If 1 is not possible, what other alternative do I have here. I do not want to have one file created for every few records. I want one file to be generated every two hours with all the messages in sqs.
A "SQS Trigger" for Lambda is implemented with the so-called Event Source Mapping integration, which polls, batches and deletes messages from the queue on your behalf. It's designed for continuous polling, although you can disable it. You can set a maximum batch size of up to 10,000 records a function receives (BatchSize) and a maximum of 300s long polling time (MaximumBatchingWindowInSeconds). That doesn't meet your once-every-two-hours requirement.
Two alternatives:
Remove the Event Source Mapping. Instead, trigger the Lambda every two hours on a schedule with an EventBridge rule. Your Lambda is responsible for the SQS ReceiveMessage and DeleteMessageBatch operations. This approach ensures your Lambda will be invoked only once per cron event.
Keep the Event Source Mapping. Process messages as they arrive, accumulating the partial results in S3. Once every two hours, run a second, EventBridge-triggered Lambda, which bundles the partial results from S3 and sends them to the SFTP server. You don't control the number of Lambda invocations.
Note on scaling:
<Edit (mid-Jan 2023): AWS Lambda now supports SQS Maximum Concurrency>
AWS Lambda now supports setting Maximum Concurrency to the Amazon SQS event source, a more direct and less fiddly way to control concurrency than with reserved concurrency. The Maximum Concurrency setting limits the number of concurrent instances of the function that an Amazon SQS event source can invoke. The valid range is 2-1000 concurrent instances.
The create and update Event Source Mapping APIs now have a ScalingConfig option for SQS:
aws lambda update-event-source-mapping \
--uuid "a1b2c3d4-5678-90ab-cdef-11111EXAMPLE" \
--scaling-config '{"MaximumConcurrency":2}' # valid range is 2-1000
</Edit>
With the SQS Event Source Mapping integration you can tweak the batch settings, but ultimately the Lambda service is in charge of Lambda scaling. As the AWS Blog Understanding how AWS Lambda scales with Amazon SQS standard queues says:
Lambda consumes messages in batches, starting at five concurrent batches with five functions at a time. If there are more messages in the queue, Lambda adds up to 60 functions per minute, up to 1,000 functions, to consume those messages.
You could theoretically restrict the number of concurrent Lambda executions with reserved concurrency, but you would risk dropped messages due to throttling errors.
You could try to set the ReservedConcurrency of the function to 1. That may help. See the docs for reference.
A simple solution would be to create a CloudWatch Event Trigger (similar to a Cronjob) that triggers your Lambda function every two hours. In the Lambda function, you call ReceiveMessage on the Queue until you get all messages, process them and afterward delete them from the Queue. The drawback is that there may be too many messages to process within 15 minutes so that's something you'd have to manage.

How does AWS Lambda internal pollers manage SQS API calls?

in the AWS doc, it is written
Lambda reads up to five batches and sends them to your function.
(https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#events-sqs-scaling)
I am a bit confused about that part
"reads up to five batches".
Does it mean:
5 SQS ReceiveMessage API calls are made in parallel at the same time ?
5 SQS ReceiveMessage API calls are made one by one (each one creating a new lambda environment)
Lambda polls 5 batches in parallel.
AWS Lambda, in python for example, uses the queue.receive_messages function, to receive messages. This function is able to receive a batch of messages in a single request from an SQS queue.
The default is 10 messages per batch as seen here and may range to 10000 for standard queues. But there is a limit for simultaneous batches and that's 5 batches, sent to the same lambda.
If there are still messages in the Queue, lambda launches up to 60 more lambdas per minute to consume them.
Finally, event source mapping (lambda's link to the SQS queue) can handle up to 1000 batches of messages simultaneously.

SQS invoke multiple lambda at same time

I am new in aws sqs as of now I understand sqs have a queue which is storing request messages (parameter) then our attached lambda will fetch numbers of messages based on the batch file which we set on lambda.
so if the sqs queue has 10000 messages and the lambda batch is set to 100 then in each pulling lambda which fetches 100 messages from the queue and executes all until all request are processed then again it will pull 100 messages and so on?
so as of now, I understand lambda will wait for the next pulling until the previous pulling process is finished.
hope I am correct if not please correct me.
now my requirement is lambda should not wait to finish the previous pulling instead it should pull the next 100 messages and execute parallelly for eg lambda should create a different instance(something like this) and each instance pulls 100 100 messages and execute parallelly.
In the situation you describe, the AWS Lambda service will automatically run multiple AWS Lambda functions based upon the concurrency settings of your function.
See: Lambda function scaling - AWS Lambda
The default is to permit up to 1000 concurrent executions of an AWS Lambda function.
Therefore, you do not need to change anything. It will automatically create multiple instances of the Lambda function in parallel and pass (up to) 100 messages to each execution.
For a really good series of articles to understand how AWS Lambda operates, see: Operating Lambda: Performance optimization – Part 1 | AWS Compute Blog

Read SQS queue from AWS Lambda

I have the following infrastructure:
I have an EC2 instance with a NodeJS+Express process listening on a port for messages (process 1). Every time the process receives a message it sends it to an SQS queue. Then I have another process in the same machine reading the queue using long polling (process 2). When it finds a message in the queue it inserts the data in a MariaDB database sitting on an RDS instance.
(Just to clarify, messages are generated by users, they send a chunk of data which can contain arbitrary information to the endpoint where the process 1 is listening)
Now I want to put the process that reads the SQS (process 2) in a Lambda function so that the process that writes to the queue and the one that reads from the queue are completely independent. The problem is that I don't know if this is possible.
I know that Lambda function are invoked in response to an event, and the events supported at the moment are S3, SNS, SES, DynamoDB, Kinesis, Cognito, CloudWatch and Cloudformation but NOT SQS.
I was thinking in using SNS notifications to invoke the Lambda function so that every time a message is pushed to the queue, an SNS notification is fired and invokes the Lambda function but after playing a bit with it I've realised that is not possible to create an SNS notification from SQS, it's only possible to write SNS notifications to the queue.
Right now I'm a bit stuck because I don't know how to continue. I have the feeling that is not possible to create this infrastructure due to the current limitations in the AWS services. Is there another way to do what I want or am I in a dead-end?
Just to extend my question with some research I've made, this github repo shows how to read an SQS queu from a Lambda function but the lambda function works only if is fired from the command line:
https://github.com/robinjmurphy/sqs-to-lambda
In the readme, the author mentions the following:
Update: Lambda now supports SNS notifications as an event source,
which makes this hack entirely unneccessary for SNS notifcations. You
might still find it useful if you like the idea of using a Lambda
function to process jobs on an SQS queue.
But I think this doesn't solve my problem, an SNS notification can invoke the Lambda function but I don't see how I can create a notification when a message is received in the SQS queue.
Thanks
There are couple of Strategies which can be used to connect the dots, (A)Synchronously or Run-Sleep-Run to keep the data process flow between SNS, SQS, Lambda.
Strategy 1 : Have a Lambda function listen to SNS and process it in real time [Please note that an SQS Queue can subscribe to an SNS Topic - which would may be helpful for logging / auditing / retry handling]
Strategy 2 : Given that you are getting data sourced to SQS Queue. You can try with 2 Lambda Functions [Feeder & Worker].
Feeder would be scheduled lambda function whose job is to take items
from SQS (if any) and push it as an SNS topic (and continue doing it forever)
Worker would be linked to listen the SNS topic which would do the actual data processing
We can now use SQS messages to trigger AWS Lambda Functions. Moreover, no longer required to run a message polling service or create an SQS to SNS mapping.
Further details:
https://aws.amazon.com/blogs/aws/aws-lambda-adds-amazon-simple-queue-service-to-supported-event-sources/
AWS SQS is one of the oldest products of Amazon, which only supported polling (long and short) up until June 2018. As mentioned in this answer, AWS SQS now supports the feature of triggering lambda functions on new message arrival in SQS. A complete tutorial for this is provided in this document.
I used to tackle this problem using different mechanisms, and given below are some approaches you can use.
You can develop a simple polling application in Lambda, and use AWS CloudWatch to invoke it every 5 mins or so. You can make this near real-time by using CloudWatch events to invoke lambda with short downtimes. Use this tutorial or this tutorial for this purpose. (This could cost more on Lambdas)
You can consider that SQS is redundant if you don't need to persist the messages nor guarantee the order of delivery. You can use AWS SNS (Simple Notification Service) to directly invoke a lambda function and do whatever the processing required. Use this tutorial for this purpose. This will happen in real-time. But the main drawback is the number of lambdas that can be initiated per region at a given time. Please read this and understand the limitation before following this approach. Nevertheless AWS SNS Guarantees the order of delivery. Also SNS can directly call an HTTP endpoint and store the message in your DB.
I had a similar situation (and now have a working solution deploed). I have addressed it in a following manner:
i.e. publishing events to SNS; which then get fanned-out to Lambda and SQS.
NOTE: This is not applicable to the events that have to be processed in a certain order.
That there are some gotchas (w/ possible solutions) such as:
racing condition: lambda might get invoked before messages is deposited into the queue
distributed nature of SQS queue may lead to returning no messages even though there is a message note1.
The solution to both cases would be to do long-polling of SQS queue; but this does make your lambda bill more expensive.
note1
Short poll is the default behavior where a weighted random set of machines is sampled on a ReceiveMessage call. This means only the messages on the sampled machines are returned. If the number of messages in the queue is small (less than 1000), it is likely you will get fewer messages than you requested per ReceiveMessage call. If the number of messages in the queue is extremely small, you might not receive any messages in a particular ReceiveMessage response; in which case you should repeat the request.
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ReceiveMessage.html
We had some similar requirements so we ended up building a library and open sourcing it to help with SQS to Lambda async. I'm not sure if this fills your particular set of requirements, but thought it might be worth a look: https://read.iopipe.com/sqs-lambda-teaming-up-92c4096be49c