Prevent AWS Lambda throttling from async invocation - amazon-web-services

I'm currently modeling a serverless application that extensively uses S3 events as a source for lambda functions.
I've configured the Dead-Letter Queue (DLQ) for lambda to provide some resilience, but I'm still wondering: is there any other way to reduce lambda concurrency and prevent the throttling issue?
Theoretically, AWS docs state that for the asynchronous invocation, Lambda uses the internal queue to process events and retries, which means that there should be no problem for Lambda to process this queue without throttling.
In a non-serverless app, I would organize it with a queue and a limited number of queue workers, however, this is not that trivial for AWS.
Are there any other ways except the DLQ, to handle or avoid the throttling issue?
Thanks!

In my experience, plugging a Lambda directly to S3 events gives many issues, including throttling and message loss.
How we (on production) usually do is to push all the S3 events to an SQS and have the Lambda function be triggered from that (not polling). Then you have more control over the parallel executions, retries, and retention.
You can use SNS instead of SQS, but I don't have any experience with that.

Related

How to manage burst of AWS Cloud Watch events triggering an AWS Lambda function

I have a service which generates a burst of Cloud Watch Events once every hour. These Cloud Watch events (which could be in thousands) each will trigger an AWS Lambda function and ultimately number of concurrent lambdas running can cross the maximum limit. How can I modify my system such that these cloud watch events will be handled gracefully by Lambda functions or if possible somehow I can distribute all these cloud watch events over the rest period of the first service.
P.S. I do not want to modify the limit on concurrent running lambdas.
have you thought about adding these events to SQS instead of consuming Lambda directly and then configure the SQS to call the Lambda function?
This is how you can trigger a Lambda function by SQS queue
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-configure-lambda-function-trigger.html
and you can define the delay in queue consumption by using this article
https://cloudaffaire.com/how-to-configure-delay-queue-in-sqs/
As prior stated you could use SQS, but I too would advise against this because you would still have the same concurrency issue (more here).
Depending on how quickly you require your processing, it may be a good idea to have lambda "poll" an SQS every couple of minutes, then send the batch of messages to a lambda to process via SNS (rather than have lambda trigger off of SQS PutMessage API Calls).

Why is AWS Lambda handler invoked directly by some AWS while Lambda needs to poll others?

Why are some Amazon Web Services configured such that they can make direct calls to Lambda handlers with appropriate permissions, while for others like SQS, lambda needs to poll repeatedly? Why can't we have a provision for invoking Lambda as soon as a message is added to an SQS, instead of polling repeatedly?
I think this is related to scaling.
From Understanding Scaling Behavior - AWS Lambda:
Poll-based event sources that are not stream-based: For Lambda functions that process Amazon SQS queues, AWS Lambda will automatically scale the polling on the queue until the maximum concurrency level is reached, where each message batch can be considered a single concurrent unit. AWS Lambda's automatic scaling behavior is designed to keep polling costs low when a queue is empty while simultaneously enabling you to achieve high throughput when the queue is being used heavily.
When an Amazon SQS event source mapping is initially enabled, Lambda begins long-polling the Amazon SQS queue. Long polling helps reduce the cost of polling Amazon Simple Queue Service by reducing the number of empty responses, while providing optimal processing latency when messages arrive.
When messages are available, Lambda initially launches up to 5 instances of your function, to handle 5 batches simultaneously. Then, Lambda launches up to 60 more instances per minute, up to 1000 total, as long as you have concurrency available at the account and function level.

How to pool AWS SQS with AWS Lambda

At the moment I'm are pooling AWS SQS from our back-end and doing business logic once payload is received.
I would like to move this to AWS Lambda and start automating business logic via SQS/SNS.
As I can not subscribe to AWS SQS events, what is the best practice in implementing SQS pooling with Lambda (node.js)?
SQS doesn't really work well with Lambda since you cannot automatically trigger Lambda functions from SQS queues messages.
I would rather remove the SQS/SNS logic and go for a DynamoDB Streams based solution that would cover the queueing, archiving & Lambda triggering tasks natively: your producer puts messages in a DynamoDB table while your Lambda is triggered for any new entry with Streams (it's an AWS native mechanism)
Of course a Kinesis based solution may be considered as well.
It is possible to even simplify the whole polling process by using the built-in SQS event source for lambda.
Lambda will automatically scale out horizontally consume the messages
in my queue. Lambda will try to consume the queue as quickly and
effeciently as possible by maximizing concurrency within the bounds of
each service. As the queue traffic fluctuates the Lambda service will
scale the polling operations up and down based on the number of
inflight messages.
see AWS Blog
As I can not subscribe to AWS SQS events,
Why?
Lambda can be triggered on SQS messages. Lambda internally handles the scaling, batching and retries for you. Check this AWS documentation on how to use Lambda with SQS.

Using AWS Lambda Functions to Consume AWS SQS Queues

I'm using an AWS Lambda function that is triggered from an SNS event trigger to consume from an SQS queue. When the Lambda function executes, it pulls 10 messages from the queue, processes them, pulls another 10, and so on and so forth - up to a certain time limit that's coded into the Lambda function (less than the max of 5 minutes, obviously).
It's my understanding that a Lambda function triggered by an SNS event is one-to-one, is that correct? In other words, one SNS event won't trigger multiple Lambda functions (up to the maximum concurrent execution limit). There's no scaling based on load.
Are there any other potential solutions, leveraging Lambda, that would let me consume from SQS as frequently/fast as possible? I had considered trying to auto-scale my Lambda functions by leveraging CloudWatch alarms (and SNS event triggers) based on SQS queue size, but it seems like those alarms can fire, at most, every 5 minutes. I've also considered developing a master Lambda function that can automatically execute (many) slave Lambdas based on querying the queue size.
I understand that the more optimal design may be to leverage Kinesis instead of SNS. I may consider incorporating Kinesis in the future, but let's just pretend that Kinesis is not an option at this time.
There is no best way to do this. One approach (which you've kind of already mentioned) is to use CloudWatch and schedule a Lambda function to run every minute (that's the minimum schedule time for Lambda). This Lambda function will then look for new SQS messages and invoke other Lambda functions to handle new message(s). Here is a very good article for that use case: https://cloudonaut.io/integrate-sqs-and-lambda-serverless-architecture-for-asynchronous-workloads/
Personally, I do not recommend triggering your Lambda by SNS for this use case, because SNS doesn't give a full guarantee for delivery and recommend sending the SNS notifications to SQS - which does not solve your problem. From the FAQ's:
[...] If it is critical that all published messages be successfully processed, developers should have notifications delivered to an SQS queue (in addition to notifications over other transports).
Source: https://aws.amazon.com/sns/faqs/
For this kind of processing, instead of SQS if you push messages to Kinesis Stream you should be able to flexibly process(In batches of needed size) the messages.
Note: If you use SQS, after triggering a Lambda function through SNS (or using a Scheduled Lambda), it can invoke inner Lambda functions to check the queue where multiple concurrent inner Lambdas are spawned. However the problem is that its not practical to process SQS items in batches.

Read SQS queue from AWS Lambda

I have the following infrastructure:
I have an EC2 instance with a NodeJS+Express process listening on a port for messages (process 1). Every time the process receives a message it sends it to an SQS queue. Then I have another process in the same machine reading the queue using long polling (process 2). When it finds a message in the queue it inserts the data in a MariaDB database sitting on an RDS instance.
(Just to clarify, messages are generated by users, they send a chunk of data which can contain arbitrary information to the endpoint where the process 1 is listening)
Now I want to put the process that reads the SQS (process 2) in a Lambda function so that the process that writes to the queue and the one that reads from the queue are completely independent. The problem is that I don't know if this is possible.
I know that Lambda function are invoked in response to an event, and the events supported at the moment are S3, SNS, SES, DynamoDB, Kinesis, Cognito, CloudWatch and Cloudformation but NOT SQS.
I was thinking in using SNS notifications to invoke the Lambda function so that every time a message is pushed to the queue, an SNS notification is fired and invokes the Lambda function but after playing a bit with it I've realised that is not possible to create an SNS notification from SQS, it's only possible to write SNS notifications to the queue.
Right now I'm a bit stuck because I don't know how to continue. I have the feeling that is not possible to create this infrastructure due to the current limitations in the AWS services. Is there another way to do what I want or am I in a dead-end?
Just to extend my question with some research I've made, this github repo shows how to read an SQS queu from a Lambda function but the lambda function works only if is fired from the command line:
https://github.com/robinjmurphy/sqs-to-lambda
In the readme, the author mentions the following:
Update: Lambda now supports SNS notifications as an event source,
which makes this hack entirely unneccessary for SNS notifcations. You
might still find it useful if you like the idea of using a Lambda
function to process jobs on an SQS queue.
But I think this doesn't solve my problem, an SNS notification can invoke the Lambda function but I don't see how I can create a notification when a message is received in the SQS queue.
Thanks
There are couple of Strategies which can be used to connect the dots, (A)Synchronously or Run-Sleep-Run to keep the data process flow between SNS, SQS, Lambda.
Strategy 1 : Have a Lambda function listen to SNS and process it in real time [Please note that an SQS Queue can subscribe to an SNS Topic - which would may be helpful for logging / auditing / retry handling]
Strategy 2 : Given that you are getting data sourced to SQS Queue. You can try with 2 Lambda Functions [Feeder & Worker].
Feeder would be scheduled lambda function whose job is to take items
from SQS (if any) and push it as an SNS topic (and continue doing it forever)
Worker would be linked to listen the SNS topic which would do the actual data processing
We can now use SQS messages to trigger AWS Lambda Functions. Moreover, no longer required to run a message polling service or create an SQS to SNS mapping.
Further details:
https://aws.amazon.com/blogs/aws/aws-lambda-adds-amazon-simple-queue-service-to-supported-event-sources/
AWS SQS is one of the oldest products of Amazon, which only supported polling (long and short) up until June 2018. As mentioned in this answer, AWS SQS now supports the feature of triggering lambda functions on new message arrival in SQS. A complete tutorial for this is provided in this document.
I used to tackle this problem using different mechanisms, and given below are some approaches you can use.
You can develop a simple polling application in Lambda, and use AWS CloudWatch to invoke it every 5 mins or so. You can make this near real-time by using CloudWatch events to invoke lambda with short downtimes. Use this tutorial or this tutorial for this purpose. (This could cost more on Lambdas)
You can consider that SQS is redundant if you don't need to persist the messages nor guarantee the order of delivery. You can use AWS SNS (Simple Notification Service) to directly invoke a lambda function and do whatever the processing required. Use this tutorial for this purpose. This will happen in real-time. But the main drawback is the number of lambdas that can be initiated per region at a given time. Please read this and understand the limitation before following this approach. Nevertheless AWS SNS Guarantees the order of delivery. Also SNS can directly call an HTTP endpoint and store the message in your DB.
I had a similar situation (and now have a working solution deploed). I have addressed it in a following manner:
i.e. publishing events to SNS; which then get fanned-out to Lambda and SQS.
NOTE: This is not applicable to the events that have to be processed in a certain order.
That there are some gotchas (w/ possible solutions) such as:
racing condition: lambda might get invoked before messages is deposited into the queue
distributed nature of SQS queue may lead to returning no messages even though there is a message note1.
The solution to both cases would be to do long-polling of SQS queue; but this does make your lambda bill more expensive.
note1
Short poll is the default behavior where a weighted random set of machines is sampled on a ReceiveMessage call. This means only the messages on the sampled machines are returned. If the number of messages in the queue is small (less than 1000), it is likely you will get fewer messages than you requested per ReceiveMessage call. If the number of messages in the queue is extremely small, you might not receive any messages in a particular ReceiveMessage response; in which case you should repeat the request.
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ReceiveMessage.html
We had some similar requirements so we ended up building a library and open sourcing it to help with SQS to Lambda async. I'm not sure if this fills your particular set of requirements, but thought it might be worth a look: https://read.iopipe.com/sqs-lambda-teaming-up-92c4096be49c