How to block lambda on event and timeout after expected time window - amazon-web-services

Is there a way aws lambda can wait (block and not run/sleep) on an event for a certain amount of time (say 10 hour) and if event isnt received in the window, lambda timeout and raises error. If not with lambda can it be acheived with other aws tech like eventbridge, step function etc?

Lambda run time is only 15 mins max - you can however have another set of lambda's set up that use the SDK to do what you wish - You can use the SDK to either disable / redirect the endpoint the lambda you wish to sleep is at, you can disable invocations, you can (and this would be my suggestion) have a variable stored in the SSM Parameter store that you update that bypasses all the structure in the lambda itself.
You can then use the SDK to set up a TTL/Timed Event to re-enable the parameter store variable after a given time (through the use of another lambda)

When the delay is less than 30 seconds or so, then you can just wait inside the lambda. For longer delays, you should send an SQS message with a delay, and configure your lambda to process it.
The maximum delay you can get from SQS is only 15 minutes, but you can just pick it up and resend with a new delay if you need longer. It'll be very inexpensive.

Related

What's the difference between the SQS batch window duration and ReceiveMessage wait time seconds?

You can specify SQS as an event source for Lambda functions, with the option of defining a batch window duration.
You can also specify the WaitTimeSeconds for a ReceiveMessage call.
What are the key differences between these two settings?
What are the use cases?
They're fundamentally different.
The receive message wait time setting is a setting that determines if your application will be doing long or short polling. You should (almost) always opt for long polling as it helps reduce your costs; the how is in the documentation.
It can be set in 2 different ways:
on the queue level, by setting the ReceiveMessageWaitTimeSeconds attribute
on the message level, by setting the WaitTimeSeconds property on your ReceiveMessage calls
It determines how long your application will wait for a message to become available in the queue before returning an empty result.
On the other hand, you can configure an SQS queue as an event source for Lambda functions by adding it as a trigger.
When creating an SQS trigger, you have 2 optional fields:
batch size (the number of messages in each batch to send to the function)
batch window (the maximum amount of time to gather SQS messages before invoking the function, in seconds)
The batch window function sets the MaximumBatchingWindowInSeconds attribute for SQS event source mapping.
It's the maximum amount of time, in seconds, that the Lambda poller waits to gather the messages from the queue before invoking the function. The batch window just ensures that more messages have accumulated in the SQS queue before the Lambda function is invoked. This increases the efficiency and reduces the frequency of Lambda invocations, helping you reduce costs.
It's important to note that it's defined as the maximum as it's not guaranteed.
As per the docs, your Lambda function may be invoked as soon as any of the below are true:
the batch size limit has been reached
the batching window has expired
the batch reaches the payload limit of 6 MB
To conclude, both features are used to control how long something waits but the resulting behaviour differs.
In the first case, you're controlling how long the poller (your application) could wait before it detects a message in your SQS queue & then immediately returns. You could set this value to 10 seconds but if a message is detected on the queue after 5 seconds, the call will return. You can change this value per message, or have a universal value set at the queue level. You can take advantage of long (or short) polling with or without Lambda functions, as it's available via the AWS API, console, CLI and any official SDK.
In the second case, you're controlling how long the poller (inbuilt Lambda poller) could wait before actually invoking your Lambda to process the messages. You could set this value to 10 second and even if a message is detected on the queue after 5 seconds, it may still not invoke your Lambda. Actual behaviour as to when your function is invoked, will differ based on batch size & payload limits. This value is naturally set at the Lambda level & not per message. This option is only available when using Lambda functions.
You can’t use both together as long/short polling is for a constantly running application or one-off calls. A Lambda function cannot poll SQS for more than 15 minutes and that is with a manual invocation.
For Lambda functions, you would use native SQS event sourcing and for any other service/application/use case, you would manually integrate SQS.
They're same in the sense that both aim to help you to ultimately reduce costs, but very different in terms of where you can use them.

Recursive AWS lambda with updating state

I have an AWS Lambda that polls from an external server for new events every 6 hours. On every call, if there are any new events, it publishes the updated total number of events polled to a SNS. So I essentially need to call the lambda on fixed intervals but also pass a counter state across calls.
I'm currently considering the following options:
Store the counter somewhere on a EFS/S3, but it seems an
overkill for a simple number
EventBridge, which would be ok to schedule the execution, but doesn't store state across calls
A step function with a loop + wait on the the lambda would do it, but it doesn't seem to be the most efficient/cost effective way to do it
use a SQS with a delay so that the lambda essentially
triggers itself, passing the updated state. Again I don't think
this is the most effective, and to actually get to the 6 hours delay
I would have to implement some checks/delays within the lambda, as the max delay for SQS is 15 minutes
What would be the best way to do it?
For scheduling Lambda at intervals, you can use CloudWatch Events. Scheduling Lambda using Serverless framework is a breeze. A cronjob type statement can schedule your lambda call. Here's a guide on scheduling: https://www.serverless.com/framework/docs/providers/aws/events/schedule
As for saving data, you can use AWS Systems Manager Parameter Store. It's a simple Key value pair storate for such small amount of data.
https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html
OR you can also save it in DynamoDB. Since the data is small and frequency is less, you wont be charged much and there's no hassle of reading files or parsing.

SQS batching for Lambda trigger doesn't work as expected

I have 2 Lambda Functions and an SQS queue inbetween.
The first Lambda sends the messages to the Queue.
Then second Lambda has a trigger for this Queue with a batch size of 250 and a batch window of 65 seconds.
I expect the second Lambda to be triggered in batches of 250 messages after about every 65 seconds. In the second Lambda I'm calling a 3rd party API that is limited to 250 API calls per minute (I get 250 tokens per minute).
I tested this setup with for 32.000 messages being added to the queue and the second Lambda didn't pick up the messages in batches as expected. At first it got executed for 15k messages and then there were not enough tokens so it did not process those messages.
The 3rd party API is based on a token bucket with a fill rate of 250 per minute and a maximum capacity of 15.000. It managed to process the first 15.000 messages due to the bucket capacity and then didn't have enough capacity to handle the rest.
I don't understand what went wrong.
The misunderstanding is probably related to how Lambda handles scaling.
Whenever there are more events than a single Lambda execution context/instance can handle, Lambda just creates more execution contexts/instances to process these events.
What probably happened is that Lambda saw there are a bunch of messages in the queue and it tries to work on these as fast as possible. It created a Lambda instance to handle the first event and then talked to SQS and asked for more work. When it got the next batch of messages, the first instance was still busy, so it scaled out and created a second one that worked on the second batch in parallel, etc. etc.
That's how you ended up going through your token budget in a few minutes.
You can limit how many functions Lambda is allowed to execute in parallel by using reserved concurrency - here are the docs for reference. If you set the reserved concurrency to 1, there will be no parallelization and only one Lambda is allowed to work on the messages.
This however opens you up to another issue. If that single Lambda takes less than 60 seconds to process the messages, Lambda will call it again with another batch ASAP and you might go over your budget again.
At this point a relatively simple approach would be to make sure that your lambda function always takes about 60 seconds by adding a sleep for the remaining time at the end.

Processing AWS SQS messages with separate Lambda at a time

Like the title suggests, I have a scenario that I would like to explore but do not know how to go about it.
I have a lambda function processCSVFile. I also have a SQS queue that at a set time everyday, it gets populated with link of csv files from S3, let's say about 2000 messages. Now I want to process 25 messages at a time once the SQS queue has the messages.
The scenario I am looking for is to process 25 messages concurrently, I want the 25 messages to be processed by 25 lambda invocations separately. I thought I could use SendMessageBatch function in SQS but this only delivers messages to the queue, it does not seem to apply to my use case.
My question is, am I able to perform the action explained above and if it is possible, what documentation or use cases can explain what I am looking for.
Also, if this use case is impossible, what do you recommend as an alternative way to do the processing I want done concurrently.
To process 25 messages from Amazon SQS with 25 concurrent Lambda functions (1 message per running Lambda function), you would need:
A maximum concurrency of 25 configured for the Lambda function (otherwise it might go higher than this when more messages are available)
A batch size of 1 configured on the Lambda trigger so that SQS only passes it one message at a time
See:
AWS Lambda Function Scaling (Maximum concurrency)
Configuring a Queue as an Event Source (Batch size)
I think that combination of lambda's event source mapping for sqs
and setting reserved concurrency to 25 could be the way do go.
The lambda uses long pooling to prepare message batches for concurrent processing by lambda. Thus each invocation of your function could get more than 1 message at a time.
I don't think there is a way to set event source mapping to serve just one message per batch. If you absolute must ensure only one message is processed by lambda, then you process one and disregards others (put them back to queue).
The reserved concurrency of 25 guarantees that you wont be running more than 25 functions in parallel. If you leave it at its default value, you can run up to whatever free concurrency you have in your account.
Edit:
#JohnRotenstein already confirmed that there is a way to set lambda to pass message a time to your function.
Hope this helps.

SQS - Delivery Delay of 30 minutes

From the documentation of SQS, Max time delay we can configure for a message to hide from its consumers is 15 minutes - http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-delay-queues.html
Suppose if I need to hide the messages for a day, what is the pattern?
For eg. I want to mimic a daily cron for doing some action.
Thanks
The simplest way to do this is as follows:
SQS.push_to_queue({perform_message_at : "Thursday November 2022"},delay: 15 mins)
Inside your worker
message = SQS.poll_messages
if message.perform_message_at > Time.now
SQS.push_to_queue({perform_message_at : "Thursday November
2022"},delay:15 mins)
else
process_message(message)
end
Basically push the message back to the queue with the maximum delay and only process it when its processing time is less than the current time.
HTH.
Visibility timeout can do up to 12 hours. I think you can hack something together where you process a message but don't delete it and next time it is processed its been 12 hours. So a queue with one message and visibility timeout of 12 hours. That gets you a 12 hour cron.
Cloudwatch is likely a better way to do it. You can use a createEvent API with the timer, and have it trigger either a lambda function or an API call to whatever comes next.
Another way to do is to use the "wait" utility in an AWS step function.
https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-wait-state.html
In any case, unless you are extremely sure you will never need anything more than 15 minutes, the SQS backdoor to add the delay seems hacky.
You can do this by adding a DLQ with MaxReceives set to 1 on the first queue.
Add a simple Lambda on the first queue and fail the message vi Lambda. So message will be moved to DLQ automatically and then you can consume from DLQ.
Both primary queue and DLQ can have max 15 min delay, so finally you get 30 min delay.
So your consumer app receives the message after 30 minutes, without adding any custom logic on it.
Two thoughts.
Untested. Perhaps publish to and SNS topic that has no SQS queues. When delivery needs to happen, subscribe the queue to the topic. (I've not done this, I'm not sure if this would work as expected)
Push messages as files to a central store (like S3). Create a worker that looks at the time created timestamp and decides whether to publish them to a queue or not. If created >= 1d ago, publish.
This was a challenge for us as well and I never found a perfect solution so I ended up building a service to address it. Obviously self promotion here but the system allows you to work around the DelaySeconds limitation and set arbitrary date/times at scale.
https://anticipated.io
Some of the challenges working with Step Functions are scale of registered machines (if your system had that requirement). If you use EventBridge to fire them you run out of allowable rulesets (limit is 200 as of this posting). Example: if you need to set 150,000 arbitrary events a month you run into limits quickly.