Dynamically schedule lambda - amazon-web-services

Say I have a messaging application. Every minute I want the application to send push notifications if any message is one minute old. Currently, I'm thinking of having a lambda function being called every minute, but since I'm probably not going to have a push notification to send every minute, I was wondering if it was possible to schedule a lambda function to run in the future whenever a message is added to the system.
Basically, it would work like this:
Message arrives to system.
Is a lambda function scheduled to notify of stale messages?
If no, schedule lambda function
If yes, do nothing
When lambda has executed, check if there are more messages in system that haven't gone stale yet.
If yes, schedule lambda function.

Seems like this would be a good candidate for SNS(Simple Notification Service). Your application can push its notifications to SNS with a payload which would then trigger your lambda function(with the payload) whenever that message comes in.
See: http://docs.aws.amazon.com/sns/latest/dg/sns-lambda.html

Related

AWS Lex fulfillment triggers Lambda function twice

I have a Lex bot whose fulfillment is set to my Lambda function (LF1), so everytime the intent is fulfilled, LF1 will be triggered and the slots parameters will be sent to LF1, which sends the data to SQS and then processed and send text msg via SNS. It works, but every time I finished the conversation with my bot, my phone receives two messages at the same time. After a careful look at CloudWatch, I found the LF1 was triggered twice every time the intent is fulfilled. They have the same message, but different request-id and different message-id. I really couldn't figure out where goes wrong. Please help!
lambda function log
detail of the first trigger
detail of the second trigger

Is it necessary for a Lambda to delete messages from an SQS queue after processing?

I'm looking at the the AWS SQS documentation here: https://docs.aws.amazon.com/sdk-for-net/v3/developer-guide/ReceiveMessage.html#receive-sqs-message
My understanding is that we need to delete the message using AmazonSQSClient.DeleteMessage() once we're done processing it, but is this necessary when we're working with an SQS triggered Lambda?
I'm testing with a Lambda function that's triggered by an SQSEvent, and unless I'm mistaken, it appears that if the Lambda function runs to completion without throwing any errors, the message does NOT return to the SQS queue. If this is true, the I would rather avoid making that unnecessary call to AmazonSQSClient.DeleteMessage().
Here is a similar question from 2019 with the top answer saying that the SDK does not delete messages automatically and that they need to be explicitly deleted within the code. I'm wondering if anything has changed since then.
Thoughts?
The key here is that you are using the AWS Lambda integration with SQS. In that instance AWS Lambda handles retrieving the messages from the queue (making them available via the event object), and automatically deletes the message from the queue for you if the Lambda function returns a success status. It will not delete the message from the queue if the Lambda function throws an error.
When using AWS Lambda integration with SQS you should not be using the AWS SDK to interact with the SQS queue at all.
Update:
Lambda now supports partial batch failure for SQS whereby the Lambda function can return a list of failed messages and only those will become visible again.
Amazon SQS doesn't automatically delete a message after retrieving it for you, in case you don't successfully receive the message (for example, if the consumers fail or you lose connectivity). To delete a message, you must send a separate request which acknowledges that you've successfully received and processed the message.
This has not changed and likely won’t change in the future as there us no way for SQS to definitively know in all cases if messages have successfully been processed. If SQS started to “assume” what happens downstream it risk becoming unreliable in many scenarios.
Yes, otherwise the next time you ask for a set of messages, you will get the same messages back - maybe not on the next call, but eventually you will. You likely don't want to keep processing the same set of messages over and over.

How to trigger AWS Lambda just once on multiple S3 notifications

We are designing a pipeline. We get a number of raw files which come into S3 buckets and then we apply a schema and then save them as parquet.
As of now we are triggering a lambda function for each file written but ideally we would like to start this process only after all the files are written. How we can we trigger the lambda just once?
I encourage you to use an alternative that maintains the separation between the publisher (whoever is writing) and the subscriber (you). The publisher tells you when things are written; it's your responsibility to choose when to process those things. The neat pattern here would be for the publisher to write its files in batches and publish manifests for you to trigger on: i.e. a list which says "I just wrote all these things, you can find them in these places". Since you don't have that / can't change the publisher, I suggest the following:
Send the notifications from the publisher to an SQS queue.
Schedule your lambda to run on a schedule; how often is determined by how long you're willing to delay ingestion. If you want data to be delayed at most 5min between being published and getting ingested by your system, set your lambda to trigger every 4min. You can use Cloudwatch notifications for this.
When your lambda runs, poll the queue. Keep going until you accumulate the maximum amount of notifications, X, you want to process in one go, or the queue is empty.
Process. If the queue wasn't empty when you stopped polling, immediately trigger another lambda execution.
Things to keep in mind on the above:
As written, it's not parallel, so if your rate of lambda execution is slower than the rate at which the queue fills up, you'll need to 1. run more frequently or 2. insert a load-balancing step: a lambda that is triggered on a schedule, polls the queue, and calls as many processing lambdas as necessary so that each one gets X notifications.
SNS in general and SQS non-FIFO queues specifically don't guarantee exactly-once delivery. They can send you duplicate notifications. Make sure you can handle duplicate processing cleanly.
Hook your Lambda up to a Webhook (API Gateway) and then just call it from your client app once your client app is done.
Solutions:
Zip all files together, Lambda unzip it
create a UI code and send files one by one, trigger lambda from it when the last one is sent
Lambda check files, if didn't find all files, silent quit. if it finds all files, then handle all files in one thread

How can I know if SQS has a new message?

I am writing an application that use lambda function that send request to a spring boot application which will call other service. I have to use sqs (required). So sqs is between lambda and spring. The question is how do my spring application know if there is new message in sqs.
I heard about long pooling, but I don't know if this is what I need.
Do I need to set a loop that open the long pooling forever or something?
Is it efficient? I mean if there are 10 message in sqs, The connection will be opened ten times?
I aslo find using while loop here: Check for an incoming message in aws sqs
Thanks
The answer you linked is accurate.
You must write a program that polls SQS for a message (or up to 10 messages). It is more efficient to use long polling because you require less calls.
If you wish to know about a message very quickly, then you will need to poll continually. That is, as soon as it comes back and says "nothing to receive", you should call it again. To reduce the frequency of these calls, you can set long polling, up to a maximum of 20 seconds. This means that, if there are no messages in the queue, the ReceiveMessages() option will take 20 seconds before it returns a response of "no messages". If, however, a message arrives in the meantime, it will respond immediately. The long polling option is specified when making the ReceiveMessages() request.
If you do not require instant notification, your application could call less often (eg every minute, or every few minutes). This would involve less calls to Amazon SQS.
When making the ReceiveMessages() call, your application can request up to 10 messages. This means that multiple messages might be returned.
Once your application has finished processing a message, it must call DeleteMessage() to have the message removed from the queue. This is a failsafe that will automatically put the message back on the queue if there is a problem with the application and the message doesn't get correctly processed.
This is a great video from the AWS re:Invent conference that explains Amazon SQS (and Amazon SNS) in detail: AWS re:Invent SVC 105: AWS Messaging

Make Lambda function execute now, and/or in an hour

I'm trying to implement an AWS Lambda function that should send an HTTP request. If that request fails (response is anything but status 200) I should wait another hour before retrying (longer that the Lambda stays hot). What the best way to implement this?
What comes to mind is to persist my HTTP request in some way and being able to trigger the Lambda function again in a specified amount of time in case of a persisted HTTP request. But I'm not completely sure which AWS service that would provide that functionality for me. Is SQS an option that can help here?
Or, can I dynamically schedule Lambda execution for this? Note that the request to be retried should be identical to the first one.
Any other suggestions? What's the best practice for this?
(Lambda function is my option. No EC2 or such things are possible)
You can't directly trigger Lambda functions from SQS (at the time of writing, anyhow).
You could potentially handle the non-200 errors by writing the request data (with appropriate timestamp) to a DynamoDB table that's configured for TTL. You can use DynamoDB Streams to detect when DynamoDB deletes a record and that can trigger a Lambda function from the stream.
This is obviously a roundabout way to achieve what you want but it should be simple to test.
As jarmod mentioned, you cannot trigger Lambda functions directly by SQS. But a workaround (one I've used personally) would be to do the following:
If the request fails, push an item to an SQS Delay Queue (docs)
This SQS message will only become visible on the queue after a certain delay (you mentioned an hour).
Then have a second scheduled lambda function which is triggered by a cron value of a smaller timeframe (I used a minute).
This second function would then scan the SQS queue and if an item is on the queue, call your first Lambda function (either by SNS or with the AWS SDK) to retry it.
PS: Note that you can put data in an SQS item, since you mentioned you needed the lambda functions to be identical you can store your first function's input in here to be reused after an hour.
I suggest that you take a closer look at the AWS Step Functions for this. Basically, Step Functions is a state machine that allows you to execute a Lambda function, i.e. a task in each step.
More information can be found if you log in to your AWS Console and choose the "Step Functions" from the "Services" menu. By pressing the Get Started button, several example implementations of different Step Functions are presented. First, I would take a closer look at the "Choice state" example (to determine wether or not the HTTP request was successful). If not, then proceed with the "Wait state" example.