I'm trying to implement an AWS Lambda function that should send an HTTP request. If that request fails (response is anything but status 200) I should wait another hour before retrying (longer that the Lambda stays hot). What the best way to implement this?
What comes to mind is to persist my HTTP request in some way and being able to trigger the Lambda function again in a specified amount of time in case of a persisted HTTP request. But I'm not completely sure which AWS service that would provide that functionality for me. Is SQS an option that can help here?
Or, can I dynamically schedule Lambda execution for this? Note that the request to be retried should be identical to the first one.
Any other suggestions? What's the best practice for this?
(Lambda function is my option. No EC2 or such things are possible)
You can't directly trigger Lambda functions from SQS (at the time of writing, anyhow).
You could potentially handle the non-200 errors by writing the request data (with appropriate timestamp) to a DynamoDB table that's configured for TTL. You can use DynamoDB Streams to detect when DynamoDB deletes a record and that can trigger a Lambda function from the stream.
This is obviously a roundabout way to achieve what you want but it should be simple to test.
As jarmod mentioned, you cannot trigger Lambda functions directly by SQS. But a workaround (one I've used personally) would be to do the following:
If the request fails, push an item to an SQS Delay Queue (docs)
This SQS message will only become visible on the queue after a certain delay (you mentioned an hour).
Then have a second scheduled lambda function which is triggered by a cron value of a smaller timeframe (I used a minute).
This second function would then scan the SQS queue and if an item is on the queue, call your first Lambda function (either by SNS or with the AWS SDK) to retry it.
PS: Note that you can put data in an SQS item, since you mentioned you needed the lambda functions to be identical you can store your first function's input in here to be reused after an hour.
I suggest that you take a closer look at the AWS Step Functions for this. Basically, Step Functions is a state machine that allows you to execute a Lambda function, i.e. a task in each step.
More information can be found if you log in to your AWS Console and choose the "Step Functions" from the "Services" menu. By pressing the Get Started button, several example implementations of different Step Functions are presented. First, I would take a closer look at the "Choice state" example (to determine wether or not the HTTP request was successful). If not, then proceed with the "Wait state" example.
Related
I have AWS lambda function that gets details using multiple ids via rest API. The problem is the API only accept 1 id at a time/per call. Per my observation, the job can only cater around 30 ids else the job won’t finish or would max my 10 mins time limit. Currently, my ids can go as high as 200 ids per job process so I’m thinking of a way how I can resolve this issue.
So far I’m thinking of using step function so I can asynchronously run the job and just chunked my ids into multiple payload but I’m not sure how I can pass ids/payload from lambda to step function. Another solution I’m thinking is I can invoke the same lambda with chunked ids but i’m afraid that recursive would happen.
Any other suggestions or AWS services I can use to fix this?
I would have a process that dumps all the IDs into an SQS queue. Then have a Lambda function that uses the SQS queue as an event source. Lambda will then automatically spin up multiple instances of your Lambda function, passing each one a batch of IDs to process.
I have a SQS queue which fills up with messages throughout the day and I want to start processing all the messages at a specific time. The scenario would be:
Between 9AM and 5PM the queue would receive messages
At 6PM the messages should be processed by a lambda
I was thinking of:
Enabler: Lambda A which will be executed using a CloudWatch Event Bridge ruleat 6PM. This lambda would create a SQS trigger for Lambda C
Disabler: Lambda B which will be executed using a CloudWatch Event Bridge rule at 8PM . This lambda would remove the SQS trigger of Lambda C
Executer: Lambda C which process the messages in the queue
Is this the best way to do this?
I would aim for the process which requires the least complexity / smallest changes to your Lambda. You could use the AWS SDK to enable / disable your Lambda's subscription, rather than actually deleting and recreating it. See this question on how to do so and specifically the enabled parameter in the updateEventSourceMapping() method in the docs it links to:
Enabled — (Boolean)
When true, the event source mapping is active. When false, Lambda
pauses polling and invocation.
Default: True
The advantage is that the only thing you're changing is the enabled flag - everything else (the SQS-Lambda subscription, if you will) is unchanged.
This approach still has the danger that if the enabler/disabler lambda(s) fail, your processing will not occur during your target hours. Particularly, I'm not personally super confident in the success rate of AWS's self-mutating commands - this might just be my bias, but it definitely leans toward "infra changes tend to fail more often than regular AWS logic".
It's worth considering whether you really need this implementation, or whether the time-based aggregation can be done on the results (e.g., let this Lambda processing run on events as they come in and write the output to some holding pen, then at Xpm when you trust all events have come in, move the results for today from the holding pen into your main output).
This approach may be safer, in that a failed trigger on the "moving" step could be easier / faster to recover from than a failed trigger on the above "process all my data now" step, and it would not depend on changing the Lambda definition.
I have around 3 AWS Lambda functions taking the following form:
Lambda function 1: Reads from an SQS queue and puts a Message on an SQS queue (the incoming and outgoing message formats are different)
Lambda function 2: Reads the message from Lambda function 1, and puts a Message on an SQS queue (the incoming and outgoing message formats are different)
Lambda function 3: Reads the message from Lambda function 3, and updates storage.
There are 3 queues involved and the message format (structure) in each queue is different, however they have one uniqueId which are the same and can be used the relate one to each other. So much question is, is there any way in SQS on or some other tool to track the messages, what I'm specifically looking at is stuff like:
Time the message was entered into the queue
Time the message was taken by the Lambda function for processing
My problem is that the 3 Lambda functions individually perform within a couple of milliseconds, but the time taken for end to end execution is way too long, I suspect that the messages are taking too long in transit.
I'm open for any other ideas on optimisation.
AWS Step Functions is specifically designed for passing information between Lambda functions and orchestrating the whole process.
However, you would need to change the way you have written your functions to take advantage of Step Functions.
Since your only real desire is to explore why it takes "way too long", then AWS X-Ray should be a good way to gather this information. It can track a single transaction end-to-end through each process. I think it's just a matter of including a library and activating X-Ray.
See: AWS Lambda and AWS X-Ray - AWS X-Ray
Or, just start doing some manual investigations in the log files. The log shows how long each function takes to run, so you should be able to identify whether the time taken is within a particular Lambda function, or whether the time is spent between functions, waiting for them to trigger.
We are designing a pipeline. We get a number of raw files which come into S3 buckets and then we apply a schema and then save them as parquet.
As of now we are triggering a lambda function for each file written but ideally we would like to start this process only after all the files are written. How we can we trigger the lambda just once?
I encourage you to use an alternative that maintains the separation between the publisher (whoever is writing) and the subscriber (you). The publisher tells you when things are written; it's your responsibility to choose when to process those things. The neat pattern here would be for the publisher to write its files in batches and publish manifests for you to trigger on: i.e. a list which says "I just wrote all these things, you can find them in these places". Since you don't have that / can't change the publisher, I suggest the following:
Send the notifications from the publisher to an SQS queue.
Schedule your lambda to run on a schedule; how often is determined by how long you're willing to delay ingestion. If you want data to be delayed at most 5min between being published and getting ingested by your system, set your lambda to trigger every 4min. You can use Cloudwatch notifications for this.
When your lambda runs, poll the queue. Keep going until you accumulate the maximum amount of notifications, X, you want to process in one go, or the queue is empty.
Process. If the queue wasn't empty when you stopped polling, immediately trigger another lambda execution.
Things to keep in mind on the above:
As written, it's not parallel, so if your rate of lambda execution is slower than the rate at which the queue fills up, you'll need to 1. run more frequently or 2. insert a load-balancing step: a lambda that is triggered on a schedule, polls the queue, and calls as many processing lambdas as necessary so that each one gets X notifications.
SNS in general and SQS non-FIFO queues specifically don't guarantee exactly-once delivery. They can send you duplicate notifications. Make sure you can handle duplicate processing cleanly.
Hook your Lambda up to a Webhook (API Gateway) and then just call it from your client app once your client app is done.
Solutions:
Zip all files together, Lambda unzip it
create a UI code and send files one by one, trigger lambda from it when the last one is sent
Lambda check files, if didn't find all files, silent quit. if it finds all files, then handle all files in one thread
My Amazon Lambda function (in Python) is called when an object 123456 is created in S3's input_bucket, do a transformation in the object and saves it in output_bucket.
I would like to notify my main application if the request was successful or unsuccessful. For example, a POST http://myapp.com/successful/123456 if the processing is successful and http://myapp.com/unsuccessful/123456 if its not.
One solution I thought is to create a second Amazon Lambda function that is triggered by a put event in output_bucket, and it to do the successful POST request. This solves half of the problem because but I can't trigger the unsuccessful POST request.
Maybe AWS has a more elegant solution using a parameter in Lambda or a service that deals with these types of notifications. Any advice or point in the right direction will be greatly appreciated.
Few possible solutions which I see as elegant
Using SNS Topic: From your transformation lambda, trigger a SNS topic, with success/unsuccess message, where SNS will call a HTTP/HTTPS endpoint with message payload. The advantage here is, your transformation lambda is loosely coupled with endpoint trigger and only connected through messaging.
Using Lambda Step Functions:
You could arrange to run a Lambda function every time a new object is uploaded to an S3 bucket. This function can then kick off a state machine execution by calling StartExecution. The advantage in using step functions is that you can coordinate the components of your application as series of steps in a visual workflow.
I don't think there is any elegant AWS solution, unless you re-architect, something like your lambda sends message to SQS or some intermediatery messaging service with STATUS and then interemdeiatery invokes POST to your application.
If you still want to go with your way of solving, you might need to configure "DeadLetter queue" to do error handling in failure cases (note that use cases described here are not comprehensive, so need to make sure it covers your case) like described here.