I have an AWS Lambda which is triggered by an S3 push event. The lambda will call an API which will trigger a long-running process. I recognize that I can configure S3 to invoke the lambda function asynchronously, and so S3 will not wait for a response, but I am interested to find out if I can configure lambda to call my API asynchronously as well. I don't want lambda waiting for several minutes while the process completes. Can anyone point me to some documentation which outlines this process? Thanks in advance.
I don't think Lambda can do this nor would I recommend a workaround. There is an article on SenseDeep which talks about this and specifically points out that points out "So what happens if we simply do not call "await" and thus not wait on the response from our HTTP request?" - "Strange things happen" - which is to say that invoking an async call in a lambda then returning immediately has unpredictable results.
Why do you need the Lambda to return quickly? If there is a valid reason (for example, you want a push notification that something in S3 has changed right away), then I'd recommend a different pattern.
Updating S3 triggers a lambda
That lambda writes to an SNS topic and does whatever fast action you want
There is a subscriber to the SNS topic that does the long running action you want
Related
I need to asynchronously invoke lambda functions from my EC2 instance. At high level, so many services come to my mind(most likely all of these support my desired functionality) -
AWS State machine(not sure), step functions, Active MQ, SQS, SNS. I am aware about pros and cons of each at high level. However not sure which one should I go for :|. Please let me know your feedback.
PS: We expect the invocation in 1000s per second at peak for very short periods. Concurrency for lambda functions is not an issue as we can ask Amazon for increase in the limit along with the burst.
If you want to invoke asynchronously then you can not use SQS as SQS invoke lambda function synchronously with event source mapping.
You can use SNS to invoke lambda function asynchronously out of the option you listed above.
Better option would be writing small piece of code in any AWS SDK whichever you are comfortable and then call lambda function from that piece of code asynchronously.
Example in python using boto3 asynchronously
pass Event in InvocationType to invoke lambda function asynchronously and pass RequestResponse to invoke lambda function synchronously
client = boto3.client('lambda')
response = client.invoke(
FunctionName="loadSpotsAroundPoint",
**InvocationType='Event',**
Payload=payload3
I have a scheduled error handling lambda, I would like to use Serverless technology here as opposed to a spring boot service or something.
The lambda will read from an s3 bucket and process accordingly. The problem is at times the s3 bucket may have high volume of data to be processed. long running operations aren't suited to lambdas.
One solution I can think of is have the lambda read and process one item from the bucket and on success trigger another instance of the same lambda unless the bucket is empty/fully-processed. The thing i don't like is that this is synchronous and quite slow. I also need to be conscious of running too many lambdas at the same time as we are hitting a REST endpoint as part of the error flow and don't want to overload it with too many requests.
I am thinking it would be nice to have maybe 3 instances of the lambdas running at the same time until the bucket is empty but not really sure, I am wondering if anyone has any nice patterns that could be used here or suggestions on best practices?
Thanks
Create a S3 bucket for processing your files.
Enable a trigger S3 -> Lambda, on every new file in the bucket lambda will be invoked to process the file, every file is processed separately. https://docs.aws.amazon.com/AmazonS3/latest/user-guide/enable-event-notifications.html
Once the file is processed you could either delete or move file to other place.
About concurrency please have a look at provisioned concurrency https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html
Update:
As you still plan to use a scheduler lambda and S3
Lambda reads/lists only the filenames and puts messages into SQS to process the file.
A new Lambda to consume SQS messages and process the file.
Note: I would recommend using SQS initially if the files/messages are not so big, it has built it recovery mechanics, DLQ , delays, visibility etc which you could benefit more than the simple S3 storage, second way is just create message with file reference and still use SQS.
I'd separate the lambda that is called by the scheduler from the lambda that is doing the actual processing. When the scheduler calls the first lambda, it can look at the contents of the bucket, then spawn the worker lambdas to process the objects. This way you have control over how many objects you want per worker.
Given your requirements, I would recommend:
Configure an Amazon S3 Event so that a message is pushed to an Amazon SQS queue when the objects are created in the S3 bucket
Schedule an AWS Lambda function at regular intervals that will:
Check that the external service is working
Invoke a Lambda function to process one message from the queue, and keep looping
The hard part would be throttling the second Lambda function so that it doesn't try to send all request at once (which might impact that external service).
You could probably do this by using a Step Function to trigger Lambda and then, if it was successful, trigger another Lambda function. This could even be done in parallel, such as allowing up to three parallel Lambda executions. The benefit of using Step Functions is that there is no cost for "waiting" for each Lambda to finish executing.
So, the Step Function flow would be something like:
Invoke a "check external service" Lambda function
If it fails, then quit the flow
Invoke the "processing" Lambda function
Get one message
Process the message
If successful, remove the message from the queue
Return success/fail
If it was successful, keep looping until the queue is empty
I have been researching AWS Documentation on how to invoke lambda functions, and I've come across different ways to do that. Mainly, Lambda invocation is done by calling Invoke() function which can be used to invoke lambda functions synchronously or asynchronously.
Currently I am invoking my Lambda functions via HTTP Request (as REST API), but, HTTP Request times out after 30 seconds, while asynchronous calls as far as I know times out after 15min.
What are the advantages, besides time that I have already mentioned, of asynchronous lambda invocation compared to invoking lambda with HTTP Request. Also, what are best (recommended) ways to invoke lambdas in production? On AWS docs (SDK for Go - https://docs.aws.amazon.com/sdk-for-go/api/service/lambda/#InvokeAsyncInput) I see that InvokeAsyncInput and InvokeAsyncOutput have been depricated. So I am wondering how async implementation would actually look like.
Lambda really is about event-driven-computing. This means Lambda always gets triggered in response to an event. This event can originate from a wide range of AWS Services as well as the AWS CLI and SDK.
All of these events invoke the Lambda function and pass some kind of information in the form of an event and context object. How this event looks like depends on the service that triggered lambda. You can find more information about the context in this documentation.
There is no real "best" way to invoke Lambda - this mostly depends on your use case - if you're building a webservice, let API Gateway invoke Lambda for you. If you want to process new files on S3 - let S3 trigger Lambda. If you're just testing the Lambda function you can invoke it via the CLI. If you have custom software that needs to trigger a Lambda function you can use the SDK. If you want to run Lambda on a schedule, configure CloudWatch events...
Please provide more information about your use case if you require a more detailed evaluation of the available options - right now this is very broad.
I have an AWS Lambda function which does a couple of API calls then saves whatever information it gets to a DynamoDB table.
Would it be a good practice for me send a message to an SQS queue after completing whatever the Lambda was doing? The queue will then trigger the Lambda function to start another process again.
So with regards to processing and the costs involved, is this a good idea or not?
The other idea I had was to trigger the Lambda function using a CloudWatch Event but the problem is I want it to start the new process once the old one has completed so if it happens that the Lambda function gets triggered while processing its then going to stuff up my records on DynamoDB.
So if anyone has a better solution or alternative let me know.
I have a service that uses a JSON file on an S3 bucket for its configuration.
I would like to be able to modify this file, but I'm going to run into a concurrency issue as multiple administrators will be able to write in this file at the same time.
I'm going to use an SNS Topic to trigger a lambda that will write the config changes.
For the moment, I'm going to check the queue every minute and then handle the messages, so that I am sure that I don't have multiple instances of lambda running at the same time and writing in the same file.
Is there any way to have an SNS topic to trigger a lambda function for each message, and then wait for this message to be handled and then move on to the next one?
Cheers,
Julien
You can achieve this by setting the max concurrent executions of your Lambda function to 1. See the documentation for more details about managing concurrency for Lambdas.