Can AWS Lambda functions be invoked as a background service? - amazon-web-services

I am new to Lambda functions, and am using the serverless framework to create them, specifically using python. I am trying to create a function that will do some heavy processing of a file. The type of work is not important, but each job might take several minutes to complete.
I'd prefer not to have an invocation request open for so long, and instead I'd like the Lambda to return immediately, but continue processing after it has returned. Once the job is done, it could email me the processed file using SMS, for example.
Is this possible with AWS Lambda?

AWS Lambda supports two invocation types:
RequestResponse
Event
RequestResponse will wait until the Lambda invocation is finished before returning the response. Event will invoke the Lambda function and then return a success status indicating that the function was successfully started. You can read more about this here.

In actual fact, Lambda functions are ‘almost’ always requested asynchronously.
For your specific use case, you can have the lambda be ‘triggered’ when a file is uploaded to s3 (instead of invoking it programmatically). This means every time a file is uploaded to s3, the lambda is triggered and will process the file.
That way, no connection is open for the duration of that processing. The function can then notify it once it has completed.

Related

Have Lambda function dispatch a task and return response right away

Im a little confused since AWS has a lot of features and I do not know what to use.
So, I was creating a Lambda function that does a lot of stuff with a remote web, process could last at least a minute.
So my idea was creating an API that calls this lambda, have lambda create an unique ID and return a response right away to the client with a token., save this token to a DB.
Then have lambda process all this stuff with a remote web and, when it finishes, save the results to the DB and a bucket (a file), so this result is ready to deliver when the client makes another call to another API that makes a query to the DB to know the status of this process.
Thing is, it seems that if a response is sent from the handler, lambda terminates the execution, Im afraid the processing with the remote web will never finish.
I have read that step functions is the way to go, but I cant figure out which service will take the processing, ¿maybe another lambda?
Is there another service that is more suitable for this type of work?, this process involves scrapping a page and downloading files, is written in python.
I have read that step functions is the way to go, but I cant figure
out which service will take the processing, ¿maybe another lambda?
Yes, another Lambda function would do the background processing. You could also just have the first Lambda invoke a second Lambda directly, without using Step Functions. Or the first Lambda could place the task info in an SQS queue that another Lambda polls.
Is there another service that is more suitable for this type of work?
Lambda functions are fine for this. You just need to split your task into multiple functions somehow, either by one Lambda calling the other directly, or by using Step Functions, or an SQS queue or something.

Lambda Low Latency Messaging Options

I have a Lambda that requires messages to be sent to another Lambda to perform some action. In my particular case it is passing a message to a Lambda in order for it to perform HTTP requests and refresh cache entries.
Currently I am relying on the AWS SDK to send an SQS message. The mechanics of this are working fine. The concern that I have is that the SQS send method call takes around 50ms on average to complete. Considering I'm in a Lambda, I am unable to perform this in the background and expect for it to complete before the Lambda returns and is frozen.
This is further compounded if I need to make multiple SQS send calls, which is particularly bad as the Lambda is responsible for responding to low-latency HTTP requests.
Are there any alternatives in AWS for communicating between Lambdas that does not require a synchronous API call, and that exhibits more of a fire and forget and asynchronous behavior?
Though there are several approaches to trigger one lambda from another, (in my experience) one of the fastest methods would be to directly trigger the ultimate lambda's ARN.
Did you try invoking one Lambda from the other using AWS SDKs?
(for e.g. in Python using Boto3, I achieved it like this).
See below, the parameter InvocationType = 'Event' helps in invoking target Lambda asynchronously.
Below code takes 2 parameters (name, which can be either your target Lambda function's name or its ARN, params is a JSON object with input parameters you would want to pass as input). Try it out!
import boto3, json
def invoke_lambda(name, params):
lambda_client = boto3.client('lambda')
params_bytes = json.dumps(params).encode()
try:
response = lambda_client.invoke(FunctionName = name,
InvocationType = 'Event',
LogType = 'Tail',
Payload = params_bytes)
except ClientError as e:
print(e)
return None
return response
Hope it helps!
For more, refer to Lambda's Invoke Event on Boto3 docs.
Alternatively, you can use Lambda's Async Invoke as well.
It's difficult to give exact answers without knowing what language are you writing the Lambda function in. To at least make "warm" function invocations faster I would make sure you are creating the SQS client outside of the Lambda event handler so it can reuse the connection. The AWS SDK should use an HTTP connection pool so it doesn't have to re-establish a connection and go through the SSL handshake and all that every time you make an SQS request, as long as you reuse the SQS client.
If that's still not fast enough, I would have the Lambda function handling the HTTP request pass off the "background" work to another Lambda function, via an asynchronous call. Then the first Lambda function can return an HTTP response, while the second Lambda function continues to do work.
You might also try to use Lambda Destinations depending on you use case. With this you don't need to put things in a queue manually.
https://aws.amazon.com/blogs/compute/introducing-aws-lambda-destinations/
But it limits your flexibility. From my point of view chaining lambdas directly is an antipattern and if you would need that, go for step functions

How do you run ('invoke') an AWS lambda function?

I have written and saved a lambda function. I see:
Congratulations! Your Lambda function "lambda_name" has been
successfully created. You can now change its code and configuration.
Choose Test to input a test event when you want to test your function.
Now how do I run it? I cannot see a 'run' or 'invoke' button as I would expect
Note
The lambda doesn't accept any arguments (it's extremely simple - for the purposes of this question, please presume it's simply 2 * 2 so when I run it it should not require any inputs and should return 4).
Also note
I can see a tonne of different ways to run the lambda here. I just want the simplest way (preferably a button in the browser)
Sending a test message via the Lambda console will run your Lambda function. The test message that you configure will define what is in the event parameter of your lambda handler function.
Since you are not doing anything with that message, you can send any arbitrary test message and it should work for you. You can just use the default hello world message and give it an arbitrary name.
It should then show you the results: any logs or returned objects right in the AWS Lambda console.
Further reading here
AWS Lambda functions are typically triggered by an event, such as an object being uploaded to Amazon S3 or a message being send to an Amazon SNS topic.
This is because Lambda functions are great at doing a small task very often. Often, Lambda functions only run for a few seconds, or even less than a second! Thus, they are normally triggered in response to something else happening. It's a bit like when somebody rings your phone, which triggers you to answer the phone. You don't normally answer your phone when it isn't ringing.
However, it is also possible to directly invoke an AWS Lambda function using the Invoke() command in the AWS SDK. For convenience, you can also use the AWS Command-Line Interface (CLI) aws lambda invoke command. When directly invoking an AWS Lambda function, you can receive a return value. This is in contrast to situations where a Lambda function is triggered by an event, in which case there is nowhere to 'return' a value since it was not directly invoked.

Make Lambda function execute now, and/or in an hour

I'm trying to implement an AWS Lambda function that should send an HTTP request. If that request fails (response is anything but status 200) I should wait another hour before retrying (longer that the Lambda stays hot). What the best way to implement this?
What comes to mind is to persist my HTTP request in some way and being able to trigger the Lambda function again in a specified amount of time in case of a persisted HTTP request. But I'm not completely sure which AWS service that would provide that functionality for me. Is SQS an option that can help here?
Or, can I dynamically schedule Lambda execution for this? Note that the request to be retried should be identical to the first one.
Any other suggestions? What's the best practice for this?
(Lambda function is my option. No EC2 or such things are possible)
You can't directly trigger Lambda functions from SQS (at the time of writing, anyhow).
You could potentially handle the non-200 errors by writing the request data (with appropriate timestamp) to a DynamoDB table that's configured for TTL. You can use DynamoDB Streams to detect when DynamoDB deletes a record and that can trigger a Lambda function from the stream.
This is obviously a roundabout way to achieve what you want but it should be simple to test.
As jarmod mentioned, you cannot trigger Lambda functions directly by SQS. But a workaround (one I've used personally) would be to do the following:
If the request fails, push an item to an SQS Delay Queue (docs)
This SQS message will only become visible on the queue after a certain delay (you mentioned an hour).
Then have a second scheduled lambda function which is triggered by a cron value of a smaller timeframe (I used a minute).
This second function would then scan the SQS queue and if an item is on the queue, call your first Lambda function (either by SNS or with the AWS SDK) to retry it.
PS: Note that you can put data in an SQS item, since you mentioned you needed the lambda functions to be identical you can store your first function's input in here to be reused after an hour.
I suggest that you take a closer look at the AWS Step Functions for this. Basically, Step Functions is a state machine that allows you to execute a Lambda function, i.e. a task in each step.
More information can be found if you log in to your AWS Console and choose the "Step Functions" from the "Services" menu. By pressing the Get Started button, several example implementations of different Step Functions are presented. First, I would take a closer look at the "Choice state" example (to determine wether or not the HTTP request was successful). If not, then proceed with the "Wait state" example.

aws lambda function triggering multiple times for a single event

I am using aws lambda function to convert uploaded wav file in a bucket to mp3 format and later move file to another bucket. It is working correctly. But there's a problem with triggering. When i upload small wav files,lambda function is called once. But when i upload a large sized wav file, this function is triggered multiple times.
I have googled this issue and found that it is stateless, so it will be called multiple times(not sure this trigger is for multiple upload or a same upload).
https://aws.amazon.com/lambda/faqs/
Is there any method to call this function once for a single upload?
Short version:
Try increasing timeout setting in your lambda function configuration.
Long version:
I guess you are running into the lambda function being timed out here.
S3 events are asynchronous in nature and lambda function listening to S3 events is retried atleast 3 times before that event is rejected. You mentioned your lambda function is executed only once (with no error) during smaller sized upload upon which you do conversion and re-upload. There is a possibility that the time required for conversion and re-upload from your code is greater than the timeout setting of your lambda function.
Therefore, you might want to try increasing the timeout setting in your lambda function configuration.
By the way, one way to confirm that your lambda function is invoked multiple times is to look into cloudwatch logs for the event id (67fe6073-e19c-11e5-1111-6bqw43hkbea3) occurrence -
START RequestId: 67jh48x4-abcd-11e5-1111-6bqw43hkbea3 Version: $LATEST
This event id represents a specific event for which lambda was invoked and should be same for all lambda executions that are responsible for the same S3 event.
Also, you can look for execution time (Duration) in the following log line that marks end of one lambda execution -
REPORT RequestId: 67jh48x4-abcd-11e5-1111-6bqw43hkbea3 Duration: 244.10 ms Billed Duration: 300 ms Memory Size: 128 MB Max Memory Used: 20 MB
If not a solution, it will at least give you some room to debug in right direction. Let me know how it goes.
Any event Executing Lambda several times is due to retry behavior of Lambda as specified in AWS document.
Your code might raise an exception, time out, or run out of memory. The runtime executing your code might encounter an error and stop. You might run out concurrency and be throttled.
There could be some error in Lambda which makes the client or service invoking the Lambda function to retry.
Use CloudWatch logs to find the error and resolving it could resolve the problem.
I too faced the same problem, in my case it's because of application error, resolving it helped me.
Recently AWS Lambda has new property to change the default Retry nature. Set the Retry attempts to 0 (default 2) under Asynchronous invocation settings.
For some in-depth understanding on this issue, you should look into message delivery guarantees. Then you can implement a solution using the idempotent consumers pattern.
The context object contains information on which request ID you are currently handling. This ID won't change even if the same event fires multiple times. You could save this ID for every time an event triggers and then check that the ID hasn't already been processed before processing a message.
In the Lambda Configuration look for "Asynchronous invocation" there is an option "Retry attempts" that is the maximum number of times to retry when the function returns an error.
Here you can also configure Dead-letter queue service
Multiple retry can also happen due read time out. I fixed with '--cli-read-timeout 0'.
e.g. If you are invoking lambda with aws cli or jenkins execute shell:
aws lambda invoke --cli-read-timeout 0 --invocation-type RequestResponse --function-name ${functionName} --region ${region} --log-type Tail --```payload {""} out --log-type Tail \
I was also facing this issue earlier, try to keep retry count to 0 under 'Asynchronous Invocations'.