Here's what I know, or think I know.
In AWS Lambda, the first time you call a function is commonly called a "cold start" -- this is akin to starting up your program for the first time.
If you make a second function invocation relatively quickly after your first, this cold start won't happen again. This is colloquially known as a "warm start"
If a function is idle for long enough, the execution environment goes away, and the next request will need to cold start again.
It's also possible to have a single AWS Lambda function with multiple triggers. Here's an example of a single function that's handling both API Gateway requests and SQS messages.
My question: Will AWS Lambda reuse (warm start) an execution environment when different event triggers come in? Or will each event trigger have it's own cold start? Or is this behavior that's not guaranteed by Lambda?
Yes, different triggers will use the same containers since the execution environment is the same for different triggers, the only difference is the event that is passed to your Lambda.
You can verify this by executing your Lambda with two types of triggers (i.e. API Gateway and simply the Test function on the Lambda Console) and looking at the CloudWatch logs. Each Lambda container creates its own Log Stream inside of your Lambda's Log Group. You should see both event logs going to the same Log Stream which means the 2nd event is successfully using the warm container created by the first event.
Related
I have an aws lambda function. When it receives only one trigger, it always succeds. But when it receives more than one trigger, it sometimes throws error. The first trigger always succeds.
Can I configure one aws lambda function receives only one trigger?
can one aws lambda function handle multiple triggers at once?
Yes, Lambda functions can handle multiple triggers at once.
when it receives more than one trigger, it sometimes throws error
This is most probably related to your implementation. Are you doing something different based on the inputs? Is the code behaving differently based on time?
Can I configure one aws lambda function receives only one trigger?
You can limit the concurrency of the Lambda function. If you set it to 1, you can only have one Lambda function running at any given time.
See: Set Concurrency Limits on Individual AWS Lambda Functions
I have a scheduled error handling lambda, I would like to use Serverless technology here as opposed to a spring boot service or something.
The lambda will read from an s3 bucket and process accordingly. The problem is at times the s3 bucket may have high volume of data to be processed. long running operations aren't suited to lambdas.
One solution I can think of is have the lambda read and process one item from the bucket and on success trigger another instance of the same lambda unless the bucket is empty/fully-processed. The thing i don't like is that this is synchronous and quite slow. I also need to be conscious of running too many lambdas at the same time as we are hitting a REST endpoint as part of the error flow and don't want to overload it with too many requests.
I am thinking it would be nice to have maybe 3 instances of the lambdas running at the same time until the bucket is empty but not really sure, I am wondering if anyone has any nice patterns that could be used here or suggestions on best practices?
Thanks
Create a S3 bucket for processing your files.
Enable a trigger S3 -> Lambda, on every new file in the bucket lambda will be invoked to process the file, every file is processed separately. https://docs.aws.amazon.com/AmazonS3/latest/user-guide/enable-event-notifications.html
Once the file is processed you could either delete or move file to other place.
About concurrency please have a look at provisioned concurrency https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html
Update:
As you still plan to use a scheduler lambda and S3
Lambda reads/lists only the filenames and puts messages into SQS to process the file.
A new Lambda to consume SQS messages and process the file.
Note: I would recommend using SQS initially if the files/messages are not so big, it has built it recovery mechanics, DLQ , delays, visibility etc which you could benefit more than the simple S3 storage, second way is just create message with file reference and still use SQS.
I'd separate the lambda that is called by the scheduler from the lambda that is doing the actual processing. When the scheduler calls the first lambda, it can look at the contents of the bucket, then spawn the worker lambdas to process the objects. This way you have control over how many objects you want per worker.
Given your requirements, I would recommend:
Configure an Amazon S3 Event so that a message is pushed to an Amazon SQS queue when the objects are created in the S3 bucket
Schedule an AWS Lambda function at regular intervals that will:
Check that the external service is working
Invoke a Lambda function to process one message from the queue, and keep looping
The hard part would be throttling the second Lambda function so that it doesn't try to send all request at once (which might impact that external service).
You could probably do this by using a Step Function to trigger Lambda and then, if it was successful, trigger another Lambda function. This could even be done in parallel, such as allowing up to three parallel Lambda executions. The benefit of using Step Functions is that there is no cost for "waiting" for each Lambda to finish executing.
So, the Step Function flow would be something like:
Invoke a "check external service" Lambda function
If it fails, then quit the flow
Invoke the "processing" Lambda function
Get one message
Process the message
If successful, remove the message from the queue
Return success/fail
If it was successful, keep looping until the queue is empty
If a Lambda function has a concurrency>1, and there are several instances running, does a CloudWatch event Lambda trigger get sent to all the running instances?
The question wording is a little bit ambiguous. I will try my best to make it more clear.
If a Lambda function has a concurrency>1, and there are several instances running
I think OP is talking about reserved concurrency which is set to a value that's greater than 1. In other words, the function is not throttled by default and can run multiple instances in parallel.
does a CloudWatch event Lambda trigger get sent to all the running instances?
This part is ambiguous. #hephalump provided one interpretation in the question comment.
I have another interpretation. If you are asking whether the currently-running lambda containers will be reused after the job is done, then here is the answer:
Based on #hephalump's comment, now it's clear that one CloudWatch event will only trigger one lambda instance to run. If multiple events come in during a short period of time, then multiple lambda instances will be triggered to run in parallel. Back to the question, if all existing lambda instances of that function are busy running, then no container will be reused, and another new lambda instance will be spun up to handle this event. If one of the running instances has just finished its job, then that container along with the execution environment will be reused to handle this incoming event from CloudWatch.
Hope this helps.
We have created a Lambda Function, which has to trigger after every minute. It is working as expected and showing the correct result. But logs stream, which are getting through Cloudwatch events, contains multiple Lambda trigger logs in a single Cloudwatch Log stream.
Event Rule: -
Is it possible to create 1 cloudwatch log for 1 Lambda Trigger ??
As per the AWS Lambda documentation here, a log stream represents an instance of your function. In other words, a log stream represents the logs from a single execution environment for your Lambda Function... The execution environment is also called the context (one of the arguments you can pass in to your handler) and the reason you're not getting a new log stream with each invocation is because of the context in which Lambda functions execute.
When you invoke your Lambda function, AWS loads the container that holds your function code and provisions the requested resources that are required to enable your function to execute: CPU, memory, networking etc. These all form the functions execution environment which is also called the context. These resources take time to provision and this results in increased latency for the execution of the function. This is commonly known as a "cold start".
In order to mitigate this undesired latency, or cold start, with each invocation, after your function has completed its initial execution, instead of terminating the execution environment, AWS keeps the container and execution environment running and the resources such as cpu, memory and networking, provisioned and ready for and in anticipation of, the next invocation. This is known as keeping the function "warm". When the container is warm, subsequent invocations of your function are executed in the same execution environment, or context, as the previous invocation, and because the invocation was executed by the same instance of the function the logs are written to the same log stream as the previous invocation(s), which is the log stream that represents that instance / execution environment / context of the function.
Notwithstanding this, it's worth pointing out that AWS does not keep the container running indefinitely. If there is no subsequent invocation within a given period of time (there's no exact period of time but it's generally considered to be between 30 and 45 minutes, source) AWS will terminate the container, and release the resources for use by another function. The next time the Lambda function is invoked, AWS will repeat the provisioning process for the function and a new execution environment will be created, and this will cause your functions logs to be written to a new log stream which represents the new execution environment / context / instance of your function.
You can read more about Lambda's execution context here.
Rachit,
Your Lambda function comes with a CloudWatch Logs log group, with a
log stream for each instance of your function. The runtime sends
details about each invocation to the log stream, and relays logs and
other output from your function's code.
Moreover, From the AWS Cloudwatch documentation you can see that a log stream is created each time the logs come from a different event source. In case of Lambda, it's one stream per Lambda container where each container might process multiple events.
A log stream is a sequence of log events that share the same source.
Each separate source of logs into CloudWatch Logs makes up a separate
log stream.
Ref:
https://docs.aws.amazon.com/lambda/latest/dg/nodejs-prog-model-logging.html
I have an API Gateway with Lambdas behind, for some of the endpoints I want to schedule an execution in the future, to run once, for example the REST call was made at T time, I want that lambda to schedule an execution ONCE at T+20min.
The only solution I found to achieve this is to use boto3 and Cloudwatch to setup a cron at the moment the REST call was made, send an event with the payload, then when the delayed lambda runs, it removes the rule.
I found this very heavy, is there any other way to achieve such pattern ?
Edit: It is NOT A RECURRING Lambda, just to run ONCE.
One option is to use AWS Step Functions to trigger the AWS Lambda function after a given delay.
Step Functions has a Wait state that can schedule or delay execution, so you can can implement a fairly simple Step Functions state machine that puts a delay in front of calling a Lambda function. No database required!
For an example of the concept (slightly different, but close enough), see:
Using AWS Step Functions To Schedule Or Delay SNS Message Publication - Alestic.com
Task Timer - AWS Step Functions