Behaviour of AWS Lambda triggered by schedule function 'rate' with concurrency > 1 - amazon-web-services

I need to create a lambda that processes records in a DynamoDB table, which will be triggered by the scheduling function rate (1 minutes). This lambda, depending on the amount data it needs to process, could take anywhere between 1 second and 5 minutes, give or take.
My understanding is that if I set concurrency to 1:
The lambda will try to trigger every minute
If the previous lambda instance has not finished (running longer than 1 minute), it will cancel the attempt since an instance is already running, and it will try again a minute later
My question is what happens if I set the concurrency > 1, so for example, 2:
Will there be 2 lambdas triggered every minute ?
Or will it trigger 1 lambda every minute, but
lambda#minute0 is created and takes 3 minutes
lambda#minute1 is also created and take 2 minutes
lambda#minute2 will not be created since we already have 2 instances
I will try to answer my own question once I do the tests if there is no answer before that.

It will be as you describe, except "Will there be 2 lambdas triggered every minute ?". There can be only two if there is "free" concurrency for that to be consumed. So if you have already two functions running for few minutes, there is no "free" concurrency to start a third function.

Related

When a state of a step function times out, does the lambda execution correlated to it continue to be performed?

I want to know if a lambda execution continues to be performed even if the state of the step function correlated to it times out. If it happens, how can i stop it?
There is no way to kill a running lambda. However, you can set concurrency limit to 0 to stop it from starting further executions
Standard StepFunctions have a max timeout of 1 year. (yes! One year)
As such any individual task also has a max timeout of 1 year.
(Express StepFunctions have a timeout of 30 seconds mind you)
Lambda's have a max time out of 15 mins.
If you need your lambda to complete in a certain amount of time, you are best served by setting your lambda timeout to that - not your state machine. (i see in your comments you say you cannot pass a value for this? If you cannot change it then you have no choice but to let it run its course)
Consider StepFunctions and state machines to be orchestrators, but they have very little control over the individual components. They tell who to act and when but otherwise are stuck waiting on those components to reply before continuing.
If your lambda times out, it will cause your StateMachine to fail that task as as it receives a lambda service error. You can then handle that in the StepFunction without failing the entire process, see:
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html
You could specifically use: TimeoutSecondsPath in your definition to set a specific result if the task timesout.
But as stated, no, once a lambda begins execution it will continue until it finishes or it times out at 15 mins / its set timeout.

Create a parallel step function with a lambda

I have a question on the step function part of AWS
I have a function to watch and update datas in databases. But because we can have only 1000 as we can have 1 000 000 items to update, I would like to manage it by 10 000 or 100 000 with a lambda.
But the optimal solution should be to manage them in parallel to update every datas at the same time and finish them together
So for that I would like to create a Lambda function with aws-sdk which should create a parallel step function with X tasks and every tasks will manage 10 000 or 100 000 items of the database
But when I read the aws-sdk documentation, it looks like there is no way to create a parallel step function, even from a template
So my question is, is it possible to create a parallel step function from a Lambda function with aws-sdk ? Or do you have a better solution to my problem ?
Thanks in advance
Update : To give you more informations, my problem is I'll have to update a insert an unknown of datas in my DB each first day of month, and the problem is that I need to call an API that takes 15 seconds to return the data (it's not our API so I cannot try to upgrade return time).
If I just use a Lambda function, it will be in timeout after 15 minutes.
Suddenly, I thought of using Step function to execute the Lambda function for each data, but the problem is, if we have a lot of datas, it will maybe take more than 24 hours and I would like to find a solution where I can execute my Lambda function in parallel to optimize the time, so i thought about parallel task of step function.
But because the number of datas will change every month, I don't know how to dynamically increase or decrease branch number of my step function, and that's why I thought of generate my step function from another Lambda
I have a function to watch and update data in databases.
I suppose what you need to watch is some kind of user/data events? what to watch? what to update?
Can you provide more info before I can give you some architectural suggestions?
By the way, it is Step Functions to orchestrate/invoke Lambda functions, not the other around.
updated answer:
so you seem to face the 15 mins hard limit for Lambda max execution time. there are 3 approaches I can see:
instead of using a Lambda function, use an ECS container or EC2 instance to handle the large volume of data processing and database writing. however, this requires substantial code re-rewrite and infrastructure/architectural change.
figure out a way to break down the input data so you can fan out the handling to multiple Lambda function instances, i.e.: input data -> Lambda to break down task -> SQS messages -> Lambda to handle each task. but my concern is that the task to break down input data may also need substantial time.
before Lambda execution timeout, mark the current processed position, invoke the same Lambda function with the original event + position offset. the next Lambda instance would pick up the data processing from where the previous execution stopped. https://medium.com/swlh/processing-large-s3-files-with-aws-lambda-2c5840ae5c91

Is there a way to maintain lambda concurrency from SQS while maintaining serial execution

I have a path from sqs to lambda. Lambda code has some time taking preprocessing before doing actual work.
The issue is,
For example, at time 0 a message comes to sqs and then onto start executing in lambda. Suppose it needs 2 units of time for warmup and execution. It will complete at time 2.
Now, at time 1 another message comes, as lambda 1 is busy another lambda spins up. It again needs 2 units of time. It will complete at time 3.
Here comes the issue, if at time 2.01 another message comes, it will picked by lambda 1 as it has finished work at time 2. Suppose lambda 1 finishes again at 2.99
The final execution sequence is 1,3,2 when it was supposed to be 1,2and 3.
Is there a way to not sacrifice concurrency while maintaining serial order.
Note: Lambda finishes job with a dynamo db write.

S3 Lambda trigger double invocation after exactly 10 minutes

We are experiencing double Lambda invocations of Lambdas triggered by S3 ObjectCreated-Events. Those double invocations happen exactly 10 minutes after the first invocation, not 10 minutes after the first try is complete, but 10 minutes after the first invocation happened. The original invocation takes anything in the range between 0.1 to 5 seconds. No invocations results in errors, they all complete successfully.
We are aware of the fact that SQS for example does not guarantee exactly-once but at-least-once delivery of messages and we would accept some of the lambdas getting invoked a second time due to results of the distributed system underneath. A delay of 10 minutes however sounds very weird.
Of about 10k messages 100-200 result in double invocations.
The AWS Support basically says "the 10 minute wait time is by design but we cannot tell you why", which is not at all helpful.
Has anyone else experienced this behaviour before?
How did you solve the issue or did you simply ignore it (which we could do)?
One proposed solution is not to use direct S3-lambda-triggers, but let S3 put its event on SNS and subscribe a Lambda to that. Any experience with that approach?
example log: two invocations, 10 minutes apart, same RequestId
START RequestId: f9b76436-1489-11e7-8586-33e40817cb02 Version: 13
2017-03-29 14:14:09 INFO ImageProcessingLambda:104 - handle 1 records
and
START RequestId: f9b76436-1489-11e7-8586-33e40817cb02 Version: 13
2017-03-29 14:24:09 INFO ImageProcessingLambda:104 - handle 1 records
After a couple of rounds with the AWS support and others and a few isolated trial runs it seems like this is simply "by design". It is not clear why, but it simply happens. The problem is neither S3 nor SQS / SNS but simply the lambda invocation and how the lambda service dispatches the invocations to lambda instances.
The double invocations happen somewhere between 1% and 3% of all invocations, 10 minutes after the first invocation. Surprisingly there are even triple (and probably quadruple) invocations with a rate of powers of the base probability, so basically 0.09%, ... The triple invocations happened 20 minutes after the first one.
If you encounter this, you simply have to work around it using whatever you have access to. We for example now store the already processed entities in a Cassandra with a TTL of 1 hour and only responding to messages from the lambda if the entity has not been processed yet. The double and triple invocations all happen within this one hour timeframe.
Not wanting to spin up a data store like Dynamo just to handle this, I did two things to solve our use case
Write a lock file per function into S3 (which we were already using for this one) and check for its existence on function entry, aborting if present; for this function we only ever want one of it running at a time. The lock file is removed before we call callback on error or success.
Write a request time in the initial event payload and check the request time on function entry; if the request time is too old then abort. We don't want Lambda retries on error unless they're done quickly, so this handles the case where a duplicate or retry is sent while another invocation of the same function is not already running (which would be stopped by the lock file) and also avoids the minimal overhead of the S3 requests for the lock file handling in this case.

TimerTrigger Schedules and code execution time

What happens if a function gets invoked by a TimerTigger every 5 minutes and for some reasons the code takes more than 5 minutes to complete?
Does this result in my function running twice at the same time?
Or does the interval start when the triggered code execution is completed?
I could not find an answer myself in the docs.
I have to ensure that my function is running always as singleton.
Thanks,
Alex
If your function execution takes longer than the timer interval, another execution won't be triggered until after the current invocation completes. The next execution is scheduled after the execution completes. You can see this in the code here. You can prove this to yourself by trying a simple local example - create a function that runs every 5 seconds, and put a sleep in there for a minute. You won't see another function start until the first finishes.
As far as running singleton, the above shows that only a single function invocation runs at a given time on the same instance (VM). The SDK further ensures that no other functions are running across scaled out instances. You can read more about that here. To see this in action, you can simulate by starting two instances of your console app locally - one will run the schedule the other will not. However, if you kill the one running the schedule, the other one will pick it up after a short time (within a minute).