What happens if a function gets invoked by a TimerTigger every 5 minutes and for some reasons the code takes more than 5 minutes to complete?
Does this result in my function running twice at the same time?
Or does the interval start when the triggered code execution is completed?
I could not find an answer myself in the docs.
I have to ensure that my function is running always as singleton.
Thanks,
Alex
If your function execution takes longer than the timer interval, another execution won't be triggered until after the current invocation completes. The next execution is scheduled after the execution completes. You can see this in the code here. You can prove this to yourself by trying a simple local example - create a function that runs every 5 seconds, and put a sleep in there for a minute. You won't see another function start until the first finishes.
As far as running singleton, the above shows that only a single function invocation runs at a given time on the same instance (VM). The SDK further ensures that no other functions are running across scaled out instances. You can read more about that here. To see this in action, you can simulate by starting two instances of your console app locally - one will run the schedule the other will not. However, if you kill the one running the schedule, the other one will pick it up after a short time (within a minute).
Related
Im developer who is new to AWS.
While configuring step functions, I found there are some senarious that there could be multiple step functions instance executed because of timer I've set.
for the step function that I configured follows below process
wait 2 minute
execute lambda function
since there is timer in my step function, there could be some cases that, step function might invoked multiple times at a time.
the thing is that, I wan't to guarantee that on execution queue, only one step function in on.
so if any other step function gets invoked, while step function in on running(timing), I wan't to terminate step function that just got invoked. is there any way to list step functions that are executing?
You can't prevent an execution from starting, but you can list the executions at the start of your Step Function, and exit early if a running execution is found.
The ListExecutions API lists the executions for a given state machine ARN. Call it in a Task, setting the statusFilter to RUNNING to return only in-progress executions. You'll get back a list of matching execution items. All you care about is whether the length > 0.
Finally, insert a Choice state. If there are running items, exit. If no running items, continue with the execution.
I want to know if a lambda execution continues to be performed even if the state of the step function correlated to it times out. If it happens, how can i stop it?
There is no way to kill a running lambda. However, you can set concurrency limit to 0 to stop it from starting further executions
Standard StepFunctions have a max timeout of 1 year. (yes! One year)
As such any individual task also has a max timeout of 1 year.
(Express StepFunctions have a timeout of 30 seconds mind you)
Lambda's have a max time out of 15 mins.
If you need your lambda to complete in a certain amount of time, you are best served by setting your lambda timeout to that - not your state machine. (i see in your comments you say you cannot pass a value for this? If you cannot change it then you have no choice but to let it run its course)
Consider StepFunctions and state machines to be orchestrators, but they have very little control over the individual components. They tell who to act and when but otherwise are stuck waiting on those components to reply before continuing.
If your lambda times out, it will cause your StateMachine to fail that task as as it receives a lambda service error. You can then handle that in the StepFunction without failing the entire process, see:
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html
You could specifically use: TimeoutSecondsPath in your definition to set a specific result if the task timesout.
But as stated, no, once a lambda begins execution it will continue until it finishes or it times out at 15 mins / its set timeout.
Scenario
I'm looking for a way to create an instance of a step function that waits for me to start it. Pseudo code would look like this.
StateMachine myStateMachine = new();
string executionArn = myStateMachine.ExecutionArn;
myStateMachine.Start();
Use Case
We need a way to reliably store the Execution ARN of a step function to a database. If we fail to write the Execution ARN to the database, we won't call the Start method and the step function should timeout. If the starting of the step function fails, the database operation would be rolled back.
These are the steps we plan to take
A local transaction is started
The step function instance is created, but not started
The ExecutionArn of the created step function instance is recorded in a database
The step function is started
The local transaction is committed
Is there a simple way to start a step function like this?
Below is the result of some research I've done on this so far.
Manual Callbacks
Following information in this article https://aws.amazon.com/blogs/compute/implementing-serverless-manual-approval-steps-in-aws-step-functions-and-amazon-api-gateway/,
I create an empty activity, then us this activity as the first step in the step function and add a timeout of 30 seconds to the activity step. The expectation was that if I didn't send a success to that activity task in the step function then the step would timeout and the workflow would fail, but it isn't doing that. Even though I set the timeout to 30 seconds, the step is not timing out. I'm guessing the timeout is about how long it waits for the step function to be able to schedule the activity, not how long it waits for the step function to move on from the activity step.
I've also considered using an SQS SendMessage step with Wait for callback checked and with a similar timeout, but that would require I create a throw-away SQS queue just to contain messages I never intend to read, plus I'm guessing the timeout functionality would work the same here as in an activity.
Wait State
There may be something I can do with a Wait state and parallel branches by following the accepted answer in this SO article: Does AWS Step Functions have a timeout feature?, but before I go down that route I want to see if something simpler can be done.
Global Timeout
I have found that step functions have a global timeout, and that is useful in this case if I use it in conjunction with a step that pauses until my application explicitly resumes it, but the global timeout is only useful if it can be reasonably low (like 20 minutes) and still have the step function viable for all use cases. For instance, if the maximum time it should take to run the step function is 2 or 3 minutes, then all is fine. But if I have another step in the step function that can take longer than 20 minutes then I can't use the global timer anymore or I have to start setting it to something very high, which I don't want to do.
Is there anything I can do here easily that I'm overlooking?
Thanks
Two-phase initialization of a step function cannot be done. We've worked around this by:
Our Application: Writing a row in our DB to indicate the intent to start a step function
Our Application: Start the step function
Our Application: Record the ExecutionArn of the step function instance in the created row
Step Function: Have the step function wait on step 1 indefinitely on an SQS step
Our Application: Poll the SQS queue and either abort the step function or allow it to proceed to the next step by sending a callback to the SQS step. (This is the 2nd phase)
I have a path from sqs to lambda. Lambda code has some time taking preprocessing before doing actual work.
The issue is,
For example, at time 0 a message comes to sqs and then onto start executing in lambda. Suppose it needs 2 units of time for warmup and execution. It will complete at time 2.
Now, at time 1 another message comes, as lambda 1 is busy another lambda spins up. It again needs 2 units of time. It will complete at time 3.
Here comes the issue, if at time 2.01 another message comes, it will picked by lambda 1 as it has finished work at time 2. Suppose lambda 1 finishes again at 2.99
The final execution sequence is 1,3,2 when it was supposed to be 1,2and 3.
Is there a way to not sacrifice concurrency while maintaining serial order.
Note: Lambda finishes job with a dynamo db write.
My goal is to have an workflow which periodically (every 30 seconds) add a same activity (doing nothing but sleep for 1 minute) to the taskList. Also I have multiple machines hosting activity workers to poll the taskList simultaneously. When the activity got scheduled, one of the workers can poll it and execute.
I tried to use a cron decorator to create a DynamicActivityClient and use the DynamicActivityClient.scheduleActivity() to schedule the activity periodically. However, it seems the the activity will not be scheduled until the last activity is finished. In my case, the activity got scheduled every 1 minute rather than 30 seconds which I set in the cron pattern.
The package structure is almost the same as aws sdk sample code: cron
Is there any other structure recommended to achieve this? I am very much new to SWF.Any suggestion is highly appreciated.
You may do so by writing a much simpler workflow code and using workflow clock and timer. Refer to the example in the link below.
http://docs.aws.amazon.com/amazonswf/latest/awsflowguide/executioncontext.html
Also remember one thing. The maximum number of events allowed in a workflow execution is 25000. So the cron job will not run for ever but you will have to write code to start a new workflow execution after some time. Refer to continuous workflow example provided at link below
http://docs.aws.amazon.com/amazonswf/latest/awsflowguide/continuous.html
The cron decorator internally relies on AsyncScheduledExecutor which is by design written to wait for all asynchronous code in the invoked method to complete before calling the cron again. So the behavior you are witnessing is expected. The workaround is to not invoke activity from the code under cron, but from the code in the different scope. Something like:
// This is a field
Settable<Void> invokeNextActivity = new Settable<>();
void executeCron() {
scheduledExecutor.execute(new AsyncRunnable() {
#Override
public void run() throws Throwable {
// Instead of executing activity here just unblock
// its execution in a different scope.
invokeNextActivity.set(null);
}
});
// Recursive loop with each activity invocation
// gated on invokeNextActivity
executeActivityLoop(invokeNextActivity);
}
#Asynchronous
void executeActivityLoop(Promise waitFor) {
activityClient.executeMyActivityOnce();
ivnokeNextActivity = new Settable<>();
executeActivityLoop(ivnokeNextActivity);
}
I recommend reading TryCatchFinally documentation to get understanding of error handling and scopes.
Another option is to rewrite AsyncScheduledExecutor to invoke invoked.set(lastInvocationTime) not from the doFinally but immediately after calling the command.run()