Cannot pass array to next task in AWS StepFunction - amazon-web-services

Working on an AWS StepFunction that gets an array of dates from a Lambda call, then passes to a Task that should take that array as a parameter to pass into a lambda.
The Get Date Range task works fine and outputs the date array:
{
"rng": [
"2019-05-07",
"2019-05-09"
]
}
...and the array gets passed into the ProcessDateRange task, but I cannot assign the array the range Parameter.
It literally tries to pass this: "$.rng" instead of this:
[
"2019-05-07",
"2019-05-09"
]
Here's the StateMachine:
{
"StartAt": "Try",
"States": {
"Try": {
"Type": "Parallel",
"Branches": [{
"StartAt": "Get Date Range",
"States": {
"Get Date Range": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:get-date-range",
"Parameters": {
"name": "thename",
"date_query": "SELECT date from sch.tbl_dates;",
"database": "the_db"
}
,
"ResultPath": "$.rng",
"TimeoutSeconds": 900,
"Next": "ProcessDateRange"
},
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
"Exit": {
"Type": "Succeed"
}
}
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.Error",
"Next": "Failed"
}],
"Next": "Succeeded"
},
"Failed": {
"Type": "Fail",
"Cause": "There was an error. Please review the logs.",
"Error": "error"
},
"Succeeded": {
"Type": "Succeed"
}
}
}

This is because you are using the wrong syntax for Lambda tasks. To specify the input you need to set the InputPath key, for example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"InputPath": "$.rng",
"ResultPath": "$",
"Next": "Exit"
},

If you want a parameter to be interpreted as a JSON path instead of a literal string, add ".$" to the end of the parameter name. To modify your example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range.$": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
Relevant docs here: https://docs.aws.amazon.com/step-functions/latest/dg/connectors-parameters.html#connectors-parameters-path

Related

Break an map loop execution in AWS step functions

I'm trying to build a step function with a loop (Map) inside that can be stopped whenever a specefic Error is thrown, something like this
"Job": {
"Type": "Map",
"InputPath": "$.content",
"ItemsPath": "$.data",
"MaxConcurrency": 0,
"Iterator": {
"StartAt": "Validate",
"States": {
"Validate": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:ship-val",
"Catch": [
{
"ErrorEquals": [
"ErrorOne"
],
"Next": "BreakLoop"
},
{
"ErrorEquals": ["States.ALL"],
"Next": "FailUncaughtError"
}
],
},
"FailUncaughtError":{
"Type": "Fail",
"Error": "Uncaught error"
},
"BreakLoop":{
"Type": "Fail",
"Error": "the loop should be stopped"
}
}
},
"ResultPath": "$.content.data",
"End": true
}
I tried to make the Next element of the Catch to a state outside the Map but I couldn't because the Map accept only states within it. Moreover, AFAIK there is no mention for a feature like this in AWS docs
Instead of catching the error inside the Map state, don't catch it and let the Map state to fail.
And add a catch to Map state and if the error is equal to what you are looking for continue to the next step:
{
"StartAt": "Map",
"States": {
"Map": {
"Type": "Map",
"ItemsPath": "$.array",
"Iterator": {
"StartAt": "FaultyLambda",
"States": {
"FaultyLambda": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"FunctionName": "your function arn",
"Payload": {
"a": 1
}
},
"End": true
}
}
},
"Catch": [
{
"ErrorEquals": ["ErrorOne"],
"Next": "BreakLoop"
}
],
"Next": "BreakLoop"
},
"BreakLoop": {
"Type": "Pass",
"End": true
}
}
}
Any other error will not be catched and failed your entire execution.

Workflow has no terminal state

I am creating a workflow with AWS Step Function where I am first checking if a record exists in database, then based on the records there are two branches and each of them end at either Succeed or Failed state, but I am still getting Workflow has no end state error.
Following is the JSON for workflow
{
"Comment": "A demo state machine",
"StartAt": "FindCategory",
"States": {
"FindCategory": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxxxxxx:function:xxxxxx",
"Next": "Exists?"
},
"Exists?": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.exists",
"BooleanEquals": true,
"Next": "Yes"
},
{
"Variable": "$.exists",
"BooleanEquals": false,
"Next": "No"
}
]
},
"Yes": {
"Type": "Pass",
"Next": "GetQuestions"
},
"GetQuestions": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxxxxxxxxxxxxx",
"Next": "ReplyWithPolls"
},
"ReplyWithPolls": {
"Type": "Map",
"MaxConcurrency": 2,
"Iterator": {
"StartAt": "SendPoll",
"States": {
"SendPoll": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxxxxxxxxxxx:xxxxxxx",
"Next": "SendPoll"
}
}
},
"Next": "Succeed"
},
"No": {
"Type": "Pass",
"Next": "FailState"
},
"Succeed": {
"Type": "Succeed"
},
"FailState": {
"Type": "Fail",
"Error": "404",
"Cause": "Category not found"
}
}
}
I believe the problem is that your SendPoll state results in an infinite loop. It references itself as next. Instead, the state in the iterator should be a terminal state.
Replace the "Next" field in "SendPoll" state with an "End" field.
"SendPoll": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:xxxxxxxxxxxx:xxxxxxx",
"End": true
}

AWS Step-Function: pass a specific value from one AWS lambda to another in step function parallel state

I have the below state machine. The requirement is to have a lambda to query DB and get all the ids. Next I have a parallel state call that calls more than five lambdas at once. Instead of passing all the ids fetched to all the lambdas, I need to pass the respective ids to each lambda.
In the below state language, first call is DB_CALL, lets say it returns {id1, id2, id3, id4, id5, id6}, I want to pass only id1 to First_Lambda and id2 to Second_Lambda etc...
The entire id object should get passed to all lambdas. Please suggest a way to achieve this.
{
"Comment": "Concurrent Lambda calls",
"StartAt": "StarterLambda",
"States": {
"StarterLambda": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:DB_CALL",
"Next": "ParallelCall"
},
"State": {
"ParallelCall": {
"Type": "Parallel",
"End": true,
"Branches": [
{
"StartAt": "First",
"States": {
"First": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:First_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
},
{
"StartAt": "Second",
"States": {
"Second": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Second_Lambda",
"Retry": [ {
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 1,
"MaxAttempts": 2,
"BackoffRate": 2.0
} ],
"End": true
}
}
},
{
"StartAt": "Third",
"States": {
"Third": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Third_Lambda",
"Catch": [ {
"ErrorEquals": ["States.TaskFailed"],
"Next": "CatchHandler"
} ],
"End": true
},
"CatchHandler": {
"Type": "Pass",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:CATCH_HANDLER",
"End": true
}
}
},
{
"StartAt": "Fourth",
"States": {
"Fourth": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Fourth_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
},
{
"StartAt": "Fifth",
"States": {
"Fifth": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Fifth_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
},
{
"StartAt": "Sixth",
"States": {
"Sixth": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Sixth_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
}
}
]
}
}
}
}
You can use Step Function parameter option.
This would allow you to send specific value or json to next lambda.
"Parameters": {
"toprocess.$": "$.MetaData.CorrelationId"
},
So input to this lambda would be smaller dto than compared to you first lambda. So while returning value from this lambda avoid assigning it back to Step function result.
"OutputPath": "$",
"ResultPath": "$.PartialResutl",
What you are looking for is the Map State. With this state, you pass in the iterator, in your case the path to the ids. The map state will run once for each item in the list. Within the map state, you have a full state machine, so you can call a Lambda or any other state. It has controls to limit how many are running at once if that is needed.

Parallel States Merge the output in Step Function

Is it possible to have following kind of Step Function graph, i.e. from 2 parallel state output, one combined state:
If yes, what would json for this looks like? If not, why?
A parallel task always outputs an array (containing one entry per branch).
You can tell AWS step functions to append the output into new (or existing) property in the original input with "ResultPath": "$.ParallelOut" in your parallel state definition, but this is not what you seem to be trying to achieve.
To merge the output of parallel task, you can leverage the "Type": "Pass" state to define transformations to apply to the JSON document.
For example, in the state machine below, I'm transforming a JSON array...
[
{
"One": 1,
"Two": 2
},
{
"Foo": "Bar",
"Hello": "World"
}
]
...into a few properties
{
"Hello": "World",
"One": 1,
"Foo": "Bar",
"Two": 2
}
{
"Comment": "How to convert an array into properties",
"StartAt": "warm-up",
"States": {
"warm-up": {
"Type": "Parallel",
"Next": "array-to-properties",
"Branches": [
{
"StartAt": "numbers",
"States": {
"numbers": {
"Type": "Pass",
"Result": {
"One": 1,
"Two" : 2
},
"End": true
}
}
},
{
"StartAt": "words",
"States": {
"words": {
"Type": "Pass",
"Result": {
"Foo": "Bar",
"Hello": "World"
},
"End": true
}
}
}
]
},
"array-to-properties": {
"Type": "Pass",
"Parameters": {
"One.$": "$[0].One",
"Two.$": "$[0].Two",
"Foo.$": "$[1].Foo",
"Hello.$": "$[1].Hello"
},
"End": true
}
}
}
It is possible as opposed diagram below
The parallel state should look like this
"MyParallelState": {
"Type": "Parallel",
"InputPath": "$",
"OutputPath": "$",
"ResultPath": "$.ParallelResultPath",
"Next": "SetCartCompleteStatusState",
"Branches": [
{
"StartAt": "UpdateMonthlyUsageState",
"States": {
"UpdateMonthlyUsageState": {
"Type": "Task",
"InputPath": "$",
"OutputPath": "$",
"ResultPath": "$.UpdateMonthlyUsageResultPath",
"Resource": "LambdaARN",
"End": true
}
}
},
{
"StartAt": "QueueTaxInvoiceState",
"States": {
"QueueTaxInvoiceState": {
"Type": "Task",
"InputPath": "$",
"OutputPath": "$",
"ResultPath": "$.QueueTaxInvoiceResultPath",
"Resource": "LambdaARN",
"End": true
}
}
}
The output of MyParallelState will be populated as in array, from each state in the Parallel state. They are populated within ParallelResultPath object and will be passed into the Next state
{
"ParallelResultPath": [
{
"UpdateMonthlyUsageResultPath": Some Output
},
{
"QueueTaxInvoiceResultPath": Some Output
}
]
}
We can use ResultSelector and Result Path to combine the result into one object
We have a parallel state like:
{
"StartAt": "ParallelBranch",
"States": {
"ParallelBranch": {
"Type": "Parallel",
"ResultPath": "$",
"InputPath": "$",
"OutputPath": "$",
"ResultSelector": {
"UsersResult.$": "$[1].UsersUpload",
"CustomersResult.$": "$[0].customersDataUpload"
},
"Branches": [
{
"StartAt": "customersDataUpload",
"States": {
"customersDataUpload": {
"Type": "Pass",
"ResultPath": "$.customersDataUpload.Output",
"Result": {
"CompletionStatus": "success",
"CompletionDetails": null
},
"Next": "Wait2"
},
"Wait2": {
"Comment": "A Wait state delays the state machine from continuing for a specified time.",
"Type": "Wait",
"Seconds": 2,
"End": true
}
}
},
{
"StartAt": "UsersUpload",
"States": {
"UsersUpload": {
"Type": "Pass",
"Result": {
"CompletionStatus": "success",
"CompletionDetails": null
},
"ResultPath": "$.UsersUpload.Output",
"Next": "Wait1"
},
"Wait1": {
"Comment": "A Wait state delays the state machine from continuing for a specified time.",
"Type": "Wait",
"Seconds": 1,
"End": true
}
}
}
],
"End": true
}
},
"TimeoutSeconds": 129600,
"Version": "1.0"
}
enter image description here
And the output will be like:
{
"UsersResult": {
"Output": {
"CompletionStatus": "success",
"CompletionDetails": null
}
},
"CustomersResult": {
"Output": {
"CompletionStatus": "success",
"CompletionDetails": null
}
}
}
Your diagram is technically wrong because no state can set multiple states to its Next task. You cannot start State Machine as StartAt by providing multiple State names. Also, even if it was possible I don't see any point why would you want to run two parallel states as opposed to one parallel state with all the sub states that you would split into two.
This worked for me
"Transform And Freeze": {
"Type": "Parallel",
"InputPath": "$",
"Branches": [
{
"StartAt": "Transform Status",
"States": {
"Transform Status": {
"Type": "Map",
"ItemsPath": "$",
"MaxConcurrency": 25,
"Iterator": {
"StartAt": "Transform",
"States": {
"Transform": {
"Type": "Task",
"Resource": "${TransformFunction}",
"End": true
}
}
},
"End": true
}
}
},
{
"StartAt": "Freeze Status",
"States": {
"Freeze Status": {
"Type": "Map",
"MaxConcurrency": 25,
"Iterator": {
"StartAt": "Freeze",
"States": {
"Freeze Transactions": {
"Type": "Task",
"Resource": "${FreezeFunction}",
"End": true
}
}
},
"End": true
}
}
}
],
"ResultPath" : "$.parts",
"Next": "SetParallelOutput",
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.exception",
"Next": "Error Handler"
}
]
},
"SetParallelOutput": {
"Type": "Pass",
"Parameters": {
"foo.$": "$.foo",
"bar.$": "$.bar",
"parts.$": "$.parts[0]"
},
"Next": "Target Type"
},

Loop inside a Step Function

I am trying to call a couple of steps in my step function in a loop but I am unable to get my head around how I need to do this. Here's what I have till now: I need to add another lambda function(GetReviews) which will then call CreateReview, SendNotification in a loop. How would I go about doing this?
I am referring to the "Iterating a Loop Using Lambda" document, which shows it is possible.
Step function Defination:
{
"Comment": "Scheduling Engine",
"StartAt": "CreateReview",
"States": {
"CreateReview": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateReview",
"Next": "CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateReviewResult",
"OutputPath": "$"
},
"CreateNotification": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateNotificationResult",
"OutputPath": "$",
"End": true
}
}
}
I am contributing to this answer because I am using a little different approach to be able to loop inside of a step-function without having to rely on a lambda to increment. If someone in the future needs a generic solution, this can be a good reference.
Here is the example with code:
{
"Comment": "A description of my state machine",
"StartAt": "InitVariables",
"States": {
"InitVariables": {
"Type": "Pass",
"Parameters": {
"index": 0,
"incrementor": 1,
"ArrayLength.$": "States.ArrayLength($.inputArray)"
},
"ResultPath": "$.iterator",
"Next": "LoopChoice"
},
"LoopChoice": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.iterator.ArrayLength",
"NumericGreaterThanPath": "$.iterator.index",
"Next": "IncrementVariable"
}
],
"Default": "End"
},
"IncrementVariable": {
"Type": "Pass",
"Parameters": {
"index.$": "States.MathAdd($.iterator.index, $.iterator.incrementor)",
"incrementor": 1,
"ArrayLength.$": "$.iterator.ArrayLength"
},
"ResultPath": "$.iterator",
"Next": "LoopChoice"
},
"End": {
"Type": "Pass",
"End": true
}
} }
This is the base for the loop, I use the States.MathAdd($.iterator.index, $.iterator.incrementor) intrinsic function to add two values, in this case, increment the index with a increment amount defined in the initVariables state. And also get the length of the array that I want to loop. You get the array length by also using a intrinsic function, States.ArrayLength("$.path.to.array"). The array is passed in the input.
To get the value of the array we can use the intrinsic function, States.ArrayGetItem($.inputArray, $.iterator.index).
All the custom logic should be put between the loopChoice state and the IncrementVariable State.
Hope this helps someone in the future.
Sorry for the late reply. You've probably solved it in between, but here you are
So, when looping in Step Functions, I quite simply add a Choice State (see Choice State Rules).
One of your State would need to output wether or not you have finished looping, or the number of items iterated on and the total number of items.
In the first case, it would be something like
{
"Comment": "Scheduling Engine",
"StartAt": "CreateReview",
"States": {
"GetReviews": {
whatever
"Next": "LoopChoiceState"
},
"LoopChoiceState": {
"Type" : "Choice",
"Choices": [
{
"Variable": "$.loopCompleted",
"BooleanEquals": false,
"Next": "GetReviews"
}
],
"Default": "YourEndState"
},
"CreateReview": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-westz2:529627678433:function:CreateReview",
"Next": "CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateReviewResult",
"OutputPath": "$"
},
"CreateNotification": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateNotificationResult",
"OutputPath": "$",
"End": true
}
}
}
Second case:
{
"Comment": "Scheduling Engine",
"StartAt": "CreateReview",
"States": {
"GetReviews": {
whatever
"Next": "LoopChoiceState"
},
"LoopChoiceState": {
"Type" : "Choice",
"Choices": [
{
"Variable": "$.iteratedItemsCount",
"NumericEquals": "$.totalItemsCount",
"Next": "CreateNotification"
}
],
"Default": "CreateReview"
},
"CreateReview": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateReview",
"Next": "CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateReviewResult",
"OutputPath": "$"
},
"CreateNotification": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateNotificationResult",
"OutputPath": "$",
"End": true
}
}
}
You could also use indexes (current index and last index) instead of the the number of items iterated over; it would help you keep track of where you in create reviews.
Three options are presented below. Here is a visual summary:
#1 Map: Repeat a set of steps for each element of an array (without a loop, optionally concurrently)
Map State is an alternative to looping when you want to run a set of steps for each element of an input array. Each element runs in parallel by default. Set MaxConcurrency: 1 to mimic the serial execution of #NunoGamaFreire's loop-based solution.
{
"StartAt": "MockArray",
"States": {
"MockArray": {
"Type": "Pass",
"Result": [ { "name": "Zaphod" }, { "name": "Arthur" }, { "name": "Trillian" } ],
"ResultPath": "$.Items",
"Next": "MapState"
},
"MapState": {
"Type": "Map",
"ResultPath": "$.MapResult",
"End": true,
"Iterator": {
"StartAt": "MockWork",
"States": {
"MockWork": {
"Type": "Pass",
"Parameters": {
"output.$": "States.Format('Hello, {}!', $.name)"
},
"OutputPath": "$.output",
"End": true
}
}
},
"ItemsPath": "$.Items"
}
}
}
One output is produced for each element in the MockArray, processed concurrently:
"MapResult": [ "Hello, Zaphod!", "Hello, Arthur!", "Hello, Trillian!" ]
#2 Repeat a set of steps X times (loop without lambda, serially)
This option involves proper looping! Repeat a set of tasks serially X number of times until a Choice State determines that an incrementing counter variable has reached X. Use the new States.MathAdd intrinsic function to increment without a Lambda. This option is suited for custom retry logic or other cases when you may want to break the loop early.
{
"StartAt": "InitializeCounter",
"States": {
"InitializeCounter": {
"Type": "Pass",
"Comment": "Initialize the counter at 0. Move all inputs to Payload.Input",
"Parameters": {
"Counter": 0
},
"Next": "IncrementCounter"
},
"IncrementCounter": {
"Type": "Pass",
"Comment": "Increment the Counter by 1",
"Parameters": {
"Counter.$": "States.MathAdd($.Counter, 1)"
},
"Next": "MockWork"
},
"MockWork": {
"Type": "Pass",
"Comment": "Simulate some work. Optionally break early from the loop with ExitNow: true",
"Result": false,
"ResultPath": "$.ExitNow",
"Next": "Loop?"
},
"Loop?": {
"Type": "Choice",
"Choices": [{ "Or": [
{ "Variable": "$.Counter", "NumericGreaterThanEqualsPath": "$$.Execution.Input.workCount" },
{ "Variable": "$.ExitNow", "BooleanEquals": true } ],
"Next": "Success"
}
],
"Default": "IncrementCounter"
},
"Success": {
"Type": "Succeed"
}
},
"TimeoutSeconds": 3
}
Counter iterates by one for each loop. The loop breaks if a task returns ExitNow: true.
{ "Counter": 4, "ExitNow": false }
#3 Repeat a set of steps X times (without a loop, *concurrently*)
This option is a hybrid of the first two. Like #2, we start with a desired number of iterations from the $.workCount input . Like #1, we map over an array concurrently. This time, though, the state machine creates the array with another intrinsic function, States.ArrayRange(1, $.workCount, 1).
{
"StartAt": "Iterations",
"States": {
"Iterations": {
"Type": "Pass",
"Parameters": {
"Iterations.$": "States.ArrayRange(1, $.workCount, 1)"
},
"Next": "MapState"
},
"MapState": {
"Type": "Map",
"ResultPath": "$.MapResult",
"End": true,
"Iterator": {
"StartAt": "MockWork",
"States": {
"MockWork": {
"Type": "Pass",
"Parameters": {
"output.$": "States.Format('Hello from iteration #{}', States.JsonToString($))"
},
"OutputPath": "$.output",
"End": true
}
}
},
"ItemsPath": "$.Iterations"
}
}
}
Tasks run concurrently, once for each item in Iterations.
"Iterations": [ 1, 2, 3, 4 ],
"MapResult": [ "Hello from iteration #1", "Hello from iteration #2", "Hello from iteration #3", "Hello from iteration #4" ]