How to reuse state definition across aws state machines? - amazon-web-services

I have a state machine like below. If it has 1000 messages to notify, it spreads the notifications across 15 minutes.
Now, if I have a TwoHourStateMachine with exact same state flows but with its own set of lambdas, how will I reuse the states so that I dont duplicate the definition again?
State machine:
FifteenMinuteStateMachine:
Type: "AWS::StepFunctions::StateMachine"
Properties:
StateMachineName: "FifteenMinuteStateMachine"
DefinitionString:
Fn::Sub: |-
{
"Comment": "A 15 minute state machine",
"StartAt": "Initialize",
"TimeoutSeconds": 900,
"States": {
"Initialize" : {
"Type": "Task",
"Resource": "${InitFifteenMinuteLambda.Arn}",
"TimeoutSeconds": 15,
"Retry": [ {
"ErrorEquals": [ "States.Timeout", "Lambda.Unknown" ],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 2
} ],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.errorOutput",
"Next": "Update Status"
}],
"Next": "Notification Job"
},
"Notification Job" : {
"Type": "Task",
"Resource": "${NotificationFifteenMinuteLambda.Arn}",
"TimeoutSeconds": 15,
"Retry": [ {
"ErrorEquals": [ "States.Timeout", "Lambda.Unknown" ],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 2
} ],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.errorOutput",
"Next": "Update Status"
}],
"Next": "All Notifications sent?"
},
"All Notifications sent?": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.status",
"StringEquals": "IN_PROGRESS",
"Next": "Wait X Seconds"
},
{
"Variable": "$.status",
"StringEquals": "SUCCEEDED",
"Next": "Update Status"
}
],
"Default": "Wait X Seconds"
},
"Wait X Seconds": {
"Type": "Wait",
"SecondsPath": "$.notificationIntervalInSeconds",
"Next": "Notification Job"
},
"Update Status": {
"Type": "Task",
"Resource": "${StatusUpdateFifteenMinuteLambda.Arn}",
"TimeoutSeconds": 15,
"End": true
}
}
}
RoleArn:
Fn::GetAtt: [ StepFunctionExecutionRole, Arn ]

Ashok,
If you can frame the problem with 1 set of lambda functions I believe the solution is already done for you in your example. Are you required to call different lambda functions? Ideally you can reuse the same lambda function and reuse that in the definition. Unfortunately, you can not use ARN variables at runtime currently, which is what I believe you're asking for.
Hope this helps!

Related

Step function with Redshift cluster

Building a step function to orchestrate an ETL pipeline but keep getting this error. Here is my code and following below AWS docs.
https://docs.aws.amazon.com/step-functions/latest/dg/sample-etl-orchestration.html
"GetStateOfCluster": {
"Type": "Task",
"Resource": "lambda,
"TimeoutSeconds": 180,
"HeartbeatSeconds": 60,
"Next": "IsClusterAvailable",
"InputPath": "$",
"ResultPath": "$.clusterStatus"
},
"IsClusterAvailable": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.clusterStatus",
"StringEquals": "available",
"Next": "runetljobs"
},
{
"Variable": "$.clusterStatus",
"StringEquals": "unavailable",
"Next": "ClusterUnavailable"
},
{
"Variable": "$.clusterStatus",
"StringEquals": "paused",
"Next": "InitializeResumeCluster"
},
{
"Variable": "$.clusterStatus",
"StringEquals": "resuming",
"Next": "ClusterWait"
}
],
"Default": "DefaultState"
},
"DefaultState": {
"Type": "Fail",
"Error": "DefaultStateError",
"Cause": "No Matches!"
},
"ClusterUnavailable": {
"Type": "Fail",
"Cause": "Redshift cluster is not available",
"Error": "Error"
},
"ClusterWait": {
"Type": "Wait",
"Seconds": 900,
"Next": "InitializeCheckCluster"
},
"InitializeResumeCluster": {
"Type": "Pass",
"Next": "ResumeCluster",
"Result": {
"input": {
"redshift_cluster_id": "redshift cluster id",
"operation": "resume"
}
}
},
"ResumeCluster": {
"Type": "Task",
"Resource": "lambda",
"TimeoutSeconds": 180,
"HeartbeatSeconds": 60,
"Next": "ClusterWait",
"InputPath": "$",
"ResultPath": "$"
},
It's directly going to default even cluster status shows 'available', rather it should go to runetljob stage. In the doc, they dont have default, if we dont add default, error is,
"cause": "An error occurred while executing the state 'IsClusterAvailable' (entered at the event id #14). Failed to transition out of the state. The state does not point to a next state."
You don't see the state "runetljobs" defined in you state definition.

AWS Step function process output from lambda

I have a AWS step function need to call 3 lambda in sequence, but at the end of each call, the step function need to process the response from a lambda, and determine the next lambda call.
So how can the step function process the response from a lambda? Can you show an example please?
Assuming that, you have called a lambda function as the first step of your step function. Based on the response from the lambda, you need to decide on which lambda should be triggered next.
This workaround is pretty straightforward, you can return an attribute (eg: next_state) in the lambda response, create a "Choice" flow, in the step function, and give this next_state attribute as an input.
"Choice" flow is nothing but an if-else condition and you can redirect the next step to the expected lambda.
For example,
and the definition will be something like below,
{
"Comment": "A description of my state machine",
"StartAt": "Lambda Invoke 1",
"States": {
"Lambda Invoke 1": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"OutputPath": "$.Payload",
"Parameters": {
"Payload.$": "$",
"FunctionName": "<your lambda name>"
},
"Retry": [
{
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 6,
"BackoffRate": 2
}
],
"Next": "Choice"
},
"Choice": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.next_state",
"StringEquals": "Lambda Invoke 2",
"Next": "Lambda Invoke 2"
},
{
"Variable": "$.next_state",
"StringEquals": "Lambda Invoke 3",
"Next": "Lambda Invoke 3"
}
],
"Default": "Lambda Invoke 3"
},
"Lambda Invoke 2": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"OutputPath": "$.Payload",
"Parameters": {
"Payload.$": "$",
"FunctionName": "<your lambda name>"
},
"Retry": [
{
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 6,
"BackoffRate": 2
}
],
"End": true
},
"Lambda Invoke 3": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"OutputPath": "$.Payload",
"Parameters": {
"Payload.$": "$",
"FunctionName": "<your lambda name>"
},
"Retry": [
{
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 6,
"BackoffRate": 2
}
],
"End": true
}
}
}
There are two ways of catching response of lambda function from step function.
Using add_retry and add_catch to handle any exception from lambda function
eg.
.start(record_ip_task
.add_retry(errors=["States.TaskFailed"],
interval=core.Duration.seconds(2),
max_attempts=2)
.add_catch(errors=["States.ALL"], handler=notify_failure_job)) \
Response value from lambda function such as return '{"Result": True} then the step function job will check that value for next task, eg.
.next(
is_block_succeed
.when(step_fn.Condition.boolean_equals('$.Result', False), notify_failure_job)
.otherwise(send_slack_task)
)
Ref: https://dev.to/vumdao/aws-guardduty-combine-with-security-hub-and-slack-17eh
https://github.com/vumdao/aws-guardduty-to-slack

Is it possible to execute Step Concurrency for AWS EMR through AWS STEP FUNCTION without Lambda?

This is my Scenario, I'm trying to create 4 AWS EMR clusters, where each cluster will be assigned with 2 jobs in it, so it'll be like 4 clusters with 8 jobs orchestrated using Step Function.
My Flow should be like:
4 Clusters will start at the same time running 8 jobs parallelly, where each cluster will run 2 jobs parallelly.
Now, recently AWS has launched this feature to run 2 (or) more jobs in a single cluster simultaneously using StepConcurrencyLevel in EMR to reduce the runtime of the cluster, which can be performed using EMR console, AWS CLI (or) even through AWS lambda.
But, I want to execute this process of launching 2 (or) more jobs parallelly in a single cluster using AWS Step Function with it's state machine language like the format referred here https://docs.aws.amazon.com/step-functions/latest/dg/connect-emr.html
I've tried referring many sites to execute this process, where I'm getting solution for doing it through the console (or) through boto3 format in AWS lambda, but I couldn't find the solution on executing this through Step Function itself...
Is there any Solution for this!?
Thanks in Advance..
So, I went through few more sites and found a solution for my issue...
The issue I faced was StepConcurrencyLevel, where I can add it using AWS Console (or) through AWS CLI (or) even through Python using BOTO3... But I was expecting a solution using State Machine Language and I found one...
All we have to do is while creating our cluster using the State Machine Language we have to specify the StepConcurrencyLevel in it like 2 (or) 3, where the default is 1. Once it's been set then create 4 steps under that cluster and run the State Machine.
Where the cluster will recognize the number of concurrency been set and will run the Steps accordingly.
My Sample Process:
-> JSON Script of my orchestration
{
"StartAt": "Create_A_Cluster",
"States": {
"Create_A_Cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "WorkflowCluster",
"StepConcurrencyLevel": 2,
"Tags": [
{
"Key": "Description",
"Value": "process"
},
{
"Key": "Name",
"Value": "filename"
},
{
"Key": "Owner",
"Value": "owner"
},
{
"Key": "Project",
"Value": "roject"
},
{
"Key": "User",
"Value": "user"
}
],
"VisibleToAllUsers": true,
"ReleaseLabel": "emr-5.28.1",
"Applications": [
{
"Name": "Spark"
}
],
"ServiceRole": "EMR_DefaultRole",
"JobFlowRole": "EMR_EC2_DefaultRole",
"LogUri": "s3://prefix/prefix/log.txt/",
"Instances": {
"KeepJobFlowAliveWhenNoSteps": true,
"InstanceFleets": [
{
"InstanceFleetType": "MASTER",
"TargetSpotCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m4.xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 90
}
]
},
{
"InstanceFleetType": "CORE",
"TargetSpotCapacity": 1,
"InstanceTypeConfigs": [
{
"InstanceType": "m4.xlarge",
"BidPriceAsPercentageOfOnDemandPrice": 90
}
]
}
]
}
},
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"IntervalSeconds": 5,
"MaxAttempts": 1,
"BackoffRate": 2.5
}
],
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"Next": "Fail_Cluster"
}
],
"ResultPath": "$.cluster",
"OutputPath": "$.cluster",
"Next": "Add_Steps_Parallel"
},
"Fail_Cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"TopicArn": "arn:aws:sns:us-west-2:919490798061:rsac_error_notification",
"Message.$": "$.Cause"
},
"Next": "Terminate_Cluster"
},
"Add_Steps_Parallel": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "Step_One",
"States": {
"Step_One": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:addStep.sync",
"Parameters": {
"ClusterId.$": "$.ClusterId",
"Step": {
"Name": "The first step",
"ActionOnFailure": "TERMINATE_CLUSTER",
"HadoopJarStep": {
"Jar": "command-runner.jar",
"Args": [
"spark-submit",
"--deploy-mode",
"cluster",
"--master",
"yarn",
"--conf",
"spark.dynamicAllocation.enabled=true",
"--conf",
"maximizeResourceAllocation=true",
"--conf",
"spark.shuffle.service.enabled=true",
"--py-files",
"s3://prefix/prefix/pythonfile.py",
"s3://prefix/prefix/pythonfile.py"
]
}
}
},
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"IntervalSeconds": 5,
"MaxAttempts": 1,
"BackoffRate": 2.5
}
],
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.err_mgs",
"Next": "Fail_SNS"
}
],
"ResultPath": "$.step1",
"Next": "Terminate_Cluster_1"
},
"Fail_SNS": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"TopicArn": "arn:aws:sns:us-west-2:919490798061:rsac_error_notification",
"Message.$": "$.err_mgs.Cause"
},
"ResultPath": "$.fail_cluster",
"Next": "Terminate_Cluster_1"
},
"Terminate_Cluster_1": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:terminateCluster.sync",
"Parameters": {
"ClusterId.$": "$.ClusterId"
},
"End": true
}
}
},
{
"StartAt": "Step_Two",
"States": {
"Step_Two": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:addStep",
"Parameters": {
"ClusterId.$": "$.ClusterId",
"Step": {
"Name": "The second step",
"ActionOnFailure": "TERMINATE_CLUSTER",
"HadoopJarStep": {
"Jar": "command-runner.jar",
"Args": [
"spark-submit",
"--deploy-mode",
"cluster",
"--master",
"yarn",
"--conf",
"spark.dynamicAllocation.enabled=true",
"--conf",
"maximizeResourceAllocation=true",
"--conf",
"spark.shuffle.service.enabled=true",
"--py-files",
"s3://prefix/prefix/pythonfile.py",
"s3://prefix/prefix/pythonfile.py"
]
}
}
},
"Retry": [
{
"ErrorEquals": [
"States.ALL"
],
"IntervalSeconds": 5,
"MaxAttempts": 1,
"BackoffRate": 2.5
}
],
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.err_mgs_1",
"Next": "Fail_SNS_1"
}
],
"ResultPath": "$.step2",
"Next": "Terminate_Cluster_2"
},
"Fail_SNS_1": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"TopicArn": "arn:aws:sns:us-west-2:919490798061:rsac_error_notification",
"Message.$": "$.err_mgs_1.Cause"
},
"ResultPath": "$.fail_cluster_1",
"Next": "Terminate_Cluster_2"
},
"Terminate_Cluster_2": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:terminateCluster.sync",
"Parameters": {
"ClusterId.$": "$.ClusterId"
},
"End": true
}
}
}
],
"ResultPath": "$.steps",
"Next": "Terminate_Cluster"
},
"Terminate_Cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:terminateCluster.sync",
"Parameters": {
"ClusterId.$": "$.ClusterId"
},
"End": true
}
}
}
In this script (or) AWS Step Function's State Machine Language, While creating the cluster I've mentioned the StepConcurrencyLevel as 2 and added 2 spark jobs as Steps below the cluster.
When I ran this script in Step Function, I was able to orchestrate the cluster and steps to run 2 steps concurrently in a cluster without directly configuring it in AWS EMR console (or) through AWS CLI (or) even through BOTO3.
I just used the State Machine Language to execute the orchestration of running 2 steps concurrently in a single cluster under AWS Step Function without any help from other services like lambda or livy API or BOTO3 etc...
This is how the Flow Diagram Looks:
AWS Step Function Workflow for concurrent step execution
To be more accurate on where I inserted the StepConcurrencyLevel in the above State Machine Language is here:
"Create_A_Cluster": {
"Type": "Task",
"Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
"Parameters": {
"Name": "WorkflowCluster",
"StepConcurrencyLevel": 2,
"Tags": [
{
"Key": "Description",
"Value": "process"
},
Under Create_A_Cluster.
Thank You.

Cannot pass array to next task in AWS StepFunction

Working on an AWS StepFunction that gets an array of dates from a Lambda call, then passes to a Task that should take that array as a parameter to pass into a lambda.
The Get Date Range task works fine and outputs the date array:
{
"rng": [
"2019-05-07",
"2019-05-09"
]
}
...and the array gets passed into the ProcessDateRange task, but I cannot assign the array the range Parameter.
It literally tries to pass this: "$.rng" instead of this:
[
"2019-05-07",
"2019-05-09"
]
Here's the StateMachine:
{
"StartAt": "Try",
"States": {
"Try": {
"Type": "Parallel",
"Branches": [{
"StartAt": "Get Date Range",
"States": {
"Get Date Range": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:get-date-range",
"Parameters": {
"name": "thename",
"date_query": "SELECT date from sch.tbl_dates;",
"database": "the_db"
}
,
"ResultPath": "$.rng",
"TimeoutSeconds": 900,
"Next": "ProcessDateRange"
},
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
"Exit": {
"Type": "Succeed"
}
}
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.Error",
"Next": "Failed"
}],
"Next": "Succeeded"
},
"Failed": {
"Type": "Fail",
"Cause": "There was an error. Please review the logs.",
"Error": "error"
},
"Succeeded": {
"Type": "Succeed"
}
}
}
This is because you are using the wrong syntax for Lambda tasks. To specify the input you need to set the InputPath key, for example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"InputPath": "$.rng",
"ResultPath": "$",
"Next": "Exit"
},
If you want a parameter to be interpreted as a JSON path instead of a literal string, add ".$" to the end of the parameter name. To modify your example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range.$": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
Relevant docs here: https://docs.aws.amazon.com/step-functions/latest/dg/connectors-parameters.html#connectors-parameters-path

How to capture output of a parallel state machine in AWS Step function

I need to run an AWS Step function that runs a parallel state machine running, say two state machines. My requirement is to check the final execution status of the parallel machine and if there is any failure, invoke an SNS service to send out an email. Pretty standard stuff but for the life of me, i can't figure out how to capture the combined error of a parallel step machine. This sample parallel machine runs
A "passtask" that is just a simple lambda pass function,
and
Runs a failtask that has a sleep timer for 5 seconds and is suppposed to fail after 5 seconds.
If I execute this machine, this machine correctly shows passtask as succeeded, failtask as cancelled, Overall Parallel Task as succeeded (?????), Notify Failure task as cancelled and the overall execution of state machine as "failed" as well.
I'd like to see passtask as succeeded, fail task as failed, overall Parallel Task as Failed, Notify Failure task as succeeded.
{
"Comment": "Parallel Example",
"StartAt": "Parallel Task",
"TimeoutSeconds": 120,
"States": {
"Parallel Task": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "passtask",
"States": {
"passtask": {
"Type": "Task",
"Resource":"arn:xxxxxxxxxxxxxxx:function:passfunction",
"End": true
}
}
},
{
"StartAt": "failtask",
"States": {
"failtask": {
"Type": "Task",
"Resource":"arn: xxxxxxxxxxxxxxx:function:failfunction",
"End": true
}
}
}
],
"ResultPath": "$.status",
"Catch": [
{
"ErrorEquals": ["States.ALL"],
"Next": "Notify Failure"
}
],
"Next": "Notify Success"
},
"Notify Failure": {
"Type": "Pass",
"InputPath": "$.input.Cause",
"End": true
},
"Notify Success": {
"Type": "Pass",
"Result": "This is a fallback from a task success",
"End": true
}
}
}
From your requirment "My requirement is to check the final execution status of the parallel machine and if there is any failure, invoke an SNS service to send out an email.", I understand that the "failtask" is just for debugging purposes and in the future it won't neccesarily fail. So the problem is, the moment Step Functions detect a failure in a branch all other branches are terminated and their outputs discarded, only the failed branch's output is used. So if you want to preserve the output of each Branch and check if a failure has occured, you will need to handle the errors in each branch and not report the whole branch as failed. Additionally you will need to add an output field to each branch which says if there was a failure or not (Choice State will give an error if a field does not exist). And also remember that the output of a ParralelState is an array with the output of each Branch, for example this State Machine should let each branch finish execution and handle the errors correctly:
{
"Comment": "Parallel Example",
"StartAt": "Parallel Task",
"TimeoutSeconds": 120,
"States": {
"Parallel Task": {
"Type": "Parallel",
"Branches": [{
"StartAt": "passtask",
"States": {
"passtask": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:XXXXXXXXXXXXXXXXX",
"Next": "SuccessBranch1",
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "FailBranch1"
}]
},
"SuccessBranch1": {
"Type": "Pass",
"Result": {
"Error": false
},
"ResultPath": "$.Status",
"End": true
},
"FailBranch1": {
"Type": "Pass",
"Result": {
"Error": true
},
"ResultPath": "$.Status",
"End": true
}
}
},
{
"StartAt": "failtask",
"States": {
"failtask": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:XXXXXXXXXXXXXXXXX",
"Next": "SuccessBranch2",
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "FailBranch2"
}]
},
"SuccessBranch2": {
"Type": "Pass",
"Result": {
"Error": false
},
"ResultPath": "$.Status",
"End": true
},
"FailBranch2": {
"Type": "Pass",
"Result": {
"Error": true
},
"ResultPath": "$.Status",
"End": true
}
}
}
],
"ResultPath": "$.ParralelOutput",
"Catch": [{
"Comment": "This catch should never catch any errors, as the error handling is done in the individual Branches",
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.ParralelOutput",
"Next": "ChoiceStateX"
}],
"Next": "ChoiceStateX"
},
"ChoiceStateX": {
"Type": "Choice",
"Choices": [{
"Or": [{
"Variable": "$.ParralelOutput[0].Status.Error",
"BooleanEquals": true
},
{
"Variable": "$.ParralelOutput[1].Status.Error",
"BooleanEquals": true
}
],
"Next": "Notify Failure"
}],
"Default": "Notify Success"
},
"Notify Failure": {
"Type": "Pass",
"End": true
},
"Notify Success": {
"Type": "Pass",
"Result": "This is a fallback from a task success",
"End": true
}
}
}
For a more general case (although more complex) of the above as asked by Nisman in the comments. Instead of hardcoding the Choice State to check for every branch we can add a pass state with some JSONPath tricks to check for conditions not currently possible with a choice state alone.
Inside this Pass State we use Parameters to restructure our data in such a way that when we apply a JSONPath filter expression to this data using the OutputPath we are left with an array of either 2 (if no branches failed) or 3 (if some branches failed) elements, where the first element always contains the original input data and the second/third contains at least 1 key with the same name to be used by the choice state. Here's the State Machine JSON:
{
"Comment": "Parallel Example",
"StartAt": "Parallel Task",
"States": {
"Parallel Task": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "passtask",
"States": {
"passtask": {
"Type": "Task",
"Resource": "<TASK RESOURCE>",
"End": true,
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error-info",
"Next": "FailBranch1"
}
]
},
"FailBranch1": {
"Type": "Pass",
"Parameters": {
"BranchOutput.$": "$",
"BranchError": true
},
"End": true
}
}
},
{
"StartAt": "failtask",
"States": {
"failtask": {
"Type": "Task",
"Resource": "<TASK RESOURCE>",
"End": true,
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error-info",
"Next": "FailBranch2"
}
]
},
"FailBranch2": {
"Type": "Pass",
"Parameters": {
"BranchOutput.$": "$",
"BranchError": true
},
"End": true
}
}
}
],
"ResultPath": "$.ParralelOutput",
"Next": "Pre-Process"
},
"Pre-Process": {
"Type": "Pass",
"Parameters": {
"OrderedArray": [
{
"OriginalData": {
"Input.$": "$",
"ShouldFilterData": false
}
},
{
"ValuesToCheck": {
"ListBranchErrors.$": "$.ParralelOutput[?(#.BranchError==true)].BranchError",
"BranchFailures": true
}
},
{
"DefaultAlwaysFalse": {
"ShouldFilterData": false,
"BranchFailures": false
}
}
]
},
"OutputPath": "$..[?(#.ShouldFilterData == false || #.ListBranchErrors[0] == true)]",
"Next": "ChoiceStateX"
},
"ChoiceStateX": {
"Type": "Choice",
"OutputPath": "$.[0].Input",
"Choices": [
{
"Variable": "$[1].BranchFailures",
"BooleanEquals": true,
"Next": "NotifyFailure"
},
{
"Variable": "$[1].BranchFailures",
"BooleanEquals": false,
"Next": "NotifySuccess"
}
],
"Default": "NotifyFailure"
},
"NotifyFailure": {
"Type": "Pass",
"End": true
},
"NotifySuccess": {
"Type": "Pass",
"Result": "This is a fallback from a task success",
"End": true
}
}
}