Send Input as Output on error for AWS Step Function - amazon-web-services

I'd like my state machine to continue execution even in the event of some state error early on. Most of my lambda functions output the same thing they take as input, so I'd like to be able to just pass on the input that the lambda that encountered the error as output to the next state. I tried
{
"DeleteStuff": {
"Type": "Task",
"Resource": "MY_ARN",
"Catch": [ {
"ErrorEquals": ["States.ALL"],
"ResultPath": "$InputPath",
"Next": "FailedState"
}],
"Next": "checkStuff"
}, ...
without any luck. Has anyone done this, or can anyone offer some assistance?
Thanks!

So the solution is the set ResultPath to null. Changing my state machine to
{
"DeleteStuff": {
"Type": "Task",
"Resource": "MY_ARN",
"Catch": [ {
"ErrorEquals": ["States.ALL"],
"ResultPath": null,
"Next": "FailedState"
}],
"Next": "checkStuff"
}, ...
gave me the desired behaviour.

if you just add a new path to the result path, it is added to the input:
{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.error",
"Next": "Catch All Error Handler"
}
so if your input was:
{
"data_a" : "aaa",
"data_b" : "bbb"
}
output will be:
{
"data_a" : "aaa",
"data_b" : "bbb",
"error" : "<error description>"
}

Related

How to catch exception from lambda in state machine?

I am using state machines and raising custom error, but in my state machine I am not able to catch that exception.
Below is lambda snippet and state machine definition. Instead of going to catch block and error task.. Its throwing error at result selector attribute as below-
the JSONPath '$.Payload.tables' specified for the field 'tables.$' could not be found in the input
How I can ignore result selector attribute during exception?
My lambda code snippet -
if schema is None:
raise Exception("schema is not configured")
My statemachine -
"ResultSelector": {
"tables.$": "$.Payload.tables"
},
"ResultPath": "$.export_tables",
"Catch": [
{
"ErrorEquals": [
"States.Runtime"
],
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error",
"Next": "error state"
}
],
"Next": "Export Tables"
},
"error state": {
"Type": "Fail"
},
"Export Tables": {
"Type": "Map",
"End": true,
"ItemsPath": "$.export.tables",
"Parameters": {
"product.$": "$.product",
"table_export_def.$": "$$.Map.Item.Value"
},
You can catch custom errors by specifying ErrorEquals and Next attribute with fallback step as in aws docs
For example, if you want to catch runtime error :
"Catch": [ {
"ErrorEquals": ["States.Runtime"],
"Next": "CustomErrorFallback"
},
...
]
Specify your custom fallback step to handle error with custom error message
"CustomErrorFallback": {
"Type": "Pass",
"Result": "schema is not configured",
"End": true
},

AWS StepFunction: Passing callback token as output from catch

i am building a step function that publishes to sns and waits for a callback. if the state times out, i want the $$.Task.Token to be passed as part of the output to the next state. i've been reading the documentation and looking at posts, but i haven't found anything that seems to to this. is this possible?
my (simplified) state machine definition looks something like the following - i want the Timeout state to have access to the callback token from the SNS Publish state (i am not very picky about structure/naming)
{
"StartAt": "SNS Publish",
"States": {
"SNS Publish": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish.waitForTaskToken",
"Parameters": {
"TopicArn": "arn:aws:sns:XXXXXX:XXXXXX:XXXXXX",
"Message.$": "$$.Task.Token"
},
"End": true,
"TimeoutSeconds": 1,
"Catch": [
{
"ErrorEquals": [
"States.Timeout"
],
"ResultPath": "$.error",
"Next": "Timeout"
}
]
},
"Timeout": {
"Type": "Pass",
"End": true,
"Parameters": {
"Identifier.$": "$.TaskToken",
"IsSuccess": false
}
}
}
}

Failing AWS Step Functions after Catching

I have 3 stages in my AWS Step Function:
Stage 1 - Lambda
Stage 2 - AWS Batch
Stage 3 - AWS Batch (Mandatory Cleanup)
Everything works fine in that if Stage 1 fails then it moves to the Cleanup stage. However, since the cleanup stage always passes, the Step Function's final result is always a Pass, whereas if Stage 1 or 2 fails, I need the Cleanup to be performed, yet the Step Function final result should be a fail.
Options investigated:
One way to solve this is to maintain a flag in a cache whether there is an error, but was wondering if there is an inbuilt way for this.
Another option is to use the Result Path to check for an error but I am not sure how to access this result from an AWS Batch.
Appreciate any advice on this, thanks.
I have added the following Catch block in Stage 1 and 2:
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"Next": "Cleanup"
}
]
The Cleanup stage is as follows:
"Cleanup": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "arn:aws:batch:<region>:<account>:job-definition/MyCleanupJob",
"JobName": "cleanup",
"JobQueue": "arn:aws:batch:<region>:<account>:job-queue/MyCleanupQueue",
"ContainerOverrides": {
"Command": [
"java",
"-jar",
"cleanup.jar" ############ need to specify if an error occured as a command line parameter ###########
],
}
},
"End": true
}
Used below mechanism, credit for #LRutten for directing down this path.
For all success stages, append the response to the ResultPath else the previous results will be overwritten.
Set the error to the response path on an exception
Use a choice to decide if the step function should fail based on the presence of the error element
Here is the end output:
"MyLambda": {
"Type": "Task",
"Resource": "arn:aws:lambda:<region>:<account>:function:MyLambda",
"ResultPath": "$.mylambda", #### All results from the lambda are added to "mylambda" in the JSON
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error", #### If an error occurs it is appended to the result path as an "error" element
"Next": "Cleanup"
}
],
"Next": "MyBatch"
},
"MyBatch": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "arn:aws:batch:<region>:<account>:job-definition/MyBatchJob",
"JobName": "cleanup",
"JobQueue": "arn:aws:batch:<region>:<account>:job-queue/MyBatchQueue",
"ContainerOverrides": {
"Command": [
"java",
"-jar",
"mybatch.jar"
],
}
},
"ResultPath": "$.mybatch",
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error",
"Next": "Cleanup"
}
],
"Next": "Cleanup"
},
"Cleanup": {
"Type": "Task",
"ResultPath": "$.cleanup",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "arn:aws:batch:<region>:<account>:job-definition/MyCleanupJob",
"JobName": "cleanup",
"JobQueue": "arn:aws:batch:<region>:<account>:job-queue/MyCleanupQueue",
"ContainerOverrides": {
"Command": [
"java",
"-jar",
"cleanup.jar"
],
}
},
"Next": "Should Fail"
},
"Should Fail" :{
"Type" : "Choice",
"Choices" : [
{
"Variable" : "$.error", #### If an error element is present it means it is a Failure
"IsPresent": true,
"Next" : "Fail"
}
],
"Default" : "Pass"
},
"Fail" : {
"Type" : "Fail",
"Cause": "Step function failed"
},
"Pass" : {
"Type" : "Pass",
"Result": "Step function passed",
"End" : true
}
}

AWS Step Functions does not catch error when Lamba function returns an error

I have been struggling with AWS Step Functions for hours now. The use case is quite simple as I want to get gradually familiar with AWS Step Functions. However, I think I do not understand how they handle errors that come back from a failed lambda function.
Here is the corresponding code:
{
"Comment": "A simple AWS Step Functions for managing users with in the context of the AWS Training Initiative at AXA.",
"StartAt": "Process-All-Deletion",
"States": {
"Process-All-Deletion": {
"Type": "Map",
"InputPath": "$",
"ItemsPath": "$.Users",
"MaxConcurrency": 0,
"Iterator": {
"StartAt": "DeleteAccessKeys",
"States": {
"DeleteAccessKeys": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-central-1:###:function:listUserAccessKeys",
"Next": "DetachUserPolicy",
"Catch": [
{
"ErrorEquals": ["NoSuchEntityException"],
"ResultPath": "$.DeleteAccessKeysError",
"Next": "CatchDeleteAccessKeysError"
}
]
},
"DetachUserPolicy": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-central-1:###:function:detachUserPolicy",
"Next": "DeleteIamUser",
"Catch": [
{
"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.ErrorDescription",
"Next": "CatchDeleteUserPolicyError"
}
]
},
"DeleteIamUser": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-central-1:###:function:deleteIamUser",
"End": true,
"Catch": [
{
"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.ErrorDescription",
"Next": "CatchDeleteIamUserError"
}
]
},
"CatchDeleteIamUserError": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-central-1:###:function:errorHandler",
"End": true
},
"CatchDeleteAccessKeysError": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-central-1:###:function:errorHandler",
"Next": "DetachUserPolicy"
},
"CatchDeleteUserPolicyError": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-central-1:###:function:errorHandler",
"Next": "DeleteIamUser"
}
}
},
"ResultPath": "$.Result",
"End": true
}
}
}
So basically the state machine should catch the error properly and the status should be orange respectively 'caught error' in 'DeleteAccessKeys'. Instead it turns into green.
This is the code of my lambda function:
import boto3
import botocore
print('Loading deleteUserAccessKeys function...')
def deleteUserAccessKeys(message, context):
# Get IAM client
client = boto3.client('iam')
item = message['Name']
try:
# List all keys associated with the user
result = client.list_access_keys(UserName=item)
accessKeyIds = [accessKeyId for element['AccessKeyId'] in result['AccessKeyMetadata']]
# Exit if there are no access keys
if not accessKeyIds: return message
# Delete all keys associated with the user
for element in accessKeyIds:
client.delete_access_key(
UserName=item,
AccessKeyId=element
)
message['DeletedAccessKeys']=len(accessKeyIds)
print(message)
return message
except botocore.exceptions.ClientError as error:
print(error.response)
if error.response['Error']['Code'] == 'NoSuchEntity':
print('Entity not found exception')
raise error
else:
raise Exception("Failed! Check the error!")
What might be the issue or what did I wrongly configure?
You need to check the exact exception name returned from your lambda. Check lambda's log to confirm this.
In case you want to quickly check if thats the problem, change the catch attribute under DeleteAccessKeys to States.All. This is the superclass of all named exceptions.
I found the reason by myself.
I removed the the type "Map". I tried it then with just one single input without any iteration.
{
"Comment": "A simple AWS Step Functions for managing users with in the context of the AWS Training Initiative at AXA.",
"StartAt": "DeleteAccessKeys",
"States": {
"DeleteAccessKeys": {
"Type": "Task",
"InputPath": "$.Users",
"Resource": "arn:aws:lambda:eu-central-1:####:function:listUserAccessKeys",
"End": true,
"Catch": [
{
"ErrorEquals": [
"NoSuchEntityException"
],
"ResultPath": "$.DeleteAccessKeysError",
"Next": "CatchDeleteAccessKeysError"
}
]
},
"CatchDeleteAccessKeysError": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-central-1:####:function:errorHandler",
"End": true
}
}
}
In the Web GUI it is then correctly displayed as "Caught Error" if e.g. the entity (NoSuchEntityException) does not exist.
If you iterate over input values as in my example in my first post caught errors will always be displayed as "Succeeded".

Cannot pass array to next task in AWS StepFunction

Working on an AWS StepFunction that gets an array of dates from a Lambda call, then passes to a Task that should take that array as a parameter to pass into a lambda.
The Get Date Range task works fine and outputs the date array:
{
"rng": [
"2019-05-07",
"2019-05-09"
]
}
...and the array gets passed into the ProcessDateRange task, but I cannot assign the array the range Parameter.
It literally tries to pass this: "$.rng" instead of this:
[
"2019-05-07",
"2019-05-09"
]
Here's the StateMachine:
{
"StartAt": "Try",
"States": {
"Try": {
"Type": "Parallel",
"Branches": [{
"StartAt": "Get Date Range",
"States": {
"Get Date Range": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:get-date-range",
"Parameters": {
"name": "thename",
"date_query": "SELECT date from sch.tbl_dates;",
"database": "the_db"
}
,
"ResultPath": "$.rng",
"TimeoutSeconds": 900,
"Next": "ProcessDateRange"
},
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
"Exit": {
"Type": "Succeed"
}
}
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.Error",
"Next": "Failed"
}],
"Next": "Succeeded"
},
"Failed": {
"Type": "Fail",
"Cause": "There was an error. Please review the logs.",
"Error": "error"
},
"Succeeded": {
"Type": "Succeed"
}
}
}
This is because you are using the wrong syntax for Lambda tasks. To specify the input you need to set the InputPath key, for example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"InputPath": "$.rng",
"ResultPath": "$",
"Next": "Exit"
},
If you want a parameter to be interpreted as a JSON path instead of a literal string, add ".$" to the end of the parameter name. To modify your example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range.$": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
Relevant docs here: https://docs.aws.amazon.com/step-functions/latest/dg/connectors-parameters.html#connectors-parameters-path