How does the MaxConcurrency attribute work for the Map Task in AWS Step Functions? - amazon-web-services

Update: Creating a step function from the Map State step template and running that also throws an error. This is strong evidence that the MaxConcurrency attribute together with the Parameters value is not working.
I am not able to use the MaxConcurrency attribute successfully in the step function definition.
This can be demonstrated by using the example provided in the documentation for the Map Task (new as of 18 sept 2019):
{
"StartAt": "ExampleMapState",
"States": {
"ExampleMapState": {
"Type": "Map",
"MaxConcurrency": 2,
"Parameters": {
"ContextIndex.$": "$$.Map.Item.Index",
"ContextValue.$": "$$.Map.Item.Value"
},
"Iterator": {
"StartAt": "TestPass",
"States": {
"TestPass": {
"Type": "Pass",
"End": true
}
}
},
"End": true
}
}
}
By executing the step function with the following input:
[
{
"who": "bob"
},
{
"who": "meg"
},
{
"who": "joe"
}
]
We can observe in the Execution event history that we get:
ExecutionStarted
MapStateEntered
MapStateStarted
MapIterationStarted (index 0)
MapIterationStarted (index 1)
PassStateEntered (index 0)
PassStateExited (index 0)
MapIterationSucceeded (index 0)
ExecutionFailed
The step function fails.
The ExecutionFailed step has the following output (execution id omitted):
{
"error": "States.Runtime",
"cause": "Internal Error (omitted)"
}
Trying to catch the error with a Catch step has no effect.
What am I doing wrong here? Is this a bug?

Response to a private ticket submitted to AWS this morning;
Thank you for contacting AWS Premium Support. My name is Akanksha and
I will be assisting you with this case.
I understand that you have been working with the new Map state feature
of step functions and have noticed that when we use Parameters along
with MaxConcurrency set to lower value than the number of iterations
(with only first iteration successful) it fails with ‘States.Runtime’
and looks like a bug with the functionality.
Thank you for providing the details. It helped me during
troubleshooting. In order to confirm the behavior, I used the below
state machine example with Pass:
{
"StartAt": "Map State",
"TimeoutSeconds": 3600,
"States": {
"Map State": {
"Type": "Map”,
"Parameters": {
“ContextValue.$”: "$$.Map.Item.Value"
},
"MaxConcurrency": 1,
"Iterator": {
"StartAt": "Run Task",
"States": {
"Run Task": {
"Type": "Pass",
"End": true
}
}
},
"Next": "Final State"
},
"Final State": {
"Type": "Pass",
"End": true
}
} }
I tested with multiple input lists and MaxConcurrency values and below
are my observations:
Input size list: 4 MaxConcurrency:1/2/3 - Fails and MaxConcurrency:0/4/5 or above - Works
Input size list: 3 MaxConcurrency: 1/2 - Fails and MaxConcurrency:0/3/4 or above - Works
Similarly, I performed tests by removing the parameters from state machine as well and could see that it works as expected with different
MaxConcurrency values.
I also tested the same by changing the Task type of “Pass” with “Lambda” and observed the same behavior.
Hence, I can confirm that the state machine fails when we have
parameters in the code and specify MaxConcurrency value as anything
other than zero or the number greater than or equal to the list size.
After doing some research regarding this behavior to check if this is
intended, I could not find much information regarding the same as this
is a new feature. So, I will be reaching out to the internal team with
all the details and the example state machine that you have provided.
Thank you for bringing this to our notice. I will get back to you as
soon as I have an update from the internal team. Please be assured
that I will regularly follow up with the team and work with them to
investigate further.
Meanwhile, if you have any other queries or concerns, please do let me
know.
Have a great day ahead!
I will update here when I get more information.

Related

Step Functions - Access State from previous Map Iteration

How can I get the results from previous Map Iterations in the next iteration when using MaxConcurrency: 1 in Amazon Step Functions?
Here's an example of the code I have
{
"StartAt": "UploadUsers",
"States": {
"UploadUsers": {
"Type": "Map",
"MaxConcurrency": 1,
"ItemsPath": "$.data.users",
"Parameters": {
"data.$": "$$.Map.Item.Value.data",
"friends.$": "$.?????? Get created users ids"
},
"Iterator": {
"StartAt": "UploadUser",
"States": {
"UploadUser": {
"End": true,
"Parameters": {
"FunctionName": "${FnUploadUser}",
"Payload": {
"data.$": "$.user_data",
"friends.$": "$.??????"
}
},
"Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
"ResultPath": "$.data. ???",
"Type": "Task"
}
}
},
"End": true,
"ResultPath": "$.data.UploadUsers",
"ResultSelector": {
"result.$": "$"
}
}
}
}
Suppose FnUploadUser is a lambda that returns the id of the created user.
And I want to get the ids of the previously created users and use that value for the next user I'm about to create.
You can't. Map State iterations don't share state. Two workarounds:
(1) Manage the shared state externally: Each Map iteration writes and reads from, say, a DynamoDB table.
(2) Refactor to a "for" loop and keep the shared state in the execution output.
Instead of using Map, insert a Choice State (after UploadUser) that checks for a "done" condition. If "done", finish, else loop back to UploadUser.
UploadUser accepts the user_data array as input. It appends its output to, say, the uploaded output array.
Each UploadUser iteration identifies the next user_data item by comparing it to the uploaded array. The iteration that processes the last item can also output done: true to signal to Choice that work is done.
The Choice State loops back to UploadUser while there are more to process (i.e. while done is not present).
There are other ways to build steps 2-3. For instance, you could add next_item and total_items keys on the output to keep track of progress. The important point is that Choice loops until an exit condition is met.

AWS Step Functions - How to ignore "ResultSelector" when error is catched?

I have a AWS Step Functions state machine that as first state starts a Lambda function. This function do something then returns a JSON like { temp_a: "temporary a" }.
This output should be sent to the second state of this state machine but, I don't want to send temp_a as key, rather I'd like to rename it a, so the result of the first state should be { a: "temporary a" }.
This is trivial and can be done using ResultSelector. For this, the Step Functions will look like this:
{
"StartAt": "State1",
"States": {
"State1": {
"Next": "State2",
"Resource": "arn:aws:lambda:eu-west-1:XXX:function:sfexample-LambdaFunction",
"ResultSelector": {
"a.$": "$.temp_a"
},
"Type": "Task"
},
"State2": {
"End": true,
"Type": "Pass"
}
}
}
and Lambda will be something as easiest as possible since it contains just a single instruction return { temp_a: "temporary a" };.
Once the state machine has been started, everything works like a charm since the temp_a is successfully renamed into a (thanks to the ResultSelector) and then it is sent to the State2. Great!
Occasionally, that Lambda can throw a CustomError exception that I would catch in the state machine. When the error is caught the flow have to be diverted into the state CustomErrorState.
To make things possible I've added a Catch statement into the State1, and added another state called CustomErrorState of type Fail.
{
"StartAt": "State1",
"States": {
"CustomErrorState": {
"Cause": "Error happens",
"Type": "Fail"
},
"State1": {
"Catch": [
{
"ErrorEquals": [
"CustomError"
],
"Next": "CustomErrorState"
}
],
"Next": "State2",
"Resource": "arn:aws:lambda:eu-west-1:XXX:function:sfexample-LambdaFunction",
"ResultSelector": {
"a.$": "$.temp_a"
},
"Type": "Task"
},
"State2": {
"End": true,
"Type": "Pass"
}
}
}
This seems reasonable but when Lambda throw the CustomError I get a runtime error because the State1 cannot perform what I've specified in the ResultSelector property.
What's the meaning? If an error is caught how could I handle the result? It is possible to ignore ResultSelector's instructions when I catch en error in a state?
Further detail:
Here you can grab all necessary file to test it into your account.
I am an engineer from Step Functions and would like to update this post to let you know that the issue has been fixed.
ResultSelector field will not be applied to caught error output, and the previous behaviour was a bug.
Thanks for bringing this up.
The issue has been fixed. See post below.
I have posted the same question on AWS discussion forum, signalling that the behaviour looks like a bug. They confirm the problem, in fact:
The "ResultSelector" field should only be applied to the successful
result of a Task, Map or Parallel State. In the case of a caught
error, "ResultSelector" should NOT be applied.
Fortunately today I encountered the same problem. I believe what you have to do is handle the exception manually in your code there is no way to ignore ResultSelector. You have to return same dict/json from your code if it fails then only the SFN will be executed completely. Here is what I tried:
Intentionally I am raising an Exception to check the behaviour of State Machine.
Below is the State Machine's execution when I ran that code:

AWS Step Functions is not catching States.Runtime error

The below step function is executed in aws and when there is a missing of a required parameter it cancel the flow and throws States.Runtime Error. This is in catch phase of the step function but it is not catching the error as stated.
Defined Step function is as below,
{
"StartAt": "Log Start Step Function",
"Comment": "Executed with inputs",
"States": {
"Log Start Step Function": {
"Type": "Task",
"Resource": "arn:aws:lambda:eu-west-1:0000000:function:update",
"Parameters": {
"body": {
"itemID.$": "$.itemID",
"functionName.$": "$.stepFunctionName ",
"executionARN.$": "$$.Execution.Id",
"complete": false,
"inprogress": true,
"error": false
}
},
"Catch": [
{
"ErrorEquals": [
"States.Runtime"
],
"ResultPath": "$.taskresult",
"Next": "Log Failed Module"
},
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.taskresult",
"Next": "Log Failed Module"
}
],
"ResultPath": "$.taskresult",
"Next": "Evaluate Module PA1"
}
}
}
Below is the step function,
And the error thrown is as below,
Runtime error is not executing Log failed module.
{
"ErrorEquals": [
"States.Runtime"
],
"ResultPath": "$.taskresult",
"Next": "Log Failed Module"
},
Is this AWS error or something wrong with the configuration which is done here or is there any other way to validate parameters in AWS Step Functions
From https://docs.aws.amazon.com/step-functions/latest/dg/concepts-error-handling.html
A States.Runtime error is not retriable, and will always cause the execution to fail. A retry or catch on States.ALL will not catch States.Runtime errors.
Your state machine is expecting the following as input:
"Parameters": {
"body": {
"itemID.$": "$.itemID",
"functionName.$": "$.stepFunctionName ",
"executionARN.$": "$$.Execution.Id",
"complete": false,
"inprogress": true,
"error": false
}
},
You need to pass them when you start a new execution instead of:
{
"Comment": "Insert your JSON here"
}
Which you are currently passing because it comes by default as the input body of a new execution in the AWS Console.
Read more about InputPath and Parameters here: https://docs.aws.amazon.com/step-functions/latest/dg/input-output-inputpath-params.html
I have the same problem.
I am beginning to think that the runtime error happens when the input path is processed, and before the catcher can be initialized. This means that try / catch to test for parameters present in the input is not possible. I also tried ChoiceState, to no avail.
So I think there is no solution but to provide every parameter you refer to in the state machine definition. But the documentation is not clear on this.
This caught me out too. My scenario was setting the output based on the results of S3 ListObjectVersions, with the versions to be deleted in a later task. In this case, $.Versions didn't exist because there was nothing in the bucket so States.Runtime was thrown.
{
"bucket.$": "$.Name",
"objects.$": "$.Versions"
}
To work around this -
I don't transform the result of ListObjectVersions task with ResultSelector. Instead this state simply outputs the unedited result.
I added a Choice state underneath with a rule to check if $.Versions is present.
If it is present, move to a Pass state and transform the input in the exact same was a I was originally transforming the result of the ListObjectVersions task using the ResultSelector (because the Pass state doesn't transform on output, only input).
If it is not present, move to a Success state because there is nothing to delete.
Here is a screen grab of the relevant section, just in case it's helpful to visualise.

Pass multiple inputs into Map State in AWS Step Function

I am trying to use AWS Step Functions to trigger operations many S3 files via Lambda. To do this I am invoking a step function with an input that has a base S3 key of the file and part numbers each file (each parallel iteration would operate on a different S3 file). The input looks something like
{
"job-spec": {
"base_file_name": "some_s3_key-",
"part_array": [
"part-0000.tsv",
"part-0001.tsv",
"part-0002.tsv", ...
]
}
}
My Step function is very simple, takes that input and maps it out, however I can't seem to get both the file and the array as input to my lambda. Here is my step function definition
{
"Comment": "An example of the Amazon States Language using a map state to process elements of an array with a max concurrency of 2.",
"StartAt": "Map",
"States": {
"Map": {
"Type": "Map",
"ItemsPath": "$.job-spec",
"ResultPath": "$.part_array",
"MaxConcurrency": 2,
"Next": "Final State",
"Iterator": {
"StartAt": "My Stage",
"States": {
"My Stage": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"FunctionName": "arn:aws:lambda:us-east-1:<>:function:some-lambda:$LATEST",
"Payload": {
"Input.$": "$.part_array"
}
},
"End": true
}
}
}
},
"Final State": {
"Type": "Pass",
"End": true
}
}
}
As written above it complains that that job-spec is not an array for the ItemsPath. If I change that to $.job-spec.array I get the array I'm looking for in my lambda but the base key is missing.
Essentially I want each python lambda to get the base file key and one entry from the array to stitch together the complete file name. I can't just put the complete file names in the array due to the limit limit of how much data I can pass around in Step Functions and that also seems like a waste of data
It looks like the Parameters value can be used for this but I can't quite get the syntax right
Was able to finally get the syntax right.
"ItemsPath": "$.job-spec.part_array",
"Parameters": {
"part_name.$": "$$.Map.Item.Value",
"base_file_name.$": "$.job-spec.base_file_name"
},
It seems that Parameters can be used to create custom inputs for each stage. The $$ is accessing the context of the stage and not the actual input. It appears that ItemsPath takes the array and puts it into a context which can be used later.
UPDATE Here is some AWS Documentation showing this being used from the comments below

Utterances to test lambda function not working (but lambda function itself executes)

I have a lambda function that executes successfully with an intent called GetEvent that returns a specific string. I've created one utterance for this intent for testing purposes (one that is simple and doesn't require any of the optional slots for invoking the skill), but when using the service simulator to test the lambda function with this utterance for GetEvent I'm met with a lambda response that says "The response is invalid". Here is what the interaction model looks like:
#Intent Schema
{
"intents": [
{
"intent": "GetVessel",
"slots": [
{
"name": "boat",
"type": "LIST_OF_VESSELS"
},
{
"name": "location",
"type": "LIST_OF_LOCATIONS"
},
{
"name": "date",
"type": "AMAZON.DATE"
},
{
"name": "event",
"type": "LIST_OF_EVENTS"
}
]
},
{
"intent": "GetLocation",
"slots": [
{
"name": "event",
"type": "LIST_OF_EVENTS"
},
{
"name": "date",
"type": "AMAZON.DATE"
},
{
"name": "boat",
"type": "LIST_OF_VESSELS"
},
{
"name": "location",
"type": "LIST_OF_LOCATIONS"
}
]
},
{
"intent": "GetEvent",
"slots": [
{
"name": "event",
"type": "LIST_OF_EVENTS"
},
{
"name": "location",
"type": "LIST_OF_LOCATIONS"
}
]
}
]
}
With the appropriate custom skill type syntax and,
#First test Utterances
GetVessel what are the properties of {boat}
GetLocation where did {event} occur
GetEvent get me my query
When giving Alexa the utterance get me my query the lambda response should output the string as it did in the execution. I'm not sure why this isn't the case; this is my first project with the Alexa Skills Kit, so I am pretty new. Is there something I'm not understanding with how the lambda function, the intent schema and the utterances are all pieced together?
UPDATE: Thanks to some help from AWSSupport, I've narrowed the issue down to the area in the json request where new session is flagged as true. For the utterance to work this must be set to false (this works when inputting the json request manually, and this is also the case during the lambda execution). Why is this the case? Does Alexa really care about whether or not it is a new session during invocation? I've cross-posted this to the Amazon Developer Forums as well a couple of days ago, but have yet to get a response from someone.
This may or may not have changed -- the last time I used the service simulator (about two weeks ago at the time of writing) it had a pretty severe bug which would lead to requests being mapped to your first / wrong intent, regardless of actual simulated speech input.
So even if you typed in something random like wafaaefgae it simply tries to map that to the first intent you have defined, providing no slots to said intent which may lead to unexpected results.
Your issue could very well be related to this, triggering the same unexpected / buggy behavior because you aren't using any slots in your sample utterance
Before spending more time debugging this, I'd recommend trying the Intent using an actual echo or alternatively https://echosim.io/ -- interaction via actual speech works as expected, unlike the 'simulator'