Access Map State ( item ) in Step functions - amazon-web-services

I am trying to access item properties which I am iterating over using Map state. I keep getting this error:
Value ($['Map'].snapshot_id.S) for parameter snapshotId is invalid. Expected: 'snap-...'. (Service: Ec2, Status Code: 400, Request ID: 6fc02935-c161-49df-8606-bc6f3e2934a6)
I have gone through the docs, which seems to suggest access using $.Map.snapshot_id.S but doesn't seem to work. Meanwhile, an input item to Map is:
{
"snapshot_id": {
"S": "snap-01dd5ee46df84119e"
},
"file_type": {
"S": "bash"
},
"id": {
"S": "64e6893261d94669b7a8ca425233d68b"
},
"script_s3_link": {
"S": "df -h"
}
}
Map state definition:
"Map": {
"Type": "Map",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "INLINE"
},
"StartAt": "Parallel State",
"States": {
"Parallel State": {
"Comment": "A Parallel state can be used to create parallel branches of execution in your state machine.",
"Type": "Parallel",
"Branches": [
{
"StartAt": "CreateVolume",
"States": {
"CreateVolume": {
"Type": "Task",
"Parameters": {
"AvailabilityZone": "us-east-2b",
"SnapshotId": "$$.Map.snapshot_id.S"
},
"Resource": "arn:aws:states:::aws-sdk:ec2:createVolume",
"Next": "AttachVolume"
},
"AttachVolume": {
"Type": "Task",
"Parameters": {
"Device": "MyData",
"InstanceId": "MyData",
"VolumeId": "MyData"
},
"Resource": "arn:aws:states:::aws-sdk:ec2:attachVolume",
"End": true
}
}
}
],
"End": true
}
}
},
"Next": "Final Result",
"ItemsPath": "$.Items",
"MaxConcurrency": 40
},

TL;DR "SnapshotId.$": "$.snapshot_id"
By default, each inline map iteration's input is one item from the ItemsPath array, accessible simply as $.
Your state machine definition implies that $.Items is an array of objects with a snapshot_id key (and other keys). If so, each iteration's snapshot id is at $.snapshot_id.
Finally, adding .$ to the parameter name (SnapshotId.$) tells Step Functions the value is not a literal, but rather a path value to be substituted.

Related

Can I access the TaskToken from a Map state with ItemSelector where the iteration step uses lambda:invoke.waitForTaskToken?

I am using AWS step function to iterate over a list in an input document where for each iteration, I need to invoke an external service. So I want to iterate over each item and run a step using lambda:invoke.waitForTaskToken and pass the TaskToken into the execution of each iteration.
The problem I'm running into is how to use both an ItemSelector at the Map state level but also inject the TaskToken during the internal step. I need to use an ItemSelector because I want each item to also contain information from the input to Map state. The AWS Docs state:
The ItemSelector field replaces the Parameters field within the Map state. If you use the Parameters field in your Map state definitions to create custom input, we highly recommend that you replace them with ItemSelector.
But they also say:
During an execution, the context object is populated with relevant data for the Parameters field from where it is accessed. The value for a Task field is null if the Parameters field is outside of a task state.
These two statements seem to imply that what I'm trying to do is impossible.
So, what I want is something like:
{
"StartAt": "ExampleMapState",
"States": {
"ExampleMapState": {
"Type": "Map",
"ItemsPath": "$.items",
"ItemSelector": {
"dynamic.$": "$.dynamic",
"ContextIndex.$": "$$.Map.Item.Index",
"ContextValue.$": "$$.Map.Item.Value"
},
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "INLINE"
},
"StartAt": "TestPass",
"States": {
"TestPass": {
"Type": "Task",
"Parameters": {
"FunctionName": "arn:aws:lambda:us-west-2:123456789012:function:echo-lambda",
"Payload": {
"item.$": "$",
"token.$": "$$.Task.Token"
}
},
"Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
"End": true
}
}
},
"End": true
}
}
}
But this doesn't work because the ItemSelector overrides the Payload of the internal TestPass state. Is there a way to get this to work?
ETA: I figured I would try putting $$.Task.Token in ItemSelector just in case it would magically work but it ended up throwing an error because $$.Task does not exist in the context object at that level.
Example with this (invalid) configuration:
{
"StartAt": "ExampleMapState",
"States": {
"ExampleMapState": {
"Type": "Map",
"ItemsPath": "$.items",
"ItemSelector": {
"dynamic.$": "$.dynamic",
"ContextIndex.$": "$$.Map.Item.Index",
"ContextValue.$": "$$.Map.Item.Value",
"token.$": "$$.Task.Token"
},
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "INLINE"
},
"StartAt": "TestPass",
"States": {
"TestPass": {
"Type": "Task",
"Parameters": {
"FunctionName": "arn:aws:lambda:us-west-2:123456789012:function:echo-lambda"
},
"Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
"End": true
}
}
},
"End": true
}
}
}
Based on my research I don't think what I'm trying to do is possible. What I ended up implementing is a workaround. I modified the function providing input to this step function to put the dynamic info that I needed into every item in the list I am iterating over. So my step function definition now looks something like this
{
"StartAt": "ExampleMapState",
"States": {
"ExampleMapState": {
"Type": "Map",
"ItemsPath": "$.items",
"ItemProcessor": {
"ProcessorConfig": {
"Mode": "INLINE"
},
"StartAt": "TestPass",
"States": {
"TestPass": {
"Type": "Task",
"Parameters": {
"FunctionName": "arn:aws:lambda:us-west-2:123456789012:function:echo-lambda",
"Payload": {
"item.$": "$",
"token.$": "$$.Task.Token"
}
},
"Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
"End": true
}
}
},
"End": true
}
}
}
And an example input to this step function looks like:
{
"dynamic": "info",
"items": [
{
"dynamic": "info",
"resize": "true",
"format": "jpg"
},
{
"dynamic": "info",
"resize": "false",
"format": "png"
},
{
"dynamic": "info",
"resize": "true",
"format": "jpg"
}
]
}
It's not great because I have to repeat info into every item ahead of time but it works.

Error in using InputPath to select parts of input in a Step Functions workflow

I am creating a Step Functions workflow which has various steps. I am referring to this topic in their documentation InputPath, ResultPath and OutputPath Examples. I am trying to check the identity and address of a person in my workflow as they've shown in their document. I'm passing the input for the Verify identity step within the state machine definition inside Parameters. My workflow looks like this.
Note: But when I run this, am getting the error -> An error occurred while executing the state 'Verify identity' (entered at the event id #19). Invalid path '$.identity' : Property ['identity'] not found in path $
What am I doing wrong here? Can someone please explain?
Thanks..
{
"StartAt": "Step1",
"States": {
"Step1": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
...something...
},
"Next": "Step2"
},
"Step2": {
"Type": "Choice",
"Choices": [
Do something...
],
"Default": "Step3.1"
},
"Step3.1": {
"Type": "Task",
...something...
}
},
"Next": "Step3.3"
},
...something...,
"Step4": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "Verify identity",
"States": {
"Verify identity": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"InputPath": "$.identity",
"Parameters": {
"Payload": {
"identity": {
"email": "jdoe#example.com",
"ssn": "123-45-6789"
},
"firstName": "Jane",
"lastName": "Doe"
},
"FunctionName": "{Lambda ARN}"
},
"End": true
}
}
},
{
"StartAt": "Verify address",
"States": {
"Verify address": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"Payload": {
"street": "123 Main St",
"city": "Columbus",
"state": "OH",
"zip": "43219"
},
"FunctionName": "{Lambda ARN}"
},
"End": true
}
}
}
],
"Next": "Step5"
},
"Step5": {
"Type": "Task",
"Parameters": {
something...
},
"End": true
}
}```
You don't have an explicit transition in your example to call Step4 but assuming the order you have defined (step1 -> step2 -> step3.1 -> step3.3 -> step4)
This means the output from step3.3 should be something like
{
"cat": "meow",
"dog": "woof",
"identity": { // this is whats missing
"email": "jdoe#example.com",
"ssn": "123-45-6789"
}
}
this is what will get passed to each branch of your parallel state (Step4)
However, since you have anInputPath defined for Step4."Verify identity", the effective input to the task becomes
{
"email": "jdoe#example.com",
"ssn": "123-45-6789"
}
The error youre seeing
An error occurred while executing the state 'Verify identity' (entered at the event id #19). Invalid path '$.identity' : Property ['identity'] not found in path $
means the "identity" key (aka $.identity) isn't getting added to the output of Step3.3 (aka $)

AWS Step-Function: pass a specific value from one AWS lambda to another in step function parallel state

I have the below state machine. The requirement is to have a lambda to query DB and get all the ids. Next I have a parallel state call that calls more than five lambdas at once. Instead of passing all the ids fetched to all the lambdas, I need to pass the respective ids to each lambda.
In the below state language, first call is DB_CALL, lets say it returns {id1, id2, id3, id4, id5, id6}, I want to pass only id1 to First_Lambda and id2 to Second_Lambda etc...
The entire id object should get passed to all lambdas. Please suggest a way to achieve this.
{
"Comment": "Concurrent Lambda calls",
"StartAt": "StarterLambda",
"States": {
"StarterLambda": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:DB_CALL",
"Next": "ParallelCall"
},
"State": {
"ParallelCall": {
"Type": "Parallel",
"End": true,
"Branches": [
{
"StartAt": "First",
"States": {
"First": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:First_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
},
{
"StartAt": "Second",
"States": {
"Second": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Second_Lambda",
"Retry": [ {
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 1,
"MaxAttempts": 2,
"BackoffRate": 2.0
} ],
"End": true
}
}
},
{
"StartAt": "Third",
"States": {
"Third": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Third_Lambda",
"Catch": [ {
"ErrorEquals": ["States.TaskFailed"],
"Next": "CatchHandler"
} ],
"End": true
},
"CatchHandler": {
"Type": "Pass",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:CATCH_HANDLER",
"End": true
}
}
},
{
"StartAt": "Fourth",
"States": {
"Fourth": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Fourth_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
},
{
"StartAt": "Fifth",
"States": {
"Fifth": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Fifth_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
},
{
"StartAt": "Sixth",
"States": {
"Sixth": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789012:function:Sixth_Lambda",
"TimeoutSeconds": 120,
"End": true
}
}
}
}
]
}
}
}
}
You can use Step Function parameter option.
This would allow you to send specific value or json to next lambda.
"Parameters": {
"toprocess.$": "$.MetaData.CorrelationId"
},
So input to this lambda would be smaller dto than compared to you first lambda. So while returning value from this lambda avoid assigning it back to Step function result.
"OutputPath": "$",
"ResultPath": "$.PartialResutl",
What you are looking for is the Map State. With this state, you pass in the iterator, in your case the path to the ids. The map state will run once for each item in the list. Within the map state, you have a full state machine, so you can call a Lambda or any other state. It has controls to limit how many are running at once if that is needed.

Can AWS Step Function describe this kind of dataflow?

It can not be described with Parallel State in AWS Step Function.
B and C should be in parallel.
C sends messages to both D and E.
D and E should be in parallel.
{
"StartAt": "A",
"States": {
"A": {
"Type": "Pass",
"Next": "Parallel State 1"
},
"Parallel State 1": {
"Type": "Parallel",
"Branches": [{
"StartAt": "B",
"States": {
"B": {
"Type": "Pass",
"End": true
}
}
},
{
"StartAt": "C",
"States": {
"C": {
"Type": "Pass",
"End": true
}
}
}
],
"Next": "Parallel State 2"
},
"Parallel State 2": {
"Type": "Parallel",
"Branches": [{
"StartAt": "D",
"States": {
"D": {
"Type": "Pass",
"End": true
}
}
},
{
"StartAt": "E",
"States": {
"E": {
"Type": "Pass",
"End": true
}
}
}
],
"Next": "F"
},
"F": {
"Type": "Pass",
"End": true
}
}
}
Answer is No , inside step function no state can set multiple states (invokes both successors)to its Next task. As per AWS step function cannot start State Machine as StartAt by providing multiple State names.
You can tweak your logic and use The Parallel state and achive same ,If you share your usecase may be help to solve problems.
How to specify multiple result path values in AWS Step Functions
A Parallel state provides each branch with a copy of its own input
data (subject to modification by the InputPath field). It generates
output that is an array with one element for each branch, containing
the output from that branch.
https://aws.amazon.com/blogs/aws/new-step-functions-support-for-dynamic-parallelism/
Example of state function
{
"Comment": "An example of the Amazon States Language using a choice state.",
"StartAt": "FirstState",
"States": {
"FirstState": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:FUNCTION_NAME",
"Next": "ChoiceState"
},
"ChoiceState": {
"Type" : "Choice",
"Choices": [
{
"Variable": "$.foo",
"NumericEquals": 1,
"Next": "FirstMatchState"
},
{
"Variable": "$.foo",
"NumericEquals": 2,
"Next": "SecondMatchState"
}
],
"Default": "DefaultState"
},
"FirstMatchState": {
"Type" : "Task",
"Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:OnFirstMatch",
"Next": "NextState"
},
"SecondMatchState": {
"Type" : "Task",
"Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:OnSecondMatch",
"Next": "NextState"
},
"DefaultState": {
"Type": "Fail",
"Error": "DefaultStateError",
"Cause": "No Matches!"
},
"NextState": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCOUNT_ID:function:FUNCTION_NAME",
"End": true
}
}
}
https://docs.aws.amazon.com/step-functions/latest/dg/connect-to-resource.html#connect-wait-example
https://sachabarbs.wordpress.com/2018/10/30/aws-step-functions/
As I answered in How to simplify complex parallel branch interdependencies for Step Functions, what you asked is better to be modeled as DAG but not state machine.
Depends on your use case, you might be able to workaround it (just as #horatiu-jeflea 's answer), but it's a workaround (not the straightforward way) anyway.

Loop inside a Step Function

I am trying to call a couple of steps in my step function in a loop but I am unable to get my head around how I need to do this. Here's what I have till now: I need to add another lambda function(GetReviews) which will then call CreateReview, SendNotification in a loop. How would I go about doing this?
I am referring to the "Iterating a Loop Using Lambda" document, which shows it is possible.
Step function Defination:
{
"Comment": "Scheduling Engine",
"StartAt": "CreateReview",
"States": {
"CreateReview": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateReview",
"Next": "CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateReviewResult",
"OutputPath": "$"
},
"CreateNotification": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateNotificationResult",
"OutputPath": "$",
"End": true
}
}
}
I am contributing to this answer because I am using a little different approach to be able to loop inside of a step-function without having to rely on a lambda to increment. If someone in the future needs a generic solution, this can be a good reference.
Here is the example with code:
{
"Comment": "A description of my state machine",
"StartAt": "InitVariables",
"States": {
"InitVariables": {
"Type": "Pass",
"Parameters": {
"index": 0,
"incrementor": 1,
"ArrayLength.$": "States.ArrayLength($.inputArray)"
},
"ResultPath": "$.iterator",
"Next": "LoopChoice"
},
"LoopChoice": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.iterator.ArrayLength",
"NumericGreaterThanPath": "$.iterator.index",
"Next": "IncrementVariable"
}
],
"Default": "End"
},
"IncrementVariable": {
"Type": "Pass",
"Parameters": {
"index.$": "States.MathAdd($.iterator.index, $.iterator.incrementor)",
"incrementor": 1,
"ArrayLength.$": "$.iterator.ArrayLength"
},
"ResultPath": "$.iterator",
"Next": "LoopChoice"
},
"End": {
"Type": "Pass",
"End": true
}
} }
This is the base for the loop, I use the States.MathAdd($.iterator.index, $.iterator.incrementor) intrinsic function to add two values, in this case, increment the index with a increment amount defined in the initVariables state. And also get the length of the array that I want to loop. You get the array length by also using a intrinsic function, States.ArrayLength("$.path.to.array"). The array is passed in the input.
To get the value of the array we can use the intrinsic function, States.ArrayGetItem($.inputArray, $.iterator.index).
All the custom logic should be put between the loopChoice state and the IncrementVariable State.
Hope this helps someone in the future.
Sorry for the late reply. You've probably solved it in between, but here you are
So, when looping in Step Functions, I quite simply add a Choice State (see Choice State Rules).
One of your State would need to output wether or not you have finished looping, or the number of items iterated on and the total number of items.
In the first case, it would be something like
{
"Comment": "Scheduling Engine",
"StartAt": "CreateReview",
"States": {
"GetReviews": {
whatever
"Next": "LoopChoiceState"
},
"LoopChoiceState": {
"Type" : "Choice",
"Choices": [
{
"Variable": "$.loopCompleted",
"BooleanEquals": false,
"Next": "GetReviews"
}
],
"Default": "YourEndState"
},
"CreateReview": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-westz2:529627678433:function:CreateReview",
"Next": "CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateReviewResult",
"OutputPath": "$"
},
"CreateNotification": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateNotificationResult",
"OutputPath": "$",
"End": true
}
}
}
Second case:
{
"Comment": "Scheduling Engine",
"StartAt": "CreateReview",
"States": {
"GetReviews": {
whatever
"Next": "LoopChoiceState"
},
"LoopChoiceState": {
"Type" : "Choice",
"Choices": [
{
"Variable": "$.iteratedItemsCount",
"NumericEquals": "$.totalItemsCount",
"Next": "CreateNotification"
}
],
"Default": "CreateReview"
},
"CreateReview": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateReview",
"Next": "CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateReviewResult",
"OutputPath": "$"
},
"CreateNotification": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:529627678433:function:CreateNotification",
"InputPath": "$",
"ResultPath": "$.CreateNotificationResult",
"OutputPath": "$",
"End": true
}
}
}
You could also use indexes (current index and last index) instead of the the number of items iterated over; it would help you keep track of where you in create reviews.
Three options are presented below. Here is a visual summary:
#1 Map: Repeat a set of steps for each element of an array (without a loop, optionally concurrently)
Map State is an alternative to looping when you want to run a set of steps for each element of an input array. Each element runs in parallel by default. Set MaxConcurrency: 1 to mimic the serial execution of #NunoGamaFreire's loop-based solution.
{
"StartAt": "MockArray",
"States": {
"MockArray": {
"Type": "Pass",
"Result": [ { "name": "Zaphod" }, { "name": "Arthur" }, { "name": "Trillian" } ],
"ResultPath": "$.Items",
"Next": "MapState"
},
"MapState": {
"Type": "Map",
"ResultPath": "$.MapResult",
"End": true,
"Iterator": {
"StartAt": "MockWork",
"States": {
"MockWork": {
"Type": "Pass",
"Parameters": {
"output.$": "States.Format('Hello, {}!', $.name)"
},
"OutputPath": "$.output",
"End": true
}
}
},
"ItemsPath": "$.Items"
}
}
}
One output is produced for each element in the MockArray, processed concurrently:
"MapResult": [ "Hello, Zaphod!", "Hello, Arthur!", "Hello, Trillian!" ]
#2 Repeat a set of steps X times (loop without lambda, serially)
This option involves proper looping! Repeat a set of tasks serially X number of times until a Choice State determines that an incrementing counter variable has reached X. Use the new States.MathAdd intrinsic function to increment without a Lambda. This option is suited for custom retry logic or other cases when you may want to break the loop early.
{
"StartAt": "InitializeCounter",
"States": {
"InitializeCounter": {
"Type": "Pass",
"Comment": "Initialize the counter at 0. Move all inputs to Payload.Input",
"Parameters": {
"Counter": 0
},
"Next": "IncrementCounter"
},
"IncrementCounter": {
"Type": "Pass",
"Comment": "Increment the Counter by 1",
"Parameters": {
"Counter.$": "States.MathAdd($.Counter, 1)"
},
"Next": "MockWork"
},
"MockWork": {
"Type": "Pass",
"Comment": "Simulate some work. Optionally break early from the loop with ExitNow: true",
"Result": false,
"ResultPath": "$.ExitNow",
"Next": "Loop?"
},
"Loop?": {
"Type": "Choice",
"Choices": [{ "Or": [
{ "Variable": "$.Counter", "NumericGreaterThanEqualsPath": "$$.Execution.Input.workCount" },
{ "Variable": "$.ExitNow", "BooleanEquals": true } ],
"Next": "Success"
}
],
"Default": "IncrementCounter"
},
"Success": {
"Type": "Succeed"
}
},
"TimeoutSeconds": 3
}
Counter iterates by one for each loop. The loop breaks if a task returns ExitNow: true.
{ "Counter": 4, "ExitNow": false }
#3 Repeat a set of steps X times (without a loop, *concurrently*)
This option is a hybrid of the first two. Like #2, we start with a desired number of iterations from the $.workCount input . Like #1, we map over an array concurrently. This time, though, the state machine creates the array with another intrinsic function, States.ArrayRange(1, $.workCount, 1).
{
"StartAt": "Iterations",
"States": {
"Iterations": {
"Type": "Pass",
"Parameters": {
"Iterations.$": "States.ArrayRange(1, $.workCount, 1)"
},
"Next": "MapState"
},
"MapState": {
"Type": "Map",
"ResultPath": "$.MapResult",
"End": true,
"Iterator": {
"StartAt": "MockWork",
"States": {
"MockWork": {
"Type": "Pass",
"Parameters": {
"output.$": "States.Format('Hello from iteration #{}', States.JsonToString($))"
},
"OutputPath": "$.output",
"End": true
}
}
},
"ItemsPath": "$.Iterations"
}
}
}
Tasks run concurrently, once for each item in Iterations.
"Iterations": [ 1, 2, 3, 4 ],
"MapResult": [ "Hello from iteration #1", "Hello from iteration #2", "Hello from iteration #3", "Hello from iteration #4" ]