AWS Step Function: ignore input from previous state, and use "Parameters" as input - amazon-web-services

So I have a state defined as below.
"process-abc": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:my-lambda...",
"Parameters": {
"type": "my-type",
},
However, when I run step function, I don't see the "type": "my-type", in the state input, I only see the input as something from the previous state output.
How can I only pass "type": "my-type" as the only input into the current state?

If you did not define any result path, by default input to the next state is the output of the previous state.
See https://docs.aws.amazon.com/step-functions/latest/dg/input-output-resultpath.html
You can set the result path of the state 1 to null to ignore the state 1 result and pass the state 1 input to the state 2.

Related

Step Functions - Access State from previous Map Iteration

How can I get the results from previous Map Iterations in the next iteration when using MaxConcurrency: 1 in Amazon Step Functions?
Here's an example of the code I have
{
"StartAt": "UploadUsers",
"States": {
"UploadUsers": {
"Type": "Map",
"MaxConcurrency": 1,
"ItemsPath": "$.data.users",
"Parameters": {
"data.$": "$$.Map.Item.Value.data",
"friends.$": "$.?????? Get created users ids"
},
"Iterator": {
"StartAt": "UploadUser",
"States": {
"UploadUser": {
"End": true,
"Parameters": {
"FunctionName": "${FnUploadUser}",
"Payload": {
"data.$": "$.user_data",
"friends.$": "$.??????"
}
},
"Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
"ResultPath": "$.data. ???",
"Type": "Task"
}
}
},
"End": true,
"ResultPath": "$.data.UploadUsers",
"ResultSelector": {
"result.$": "$"
}
}
}
}
Suppose FnUploadUser is a lambda that returns the id of the created user.
And I want to get the ids of the previously created users and use that value for the next user I'm about to create.
You can't. Map State iterations don't share state. Two workarounds:
(1) Manage the shared state externally: Each Map iteration writes and reads from, say, a DynamoDB table.
(2) Refactor to a "for" loop and keep the shared state in the execution output.
Instead of using Map, insert a Choice State (after UploadUser) that checks for a "done" condition. If "done", finish, else loop back to UploadUser.
UploadUser accepts the user_data array as input. It appends its output to, say, the uploaded output array.
Each UploadUser iteration identifies the next user_data item by comparing it to the uploaded array. The iteration that processes the last item can also output done: true to signal to Choice that work is done.
The Choice State loops back to UploadUser while there are more to process (i.e. while done is not present).
There are other ways to build steps 2-3. For instance, you could add next_item and total_items keys on the output to keep track of progress. The important point is that Choice loops until an exit condition is met.

How to access input of state machine in any node at AWS Step Functions

Let's say I have this state machine in AWS Step Function:
And I had started it with this input:
{
"item1": 1,
"item2": 2,
"item3": 3
}
It's clear for me that Action A is receiving the input payload. But, how can Action C access the state machine input to get the value of item3? Is it possible?
Thanks!!
Typically, the data available in Action C will be dependent on what the result/output of Action B is.
However, if you just care about the original input to the state machine execution, you can set the payload of Action C using the Context Object.
// roughly
"Action C": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"Payload.$": "$$.Execution.Input",
"FunctionName": "<action c lambda>"
},
Check out the AWS documentation for Context Object

AWS Step function parameters are not moving to the next step

I am running a step function with many different step yet I am still stuck on the 2nd step.
The first step is a Java Lambda that gets all the input parameters and does what it needs to do.
The lambda returns null as it doesn't need to return anything.
The next step is a call for API gateway which needs to use one of the parameters in the URL.
However, I see that neither the URL has the needed parameter nor do I actually get the parameters into the step. ("input": null under TaskStateEntered)
The API gateway step looks as follows: (I also tried "Payload.$": "$" instead of the "Input.$": "$")
"API Gateway start": {
"Type": "Task",
"Resource": "arn:aws:states:::apigateway:invoke",
"Parameters": {
"Input.$": "$",
"ApiEndpoint": "aaaaaa.execute-api.aa-aaaa-1.amazonaws.com",
"Method": "GET",
"Headers": {
"Header1": [
"HeaderValue1"
]
},
"Stage": "start",
"Path": "/aaa/aaaa/aaaaa/aaaa/$.scenario",
"QueryParameters": {
"QueryParameter1": [
"QueryParameterValue1"
]
},
"AuthType": "IAM_ROLE"
},
"Next": "aaaaaa"
},
But when my step function gets to this stage it fails and I see the following in the logs:
{
"name": "API Gateway start",
"input": null,
"inputDetails": {
"truncated": false
}
}
And eventually:
{
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'API Gateway start' (entered at the event id #9). Unable to apply Path transformation to null or empty input."
}
What am I missing here? Note that part of the path is a value that I enter at the step function execution. ("Path": "/aaa/aaaa/aaaaa/aaaa/$.scenario")
EDIT:
As requested by #lynkfox, I am adding the lambda definition that comes before the API gateway step:
And to answer the question, yes its standard and I see no input.
"Run tasks": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"OutputPath": "$.Payload",
"Parameters": {
"Payload.$": "$",
"FunctionName": "arn:aws:lambda:aaaaaa-1:12345678910:function:aaaaaaa-aaa:$LATEST"
},
"Retry": [
{
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 6,
"BackoffRate": 2
}
],
"Next": "API Gateway start"
},
So yes, as I commented, I believe the problem is the OutputPath of your lambda task definition. What this is saying is Take whatever comes out of this lambda (which is nothing!) and cut off everything other than the key Payload.
Well you are returning nothing, so this causes nothing to be sent to the next task.
I am assuming your incoming vent already has a key in the Json that is named Payload, so what you want to do is remove the OutputPath from your lambda. It doesn't need to return anything so it doesn't need an Output or Result path.
Next, on your API task, assuming again that your initializing event has a key of Payload, you would have "InputPath": "$.Payload" - if you have your headers or parameters in the initializing json Event then, you can reference those keys in the Parameters section of the definition.
Every AWS Service begins with an Event and ends with an Event. Each Event is a JSON object. (Which I'm sure you know). With State Machines, this continues - the State Machine/Step Function is just the controller for passing Events from one Task to the next.
So any given task can have an InputPath, OutputPath, or Result Path - These three definition parameters can decide what values go into the Task and what are sent onto the Next Task. State machines are, by definition, for maintaining State between Tasks, and these help control that 'State' (and there is pretty much only one 'state' at any given time, the event heading to the next Task(s)
The ResultPath is where, in that overall Event, the task puts the data. If you put ResultPath: "$.MyResult" by itself it appends this key to the incoming event
If you add OutputPath, it ONLY passes that key from the output event of the Task onto the next step in the Step Functions.
These three give you a lot of control.
Want to Take an Event into a Lambda and respond with something completely different - you don't need the incoming data - you combine OutputPath and ResultPath with the same value (and your Lambda needs to respond with a Json Object) then you can replace the event wholesale.
If you have ResultPath of some value and OutputPath: "$." you create a new json object with a single Key that contains the result of your task (the key being the definition set in ResultPath
InputPath allows you to set what goes into the Task. I am not 100% certain but I'm pretty sure it does not remove anything from the next Task in the chain.
More information can be found here but it can get pretty confusing.
My quick guide:
ResultPath by itself if you want to append the data to the event
ResultPath + OutputPath of the same value if you want to cut off the Input and only have the output of the task continue (and it returns a JSON style object)

Passthrough input to output in AWS Step Functions

How can I passthrough the input to a Task state in an AWS Step Functions to the output?
After reading the Input and Output Processing page in the AWS docs, I have played with various combinations of InputPath, ResultPath and OutputPath.
State definition:
"First State": {
"Type": "Task",
"Resource": "[My Lambda ARN]",
"Next": "Second State",
"InputPath": "$.someKey",
"OutputPath": "$"
}
Input:
{
"someKey": "someValue"
}
Expected Result
I would like the output of the First State (and thus the input of Second State) to be
{
"someKey": "someValue"
}
Actual Result
[empty]
What if the input is more complicated, e.g.
{
"firstKey": "firstValue",
"secondKey": "secondValue"
}
I would like to forward all of it without worrying about (sub) paths.
In the Amazon States Language spec it is stated that:
If the value of ResultPath is null, that means that the state’s own raw output is discarded and its raw input becomes its result.
Consequently, I updated my state definition to
"First State": {
"Type": "Task",
"Resource": "[My Lambda ARN]",
"Next": "Second State",
"ResultPath": null
}
As a result, when passing the input example Task input payload will be copied to the output, even for rich objects like:
{
"firstKey": "firstValue",
"secondKey": "secondValue"
}
For those who find themselves here using CDK, the solution is to use the explicit aws_stepfunctions.JsonPath.DISCARD enum rather than None/null.
from aws_cdk import (
aws_stepfunctions,
aws_stepfunctions_tasks,
)
aws_stepfunctions_tasks.LambdaInvoke(
self,
"my_function",
lambda_function=lambda_function,
result_path=aws_stepfunctions.JsonPath.DISCARD,
)
https://docs.aws.amazon.com/cdk/api/latest/docs/#aws-cdk_aws-stepfunctions.JsonPath.html#static-discard
I was looking for a solution from passing input from one parallel state to another parallel state and the above option worked really good.
For example my step function is like this...tas1->parallel task2 -> parallel trask3 -> task4. So when it start with parallel task3, the input values are wiped out, so ptask3 is failing. With the above option, i was able to pass in same input from ptask2 to ptas3.

How to specify multiple result path values in AWS Step Functions

I have a AWS Step Function State formatted as follows:
"MyState": {
"Type": "Task",
"Resource": "<MyLambdaARN>",
"ResultPath": "$.value1"
"Next": "NextState"
}
I want to add a second value but can't find out how anywhere. None of the AWS examples display multiple ResultPath values being added to the output.
Would I just add a comma between them?
"MyState": {
"Type": "Task",
"Resource": "<MyLambdaARN>",
"ResultPath": "$.value1, $.value2"
"Next": "NextState"
}
Or is there a better way to format these?
Let's answer this straight up: you can't specify multiple ResultPath values, because it doesn't make sense. Amazon does do a pretty bad job of explaining how this works, so I understand why this is confusing.
You can, however, return multiple result values from a State in your State Machine.
General Details
The input to any State is a JSON object. The output of the State is a JSON object.
ResultPath directs the State Machine what to do with the output (result) of the State. Without specifying ResultPath, it defaults to $ which means all the input to the State is lost, replaced by the output of the State.
If you want to allow data from the input JSON to pass through your State, you specify a ResultPath to describe a property to add/overwrite on the input JSON to pass to the next State.
Your scenario
In your case, $.value1 means the output JSON of your State is the input JSON with a new/overwritten property value1 containing the output JSON of your lambda.
If you want multiple values in your output, your lambda should return a JSON object containing the multiple values, which will be the value of the value1 property.
If you don't care about allowing input values passing through your State, leave the ResultPath as the default $ by omitting it. The output JSON containing your multiple values will be the input to the next State.
Support scenario
Here's a simple State machine I use to play with the inputs and outputs:
{
"StartAt": "State1",
"States": {
"State1": {
"Type": "Pass",
"Result": { "Value1": "Yoyo", "Value2": 1 },
"ResultPath": "$.Result",
"Next": "State2"
},
"State2": {
"Type": "Pass",
"Result": { "Value2": 5 },
"ResultPath": "$.Result",
"Next": "State3"
},
"State3": {
"Type": "Pass",
"Result": "Done",
"End": true
}
}
}
Execute this with the following input:
{
"Input 1": 10000,
"Input 2": "YOLO",
"Input 3": true
}
Examine the inputs and outputs of each Stage. You should observe the following:
The input is passed all the way through, because the ResultPath always directs output to a Result property of the input.
The output of State1 is overwritten by the Output of State2. The net effect is Result.Value1 disappears and Result.Value2 is "updated".
Hopefully this clarifies how to use ResultPath effectively.
You cannot specify several values in ResultPath, because ResultPath defines the path of your result value in the json. The close analogy for ResultPath is a return value of a function, as your step can return only 1 value it should be put into 1 node in the resulting json.
If you have an input json
{
"myValue": "value1",
"myArray": [1,2,3]
}
And define your ResultPath as $.myResult the overall resulting json will be
{
"myValue": "value1",
"myArray": [1,2,3],
"myResult": "result"
}
Now you can truncate this json to pass only part of it to the next step in your function using OutputPath (e.g. OutputPath: "$.myResult")
InputPath and OutputPath can have several nodes in their definition, but ResultPath should always have only 1 node.
Seems like today it can be done by using ResultsSelector.
I think it would be like the following, but I haven't actually done this so I can't say for sure.
"ResultPath": {
"var1" : "$.value1",
"var2" : "$.value2"
},
=====
After looking further into this, I am convinced that there is no direct way to do what you want to do. Here is a way that could give you the results that you want.
1) You would omit the InputPath, OutputPath, and ResultPath from your step. This would mean that all of $. would be passed in as an input to your step function and that all of the output from the lambda function would be stored as $. In the lambda function you could set the results fields to be whatever you want them to be. The lambda function must return the modified input as it output.