What's the proper way to send part of a Step Function's input to a Batch Job?
I've tried setting and env var using Parameters.ContainerOverrides.Environment like this:
"Parameters": {
"ContainerOverrides": {
"Environment": [
{
"Name": "PARAM_1",
"Value": "$.param_1"
}
Step function input looks like this:
{
"param_1": "value-goes-here"
}
But the batch job just ends up getting invoked with literal "$.param_1" in the PARAM_1 env var.
Fixed. The Value key simply needed the ".$" postfix.
"Parameters": {
"ContainerOverrides": {
"Environment": [
{
"Name": "PARAM_1",
"Value.$": "$.param_1"
}
Pass it in "Parameters" (within the parent "Parameters"). Please note all parameters values are strings
"MyStepTask": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "myjobdef",
"JobName": "myjobname",
"JobQueue": "myjobqueue",
"Parameters": { "p_param1":"101",
"p_param2":"201"
}
},
"Next": "MyNextStepTask"
}
If you're wanting to pass parameters to Batch add the Parameters section to the parent Parameters section (not great naming!)
"MyStepTask": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "myjobdef",
"JobName": "myjobname",
"JobQueue": "myjobqueue",
"Parameters": {
"Name": "PARAM_1",
"Value.$": "$.param_1"
}
},
"Next": "MyNextStepTask"
}
Related
{
"Comment": "A description of my state machine",
"StartAt": "Batch SubmitJob",
"States": {
"Batch SubmitJob": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "arn:aws:batch:us-east-1:XXXXXXXXXXX:job-definition/clientcopyjobdef:1",
"JobQueue": "arn:aws:batch:us-east-1:XXXXXXXXXX:job-queue/copyclientjq",
"ContainerOverrides": {
"Command.$": [
"dotnet",
"CopyClientJob.dll",
"$.input"
]
},
"JobName.$": "$.input"
},
"End": true
}
}
}
I was trying to create this state machine, it was working fine if I am passing command to JobDefinition directly, but here I am trying to override the command and want to pass command parameter from state input, so trying to pass like above code. For "JobName.$": "$.input" it is working but for
"Command.$": [
"dotnet",
"CopyClientJob.dll",
"$.input"
]
It is not working, the command is passing as it is to aws batch without transforming the parameter, can someone help on this.
Thanks
Got the solution, Actually I need to combine the all 3 using States.Array like below
{
"Comment": "A description of my state machine",
"StartAt": "Batch SubmitJob",
"States": {
"Batch SubmitJob": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "arn:aws:batch:us-east-1:XXXXXXXXX:job-definition/clientcopyjobdef:1",
"JobQueue": "arn:aws:batch:us-east-1:XXXXXXXX:job-queue/copyclientjq",
"ContainerOverrides": {
"Command.$": "States.Array('dotnet', 'CopyClientJob.dll', $.input)"
},
"JobName": "test"
},
"End": true
}
}
}
How can use input in my state machine in mid of a string?
For example, I want to make a command using input params like this
["python test.py", "--name=$.name", "--age=$.age"]
But as per AWS documentation, I can't pass like above. I can only pass like
{"Command.$": "$.age"}
Why this strange structure is required. Why do I need to use .$ in the key. Why can't I use $.age freely anywhere?
What I want to achieve is something like this.
{
"Comment": "A description of my state machine",
"StartAt": "Pipeline",
"States": {
"Run Pipeline": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "aaaaaaaaaaaaaa",
"TaskDefinition": "aaaaaaaaaaaaaa",
"Overrides": {
"ContainerOverrides": [{
"Name": "my_container",
"Command": [
"python test.py",
"--name=$.name",
"--age=$.age"
]
}]
}
},
"End": true
}
}
}
where my step function input is
{"name": "Rahul", "age": 25}
The .$ tells step-functions that you are passing a path and not a literal string value.
Input/Ouput Path Params Doc
If you want to construct a string from inputs, you should be able to use intrinsic string formatting:
States.Format('python test.py --name={} --age={}', $.name, $.age)
Intrinsic Function Doc
{
"Comment": "A description of my state machine",
"StartAt": "Run Pipeline",
"States": {
"Run Pipeline": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "aaaaaaaaaaaaaa",
"TaskDefinition": "aaaaaaaaaaaaaa",
"Overrides": {
"ContainerOverrides": [{
"Name": "my_container",
"Command.$": "States.Array(States.Format('python test.py --name={} --age={}', $.name, $.age))"
}]
}
},
"End": true
}
}
}
States.Array makes an array out of a list of inputs.
States.Format lets you construct a string using literal and interpolated (variable) inputs.
I'm sure someone will point me to an immediate solution, but I've been at this for hours, so I'm just going to ask.
I cannot get a State Machine to accept an initial input. The intent is to set up an EventBridge trigger pointed at the State Machine with a static JSON passed to the SM to initiate with the proper parameters. In development, I'm just using Step Functions option to pass a JSON as the initial input when you select "New Execution".
This is the input:
{"event":{
"country": "countryA",
"landing_bucket": "aws-glue-countryA-inputs",
"landing_key": "countryA-Bucket/prefix/filename.csv",
"forecast_bucket": "aws-forecast-countryA",
"forecast_key": "inputs/",
"date_start": "2018-01-01",
"validation": "False",
"validation_size": 90
}
}
When looking at what is passed at the ExecutionStarted log entry:
{
"input": {
"country": "countryA",
"landing_bucket": "aws-glue-countryA-inputs",
"landing_key": "countryA-Bucket/prefix/filename.csv",
"forecast_bucket": "aws-forecast-countryA",
"forecast_key": "inputs/",
"date_start": "2018-01-01",
"validation": "False",
"validation_size": 90
}
,
"inputDetails": {
"truncated": false
},
"roleArn": "arn:aws:iam::a-valid-service-role"
}
This is the State Machine:
"Comment": "A pipeline!",
"StartAt": "Invoke Preprocessor",
"States": {
"Invoke Preprocessor": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"InputPath": "$.input",
"Parameters": {
"FunctionName": "arn:aws:lambda:my-lambda-arn:$LATEST"
},
"Next": "EndSM"
},
"EndSM": {
"Type": "Pass",
"Result": "Ended",
"End": true
}
}
}
I've tried nearly anything I can think of from changing the InputPath to assigning the "input" dictionary directly to a variable:
"$.event":"$.input"
To drilling down to the individual variables and assigning those directly like:
"$.country:"$.country". I've also used the new Step Functions Data Flow Simulator and can't get anywhere. If anyone has any thoughts, I'd really appreciate it.
Thanks!
Edited for correct solution:
You need to set the Payload.$ parameter to $. That will pass in the entire input object to the lambda.
{
"Comment": "A pipeline!",
"StartAt": "Invoke Preprocessor",
"States": {
"Invoke Preprocessor": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"FunctionName": "arn:aws:lambda:my-lambda-arn:$LATEST",
"Payload.$": "$"
},
"Next": "EndSM"
},
"EndSM": {
"Type": "Pass",
"Result": "Ended",
"End": true
}
}
}
Another thing you could do is specify the input in the parameters, this will allow you to specify only all/certain parts of the json to pass in.
{
"Comment": "A pipeline!",
"StartAt": "Invoke Preprocessor",
"States": {
"Invoke Preprocessor": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"InputPath": "$",
"Parameters": {
"FunctionName": "arn:aws:lambda:my-lambda-arn:$LATEST",
"input_event.$": "$.event"
},
"Next": "EndSM"
},
"EndSM": {
"Type": "Pass",
"Result": "Ended",
"End": true
}
}
}
From the code perspective you could just reference it like so (python):
input = event['input_event']
I have a batch Job with a single Job Definition that executes depending on a parameter on the environment command option.
The original value is "--param2=XXX" but I need this to be dynamic according to the Input parameters of the Step Functions.
{
"param2": "--param2=YYY"
}
I haven't been able to replace the value in the Step Function with the Input Value
{
"Step1": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "arn:aws:batch:us-east-2:zzzzzzzzz:job-definition/XXXXXX",
"JobQueue": "arn:aws:batch:us-east-2:zzzzzzzz:job-queue/YYYYYY",
"JobName": "Step1",
"ContainerOverrides": {
"Environment": [
{
"Name": "envparam",
"Value": "0"
}
],
"Command": [
"python",
"run.py",
"--param=val",
"$.param2"
]
}
},
"Next": "Step2"
}
}
I found a solution adding a Parameter to Batch and it is referenced using Ref::Param2
This is the complete code
{
"Step1": {
"Type": "Task",
"Resource": "arn:aws:states:::batch:submitJob.sync",
"Parameters": {
"JobDefinition": "arn:aws:batch:us-east-2:zzzzzzzzz:job-definition/XXXXXX",
"JobQueue": "arn:aws:batch:us-east-2:zzzzzzzz:job-queue/YYYYYY",
"JobName": "Step1",
"Parameters": {
"Param2.$": "$.param2"
},
"ContainerOverrides": {
"Environment": [
{
"Name": "envparam",
"Value": "0"
}
],
"Command": [
"python",
"run.py",
"--param=val",
"Ref::Param2"
]
}
},
"Next": "Step2"
}
}
Working on an AWS StepFunction that gets an array of dates from a Lambda call, then passes to a Task that should take that array as a parameter to pass into a lambda.
The Get Date Range task works fine and outputs the date array:
{
"rng": [
"2019-05-07",
"2019-05-09"
]
}
...and the array gets passed into the ProcessDateRange task, but I cannot assign the array the range Parameter.
It literally tries to pass this: "$.rng" instead of this:
[
"2019-05-07",
"2019-05-09"
]
Here's the StateMachine:
{
"StartAt": "Try",
"States": {
"Try": {
"Type": "Parallel",
"Branches": [{
"StartAt": "Get Date Range",
"States": {
"Get Date Range": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:get-date-range",
"Parameters": {
"name": "thename",
"date_query": "SELECT date from sch.tbl_dates;",
"database": "the_db"
}
,
"ResultPath": "$.rng",
"TimeoutSeconds": 900,
"Next": "ProcessDateRange"
},
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
"Exit": {
"Type": "Succeed"
}
}
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.Error",
"Next": "Failed"
}],
"Next": "Succeeded"
},
"Failed": {
"Type": "Fail",
"Cause": "There was an error. Please review the logs.",
"Error": "error"
},
"Succeeded": {
"Type": "Succeed"
}
}
}
This is because you are using the wrong syntax for Lambda tasks. To specify the input you need to set the InputPath key, for example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"InputPath": "$.rng",
"ResultPath": "$",
"Next": "Exit"
},
If you want a parameter to be interpreted as a JSON path instead of a literal string, add ".$" to the end of the parameter name. To modify your example:
"ProcessDateRange": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456789:function:process-date-range",
"Parameters": {
"range.$": "$.rng"
},
"ResultPath": "$",
"Next": "Exit"
},
Relevant docs here: https://docs.aws.amazon.com/step-functions/latest/dg/connectors-parameters.html#connectors-parameters-path