How to send the notification on every task execution in a state machine on AW step functions? - amazon-web-services

I am working on Amazon Step functions to leverage the workflow for multiple Batch jobs. The requirement is such that the Batch jobs should be executed sequentially and whenever a job transition from one to another job then send a notification with the execution status of the tasks to a SNS topic. I need to send a notification for SUCCESS and FAILURE of a task.
I have tried the Execution Events using Cloudwatch event rules, but Execution Events only gives information about the State Machine's execution, not about the Tasks execution.

As you have found states aren't in Cloudwatch events need to add this as a separate step, there is no way around this, have a notify step which either executes a lambda, or sync to SNS.
There is also another way to do this as you can compose step functions of step functions. So you have your parent step function and your child step function. Your child step function could be the batch job itself, and then you can make use of Cloudwatch events on the batch-job-step-function step function:
"BatchJob" : {
"Comment": "This snippet is in the parent step function. It will kick off another step function, called: batch-job-step-function",
"Type": "Task",
"Resource": "arn:aws:states:::states:startExecution.sync",
"Parameters": {
"StateMachineArn": "arn:aws:states:us-east-1:TODO:stateMachine:batch-job-step-function",
"Input": {
"batchJobInput.$": "$$.Execution.Input.batchJobInput"
}
},
"End": true | "Next" : "TODO"
}
Now you can put Cloudwatch Event Rules against: arn:aws:states:us-east-1:TODO:stateMachine:batch-job-step-function

Related

DS SDK -AWS Step functions lambda job cancelled immediately

I am getting this weird result when I try to deploy the lambda step, if I define my lambda step like this
lambda_step = steps.compute.LambdaStep(
"Query Training Results",
parameters={
"FunctionName": execution_input["LambdaFunctionName"],
"Payload": {"TrainingJobName.$": "$.TrainingJobName"},
},
)
For some reason it will automatically just grey out the box meaning the job will be cancelled immediately. If I simply remove the "Payload" part then it will work but the lambda step will still fail because it does not know the training job name that I am trying to pass in the Payload.
I followed this example to the T here. Any suggestions would be greatly appreciated.

Sns mail notification when a step is not kicked off within a threshold timeframe

I have an emr step which is submitted through step function. During step run I can see task is submitted, but emr step is not executed and emr console don’t have any information .
How can I debug this?
How can I send an sns when a step doesn’t start execution with in a threshold timeframe?in my case step function shows emr task submitted but no information on emr console and pipeline is long running without failing for more than half hr
You could start the debugging process through the Step Functions execution log and identify the specific step that has failed, and later, you can move on looking for the EMR console or the specific service that has failed. Usually when the EMR step doesn't appear in the EMR console, is due to a Runtime Error, caused by an exception raised when calling the EMR step.
For this scenario, you can use the Error Handling that Step Functions has, using the Catch and Timeout fields, you can find more details in the AWS documentation here.
Basically you need to add this fields as show bellow:
{
"StartAt": "EmrStep",
"States": {
"EmrStep": {
"Type": "Task",
"Resource": "arn:aws:emr:execute-X-step",
"Comment": "This is your EMR step",
"TimeoutSeconds": 10,
"Catch": [ {
"ErrorEquals": ["States.Timeout"],
"Next": "ShutdownClusterAndSendSNS"
} ],
"End": true
},
"ShutdownClusterAndSendSNS": {
"Type": "Pass",
"Comment": "This step handles the timeout exception raised",
"Result": "You can shutdown the EMR cluster to avoid increased cost here and later send a sns notification!",
"End": true
}
}
Note: To catch the timeout exception, you have to catch the error States.Timeout, but also you can define the same catch field for other types of error.

AWS Step Function Synchronous Task Token

I have a use case which I want to use Step Functions to solve but I can't find a way to solve this problem. Your help would be greatly appreciated.
The problem goes like this: I have an Amazon API Gateway which has a /start endpoint. a POST to this endpoint should start a data processing session and return a URL to an app which the API client can use to capture some data. Once data capture is complete, some processing takes place before the final response is sent to the API client via a callback.
My thinking, as you can see below, is to generate a task token and send it to the Data Capture Service. Then, when the user data capture is complete, the service can send a request to the Step Function API to say that stage is complete. The problem with this is how can I return the URL to the client from within the Step Function? I don't want to use a callback to do this.
One option is to create the data capture session within the 'Step Function Initiator' Lambda but then how do I provide the Data Capture Service with a task token?
Really, what I need is some mechanism of synchronously returning something (either a URL from that call or the task token from the first stage) from within the Step Function to the Lambda which started the execution. Is this possible? How would you solve this?
In step function initiator lambda, you must be doing start-execution which returns an executionArn
Next, you can loop and call get-execution-history api and task token will be part of the 'capture data' task parameters. Since this is the first step, this really should be done with in couple of seconds, so, we can keep running this loop every second until desired step in step function is initiated and task token can be obtained.
Take this example, i am passing the task token to another step function call from current step function.
{
"StartAt":"ChildTask",
"States":{
"ChildTask":{
"End":true,
"Type":"Task",
"Resource":"arn:aws:states:::states:startExecution.waitForTaskToken",
"Parameters":{
"Input":{
"token.$":"$$.Task.Token",
"foo":"bar"
},
"StateMachineArn":"arn:aws:states:us-east-1:110011001100:stateMachine:ChildStateMachine",
"Name":"MyExecutionName"
}
}
}
}
Get Execution history:
aws stepfunctions get-execution-history --execution-arn arn:aws:states:us-east-1:110011001100:execution:ParentStateMachine:667102b3-b19c-b7ab-b119-9ec6cf23e505
Result:
one of the first few entries in execution history and task token is part of the parameters. we can exit the loop, grab that, send it back to Api Gateway.
{
"timestamp": "2021-03-12T13:56:58.097000-05:00",
"type": "TaskScheduled",
"id": 3,
"previousEventId": 2,
"taskScheduledEventDetails": {
"resourceType": "states",
"resource": "startExecution.waitForTaskToken",
"region": "us-east-1",
"parameters": "{\"Input\":{\"foo\":\"bar\",\"token\":\"o6QVQ9gls.......=\"},\"StateMachineArn\":\"arn:aws:states:us-east-1:110011001100:stateMachine:ChildStateMachine\",\"Name\":\"MyExecutionName\"}"
}
}

How much time does AWS step function keeps the execution running?

I am new to AWS Step Function. I have created a basic step function with Activity Worker in the back end. For how much time, does the Step Function keeps the execution alive and not time out if the execution is still not picked by the activity worker?
For how much time, does the Step Function keeps the execution alive
and not time out if the execution is still not picked by the activity
worker?
1 year
You can specify TimeoutSeconds in activity task which is also a recommended way
"ActivityState": {
"Type": "Task",
"Resource": "arn:aws:states:us-east-1:123456789012:activity:HelloWorld",
"TimeoutSeconds": 300,
"HeartbeatSeconds": 60,
"Next": "NextState"
}
Step functions can keep the task in the queue for maximum 1 year. You can find more info on Step Functions limitations on this page.

How can I create an "aws.cloudformation" CloudWatch event type for a specific CloudFormation stack?

I need to create an aws.cloudformation event type for a specific CloudFormation stack. For example when StackA receives the UpdateStack event, I need to be able to catch that event.
Through the console I was able to create the following event rule (which is an AWS API Call via CloudTrail type event):
{
"source": [
"aws.cloudformation"
],
"detail-type": [
"AWS API Call via CloudTrail"
],
"detail": {
"eventSource": [
"cloudformation.amazonaws.com"
],
"eventName": [
"UpdateStack",
"CreateStack"
]
}
}
However, this event isn't for any specific CloudFormation stack, and I don't see any option for adding anything specific (such as whenever StackA gets an UpdateStack call.
The documentation for event types give examples of other event types and how we can add a specific resource that triggers the event. For example, with the aws.codepipeline event, you can specify a pipeline equal value to PipelineA, and then the event would get triggered whenever PipelineA gets to the state you specified in the State parameter.
How can I do something similar with an aws.cloudformation event type?
Unfortunately the only way (as far as I have found) to get a stack-specific events is the notification configuration inside the stack, which can be only provided on creation/update.
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-console-add-tags.html
I'm looking similar solution and found that should work:
https://aws.amazon.com/ru/blogs/mt/tracking-aws-service-catalog-products-provisioned-by-individual-saml-users/
Step5
aws events put-rule --name "sc-add-user" --event-pattern "{"source":["aws.cloudformation"],"detail-type":["AWS API Call via CloudTrail"],"detail":{"eventSource":["cloudformation.amazonaws.com"],"eventName":["CreateStack"]}}"