aws swf - get workflow execution id from within the workflow - amazon-web-services

I am using Amazon SWF service to automate some recurring tasks.
I am trying to use signals to execute some commands on remote machines. After the commands are finished executing, I'd like to send a signal back to the workflow to indicate success or failure.
The question is how can I find the workflow execution id programmatically? This is required for the remote machines to send a signal.
Thanks

Per http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/SimpleWorkflow/WorkflowExecution.html, shouldn't
your_workflow_execution_variable.run_id
get you exactly what you're looking for?

Related

How to complete a service task using camunda rest api

I am using Camunda workflows to automate various processes. I have come across a scenario where the process is not moving from a service task. Usually, we call the task/{taskid}/complete to complete the task, but since the process is stuck on a service task, I am not able to complete that task. Can anybody help me find a way to complete the service task?
You are using a service task. That basically means "a machine should do something". The "normal" implementation is to provide code (a java Delegate or a connector endpoint) that is called by the process engine to execute this task.
The alternativ is to use the "external task" pattern. Think of external tasks as "user tasks for computers". So the process waits, tells subscribed clients that a job is to be done and waits for their completion.
I suppose your process uses the second option? (you can check in the modeler under "Implementation"). So completion can be done through the external task API, see docs.
/external-task/{id}/complete
If it is a connector then you likely will see when checking the log that retries have occurred and that the transaction rolled back. After addressing the underlying issue the service task (email) should be sent without explicitly triggering the service task and the following user task (Approval) should be created.

can I run azure webjob after other webjobs finish

say I have this:
Step 1: A azure webjob triggered by a timer, and this job will create 1000 messages and I will put them in a queue.
Step 2: I have another azure webjob triggered by above message queue, this webjob will process these messages.
Step 3: The final webjob should only be triggered when all messages have been processed by step 2.
Looks like azure Queue doesn't support ordering and the only way is to use ServiceBus. I am wondering is it really the only way?
What I am thinking is this kind of process:
Put all these messages into an azure table, with some guid as primary key and status to be 0.
after finishing step 2, change the status of this message to 1 (i.e. finished) and will trigger step 3 if every messages have been done.
Will it work? Or maybe there are some nuget packages that I can use to achieve what I want?
The simplest way I think is the combination of Azure Logic App and Azure Function.
Logic App is a automated scalable workflow, you could trigger it with Timer, HTTP request and etc. And Azure function is a serverless compute service that enables you to run code on-demand without having to explicitly provision or manage infrastructure.
The Logic App support add code with Function, as for the use of Function, It's similar with WebJob. So I think you could create a Logic App with three Function, the they will run one by one.
As for the WebJob, yes,the QueueTrigger doesn't support ordering. And the Service Bus you mentioned, It did meet your some requirements for its FIFO feature. However you need to make sure your step 3 would be triggered after step 1, because it's already null in your queue before creating queues.
Hope my answer could help you.

Approach to crashed workers in amazon swf

We're currently implementing a workflow in Amazon SWF where we submit jobs/workflow executions from our web application. Everything was fairly quick and painless to get set up using the Ruby Flow framework. As long as the deciders/activity workers don't crash we seem to be able to handle most issues/exceptions gracefully.
My question is, what is common practice for the scenario where the decider process crashes midway through a workflow execution? If the task fails in that way, is it possible to push an SNS notification (I've seen no examples) or something to indicate to another process that there's been an unexpected failure/crash?
There are various types of "decider" failures.
Workflow worker crashes while processing a decision. The decision task is automatically rescheduled after specified timeout. Make sure that workflow type defaultTaskStartToCloseTimeout is not set too high. If this crash is not related to code correctness then rescheduled task is processed and workflow execution continues normally.
Workflow worker doesn't crash but workflow execution itself fails. In this case you can use ListClosedWorkflowExecutions to count such failed workflows.
Workflow worker doesn't crash but a decision task cannot complete as RespondDecisionTaskCompleted fails due to a bug in the Flow framework. As from SWF point of view task is never completed it at some point is marked as timed out and rescheduled. As bug is still present a new task is again never completes and rescheduled, and so on. The workflow execution that is experiencing such issue has a history with a tail that consists from repeated "decision task scheduled, decision task timed out" events. If your workflow has a known execution time limit then the best way to catch this issue is to set reasonable executionStartToCloseTimeout and look for timed out workflow executions. If the decision task timeout is set too low such workflows can also hit the limit on history size before the execution timeout.
All swf metrics are not published to cloud watch. So all completed and failed workflows will send the metrics to cloudwatch where you can create alarms to send you notifications when any workflow fails.

Break out of loop in AWS SWF activity

I'm running permanent loop in SWF Activity. Say like a web crawler crawling a website www.example1.com. However, I don't want to wait until it finishes crawling, but at certain time I want to terminate the activity and switch it to craw website www.example2.com instead.
I have tried to use 'try-cancel', 'terminate', workflow by workflow-id. It seems like it just sends signal to SWF to indicate that the task is finished in the AWS console, but the Activity process on worker is still running.
Any solution for this?
When activity is cancelled a heartbeat call returns flag that indicates that. So your activity loop should include heartbeating code to support cancellation. See "activity heartbeat" section from "error handling" page of AWS Flow Framework for Java
Developer Guide for an example.

How to kill /re-start a long running task

Is there a way to kill / re-start a long running task in AWS SWF? Sometimes some of our tasks run for a longer duration and we would like to manually kill a certain task (either via UI or programmatically) and re-start the task if possible. How to achieve this?
Console is option to manually kill workflow.
You can also set timeouts to whole workflow execution time or to individual activities. This can be set when you register your activity or when you start your activity (defaultTaskStartToCloseTimeoutSecond).
It's not clear what language you're using.
If you're using java, then you should look into Exponential Retry in Flow Framework. This make SDK restart your activity if it fails.
Long running activity is expected to heartbeat using RecordActivityTaskHeartbeat. It leads to timeout failure after short hearbeat interval instead of long task execution timeout if the activity process hangs or crashes.
The workflow code (decider) can always request activity cancellation through RequestCancelActivityTask decision. The cancellation request is returned as output of the RecordActivityTaskHeartbeat call. Activity implementation should cancel itself and report back to the service using RespondActivityTaskCanceled API call.
See Error Handling section of AWS Flow Framework Developer Guide for the AWS Flow Framework way of cancelling activities.
Sometimes activity implementation cannot support heartbeating and self cancellation. The solution is to execute another kill activity that terminates the first activity execution. For example under Unix such kill activity could emit "kill -9" command for the process that implements the first one.