Break out of loop in AWS SWF activity - amazon-web-services

I'm running permanent loop in SWF Activity. Say like a web crawler crawling a website www.example1.com. However, I don't want to wait until it finishes crawling, but at certain time I want to terminate the activity and switch it to craw website www.example2.com instead.
I have tried to use 'try-cancel', 'terminate', workflow by workflow-id. It seems like it just sends signal to SWF to indicate that the task is finished in the AWS console, but the Activity process on worker is still running.
Any solution for this?

When activity is cancelled a heartbeat call returns flag that indicates that. So your activity loop should include heartbeating code to support cancellation. See "activity heartbeat" section from "error handling" page of AWS Flow Framework for Java
Developer Guide for an example.

Related

Does billing of ACI continue to happen even when my python code is waiting for messages on service bus subscription?

I am have simple python code which subscribes to a service bus subscription. I have containerized this and deployed as part of ACI on Azure.
If message arrives on service bus subscription, the code is executed, executes it logic and then waits indefinitely for another message from appear.
The code is what Azure has provided in its documentation for python sdk here
Since ACI is serverless and bills/second, just wanted a confirmation if I'll get billed even if it is not executing my code and waiting for message for appear on topic/subscription (event-based) ?
Yes, of course. It will cost if there is anyone container instance in the running state. Until you stop all the container instance, then the cost will stop. So even if your code is waiting, but the instance is running.

Running Task In The Background

What is the technology which allows the web application to process the task in the background without holding user to wait until the task to finish.
Example, as a user,
1. I want to submit a form which requires heavy processing. (Assume it requires to checking or actions, upload documentation or etc)
2. After submitting the form, the task will be running in the background, then I can go to other page and do something else.
2.1 At the same time, I might submit another form to the server.
The request can be process at the same time or can be queue under a queue system
3. I will receive a notification from the system whenever the server return a response. (Regardless it is success or failure)
This feature is similar to Google Cloud Platform.
Try Kue or any other similar libraries. The term to "google" is "[language] task queue"
You can of course roll your own. Though it will be much easier if you make use of an existing server such as redis or rabbitmq. So that queuing part is handled for you by the server and you could concentrate on your business logic.

Approach to crashed workers in amazon swf

We're currently implementing a workflow in Amazon SWF where we submit jobs/workflow executions from our web application. Everything was fairly quick and painless to get set up using the Ruby Flow framework. As long as the deciders/activity workers don't crash we seem to be able to handle most issues/exceptions gracefully.
My question is, what is common practice for the scenario where the decider process crashes midway through a workflow execution? If the task fails in that way, is it possible to push an SNS notification (I've seen no examples) or something to indicate to another process that there's been an unexpected failure/crash?
There are various types of "decider" failures.
Workflow worker crashes while processing a decision. The decision task is automatically rescheduled after specified timeout. Make sure that workflow type defaultTaskStartToCloseTimeout is not set too high. If this crash is not related to code correctness then rescheduled task is processed and workflow execution continues normally.
Workflow worker doesn't crash but workflow execution itself fails. In this case you can use ListClosedWorkflowExecutions to count such failed workflows.
Workflow worker doesn't crash but a decision task cannot complete as RespondDecisionTaskCompleted fails due to a bug in the Flow framework. As from SWF point of view task is never completed it at some point is marked as timed out and rescheduled. As bug is still present a new task is again never completes and rescheduled, and so on. The workflow execution that is experiencing such issue has a history with a tail that consists from repeated "decision task scheduled, decision task timed out" events. If your workflow has a known execution time limit then the best way to catch this issue is to set reasonable executionStartToCloseTimeout and look for timed out workflow executions. If the decision task timeout is set too low such workflows can also hit the limit on history size before the execution timeout.
All swf metrics are not published to cloud watch. So all completed and failed workflows will send the metrics to cloudwatch where you can create alarms to send you notifications when any workflow fails.

aws swf - get workflow execution id from within the workflow

I am using Amazon SWF service to automate some recurring tasks.
I am trying to use signals to execute some commands on remote machines. After the commands are finished executing, I'd like to send a signal back to the workflow to indicate success or failure.
The question is how can I find the workflow execution id programmatically? This is required for the remote machines to send a signal.
Thanks
Per http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/SimpleWorkflow/WorkflowExecution.html, shouldn't
your_workflow_execution_variable.run_id
get you exactly what you're looking for?

How to kill /re-start a long running task

Is there a way to kill / re-start a long running task in AWS SWF? Sometimes some of our tasks run for a longer duration and we would like to manually kill a certain task (either via UI or programmatically) and re-start the task if possible. How to achieve this?
Console is option to manually kill workflow.
You can also set timeouts to whole workflow execution time or to individual activities. This can be set when you register your activity or when you start your activity (defaultTaskStartToCloseTimeoutSecond).
It's not clear what language you're using.
If you're using java, then you should look into Exponential Retry in Flow Framework. This make SDK restart your activity if it fails.
Long running activity is expected to heartbeat using RecordActivityTaskHeartbeat. It leads to timeout failure after short hearbeat interval instead of long task execution timeout if the activity process hangs or crashes.
The workflow code (decider) can always request activity cancellation through RequestCancelActivityTask decision. The cancellation request is returned as output of the RecordActivityTaskHeartbeat call. Activity implementation should cancel itself and report back to the service using RespondActivityTaskCanceled API call.
See Error Handling section of AWS Flow Framework Developer Guide for the AWS Flow Framework way of cancelling activities.
Sometimes activity implementation cannot support heartbeating and self cancellation. The solution is to execute another kill activity that terminates the first activity execution. For example under Unix such kill activity could emit "kill -9" command for the process that implements the first one.