AWS Flow Framework: Can we run activity worker and activity task on different EC2 instances - amazon-web-services

I am newbie to AWS and want to use Simple workflow service.
So far I know is
URL: http://docs.aws.amazon.com/amazonswf/latest/awsflowguide/awsflow-basics-application-structure.html
1) Workflow starter, workflow worker (decider) and activities worker
can run on same EC2 instance or each one can run on different EC2 instance.
2) Activities worker execute activity tasks or activity methods
My question is:
1) Can I run Workflow starter, workflow worker (decider), Activities worker on one EC2 instance
and Activity tasks or activity methods on different EC2 instance?
Example:
EC2 instance 1 -> Workflow starter, workflow worker(decider), Activities worker
EC2 instance 2 -> Activity tasks or activity methods
If above thing is possible then can anyone point me to some example?
I have looked at AWS Helloworld distributed application but it
runs Workflow worker and Activities worker on different EC2 instance where activity task run on same machine as Activities worker
URL: http://docs.aws.amazon.com/amazonswf/latest/awsflowguide/getting-started-example-helloworldworkflowdistributed.html
Requirement
At beginning only one instance of EC2 would be running:
1) On this running EC2 instance, I will have Workflow starter and workflow worker (decider).
2) Based on task received in decision task list, decider will execute custom logic and based on custom
logic outcome I want to create an new instance of EC2 and execute activity worker on it.
Issue is I read somewhere that workflow worker (decider) and activity worker must be started before receiving a task in decision task list and in my case I cannot keep running 2 EC2 instances from the beginning because of cost reason.
Hence to solve this issue my solution was
1) Start decider and activity worker on the running instance of EC2.
2) Once the activity worker receives the task in activity task list, it will create a new instance of EC2 and execute the activity on it.
Thanks

yes. you can definitely do this. you can also have multiple workers (both workflow and activities) on multiple machines. you should not really care where the workflow and activities run as long as the work is getting done.
here is a sample I wrote to show how to use flow with java 1.8
https://github.com/mirceal/swf-flow-java18-sample
In App.java, comment line 42 (aw.start()) and build sample. After that comment line 43 (wfw.start()) and build sample. Copy the 2 built samples to 2 different machines and you should achieve what you are asking for.
line 51 starts the workflow.
Edit
Think of the workers as entities that have the ability to perform a task. The workers constantly look at a SWF (poll SWF) and if there is work they pick it up and execute it.
There is nothing preventing you from running workflow and activity workers on the same machine. There is also nothing preventing you from spinning up EC2 instances from an activity.
If you need to only run part of activities on a machine and other activities on other machines (the ones you spin up) you would typically use different SWF task lists.

Related

Make Azure Webjob Timer Trigger Not Run as a Singleton

I have a set of worker functions that are spun up as needed to pull from Service Bus Topic subscriptions when they are created. New workers are created when a new Subscription is created by way of a provisioning message that is queued triggering a job to spin up the new worker to listen to the subscription. The problem is that now I want to be able to scale out the workers listening to the subscriptions when the app is scaled out. Since the provisioning job only creates the worker on a single instance the effectiveness of scaling out is significantly reduced.
My thought was to create a second provisioning job that runs from a timer trigger to synchronize the running jobs to the current list of subscriptions. I run into the same problem with the timer job as with the service bus trigger though because it is running as a singleton in the web job and likely will run on the same instance of the job each time it is run meaning I still will likely have one maybe 2 instances of a job per subscription no matter how much I scale out.
My question is, is it possible to create a timer job that is not run as a singleton? Meaning, can I configure a timer job that, for each instance of the scaled out web job, will run on a set interval?
Singleton attribute ensures to run only one instance with the help of distributed locking, these are related to webjobs SDK.
Also we have Singleton Listeners and with adding few settings to ensure that your function runs as a singleton on a single instance. To ensure that only a single instance of the function is running when the web app scales out to multiple instances, apply a listener-level singleton lock on the function ([Singleton(Mode = SingletonMode.Listener)]). Listener locks are acquired when the JobHost starts. If three scaled-out instances all start at the same time, only one of the instances acquires the lock and only one listener starts.
Now in case of not assigning singleton to avoid running each instance at a time, refer to Multiple Instances documentation from MS Docs

Auto scaling service in AWS without duplicating cron jobs

I have a (golang web server) service running on AWS on a EC2 (no auto scaling). This service has a few cron jobs that runs throughout the day and these jobs starts when the service starts.
I would like to take advantage of auto scaling in some form on AWS. Been looking at ECS and Beanstalk.
When I add auto scaling I need the cron job to only execute on one of the scaled services due to rate limits on external APIs. Right now the cron job is tightly coupled within the service and I am looking for an option that does not require moving the cron job to its own service.
How can I achieve this in a good way using AWS?
You're going to get this problem as a general issue in any scalable application where crons cannot / should not run multiple times. It's not really AWS specific. I'm not sure to what extent you want to keep things coupled or how your crons are currently run but here are a few suggestions that might work for you:
Create a "cron runner" instance with a limit to run crons on
You could create a separate ECS service which has no autoscaling and a fixed value of 1 instance. This instance would run the same copy of your code as your "normal" instances and would run crons. You would turn crons off on your "normal" instances. You might find that this can be a very small instance since it doesn't handle any web traffic.
Create a "cron trigger" instance which fires off crons remotely
Here you create one "trigger" instance which sends a request to your normal instances through an ALB. Because your ALB will route the request to 1 of the servers behind it the cron only gets run once. One watch out with this is that if your cron is long running, you may need to consider your request timeouts. You'll also have to think about retries etc but I assume you already have a process that can be adapted for that.
The above solutions can be adapted with message queues etc but the basis of both is that there is another instance of some kind which starts the cron and is separate from your normal servers. Depending on when your cron runs, you may only need to run this cron instance for a few hours per day so it can be cost efficient to do things like this.
Personally I have used both methods in a multi-tenant application and I had to go with the option of running the cron like this due to the number of tenants and the time / resource it took to run the crons for all of them at once:
Cloudwatch schedule triggers a lambda which sends a message to SQS to queue a cron for each tenant individually.
Cron servers (totally separate from main web servers but running same / similar code) pull messages and run the cron for each tenant individually. Stores a key in redis for crons which are vital to only run once to stop issues with "at least once" delivery so crons don't run twice.
This can also help handle failures with retry policies and deadletter queues managed in SQS.
Ultimately you need to kick off these crons from one place. If possible, change up your crons so it doesn't matter if they run twice. It makes it easier to deal with retries and things like that.

AWS Data Pipeline - Can we re-use the EC2 instance created during 'on-demand' pipeline activation?

Question -
Can I reuse the ec2 resource created during first on-demand run of the data pipeline in subsequent on-demand runs as well?
Description -
I have configured an 'on-demand' AWS data pipeline which is required to be activated many times during a day ( say 3 times within an hour ).
( I can not go with the cron or timeseries style scheduling since I have to pass different parameters to the pipeline at each execution)
In each on-demand activation, Data pipeline seems to create a new ec2 resource ? Is this the case?
Can I reuse the ec2 resource created during first on-demand run in other subsequent runs as well?
AWS Documentation provides the following information but it's not clear whether that applies to 'on-demand' pipelines as well.
AWS Data Pipeline allows you to maximize the efficiency of resources
by supporting different schedule periods for a resource and an
associated activity.
For example, consider an activity with a 20-minute schedule period. If
the activity's resource were also configured for a 20-minute schedule
period, AWS Data Pipeline would create three instances of the resource
in an hour and consume triple the resources necessary for the task.
Instead, AWS Data Pipeline lets you configure the resource with a
different schedule; for example, a one-hour schedule. When paired with
an activity on a 20-minute schedule, AWS Data Pipeline creates only
one resource to service all three instances of the activity in an
hour, thus maximizing usage of the resource.
This isn't possible with Data-Pipeline-managed resources. For this scenario, you would need to spin up the EC2 instance yourself and configure TaskRunner:
You can install Task Runner on computational resources that you
manage, such as an Amazon EC2 instance, or a physical server or
workstation. Task Runner can be installed anywhere, on any compatible
hardware or operating system, provided that it can communicate with
the AWS Data Pipeline web service.
To connect a Task Runner that you've installed to the pipeline
activities it should process, add a workerGroup field to the object,
and configure Task Runner to poll for that worker group value. You do
this by passing the worker group string as a parameter (for example,
--workerGroup=wg-12345) when you run the Task Runner JAR file.
This way Data Pipeline will not create any resources for you, and all activities will run on the EC2 instance that you provided.

How to finish long-running task on Tomcat when AWS AutoScaling is terminating the EC2 instance?

I have an application that is deployed to Tomcat 8 that is hosted on ElasticBeanstalk environment with enabled auto-scaling. In the application I have long-running jobs that must be finished and all changes must be committed to a database.
The problem is that AWS might kill any EC2 instance during scale in and then some jobs might be not finished as it is expected. By default, AWS waits just 30 seconds and then kill the Tomcat process.
I've already changed /etc/tomcat8/tomcat8.conf file: set parameter SHUTDOWN_WAIT to 3600 (60 by default). But it didn't fix the issue - the whole instance is killed after 20-25 minutes.
Then I've tried to configure lifecycle hook via .ebextensions file (as it's explained here). But I couldn't approve that the lifecycle hook really postpones termination of the instance (still waiting for an answer from AWS support about that).
So the question is: do you know any "legal" ways to postpone or cancel instance termination when the autoscaling group scales in?
I want to have something like that:
AWS starts to scale in the autoscaling group
autoscaling group sends shutdown signal to the EC2 instance
EC2 instance starts to stop all active processes
Tomcat process receives a signal to shutdown, but waits until the active job is finished
the application commits the job result (it might take even 60 minutes)
Tomcat process is terminated
EC2 instances in terminated
Elastic Beanstalk consists of two parts - API, and Worker. API is auto scaled, so it can go down. Worker is something that runs longer. You can communicate between them with SQS. That is how they designed it.
For sure you can tweak the system. That is platform as a service, so you can force auto scaling group not to go down - by setting min instances to max. Also you can turn off health check - that can also kill instance... But that is hacking, latter can kick.

Elastic beanstalk periodic tasks on autoscaled environment

On an autoscaled environment running a periodic task, if the environment is scaled up, do the periodic tasks get run on each instance? Or more specifically, does each instance then post to the queue leading to multiple "periodic tasks" running?
Yes. If there's some periodic task that should only be triggered once, you should have a separate auto scale environment of minimum 1 maximum one instance to either perform the task or trigger it on one of your servers (maybe make a request to your load balancer and one of your instances will perform the task)
Yes, behind the screen it's just a cron job on all your instances. The default scenario for using periodic tasks is to read the tasks from the SQS queue on the worker nodes.
So yes, if you doing some kind of posting what has to happen only once, then you either need to put some logic between or use a different solution.
(For example generating some kind of time based ID which identifies the cycle of the cron job. So messages from the same cycle are having the same id, easy to filter them/ ignore everything after the firs.