Akka Cluster Scheduler - what happens when a node goes down - akka

I want to have a scheduler in my cluster that would send some messages after some time. From what I see scheduler is per actorsystem, from my tests only for local actor system. Not the cluster one. So If schedule something on one node, if it get's down then all scheduled tasks are discarded.
If I create a Cluster Singleton which would be responsible for scheduling, could the already made schedules survive recreation on some other node? Or should I keep it as a persistent actor with structure of already created schedules metadata and in preStart phase reschedule everything that was persisted?

A cluster singleton will reincarnate on another node if the node it was previously on is downed or leaves the cluster.
That reincarnation will start with a clean slate: it won't remember its "past lives".
However, if it's a persistent actor (or, equivalently, its behavior is an EventSourcedBehavior in Akka Typed), it will on startup recover its state from the event stream (and/or snapshots). For a persistent actor, this typically doesn't require anything to be done preStart: the persistence implementation will take care of replaying the events.
Depending on how many tasks are scheduled and if you want the schedule to be discarded on a full cluster restart, it may be possible to use Akka Distributed Data to have the schedule metadata distributed around the cluster (with tuneable consistency) and then have a cluster singleton scheduling actor read that metadata.

Related

How to handle million individual actors with AKKA Cluster setup?

I have a use case which is million clients data processed by individual actor.And all the actors created on across multiple nodes and formed as a cluster.
When i receive a message i have to send it to the particular actor in the cluster which is responsible for.How can i map to that actor with AKKA cluster.I don't want it to send to other actors.
Can this use case achievable With Akka - Cluster?
How failure handling will happen here?
I cant understand what cluster singleton is,saying in doc that it will only created on oldest node. In my case i only want all million actors as singleton.
How particular actor in cluster mapped with message?
How can i create actors like this in cluster?
Assuming that each actor is responsible for some uniquely identifiable part of state, it sounds like you want to use cluster sharding to balance the actors across the cluster, and route messages by ID.
Note that since the overhead of an actor is fairly small, it's not implausible to host millions of actors on a single node.
See the Akka JVM or Akka.Net docs.

Failed cron job handling with elastic beanstalk and SQS

I have two elastic beanstalk environments.
One is the 'primary' web server environment and the other is a worker environment that handles cron jobs.
I have 12 cron jobs, setup via a cron.yaml file that all point at API endpoints on the primary web server.
Previously my cron jobs were all running on the web server environment but of course this created duplicate cron jobs when this scaled up.
My new implementation works nicely but where my cron jobs fail to run as expected the cron job repeats, generally within a minute or so.
I would rather avoid this behaviour and just attempt to run the cron job again at the next scheduled interval.
Is there a way to configure the worker environment/SQS so that failed jobs do not repeat?
Simply configure a CloudWatch event to take over your cron, and have it create an SQS message ( either directly or via a Lambda function ).
Your workers will now just have to handle SQS jobs and if needed, you will be able to scale the workers as well.
http://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html
Yes, you can set the Max retries parameter in the Elastic Beanstalk environment and the Maximum Receives parameter in the SQS queue to 1. This will ensure that the message is executed once, and if it fails, it will get sent to the dead letter queue.
With this approach, your instance may turn yellow if there are any failed jobs, because the messages would end up in the dead letter queue, which you can simple observe and ignore, but it may be annoying if you are OCD about needing all environments to be green. You can set the Message Retention Period parameter for the dead letter queue to something short so that it will go away sooner though.
An alternative approach, if you're interested, is to return a status 200 OK in your code regardless of how the job ran. This will ensure that the SQS daemon deletes the message in the queue, so that it won't get picked up again.
Of course, the downside is that you would have to modify your code, but I can see how this would make sense if you don't care about the result.
Here's a link to AWS documentation that explains all of the parameters.

How can i make two separate sqs queue in same ec2 instance

I am using EC2 and SQS, how can i make two separate sqs queue in same ec2 instance. Let's say the case like - I am using sqs to process my task in queue and each task takes little long time and suddenly i got a requirement where i have to process 50k queue process which will take minimum 1 weak, here i want to make a new queue thread for this 50k messages so that it should not let other coming queue to wait until it get processed. so that main thread dont get delay for new coming messaged
You question doesn't quite read correctly because SQS Queues do not belong to EC2 instances, Queues are created at an account level, and EC2 instance can use the AWSSDK client to create queues as needed.
From what you are saying, one approach to handle a sudden burst of messages in a queue would be to keep the messages in 1 queue, and define an EC2 Auto Scaling Group configured to scale up and down EC2 instance base on the queue length. See here for instructions
Alternatively, if this queue has messages that need to be separated because a back pressure of one message type shouldn't impact the other, then you should create multiple queues (either using the console or SDK) and poll these independently. You could poll from multiple threads, poll from 1 thread and fan the work out to multiple threads, poll from multiple processed, or use completely different EC2 instances to poll from. You have a lot of options open to you here.

Is KCL for AWS kinesis processing thread safe?

We have an application which process the data from kinesis and maintains some state for few seconds.we are afraid whether the maintained state can be affected by the multithreaded nature of KCL.
Can anybody tell us whether RecordProcessor from KCL is thread safe?.
KCL is a wrapper library around your custom logic that processes your records.
The purpose of the library is to manage the Kinesis side of things while you focus on the record processing logic. KCL will align you EC2 workers to a certain shard or shards (usually 1 EC2 worker to 1 shard) and maintain a DynamoDB table which stores the sequencing keys.
Your custom application logic is responsible for maintaining state and thread-safety.
By default, a list of Kinesis records (target size is defined by you) that you have picked up from your shard is passed to your code to be processed. You can do this sequentially or fork them to threads if you wish. Not until you return from this processing method will KCL request more records from the shard for you.

Which one is a better scheduler in AWS Data Pipeline and AWS SWF

I have a situation where I have to trigger my workflow based on this condition "It has to process all files in s3 and then start again when there are files in s3". However, I found that Data Pipeline starts every scheduled duration while SWF starts and ends the job which also shut downs my EMR Cluster. Both of them are not suitable in this case. So, for a process which has to start or trigger based on a condition neither is suitable is what I found. Is there any alternative? Or is one of SWF and Data Pipeline could perform my task.
This is more like #Chris's answer's corollary. You still make use of Lambda - listen to S3 - Put Event trigger - so every time when there is a new object being create - the lamdba function would be called.
The Lambda Function can pick up the S3 object's key and put it in SQS; you can run a separate Worker Process which can pick items from the Queue.
To reiterate your statement,
It has to process all files in s3 [ Can be Done by Lambda ]
and then start again when there are files in s3 [Can be Done by SQS & EC2 ]
Look at Lambda. You can set up a trigger so that your code is invoked each time a new object is uploaded to S3.
Data Pipeline supports the concept of Preconconditions which can trigger your execution based on conditions. The S3KeyExists preconditions seems like what you're looking for. This will begin the execution of your activity when a particular S3 key exists.
Data Pipeline will also manage the creation and termination of your resource (EC2 or EMR) based on the activity's execution. If you wish to use your own EC2 instance or EMR cluster you can look into worker groups. Worker group resources are managed by you and will not be terminated by the service.