Is there a way to create Actors and add them to existing Cluster Shard in Akka ?
1) Create/Start Cluster Shard when App starts
2) Create Actor for each API request
3) Add them to the existing shard
Thanks !!
If you use Cluster Sharding, it will take care of the actor lifecycle for you. I.e. you don't create an actor, you ask the ShardRegion to give you an actor for an ID and you will get one (placed in an existing shard). So yes, you could create a new ID on every API request and have the ShardRegion give you a (new) actor for it.
Cluster Sharding is described in some detail on http://doc.akka.io/docs/akka/snapshot/scala/cluster-sharding.html , that should clear things up a little.
Related
I have a use case which is million clients data processed by individual actor.And all the actors created on across multiple nodes and formed as a cluster.
When i receive a message i have to send it to the particular actor in the cluster which is responsible for.How can i map to that actor with AKKA cluster.I don't want it to send to other actors.
Can this use case achievable With Akka - Cluster?
How failure handling will happen here?
I cant understand what cluster singleton is,saying in doc that it will only created on oldest node. In my case i only want all million actors as singleton.
How particular actor in cluster mapped with message?
How can i create actors like this in cluster?
Assuming that each actor is responsible for some uniquely identifiable part of state, it sounds like you want to use cluster sharding to balance the actors across the cluster, and route messages by ID.
Note that since the overhead of an actor is fairly small, it's not implausible to host millions of actors on a single node.
See the Akka JVM or Akka.Net docs.
In case of a high throughput transaction system, we route transactions to instances in an instance group based on some condition to ensure that the transactions are processed one after the other. For example, there might be a routing rule that says transactions has 'cancel' in the data might be routed to instance C while 'new' might be routed to instance A. This is relevant for some business logic.
However in the serverless world, we cannot name an instance because we don't know where it is running and how. How do we implement this kind of logic in such cases. Or does it goes against against the serverless paradigm.
You can publish the event message in PubSub with an ordering key. like that, the message are delivered in order even if they aren't processed on the same instance.
I have to setup jboss over AWS-EC2-Windows server, this will scale-up as well as per the requirements. We are using ELK for infrastructure monitoring for which will be installing beats here which will send the data to on-prem logstash. There we on-board the servers with there hostname and ip.
Now the problem is: in case of autoscaling, how we can achieve this.
Please advise.
Thanks,
Abhishek
If you would create one EC2 instance and create an AMI of it in order to have it autoscale based on that one, this way the config can be part of it.
If you mean by onboard adding it to the allowed list, you could use a direct connect or a VPC with a custom CIDR block defined and add that subnet in the allowed list already.
AFAIK You need to change the logstash config file on disk to add new hosts, and it should notice the updated config automatically and "just work".
I would suggest a local script that can read/write the config file and that polls an SQS queue "listening" for autoscaling events. You can have your ASG send SNS messages when it scales and then subscribe an SQS queue to receive them. Messages will be retained for upto 14 days and theres options to add delays if required. The message you receive from SQS will indicate the region, instance-id and operation (launched or terminated) from which you can lookup the IP address/hostname to add/remove from the config file (and the message should be deleted from the queue when processed successfully). Editing the config file is just simple string operations to locate the right line and insert the new one. This approach only requires outbound HTTPS access for your local script to work and some IAM permissions, but there is (a probably trivial) cost implication.
Another option is a UserData script thats executed on each instance at startup (part of the Launch Template of your AutoScale group). Exactly how it might communicate with your on-prem depends on your architecture/capabilities - anythings possible. You could write a simple webservice to manage the config file and have the instances call it but thats a lot more effort and somewhat risky in my opinion.
FYI - if you use SQS look at Long Polling if your checking the queues frequently/want the message to propagate as quickly as possible (TLDR - faster & cheaper than polling any more than twice a minute). Its good practice to use a dead-letter queue with SQS - messages that get retrieved but not removed from the queue end up here. Setup alarms on the queue and deadletter queue to alert you via email if there are messages failing to be processed or not getting picked up in sensible time (ie your script has crashed etc).
I want to have a scheduler in my cluster that would send some messages after some time. From what I see scheduler is per actorsystem, from my tests only for local actor system. Not the cluster one. So If schedule something on one node, if it get's down then all scheduled tasks are discarded.
If I create a Cluster Singleton which would be responsible for scheduling, could the already made schedules survive recreation on some other node? Or should I keep it as a persistent actor with structure of already created schedules metadata and in preStart phase reschedule everything that was persisted?
A cluster singleton will reincarnate on another node if the node it was previously on is downed or leaves the cluster.
That reincarnation will start with a clean slate: it won't remember its "past lives".
However, if it's a persistent actor (or, equivalently, its behavior is an EventSourcedBehavior in Akka Typed), it will on startup recover its state from the event stream (and/or snapshots). For a persistent actor, this typically doesn't require anything to be done preStart: the persistence implementation will take care of replaying the events.
Depending on how many tasks are scheduled and if you want the schedule to be discarded on a full cluster restart, it may be possible to use Akka Distributed Data to have the schedule metadata distributed around the cluster (with tuneable consistency) and then have a cluster singleton scheduling actor read that metadata.
All materials on Cluster Sharding with Akka imply sending messages from outside the cluster to entities in the cluster. However, can entities (actors) in different sharding regions/shards of the same cluster communicate between each other? Is there some sample code available for this? (on how we send a message from one entity to another within a cluster)
the short answer is "yes".
Let's elaborate:
You can view an EntiryRef is an ActorRef that's known to be sharded, so what you need, in any case, is a mechanism to obtain that entityRef. That mechanism is the ClusterSharding extension. So using:
val sharding = ClusterSharding(system)
you obtain the sharding extension which you can then use:
val counterOne: EntityRef[Counter.Command] = sharding.entityRefFor(TypeKey, "counter-1")