Akka PersistenActor with RoundRobinPool

Akka PersistenActor with RoundRobinPool - akka

I am trying to implement eventsourcing using Akka persisten actors. The receiver actors are persistent, they persist the message before processing them. I have a round-robin-pool of persistent receiver actors. Now since the persistent id is same for these pool of actors, how to handle recovery? Or i want to understand the correct way of using persistency with pool of actors...
I was thinking to use this propery 'akka.persistence.max-concurrent-recoveries = 1'.
NOTE: i am using java

According to docs:
Note persistenceId must be unique to a given entity in the journal
(database table/keyspace). When replaying messages persisted to the
journal, you query messages with a persistenceId. So, if two different
entities share the same persistenceId, message-replaying behavior is
corrupted.
Seems that you need akka cluster-sharding with unique persistenceId for every entity actor.
Also see:
Can I Read/Write from separate actors with same PersistenceId?

Related

Akka Durable State restore

I have an existing Akka Typed application, and am considering adding in support for persistent actors, using the Durable State feature. I am not currently using cluster sharding, but plan to implement that sometime in the future (after implementing Durable State).
I have read the documentation on how to implement Durable State to persist the actor's state, and that all makes sense. However, there does not appear to be any information in that document about how/when an actor's state gets recovered, and I'm not quite clear as to what I would need to do to recover persisted actors when the entire service is restarted.
My current architecture consists of an HTTP service (using AkkaHTTP), a "dispatcher" actor (which is the ActorSystem's guardian actor, and currently a singleton), and N number of "worker" actors, which are children of the dispatcher. Both the dispatcher actor and the worker actors are stateful.
The dispatcher actor's state contains a map of requestId->ActorRef. When a new job request comes in from the HTTP service, the dispatcher actor creates a worker actor, and stores its reference in the map. Future requests for the same requestId (i.e. status and result queries) are forwarded by the dispatcher to the appropriate worker actor.
Currently, if the entire service is restarted, the dispatcher actor is recreated as a blank slate, with an empty worker map. None of the worker actors exist anymore, and their status/results can no longer be retrieved.
What I want to accomplish when the service is restarted is that the dispatcher gets recreated with its last-persisted state. All of the worker actors that were in the dispatcher's worker map should get restored with their last-persisted states as well. I'm not sure how much of this is automatic, simply by refactoring my actors as persistent actors using Durable State, and what I need to do explicitly.
Questions:
Upon restart, if I create the dispatcher (guardian) actor with the same name, is that sufficient for Akka to know to restore its persisted state, or is there something more explicit that I need to do to tell it to do that?
Since persistent actors require the state to be serializable, will this work with the fact that the dispatcher's worker map references the workers by ActorRef? Are those serializable, or do I need to switch it to referencing them by name?
If I leave the references to the worker actors as ActorRefs, and the service is restarted, will those ActorRefs (that were restored as part of the dispatcher's persisted state) continue to work, and will the worker actors' persisted states be automatically restored? Or, again, do I need to do something explicit to tell it to revive those actors and restore their states.
Currently, since all of the worker actors are not persisted, I assume that their states are all held in memory. Is that true? I currently keep all workers around indefinitely so that the results of their work (which is part of their state) can be retrieved in the future. However, I'm worried about running out of memory on the server. I'd like to have workers that are done with their work be able to be persisted to disk only, kind of "putting them to sleep", so that the results of their work can be retrieved in the future, without taking up memory, days or weeks later. I'd like to have control over when an actor is "in memory", and when it's "on disk only". Can this Durable State persistence serve as a mechanism for this? If so, can I kill an actor, and then revive it on demand (and restore its state) when I need it?

The durable state is stored keyed by an akka.persistence.typed.PersistenceId. There's no necessary relationship between the actor's name and its persistence ID.
ActorRefs are serializable (the included Jackson serializations (CBOR or JSON) do it out of the box; if using a custom serializer, you will need to use the ActorRefResolver), though in the persistence case, this isn't necessarily that useful: there's no guarantee that the actor pointed to by the ref is still there (consider, for instance, if the JVM hosting that actor system has stopped between when the state was saved and when it was read back).
Non-persistent actors (assuming they're not themselves directly interacting with some persistent data store: there's nothing stopping you from having an actor that reads state on startup from somewhere else (possibly stashing incoming commands until that read completes) and writes state changes... that's basically all durable state is under the hood) keep all their state in memory, until they're stopped. The mechanism of stopping an actor is typically called "passivation": in typed you typically have a Passivate command in the actor's protocol. Bringing it back is then often called "rehydration". Both event-sourced and durable-state persistence are very useful for implementing this.
Note that it's absolutely possible to run a single-node Akka Cluster and have sharding. Sharding brings a notion of an "entity", which has a string name and is conceptually immortal/eternal (unlike an actor, which has a defined birth-to-death lifecycle). Sharding then has a given entity be incarnated by at most one actor at any given time in a cluster (I'm ignoring the multiple-datacenter case: if multiple datacenters are in use, you're probably going to want event sourced persistence). Once you have an EntityRef from sharding, the EntityRef will refer to whatever the current incarnation is: if a message is sent to the EntityRef and there's no living incarnation, a new incarnation is spawned. If the behavior for that TypeKey which was provided to sharding is a persistent behavior, then the persisted state will be recovered. Sharding can also implement passivation directly (with a few out-of-the-box strategies supported).
You can implement similar functionality yourself (for situations where there aren't many children of the dispatcher, a simple map in the dispatcher and asks/watches will work).
The Akka Platform Guide tutorial works an example using cluster sharding and persistence (in this case, it's event sourced, but the durable state APIs are basically the same, especially if you ignore the CQRS bits).

Is there a way to achieve service downgrade in akka cluster sharding?

I'm trying to build up an Akka cluster ShardRegion that might need to be downgraded in the production environment when a bug occurs. However, instead of unregistering it by calling
ClusterClientReceptionist.get(nodeActorSystem).unregisterService(shardRegion)
which will terminate the ShardRegion and its child actors after all messages are consumed before PoisonPill, my sharding child actors have their internal state and purposes that need to be accomplished. I need an elegant way to slowly downgrade the process with the ShardRegion to let any session in-between finish, e.g. any new message with a different EntityId will be sent elsewhere.
I haven't yet found any means to downgrade it or just simply stop any new sharding AkkaActor to prop up on the ShardRegion.Is this even achievable in Akka Cluster ShardRegion?

You can accomplish part of this by specifying a custom stopMessage. The shard region will send this command to the entity actors when they are to be passivated or rebalanced. The default is PoisonPill, but a custom one allows the entity actors to do whatever they need to do to shut down (they do need to eventually stop themselves in this scenario).
If you're triggering a rebalance, the messages to the shard will be buffered until all the active entities in that shard have stopped, which may qualify as "any new message with a different entity ID will be sent elsewhere". Note that messages which are being sent outside of cluster sharding (i.e. directly between entity actors) will still be delivered normally (until said entity actors stop).

Akka Typed ActorSelection/Receptionist

I have question about ActorSelection/Receptionist in new Akka Typed.....
Before Akka Typed I didn't use ActorSelection because I read somewhere that it was not performant so I kept reference to the actor over HashMaps, now I am reading the documentation of the Akka Typed, I see that another mechanism Receptionist exist, so for me the question is, does it also suffers the same problems ActorSelection and I should stick to my old pattern of keeping reference to Actor over HashMap or now the receptionist is the way to go....
My specific scenario, my Actor spawns several Child Actors creates several Child Actors, if the parent Actor passivates or restored over Akka Persistence, it should again find reference to these Child Actors....
So what do you think, would I experience Performance problems if I convert to Receptionist?????
Thx for answers...

ActorSelection will resolve the actor path to an ActorRef each time you use it, this is somewhat costly, if used for a high throughput actor but has the upside that if the actor is stopped, and then later a new actor is started at the same path, the ActorSelection will deliver messages to the new actor, while if you had an ActorRef it specifically points to the actor instance that is no stopped and messages end up in dead letters.
The receptionist is quite different and is more like a registry of actors that you can subscribe to. When the set of ActorRefs registered for a key changes you get an update message with the new set, there is no extra overhead per message sent, you are dealing directly with the ActorRefs of the recipients.
Note that you can use the GroupRouter for delivery to actors registered with the receptionist and to avoid having to implement the subscription part in your actor.

How to externalize akka sharded actor state to redis or ignite?

I am very new to Akka clustering and working on a proof of concept. In my case i have an actor which is running on a cluster and the actor has state as a Map[String,Any]. So, for any request the actor receives it based on the incoming message it create a new entity actor and the data map. The problem here is the map is in memory right now. Is it possible to store the sharded actor state somewhere in redis or ignite ?

You should probably start by having a look at akka-persistence (the persistence module included in akka). The snapshotting part is meant to persist the state directly, but you have to start with the command/event-sourcing part, the snapshotting part being an optional enhancement.
Then you can combine this with automatic passivation of your sharded actors after a certain inactivity timeout.
With the above, you'll have a solution that persists the state of your actors in an external storage system to free up memory, restoring your actor's state whenever they come back to life.
Last step would be to see which storage backends are available for akka-persistence and match your requirements, you can implement your own of course.

How to manage Akka Actor's paths in distributed system?

Suppose I have a the following two Actors
Store
Product
Every Store can have multiple Products and I want to dynamically split the Store into StoreA and StoreB on high traffic on multiple machines. The splitting of Store will also split the Products evenly between StoreA and StoreB.
My question is: what are the best practices of knowing where to send all the future BuyProduct requests to (StoreA or StoreB) after the split ? The reason I'm asking this is because if a request to buy ProductA is received I want to send it to the right store which already has that Product's state in memory.
Solution: The only solution I can think of is to store the path of each Product Map[productId:Long, storePath:String] in a ProductsPathActor every time a new Product is created and for every BuyProduct request I will query the ProductPathActor which will return the correct Store's path and then send the BuyProduct request to that Store ?
Is there another way of managing this in Akka or is my solution correct ?

One good way to do this is with Akka Cluster Sharding. From the docs:
Cluster sharding is useful when you need to distribute actors across
several nodes in the cluster and want to be able to interact with them
using their logical identifier, but without having to care about their
physical location in the cluster, which might also change over time.
There is an Activator Template that demonstrates it here.
To your problem, the concept of StoreA and StoreB are each a ShardRegion and map 1:1 with to your cluster nodes. The ShardCoordinator manages distribution between these nodes and acts as the conduit between regions.
For it's part, your Request Handler talks to a ShardRegion, which routes the message if necessary in conjunction with the coordinator. Presumably, there is a JVM-local ShardRegion for each Request Handler to talk to, but there's no reason that it could not be a remote actor.
When there is a change in the number of nodes, ShardCoordinator needs to move shards (i.e. the collections of entities that were managed by that ShardRegion) that are going to shut down in a process called "rebalancing". During that period, the entities within those shards are unavailable, but the messages to those entities will be buffered until they are available again. To this end, "being available" means that the new ShardRegion responds to a directed message for that entity.
It's up to you to bring that entity back to life on the new node. Akka Persistence makes this very easy, but requires you to use the Event Sourcing pattern in the process. This isn't a bad thing, as it can lead to web-scale performance much more easily. This is especially true when the database in use is something like Apache Cassandra. You will see that nodes are "passivated", which is essentially just caching off to disk so they can be restored on request, and Akka Persistence works with that passivation to transparently restore the nodes under the control of the new ShardRegion – essentially a "move".

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js