Akka Daemon Services - akka

Most of the beginner's Akka examples seem to advocate calling the actor system's stop() and shutdown() methods like so:
object Main extends App {
// create the ActorSystem
val system = ActorSystem("HelloSystem")
// put your actors to work here ...
// shut down the ActorSystem when the work is finished
system.stop
system.shutdown
}
However what if your Akka app is meant to be a running service, that should (conceivably) live forever? Meaning it starts, the actor system is created, and actors simply idle until work (perhaps coming in from connected clients, etc.) needs to be done?
Is it OK to just initialize/start the actor system and leave it be (that is, omit invoking stop and shutdown altogether? Why/why not?

Yes, it is ok. This is a problem similar to AkkaHTTP implementation. In AkkaHTTP, you start actors which open a socket and wait for requests.
One possible issue comes to my mind: if you need some short-living actors (inside your long-running service) to process a single request, you should stop them after they are no longer needed (to free resources), especially if the actors are stateful.
I wrote a blog post about that issue: https://mikulskibartosz.name/always-stop-unused-akka-actors-a2ceeb1ed41

Related

Akka Durable State restore

I have an existing Akka Typed application, and am considering adding in support for persistent actors, using the Durable State feature. I am not currently using cluster sharding, but plan to implement that sometime in the future (after implementing Durable State).
I have read the documentation on how to implement Durable State to persist the actor's state, and that all makes sense. However, there does not appear to be any information in that document about how/when an actor's state gets recovered, and I'm not quite clear as to what I would need to do to recover persisted actors when the entire service is restarted.
My current architecture consists of an HTTP service (using AkkaHTTP), a "dispatcher" actor (which is the ActorSystem's guardian actor, and currently a singleton), and N number of "worker" actors, which are children of the dispatcher. Both the dispatcher actor and the worker actors are stateful.
The dispatcher actor's state contains a map of requestId->ActorRef. When a new job request comes in from the HTTP service, the dispatcher actor creates a worker actor, and stores its reference in the map. Future requests for the same requestId (i.e. status and result queries) are forwarded by the dispatcher to the appropriate worker actor.
Currently, if the entire service is restarted, the dispatcher actor is recreated as a blank slate, with an empty worker map. None of the worker actors exist anymore, and their status/results can no longer be retrieved.
What I want to accomplish when the service is restarted is that the dispatcher gets recreated with its last-persisted state. All of the worker actors that were in the dispatcher's worker map should get restored with their last-persisted states as well. I'm not sure how much of this is automatic, simply by refactoring my actors as persistent actors using Durable State, and what I need to do explicitly.
Questions:
Upon restart, if I create the dispatcher (guardian) actor with the same name, is that sufficient for Akka to know to restore its persisted state, or is there something more explicit that I need to do to tell it to do that?
Since persistent actors require the state to be serializable, will this work with the fact that the dispatcher's worker map references the workers by ActorRef? Are those serializable, or do I need to switch it to referencing them by name?
If I leave the references to the worker actors as ActorRefs, and the service is restarted, will those ActorRefs (that were restored as part of the dispatcher's persisted state) continue to work, and will the worker actors' persisted states be automatically restored? Or, again, do I need to do something explicit to tell it to revive those actors and restore their states.
Currently, since all of the worker actors are not persisted, I assume that their states are all held in memory. Is that true? I currently keep all workers around indefinitely so that the results of their work (which is part of their state) can be retrieved in the future. However, I'm worried about running out of memory on the server. I'd like to have workers that are done with their work be able to be persisted to disk only, kind of "putting them to sleep", so that the results of their work can be retrieved in the future, without taking up memory, days or weeks later. I'd like to have control over when an actor is "in memory", and when it's "on disk only". Can this Durable State persistence serve as a mechanism for this? If so, can I kill an actor, and then revive it on demand (and restore its state) when I need it?
The durable state is stored keyed by an akka.persistence.typed.PersistenceId. There's no necessary relationship between the actor's name and its persistence ID.
ActorRefs are serializable (the included Jackson serializations (CBOR or JSON) do it out of the box; if using a custom serializer, you will need to use the ActorRefResolver), though in the persistence case, this isn't necessarily that useful: there's no guarantee that the actor pointed to by the ref is still there (consider, for instance, if the JVM hosting that actor system has stopped between when the state was saved and when it was read back).
Non-persistent actors (assuming they're not themselves directly interacting with some persistent data store: there's nothing stopping you from having an actor that reads state on startup from somewhere else (possibly stashing incoming commands until that read completes) and writes state changes... that's basically all durable state is under the hood) keep all their state in memory, until they're stopped. The mechanism of stopping an actor is typically called "passivation": in typed you typically have a Passivate command in the actor's protocol. Bringing it back is then often called "rehydration". Both event-sourced and durable-state persistence are very useful for implementing this.
Note that it's absolutely possible to run a single-node Akka Cluster and have sharding. Sharding brings a notion of an "entity", which has a string name and is conceptually immortal/eternal (unlike an actor, which has a defined birth-to-death lifecycle). Sharding then has a given entity be incarnated by at most one actor at any given time in a cluster (I'm ignoring the multiple-datacenter case: if multiple datacenters are in use, you're probably going to want event sourced persistence). Once you have an EntityRef from sharding, the EntityRef will refer to whatever the current incarnation is: if a message is sent to the EntityRef and there's no living incarnation, a new incarnation is spawned. If the behavior for that TypeKey which was provided to sharding is a persistent behavior, then the persisted state will be recovered. Sharding can also implement passivation directly (with a few out-of-the-box strategies supported).
You can implement similar functionality yourself (for situations where there aren't many children of the dispatcher, a simple map in the dispatcher and asks/watches will work).
The Akka Platform Guide tutorial works an example using cluster sharding and persistence (in this case, it's event sourced, but the durable state APIs are basically the same, especially if you ignore the CQRS bits).

akka persistence, resume on failure, at least once semantics

I have small mess in my head
Avoid having poisoned mailbox
http://doc.akka.io/docs/akka/2.4.2/general/supervision.html
The new actor then resumes processing its mailbox, meaning that the
restart is not visible outside of the actor itself with the notable
exception that the message during which the failure occurred is not
re-processed.
My case: actor receives the "command" to run smth. Actor tries to reach remote service. Service is unavailable. Exception is thrown. I want actor to keep going contact remote server. I don't wan't actor to skip input command which caused exception. Will Resume help me to force actor to keep going?
override val supervisorStrategy: SupervisorStrategy =
OneForOneStrategy(maxNrOfRetries = -1, withinTimeRange = 5.minutes) {
case _: RemoteServiceIsDownException => Resume
case _ => Stop
}
By Resume, I mean retry the invocation that caused the exception to be thrown. I suspect that akka Resume means keep actor instance, but not retry failed invocation
Does akka persistence means durable mailboxes?
Extending first case. Actor tries to reach remote service. Now actor is persistent. SupervisorStrategy forces Actor to continue to contact remote service. The whole JVM shuts down. Akka app is restarted. Will Actor Resume from the point where it tired desperately reach remote service?
Does akka persistence means At least once semantics?
Actor receives message. Then JVM crashes. Will parent re-receive message it was processing during crush?
Expanding my comment:
Will Resume help me to force actor to keep going? ... By Resume, I mean retry the invocation that caused the exception to be thrown. I suspect that akka Resume means keep actor instance, but not retry failed invocation
No, I do not believe so. The Resume directive will keep the actor chugging along after your message processing failure. One way to retry the message though, is to actually just use Restart and to take advantage of an Actor's preRestart hook:
override def preRestart(t: Throwable, msgBeforeFailure: Option[Any]): Unit = {
case t: RemoteServiceIsDownException if msgBeforeFailure.isDefined =>
self ! msgBeforeFailure.get
case _: =>
}
When the actor crashes, it will run this hook and offer you an opportunity to handle the message that caused it to fail.
Does akka persistence means durable mailboxes?
Not necessarily, using a persistent actor just means that the actor's domain events, and subsequently, internal state is durable, not the mailbox it uses to process messages. That being said, it is possible to implement a durable mailbox pretty easily, see below.
Does akka persistence means At least once semantics?
Again not necessarily, but the toolkit does have a trait called AtleastOnceDelivery to let you achieve this (and durable mailbox)!
See http://doc.akka.io/docs/akka/current/scala/persistence.html#At-Least-Once_Delivery

Akka actor read state from database

I would like to ask for pattern when it is required to init actor state from database. I have a DAO which return Future[...] and normaly to be non blocking message should be send to actor on future complete. But this case is different. I cannot receive any message from actor mailbox before initialization complete. Is the only way to block actor thread while waiting for database future complete?
The most trivial approach would be to define two receive methods, e.g. initializing and initialized, starting with def receive = initializing. Inside your inizializing context, you could simply send back a message like InitializationNotReady to tell the other actor that he should try it again later. After your actor is initialized, you switch your context with context become initialized to the new state, where you can operate normally.
After all, another good aproach could be to have a look at Akka Persistence. It enables stateful actors to persist their internal state so that it can be recovered when an actor is started, restarted after a JVM crash or by a supervisor, or migrated in a cluster.
In you case, you can restore your state from a database, since their are multiple storage options for Akka persistence. you can find them here. After that recovery, you can receive messages like you're used to it.

Should supervised actors in Akka receive messages directly or via its supervisor?

I recently watched a great video by Riccardo Terrell about Akka.NET and F# but the question I am going to ask is related to Akka in general. I was puzzled with a part where he discusses supervisor and supervision strategy on error. In his examples clients send messages to supervisor that forwards them to a worker actor by calling worker.Tell(msg, mailbox.Sender()). I wonder how common is such practice. In our system there are a few places where a client first Ask a supervisor to obtain a worker instance and then make Tell calls directly to a worker. I am not happy about using Ask but in our case we need worker affinity, i.e. messages from the same client may need to be routed to the same worker so giving a client a worker instance simplifies this. But again, Ask is bad. So in case of supervised actors, are they supposed to receive messages via the supervisor (to avoid Ask) or "it depends"?
One solution could be is use hashing pool when using simple hasing function, so then you can eliminate sending back references to asker.
new ConsistentHashingPool(5).WithHashMapping(o =>
{
if (o is IHasCustomKey)
return ((IHasCustomKey)o).Key;
return null;
});
Any comments welcome!

Configuring spray-servlet to avoid request bottleneck

I have an application which uses spray-servlet to bootstrap my custom Spray routing Actor via spray.servlet.Initializer. The requests are then handed off to my Actor via spray.servlet.Servlet30ConnectorServlet.
From what I can gather, the Servlet30ConnectorServlet simply retrieves my Actor out of the ServletContext that the Initializer had set when the application started, and hands the HttpServletRequest to my Actor's receive method. This leads me to believe that only one instance of my Actor will have to handle all requests. If my Actor blocks in its receive method, then subsequent requests will queue waiting for it to complete.
Now I realize that I can code my routing Actor to use detach() or a complete that returns a Future, however most of the documentation never alludes to having to do this.
If my above assumption is true (single Actor instance handling all requests), is there a way to configure the Servlet30ConnectorServlet to perhaps load balance the incoming requests amongst multiple instances of my routing Actor instead of just the one? Or is this something I'll have to roll myself by subclassing Servlet30ConnectorServlet?
I did some research and now I understand better how spray-servlet is working. It's not spray-servlet that dictates the strategy for how many Request Handler Actors are created but rather the plumbing code provided with the example I based my application on.
My assumption all along was that spray-servlet would essentially work like a traditional Java EE application dispatcher in a handler-per-request type of fashion (or some reasonable variant of that notion). That is not the case because it is routing the request to an Actor with a mailbox, not some singleton HttpServlet.
I am now delegating the requests to a pool of actors in order to reduce our potential for bottleneck when our system is under load.
val serviceActor = system.actorOf(RoundRobinPool(config.SomeReasonableSize).props(Props(Props[BootServiceActor])), "my-route-actors")
I am still a bit baffled by the fact that the examples and documentation assumes everyone would be writing non-blocking Request Handler Actors under spray. All of their documentation essentially demonstrates non-Future rendering complete, yet there is no mention in their literature that maybe, just maybe, you might want to create a reasonable sized pool of Request Handler Actors to prevent a slew of requests from bottle necking the poor single overworked Actor. Or it's possible I've overlooked it.