Akka equivalent of Spring InitializingBean - akka

I have written some actor classes and I find that I have to get a handle into the lifecycle of these entities. For example whenever my actor is initialized I would like a method to be called so that I can setup some listeners on message queues (or open db connections etc).
Is there an equivalent of this? The equivalent I can think of is Spring's InitialisingBean and DisposableBean

This is a typical scenario where you would override methods like preStart(), postStop(), etc. I don't see anything wrong with this.
Of course you have to be aware of the details - for example postStop() is called asynchronously after actor.stop() is invoked while preStart() is called when an Actor is started. This means that potentially slow/blocking things like DB interaction should be kept to a minimum.
You can also use the Actor's constructor for initialization of data.
As Matthew mentioned, supervision plays a big part in Akka - so you can instruct the supervisor to perform specific stuff on events. For example the so-called DeathWatch - you can be notified if one of the actors "you are watching upon" dies:
context.watch(child)
...
def receive = {
case Terminated(`child`) => lastSender ! "finished"
}

An Actor is basically two methods -- a constructor, and onMessage(Object): void.
There's nothing in its lifecycle that naturally provides for "wiring" behavior, which leaves you with a few options.
Use a Supervisor actor to create your other actors. A Supervisor is responsible for watching, starting and restarting Actors on failure -- and therefore it is often valuable to have a Supervisor that understands the state of integrated systems to avoid continously restarting. This Supervisor would create and manage Service objects (possibly via Spring) and pass them to Actors.
Use your preferred Initialization technique at the time of Actor construction. It's tricky but you can certainly combine Spring with Actors. Just be aware that should a Supervisor restart your actor, you'll need to be able to resurrect its desired state from whatever content you placed in the Props object you used to start it in the first place.
Wire everything on-demand. Open connections on demand when an Actor starts (and cache them as necessary). I find I do this fairly often -- and I let the Actor fail when its connections no longer work. The supervisor will restart the Actor, which will recreate all connections.
Something important things to remember:
The intent of Actor model is that Actors don't run continuously -- they only run when there are messages provided to them. If you add a message listener to an Actor, you are essentially adding new threads that can access that actor. This can be a problem if you use supervision -- a restarted actor may leak that thread and this may in turn cause the actor not to be garbage collected. It can also be a problem because it introduces a race condition, and part of the value of actors is avoiding that.
An Actor that does I/O is, from the perspective of the actor system, blocking. If you have too many Actors doing I/O at the same time, you will exhaust your Dispatcher's thread pool and lock up the system.
A given Actor instance can operate on many different threads over its lifetime, but will only operate on one thread at a time. This can be confusing to some messaging systems -- for example, JMS' Spec asserts that a Session not be used on multiple threads, and many JMS interpret this as "can only run on the thread on which it was started." You may see warnings, or even exceptions, resulting from this.
For these reasons, I prefer to use non-actor code to do some of my I/O. For example, I'll have an incoming message listener object whose responsibility is to take JMS messages off a queue, use them to create POJO messages, and send tells to the Actor system. Alternately, I'll use an Actor, but place that actor on a custom Dispatcher that has thread pinning enabled. This assures that that Actor will only run on a specific thread and won't block up the system that other non-I/O actors are using.

Related

Best practice to identify and deal with idle actors in Akka

I'm new to the Akka framework and I'm building a group chat application with it. My application may have 10 million actor instances (an actor instance for each group chat) of the same type, only 5% of which are highly active, and 60% of which can be idle (not receiving any messages) for days.
I want to know:
Is there any best practice to identify these idle actors?
What is the best practice to deal with them? Is stopping them enough?
Is there any best practice to identify these idle actors?
An actor's ActorContext has a setReceiveTimeout method that defines an inactivity threshold for the actor: if the actor hasn't received a message in the given amount of time, then an akka.actor.ReceiveTimeout message is sent to the actor. For example:
import akka.actor.{ Actor, ReceiveTimeout }
import scala.concurrent.duration._
class ChatActor extends Actor {
context.setReceiveTimeout(2 hours)
def receive = {
case ReceiveTimeout =>
// do something
// other case clauses
}
}
The above ChatActor will receive a ReceiveTimeout message if it hasn't received a message for two hours. (However, as the documentation states: "the receive timeout might fire and enqueue the ReceiveTimeout message right after another message was enqueued; hence it is not guaranteed that upon reception of the receive timeout there must have been an idle period beforehand as configured via this method.")
What is the best practice to deal with them?
It's a good idea to stop inactive actors; otherwise you could have a memory leak. Here are a few approaches for stopping these actors:
The inactive actor throws an exception, which is handled in a supervisor strategy defined in the actor's parent. In the supervisor strategy, the parent stops the idle actor (e.g., via context stop sender()).
The inactive actor sends its self reference to a "reaper" actor that collects references to idle actors and culls (i.e., stops) these actors on a periodic basis (perhaps using a scheduler).
The inactive actor stops itself (via context stop self).
More information about stopping actors is found here.
Is stopping them enough?
When an actor is stopped, its ActorRef essentially becomes invalid. From the documentation:
After stopping an actor, its postStop hook is called, which may be used e.g. for deregistering this actor from other services. This hook is guaranteed to run after message queuing has been disabled for this actor, i.e. messages sent to a stopped actor will be redirected to the deadLetters of the ActorSystem.
At this point, the underlying actor instance to which the now-stale ActorRef points is eligible for garbage collection. In other words, an actor must be stopped in order for it to be eligible for garbage collection. Therefore, in regard to freeing up memory, stopping the actor is enough. You could also remove the invalid ActorRef itself after the actor has been stopped. Note that removing an ActorRef does not automatically stop the actor:
It is important to note that Actors do not stop automatically when no longer referenced, every Actor that is created must also explicitly be destroyed.
Is there any best practice to identify these idle actors?
The only way is to make each actor to keep the time when it was active last time. Then, to speedup investigation of the longest inactive actor, you can organize an index-like structure, e.g. PriorityQueue. Then a dedicated actor periodically awakes and cleans that structure from actors which are idle longer than some predefined period of time.
What is the best practice to deal with them? Is stopping them enough?
An idle actor does not consume any resources except core memory. If you have plenty of memory, the best practice is to do nothing. If you want to save that memory, store actor in database (after some period of inactivity), and then read it from there by demand.

What is the purpose of stopping actors in Akka?

I have read the Akka docs on fault tolerance & supervision, and I think I totally get them, with one big exception (no pun intended).
Why would you ever want/need to stop a child actor???
The only clue in the docs is:
Closer to the Erlang way is the strategy to just stop children when they fail and then take corrective action in the supervisor...
But to me, stopping a child is the same as saying "don't execute this code any longer", which to me, is effectively the same as deploying new changes to the code which has that actor removed entirely:
Every Actor plays some critical role in the actor system
To simply stop the actor means that actor currently doesn't have a role any longer, and presumes the system can now somehow (magically) work without it
So again, to me, this is no different than refactoring the code to not even have the actor any more, and then deploying those changes
I'm sure I'm just not seeing the forest through the trees on this one, but I just don't see any use cases where I'd have this big complex actor system, where each actor does critical work and then hands it off to the next critical actor, but then I stop an actor, and magically the whole system keeps on working perfectly.
In short: stopping an actor (to me) is like ripping the transmission out of a moving vehicle. How can this ever be a good/desirable thing?!?
The essence of the "error kernel" pattern is to delegate risky operations and protect essential state, it is common to spawn child-actors for one-off operations, and when that operation is completed and its result send off somewhere else, the child-actor or the parent-actor needs to stop it. (otherwise the child-actor will remain active/leak)
If the child actor is doing a longer process that could be terminated safely, such as video coding, or some kind of file transformation and you have to deploy a new build, in that case a terminate sign would be useful to stop running processes gracefully.
Every Actor plays some critical role in the actor system
This is where you are running into trouble, I can create a child actor to do a job, for example execute a query against a database or maintain the state of a connected user and this is its only purpose.
Once the database query is complete or the user has gracefully disconnected the child actor no longer has any role to play and should be stopped so that it will release any resources it holds.
To simply stop the actor means that actor currently doesn't have a role any >longer, and presumes the system can now somehow (magically) work without it
The system is able to continue because I can create new child actors if/when they are needed.

When is Akka's default system ready in Play?

I was writing an application in Play 2.3.7 and when trying to create an actor (using the default Akka.system() of Play) inside the beforeStart overriden method of the Global object, the application crashes with some infinite recursive call of beforeStart, ultimately throwing an exception due to Global object not being initialized. If I create this actor inside the onStart method, then everything goes well.
My "intuition" was: "ok, this actor must be ready before the application receives the first request, so it must be created on beforeStart, not in onStart".
When is Akka.system() ready to use?
Akka.system returns an ActorSystem held by the AkkaPlugin. Therefore, if you want to use it, you must do so after the AkkaPlugin has been initialized. The AkkaPlugin is given priority 1000, which means its started after most other internal plugins (database, evolutions, ..). The Global plugin has priority 10000, which means the AkkaPlugin is available there (and for any plugin with priority > 1000).
Note the warning in the docs about beforeStart:
Called before the application starts.
Resources managed by plugins, such as database connections, are likely not available at this point.
You have to start this in onStart() because beforeStart() is called too early - way before anything like Akka (which is actually a plugin) or any database connections are created. In fact, the documentation for GlobalSettings states:
Resources managed by plugins, such as database connections, are likely not available at this point.
The general guidance (confirmed by this thread) is that onStart() is the place to create your actors. And in practice, that has worked for me as well.

akka actor failure affect on the os process

If say the code that my actor uses (a code I have no control over) throws an unhandled exception, could that result into the whole actor system process to crash or each actor is running in some kind of special container?
To clarify more, in my use case, I want each actor to load (at run time) some user written code/lib and call some interface methods on them. These libs maybe buggy and can potentially result in my actor system os process to die or halt or something like that. I mean what if the code that actor calls does something that halt (like accessing a remote resource by a buggy client or a dead loop) or even call Enviroment.exit() or something of bad nature.
I mean if my requirement is to allow each actor to load code that I do not have control over, how can I guard my actor system against them? Do I even have to do this?
One way that I can think the whole actor system OS process guard itself against these third party code is to run each actor inside some kind of a container or event have one actor system per actor on the local machine that my actor controls? Do I have to go this far or akka already takes care of this for me and any failure at actor level would not jeopardize the whole actor system and its process??
If the JVM process dies, the JVM process dies. You get around that by using Akka Cluster so you can observe and react to node failures.

How to deal with too much actor creation

EDIT: One important thing I forgot mentioning: the actor creation described below depends on the data - sometimes few processing actors are required, and sometimes many.
One component I'm working on needs to create a number of actors (possibly, round-robin routed ones), that each get a rather large amount of messages to process. Each of those actors belongs to a "processing batch" which has a the same initialization parameters.
When I'm running this on the production machine with many messages, I quickly get a number of actor creation timeouts. I'm creating the actors directly with ActorSystem.actorOf().
What's surprising me though is that all in all there aren't that many actors being created I'd think (8 "processing sinks" with 5 round-robin routed actors would be 40 actors, which doesn't seem very much).
I'm shutting down the actors once they're not needed anymore by having another actor (which counts the amount of successes and failures that it gets via the "processing" actors) send them a PoisonPill so I'd think that they are all shut down correctly.
Am I perhaps doing something wrong here in the way that I am creating those actors, e.g. should I perhaps create them differently? Or would an appropriate strategy be to wait for some of the batches to be done before creating new actors?
Since you did not specify which version you are using I’m assuming that you will be interested in reading this:
http://doc.akka.io/docs/akka/2.0.3/scala/actors.html#Creating_Actors_with_default_constructor (especially the warning)
Besides the technical argument, creating your actors at top level is not a good design, missing out on the fault handling benefits.