Is it possible to prioritize (give a priority) to specific Akka's Actor? - akka

I've made my research about Akka Framework,
And I would like to know ;
Is it possible to give a priority to a specific actor?
I mean - actors are working while getting a "let" message from the queue,
Is there an option to let an actor work even when it's not his turn yet to work?

Effectively, yes.
One of the parts of your Actor configuration is which Dispatcher those actors will use. A dispatcher is what connects the actor to the actual threads that will execute the work. (Dispatchers default to ForkJoinPools, but can also be dedicated thread pools or even threads dedicated to a specific actor.)
So the typical way you give an Actor "priority" is to give it a dedicated dispatcher, and thereby dedicated threads. For example, Akka itself does this for its internal messages: they run on a dedicated dispatcher so that even you deploy a bunch of poorly written actors that block the threads, Akka itself can still function.
I put "priority" in quotes, because you aren't guaranteeing a specific order of processing. (There are other ways to do that, but not across Actors.) But you are solving the case where you want specific actors to always have a greater access to resources and/or specific actors to get executed promptly.
(In theory, you could take this even further and create a ThreadPoolExecutor with higher priority threads, and then create a Dispatcher based on that ThreadPoolExecutor. That would truly give OS-level priority to an Actor, but that would only be likely relevant in very unusual circumstances.)
EDIT TO RESPOND TO "do mailboxes and dispatchers are the same" [sic]?
No. Each actor has a mailbox. So sometimes we talk about the behavior of mailboxes when discussing the behavior of actors, as the behavior of the mailbox governs the ordering of the actor's message processing.
But dispatchers are a distinct concept. Actors have a dispatcher, but it is many to one. (i.e. each Actor has one mailbox, but there may be many actors associated with a single dispatcher.)
For example, a real world situation might be:
System actors are processed by the internal dispatcher. To quote the docs "To protect the internal Actors that are spawned by the various Akka modules, a separate internal dispatcher is used by default." i.e. no matter how badly screwed up your own code might be, you can't screw up the heartbeat processing and other system messages because they are running on their own dispatcher, and thus their own threads.
Most actors (millions of them perhaps) are processed by the default dispatcher. Huge numbers of actors, as long as they are well behaved, can be handled with a tiny number of threads. So they might all be configured to use the default dispatcher.
Badly behaved actors (such as those that block) might be configured to be processed by a dedicated "blocking" dispatcher. By isolating blocking dispatchers into a separate dispatcher they don't impact the response time of the default dispatcher.
Although I don't see this often, you might also have a dispatcher for extremely response time sensitive actors that gives them a dedicated thread pool. Or even a "pinned" dispatcher that gives an actor a dedicated thread.
As I mentioned this isn't really "priority", this is "dedicated resources". Because one of the critical aspects of actors is that the are location independent. So if Actor A is on Node A, and Actor B is on Node B, I can't guarantee that Actor A will ALWAYS act first. Because doing so would involve an ASTRONOMINCAL amount of overhead between nodes. All I can reasonably do is give Actor A dedicated resources so that I know that Actor A should always be able to act quickly.
Note that this is what the internal dispatcher does as well. We don't guarantee that heartbeat messages are always processed first, but we do make sure that there are always threads available to process system messages, even if some bad user code has blocked the default dispatcher.

Related

Actors, ForkJoinPool, and ordering of messages

I need help understanding how an Actor system can use ForkJoinPool and maintain ordering guarantees.
I have been playing with Actr https://github.com/zakgof/actr which is a simple small actor system. I think my question applies to Akka as well. I have a simple bit of code that sends one Actor numbers 1 to 10. The Actor just prints the messages; and the messages are not in order. I get 1,2,4,3,5,6,8,7,9,10.
I think this has to do with the ForkJoinPool. Actr wraps a message into a Runnable and sends it to the ForkJoin Executor. When the task executes it puts the message onto the destination Actor's queue and processes it. My understanding of ForkJoinPool is that tasks are distributed to multiple threads. I've added logging and the messages 1,2,3,... are being distributed to different threads and the messages are put on to the Actor's queue out of order.
Am I missing something? Actr's Scheduler is similar to Akka's Disapatcher and it can be found here: https://github.com/zakgof/actr/blob/master/src/main/java/com/zakgof/actr/impl/ExecutorBasedScheduler.java
The ExecutorBasedScheduler is constructed with a ForkJoinPool.commonPool like so:
public static IActorScheduler newForkJoinPoolScheduler(int throughput) {
return new ExecutorBasedScheduler(ForkJoinPool.commonPool(), throughput);
}
How can an Actor use ForkJoinPool and keep messages in order?
I can't speak to Actr at all, but in Akka the individual messages are not created as ForkJoinPool tasks. (One task per message seems like a very bad approach for many reasons, not just ordering issues. Namely that messages can typically be processed very quickly and if you had one task per message the overhead would be awfully high. You want to have some batching, at least under load, so that you get better thread locality and less overhead.)
Essentially, in Akka, the actor mailboxes are queues within an object. When a message is received by the mailbox it will check if it has already scheduled a task, if not, it will add a new task to the ForkJoinPool. So the ForkJoinPool task isn't "process this message", but instead "process the Runnable associated with this specific Actor's mailbox". Some period of time then obviously passes before the task gets scheduled and the Runnable runs. When the Runnable runs, the mailbox may have received many more messages. But they will just have been added to the queue and the Runnable will then just process as many of them as it is configured to do, in the order in which they were received.
This is why, in Akka, you can guarantee the order of messages within a mailbox, but cannot guarantee the order of messages sent to different Actors. If I send message A to Actor Alpha, then message B to Actor Beta, then message C to Actor Alpha, I can guarantee that A will be before C. But B might happen before, after, or at the same time as A and C. (Because A and C will be handled by the same task, but B will be a different task.)
Messaging Ordering Docs : More details on what is guaranteed and what isn't regarding ordering.
Dispatcher Docs : Dispatchers are the connection between Actors and the actual execution. ForkJoinPool is only one implementation (although a very common one).
EDIT: Just thought I'd add some links to the Akka source to illustrate. Note that these are all internal APIs. tell is how you use it, this is all behind the scenes. (I'm using permalinks so that my links don't bitrot, but be aware that Akka may have changed in the version you are using.)
The key bits are in akka.dispatch.Dispatcher.scala
Your tell will go through some hoops to get to the right mailbox. But eventually:
dispatch method gets called to enqueue it. This is very simple, just enqueue and call the registerForExecution method
registerForExecution This method actually checks to see if scheduling is needed first. If it needs scheduling it uses the executorService to schedule it. Note that the executorService is abstract, but execute is called on that service providing the mailbox as an argument.
execute
If we assume the implementation is ForkJoinPool, this is the executorService execute method we end up in. Essentially we just create a ForkJoinTask with the supplied argument (the mailbox) as the runnable.
run The Mailbox is conveniently a Runnable so the ForkJoinPool will eventually call this method once scheduled. You can see that it processes special system messages then calls processMailbox then (in a finally) calls registerForExecution again. Note that registerForExecution checks if it needs scheduling first so this isn't an infinite loop, it's just checking if there are is remaining work to do. While we are in the Mailbox class you can also look at some of the methods that we used in the Dispatcher to see if scheduling is needed, to actually add messages to the queue,etc.
processMailbox Is essentially just a loop over calling actor.invoke except that it has to do lots of checking to see if it has system messages, if it's out of work, if it's passed a threshold, if it has been interrupted, etc.
invoke is where the code you write (the receiveMessage) actually gets called.
If you actually click through all of those links you'll see that I'm simplifying a lot. There's lots of error handling and code to make sure everything is thread safe, super efficient, and bulletproof. But that's the gist of the code flow.

How AKKA provides concurrency for a single actor?

I have read that:
Akka ensures that each instance of an actor runs in its own lightweight thread and its messages are processed one at a time.
If this is the case that AKKA actors processes its messages sequentially then how AKKA provides concurrency for a single Actor.
Actors are independent agents of computation, each one is executed strictly sequentially but many actors can be executed concurrently. You can view an Actor as a Thread that costs only about 0.1% of what a normal thread costs and that also has an address to which you can send messages—you can of course manage a queue in your own Thread and use that for message passing but you’d have to implement all that yourself.
If Akka—or the Actor Model—stopped here, then it would indeed not be very useful. The trick is that giving stable addresses (ActorRef) to the Actors enables them to communicate even across machine boundaries, over a network, in a cluster. It also allows them to be supervised for principled failure handling—when a normal Thread throws an exception it simply terminates and nothing is done to fix it.
It is this whole package of encapsulation (provided by hiding everything behind ActorRef), message-based communication that is location transparent, and support for failure handling that makes the Actor Model a perfect fit for expressing distributed systems. And today there is a distributed system of many CPU cores within even the smallest devices.

When to use various Akka Mailbox types

I'm trying to understand when and where to use the different built-in Akka mailboxes as well as when it is appropriate to roll your own. However, nowhere on that page does it explain what a "bounded mailbox" actually is, or how it behaves different than an unbounded mailbox. Also, that page categorizes mailboxes as "blocking" vs "non-blocking". And while I have a strong idea of what they mean by this (a message can be sent to a mailbox unless the mailbox is first emptied) I'm not 100% sure that I understand this. So seeing that I have no idea what the docs mean when they categorize a mailbox as bounded or blocking, it's tough for me to tell when I should be using each type.
Also, it seems like it is the default Akka behavior to clear out an actor's mailbox if that actor is restarted. I'd like to prevent this, but not sure if the solution is to use one of these built-in mailbox types (no mention of message persistence is mentioned on this page) or to somehow use persistent actors to accomplish such lossless-ness.
First, if an actor crashes and is restarted you only lose the current message that was being processed and not the entire mailbox.
A bounded mailbox has a limit to the number of messages it can have queued before it starts blocking the sender and not allowing the item if the queue doesn't go down while the sender is trying to put an item on. If you have concerns about memory and you can deal with potential message loss then you might want something like this. An unbounded mailbox has no limit on capacity at all so it could possible suffer memory issues if it gets flooded.
Whether it's bounded or not will affect whether or not it blocks. Blocking is generally not great for performance and should be avoided if the situation does not call for a bounded mailbox. That's why the default mailbox is unbounded; it will yield much better performance than a bounded counterpart.
The single consumer unbounded mailbox will most likely be the fastest because it is optimized to only have one consumer ever taking things off the queue. This means that you can not use a dispatcher that allows an actor instance to steal items from another actor instances mailbox (work distributing/stealing) but if you don't care about that then this mailbox might be the best bet for performance.
The priority based mailboxes allow you to provide code that allows the placement within the queue to vary depending on some attributes on the messages itself. This allows you to define the priority of the messages yourself and this will then shift higher priority items to the front of the queue regardless of the normal FIFO rules.

When is it safe to block in an Akka 2 actor?

I know that it is not recommended to block in the receive method of an actor, but I believe it can be done (as long as it is not done in too many actors at once).
This post suggests blocking in preStart as one way to solve a problem, so presumably blocking in preStart is safe.
However, I tried to block in preRestart (not preStart) and everything seemed to just hang - no more messages were logged as received.
Also, in cases where it is not safe to block, what is a safe alternative?
It's relatively safe to block in receive when:
the number of blocked actors in total is much smaller than the number of total worker threads. By default there are ten worker threads, so 1-2 blocked actors are fine
blocking actor has its own, dedicated dispatcher (thread pool). Other actors are not affected
When it's not safe to block, good alternative is to... not block ;-). If you are working with legacy API that is inherently blocking you can either have a separate thread pool maintained inside some actor (feels wrong) or use approach 2. above - dedicate few threads to a subset of actors that need to block.
Never ever block an actor.
If your actor is part of an actor hierarchy (and it should be), the actor system is not able to stop it.
The actor's life-cycle (supervision, watchig etc.) is done by messaging.
Stopping a parent actor of a blocking child will not work.
Maybe there are ways to couple the blocking condition with the actor's lifecycle.
But this would lead to overload of complications and bad-style.
So, the best way is to do the blocking part outside of that actor.
E.g. you could run the blocking code via an executor service in a separate thread.

Akka and Session Beans

The Typesafe whitepaper (v5) states:
"In different scenarios, actors may be an alternative to: a thread; a Java EE session bean; ..."
I don't understand how an actor is an alternative to a session bean, because they work completely differently: an actor is called serially by passing messages to it and it processes the messages one at a time in the order in which they are sent. That means the running of any business logic inside the actor is synchronised. Session beans on the other hand are pooled - there is a number of them and multiple threads can run the same business logic at any time meaning that the logic is run concurrently.
Can anyone clear up my misunderstanding of this statement?
You can pool Actors (children) or behind Akka Routers (also technically children), so that way you can tune "concurrency".
Too much ejb concurrency can often be a cause of various lock contention and performance degradation.
Meanwhile akka is aimed at async processing and nio. This approach benefits most of all when number of threads is near the number of CPU cores.
Note that akka doesn't enforce exactly one processing thread. See e.g. Akka control threadpool threads