My team is debating how granular our actors should be.
As an example, we have an actor that is responsible for deserializing a json string into an object. The argument in favour of making it an actor is that the deserialization can cause errors and actors and their supervision model can be used for control flow.
Is it a good idea to use actors for this and other small tasks?
Yes, it is a good idea to delegate tasks that are routinely prone to failure to child actors which handle that specific task. This pattern is referred to as the Character Actor pattern on the Petabridge blog, but reiterated below in case the link breaks in future.
The Character Actor Pattern is used when an application has some risky but critical operation to execute, but needs to protect critical state contained in other actors and ensure that there are no negative side effects.
It’s often cheaper, faster, and more reliable to simply delegate these risky operations to a purpose-built, but trivially disposable actor whose only job is to carry out the operation successfully or die trying.
These brave, disposable actors are Character Actors.
Character actors can be temporary or long-running actors, but typically they’re designed to carry out only one specific type of risky operation. Often times character actors can be re-used throughout an application, belonging to many different types of parents. For example, you may have a utility character actor that handles making external network requests, and is then used by parent actors throughout your application for their own purposes.
Use Cases
The Character Actor pattern is broadly applicable. Use it any time you need to do something risky such as network calls, file I/O, parsing malformed content, and so on. Any of these operations is a good candidate for a character actor.
Character actors are most effective when used to provide protection and fault isolation to some other important type of actor, typically one containing some important state.
Benefits
There are three key benefits to using the Character Actor Pattern:
Insulates stateful and critical actors from failures and risky operations;
Makes it easy to cleanly introduce retry / backoff / undo semantics specific to each type of risky operation: since you have a character actor specific to each risky task, you have a well-defined place to put retry handling and related operations. These are specific to the task the Character Actor was created for and don’t need to be shared with the rest of your actors, meaning that the pattern…
Reduces code by letting the SupervisionStrategy and actor lifecycle do most of the heavy lifting, you don’t need all sorts of exception handling code in your parent actors. Just let it crash, baby.
Related
I've made my research about Akka Framework,
And I would like to know ;
Is it possible to give a priority to a specific actor?
I mean - actors are working while getting a "let" message from the queue,
Is there an option to let an actor work even when it's not his turn yet to work?
Effectively, yes.
One of the parts of your Actor configuration is which Dispatcher those actors will use. A dispatcher is what connects the actor to the actual threads that will execute the work. (Dispatchers default to ForkJoinPools, but can also be dedicated thread pools or even threads dedicated to a specific actor.)
So the typical way you give an Actor "priority" is to give it a dedicated dispatcher, and thereby dedicated threads. For example, Akka itself does this for its internal messages: they run on a dedicated dispatcher so that even you deploy a bunch of poorly written actors that block the threads, Akka itself can still function.
I put "priority" in quotes, because you aren't guaranteeing a specific order of processing. (There are other ways to do that, but not across Actors.) But you are solving the case where you want specific actors to always have a greater access to resources and/or specific actors to get executed promptly.
(In theory, you could take this even further and create a ThreadPoolExecutor with higher priority threads, and then create a Dispatcher based on that ThreadPoolExecutor. That would truly give OS-level priority to an Actor, but that would only be likely relevant in very unusual circumstances.)
EDIT TO RESPOND TO "do mailboxes and dispatchers are the same" [sic]?
No. Each actor has a mailbox. So sometimes we talk about the behavior of mailboxes when discussing the behavior of actors, as the behavior of the mailbox governs the ordering of the actor's message processing.
But dispatchers are a distinct concept. Actors have a dispatcher, but it is many to one. (i.e. each Actor has one mailbox, but there may be many actors associated with a single dispatcher.)
For example, a real world situation might be:
System actors are processed by the internal dispatcher. To quote the docs "To protect the internal Actors that are spawned by the various Akka modules, a separate internal dispatcher is used by default." i.e. no matter how badly screwed up your own code might be, you can't screw up the heartbeat processing and other system messages because they are running on their own dispatcher, and thus their own threads.
Most actors (millions of them perhaps) are processed by the default dispatcher. Huge numbers of actors, as long as they are well behaved, can be handled with a tiny number of threads. So they might all be configured to use the default dispatcher.
Badly behaved actors (such as those that block) might be configured to be processed by a dedicated "blocking" dispatcher. By isolating blocking dispatchers into a separate dispatcher they don't impact the response time of the default dispatcher.
Although I don't see this often, you might also have a dispatcher for extremely response time sensitive actors that gives them a dedicated thread pool. Or even a "pinned" dispatcher that gives an actor a dedicated thread.
As I mentioned this isn't really "priority", this is "dedicated resources". Because one of the critical aspects of actors is that the are location independent. So if Actor A is on Node A, and Actor B is on Node B, I can't guarantee that Actor A will ALWAYS act first. Because doing so would involve an ASTRONOMINCAL amount of overhead between nodes. All I can reasonably do is give Actor A dedicated resources so that I know that Actor A should always be able to act quickly.
Note that this is what the internal dispatcher does as well. We don't guarantee that heartbeat messages are always processed first, but we do make sure that there are always threads available to process system messages, even if some bad user code has blocked the default dispatcher.
One of the biggest advantages of the actor model is the removal of locking (actors operate independently and serially). Does this mean that we cannot have any shared/global state at all in an actor system (because accessing/updating that will bring in locks)?
In more practical terms, what happens to updates to an entity (e.g. DB) from multiple actors in such a system?
Actor model intended to solve issue with any mutable shared state in another way - actor should encapsulate it. So if you need something to be shared between actors - this should be an actor with this state and protocol to work with it. If you would like to update DB from different actors - extract an actor responsible for this, and provide API or protocol for other actors to update DB. Or make several actors to handle DB updates and route messages between them (Please, see for more details: https://doc.akka.io/docs/akka/current/typed/routers.html)
General approach - think about shared state, as actor shared between actors (via ActorRef) and state API as messages for this actor.
Usually, it is not a preferred way to have a shared/global state in an actor system. A very central idea when working with actors is to not share any mutable state, instead, mutable state is encapsulated inside of the actors as pointed out in the documanetation
Do not pass mutable objects between actors. In order to ensure that,
prefer immutable messages. If the encapsulation of actors is broken by
exposing their mutable state to the outside, you are back in normal
Java concurrency land with all the drawbacks.
Actors are made to be containers for behavior and state, embracing
this means to not routinely send behavior within messages (which may
be tempting using Scala closures). One of the risks is to accidentally
share mutable state between actors, and this violation of the actor
model unfortunately breaks all the properties which make programming
in actors such a nice experience.
Moreover, If one actor needs to know something about the state of another actor it will ask for it using immutable messages and get an immutable reply.One of the key features of Akka actors its their ability to manage state in a thread-safe manner and by having a shared and mutable state, we will violate this property
Usually DB reading operations (CRUD) can be performed directly by any actor.To perform this. make an actor responsible for this, and use it from other actors.
Let me know if it helps!!
I know that Erlang uses Actor model to support concurrency and Erlang processes are the same as Actors: they send messages, immutable and so on. But according to Carl Hewitt one important thing in the Actor Model is indeterminism and Arbiters (Given an arbiter, you can have multiple inputs (e.g. I0 and I1) into the arbiter at the same time, but only one of the possible outcomes (e.g. O0 or O1) will come out on the other end).
So, I'm interesting how does Erlang implemented this concept? And what is used in the role of Arbiters in the Erlang concurrency model/actor model implementation?
This gets pretty deeply philosophical (see e.g. https://en.wikipedia.org/wiki/Indeterminacy_in_concurrent_computation), but as far as I can tell, he's saying that in the Actor Model, whenever an actor has multiple inputs, there's a magic box that decides the ordering of the incoming messages any way it wants to, even if it means delaying some of the messages for an arbitrarily long (but bounded) time. I.e., you can never rely on any particular order or time for receiving parallel messages, even if the program structure seems to favour a certain arrival order. (Note that this is a theoretical concept for reasoning about actor programs - you wouldn't try to make a system unnecessarily random in practice, except for testing purposes.)
The semantics of Erlang message passing say pretty much the same thing: whenever two processes send a message each to a third process, and there is no ordering constraint on the individual send events, you can never rely on which message will end up first in the receiver's mailbox. They could be arbitrarily delayed, even if all processes run within the same Erlang VM. Again, this is about what guarantees you get as a programmer (none), not about making the Erlang VM insert random delays. (Random delays can be introduced naturally by other things, such as OS-level pauses for page faults.)
I have read that:
Akka ensures that each instance of an actor runs in its own lightweight thread and its messages are processed one at a time.
If this is the case that AKKA actors processes its messages sequentially then how AKKA provides concurrency for a single Actor.
Actors are independent agents of computation, each one is executed strictly sequentially but many actors can be executed concurrently. You can view an Actor as a Thread that costs only about 0.1% of what a normal thread costs and that also has an address to which you can send messages—you can of course manage a queue in your own Thread and use that for message passing but you’d have to implement all that yourself.
If Akka—or the Actor Model—stopped here, then it would indeed not be very useful. The trick is that giving stable addresses (ActorRef) to the Actors enables them to communicate even across machine boundaries, over a network, in a cluster. It also allows them to be supervised for principled failure handling—when a normal Thread throws an exception it simply terminates and nothing is done to fix it.
It is this whole package of encapsulation (provided by hiding everything behind ActorRef), message-based communication that is location transparent, and support for failure handling that makes the Actor Model a perfect fit for expressing distributed systems. And today there is a distributed system of many CPU cores within even the smallest devices.
The active object design pattern as I understand is tying up a (private/dedicated) thread life time with an object and making it work on independent data. From some of the documentation I read , the evolution of this kind of paradigm was because of two reasons , first , managing raw threads would be pain and second more threads contending for the shared resource doesn't scale well using mutex and locks. while I agree with the first reason , I do not fully comprehend the second . Making an object active just makes the object independent but the problems like contention for lock/mutex is still there (as we still have shared queue/buffer), the object just delegated the sharing responsibility onto the message queue. The only advantage of this design pattern as i see is the case where I had to perform long asynch task on the shared object (now that i am just passing message to a shared queue , the threads no longer have to block for long on mutex/locks but they still will blocka and contend for publishing messages/task). Other than this case could someone tell more scenarios where this kind of design pattern will have other advantages.
The second question I have is (I just started digging around design patterns) , what is the conceptual difference between , active object , reactor and proactor design pattern . How do you decide in which design pattern is more efficient and suits your requirements more. It would be really nice if someone can demonstrate certain examples showing how the three design patterns will behave and which one has comparative advantage/disadvantage in different scenarios.
I am kind of confused as I have used active object (which used shared thread-safe buffer) and boost::asio(Proactor) both to do similar kind of async stuff , I would like to know if any one has more insights on applicability of different patterns when approaching a problem.
The ACE website has some very good papers on the Active Object, Proactor en Reactor design patterns. A short summary of their intents:
The Active Object design pattern decouples method execution
from method invocation to enhance concurrency and
simplify synchronized access to an object that resides in its
own thread of control. Also known as: Concurrent Object, Actor.
The Proactor pattern supports the demultiplexing and dispatching
of multiple event handlers, which are triggered by the completion
of asynchronous events. This pattern is heavily used in Boost.Asio.
The Reactor design pattern handles service requests that are delivered
concurrently to an application by one or more clients. Each service
in an application may consist of several methods and is represented by
a separate event handler that is responsible for dispatching service-specific
requests. Also known as: Dispatcher, Notifier.