I know that Erlang uses Actor model to support concurrency and Erlang processes are the same as Actors: they send messages, immutable and so on. But according to Carl Hewitt one important thing in the Actor Model is indeterminism and Arbiters (Given an arbiter, you can have multiple inputs (e.g. I0 and I1) into the arbiter at the same time, but only one of the possible outcomes (e.g. O0 or O1) will come out on the other end).
So, I'm interesting how does Erlang implemented this concept? And what is used in the role of Arbiters in the Erlang concurrency model/actor model implementation?
This gets pretty deeply philosophical (see e.g. https://en.wikipedia.org/wiki/Indeterminacy_in_concurrent_computation), but as far as I can tell, he's saying that in the Actor Model, whenever an actor has multiple inputs, there's a magic box that decides the ordering of the incoming messages any way it wants to, even if it means delaying some of the messages for an arbitrarily long (but bounded) time. I.e., you can never rely on any particular order or time for receiving parallel messages, even if the program structure seems to favour a certain arrival order. (Note that this is a theoretical concept for reasoning about actor programs - you wouldn't try to make a system unnecessarily random in practice, except for testing purposes.)
The semantics of Erlang message passing say pretty much the same thing: whenever two processes send a message each to a third process, and there is no ordering constraint on the individual send events, you can never rely on which message will end up first in the receiver's mailbox. They could be arbitrarily delayed, even if all processes run within the same Erlang VM. Again, this is about what guarantees you get as a programmer (none), not about making the Erlang VM insert random delays. (Random delays can be introduced naturally by other things, such as OS-level pauses for page faults.)
Related
I've made my research about Akka Framework,
And I would like to know ;
Is it possible to give a priority to a specific actor?
I mean - actors are working while getting a "let" message from the queue,
Is there an option to let an actor work even when it's not his turn yet to work?
Effectively, yes.
One of the parts of your Actor configuration is which Dispatcher those actors will use. A dispatcher is what connects the actor to the actual threads that will execute the work. (Dispatchers default to ForkJoinPools, but can also be dedicated thread pools or even threads dedicated to a specific actor.)
So the typical way you give an Actor "priority" is to give it a dedicated dispatcher, and thereby dedicated threads. For example, Akka itself does this for its internal messages: they run on a dedicated dispatcher so that even you deploy a bunch of poorly written actors that block the threads, Akka itself can still function.
I put "priority" in quotes, because you aren't guaranteeing a specific order of processing. (There are other ways to do that, but not across Actors.) But you are solving the case where you want specific actors to always have a greater access to resources and/or specific actors to get executed promptly.
(In theory, you could take this even further and create a ThreadPoolExecutor with higher priority threads, and then create a Dispatcher based on that ThreadPoolExecutor. That would truly give OS-level priority to an Actor, but that would only be likely relevant in very unusual circumstances.)
EDIT TO RESPOND TO "do mailboxes and dispatchers are the same" [sic]?
No. Each actor has a mailbox. So sometimes we talk about the behavior of mailboxes when discussing the behavior of actors, as the behavior of the mailbox governs the ordering of the actor's message processing.
But dispatchers are a distinct concept. Actors have a dispatcher, but it is many to one. (i.e. each Actor has one mailbox, but there may be many actors associated with a single dispatcher.)
For example, a real world situation might be:
System actors are processed by the internal dispatcher. To quote the docs "To protect the internal Actors that are spawned by the various Akka modules, a separate internal dispatcher is used by default." i.e. no matter how badly screwed up your own code might be, you can't screw up the heartbeat processing and other system messages because they are running on their own dispatcher, and thus their own threads.
Most actors (millions of them perhaps) are processed by the default dispatcher. Huge numbers of actors, as long as they are well behaved, can be handled with a tiny number of threads. So they might all be configured to use the default dispatcher.
Badly behaved actors (such as those that block) might be configured to be processed by a dedicated "blocking" dispatcher. By isolating blocking dispatchers into a separate dispatcher they don't impact the response time of the default dispatcher.
Although I don't see this often, you might also have a dispatcher for extremely response time sensitive actors that gives them a dedicated thread pool. Or even a "pinned" dispatcher that gives an actor a dedicated thread.
As I mentioned this isn't really "priority", this is "dedicated resources". Because one of the critical aspects of actors is that the are location independent. So if Actor A is on Node A, and Actor B is on Node B, I can't guarantee that Actor A will ALWAYS act first. Because doing so would involve an ASTRONOMINCAL amount of overhead between nodes. All I can reasonably do is give Actor A dedicated resources so that I know that Actor A should always be able to act quickly.
Note that this is what the internal dispatcher does as well. We don't guarantee that heartbeat messages are always processed first, but we do make sure that there are always threads available to process system messages, even if some bad user code has blocked the default dispatcher.
I need help understanding how an Actor system can use ForkJoinPool and maintain ordering guarantees.
I have been playing with Actr https://github.com/zakgof/actr which is a simple small actor system. I think my question applies to Akka as well. I have a simple bit of code that sends one Actor numbers 1 to 10. The Actor just prints the messages; and the messages are not in order. I get 1,2,4,3,5,6,8,7,9,10.
I think this has to do with the ForkJoinPool. Actr wraps a message into a Runnable and sends it to the ForkJoin Executor. When the task executes it puts the message onto the destination Actor's queue and processes it. My understanding of ForkJoinPool is that tasks are distributed to multiple threads. I've added logging and the messages 1,2,3,... are being distributed to different threads and the messages are put on to the Actor's queue out of order.
Am I missing something? Actr's Scheduler is similar to Akka's Disapatcher and it can be found here: https://github.com/zakgof/actr/blob/master/src/main/java/com/zakgof/actr/impl/ExecutorBasedScheduler.java
The ExecutorBasedScheduler is constructed with a ForkJoinPool.commonPool like so:
public static IActorScheduler newForkJoinPoolScheduler(int throughput) {
return new ExecutorBasedScheduler(ForkJoinPool.commonPool(), throughput);
}
How can an Actor use ForkJoinPool and keep messages in order?
I can't speak to Actr at all, but in Akka the individual messages are not created as ForkJoinPool tasks. (One task per message seems like a very bad approach for many reasons, not just ordering issues. Namely that messages can typically be processed very quickly and if you had one task per message the overhead would be awfully high. You want to have some batching, at least under load, so that you get better thread locality and less overhead.)
Essentially, in Akka, the actor mailboxes are queues within an object. When a message is received by the mailbox it will check if it has already scheduled a task, if not, it will add a new task to the ForkJoinPool. So the ForkJoinPool task isn't "process this message", but instead "process the Runnable associated with this specific Actor's mailbox". Some period of time then obviously passes before the task gets scheduled and the Runnable runs. When the Runnable runs, the mailbox may have received many more messages. But they will just have been added to the queue and the Runnable will then just process as many of them as it is configured to do, in the order in which they were received.
This is why, in Akka, you can guarantee the order of messages within a mailbox, but cannot guarantee the order of messages sent to different Actors. If I send message A to Actor Alpha, then message B to Actor Beta, then message C to Actor Alpha, I can guarantee that A will be before C. But B might happen before, after, or at the same time as A and C. (Because A and C will be handled by the same task, but B will be a different task.)
Messaging Ordering Docs : More details on what is guaranteed and what isn't regarding ordering.
Dispatcher Docs : Dispatchers are the connection between Actors and the actual execution. ForkJoinPool is only one implementation (although a very common one).
EDIT: Just thought I'd add some links to the Akka source to illustrate. Note that these are all internal APIs. tell is how you use it, this is all behind the scenes. (I'm using permalinks so that my links don't bitrot, but be aware that Akka may have changed in the version you are using.)
The key bits are in akka.dispatch.Dispatcher.scala
Your tell will go through some hoops to get to the right mailbox. But eventually:
dispatch method gets called to enqueue it. This is very simple, just enqueue and call the registerForExecution method
registerForExecution This method actually checks to see if scheduling is needed first. If it needs scheduling it uses the executorService to schedule it. Note that the executorService is abstract, but execute is called on that service providing the mailbox as an argument.
execute
If we assume the implementation is ForkJoinPool, this is the executorService execute method we end up in. Essentially we just create a ForkJoinTask with the supplied argument (the mailbox) as the runnable.
run The Mailbox is conveniently a Runnable so the ForkJoinPool will eventually call this method once scheduled. You can see that it processes special system messages then calls processMailbox then (in a finally) calls registerForExecution again. Note that registerForExecution checks if it needs scheduling first so this isn't an infinite loop, it's just checking if there are is remaining work to do. While we are in the Mailbox class you can also look at some of the methods that we used in the Dispatcher to see if scheduling is needed, to actually add messages to the queue,etc.
processMailbox Is essentially just a loop over calling actor.invoke except that it has to do lots of checking to see if it has system messages, if it's out of work, if it's passed a threshold, if it has been interrupted, etc.
invoke is where the code you write (the receiveMessage) actually gets called.
If you actually click through all of those links you'll see that I'm simplifying a lot. There's lots of error handling and code to make sure everything is thread safe, super efficient, and bulletproof. But that's the gist of the code flow.
I'm studying event sourcing and command/query segregation and I have a few doubts that I hope someone with more experience will easily answer:
A) should a command handler work with more than one aggregate? (a.k.a. should they coordinate things between several aggregates?)
B) If my command handler generates more than one event to store, how do you guys push all those events atomically to the event store? (how can I garantee no other command handler will "interleave" events in between?)
C) In many articles I read people suggest using optimistic locking to write the new events generated, but in my use case I will have around 100 requests / second. This makes me think that a lot of requests will just fail at huge rates (a lot of ConcurrencyExceptions), how you guys deal with this?
D) How to deal with the fact that the command handler can crash after storing the events in the event store but before publishing them to the event bus? (how to eventually push those "confirmed" events back to the event bus?)
E) How you guys deal with the eventual consistency in the projections? you just live with it? or in some cases people lock things there too? (waiting for an update for example)
I made a sequence diagram to better ilustrate all those questions
(and sorry for the bad english)
If my command handler generates more than one event to store, how do you guys push all those events atomically to the event store?
Most reasonable event store implementations will allow you to batch multiple events into the same transaction.
In many articles I read people suggest using optimistic locking to write the new events generated, but in my use case I will have around 100 requests / second.
If you have lots of parallel threads trying to maintain a complex invariant, something has gone badly wrong.
For "events" that aren't expected to establish or maintain any invariant, then you are just writing things to the end of a stream. In other words, you are probably not trying to write an event into a specific position in the stream. So you can probably use batching to reduce the number of conflicting writes, and a simple retry mechanism. In effect, you are using the same sort of "fan-in" patterns that appear when you have concurrent writers inserting into a queue.
For the cases where you are establishing/maintaining an invariant, you don't normally have many concurrent writers. Instead, specific writers have authority to write events (think "sharding"); the concurrency controls there are primarily to avoid making a mess in abnormal conditions.
How to deal with the fact that the command handler can crash after storing the events in the event store but before publishing them to the event bus?
Use pull, rather than push, as the primary subscription mechanism. Make sure that subscribers can handle duplicate messages safely (aka "idempotent"). Don't use a message subscription that can re-order events when you need events strictly ordered.
How you guys deal with the eventual consistency in the projections? you just live with it?
Pretty much. Views and reports have metadata information in them to let you know at what fixed point in "time" the report was accurate.
Unless you lock out all writers while a report is being consumed, there's a potential for any data being out of date, regardless of whether you are using events vs some other data model, regardless of whether you are using a single data model or several.
It's all part of the tradeoff; we accept that there will be a larger window between report time and current time in exchange for lower response latency, an "immutable" event history, etc.
should a command handler work with more than one aggregate?
Probably not - which isn't the same thing as always never.
Usual framing goes something like this: aggregate isn't a domain modeling pattern, like entity. It's a lifecycle pattern, used to make sure that all of the changes we make at one time are consistent.
In the case where you find that you want a command handler to modify multiple domain entities at the same time, and those entities belong to different aggregates, then have you really chosen the correct aggregate boundaries?
What you can do sometimes is have a single command handler that manages multiple transactions, updating a different aggregate in each. But it might be easier, in the long run, to have two different command handlers that each receive a copy of the command and decide what to do, independently.
I'm trying to understand when and where to use the different built-in Akka mailboxes as well as when it is appropriate to roll your own. However, nowhere on that page does it explain what a "bounded mailbox" actually is, or how it behaves different than an unbounded mailbox. Also, that page categorizes mailboxes as "blocking" vs "non-blocking". And while I have a strong idea of what they mean by this (a message can be sent to a mailbox unless the mailbox is first emptied) I'm not 100% sure that I understand this. So seeing that I have no idea what the docs mean when they categorize a mailbox as bounded or blocking, it's tough for me to tell when I should be using each type.
Also, it seems like it is the default Akka behavior to clear out an actor's mailbox if that actor is restarted. I'd like to prevent this, but not sure if the solution is to use one of these built-in mailbox types (no mention of message persistence is mentioned on this page) or to somehow use persistent actors to accomplish such lossless-ness.
First, if an actor crashes and is restarted you only lose the current message that was being processed and not the entire mailbox.
A bounded mailbox has a limit to the number of messages it can have queued before it starts blocking the sender and not allowing the item if the queue doesn't go down while the sender is trying to put an item on. If you have concerns about memory and you can deal with potential message loss then you might want something like this. An unbounded mailbox has no limit on capacity at all so it could possible suffer memory issues if it gets flooded.
Whether it's bounded or not will affect whether or not it blocks. Blocking is generally not great for performance and should be avoided if the situation does not call for a bounded mailbox. That's why the default mailbox is unbounded; it will yield much better performance than a bounded counterpart.
The single consumer unbounded mailbox will most likely be the fastest because it is optimized to only have one consumer ever taking things off the queue. This means that you can not use a dispatcher that allows an actor instance to steal items from another actor instances mailbox (work distributing/stealing) but if you don't care about that then this mailbox might be the best bet for performance.
The priority based mailboxes allow you to provide code that allows the placement within the queue to vary depending on some attributes on the messages itself. This allows you to define the priority of the messages yourself and this will then shift higher priority items to the front of the queue regardless of the normal FIFO rules.
When I write a message driven app. much like a standard windows app only that it extensively uses messaging for internal operations, what would be the best approach regarding to threading?
As I see it, there are basically three approaches (if you have any other setup in mind, please share):
Having a single thread process all of the messages.
Having separate threads for separate message types (General, UI, Networking, etc...)
Having multiple threads that share and process a single message queue.
So, would there be any significant performance differences between the three?
Here are some general thoughts:
Obviously, the last two options benefit from a situation where there's more than one processor. Plus, if any thread is waiting for an external event, other threads can still process unrelated messages. But ignoring that, seems that multiple threads only add overhead (Thread switches, not to mention more complicated sync situations).
And another question: Would you recommend to implement such a system upon the standard Windows messaging system, or to implement a separate queue mechanism, and why?
The specific choice of threading model should be driven by the nature of the problem you are trying to solve. There isn't necessarily a single "correct" approach to designing the threading model for such an application. However, if we adopt the following assumptions:
messages arrive frequently
messages are independent and don't rely too heavily on shared resources
it is desirable to respond to an arriving message as quickly as possible
you want the app to scale well across processing architectures (i.e. multicode/multi-cpu systems)
scalability is the key design requirement (e.g. more message at a faster rate)
resilience to thread failure / long operations is desirable
In my experience, the most effective threading architecture would be to employ a thread pool. All messages arrive on a single queue, multiple threads wait on the queue and process messages as they arrive. A thread pool implementation can model all three thread-distribution examples you have.
#1 Single thread processes all messages => thread pool with only one thread
#2 Thread per N message types => thread pool with N threads, each thread peeks at the queue to find appropriate message types
#3 Multiple threads for all messages => thread pool with multiple threads
The benefits of this design is that you can scale the number of threads in the thread in proportion to the processing environment or the message load. The number of threads can even scale at runtime to adapt to the realtime message load being experienced.
There are many good thread pooling libraries available for most platforms, including .NET, C++/STL, Java, etc.
As to your second question, whether to use standard windows message dispatch mechanism. This mechanism comes with significant overhead and is really only intended for pumping messages through an windows application's UI loop. Unless this is the problem you are trying to solve, I would advise against using it as a general message dispatching solution. Furthermore, windows messages carry very little data - it is not an object-based model. Each windows message has a code, and a 32-bit parameter. This may not be enough to base a clean messaging model on. Finally, the windows message queue is not design to handle cases like queue saturation, thread starvation, or message re-queuing; these are cases that often arise in implementing a decent message queing solution.
We can't tell you much for sure without knowing the workload (ie, the statistical distribution of events over time) but in general
single queue with multiple servers is at least as fast, and usually faster, so 1,3 would be preferable to 2.
multiple threads in most languages add complexity because of the need to avoid contention and multiple-writer problems
long duration processes can block processing for other things that could get done quicker.
So horseback guess is that having a single event queue, with several server threads taking events off the queue, might be a little faster.
Make sure you use a thread-safe data structure for the queue.
It all depends.
For example:
Events in a GUI queue are best done by a single thread as there is an implied order in the events thus they need to be done serially. Which is why most GUI apps have a single thread to handle events, though potentially multiple events to create them (and it does not preclude the event thread from creating a job and handling it off to a worker pool (see below)).
Events on a socket can potentially by done in parallel (assuming HTTP) as each request is stateless and can thus by done independently (OK I know that is over simplifying HTTP).
Work Jobs were each job is independent and placed on queue. This is the classic case of using a set of worker threads. Each thread does a potentially long operation independently of the other threads. On completion comes back to the queue for another job.
In general, don't worry about the overhead of threads. It's not going to be an issue if you're talking about merely a handful of them. Race conditions, deadlocks, and contention are a bigger concern, and if you don't know what I'm talking about, you have a lot of reading to do before you tackle this.
I'd go with option 3, using whatever abstractions my language of choice offers.
Note that there are two different performance goals, and you haven't stated which you are targetting: throughput and responsiveness.
If you're writing a GUI app, the UI needs to be responsive. You don't care how many clicks per second you can process, but you do care about showing some response within a 10th of a second or so (ideally less). This is one of the reasons it's best to have a single thread devoted to handling the GUI (other reasons have been mentioned in other answers). The GUI thread needs to basically convert windows messages into work-items and let your worker queue handle the heavy work. Once the worker is done, it notifies the GUI thread, which then updates the display to reflect any changes. It does things like painting a window, but not rendering the data to be displayed. This gives the app a quick "snapiness" that is what most users want when they talk about performance. They don't care if it takes 15 seconds to do something hard, as long as when they click on a button or a menu, it reacts instantly.
The other performance characteristic is throughput. This is the number of jobs you can process in a specific amount of time. Usually this type of performance tuning is only needed on server type applications, or other heavy-duty processing. This measures how many webpages can be served up in an hour, or how long it takes to render a DVD. For these sort of jobs, you want to have 1 active thread per CPU. Fewer than that, and you're going to be wasting idle clock cycles. More than that, and the threads will be competing for CPU time and tripping over each other. Take a look at the second graph in this article DDJ articles for the trade-off you're dealing with. Note that the ideal thread count is higher than the number of available CPUs due to things like blocking and locking. The key is the number of active threads.
A good place to start is to ask yourself why you need multiple threads.
The well-thought-out answer to this question will lead you to the best answer to the subsequent question, "how should I use multiple threads in my application?"
And that must be a subsequent question; not a primary question. The fist question must be why, not how.
I think it depends on how long each thread will be running. Does each message take the same amount of time to process? Or will certain messages take a few seconds for example. If I knew that Message A was going to take 10 seconds to complete I would definitely use a new thread because why would I want to hold up the queue for a long running thread...
My 2 cents.
I think option 2 is the best. Having each thread doing independant tasks would give you best results. 3rd approach can cause more delays if multiple threads are doing some I/O operation like disk reads, reading common sockets and so on.
Whether to use Windows messaging framework for processing requests depends on the work load each thread would have. I think windows restricts the no. of messages that can be queued at the most to 10000. For most of the cases this should not be an issue. But if you have lots of messages to be queued this might be some thing to take into consideration.
Seperate queue gives a better control in a sense that you may reorder it the way you want (may be depending on priority)
Yes, there will be performance differences between your choices.
(1) introduces a bottle-neck for message processing
(3) introduces locking contention because you'll need to synchronize access to your shared queue.
(2) is starting to go in the right direction... though a queue for each message type is a little extreme. I'd probably recommend starting with a queue for each model in your app and adding queues where it makes since to do so for improved performance.
If you like option #2, it sounds like you would be interested in implementing a SEDA architecture. It is going to take some reading to understand what is going on, but I think the architecture fits well with your line of thinking.
BTW, Yield is a good C++/Python hybrid implementation.
I'd have a thread pool servicing the message queue, and make the number of threads in the pool easily configurable (perhaps even at runtime). Then test it out with expected load.
That way you can see what the actual correlation is - and if your initial assumptions change, you can easily change your approach.
A more sophisticated approach would be for the system to introspect its own performance traits and adapt it's use of resources, threads in particular, as it goes. Probably overkill for most custom application code, but I'm sure there are products that do that out there.
As for the windows events question - I think that's probably an application specific question that there is no right or wrong answer to in the general case. That said, I usually implement my own queue as I can tailor it to the specific characteristics of the task at hand. Sometimes that might involve routing events via the windows message queue.