EDIT: One important thing I forgot mentioning: the actor creation described below depends on the data - sometimes few processing actors are required, and sometimes many.
One component I'm working on needs to create a number of actors (possibly, round-robin routed ones), that each get a rather large amount of messages to process. Each of those actors belongs to a "processing batch" which has a the same initialization parameters.
When I'm running this on the production machine with many messages, I quickly get a number of actor creation timeouts. I'm creating the actors directly with ActorSystem.actorOf().
What's surprising me though is that all in all there aren't that many actors being created I'd think (8 "processing sinks" with 5 round-robin routed actors would be 40 actors, which doesn't seem very much).
I'm shutting down the actors once they're not needed anymore by having another actor (which counts the amount of successes and failures that it gets via the "processing" actors) send them a PoisonPill so I'd think that they are all shut down correctly.
Am I perhaps doing something wrong here in the way that I am creating those actors, e.g. should I perhaps create them differently? Or would an appropriate strategy be to wait for some of the batches to be done before creating new actors?
Since you did not specify which version you are using I’m assuming that you will be interested in reading this:
http://doc.akka.io/docs/akka/2.0.3/scala/actors.html#Creating_Actors_with_default_constructor (especially the warning)
Besides the technical argument, creating your actors at top level is not a good design, missing out on the fault handling benefits.
Related
I am looking to use ZeroMQ to facilitate IPC in my embedded systems application, however, I'm not able to find many examples on using multiple 0MQ socket types in the same process.
For example, say I have a process called "antenna_mon" that monitors an antenna. I want to be able to send messages to this process and get responses back - a classic REQ-REP pattern. However, I also have a "cm" process, that publishes configuration changes to subscribers. I want antenna_mon to also subscribe to antenna configuration changes - PUB-SUB.
I found this example of reading from multiple sockets in the same process, but it seems sub optimal, because now you no longer block waiting for messages, you inefficiently check for messages constantly and go back to sleep.
Has anyone encountered this problem before? Am I just thinking about it wrong? Maybe I should have two threads - one for CM changes, one for REQ-REP servicing?
I would love any insights or examples of solving this type of problem.
Welcome to the very nature of distributed computing!
Yes, there are new perspectives one has to solve, once assembling a Project for a multi-agent domain, where more than one process works and communicates with it's respective peers ad-hoc.
A knowledge base, acquired from a soft Real-Time System or embedded systems design experience will help a lot here. If none such available, some similarities might be also chosen from GUI design, where a centerpiece is something like a lightweight .mainloop() scheduler, and most of the hard-work is embedded into round-robin polled GUI-devices and internal-state changes or external MMI-events are marshalled into event-triggered handlers.
ZeroMQ infrastructure gives one all the tools needed for such non-blocking, controllably poll-able ( scaleable, variable or adaptively ad-hoc adjustable poll-timeouts, not to overcome the given, design defined, round-trip duration of the controller .mainloop() ) and transport-agnostic, asynchronously operated, message dispatcher ( with thread-mapped performance scaling & priority tuning ).
What else one may need?
Well, just imagination and a lot of self-discipline to adhere the Zero-Copy, Zero-Sharing and Zero-Blocking design maxims.
The rest is in your hands.
Many "academic" examples may seem trivial and simplified, so as to illustrate just the currently discussed, or a feature demonstrated in some narrow perspective.
Not so in the real-life situations.
As an example, my distributed ML-engine uses a tandem of several PUSH/PULL pipelines for moving state data updates transfers and prediction forcasts + another PUSH/PULL for remote keyboard + a reversed .bind()/.connect() on PUB/SUB for easy broadcasting of distributed agents' telemetry to a remote centrally operated syslog and some additional PAIR/PAIR pipes, as processing requires.
( nota bene: one shall always bear in mind, that robust and error-resilient systems ought avoid to use a default REQ/REP Scaleable Formal Communication Pattern, as there is non-zero probability of falling the pairwise-stepped REQ/REP dual-FSA into an unsalvageable deadlock. Do not hesitate to read more about this smart tool. )
I am creating a chatbot and need a solution to send messages to the user in the future after a specific delay. I have my system set up with Nginx, Gunicorn and Django. The idea is that if the bot needs to send the user several messages, it can delay each subsequent message by a certain amount of time before it sends it to seem more 'human'.
However, a simple threading.Timer approach won't work because the user might interrupt this process at any moment prompting future messages to be changed, but the timer threads might not be available to be stopped as they are on a different worker. So far I have come across two solutions:
Use threading.Timer blindly to check a to-send list in the database, can create problems with lots of unneeded threads. Also makes the database less clean/organized.
Use celery or some other system to execute these future tasks. Seems like overkill and over-engineering a simple problem. Tasks will always just be delayed function calls. Also a hassle dealing with which messages belong to which conversation.
What would be the best solution for this problem?
Also, a more generic question:
Ideally the best solution would be a framework where I can 'simulate' a new bot for each conversation so it acts as its own entity and holds all the state/message queue information in memory for itself. It would be necessary for this framework to only allocate resources to a bot when it needs to do something based on a preset delay or incoming message. Is there anything that exists like this?
Personally I would use Celery for this; executing delayed function calls is its job. And I don't know why knowing what messages belong where would be more of a problem there than doing it in a thread.
But you might also want to investigate the new Django-Channels work that Andrew Godwin is doing, since that is intended to support async background tasks.
I have written some actor classes and I find that I have to get a handle into the lifecycle of these entities. For example whenever my actor is initialized I would like a method to be called so that I can setup some listeners on message queues (or open db connections etc).
Is there an equivalent of this? The equivalent I can think of is Spring's InitialisingBean and DisposableBean
This is a typical scenario where you would override methods like preStart(), postStop(), etc. I don't see anything wrong with this.
Of course you have to be aware of the details - for example postStop() is called asynchronously after actor.stop() is invoked while preStart() is called when an Actor is started. This means that potentially slow/blocking things like DB interaction should be kept to a minimum.
You can also use the Actor's constructor for initialization of data.
As Matthew mentioned, supervision plays a big part in Akka - so you can instruct the supervisor to perform specific stuff on events. For example the so-called DeathWatch - you can be notified if one of the actors "you are watching upon" dies:
context.watch(child)
...
def receive = {
case Terminated(`child`) => lastSender ! "finished"
}
An Actor is basically two methods -- a constructor, and onMessage(Object): void.
There's nothing in its lifecycle that naturally provides for "wiring" behavior, which leaves you with a few options.
Use a Supervisor actor to create your other actors. A Supervisor is responsible for watching, starting and restarting Actors on failure -- and therefore it is often valuable to have a Supervisor that understands the state of integrated systems to avoid continously restarting. This Supervisor would create and manage Service objects (possibly via Spring) and pass them to Actors.
Use your preferred Initialization technique at the time of Actor construction. It's tricky but you can certainly combine Spring with Actors. Just be aware that should a Supervisor restart your actor, you'll need to be able to resurrect its desired state from whatever content you placed in the Props object you used to start it in the first place.
Wire everything on-demand. Open connections on demand when an Actor starts (and cache them as necessary). I find I do this fairly often -- and I let the Actor fail when its connections no longer work. The supervisor will restart the Actor, which will recreate all connections.
Something important things to remember:
The intent of Actor model is that Actors don't run continuously -- they only run when there are messages provided to them. If you add a message listener to an Actor, you are essentially adding new threads that can access that actor. This can be a problem if you use supervision -- a restarted actor may leak that thread and this may in turn cause the actor not to be garbage collected. It can also be a problem because it introduces a race condition, and part of the value of actors is avoiding that.
An Actor that does I/O is, from the perspective of the actor system, blocking. If you have too many Actors doing I/O at the same time, you will exhaust your Dispatcher's thread pool and lock up the system.
A given Actor instance can operate on many different threads over its lifetime, but will only operate on one thread at a time. This can be confusing to some messaging systems -- for example, JMS' Spec asserts that a Session not be used on multiple threads, and many JMS interpret this as "can only run on the thread on which it was started." You may see warnings, or even exceptions, resulting from this.
For these reasons, I prefer to use non-actor code to do some of my I/O. For example, I'll have an incoming message listener object whose responsibility is to take JMS messages off a queue, use them to create POJO messages, and send tells to the Actor system. Alternately, I'll use an Actor, but place that actor on a custom Dispatcher that has thread pinning enabled. This assures that that Actor will only run on a specific thread and won't block up the system that other non-I/O actors are using.
[as a small context provider: I am new to networking and ZERO-MQ, but I did spend quite a bit of time on the guide and examples]
I have the following challenge (done in C++, but irrelevant to the question). I have a single source that generates tasks. I have multiple engines that need to process those tasks, and send back the result.
First attempt:
I created a client with a ZMQ_PUSH socket. The engines have a ZMQ_PULL socket. To get the answers back to the client, I created the reverse: a ZMQ_PUSH on the workers and a ZMQ_PULL on the client. It worked out of the box. Only to find out that after some time the client ran out of memory since I was pushing way more requests than the workers could process. I need some backpressure.
Second attempt:
I added a counter on the client that took care of only pushing when no more than say 1000 tasks were 'in progress'. The out of memory issue was solved, since I was never having more than 1000 'in progress' tasks. But ... some workers were slower than others. Since PUSH/PULL uses fair queueing, the amount of work for that slow worker kept increasing and increasing...until the slowest worker had all 1000 requests queued and the others were starved. I was not using my workers effectively.
Now, what architecture could I use that solves the issue of 'workers with different speed'? Is the 'count the number of in progress tasks' approach a good way of balancing the number of pushed requests? Or is there a way I can PUSH tasks to the workers, and the pushing blocks on a predefined point? Can I do that with HWM?
I am sure this problem is of such a generic nature that I should be able to easily deal with this. Can anyone point me in the right direction?
Thanks!
we used the Paranoid Pirate Protocol http://rfc.zeromq.org/spec:6,
but in case of many very small jobs, where the overhead of communication might be high, a credit-based flow control pattern might be more efficient. http://unprotocols.org/blog:15
in both cases it is necessary for the requester to directly assign jobs to individual workers. this is abstracted away of course and, depending on the use-case, could be made available as a sync call, which returns when all tasks have been processed.
I'm building my first web application after many years of desktop application development (I'm using Django/Python but maybe this is a completely generic question, I'm not sure). So please beware - this may be an ultra-newbie question...
One of my user processes involves heavy processing in the server (i.e. user inputs something, server needs ~10 minutes to process it). On a desktop application, what I would do it throw the user input into a queue protected by a mutex, and have a dedicated background thread running in low priority blocking on the queue using that mutex.
However in the web application everything seems to be oriented towards synchronization with the HTTP requests.
Assuming I will use the database as my queue, what is best practice architecture for running a background process?
There are two schools of thought on this (at least).
Throw the work on a queue and have something else outside your web-stack handle it.
Throw the work on a queue and have something else in your web-stack handle it.
In either case, you create work units in a queue somewhere (e.g. a database table) and let some process take care of them.
I typically work with number 1 where I have a dedicated windows service that takes care of these things. You could also do this with SQL jobs or something similar.
The advantage to item 2 is that you can more easily keep all your code in one place--in the web tier. You'd still need something that triggers the execution (e.g. loading the web page that processes work units with a sufficiently high timeout), but that could be easily accomplished with various mechanisms.
Since:
1) This is a common problem,
2) You're new to your platform
-- I suggest that you look in the contributed libraries for your platform to find a solution to handle the task. In addition to queuing and processing the jobs, you'll also want to consider:
1) status communications between the worker and the web-stack. This will enable web pages that show the percentage complete number for the job, assure the human that the job is progressing, etc.
2) How to ensure that the worker process does not die.
3) If a job has an error, will the worker process automatically retry it periodically?
Will you or an operations person be notified if a job fails?
4) As the number of jobs increase, can additional workers be added to gain parallelism?
Or, even better, can workers be added on other servers?
If you can't find a good solution in Django/Python, you can also consider porting a solution from another platform to yours. I use delayed_job for Ruby on Rails. The worker process is managed by runit.
Regards,
Larry
Speaking generally, I'd look at running background processes on a different server, especially if your web server has any kind of load.
Running long processes in Django: http://iraniweb.com/blog/?p=56