Notifying a task from multiple other tasks without extra work

Notifying a task from multiple other tasks without extra work - concurrency

My application is futures-based with async/await, and has the following structure within one of its components:
a "manager", which is responsible for starting/stopping/restarting "workers", based both on external input and on the current state of "workers";
a dynamic set of "workers", which perform some continuous work, but may fail or be stopped externally.
A worker is just a spawned task which does some I/O work. Internally it is a loop which is intended to be infinite, but it may exit early due to errors or other reasons, and in this case the worker must be restarted from scratch by the manager.
The manager is implemented as a loop which awaits on several channels, including one returned by async_std::stream::interval, which essentially makes the manager into a poller - and indeed, I need this because I do need to poll some Mutex-protected external state. Based on this state, the manager, among everything else, creates or destroys its workers.
Additionally, the manager stores a set of async_std::task::JoinHandles representing live workers, and it uses these handles to check whether any workers has exited, restarting them if so. (BTW, I do this currently using select(handle, future::ready()), which is totally suboptimal because it relies on the select implementation detail, specifically that it polls the left future first. I couldn't find a better way of doing it; something like race() would make more sense, but race() consumes both futures, which won't work for me because I don't want to lose the JoinHandle if it is not ready. This is a matter for another question, though.)
You can see that in this design workers can only be restarted when the next poll "tick" in the manager occurs. However, I don't want to use a too small interval for polling, because in most cases polling just wastes CPU cycles. Large intervals, however, can delay restarting a failed/canceled worker by too much, leading to undesired latencies. Therefore, I though I'd set up another channel of ()s back from each worker to the manager, which I'd add to the main manager loop, so when a worker stops due to an error or otherwise, it will first send a message to its channel, resulting in the manager being woken up earlier than the next poll in order to restart the worker right away.
Unfortunately, with any kinds of channels this might result in more polls than needed, in case two or more workers stop at approximately the same time (which due to the nature of my application, is somewhat likely to happen). In such case it would make sense to only run the manager loop once, handling all of the stopped workers, but with channels it will necessarily result in the number of polls equal to the number of stopped workers, even if additional polls don't do anything.
Therefore, my question is: how do I notify the manager from its workers that they are finished, without resulting in extra polls in the manager? I've tried the following things:
As explained above, regular unbounded channels just won't work.
I thought that maybe bounded channels could work - if I used a channel with capacity 0, and there was a way to try and send a message into it but just drop the message if the channel is full (like the offer() method on Java's BlockingQueue), this seemingly would solve the problem. Unfortunately, the channels API, while providing such a method (try_send() seems to be like it), also has this property of having capacity larger than or equal to the number of senders, which means it can't really be used for such notifications.
Some kind of atomic or a mutex-protected boolean flag also look as if it could work, but there is no atomic or mutex API which would provide a future to wait on, and would also require polling.
Restructure the manager implementation to include JoinHandles into the main select somehow. It might do the trick, but it would result in large refactoring which I'm unwilling to make at this point. If there is a way to do what I want without this refactoring, I'd like to use that first.
I guess some kind of combination of atomics and channels might work, something like setting an atomic flag and sending a message, and then skipping any extra notifications in the manager based on the flag (which is flipped back to off after processing one notification), but this also seems like a complex approach, and I wonder if anything simpler is possible.

I recommend using the FuturesUnordered type from the futures crate. This collection allows you to push many futures of the same type into a collection and wait for any one of them to complete at once.
It implements Stream, so if you import StreamExt, you can use unordered.next() to obtain a future that completes once any future in the collection completes.
If you also need to wait for a timeout or mutex etc., you can use select to create a future that completes once either the timeout or one of the join handles completes. The future returned by next() implements Unpin, so it is usable with select without problems.

Related

How to implement long running gRPC async streaming data updates in C++ server

I'm creating an async gRPC server in C++. One of the methods streams data from the server to clients - it's used to send data updates to clients. The frequency of the data updates isn't predictable. They could be nearly continuous or as infrequent as once per hour. The model used in the gRPC example with the "CallData" class and the CREATE/PROCESS/FINISH states doesn't seem like it would work very well for that. I've seen an example that shows how to create a 'polling' loop that sleeps for some time and then wakes up to check for new data, but that doesn't seem very efficient.
Is there another way to do this? If I use the "CallData" method can it block in the 'PROCESS' state until there's data (which probably wouldn't be my first choice)? Or better, can I structure my code so I can notify a gRPC handler when data is available?
Any ideas or examples would be appreciated.

In a server-side streaming example, you probably need more states, because you need to track whether there is currently a write already in progress. I would add two states, one called WRITE_PENDING that is used when a write is in progress, and another called WRITABLE that is used when a new message can be sent immediately. When a new message is produced, if you are in state WRITABLE, you can send immediately and go into state WRITE_PENDING, but if you are in state WRITE_PENDING, then the newly produced message needs to go into a queue to be sent after the current write finishes. When a write finishes, if the queue is non-empty, you can grab the next message from the queue and immediately start a write for it; otherwise, you can just go into state WRITABLE and wait for another message to be produced.
There should be no need to block here, and you probably don't want to do that anyway, because it would tie up a thread that should otherwise be polling the completion queue. If all of your threads wind up blocked that way, you will be blind to new events (such as new calls coming in).
An alternative here would be to use the C++ sync API, which is much easier to use. In that case, you can simply write straight-line blocking code. But the cost is that it creates one thread on the server for each in-progress call, so it may not be feasible, depending on the amount of traffic you're handling.
I hope this information is helpful!

Is there a limit to the number of created events?

I'm developing a C++14 Windows DLL on VS2015 that runs on all Windows version >= XP.
TL;DR
Is there a limit to the number of events, created with CreateEvent, with different names of course?
Background
I'm writing a thread pool class.
The class interface is simple:
void AddTask(std::function<void()> task);
Task is added to a queue of tasks and waiting workers (vector <thread>) activate the task when available.
Requirement
Wait (block) for a task for a little bit before continuing with the flow. Meaning, some users of ThreadPool, after calling AddTask, may want to wait for a while (say 1 second) for the task to end, before continuing with the flow. If the task is not done yet, they will continue with the flow anyways.
Problem
ThreadPool class cannot provide Wait interface. Not its responsibility.
Solution
ThreadPool will SetEvent when task is done.
Users of ThreadPool will wait (or not. depend on their need) for the event to be signaled.
So, I've changed the return value of ThreadPool::AddTask from void to int where int is a unique task ID which is essentially the name of the event to be singled when a task is done.
Question
I don't expect more than ~500 tasks but I'm afraid that creating hundreds of events is not possible or even a bad practice.
So is there a limit? or a better approach?

Of course there is a limit (if nothing else; at some point the system runs out of memory).
In reality, the limit is around 16 million per process.
You can read more details here: https://blogs.technet.microsoft.com/markrussinovich/2009/09/29/pushing-the-limits-of-windows-handles/

You're asking the wrong question. Fortunately you gave enough background to answer your real question. But before we get to that:
First, if you're asking what's the maximum number of events a process can open or a system can hold, you're probably doing something very very wrong. Same goes for asking what's the maximum number of files a process can open or what's the maximum number of threads a process can create.
You can create 50, 100, 200, 500, 1000... but where does it stop? If you're even considering creating that many of them that you have to ask about a limit, you're on the wrong track.
Second, the answer depends on too many implementation details: OS version, amount of RAM installed, registry settings, and maybe more. Other programs running also affect that "limit".
Third, even if you knew the limit - even if you could somehow calculate it at runtime based on all the relevant factors - it wouldn't allow you to do anything that you can't already do now.
Lets say you find out the limit is L and you have created exactly L events by now. Another task come in. What do you do? Throw away the task? Execute the task without signaling an event? Wait until there are fewer than L events and only then create an event and start executing the task? Crash the process?
Whatever you decide you can do it just the same when CreateEvent fails. All of this is completely pointless. And this is yet another indication that you're asking the wrong question.
But maybe the most wrong thing you're doing is saying "the thread pool class can't provide wait because it's not its responsibility, so lets have the thread pool class provide an event for each task that the thread pool will signal when the task ends" (in paraphrase).
It looks like by the end of the sentence you forgot the premise from the beginning: It's not the thread pool's responsibility!
If you want to wait for the task to finish have the task itself signal when it's done. There's no reason to complicate the thread pool because someone, sometimes want to wait on tasks. Signaling that the task is done is the task's job:
event evt; ///// this
thread_pool.queue([evt] {
// whatever
evt.signal(); ///// and this
});
auto reason = wait(evt, 1s);
if (reason == timeout) {
log("bummer");
}
The event class could be anything you want - a Windows event, and std::promise and std::future pair, or anything else.
This is so simple and obvious.
Complicating the thread pool infrastructure, taking up valuable system resources for nothing, and signaling synchronization primitives even when no one's listening just to save the two marked code lines above in the few cases where you actually want to wait for the task is unjustifiable.

How do I add simple delayed tasks in Django?

I am creating a chatbot and need a solution to send messages to the user in the future after a specific delay. I have my system set up with Nginx, Gunicorn and Django. The idea is that if the bot needs to send the user several messages, it can delay each subsequent message by a certain amount of time before it sends it to seem more 'human'.
However, a simple threading.Timer approach won't work because the user might interrupt this process at any moment prompting future messages to be changed, but the timer threads might not be available to be stopped as they are on a different worker. So far I have come across two solutions:
Use threading.Timer blindly to check a to-send list in the database, can create problems with lots of unneeded threads. Also makes the database less clean/organized.
Use celery or some other system to execute these future tasks. Seems like overkill and over-engineering a simple problem. Tasks will always just be delayed function calls. Also a hassle dealing with which messages belong to which conversation.
What would be the best solution for this problem?
Also, a more generic question:
Ideally the best solution would be a framework where I can 'simulate' a new bot for each conversation so it acts as its own entity and holds all the state/message queue information in memory for itself. It would be necessary for this framework to only allocate resources to a bot when it needs to do something based on a preset delay or incoming message. Is there anything that exists like this?

Personally I would use Celery for this; executing delayed function calls is its job. And I don't know why knowing what messages belong where would be more of a problem there than doing it in a thread.
But you might also want to investigate the new Django-Channels work that Andrew Godwin is doing, since that is intended to support async background tasks.

What is Ember RunLoop and how does it work?

I am trying to understand how Ember RunLoop works and what makes it tick. I have looked at the documentation, but still have many questions about it. I am interested in understanding better how RunLoop works so I can choose appropriate method within its name space, when I have to defer execution of some code for later time.
When does Ember RunLoop start. Is it dependant on Router or Views or Controllers or something else?
how long does it approximately take (I know this is rather silly to asks and dependant on many things but I am looking for a general idea, or maybe if there is a minimum or maximum time a runloop may take)
Is RunLoop being executed at all times, or is it just indicating a period of time from beginning to end of execution and may not run for some time.
If a view is created from within one RunLoop, is it guaranteed that all its content will make it into the DOM by the time the loop ends?
Forgive me if these are very basic questions, I think understanding these will help noobs like me use Ember better.

Update 10/9/2013: Check out this interactive visualization of the run loop: https://machty.s3.amazonaws.com/ember-run-loop-visual/index.html
Update 5/9/2013: all the basic concepts below are still up to date, but as of this commit, the Ember Run Loop implementation has been split off into a separate library called backburner.js, with some very minor API differences.
First off, read these:
http://blog.sproutcore.com/the-run-loop-part-1/
http://blog.sproutcore.com/the-run-loop-part-2/
They're not 100% accurate to Ember, but the core concepts and motivation behind the RunLoop still generally apply to Ember; only some implementation details differ. But, on to your questions:
When does Ember RunLoop start. Is it dependant on Router or Views or Controllers or something else?
All of the basic user events (e.g. keyboard events, mouse events, etc) will fire up the run loop. This guarantees that whatever changes made to bound properties by the captured (mouse/keyboard/timer/etc) event are fully propagated throughout Ember's data-binding system before returning control back to the system. So, moving your mouse, pressing a key, clicking a button, etc., all launch the run loop.
how long does it approximately take (I know this is rather silly to asks and dependant on many things but I am looking for a general idea, or maybe if there is a minimum or maximum time a runloop may take)
At no point will the RunLoop ever keep track of how much time it's taking to propagate all the changes through the system and then halt the RunLoop after reaching a maximum time limit; rather, the RunLoop will always run to completion, and won't stop until all the expired timers have been called, bindings propagated, and perhaps their bindings propagated, and so on. Obviously, the more changes that need to be propagated from a single event, the longer the RunLoop will take to finish. Here's a (pretty unfair) example of how the RunLoop can get bogged down with propagating changes compared to another framework (Backbone) that doesn't have a run loop: http://jsfiddle.net/jashkenas/CGSd5/ . Moral of the story: the RunLoop's really fast for most things you'd ever want to do in Ember, and it's where much of Ember's power lies, but if you find yourself wanting to animate 30 circles with Javascript at 60 frames per second, there might be better ways to go about it than relying on Ember's RunLoop.
Is RunLoop being executed at all times, or is it just indicating a period of time from beginning to end of execution and may not run for some time.
It is not executed at all times -- it has to return control back to the system at some point or else your app would hang -- it's different from, say, a run loop on a server that has a while(true) and goes on for infinity until the server gets a signal to shut down... the Ember RunLoop has no such while(true) but is only spun up in response to user/timer events.
If a view is created from within one RunLoop, is it guaranteed that all its content will make it into the DOM by the time the loop ends?
Let's see if we can figure that out. One of the big changes from SC to Ember RunLoop is that, instead of looping back and forth between invokeOnce and invokeLast (which you see in the diagram in the first link about SproutCore's RL), Ember provides you a list of 'queues' that, in the course of a run loop, you can schedule actions (functions to be called during the run loop) to by specifying which queue the action belongs in (example from the source: Ember.run.scheduleOnce('render', bindView, 'rerender');).
If you look at run_loop.js in the source code, you see Ember.run.queues = ['sync', 'actions', 'destroy', 'timers'];, yet if you open your JavaScript debugger in the browser in an Ember app and evaluate Ember.run.queues, you get a fuller list of queues: ["sync", "actions", "render", "afterRender", "destroy", "timers"]. Ember keeps their codebase pretty modular, and they make it possible for your code, as well as its own code in a separate part of the library, to insert more queues. In this case, the Ember Views library inserts render and afterRender queues, specifically after the actions queue. I'll get to why that might be in a second. First, the RunLoop algorithm:
The RunLoop algorithm is pretty much the same as described in the SC run loop articles above:
You run your code between RunLoop .begin() and .end(), only in Ember you'll want to instead run your code within Ember.run, which will internally call begin and end for you. (Only internal run loop code in the Ember code base still uses begin and end, so you should just stick with Ember.run)
After end() is called, the RunLoop then kicks into gear to propagate every single change made by the chunk of code passed to the Ember.run function. This includes propagating the values of bound properties, rendering view changes to the DOM, etc etc. The order in which these actions (binding, rendering DOM elements, etc) are performed is determined by the Ember.run.queues array described above:
The run loop will start off on the first queue, which is sync. It'll run all of the actions that were scheduled into the sync queue by the Ember.run code. These actions may themselves also schedule more actions to be performed during this same RunLoop, and it's up to the RunLoop to make sure it performs every action until all the queues are flushed. The way it does this is, at the end of every queue, the RunLoop will look through all the previously flushed queues and see if any new actions have been scheduled. If so, it has to start at the beginning of the earliest queue with unperformed scheduled actions and flush out the queue, continuing to trace its steps and start over when necessary until all of the queues are completely empty.
That's the essence of the algorithm. That's how bound data gets propagated through the app. You can expect that once a RunLoop runs to completion, all of the bound data will be fully propagated. So, what about DOM elements?
The order of the queues, including the ones added in by the Ember Views library, is important here. Notice that render and afterRender come after sync, and action. The sync queue contains all the actions for propagating bound data. (action, after that, is only sparsely used in the Ember source). Based on the above algorithm, it is guaranteed that by the time the RunLoop gets to the render queue, all of the data-bindings will have finished synchronizing. This is by design: you wouldn't want to perform the expensive task of rendering DOM elements before sync'ing the data-bindings, since that would likely require re-rendering DOM elements with updated data -- obviously a very inefficient and error-prone way of emptying all of the RunLoop queues. So Ember intelligently blasts through all the data-binding work it can before rendering the DOM elements in the render queue.
So, finally, to answer your question, yes, you can expect that any necessary DOM renderings will have taken place by the time Ember.run finishes. Here's a jsFiddle to demonstrate: http://jsfiddle.net/machty/6p6XJ/328/
Other things to know about the RunLoop
Observers vs. Bindings
It's important to note that Observers and Bindings, while having the similar functionality of responding to changes in a "watched" property, behave totally differently in the context of a RunLoop. Binding propagation, as we've seen, gets scheduled into the sync queue to eventually be executed by the RunLoop. Observers, on the other hand, fire immediately when the watched property changes without having to be first scheduled into a RunLoop queue. If an Observer and a binding all "watch" the same property, the observer will always be called 100% of the time earlier than the binding will be updated.
scheduleOnce and Ember.run.once
One of the big efficiency boosts in Ember's auto-updating templates is based on the fact that, thanks to the RunLoop, multiple identical RunLoop actions can be coalesced ("debounced", if you will) into a single action. If you look into the run_loop.js internals, you'll see the functions that facilitate this behavior are the related functions scheduleOnce and Em.run.once. The difference between them isn't so important as knowing they exist, and how they can discard duplicate actions in queue to prevent a lot of bloated, wasteful calculation during the run loop.
What about timers?
Even though 'timers' is one of the default queues listed above, Ember only makes reference to the queue in their RunLoop test cases. It seems that such a queue would have been used in the SproutCore days based on some of the descriptions from the above articles about timers being the last thing to fire. In Ember, the timers queue isn't used. Instead, the RunLoop can be spun up by an internally managed setTimeout event (see the invokeLaterTimers function), which is intelligent enough to loop through all the existing timers, fire all the ones that have expired, determine the earliest future timer, and set an internal setTimeout for that event only, which will spin up the RunLoop again when it fires. This approach is more efficient than having each timer call setTimeout and wake itself up, since in this case, only one setTimeout call needs to be made, and the RunLoop is smart enough to fire all the different timers that might be going off at the same time.
Further debouncing with the sync queue
Here's a snippet from the run loop, in the middle of a loop through all the queues in the run loop. Note the special case for the sync queue: because sync is a particularly volatile queue, in which data is being propagated in every direction, Ember.beginPropertyChanges() is called to prevent any observers from being fired, followed by a call to Ember.endPropertyChanges. This is wise: if in the course of flushing the sync queue, it's entirely possible that a property on an object will change multiple times before resting on its final value, and you wouldn't want to waste resources by immediately firing observers per every single change.
if (queueName === 'sync')
{
log = Ember.LOG_BINDINGS;
if (log)
{
Ember.Logger.log('Begin: Flush Sync Queue');
}
Ember.beginPropertyChanges();
Ember.tryFinally(tryable, Ember.endPropertyChanges);
if (log)
{
Ember.Logger.log('End: Flush Sync Queue');
}
}
else
{
forEach.call(queue, iter);
}

Processing messages is too slow, resulting in a jerky, unresponsive UI - how can I use multiple threads to alleviate this?

I'm having trouble keeping my app responsive to user actions. Therefore, I'd like to split message processing between multiple threads.
Can I simply create several threads, reading from the same message queue in all of them, and letting which ever one is able process each message?
If so, how can this be accomplished?
If not, can you suggest another way of resolving this problem?

You cannot have more than one thread which interacts with the message pump or any UI elements. That way lies madness.
If there are long processing tasks which can be farmed out to worker threads, you can do it that way, but you'll have to use another thread-safe queue to manage them.

If this were later in the future, I would say use the Asynchronous Agents APIs (plug for what I'm working on) in the yet to be released Visual Studio 2010 however what I would say given todays tools is to separate the work, specifically in your message passing pump you want to do as little work as possible to identify the message and pass it along to another thread which will process the work (hopefully there isn't Thread Local information that is needed). Passing it along to another thread means inserting it into a thread safe queue of some sort either locked or lock-free and then setting an event that other threads can watch to pull items from the queue (or just pull them directly). You can look at using a 'work stealing queue' with a thread pool for efficiency.
This will accomplish getting the work off the UI thread, to have the UI thread do additional work (like painting the results of that work) you need to generate a windows message to wake up the UI thread and check for the results, an easy way to do this is to have another 'work ready' queue of work objects to execute on the UI thread. imagine an queue that looks like this: threadsafe_queue<function<void(void)> basically you can check if it to see if it is non-empty on the UI thread, and if there are work items then you can execute them inline. You'll want the work objects to be as short lived as possible and preferably not do any blocking at all.
Another technique that can help if you are still seeing jerky movement responsiveness is to either ensure that you're thread callback isn't executing longer that 16ms and that you aren't taking any locks or doing any sort of I/O on the UI thread. There's a series of tools that can help identify these operations, the most freely available is the 'windows performance toolkit'.

Create the separate thread when processing the long operation i.e. keep it simple, the issue is with some code you are running that is taking too long, that's the code that should have a separate thread.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js