Long cycle blocks application - swift3

I hve following cycle in my app
var maxIterations: Int = 0
func calculatePoint(cn: Complex) -> Int {
let threshold: Double = 2
var z: Complex = .init(re: 0, im: 0)
var z2: Complex = .init(re: 0, im: 0)
var iteration: Int = 0
repeat {
z2 = self.pow2ForComplex(cn: z)
z.re = z2.re + cn.re
z.im = z2.im + cn.im
iteration += 1
} while self.absForComplex(cn: z) <= threshold && iteration < self.maxIterations
return iteration
}
and rainbow wheel is showing during the cycle execution. How I can manage that app is still responding to UI actions?
Note I have NSProgressIndicator updated in different part of code which is not being updated (progress is not shown) while the cycle is running.
I have suspicion that it has something to do with dispatcing but I'm quite "green" with that. I do appreciate any help.
Thanks.

To dispatch something asynchronously, call async on the appropriate queue. For example, you might change this method to do the calculation on a global background queue, and then report the result back on the main queue. By the way, when you do that, you shift from returning the result immediately to using a completion handler closure which the asynchronous method will call when the calculation is done:
func calculatePoint(_ cn: Complex, completionHandler: #escaping (Int) -> Void) {
DispatchQueue.global(qos: .userInitiated).async {
// do your complicated calculation here which calculates `iteration`
DispatchQueue.main.async {
completionHandler(iteration)
}
}
}
And you'd call it like so:
// start NSProgressIndicator here
calculatePoint(point) { iterations in
// use iterations here, noting that this is called asynchronously (i.e. later)
// stop NSProgressIndicator here
}
// don't use iterations here, because the above closure is likely not yet done by the time we get here;
// we'll get here almost immediately, but the above completion handler is called when the asynchronous
// calculation is done.
Martin has surmised that you are calculating a Mandelbrot set. If so, dispatching the calculation of each point to a global queue is not a good idea (because these global queues dispatch their blocks to worker threads, but those worker threads are quite limited).
If you want to avoid using up all of these global queue worker threads, one simple choice is to take the async call out of your routine that calculates an individual point, and just dispatch the whole routine that iterates through all of the complex values to a background thread:
DispatchQueue.global(qos: .userInitiated).async {
for row in 0 ..< height {
for column in 0 ..< width {
let c = ...
let m = self.mandelbrotValue(c)
pixelBuffer[row * width + column] = self.color(for: m)
}
}
let outputCGImage = context.makeImage()!
DispatchQueue.main.async {
completionHandler(NSImage(cgImage: outputCGImage, size: NSSize(width: width, height: height)))
}
}
That's solves the "get it off the main thread" and the "don't use up the worker threads" problems, but now we've swung from using too many worker threads, to only using one worker thread, not fully utilizing the device. We really want to do as many calculations in parallel (while not exhausting the worker threads).
One approach, when doing a for loop for complex calculations, is to use dispatch_apply (now called concurrentPerform in Swift 3). This is like a for loop, but it does the each of the loops concurrently with respect to each other (but, at the end, waits for all of those concurrent loops to finish). To do this, replace the outer for loop with concurrentPerform:
DispatchQueue.global(qos: .userInitiated).async {
DispatchQueue.concurrentPerform(iterations: height) { row in
for column in 0 ..< width {
let c = ...
let m = self.mandelbrotValue(c)
pixelBuffer[row * width + column] = self.color(for: m)
}
}
let outputCGImage = context.makeImage()!
DispatchQueue.main.async {
completionHandler(NSImage(cgImage: outputCGImage, size: NSSize(width: width, height: height)))
}
}
The concurrentPerform (formerly known as dispatch_apply) will perform the various iterations of that loop concurrently, but it will automatically optimize the number of concurrent threads for the capabilities of your device. On my MacBook Pro, this made the calculation 4.8 times faster than the simple for loop. Note, I still dispatch the whole thing to a global queue (because concurrentPerform runs synchronously, and we never want to perform slow, synchronous calculations on the main thread), but concurrentPerform will run the calculations in parallel. It's a great way to enjoy concurrency in a for loop in such a way that you won't exhaust GCD worker threads.
By the way, you mentioned that you are updating a NSProgressIndicator. Ideally, you want to update it as every pixel is processed, but if you do that, the UI may get backlogged, unable to keep up with all of these updates. You'll end up slowing the final result to allow the UI to catch up to all of those progress indicator updates.
The solution is to decouple the UI update from the progress updates. You want the background calculations to inform you as each pixel is updated, but you want the progress indicator to be updated, each time effectively saying "ok, update the progress with however many pixels were calculated since the last time I checked". There are cumbersome manual techniques to do that, but GCD provides a really elegant solution, a dispatch source, or more specifically, a DispatchSourceUserDataAdd.
So define properties for the dispatch source and a counter to keep track of how many pixels have been processed thus far:
let source = DispatchSource.makeUserDataAddSource(queue: .main)
var pixelsProcessed: UInt = 0
And then set up an event handler for the dispatch source, which updates the progress indicator:
source.setEventHandler() { [unowned self] in
self.pixelsProcessed += self.source.data
self.progressIndicator.doubleValue = Double(self.pixelsProcessed) / Double(width * height)
}
source.resume()
And then, as you process the pixels, you can simply add to your source from the background thread:
DispatchQueue.concurrentPerform(iterations: height) { row in
for column in 0 ..< width {
let c = ...
let m = self.mandelbrotValue(for: c)
pixelBuffer[row * width + column] = self.color(for: m)
self.source.add(data: 1)
}
}
If you do this, it will update the UI with the greatest frequency possible, but it will never get backlogged with a queue of updates. The dispatch source will coalesce these add calls for you.

Related

how to delay a function that is called in a while loop without delaying the loop

imagine I have something like this
void color(int a)
{
if (a > 10)
{
return;
}
square[a].red();
sleep(1second);
color(a+1);
}
while (programIsRunning())
{
color(1);
updateProgram();
}
but with something that actually requires a recursive function.
how can I call this recursive function to color the squares one by one.
because on its own its too fast and if the program is being updated every frame.
they instantly get colored when I want them to get colored one by one (with a delay).
sleep() will cause the current thread to stop. That makes it a bad candidate for human-perceptible delays from the main thread.
You "could" have a thread that only handles that process, but threads are expensive, and creating/managing one just to color squares in a sequence is completely overkill.
Instead, you could do something along the lines of: Every time the program updates, check if it's he appropriate time to color the next square.
const std::chrono::duration<double> color_delay{0.1};
auto last_color_time = std::chrono::steady_clock::now();
bool coloring_squares = true;
while (programIsRunning()) {
if (coloring_squares) {
auto now = std::chrono::steady_clock::now();
// This will "catch up" as needed.
while (now - last_color_time >= color_delay) {
last_color_time += color_delay;
coloring_squares = color_next_square();
}
}
updateProgram();
}
How color_next_square() works is up to you. You could possibly "pre-bake" a list of squares to color using your recursive function, and iterate through it.
Also, obviously, this example just uses the code you posted. You'll want to organise all this as part of updateProgram(), possibly in some sort of class SquareAnim {}; stateful wrapper.
N.B. If your program has little jitter, i.e. it has consistent time between updates, and the delay is low, using the following instead can lead to a slightly smoother animation:
if (now - last_color_time >= color_delay) {
last_color_time = now;
// ...

Check for a condition periodically without blocking

In my project, function clipsUpdate reads some facts which are set by CLIPS without the interference of my C++ code. Based on the read facts, clipsUpdate calls the needed function.
void updateClips(void)
{
// read clipsAction
switch(clipsAction)
{
case ActMove:
goToPosition (0, 0, clipsActionArg);
break;
}
}
In goToPosition function, a message is sent to the vehicle to move to the specified position and then a while loop is used to wait until the vehicle reaches the position.
void goToPosition(float north, float east, float down)
{
// Prepare and send the message
do
{
// Read new location information.
}while(/*Specified position reached?*/)
}
The problem is that updateClips should be called every 500 ms and when the goToPosition function is called, the execution is blocked until the target location is reached. During this waiting period, something may happen that requires the vehicle to stop. Therefore, updateClips should be called every 500 ms no matter what, and it should be able to stop executing goToPosition if it's running.
I tried using threads as following, but it didn't work successfully with me and it was difficult for me to debug. I think it can be done with a simpler and cleaner way.
case ActMove:
std::thread t1(goToPosition, 0, 0, clipsActionArg);
t1.detach();
break;
My question is, how can I check if the target location is reached without blocking the execution, i.e., without using while?
You probably want an event-driven model.
In an event-driven model, your main engine is a tight loop that reads events, updates state, then waits for more events.
Some events are time based, others are input based.
The only code that is permitted to block your main thread is the main loop, where it blocks until a timer hits or a new event arrives.
It might very roughly look like this:
using namespace std::literals::chrono_literals;
void main_loop( engine_state* state ) {
bool bContinue = true;
while(bContinue) {
update_ui(state);
while(bContinue && process_message(state, 10ms)) {
bContinue = update_state(state);
}
bContinue = update_state(state);
}
}
update_ui provides feedback to the user, if required.
process_message(state, duration) looks for a message to process, or for 10ms to occur. If it sees a message (like goToPosition), it modifies state to reflect that message (for example, it might store the desired destionation). It does not block, nor does it take lots of time.
If no message is recived in duration time, it returns anyhow without modifying state (I'm assuming you want things to happen even if no new input/messages occur).
update_state takes the state and evolves it. state might have a last updated time stamp; update_state would then make the "physics" reflect the time since last one. Or do any other updates.
The point is that process_message doesn't do work on the state (it encodes desires), while update_state advances "reality".
It returns false if the main loop should exit.
update_state is called once for every process_message call.
updateClips being called every 500ms can be encoded as a repeated automatic event in the queue of messages process_message reads.
void process_message( engine_state* state, std::chrono::milliseconds ms ) {
auto start = std::chrono::high_resolution_clock::now();
while (start + ms > std::chrono::high_resolution_clock::now()) {
// engine_state::delayed is a priority_queue of timestamp/action
// ordered by timestamp:
while (!state->delayed.empty()) {
auto stamp = state->delayed.front().stamp;
if (stamp >= std::chrono::high_resolution_clock::now()) {
auto f = state->queue.front().action;
state->queue.pop();
f(stamp, state);
} else {
break;
}
}
//engine_state.queue is std::queue<std::function<void(engine_state*)>>
if (!state->queue.empty()) {
auto f = state->queue.front();
state->queue.pop();
f(state);
}
}
}
The repeated polling is implemented as a delayed action that, as its first operation, inserts a new delayed action due 500ms after this one. We pass in the time the action was due to run.
"Normal" events can be instead pushed into the normal action queue, which is a sequence of std::function<void(engine_state*)> and executed in order.
If there is nothing to do, the above function busy-waits for ms time and then returns. In some cases, we might want to go to sleep instead.
This is just a sketch of an event loop. There are many, many on the internet.

Threading and Mutex

I'm working on a program that simulates a gas station. Each car at the station is it's own thread. Each car must loop through a single bitmask to check if a pump is open, and if it is, update the bitmask, fill up, and notify other cars that the pump is now open. My current code works but there are some issues with load balancing. Ideally all the pumps are used the same amount and all cars get equal fill-ups.
EDIT: My program basically takes a number of cars, pumps, and a length of time to run the test for. During that time, cars will check for an open pump by constantly calling this function.
int Station::fillUp()
{
// loop through the pumps using the bitmask to check if they are available
for (int i = 0; i < pumpsInStation; i++)
{
//Check bitmask to see if pump is open
stationMutex->lock();
if ((freeMask & (1 << i)) == 0 )
{
//Turning the bit on
freeMask |= (1 << i);
stationMutex->unlock();
// Sleeps thread for 30ms and increments counts
pumps[i].fillTankUp();
// Turning the bit back off
stationMutex->lock();
freeMask &= ~(1 << i);
stationCondition->notify_one();
stationMutex->unlock();
// Sleep long enough for all cars to have a chance to fill up first.
this_thread::sleep_for(std::chrono::milliseconds((((carsInStation-1) * 30) / pumpsInStation)-30));
return 1;
}
stationMutex->unlock();
}
// If not pumps are available, wait until one becomes available.
stationCondition->wait(std::unique_lock<std::mutex>(*stationMutex));
return -1;
}
I feel the issue has something to do with locking the bitmask when I read it. Do I need to have some sort of mutex or lock around the if check?
It looks like every car checks the availability of pump #0 first, and if that pump is busy it then checks pump #1, and so on. Given that, it seems expected to me that pump #0 would service the most cars, followed by pump #1 serving the second-most cars, all the way down to pump #(pumpsInStation-1) which only ever gets used in the (relatively rare) situation where all of the pumps are in use simultaneously at the time a new car pulls in.
If you'd like to get better load-balancing, you should probably have each car choose a different random ordering to iterate over the pumps, rather than having them all check the pumps' availability in the same order.
Normally I wouldn't suggest refactoring as it's kind of rude and doesn't go straight to the answer, but here I think it would help you a bit to break your logic into three parts, like so, to better show where the contention lies:
int Station::acquirePump()
{
// loop through the pumps using the bitmask to check if they are available
ScopedLocker locker(&stationMutex);
for (int i = 0; i < pumpsInStation; i++)
{
// Check bitmask to see if pump is open
if ((freeMask & (1 << i)) == 0 )
{
//Turning the bit on
freeMask |= (1 << i);
return i;
}
}
return -1;
}
void Station::releasePump(int n)
{
ScopedLocker locker(&stationMutex);
freeMask &= ~(1 << n);
stationCondition->notify_one();
}
bool Station::fillUp()
{
// If a pump is available:
int i = acquirePump();
if (i != -1)
{
// Sleeps thread for 30ms and increments counts
pumps[i].fillTankUp();
releasePump(i)
// Sleep long enough for all cars to have a chance to fill up first.
this_thread::sleep_for(std::chrono::milliseconds((((carsInStation-1) * 30) / pumpsInStation)-30));
return true;
}
// If no pumps are available, wait until one becomes available.
stationCondition->wait(std::unique_lock<std::mutex>(*stationMutex));
return false;
}
Now when you have the code in this form, there is a load balancing issue which is important to fix if you don't want to "exhaust" one pump or if it too might have a lock inside. The issue lies in acquirePump where you are checking the availability of free pumps in the same order for each car. A simple tweak you can make to balance it better is like so:
int Station::acquirePump()
{
// loop through the pumps using the bitmask to check if they are available
ScopedLocker locker(&stationMutex);
for (int n = 0, i = startIndex; n < pumpsInStation; ++n, i = (i+1) % pumpsInStation)
{
// Check bitmask to see if pump is open
if ((freeMask & (1 << i)) == 0 )
{
// Change the starting index used to search for a free pump for
// the next car.
startIndex = (startIndex+1) % pumpsInStation;
// Turning the bit on
freeMask |= (1 << i);
return i;
}
}
return -1;
}
Another thing I have to ask is if it's really necessary (ex: for memory efficiency) to use bit flags to indicate whether a pump is used. If you can use an array of bool instead, you'll be able to avoid locking completely and simply use atomic operations to acquire and release pumps, and that'll avoid creating a traffic jam of locked threads.
Imagine that the mutex has a queue associated with it, containing the waiting threads. Now, one of your threads manages to get the mutex that protects the bitmask of occupied stations, checks if one specific place is free. If it isn't, it releases the mutex again and loops, only to go back to the end of the queue of threads waiting for the mutex. Firstly, this is unfair, because the first one to wait is not guaranteed to get the next free slot, only if that slot happens to be the one on its loop counter. Secondly, it causes an extreme amount of context switches, which is bad for performance. Note that your approach should still produce correct results in that no two cars collide while accessing a single filling station, but the behaviour is suboptimal.
What you should do instead is this:
lock the mutex to get exclusive access to the possible filling stations
locate the next free filling station
if none of the stations are free, wait for the condition variable and restart at point 2
mark the slot as occupied and release the mutex
fill up the car (this is where the sleep in the simulation actually makes sense, the other one doesn't)
lock the mutex
mark the slot as free and signal the condition variable to wake up others
release the mutex again
Just in case that part isn't clear to you, waiting on a condition variable implicitly releases the mutex while waiting and reacquires it afterwards!

Getting item sequence numbers from a QtConcurrent Threaded Calculation

The QtConcurrent namespace is really great for simplifying the management of multi-threaded calculations. Overall this works great and I have been able to use QtConcurrent run(), map(), and other variants in the way they are described in the API.
Overall Goal:
I would like to query, cancel(), or pause() a numerically intensive calculation from QML. So far this is working the way I would like, except that I cannot access the sequence numbers in the calculation. Here is a link that describes a similar QML setup.
Below is an image from small test app that I created to encapsulate what I am trying to do:
In the example above the calculation has nearly completed and all the cores have been enqueued with work properly, as can be seen from a system query:
But what I really would like to do is use the sequence numbers from a given list of the items IN THE multi-threaded calculation itself. E.g., one approach might be to simply setup the sequence numbers directly in a QList or QVector (other C++ STL containers can work as well), like this:
void TaskDialog::mapTask()
{
// Number of times the map function will be called:
int N = 5;
// Prepare the vector that we operate on with mapFunction:
QList<int> vectorOfInts;
for (int i = 0; i < N; i++) {
vectorOfInts << i;
}
// Start the calc:
QFuture<void> future = QtConcurrent::map(vectorOfInts, mapFunction);
_futureWatcher.setFuture(future);
//_futureWatcher.waitForFinished();
}
The calculation is non-blocking with the line: _futureWatcher.waitForFinished(); commented out, as shown in the code above. Note that when setup as a non-blocking calculation, the GUI thread is responsive, and the progress bar updates as desired.
But when the values in the QList container are queried during the calculation, what appears seem to be the uninitialized garbage values that one would expect when the array is not properly initialized.
Below is the example function I am calling:
void mapFunction(int& n)
{
// Check the n values:
qDebug() << "n = " << n;
/* Below is an arbitrary task but note that we left out n,
* although normally we would want to use it): */
const long work = 10000 * 10000 * 10;
long s = 0;
for (long j = 0; j < work; j++)
s++;
}
And the output of qDebug() is:
n = 30458288
n = 204778
n = 270195923
n = 0
n = 270385260
The n-values are useless but the sum values, s, are correct (although not shown) when the calculation is mapped in this fashion (non-blocking).
Now, if I uncomment the _futureWatcher.waitForFinished(); line then I get the expected values (the order is irrelevant):
n = 0
n = 2
n = 4
n = 3
n = 1
But in this case, with _futureWatcher.waitForFinished(); enabled, my GUI thread is blocked and the progress bar does not update.
What then would be the advantage of using QtConcurrent::map() with blocking enabled, if the goal to not block the main GUI thread?
Secondly, how can get the correct values of n in the non-blocking case, allowing the GUI to remain responsive and have the progress bar keep updating?
My only option may be to use QThread directly but I wanted to take advantage of all the nice tools setup for us in QtConcurrent.
Thoughts? Suggestions? Other options? Thanks.
EDIT: Thanks to user2025983 for the insight which helped me to solve this. The bottom line is that I first needed to dynamically allocate the QList:
QList<int>* vectorOfInts = new QList<int>;
for (int i = 0; i < N; i++)
vectorOfInts->push_back(i);
Next, the vectorOfInts is passed by reference to the map function by de-referencing the pointer, like this:
QFuture<void> future = QtConcurrent::map(*vectorOfInts, mapFunction);
Note also that the prototype of the mapFunction remains the same:
void mapFunction(int& n)
And then it all works properly: the GUI remained responsive, progress bar updated, the values of n are all correct, etc., WITHOUT the need to add blocking through the function:
_futureWatcher.waitForFinished();
Hope these extra details can help someone else.
The problem here is that your QList goes out of the scope when mapTask() finishes.
Since the mapFunction(int &n) takes the parameter by reference, it gets references to integer values which are now part of an array which is out of scope! So then the computer is free to do whatever it likes with that memory, which is why you see garbage values. If you are just using integer parameters, I would recommend passing the parameters by value and then everything should work.
Alternatively, if you must pass by reference you can have the futureWatcher delete the array when its finished.
QList<int>* vectorOfInts = new QList<int>;
// push back into structure
connect(_futureWatcher, SIGNAL(finished()), vectorOfInts, SLOT(deleteLater()));
// launch stuff
QtConcurrent::map...
// profit

Nodejs: parallel request, serial response

I have done Nodejs: How to write high performance async loop for stormjs, you can check stormjs serial loop demo
but there is still have problem that parallel loop, e.g. we have a function requestContext(index, callback(err, context)) which remote get context 'http://host/post/{index}', and we need get the context of [0-99] and push context into an array in the order, [context0...context99]
but obviously this output cant work stormjs parallel loop
I still want to know how noders do this task, but you must make these requests parallel, not 1 by 1, it should be parallel request and serial push.
var counter = 0;
// create an array with numbers 0 to 99.
_.range(0,100).forEach(function(key, value, arr) {
// for each of them request context
requestContext(key, function(err, context) {
// add the context to the array under the correct key
if (!err) {
arr[key] = context;
}
// increment counter, if all have finished then fire finished.
if (++counter === 100) {
finished(arr);
}
});
});
function finished(results) {
// do stuff
}
No storm required. If you want an execution / flow control library I would recommend Futures because it doesn't compile your code and "hides the magic".
Previously you recursed through each one and executed them in a serial order pushing them into the array in order.
This time you execute them all in parallel and tell each one to assign the right ordered key in the array to their own value.
_.range Creates an array containing values 0 to 99.