condition_variable: notifying after releasing lock optimization effect

condition_variable: notifying after releasing lock optimization effect - c++

It was already asked few times about whether one should notify condition_variable while holding a lock:
lk.lock();
// change state
cv.notify_one();
lk.unlock();
or after releasing it:
lk.lock();
// change state
lk.unlock();
cv.notify_one();
Here's one of such questions: Do I have to acquire lock before calling condition_variable.notify_one()?
Answers point out that both are safe, as long as the lock is held at all when condition is changed, and the later form is better for performance: awaken thread is immediately unlocked.
I'm wondering how much this performance effect is generic and significant:
Does this apply to Windows CONDITION_VARIABLE (that is avaliable since Vista+, but I'm nostly asking about Windows 10 implementation)? If so, how much significant the effect is?
Does this apply to Qt QWaitCondition on Windows?
Does this apply to boost::condition_variable on Windows?
Does it apply to most Linux systems where POSIX condition variable is the underlying implementation? Is there a measure of it?
The practical application is that I have to use the first form, where notification is done under the lock, as a structure that manages condition variables needs protection as well, so I want to know how much this approach worth avoiding.

Related

C++ std::scoped_lock: What happens if the lock blocks?

I am interested to know more about how the std::scoped_lock operates.
I am making some modifications to some thread-unsafe code by adding a mutex around a critical section.
I am using a std::scoped_lock to do this.
There are two possible things which can happen when the line of code std::scoped_lock lock(mutex) is reached:
The mutex is locked successfully. Nothing to be concerned with here.
The mutex is being held by another thread (A). The mutex cannot be locked, and presumably the current thread (B) blocks.
Regarding point 2, what happens when the thread which was able to lock the mutex unlocks it again? ie: When thread A unlocks the mutex (scoped_lock goes out of scope) what does thread B do? Does it automatically wake up and try to lock the mutex again? (Does thread B sleep when it is not able to lock the mutex? Presumably it does not sit in an infinite while(1) like loop hogging the CPU.)
As you can see I don't have a complete understanding of how a scoped_lock works. Is anyone able to enlighten me?

Your basic assumptions are pretty much correct - block, don't hog the CPU too much (a bit is OK), and let the implementation deal with thread wake-ups.
There's one special case that should be mentioned: when a mutex is used not just to protect a shared resource, but specifically to coordinate between threads A&B, a std::condition_variable can help with the thread synchronization. This does not replace the mutex; in fact you need to lock a mutex in order to wait on a condition variable. Many Operating Systems can benefit from condition variables by waking up the right threads faster.

pthread_mutex_lock and pthread_mutex_lock in another thread

I called a pthread_mutex_lock(&th) in a thread then I want to unlock the mutex in another thread pthread_mutex_unlock(&th)
Is it possible to do that?
Or the mutex should be unlocked in the same thread ?

It should be unlocked in the same thread. From the man page: "If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behavior results." (http://pubs.opengroup.org/onlinepubs/009604499/functions/pthread_mutex_lock.html)

I just wanted to add to Guijt's answer:
When a thread locks a mutex, it is assumed it is inside a critical section. If we allow another thread to unlock that mutex, the first thread might still be inside the critical section, resulting in problems.
I can see several solutions to your problem:
Option 1: Rethink your algorithm
Try to understand why you need to unlock from a different thread, and see if you can get the unlocking to be done within the locking thread. This is the best solution, as it typically produces the code that is simplest to understand, and simplest to prove it is actually doing what you believe it is doing. With multithreaded programming being so complicated, the price it is worth paying for such simplicity should be quite high.
Option 2: Synchronize the threads with an event
One might argue it is just a method to implement option 1 above. The idea is that when the locking thread finishes with the critical section, it does not go out to do whatever, but waits on an event. When the second thread wishes to release the lock, it instead signals the event. The first thread then releases the lock.
This procedure has the advantage that thread 2 cannot inadvertently release the lock too soon.
Option 3: Don't use a mutex
If neither one of the above options work for you, you most likely are not using the mutex for mutual exclusion, but for synchronizations. If such is the case, you are likely using the wrong construct.
The construct most resembling a mutex is a semaphore. In fact, for years the Linux kernel did not have a mutex, claiming that it's just a semaphore with a maximal value of 1. A semaphore, unlike a mutex, does not require that the same thread lock and release.
RTFM on sem_init and friends for how to use it.
Please be reminded that you must first model your problem, and only then choose the correct synchronization construct to use. If you do it the other way around, you are almost certain to introduce lots of bugs that are really really really difficult to find and fix.

Whole purpose of using Mutex is achieve mutual exclusion in a critical section with ownership being tracked by the kernel. So mutex has to be unlocked in the same thread which has acquired it

Bottleneck in Threads C++

So I am just trying to verify my understanding and hope that you guys will be able to clear up any misunderstandings. So essentially I have two threads which use the same lock and perform calculations when they hold the lock but the interesting thing is that within the lock I will cause the thread to sleep for a short time. For both threads, this sleep time will be slightly different for either thread. Because of the way locks work, wont the faster thread be bottlenecked by the slower thread as it will have to wait for it to complete?
For example:
Thread1() {
lock();
usleep(10)
lock();
}
-
Thread2() {
lock();
sleep(100)
lock();
}
Now because Thread2 holds onto the lock longer, this will cause a bottleneck. And just to be sure, this system should have a back and forth happens on who gets the lock, right?
It should be:
Thread1 gets lock
Thread1 releases lock
Thread2 gets lock
Thread2 releases lock
Thread1 gets lock
Thread1 releases lock
Thread2 gets lock
Thread2 releases lock
and so on, right? Thread1 should never be able to acquire the lock right after it releases it, can it?

Thread1 should never be able to acquire the lock right after it releases it, can it?
No, Thread1 could reacquire the lock, right after it releases it, because Thread2 could still be suspended (sleeps because of the scheduler)
Also sleep only guarantees that the thread will sleep at least the wanted amount, it can and will often be more.
In practice you would not hold a lock while calculating a value, you would get the lock, get the needed values for calculation, unlock, calculate it, and then get the lock again, check if the old values for the calculation are still valid/wanted, and then store/return your calculated results.
For this purpose, the std::future and atomic data types were invented.
...this system should have a back and forth happens on who gets the lock, right?
Mostly The most of the time it will be a back and forth but some times there could/will be two lock/unlock cycles by Thread1. It depends on your scheduler and any execution and cycle will probably vary.

Absolutely nothing prevents either thread from immediately reacquiring the lock after releasing it. I have no idea what you think prevents this from happening, but nothing does.
In fact, in many implementations, a thread that is already running has an advantage in acquiring a lock over threads that have to be made ready-to-run. This is a sensible optimization to minimize context switches.
If you're using a sleep as a way to simulate work and think this represents some real world issue with lock fairness, you are wrong. Threads that sleep are voluntarily yielding the remainder of their timeslice and are treated very differently from threads that exhaust their timeslice doing work. If these threads were actually doing work, eventually one thread would exhaust its timeslice.

Depending on what you are trying to achieve there are several possibilities.
If you want your threads to run in a particular order then have a look here.
There are basically 2 options:
- one is to use events where a thread is signaling the next one it has done his job and so the next one could start.
- the other one is to have a scheduler thread that handle the ordering with events or semaphores.
If you want your threads to run independently but have a lock mechanism where the order of attempting to get the lock is preserved you can have a look here. The last part of the answer uses a queue of one condition variable per thread seem good.
And as it was said in previous answers and comments, using sleep for scheduling is a bad idea.
Also lock is just a mutual exclusion mechanism and has no guarentee on the execution order.
A lock is usually intended for preventing concurrent access on a critical resource so it should just do that. The smaller the critical section is, the better.
Finally yes trying to order threads is making "bottlenecks". In this particular case if all calculations are made in the locked sections the threads won't do anything in parallel so you can question the utility of using threads.
Edit :
Just on more warning: be careful, with threads it's not because is worked (was scheduled as you wanted to) 10 times on your machine that it always will, especially if you change any of the context (machine, workload...). You have to be sure of it by design.

lock vs (try_lock, sleep, repeat) performance

I am updating some code and I came across several mutexs that were using the lines:
while (!mutex_.try_lock())
sleep_for(milliseconds(1));
instead of just locking the mutex straight away:
mutex_.lock();
Is there any performance difference positive or negative to using the try lock and sleep approach vs straight locking the mutex or is it just wasted instructions?

lock() will block if the lock is not available, while try_lock() returns straight away even if the lock is not available.
The first form polls the lock until it is available, which has a couple of disadvantages:
Depending on the delay in the loop, there can be unnecessary latency. The lock might become available just after the try_lock attempt, but your process still waits for the preset delay.
It is a waste of CPU cycles polling the lock
In general you should only use try_lock if there is something useful that the thread can do while it is waiting for the lock, in which case you would not want the thread to block waiting for the lock.
Another use case for try_lock would be to attempt locking with a timeout, so that if the lock did not become available within a certain time the program could abort.
cppreference says that:
lock() is usually not called directly: std::unique_lock and std::lock_guard are used to manage exclusive locking.
Using these classes may be more appropriate depending on the circumstances.

mutex lock priority

In multithreading (2 thread) program, I have this code:
while(-1)
{
m.lock();
(...)
m.unlock();
}
m is a mutex (in my case a c++11 std::mutex, but I think it'doesn't change if I use different library).
Assuming that the first thread owns the mutex and it's done something in (...) part. The second thread tried to acquire the mutex, but it's waiting that the first thread release m.
The question is: when thread 1 ends it's (...) execution and unlocks the mutex, can we be sure that thread 2 acquires the mutex or thread 1 can re-acquire again the mutex before thread 2, leaving it stucked in lock()?

The C++ standard doesn't make any guarantee about the order locks to a mutex a granted. Thus, it is entirely possible that the active thread keeps unlock()ing and lock()ing the std::mutex m without another thread trying to acquire the lock ever getting to it. I don't think the C++ standard provides a way to control thread priorities. I don't know what you are trying to do but possibly there is another approach which avoids the problem you encounter.

If both threads are equal priority, there is no such guarantee by standard mutex implementations. Some OS's have a lis of "who's waiting", and will pick the "longest waiting" when you release something, but that is an implementation detail, not something you can reliably depend on.
And imagine that you have two threads, each running something like this:
m.lock();
(...)
m.unlock();
(...) // Clearly not the same code as above (...)
m.lock();
(...) // Some other code that needs locking against updates.
m.unlock();
Would you want the above code to switch thread on the second lock, every time?
By the way, if both threads run with lock for the entire loop, what is the point of a lock?

There are no guarantees, as the threads are not ordered in any way with respect to each other. In fact, the only synchronisation point is the mutex locking.
It's entirely possible that the first thread reacquires the lock immediately if for example it is running the function in a tight loop. Typical implementations have a notification and wakeup mechanism if any thread is sleeping on a mutex, but there may also be a bias for letting the running thread continue rather than performing a context switch... it's very much up to the implementation and the details of the platform at the time.

There are no guarantees provided by C++ or underlying OS.
However, there is some reasonable degree of fairness determined by the thread arrival time to the critical region (mutex in this case). This fairness can be expressed as statistical probability, but not a strict guarantee. Most likely this choice will be down to OS execution scheduler, which will also consider many other factors.

It's not a good idea to rely on such code, so you should probably change your design.
However, on some operating systems, sleep(0) will yield the thread. (Sleep(0) on Windows)
Again, it's best not to rely on this.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js