what will happen if we signal semaphore without wait? - c++

In my project when I was implementing I came across the scenario. I have a binary semaphore which is taken by one thread. when that thread executing, the semaphore is signaled multiple times by another thread. Is it an issue or will it cause any undefined behavior??

It is an error to signal a semaphore without a corresponding wait. What happens if you do this is implementation dependent.
If a call to ReleaseSemaphore on a Windows Semaphore object would result in the maximum count being exceeded, ReleaseSemaphore returns FALSE. It will not throw an exception or cause a fatal runtime error.
Under Linux, a call to sem_post that would exceed the maximum count returns -1, and errno is set to EOVERFLOW. Again, this will not be fatal to your application.
Under .NET, a call to Release that would exceed the semaphore's maximum value results in SemaphoreFullException being thrown.
It's a logic error to release a semaphore more often than it's acquired. If your program does that, you have a latent bug. It might be okay in your particular situation, but if you try this with anything other than a binary semaphore, you're likely to end up with some very difficult to find bugs.
I would strongly recommend that you check the return value when you release the semaphore, and treat a failure as a fatal exception.

A binary semaphore is called a Mutex.
Nothing will happen, the mutex can either be taken or not taken, those are the only two states. Releasing a non acquired mutex has no negative effects.
However, take into account the logic may be affected. It doesn't seem like you're controlling too well when and how you signal the mutex, which may result on you releasing the mutex that another thread has acquired.

Related

Is there a way to check if std::future state is ready in a guaranteed wait-free manner?

I know that I can check the state of the std::future the following way:
my_future.wait_for(std::chrono::seconds(0)) == std::future_status::ready
But according to cppreference.com std::future::wait_for may block in some cases:
This function may block for longer than timeout_duration due to scheduling or resource contention delays.
Is it still the case when timeout_duration is 0 ? If so, is there another way to query the state in a guaranteed wait-free manner ?
The quote from cppreference is simply there to remind you that the OS scheduler is a factor here and that other tasks requiring platform resources could be using the CPU-time your thread needs in order to return from wait_for() -- regardless of the specified timeout duration being zero or not. That's all. You cannot technically be guaranteed to get more than that on a non-realtime platform. As such, the C++ Standard says nothing about this, but you can see other interesting stuff there -- see the paragraph for wait_for() under [futures.unique_future¶21]:
Effects: None if the shared state contains a deferred function
([futures.async]), otherwise blocks until the shared state is ready or
until the relative timeout ([thread.req.timing]) specified by
rel_­time has expired.
No such mention of the additional delay here, but it does say that you are blocked, and it remains implementation dependent whether wait_for() is yield()ing the thread1 first thing upon such blocking or immediately returning if the timeout duration is zero. In addition, it might also be necessary for an implementation to synchronize access to the future's status in a locking manner, which would have to be applied prior to checking if a potential immediate return is to take place. Hence, you don't even have the guarantee for lock-freedom here, let alone wait-freedom.
Note that the same applies for calling wait_until with a time in the past.
Is it still the case when timeout_duration is 0 ? If so, is there
another way to query the state in a guaranteed wait-free manner ?
So yes, implementation of wait_free() notwithstanding, this is still the case. As such, this is the closest to wait-free you're going to get for checking the state.
1 In simple terms, this means "releasing" the CPU and putting your thread at the back of the scheduler's queue, giving other threads some CPU-time.
To answer your second question, there is currently no way to check if the future is ready other than waiting. We will likely get this at some point: https://en.cppreference.com/w/cpp/experimental/future/is_ready. If your runtime library supports the concurrency extensions and you don't mind using experimental in your code, then you can use is_ready() now. That being said, I know of few cases where you must check a future's state. Are you sure it's necessary?
Is it still the case when timeout_duration is 0 ?
Yes. That's true for any operation. The OS scheduler could pause the thread (or the whole process) to allow another thread to run on the same CPU.
If so, is there another way to query the state in a guaranteed wait-free manner ?
No. Using a zero timeout is the correct way.
There's not even a guarantee that the shared state of a std::future doesn't lock a mutex to check if it's ready, so it would be impossible to guarantee it was wait-free.
For GCC's implementation the ready flag is an atomic so there's no mutex lock needed, and if it's ready then wait_for returns immediately. If it's not ready then there are some more atomic operations and then a check to see if the timeout has passed already, then a system call. So for a zero timeout there are just some atomic loads and function calls (no system call).

unlock the mutex after condition_variable::notify_all() or before?

Looking at several videos and the documentation example, we unlock the mutex before calling the notify_all(). Will it be better to instead call it after?
The common way:
Inside the Notifier thread:
//prepare data for several worker-threads;
//and now, awaken the threads:
std::unique_lock<std::mutex> lock2(sharedMutex);
_threadsCanAwaken = true;
lock2.unlock();
_conditionVar.notify_all(); //awaken all the worker threads;
//wait until all threads completed;
//cleanup:
_threadsCanAwaken = false;
//prepare new batches once again, etc, etc
Inside one of the worker threads:
while(true){
// wait for the next batch:
std::unique_lock<std::mutex> lock1(sharedMutex);
_conditionVar.wait(lock1, [](){return _threadsCanAwaken});
lock1.unlock(); //let sibling worker-threads work on their part as well
//perform the final task
//signal the notifier that one more thread has completed;
//loop back and wait until the next task
}
Notice how the lock2 is unlocked before we notify the condition variable - should we instead unlock it after the notify_all() ?
Edit
From my comment below: My concern is that, what if the worker spuriously awakes, sees that the mutex is unlocked, super-quickly completes the task and loops back to the start of while. Now the slow-poke Notifier finally calls notify_all(), causing the worker to loop an additional time (excessive and undesired).
There are no advantages to unlocking the mutex before signaling the condition variable unless your implementation is unusual. There are two disadvantages to unlocking before signaling:
If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.
There is a performance penalty for unlocking before signaling that is avoided by unlocking after signaling. If you signal before you unlock, a good implementation will know that your signal cannot possibly render any thread ready-to-run because the mutex is held by the calling thread and any thread affects by the condition variable necessarily cannot make forward progress without the mutex. This permits a significant optimization (often called "wait morphing") that is not possible if you unlock first.
So signal while holding the lock unless you have some unusual reason to do otherwise.
should we instead unlock it after the notify_all() ?
It is correct to do it either way but you may have different behavior in different situations. It is quite difficult to predict how it will affect performance of your program - I've seen both positive and negative effects for different applications. So it is better you profile your program and make decision on your particular situation based on profiling.
As mentioned here : cppreference.com
The notifying thread does not need to hold the lock on the same mutex
as the one held by the waiting thread(s); in fact doing so is a
pessimization, since the notified thread would immediately block
again, waiting for the notifying thread to release the lock.
That said, documentation for wait
At the moment of blocking the thread, the function automatically calls
lck.unlock(), allowing other locked threads to continue.
Once notified (explicitly, by some other thread), the function
unblocks and calls lck.lock(), leaving lck in the same state as when
the function was called. Then the function returns (notice that this
last mutex locking may block again the thread before returning).
so when notified wait will re-attempt to gain the lock and in that process it will get blocked again till original notifying thread releases the lock.
So I'll suggest that release the lock before calling notify. As done in example on cppreference.com and most importantly
Don't be Pessimistic.
David's answer seems to me wrong.
First, assuming the simple case of two threads, one waiting for the other on a condition variable, unlocking first by the notifier will not waken the other waiting thread, as the signal has not arrived. Then the notify call will immediately waken the waiting thread. You do not need any special optimizations.
On the other hand, signalling first has the potential of waking up a thread and making it sleep immediately again, as it cannot hold the lock—unless wait morphing is implemented.
Wait morphing does not exist in Linux at least, according to the answer under this StackOverflow question: Which OS / platforms implement wait morphing optimization?
The cppreference example also unlocks first before signalling: https://en.cppreference.com/w/cpp/thread/condition_variable/notify_all
It explicit says:
The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s). Doing so may be a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock, though some implementations recognize the pattern and do not attempt to wake up the thread that is notified under lock.
should we instead unlock it after the notify_all() ?
After reading several related posts, I've formed the opinion that it's purely a performance issue. If OS supports "wait morphing", unlock after; otherwise, unlock before.
I'm adding an answer here to augment that of #DavidSchwartz 's. Particularly, I'd like to clarify his point 1.
If you unlock before you signal, the signal may wake a thread that choose to block on the condition variable after you unlocked. This can lead to a deadlock if you use the same condition variable to signal more than one logical condition. This kind of bug is hard to create, hard to diagnose, and hard to understand. It is trivially avoided by always signaling before unlocking. This ensures that the change of shared state and the signal are an atomic operation and that race conditions and deadlocks are impossible.
The 1st thing I said is that, because it's a CV and not a Mutex, a better term for the so-called "deadlock" might be "sleep paralysis" - a mistake some programs make is that
a thread that's supposed to wake
went to sleep due to not rechecking the condition it's been waiting for before wait'ng again.
The 2nd thing is that, when waking some other thread(s),
the default choice should be broadcast/notify_all (broadcast is the POSIX term, which is equivalent to its C++ counterpart).
signal/notify is an optimized special case used for when there's only 1 other thread is waiting.
Finally 3rd, David is adamant that
it's better to unlock after notify,
because it can avoid the "deadlock" which I've been referring to as "sleep paralysis".
If it's unlock then notify, then there's a window where another thread (let's call this the "wrong" thread) may i.) acquire the mutex, ii.)going into wait, and iii.) wake up. The steps i. ii. and iii. happens too quickly, consumed the signal, leaving the intended (let's call it "correct") thread in sleep.
I discussed this extensively with David, he clarified that only when all 3 points are violated ( 1. condvar associated with several separate conditions and/or didn't check it before waiting again; 2. signal/notify only 1 thread when there're more than 1 other threads using the condvar; 3. unlock before notify creating a window for race condition ), the "sleep paralysis" would occur.
Finally, my recommendation is that, point 1 and 2 are essential for correctness of the program, and fixing issues associated with 1 and 2 should be prioritized over 3, which should only be a augmentative "last resort".
For the purpose of providing reference, manpage for signal/broadcast and wait contains some info from version 3 of Single Unix Specification that gave some explanations on point 1 and 2, and partly 3. Although specified for POSIX/Unix/Linux in C, it's concepts are applicable to C++.
As of this writing (2023-01-31), the 2018 edition of version 4 of Single Unix Specification is released, and the drafting of version 5 is underway.

When can std::thread::join fail due to no_such_process

std::thread::join() is permitted to fail, throwing a std::system_error for no_such_process if the thread is "not valid". Note that the no_such_process case is distinct from a thread that is not joinable (for which the error code is invalid_argument).
In what circumstances might that happen? Alternatively, what must I do to ensure that join() does not fail for that reason? I want a destructor to join() some threads it manages, and of course I want the destructor to never throw exceptions. What can make a (properly constructed and not destroyed) thread "not valid".
In what circumstances might that happen?
On *nix systems, it happens when you try to join a thread whose ID is not in the thread table, meaning the thread does not exist (anymore). This might happen when a thread has already been joined and terminated, or if your thread variable's memory has been corrupted.
Alternatively, what must I do to ensure that join() does not fail for that reason?
You might test std::thread::joinable(), but it might also fail1. Just don't mess with your thread variables, and you're good to go. Simply ignore this possibility, if you encounter such an error your program better core dump and let you analyse the bug.
1) By fail, I mean report true instead of false or the other way around, not throw or crash.
The no_such_process error code corresponds to a ESRCH POSIX error code. On a POSIX system std::thread::join() probably delegates to pthread_join().
Issue 7 of POSIX removed the possibility of an ESRCH.
On Linux, pthread_join may give ESRCH if no thread with the given thread ID could be found. The ID of a C++ thread is private data, so the only way the ID could be not found would be if this does not point to a properly constructed std::thread.
I conclude that this error condition can only occur as a result of an earlier action that had undefined behaviour, such as a bad reinterpret_cast or use of a dangling pointer.

Why locking a std::mutex doesn't block the thread

I wrote the following code to test my understanding of std::mutex
int main() {
mutex m;
m.lock();
m.lock(); // expect to block the thread
}
And then I got a system_error: device or resource busy. Isn't the second m.lock() supposed to block the thread?
From std::mutex:
A calling thread must not own the mutex prior to calling lock or try_lock.
and from std::mutex::lock:
If lock is called by a thread that already owns the mutex, the program may deadlock. Alternatively, if an implementation can detect the deadlock, a resource_deadlock_would_occur error condition may be observed.
and the exceptions clause:
Throws std::system_error when errors occur, including errors from the underlying operating system that would prevent lock from meeting its specifications. The mutex is not locked in the case of any exception being thrown.
Therefore it is not supposed to block the thread. On your platform, the implementation appears to be able to detect when a thread is already the owner of a lock and raise an exception. This may not happen on other platforms, as indicated in the descriptions.
Isn't the second m.lock() supposed to block the thread?
No, it gives undefined behaviour. The second m.lock() breaks this requirement:
C++11 30.4.1.2/7 Requires: If m is of type std::mutex or std::timed_mutex, the calling thread does not own the mutex.
It looks like your implementation is able to detect that the calling thread owns the mutex and gives an error; others may block indefinitely, or fail in other ways.
(std::mutex wasn't mentioned in the question when I wrote this answer.)
It depends on the mutex library and mutex type you're using - you haven't told us. Some systems provide a "recursive mutex" that is allowed to be called multiple times like this only if it happens from the same thread (then you must have a matching number of unlocks before another thread can lock it), other libraries consider this an error and may fail gracefully (as yours has) or have undefined behaviour.

Not locking mutex for pthread_cond_timedwait and pthread_cond_signal ( on Linux )

Is there any downside to calling pthread_cond_timedwait without taking a lock on the associated mutex first, and also not taking a mutex lock when calling pthread_cond_signal ?
In my case there is really no condition to check, I want a behavior very similar to Java wait(long) and notify().
According to the documentation, there can be "unpredictable scheduling behavior". I am not sure what that means.
An example program seems to work fine without locking the mutexes first.
The first is not OK:
The pthread_cond_timedwait() and
pthread_cond_wait() functions shall
block on a condition variable. They
shall be called with mutex locked by
the calling thread or undefined
behavior results.
http://opengroup.org/onlinepubs/009695399/functions/pthread_cond_timedwait.html
The reason is that the implementation may want to rely on the mutex being locked in order to safely add you to a waiter list. And it may want to release the mutex without first checking it is held.
The second is disturbing:
if predictable scheduling behaviour is
required, then that mutex is locked by
the thread calling
pthread_cond_signal() or
pthread_cond_broadcast().
http://www.opengroup.org/onlinepubs/007908775/xsh/pthread_cond_signal.html
Off the top of my head, I'm not sure what the specific race condition is that messes up scheduler behaviour if you signal without taking the lock. So I don't know how bad the undefined scheduler behaviour can get: for instance maybe with broadcast the waiters just don't get the lock in priority order (or however your particular scheduler normally behaves). Or maybe waiters can get "lost".
Generally, though, with a condition variable you want to set the condition (at least a flag) and signal, rather than just signal, and for this you need to take the mutex. The reason is that otherwise, if you're concurrent with another thread calling wait(), then you get completely different behaviour according to whether wait() or signal() wins: if the signal() sneaks in first, then you'll wait for the full timeout even though the signal you care about has already happened. That's rarely what users of condition variables want, but may be fine for you. Perhaps this is what the docs mean by "unpredictable scheduler behaviour" - suddenly the timeslice becomes critical to the behaviour of your program.
Btw, in Java you have to have the lock in order to notify() or notifyAll():
This method should only be called by a
thread that is the owner of this
object's monitor.
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Object.html#notify()
The Java synchronized {/}/wait/notifty/notifyAll behaviour is analogous to pthread_mutex_lock/pthread_mutex_unlock/pthread_cond_wait/pthread_cond_signal/pthread_cond_broadcast, and not by coincidence.
Butenhof's excellent "Programming with POSIX Threads" discusses this right at the end of chapter 3.3.3.
Basically, signalling the condvar without locking the mutex is a potential performance optimisation: if the signalling thread has the mutex locked, then the thread waking on the condvar has to immediately block on the mutex that the signalling thread has locked even if the signalling thread is not modifying any of the data the waiting thread will use.
The reason that "unpredictable scheduler behavior" is mentioned is that if you have a high-priority thread waiting on the condvar (which another thread is going to signal and wakeup the high priority thread), any other lower-priority thread can come and lock the mutex so that when the condvar is signalled and the high-priority thread is awakened, it has to wait on the lower-priority thread to release the mutex. If the mutex is locked whilst signalling, then the higher-priority thread will be scheduled on the mutex before the lower-priority thread: basically you know that that when you "awaken" the high-priority thread it will awaken as soon as the scheduler allows it (of course, you might have to wait on the mutex before signalling the high-priority thread, but that's a different issue).
The point of waiting on conditional variable paired with a mutex is to atomically enter wait and release the lock, i.e. allow other threads to modify the protected state, then again atomically receive notification of the state change and acquire the lock. What you describe can be done with many other methods like pipes, sockets, signals, or - probably the most appropriate - semaphores.
I think this should work (note untested code):
// initialize a semaphore
sem_t sem;
sem_init(&sem,
0, // not shared
0 // initial value of 0
);
// thread A
struct timespec tm;
struct timeb tp;
const long sec = msecs / 1000;
const long millisec = msecs % 1000;
ftime(&tp);
tp.time += sec;
tp.millitm += millisec;
if(tp.millitm > 999) {
tp.millitm -= 1000;
tp.time++;
}
tm.tv_sec = tp.time;
tm.tv_nsec = tp.millitm * 1000000;
// wait until timeout or woken up
errno = 0;
while((sem_timedwait(&sem, &tm)) == -1 && errno == EINTR) {
continue;
}
return errno == ETIMEDOUT; // returns true if a timeout occured
// thread B
sem_post(&sem); // wake up Thread A early
Conditions should be signaled outside of the mutex whenever possible. Mutexes are a necessary evil in concurrent programming. Their use leads to contention which robs the system of the maximum performance that it can gain from the use of multiple processors.
The purpose of a mutex is to guard access to some shared variables in the program so that they behave atomically. When a signaling operation is done inside a mutex, it causes an inclusion of hundreds of irrelevant machine cycles into the mutex which have nothing to do with guarding the shared data. Potentially, it calls from a user space all the way into a kernel.
The notes about "predictable scheduler behavior" in the standard are completely bogus.
When we want the machine to execute statements in a predictable, well-defined order, the tool for that is the sequencing of statements within a single thread of execution: S1 ; S2. Statement S1 is "scheduled" before S2.
We use threads when we realize that some actions are independent and their scheduling order is not important, and there are performance benefits to be realized, like more timely response to real time events or computing on multiple processors.
At times when scheduling orders do become important among multiple threads, this falls under a concept called priority. Priority resolves what happens first when any one of N statements could potentially be scheduled to execute. Another tool for ordering under multithreading is queuing. Events are placed into a queue by one or more threads and a single service thread processes the events in the queue order.
The bottom line is, the placement of pthread_cond_broadcast is not an appropriate tool for controlling execution order. It will not make execution order predictable in the sense that the program suddenly has exactly the same, reproducible behavior on every platform.
"unpredictable scheduling behavior" means just that. You don't know what's going to happen.
Nor do the implementation. It could work as expected. It could crash your app. It could work fine for years, then a race condition makes your app go monkey. It could deadlock.
Basically if any docs suggest anything undefined/unpredicatble can happen unless you do what the docs tell you to do, you better do it. Else stuff might blow up in your face.
(And it won't blow up until you put the code into production , just to annoy you even more. Atleast that's my experience)