pthread_cond_wait not waking up from pthread_cond_broadcast - c++

in my program there's a part of code that waits to be waken up from other part of code:
Here's the part that goes to sleep:
void flush2device(int task_id) {
if (pthread_mutex_lock(&id2cvLock) != SUCCESS) {
cerr << "system error - exiting!!!\n";
exit(1);
}
map<int,pthread_cond_t*>::iterator it;
it = id2cv.find(task_id);
if(it == id2cv.end()){
if (pthread_mutex_unlock(&id2cvLock) != SUCCESS) {
cerr << "system error\n UNLOCKING MUTEX flush2device\n";
exit(1);
}
return;
}
cout << "Waiting for CV signal" <<endl;
if(pthread_cond_wait(it->second, &id2cvLock)!=SUCCESS){
cerr << "system error\n COND_WAIT flush2device - exiting!!!\n";
exit(1);
}
cout << "should be right after " << task_id << " signal" << endl;
if (pthread_mutex_unlock(&id2cvLock) != SUCCESS) {
cerr << "system error\n UNLOCKING MUTEX flush2device -exiting!!!\n";
exit(1);
}
}
In another part of code, there's the waking up part (signaling):
//id2cv is a map <int, pthread_cond_t*> variable. - the value is a pointer to the cv on
//which we call with the broadcast method.
if(pthread_mutex_lock(&id2cvLock)!=SUCCESS){
cerr <<"system error\n";
exit(1);
}
id2cv.erase(nextBuf->_taskID);
cout << "In Thread b4 signal, i'm tID " <<nextBuf->_taskID << endl;
if (pthread_cond_broadcast(nextBuf->cv) != 0) {
cerr << "system error SIGNAL_CV doThreads\n";
exit(1);
}
cout << "In doThread, after erasing id2cv " << endl;
if(pthread_mutex_unlock(&id2cvLock)!=SUCCESS){
cerr <<"system error\n;
exit(1);
}
Most of the runnings work just fine, but once in a while the program just stop "reacting" - the first method (above) just doesn't pass the cond_wait part - it seems like no one really send her the signal on time (or from some other reason) - while the other method (which the last part of code is a part of it) keeps running.
Where do i go wrong in the logic of mutexes and signaling? I've already checked that the pthread_cond_t variable is still "alive" before the calling to the cond_wait and the cond_broadcast method, and nothing in that area seems to be the fault.

Despite it's name, pthread_cond_wait is an unconditional wait for a condition. You must not call pthread_cond_wait unless you have confirmed that there is something to wait for, and the thing it's waiting for must be protected by the associated mutex.
Condition variables are stateless and it is the application's responsibility to store the state of the thing being waited for, called a 'predicate'.
The canonical pattern is:
pthread_mutex_lock(&mutex);
while(!ready_for_me_to_do_something)
pthread_cond_wait(&condvar, &mutex);
do_stuff();
ready_for_me_to_do_something=false; // this may or may not be appropriate
pthread_mutex_unlock(&mutex);
and:
pthread_mutex_lock(&mutex);
ready_for_me_to_do_something=true;
pthread_cond_broadcast(&condvar);
pthread_mutex_unlock(&mutex);
Notice how this code maintains the state in the ready_for_me_to_do_something variable and the waiting thread waits in a loop until that variable is true. Notice how the mutex protects that shared variable, and it protects the condition variable (because that is also shared between the threads).
This is not the only correct way to use a condition variable, but it is very easy to run into trouble with any other use. You call pthread_cond_wait even if there is no reason to wait. If you wait for your sister to get home with the car before you use it, and she has already returned, you will be waiting a long time.

Your use of pthread_cond_wait() is not correct. If a condition variable is signalled while no processes are waiting, the signal has no effect. It's not saved for the next time a process waits. This means that correct use of pthread_cond_wait() looks like:
pthread_mutex_lock(&mutex);
/* ... */
while (!should_wake_up)
pthread_cond_wait(&cond, &mutex);
The should_wake_up condition might just be a simple test of a flag variable, or it might be something like a more complicated test for a buffer being empty or full, or something similar. The mutex must be locked to protect against concurrent modifications that might change the result of should_wake_up.
It is not clear what that test should be in your program - you might need to add a specific flag variable.

I don't think there's enough code in the "waking up" part, but my initial guess is that the pthread_cond_wait hasn't been entered at the time pthread_cond_broadcast is issued.
Another possibility is that pthread_cond_wait is in the middle of a spurious wakeup and misses the signal completely.
I'm pretty sure that most uses of condition variables also have an external predicate that must be checked after every wakeup to see if there is work to be done.

Related

How to stop one thread from two parallel threads?

I have a program in which we can monitor 2 objects at same time.
myThread = new thread (thred1, id);
vec.push_back (myThread);
In thred1 function,i use Boolean function to read the stored values from a different vector and it runs parallely like this:
element found 2 -- hj
HUMIDITY-1681692777 DISPLAYED IN RH
element found 1 -- hj
TEMPERATURE--1714636915 IN DEGREE CELSIUS
This keeps on running as that is what my program should do.
I have a case where I need to get ID from the user and stop that particular thread and the other should keep running till I stop it.Can someone help me with that?
void thred1 (int id)
{
bool err = false;
while (stopThread == false)
{
for (size_t i = 0; i < v.size (); i++)
{
if (id == v[i]->id)
{
cout << "element found " << v[i]->id << " -- " << v[i]->name << endl;
v[i]->Read ();
this_thread::sleep_for (chrono::seconds (4));
err = true;
break;
}
}
if (!err)
{
cout << "element not found" << endl;
break;
}
}
}
Suspension
1. Assuming you want to suspend the monitor thread but only temporarily (i.e making any changes) then you can just use a mutex. Lock it before accessing the shared vector and unlock it when you're done, ensuring that only one thread can access the data at a time.
2. You can actively suspend the thread using OS support such as SuspendThread and ResumeThread, in the case of Windows, when it's ready.
Termination
1. You could use an event for each monitor thread, name being linked to the ID would work. At each iteration of the monitor check for the termination event, ending the thread if it's active.
2. Pass some variable to each thread, store them in a map with the thread handle being the key, and similar to the previous option just check the value for each iteration.
3. Store all threads in a map with the handle as key, terminating it directly with OS support.
Honestly there are a ton of ways to do this, the best implementation depends on why exactly you want to stop the monitor thread. Any sort of synchronization object like a mutex should be fine if you're reading from one thread and writing from another. Otherwise, just storing all threads with the internal ID as key and the thread as the value should be fine for terminating monitor threads on demand.

Deadlock using std::mutex to protect cout in multiple threads

Using cout in multiple threads might result in interleaved output.
So I tried to protect cout with a mutex.
The following code starts 10 background threads with std::async. When a thread starts, it prints "Started thread ...".
The main thread iterates over the futures of the background threads in the order in which they were created and prints out "Done thread ..." when the corresponding thread finished.
The output is synchronized correctly, but after some threads have started and some have finished (see output below), a deadlock occurres. All background threads left and the main thread are waiting for the mutex.
What is the reason for the deadlock?
When the print function is left or one iteration of the for loop ends, the lock_guard should unlock the mutex, so that one of the waiting threads would be able to proceed.
Why are all the threads left starving?
Code
#include <future>
#include <iostream>
#include <vector>
using namespace std;
std::mutex mtx; // mutex for critical section
int print_start(int i) {
lock_guard<mutex> g(mtx);
cout << "Started thread" << i << "(" << this_thread::get_id() << ") " << endl;
return i;
}
int main() {
vector<future<int>> futures;
for (int i = 0; i < 10; ++i) {
futures.push_back(async(print_start, i));
}
//retrieve and print the value stored in the future
for (auto &f : futures) {
lock_guard<mutex> g(mtx);
cout << "Done thread" << f.get() << "(" << this_thread::get_id() << ")" << endl;
}
cin.get();
return 0;
}
Output
Started thread0(352)
Started thread1(14944)
Started thread2(6404)
Started thread3(16884)
Done thread0(16024)
Done thread1(16024)
Done thread2(16024)
Done thread3(16024)
Your problem lies in the use of future::get:
Returns the value stored in the shared state (or throws its exception)
when the shared state is ready.
If the shared state is not yet ready (i.e., the provider has not yet
set its value or exception), the function blocks the calling thread
and waits until it is ready.
http://www.cplusplus.com/reference/future/future/get/
So if the thread behind the future didn't get to run yet, the function blocks until that thread finishes. However, you take ownership of the mutex before calling future::get, so whichever thread you're waiting for will not be able to attain the mutex for itself.
This should fix your deadlock problem:
int value = f.get();
lock_guard<mutex> g(mtx);
cout << "Done thread" << value << "(" << this_thread::get_id() << ")" << endl;
You lock the mutex and then wait for one of the futures, which in turn requires a lock on the mutex itself. Simple rule: Don't wait with locked mutexes.
BTW: Locking output streams is not very effective, because it can easily be circumvented by code you don't even control. Rather than using those globals, give a stream to code that needs to output something (dependency injection) and then collect the data from that stream in a threadsafe way. Or use a logging library, because that's probably what you wanted to do anyway.
It is good that the reason was spotted from the source. However, quite often the error, as it happens, may be not so easy to locate. And the reason may differ as well. Fortunately, in case of deadlock you can use debugger to investigate it.
I compiled and ran your example, then after attaching to it with gdb (gcc 4.9.2/Linux), there is a backtrace (noisy implementation details skipped):
#0 __lll_lock_wait ()
...
#5 0x0000000000403140 in std::lock_guard<std::mutex>::lock_guard (
this=0x7ffe74903320, __m=...) at /usr/include/c++/4.9/mutex:377
#6 0x0000000000402147 in print_start (i=0) at so_deadlock.cc:9
...
#23 0x0000000000409e69 in ....::_M_complete_async() (this=0xdd4020)
at /usr/include/c++/4.9/future:1498
#24 0x0000000000402af2 in std::__future_base::_State_baseV2::wait (
this=0xdd4020) at /usr/include/c++/4.9/future:321
#25 0x0000000000404713 in std::__basic_future<int>::_M_get_result (
this=0xdd47e0) at /usr/include/c++/4.9/future:621
#26 0x0000000000403c48 in std::future<int>::get (this=0xdd47e0)
at /usr/include/c++/4.9/future:700
#27 0x000000000040229b in main () at so_deadlock.cc:24
This is just what is explained in the other answers - the code in locked section (so_deadlock.cc:24) calls future::get(), which in turn (by forcing the result) trying to acquire the lock again.
It might be not that simple in other cases, there are usually several threads, but it's all there.

Unexpected behavior with mutex::try_lock()

I have tried the mutex::try_lock() member in a program, which does the following:
1) It deliberately locks a mutex in a parallel thread.
2) In the main thread, it tries to lock the mutex using try_lock().
a) If the lock isn't acquired, it adds chars to a string.
b) When the lock is acquired, it prints the string.
I have tested this program on 2 online compilers:
1) On coliru, (which has a thread::hardware_concurrency() of 1), the program is here:
int main()
{
/// Lock the mutex for 1 nanosecond.
thread t {lock_mutex};
job();
t.join();
}
/// Lock the mutex for 1 nanosecond.
void lock_mutex()
{
m.lock();
this_thread::sleep_for(nanoseconds {1});
m.unlock();
}
void job()
{
cout << "starting job ..." << endl;
int lock_attempts {};
/// Try to lock the mutex.
while (!m.try_lock())
{
++lock_attempts;
/// Lock not acquired.
/// Append characters to the string.
append();
}
/// Unlock the mutex.
m.unlock();
cout << "lock attempts = " << lock_attempts
<< endl;
/// Lock acquired.
/// Print the string.
print();
}
/// Append characters to the string
void append()
{
static int count = 0;
s.push_back('a');
/// For every 5 characters appended,
/// append a space.
if (++count == 5)
{
count = 0;
s.push_back(' ');
}
}
/// Print the string.
void print()
{
cout << s << endl;
}
Here, the program output is as expected:
starting job ...
lock attempts = 2444
aaaaa aaaaa aaaaa ...
However, here, if I remove the following statement from the program:
cout << "starting job ..." << endl;
the output shows:
lock attempts = 0
Why does this happen?
2) On the other hand when I try this program (even locking for 1 second rather than 1 nanosecond) on ideone - here - I always get an output showing:
lock attempts = 0
This happens even if the diagnostic "starting job" is present in the program.
ideone has a thread::hardware_concurrency() of 8.
In other words, I successfully get the lock immediately. Why does this happen?
Note that this is NOT a case of try_lock() spuriously failing. In that case, though there is no existing lock on the mutex, the member returns false, indicating an unsuccessful locking attempt.
Here, the OPPOSITE appears to be happening. Though a lock (apparently) exists on the mutex, the member returns true, indicating a new lock has been successfully taken! Why?
calling cout.operator << (...) with std::endl calls flush. This is a switch into kernel and gives a lot of time (some nano seconds :) ) to allow the lock_mutex thread to run. When you are not calling this function the lock_mutex has not started yet.
Due to the call of into kernel you might even see this in a single core system.

Synchronising main thread and worker thread

In QT, from main(GUI) thread I am creating a worker thread to perform a certain operation which accesses a resource shared by both threads. On certain action in GUI, main thread has to manipulate the resource. I tried using QMutex to lock that particular resource. This resource is continuously used by the worker thread, How to notify main thread on this?
Tried using QWaitCondition but it was crashing the application.
Is there any other option to notify and achieve synchronisation between threads?
Attached the code snippet.
void WorkerThread::IncrementCounter()
{
qDebug() << "In Worker Thread IncrementCounter function" << endl;
while(stop == false)
{
mutex.lock();
for(int i = 0; i < 100; i++)
{
for(int j = 0; j < 100; j++)
{
counter++;
}
}
qDebug() << counter;
mutex.unlock();
}
qDebug() << "In Worker Thread Aborting " << endl;
}
//Manipulating the counter value by main thread.
void WorkerThread::setCounter(int value)
{
waitCondition.wait(&mutex);
counter = value;
waitCondition.notify_one();
}
You are using the wait condition completely wrong.
I urge you to read up on mutexes and conditions, and maybe look at some examples.
wait() will block execution until either notify_one() or notify_all() is called somewhere. Which of course cannot happen in your code.
You cannot wait() a condition on one line and then expect the next two lines to ever be called if they contain the only wake up calls.
What you want is to wait() in one thread and notify_XXX() in another.
You could use shared memory from within the same process. Each thread could lock it before writing it, like this:
QSharedMemory *shared=new QSharedMemory("Test Shared Memory");
if(shared->create(1,QSharedMemory::ReadWrite))
{
shared->lock();
// Copy some data to it
char *to = (char*)shared->data();
const char *from = &dataBuffer;
memcpy(to, from, dataSize);
shared->unlock();
}
You should also lock it for reading. If strings are wanted, reading strings can be easier that writing them, if they are zero terminated. You'll want to convert .toLatin1() to get a zero-terminated string which you can get the size of a string. You might get a lock that multiple threads can read from, with shared->attach(); but that's more for reading the shared memory of a different process..
You might just use this instead of muteces. I think if you try to lock it, and something else already has it locked, it will just block until the other process unlocks it.

Thread is blocking whole program

I'm trying to make a multiclient server. I have this thread:
void client_thread(int new_socket)
{
int size;
char inbuffer[BUF];
do
{
cout << "Waiting for messages: " << endl;
size = recv(new_socket, inbuffer, BUF, 0);
} while (true);
}
and this main procedure:
int main()
{
while (true)
{
//waiting for clients
cout << "Waiting for connections..." << endl;
new_socket = accept ( create_socket, (struct sockaddr *) &cliaddress, &addrlen );
//new client connected
if (new_socket > 0)
{
//start thread
thread(client_thread, new_socket).join();
}
}
return 0;
}
When the first client connects, the thread starts and the server is waiting for messages from him. But the server doesn't wait for new clients anymore. I don't know why. Is it because of the infinite do-while loop inside the thread-function? What's the point of threads if they block your whole program if they contain infinite loops?
The main routine blocks, because it waits for the thread to finish: join().
If you don't want to block, then don't join() your client_thread.
This exception might come from the destruction of your anonymous thread object. When you leave the scope of if() all objects in this scope are destroyed. From http://en.cppreference.com/w/cpp/thread/thread/~thread you can see, the destructor calls terminate(). To avoid it, you can call detach(). So instead of thread(client_thread, new_socket).join();, you must say thread(client_thread, new_socket).detach();.
You are supposed to create a thread and keep a reference to it until you joined it. In your code, the thread object is deallocated right after being created, hence your error if don't call join immediately.
To achieve this properly, the best way is to allocate the object on the heap, using the operator new and store the pointer in a list somewhere. When the thread is done it may remove itself from the list (don't forget to "mutex" it), or you could have another dedicated thread do that: perhaps you could simply have your main thread join all the other threads before exiting.