Mutex blocking the thread in a producer-consumer problem - c++

I have a fundamental question regarding the producer consumer problem. Please consider the below pseudo-code.
// Consumer. This is in the thread I created for asynchronous log writing
// so that I don't block the important threads with my silly log-writing
// operation
run()
{
mutex.lock(); // Line A
retrieveFromQueueAndWriteToFile();
mutex.unlock();
}
// producer. This function gets log messages from 'x' number of threads
add( string mylog )
{
mutex.lock(); // Line B, consider internal call to pthread_mutex_lock
Queue.push(mylog);
mutex.lock();
}
When the log writing operation is in progress in the consumer function the mutex lock is held there. So, when a new log comes in, at Line B, the mutex lock cannot be obtained in the add function. This blocks the important threads when log writing operation is happening.
Is this not the same as writing the log to the file with the other important threads itself. I don't see the point in creating a new thread to write the logs to a file when the important threads are being blocked anyways.
Any help appreciated.

You can split retrieveFromQueueAndWriteToFile to: retrieveFromQueue and writeLogToFile.
Something like this:
run()
{
mutex.lock(); // Line A
auto log = retrieveFromQueue();
mutex.unlock();
writeLogToFile(log); // this operation does not need a lock
}
Note:
If run is called only by one thread in an infinite loop, only the push and pop from the queue are locked, the write to file part of the operation is done by that thread without locking.

Related

How to know whether detached std::thread has finished its execution?

I have a function like following in which threads acquire a lock by using std::lock_guard mutex and write to the file via ofstream.
When the current file size increases the max size, then I create an independent thread that compresses the file and should terminate.
If the log file is big in size (say ~500MB), it takes around 25+ seconds to compress.
I detach the compress thread since no other thread (or main) wants to wait for this thread to finish.
But I need to know that the compress thread is not running before the execution of following line:
_compress_thread(compress_log, _logfile).detach();
Sample code snippet:
void log (std::string message)
{
// Lock using mutex
std::lock_guard<std::mutex> lck(mtx);
_outputFile << message << std::endl;
_outputFile.flush();
_sequence_number++;
_curr_file_size = _outputFile.tellp();
if (_curr_file_size >= max_size) {
// Code to close the file stream, rename the file, and reopen
...
// Create an independent thread to compress the file since
// it takes some time to compress huge files.
if (the_compress_thread_is_not_already_running) //pseudo code
{
_compress_thread(compress_log, _logfile).detach();
}
}
}
In the above if condition i.e. the_compress_thread_is_not_already_running, how can I be sure that the compress thread is not running?
void * compress_log (std::string s)
{
// Compress the file
// ...
}
It is not possible to detect whether a detached thread of execution has terminated.
If you for some reason need to guarantee that at most one thread is compressing simultaneously, then a simple solution is to use std::async. It returns a future object. You can query the future object whether the associated callback has finished. Same effect can be achieved in a less strucuted way using a detached thread by modifying a shared variable at the end of the function (note that shared access must be synchronised).
Another approach could be to constantly keep alive a compression thread, but block it as long as there is no work to be done. The thread can be notified using a condition variable to start its work and once finished, resume blocking until next notification.
P.S. You might want to first close the file stream, rename the file, and reopen while you hold the lock so that other threads may keep logging into a fresh file while the previous logs - now in the renamed file - are being compressed.

How to block a thread while other threads are waiting

I have a very specific problem to solve. I'm pretty sure someone else in the world has already encountered and solved it but I didn't find any solutions yet.
Here it is :
I have a thread that pop command from a queue and execute them asynchronously
I can call from any other thread a function to execute a command synchronously, bypassing the queue mechanism, returning a result, and taking priority of execution (after the current execution is over).
I have a mutex protecting a command execution so only one is executed at a time
The problem is, with a simple mutex, I have no certitude that a synchronous call will get the mutex before the asynchronous thread when in conflict. In fact, our test shows that the allocation is very unfair and that the asynchronous thread always win.
So I want to block the asynchronous thread while there is a synchronous call waiting. I don't know in advance how many synchronous call can be made, and I don't control the threads that make the calls (so any solution using a pool of threads is not possible).
I'm using C++ and Microsoft library. I know the basic synchronization objects, but maybe there is an more advance object or method suitable for my problem that I don't know.
I'm open to any idea!
Ok so I finally get the chance to close this. I tried some of the solution proposed here and in the link posted.
In the end, I combined a mutex for the command execution and a counter of awaiting sync calls (the counter is also protected by a mutex of course).
The async thread check the counter before trying to get the mutex, and wait the counter to be 0. Also, to avoid a loop with sleep, I added an event that is set when the counter is set to 0. The async thread wait for this event before trying to get the mutex.
void incrementSyncCounter()
{
DLGuardThread guard(_counterMutex);
_synchCount++;
}
void decrementSyncCounter()
{
DLGuardThread guard(_counterMutex);
_synchCount--;
// If the counter is 0, it means that no other sync call is waiting, so we notify the main thread by setting the event
if(_synchCount == 0)
{
_counterEvent.set();
}
}
unsigned long getSyncCounter()
{
DLGuardThread guard(_counterMutex);
return _synchCount;
}
bool executeCommand(Command* command)
{
// Increment the sync call counter so the main thread can be locked while at least one sync call is waiting
incrementSyncCounter();
// Execute the command using mutex protection
DLGuardThread guard(_theCommandMutex);
bool res = command->execute();
guard.release();
// Decrement the sync call counter so the main thread can be unlocked if there is no sync call waiting
decrementSyncCounter();
return res;
}
void main ()
{
[...]
// Infinite loop
while(!_bStop)
{
// While the Synchronous call counter is not 0, this main thread is locked to give priority to the sync calls.
// _counterEvent will be set when the counter is decremented to 0, then this thread will check the value once again to be sure no other call has arrived inbetween.
while(getSyncCounter() > 0)
{
::WaitForSingleObject (_counterEvent.hEvent(), INFINITE);
}
// Take mutex
DLGuardThread guard(_theCommandMutex);
status = command->execute();
// Release mutex
guard.release();
}
}

How can I use Boost condition variables in producer-consumer scenario?

EDIT: below
I have one thread responsible for streaming data from a device in buffers. In addition, I have N threads doing some processing on that data. In my setup, I would like the streamer thread to fetch data from the device, and wait until the N threads are done with the processing before fetching new data or a timeout is reached. The N threads should wait until new data has been fetched before continuing to process. I believe that this framework should work if I don't want the N threads to repeat processing on a buffer and if I want all buffers to be processed without skipping any.
After careful reading, I found that condition variables is what I needed. I have followed tutorials and other stack overflow questions, and this is what I have:
global variables:
boost::condition_variable cond;
boost::mutex mut;
member variables:
std::vector<double> buffer
std::vector<bool> data_ready // Size equal to number of threads
data receiver loop (1 thread runs this):
while (!gotExitSignal())
{
{
boost::unique_lock<boost::mutex> ll(mut);
while(any(data_ready))
cond.wait(ll);
}
receive_data(buffer);
{
boost::lock_guard<boost::mutex> ll(mut);
set_true(data_ready);
}
cond.notify_all();
}
data processing loop (N threads run this)
while (!gotExitSignal())
{
{
boost::unique_lock<boost::mutex> ll(mut);
while(!data_ready[thread_id])
cond.wait(ll);
}
process_data(buffer);
{
boost::lock_guard<boost::mutex> ll(mut);
data_ready[thread_id] = false;
}
cond.notify_all();
}
These two loops are in their own member functions of the same class. The variable buffer is a member variable, so it can be shared across threads.
The receiver thread will be launched first. The data_ready variable is a vector of bools of size N. data_ready[i] is true if data is ready to be processed and false if the thread has already processed data. The function any(data_ready) outputs true if any of the elements of data_ready is true, and false otherwise. The set_true(data_ready) function sets all of the elements of data_ready to true. The receiver thread will check if any processing thread still is processing. If not, it will fetch data, set the data_ready flags, notify the threads, and continue with the loop which will stop at the beginning until processing is done. The processing threads will check their respective data_ready flag to be true. Once it is true, the processing thread will do some computations, set its respective data_ready flag to 0, and continue with the loop.
If I only have one processing thread, the program runs fine. Once I add more threads, I'm getting into issues where the output of the processing is garbage. In addition, the order of the processing threads matters for some reason; in other words, the LAST thread I launch will output correct data whereas the previous threads will output garbage, no matter what the input parameters are for the processing (assuming valid parameters). I don't know if the problem is due to my threading code or if there is something wrong with my device or data processing setup. I try using couts at the processing and receiving steps, and with N processing threads, I see the output as it should:
receive data
process 1
process 2
...
process N
receive data
process 1
process 2
...
Is the usage of the condition variables correct? What could be the problem?
EDIT: I followed fork's suggestions and changed the code to:
data receiver loop (1 thread runs this):
while (!gotExitSignal())
{
if(!any(data_ready))
{
receive_data(buffer);
boost::lock_guard<boost::mutex> ll(mut);
set_true(data_ready);
cond.notify_all();
}
}
data processing loop (N threads run this)
while (!gotExitSignal())
{
// boost::unique_lock<boost::mutex> ll(mut);
boost::mutex::scoped_lock ll(mut);
cond.wait(ll);
process_data(buffer);
data_ready[thread_id] = false;
}
It works somewhat better. Am I using the correct locks?
I did not read your whole story but if i look at the code quickly i see that you use conditions wrong.
A condition is like a state, once you set a thread in a waiting condition it gives away the cpu. So your thread will effectively stop running untill some other process/thread notifies it.
In your code you have a while loop and each time you check for data you wait. That is wrong, it should be an if instead of a while. But then again it should not be there. The checking for data should be done somewhere else. And your worker thread should put itself in waiting condition after it has done its work.
Your worker threads are the consumers. And the producers are the ones that deliver the data.
I think a better construction would be to make a thread check if there is data and notify the worker(s).
PSEUDO CODE:
//producer
while (true) {
1. lock mutex
2. is data available
3. unlock mutex
if (dataAvailableVariable) {
4. notify a worker
5. set waiting condition
}
}
//consumer
while (true) {
1. lock mutex
2. do some work
3. unlock mutex
4. notify producer that work is done
5. set wait condition
}
You should also take care of the fact that some thread needs to be alive in order to avoid a deadlock, means all threads in waiting condition.
I hope that helps you a little.

invoke methods in thread in c++

I have a class which reads from a message queue. Now this class has also got a thread inside it. Depending on the type of the msg in msg q, it needs to execute different functions inside that thread as the main thread in class always keeps on waiting on msg q. As soon as it reads a message from queue, it checks its type and calls appropriate method to be executed in thread and then it goes back to reading again(reading in while loop).
I am using boost message q and boost threads
How can I do this.
Its something like this:
while(!quit) {
try
{
ptime now(boost::posix_time::microsec_clock::universal_time());
ptime timeout = now + milliseconds(100);
if (mq.timed_receive(&msg, sizeof(msg), recvd_size, priority, timeout))
{
switch(msg.type)
{
case collect:
{
// need to call collect method in thread
}
break;
case query:
{
// need to call query method in thread
}
break;
and so on.
Can it be done?
If it can be done, then what happens in the case when thread is say executing collect method and main thread gets a query message and wants to call it.
Thanks in advance.
Messages arriving while the receiving thread is executing long operations will be stored for later (in the queue, waiting to be processed).
If the thread is done with its operation, it will come back and call the receive function again, and immediately get the first of the messages that arrived while it was not looking and can process it.
If the main thread needs the result of the message processing operation, it will block until the worker thread is done and delivers the result.
Make sure you do not do anything inside the worker thread that in turn waits on the main thread's actions, otherwise there is the risk of a deadlock.

c++ Handling multiple threads in a main thread

I am a bit new to multi threading, so forgive me if these questions are too trivial.
My application needs to create multiple threads in a thread and perform actions from each thread.
For example, I have a set of files to read, say 50 and I create a thread to read these files using CreateThread() function.
Now this main thread creates 4 threads to access the file. 1st thread is given file 1, second file 2 and so on.
After 1st thread completed reading file 1 and gives main thread the required data, main thread needs to invoke it with file 5 and obtain data from it. Similar goes for all other threads until all 50 files are read.
After that, each thread is destroyed and finally my main thread is destroyed.
The issue I am facing is:
1) How to stop a thread to exit after file reading?
2) How to invoke the thread again with other file name?
3) How would my child thread give information to main thread?
4) After a thread completes reading the file and returns the main thread a data, how main thread would know which thread has provided the data?
Thanks
This is a very common problem in multi-threaded programming. You can view this as a producer-consumer problem: the main thread "produces" tasks which are "consumed" by the worker threads (s. e.g. http://www.mario-konrad.ch/blog/programming/multithread/tutorial-06.html) . You might also want to read about "thread pools".
I would highly recommend to read into boost's Synchronization (http://www.boost.org/doc/libs/1_50_0/doc/html/thread.html) and use boost's threading functionality as it is platform independent and good to use.
To be more specific to your question: You should create a queue with operations to be done (usually it's the same queue for all worker threads. If you really want to ensure thread 1 is performing task 1, 5, 9 ... you might want to have one queue per worker thread). Access to this queue must be synchronized by a mutex, waiting threads can be notified by condition_variables when new data is added to the mutex.
1.) don't exit the thread function but wait until a condition is fired and then restart using a while ([exit condition not true]) loop
2.) see 1.
3.) through any variable to which both have access and which is secured by a mutex (e.g. a result queue)
4.) by adding this information as the result written to the result queue.
Another advice: It's always hard to get multi-threading correct. So try to be as careful as possible and write tests to detect deadlocks and race conditions.
The typical solution for this kind of problem is using a thread pool and a queue. The main thread pushes all files/filenames to a queue, then starts a thread pool, ie different threads, in which each thread takes an item from the queue and processes it. When one item is processed, it goes on to the next one (if by then the queue is not yet empty). The main thread knows everything is processed when the queue is empty and all threads have exited.
So, 1) and 2) are somewhat conflicting: you don't stop the thread and invoke it again, it just keeps running as long as it finds items on the queue.
For 3) you can again use a queue in which the thread puts information, and from which the main thread reads. For 4) you could give each thread an id and put that together with the data. However normally the main thread should not need to know which thread exactly processed data.
Some very basic pseudocode to give you an idea, locking for threadsafety omitted:
//main
for( all filenames )
queue.push_back( filename );
//start some thread
threadPool.StartThreads( 4, CreateThread( queue ) );
//wait for threads to end
threadPool.Join();
//thread
class Thread
{
public:
Thread( queue q ) : q( q ) {}
void Start();
bool Join();
void ThreadFun()
{
auto nextQueueItem = q.pop_back();
if( !nextQueuItem )
return; //q empty
ProcessItem( nextQueueItem );
}
}
Whether you use a thread pool or not to execute your synchronies file reads, it boils down to a chain of functions or groups of functions that have to run serialized. So let's assume, you find a way to execute functions in parallel (be it be starting one thread per function or by using a thread pool), to wait for the first 4 files to read, you can use a queue, where the reading threads push there results into, the fifth function now pulls 4 results out of the queue (the queue blocks when empty) and processes. If there are more dependencies between functions, you can add more queues between them. Sketch:
void read_file( const std::string& name, queue& q )
{
file_content f= .... // read file
q.push( f )
}
void process4files( queue& q )
{
std::vector< file_content > result;
for ( int i = 0; i != 4; ++i )
result.push_back( q.pop() );
// now 4 files are read ...
assert( result.size() == 4u );
}
queue q;
thread t1( &read_file, "file1", q );
thread t2( &read_file, "file2", q );
thread t3( &read_file, "file3", q );
thread t4( &read_file, "file4", q );
thread t5( &process4files, q );
t5.join();
I hope you get the idea.
Torsten