I don't know if this is good practice or not but I am doing work on a real time stream of input data and using pthreads in lockstep order to allow one thread at a time to do different operations at the same. This is my program flow for each thread:
void * my_thread() {
pthread_mutex_lock(&read_mutex);
/*
read data from a stream such as stdin into global buffer
*/
pthread_mutex_lock(&operation_mutex);
pthread_mutex_unlock(&read_mutex);
/*
perform some work on the data you read
*/
pthread_mutex_lock(&output_mutex);
pthread_mutex_unlock(&operation_mutex);
/*
Write the data to output such as stdout
*/
pthread_mutex_unlock(&output_mutex);
}
I know there is pthread conditional lock, but is my approach a good idea or a bad idea? I tested this on various size streams and I am trying to think of corner cases to make this deadlock, produce race condition, or both. I know mutexes don't guarantee thread order execution but I need help to think of scenarios that will break this.
UPDATE:
I stepped away from this, but had sometime recently to rethink about this. I rewrote the code using C++ threads and mutexes. I am trying to use condition variables but have no such luck. This is my approach to the problem:
void my_thread_v2() {
//Let only 1 thread read in at a time
std::unique_lock<std::mutex> stdin_lock(stdin_mutex);
stdin_cond.wait(stdin_lock);
/*
Read from stdin stream
*/
//Unlock the stdin mutex
stdin_lock.unlock();
stdin_cond.notify_one();
//Lock step
std::unique_lock<std::mutex> operation_lock(operation_mutex);
operation_cond.wait(operation_lock);
/*
Perform work on the data that you read in
*/
operation_lock.unlock();
operation_cond.notify_one();
std::unique_lock<std::mutex> stdout_lock(stdout_mutex);
stdout_cond.wait(stdout_lock);
/*
Write the data out to stdout
*/
//Unlock the stdout mutex
stdout_lock.unlock();
stdout_cond.notify_one();
}
I know the issue with this code is that there is no way to signal the first condition. I definitely am not understanding the proper use of the condition variable. I looked at various examples on cpp references, but can't seem to get away from the thought that the initial approach maybe the only way of doing what I want to do which is to lock step the threads. Can someone shed some light on this?
UPDATE 2:
So I implemented a simple Monitor class that utilizes C++ condition_variable and unique_lock:
class ThreadMonitor{
public:
ThreadMonitor() : is_occupied(false) {}
void Wait() {
std::unique_lock<std::mutex> lock(mx);
while(is_occupied) {
cond.wait(lock);
}
is_occupied = true;
}
void Notify() {
std::unique_lock<std::mutex> lock(mx);
is_occupied = false;
cond.notify_one();
}
private:
bool is_occupied;
std::condition_variable cond;
std::mutex mx;
};
This is my initial approach assuming i have three ThreadMonitors called stdin_mon, operation_mon, and stdout_mon:
void my_thread_v3() {
//Let only 1 thread read in at a time
stdin_mon.Wait();
/*
Read from stdin stream
*/
stdin_mon.Notify();
operation_mon.Wait();
/*
Perform work on the data that you read in
*/
operation_mon.Notify();
stdout_mon.Wait();
/*
Write the data out to stdout
*/
//Unlock the stdout
stdout_mon.notify();
}
The issue with this was that the data was still being corrupted so I had to change back to the original logic of lock stepping the threads:
void my_thread_v4() {
//Let only 1 thread read in at a time
stdin_mon.Wait();
/*
Read from stdin stream
*/
operation_mon.Wait();
stdin_mon.Notify();
/*
Perform work on the data that you read in
*/
stdout_mon.Wait();
operation_mon.Notify();
/*
Write the data out to stdout
*/
//Unlock the stdout
stdout_mon.notify();
}
I am beginning to suspect that if thread order matters that this is the only way to handle it. I am also questioning what the benefit is of using a Monitor that utilizes condition_variable over just using a mutex.
The problem with your approach is that you still can modify the data while another thread is reading it:
Thread A acquired read, then operation and released read again, and starts writing some data, but is interrupted.
Now thread B operates, acquires read and can read the partially modified, possibly inconsistent data!
I assume you want to allow multiple threads reading the same data without blocking, but as soon as writing, the data shall be protected. Finally, while outputting data, we are just reading the modified data again and thus can do this concurrently again, but need to prevent simultaneous write.
Instead of having multiple mutex instances, you can do this better with a read/write mutex:
Any function only reading the data acquires the read lock.
Any function intending to write acquires write lock right from the start (be aware that first acquiring read, then write lock without releasing the read lock in between can result in dead-lock; if you release read lock in between, though, your data handling needs to be robust against data being modified by another thread in between as well!).
Reducing write lock to shared without releasing in between is safe, so we can do so now before outputting. If data must not be modified in between writing data and outputting it, we even need to do this without entirely releasing the lock.
Last point is problematic as not supported neither by C++ standard's thread support library nor by pthreads library.
For C++ boost provides a solution; if you don't want to or cannot (C!) use boost a simple, but possibly not most efficient approach would be protecting acquiring write lock via another mutex:
acquire standard (non-rw) mutex protecting the read write mutex
acquire RW mutex for writing
release protecting mutex
read data, write modified data
acquire protecting mutex
release RW mutex
re-acquire RW mutex for reading; it does not matter if another thread acquired for reading as well, we only need to protect against locking for write here
release protecting mutex
output
release RW mutex (no need to protect)...
Non-modifying functions can just acquire the read lock without any further protection, there aren't any conflicts with...
In C++, you'd prefer using the thread support library and additionally gain platform independent code for free, in C, you would use a standard pthread mutex for protecting acquiring the write lock just as you did before and use the RW variants from pthread for the read write lock.
Related
This question already has an answer here:
C++ thread safety without atomics [closed]
(1 answer)
Closed 1 year ago.
If I have a struct with some members in it, and I want to have differents threads that read into these structs. In this case, I don't need to do anything because I only read, but if I ever want to write to a member, do I have to use atomics and force everyone to explicitly load the atomic everywhere? Just because of one single write that I occasionally do?
Make the variables as private and access them through getters and setters, and guard their access with a std::scoped_lock:
in cpp:
std::mutex aMutex;
int getInteger() const
{
std::scoped_lock<std::mutex> aLock(aMutex);
return m_integer;
}
void setInteger(int iInteger)
{
std::scoped_lock<std::mutex> aLock(aMutex);
m_integer = iInteger;
}
If you have multi reader and writer you should use synchronization methods.
in your case because your read and write not in proper order you should synchronize them. imagine you read and write value in same time if you dont have enough luck your reader thread will read fast and store value in register(or other location) and immediately you write new value and therefore your value updated and previous value is gone.
most famous synchronization method is mutex.
example:
#include <mutex>
std::mutex mutex;
mutex.lock();
//enter critical section
mutex.unlock(); //exit critical section
another good synchronization method is reader and writer lock.
in this method if you going to read value you can lock multi times and read value from multi thread but if you going to write when you call lock on writer side other thread cannot read or write value because it is currently under writing.
another good synchronization method is spin lock in this method you just run busy loop until you can gain access for manipulate variable. the good point of this method is that because we run busy loop therefore it's faster than mutex because in mutex you switching to kernel mode which is slow(but beware you should hold spinlock for just little amount of time)
they are many other synchronization method but i recommend you to use mutex or atomic variables.
I have a situation where I have multiple threads accessing the clipboard and performing an action with some sleeps. Basically I don't want to copy the wrong thing to the clipboard and perform something in an incorrect fashion.
Currently, I have an idea of using a mutex to lock a std::string variable and store the required data in it, then to the clipboard. After I finish the action, I unlock it. Then the other thread accesses the variable, then clipboard, and action.
As I have never used a mutex, my question is - will it work by just doing the above?
I'm using the following library to manage the clipboard: https://github.com/dacap/clip
From a quick look at the library, you may not need to use a mutex at all. For eg. the set_text function:
bool set_text(const std::string& value) {
lock l;
if (l.locked()) {
l.clear();
return l.set_data(text_format(), value.c_str(), value.size());
}
else
return false;
}
You can see that a lock is being created (which locks the resource based on the os implementation at construction of lock l) which should provide you some thread safety.
Currently, I have an idea of using a mutex to lock a std::string variable and store the required data in it, then to the clipboard.
From the way you explain this it is clear you do not understand how it works. Mutex does not lock a variable whatever type it has. Mutex is locked itself. And mutex has a property that only one thread can lock it at a time. Using that property you can organize your program the way so only one thread will access some critical data and that will make such access thread safe. It is not that you link a mutex to an unrelated variable, then that mutex would somehow lock that variable and make it thread safe. Now if you understand what is different btw those you would understand that yes you can make access to your std::string variable thread safe but that will require all threads accessing it doing that under mutex lock, not that only one thread locks it and all other will obey that lock magically.
I am trying to use read/write lock in C++ using shared_mutex
typedef boost::shared_mutex Lock;
typedef boost::unique_lock< Lock > WriteLock;
typedef boost::shared_lock< Lock > ReadLock;
class Test {
Lock lock;
WriteLock writeLock;
ReadLock readLock;
Test() : writeLock(lock), readLock(lock) {}
readFn1() {
readLock.lock();
/*
Some Code
*/
readLock.unlock();
}
readFn2() {
readLock.lock();
/*
Some Code
*/
readLock.unlock();
}
writeFn1() {
writeLock.lock();
/*
Some Code
*/
writeLock.unlock();
}
writeFn2() {
writeLock.lock();
/*
Some Code
*/
writeLock.unlock();
}
}
The code seems to be working fine but I have a few conceptual questions.
Q1. I have seen the recommendations to use unique_lock and shared_lock on http://en.cppreference.com/w/cpp/thread/shared_mutex/lock, but I don't understand why because shared_mutex already supports lock and lock_shared methods?
Q2. Does this code have the potential to cause write starvation? If yes then how can I avoid the starvation?
Q3. Is there any other locking class I can try to implement read write lock?
Q1: use of a mutex wrapper
The recommendation to use a wrapper object instead of managing the mutex directly is to avoid unfortunate situation where your code is interrupted and the mutex is not released, leaving it locked forever.
This is the principle of RAII.
But this only works if your ReadLock or WriteLock are local to the function using it.
Example:
readFn1() {
boost::unique_lock< Lock > rl(lock);
/*
Some Code
==> imagine exception is thrown
*/
rl.unlock(); // this is never reached if exception thrown
} // fortunately local object are destroyed automatically in case
// an excpetion makes you leave the function prematurely
In your code this won't work if one of the function is interupted, becaus your ReadLock WriteLock object is a member of Test and not local to the function setting the lock.
Q2: Write starvation
It is not fully clear how you will invoke the readers and the writers, but yes, there is a risk:
as long as readers are active, the writer is blocked by the unique_lock waiting for the mutex to be aquirable in exclusive mode.
however as long as the wrtier is waiting, new readers can obtain access to the shared lock, causing the unique_lock to be further delayed.
If you want to avoid starvation, you have to ensure that waiting writers do get the opportunity to set their unique_lock. For example att in your readers some code to check if a writer is waiting before setting the lock.
Q3 Other locking classes
Not quite sure what you're looking for, but I have the impression that condition_variable could be of interest for you. But the logic is a little bit different.
Maybe, you could also find a solution by thinking out of the box: perhaps there's a suitable lock-free data structure that could facilitate coexistance of readers and writers by changing slightly the approach ?
The types for the locks are ok but instead of having them as member functions create then inside the member functions locktype lock(mymutex). That way they are released on destruction even in the case of an exception.
Q1. I have seen the recommendations to use unique_lock and shared_lock on http://en.cppreference.com/w/cpp/thread/shared_mutex/lock, but I don't understand why because shared_mutex already supports lock and lock_shared methods?
Possibly because unique_lock has been around since c++11 but shared_lock is coming onboard with c++17. Also, [possibly] unique_lock can be more efficient. Here's the original rationale for shared_lock [by the creator] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2406.html and I defer to that.
Q2. Does this code have the potential to cause write starvation? If yes then how can I avoid the starvation?
Yes, absolutely. If you do:
while (1)
writeFn1();
You can end up with a time line of:
T1: writeLock.lock()
T2: writeLock.unlock()
T3: writeLock.lock()
T4: writeLock.unlock()
T5: writeLock.lock()
T6: writeLock.unlock()
...
The difference T2-T1 is arbitrary, based on amount of work being done. But, T3-T2 is near zero. This is the window for another thread to acquire the lock. Because the window is so small, it probably won't get it.
To solve this, the simplest method is to insert a small sleep (e.g. nanosleep) between T2 and T3. You could do this by adding it to the bottom of writeFn1.
Other methods can involve creating a queue for the lock. If a thread can't get the lock, it adds itself to a queue and the first thread on the queue gets the lock when the lock is released. In the linux kernel, this is implemented for a "queued spinlock"
Q3. Is there any other locking class I can try to implement read write lock?
While not a class, you could use pthread_mutex_lock and pthread_mutex_unlock. These implement recursive locks. You could add your own code to implement the equivalent of boost::scoped_lock. Your class can control the semantics.
Or, boost has its own locks.
My problem is quite common I suppose, but it drives me crazy:
I have a multi-threaded application with 5 threads. 4 of these threads do their job, like network communication and local file system access, and then all write their output to a data structure of this form:
struct Buffer {
std::vector<std::string> lines;
bool has_been_modified;
}
The 5th thread prints these buffer/structures to the screen:
Buffer buf1, buf2, buf3, buf4;
...
if ( buf1.has_been_modified ||
buf2.has_been_modified ||
buf3.has_been_modified ||
buf4.has_been_modified )
{
redraw_screen_from_buffers();
}
How do I protect the buffers from being overwritten while they are either being read from or written to?
I can't find a proper solution, although I think this has to be a quiet common problem.
Thanks.
You should use a mutex. The mutex class is std::mutex. With C++11 you can use std::lock_guard<std::mutex> to encapsulate the mutex using RAII. So you would change your Buffer struct to
struct Buffer {
std::vector<std::string> lines;
bool has_been_modified;
std::mutex mutex;
};
and whenever you read or write to the buffer or has_been_modified you would do
std::lock_guard<std::mutex> lockGuard(Buffer.mutex); //Do this for each buffer you want to access
... //Access buffer here
and the mutex will be automatically released by the lock_guard when it is destroyed.
You can read more about mutexes here.
You can use a mutex (or mutexes) around the buffers to ensure that they're not modified by multiple threads at the same time.
// Mutex shared between the multiple threads
std::mutex g_BufferMutex;
void redraw_screen_from_buffers()
{
std::lock_guard<std::mutex> bufferLockGuard(g_BufferMutex);
//redraw here after mutex has been locked.
}
Then your buffer modification code would have to lock the same mutex when the buffers are being modified.
void updateBuffer()
{
std::lock_guard<std::mutex> bufferLockGuard(g_BufferMutex);
// update here after mutex has been locked
}
This contains some mutex examples.
What appears you want to accomplish is to have multiple threads/workers and one observer. The latter needs to do its job only when all workers are done/signal. If this is the case then check code in this SO q/a. std::condition_variable - Wait for several threads to notify observer
mutex are a very nice thing when trying to avoid dataraces, and I'm sure the answer posted by #Phantom will satisfy most people. However, one should know that this is not scalable to large systems.
By locking you are synchronising your threads. As only one at a time can be accessing the vector, on thread writting to the container will cause the other one to wait for it to finish ... with may be good for you but causes serious performance botleneck when high performance is needed.
The best solution would be to use a more complexe lock free structure. Unfortunatelly I don't think there is any standart lockfree structure in the STL. One exemple of lockfree queue is available here
Using such a structure, your 4 working threads would be able to enqueue messages to the container while the 5th one would dequeue them, without any dataraces
More on lockfree datastructure can be found here !
If I have the following code:
#include <boost/date_time.hpp>
#include <boost/thread.hpp>
boost::shared_mutex g_sharedMutex;
void reader()
{
boost::shared_lock<boost::shared_mutex> lock(g_sharedMutex);
boost::this_thread::sleep(boost::posix_time::seconds(10));
}
void makeReaders()
{
while (1)
{
boost::thread ar(reader);
boost::this_thread::sleep(boost::posix_time::seconds(3));
}
}
boost::thread mr(makeReaders);
boost::this_thread::sleep(boost::posix_time::seconds(5));
boost::unique_lock<boost::shared_mutex> lock(g_sharedMutex);
...
the unique lock will never be acquired, because there are always going to be readers. I want a unique_lock that, when it starts waiting, prevents any new read locks from gaining access to the mutex (called a write-biased or write-preferred lock, based on my wiki searching). Is there a simple way to do this with boost? Or would I need to write my own?
Note that I won't comment on the win32 implementation because it's way more involved and I don't have the time to go through it in detail. That being said, it's interface is the same as the pthread implementation which means that the following answer should be equally valid.
The relevant pieces of the pthread implementation of boost::shared_mutex as of v1.51.0:
void lock_shared()
{
boost::this_thread::disable_interruption do_not_disturb;
boost::mutex::scoped_lock lk(state_change);
while(state.exclusive || state.exclusive_waiting_blocked)
{
shared_cond.wait(lk);
}
++state.shared_count;
}
void lock()
{
boost::this_thread::disable_interruption do_not_disturb;
boost::mutex::scoped_lock lk(state_change);
while(state.shared_count || state.exclusive)
{
state.exclusive_waiting_blocked=true;
exclusive_cond.wait(lk);
}
state.exclusive=true;
}
The while loop conditions are the most relevant part for you. For the lock_shared function (read lock), notice how the while loop will not terminate as long as there's a thread trying to acquire (state.exclusive_waiting_blocked) or already owns (state.exclusive) the lock. This essentially means that write locks have priority over read locks.
For the lock function (write lock), the while loop will not terminate as long as there's at least one thread that currently owns the read lock (state.shared_count) or another thread owns the write lock (state.exclusive). This essentially gives you the usual mutual exclusion guarantees.
As for deadlocks, well the read lock will always return as long as the write locks are guaranteed to be unlocked once they are acquired. As for the write lock, it's guaranteed to return as long as the read locks and the write locks are always guaranteed to be unlocked once acquired.
In case you're wondering, the state_change mutex is used to ensure that there's no concurrent calls to either of these functions. I'm not going to go through the unlock functions because they're a bit more involved. Feel free to look them over yourself, you have the source after all (boost/thread/pthread/shared_mutex.hpp) :)
All in all, this is pretty much a text book implementation and they've been extensively tested in a wide range of scenarios (libs/thread/test/test_shared_mutex.cpp and massive use across the industry). I wouldn't worry too much as long you use them idiomatically (no recursive locking and always lock using the RAII helpers). If you still don't trust the implementation, then you could write a randomized test that simulates whatever test case you're worried about and let it run overnight on hundreds of thread. That's usually a good way to tease out deadlocks.
Now why would you see that a read lock is acquired after a write lock is requested? Difficult to say without seeing the diagnostic code that you're using. Chances are that the read lock is acquired after your print statement (or whatever you're using) is completed and before state_change lock is acquired in the write thread.