here is the situation
boost::shared_mutex rwlock;
void test()
{
boost::unique_lock < boost::shared_mutex > writelock(rwlock);
// here we have deadlock
}
int main()
{
boost::shared_lock < boost::shared_mutex > readlock(rwlock);
test();
}
I know that we can do something like that:
{
boost::upgrade_lock<boost::shared_mutex> upgradeable_lock(rwlock); // here we obtain readlock
{
boost::upgrade_to_unique_lock<boost::shared_mutex> uniqueLock(upgradeable_lock); // right now we upgrade readlock to writelock
}
}
but if like in my first example we have another scope we don't see upgradeable_lock. How to solve that issue ?
I assume, that the real code is a lot more difficult, with acquiring read locks all over the place, multiple times in the call stack, and then you need to write somewhere, never anticipated.
I am a guessing a bit here, but if that is true, and you don't want to change that, you'll have to walk up the call path from your writing function, and always release the shared_lock before you do the relevant call, and acquire it again afterwards.
Read / write locks are great, but they tend to mislead developers to use read locks inflationary.
As soon as you can refactor, try and reduce the read locks to just those places, where you really have to read. Keep the critical section as short as possible, and avoid function calls inside, that might acquire that lock as well.
When you have done that, a change to a function, that has to write now as well, will not be a big issue any more. BTW, this will also improve the concurrency, because a writer will have more chances to find a moment, where no reader holds a read lock. You might prefer to do that refactoring now, because it will make life a lot easier afterwards.
Another guess: In case you use these read-locks to have a stable state of the data during a longer process, you might want to rethink that choice now. What you really want then, is some kind of software transactional memory.
Related
I'd like to write a function that is accessible only by a single thread at a time. I don't need busy waits, a brutal 'rejection' is enough if another thread is already running it. This is what I have come up with so far:
std::atomic<bool> busy (false);
bool func()
{
if (m_busy.exchange(true) == true)
return false;
// ... do stuff ...
m_busy.exchange(false);
return true;
}
Is the logic for the atomic exchange correct?
Is it correct to mark the two atomic operations as std::memory_order_acq_rel? As far as I understand a relaxed ordering (std::memory_order_relaxed) wouldn't be enough to prevent reordering.
Your atomic swap implementation might work. But trying to do thread safe programming without a lock is most always fraught with issues and is often harder to maintain.
Unless there's a performance improvement that's needed, then std::mutex with the try_lock() method is all you need, eg:
std::mutex mtx;
bool func()
{
// making use of std::unique_lock so if the code throws an
// exception, the std::mutex will still get unlocked correctly...
std::unique_lock<std::mutex> lck(mtx, std::try_to_lock);
bool gotLock = lck.owns_lock();
if (gotLock)
{
// do stuff
}
return gotLock;
}
Your code looks correct to me, as long as you leave the critical section by falling out, not returning or throwing an exception.
You can unlock with a release store; an RMW (like exchange) is unnecessary. The initial exchange only needs acquire. (But does need to be an atomic RMW like exchange or compare_exchange_strong)
Note that ISO C++ says that taking a std::mutex is an "acquire" operation, and releasing is is a "release" operation, because that's the minimum necessary for keeping the critical section contained between the taking and the releasing.
Your algo is exactly like a spinlock, but without retry if the lock's already taken. (i.e. just a try_lock). All the reasoning about necessary memory-order for locking applies here, too. What you've implemented is logically equivalent to the try_lock / unlock in #selbie's answer, and very likely performance-equivalent, too. If you never use mtx.lock() or whatever, you're never actually blocking i.e. waiting for another thread to do something, so your code is still potentially lock-free in the progress-guarantee sense.
Rolling your own with an atomic<bool> is probably good; using std::mutex here gains you nothing; you want it to be doing only this for try-lock and unlock. That's certainly possible (with some extra function-call overhead), but some implementations might do something more. You're not using any of the functionality beyond that. The one nice thing std::mutex gives you is the comfort of knowing that it safely and correctly implements try_lock and unlock. But if you understand locking and acquire / release, it's easy to get that right yourself.
The usual performance reason to not roll your own locking is that mutex will be tuned for the OS and typical hardware, with stuff like exponential backoff, x86 pause instructions while spinning a few times, then fallback to a system call. And efficient wakeup via system calls like Linux futex. All of this is only beneficial to the blocking behaviour. .try_lock leaves that all unused, and if you never have any thread sleeping then unlock never has any other threads to notify.
There is one advantage to using std::mutex: you can use RAII without having to roll your own wrapper class. std::unique_lock with the std::try_to_lock policy will do this. This will make your function exception-safe, making sure to always unlock before exiting, if it got the lock.
Does is make sense to do something like putting a std::lock_guard in an extra scope so that the locking period is as short as possible?
Pseudo code:
// all used variables beside the lock_guard are created and initialized somewhere else
...// do something
{ // open new scope
std::lock_guard<std::mutex> lock(mut);
shared_var = newValue;
} // close the scope
... // do some other stuff (that might take longer)
Are there more advantages besides having a short lock duration?
What might be negative side effects?
Yes, it certainly makes sense to limit the scope of lock guards to be as short as possible, but not shorter.
The longer you hold a lock, the more likely it is that a thread will block waiting for that lock, which impacts performance as is thus usually considered a bad thing.
However, you must make sure that the program is still correct and that the lock is held at all times when it must be, i.e. when the shared resource protected by the lock is accessed or modified.
There may be one more point to consider (I do not have enough practical experience here to speak with certainty). Locking/releasing a mutex can potentially be an operation with nontrivial performance costs itself. Therefore, it may turn out that keeping a lock for a slightly longer period instead of unlocking & re-locking it several times in the course of one operation can actually improve overall performace. This is something which profiling could show you.
There might be a disadvantage: you cannot protect initializations this way. For example:
{
std::lock_guard<std::mutex> lock(mut);
Some_resource var{shared_var};
} // oops! var is lost
you have to use assignment like this:
Some_resource var;
{
std::lock_guard<std::mutex> lock(mut);
var = shared_Var;
}
which may be suboptimal for some types, for which default initialization followed by assignment is less efficient than directly initializing. Moreover, in some situations, you cannot change the variable after initialization. (e.g. const variables)
user32434999 pointed out this solution:
// use an immediately-invoked temporary lambda
Some_resource var {
[&] {
std::lock_guard<std::mutex> lock(mut);
return shared_var;
} () // parentheses for invoke
};
This way, you can protect the retrieval process, but the initialization itself is still not guarded.
Yes, it makes sense.
There are no other advantages, and there are no side-effects (it is a good way to write it).
An even better way, is to extract it into a private member function (if you have an operation that is synchronized this way, you might as well give the operation its own name):
{
// all used variables beside the lock_guard are created and initialized somewhere else
...// do something
set_var(new_value);
... // do some other stuff (that might take longer)
}
void your_class::set_value(int new_value)
{
std::lock_guard<std::mutex> lock(mut);
shared_var = new_value;
}
Using an extra scope specifically to limit the lifetime of an std::lock_guard object is indeed good practice. As the other answers point out, locking your mutex for the shortest period of time will reduce the chances that another thread will block on the mutex.
I see one more point that was not mentioned in the other answers: transactional operations. Let's use the classical example of a money transfer between two bank accounts. For your banking program to be correct, the modification of the two bank account's balance must be done without unlocking the mutex in between. Otherwise, it would be possible for another thread to lock the mutex while the program is in a weird state where only one of the accounts was credited/debited while the other account's balance was untouched!
With this in mind, it is not enough to ensure that the mutex is locked when each shared resource is modified. Sometimes, you must keep the mutex locked for a period of time spanning the modification of all the shared resources that form a transaction.
EDIT:
If for some reason keeping the mutex locked for the whole duration of the transaction is not acceptable, you can use the following algorithm:
1. Lock mutex, read input data, unlock mutex.
2. Perform all needed computations, save results locally.
3. Lock mutex, check that input data has not changed, perform the transaction with readily available results, unlock the mutex.
If the input data has changed during the execution of step 2, throw away the results and start over with the fresh input data.
I don't see the reason to do it.
If you do something so simple as "set one variable" - use atomic<> and you don't need mutex and lock at all. If you do something complicated - extract this code into new function and use lock in its first line.
I am developing an application in Qt/C++. At some point, there are two threads : one is the UI thread and the other one is the background thread. I have to do some operation from the background thread based on the value of an extern variable which is type of bool. I am setting this value by clicking a button on UI.
header.cpp
extern bool globalVar;
mainWindow.cpp
//main ui thread on button click
setVale(bool val){
globalVar = val;
}
backgroundThread.cpp
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
Here, writing to globalVar happens only when the user clicks the button whereas reading happens continuously.
So my question is :
In a situation like the one above, is mutex mandatory?
If read and write happens at the same time, does this cause the application to crash?
If read and write happens at same time, is globalVar going to have some value other than true or false?
Finally, does the OS provide any kind of locking mechanism to prevent the read/write operation to access a memory location at the same time by a different thread?
The loop
while(1){
if(globalVar)
//do some operation
else
//do some other operation
}
is busy waiting, which is extremely wasteful. Thus, you're probably better off with some classic synchronization that will wake the background thread (mostly) when there is something to be done. You should consider adapting this example of std::condition_variable.
Say you start with:
#include <thread>
#include <mutex>
#include <condition_variable>
std::mutex m;
std::condition_variable cv;
bool ready = false;
Your worker thread can then be something like this:
void worker_thread()
{
while(true)
{
// Wait until main() sends data
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return ready;});
ready = false;
lk.unlock();
}
The notifying thread should do something like this:
{
std::lock_guard<std::mutex> lk(m);
ready = true;
}
cv.notify_one();
Since it is just a single plain bool, I'd say a mutex is overkill, you should just go for an atomic integer instead. An atomic will read and write in a single CPU clock so no worries there, and it will be lock free, which is always better if possible.
If it is something more complex, then by all means go for a mutex.
It won't crash from that alone, but you can get data corruption, which may crash the application.
The system will not manage that stuff for you, you do it manually, just make sure all access to the data goes through the mutex.
Edit:
Since you specify a number of times that you don't want a complex solution, you may opt for simply using a mutex instead of the bool. There is no need to protect the bool with a mutex, since you can use the mutex as a bool, and yes, you could go with an atomic, but that's what the mutex already does (plus some extra functionality in the case of recursive mutexes).
It also matters what is your exact workload, since your example doesn't make a lot of sense in practice. It would be helpful to know what those some operations are.
So in your ui thread you could simply val ? mutex.lock() : mutex.unlock(), and in your secondary thread you could use if (mutex.tryLock()) doStuff; mutex.unlock(); else doOtherStuff;. Now if the operation in the secondary thread takes too long and you happen to be changing the lock in the main thread, that will block the main thread until the secondary thread unlocks. You could use tryLock(timeout) in the main thread, depending on what you prefer, lock() will block until success, while tryLock(timeout) will prevent blocking but the lock may fail. Also, take care not to unlock from a thread other than the one you locked with, and not to unlock an already unlocked mutex.
Depending on what you are actually doing, maybe an asynchronous event driven approach would be more appropriate. Do you really need that while(1)? How frequently do you perform those operations?
In situation like above does mutex is necessary?
A mutex is one tool that will work. What you actually need are three things:
a means of ensuring an atomic update (a bool will give you this as it's mandated to be an integral type by the standard)
a means of ensuring that the effects of a write made by one thread is actually visible in the other thread. This may sound counter-intuitive but the c++ memory model is single-threaded and optimisations (software and hardware) do not need to consider cross-thread communication, and...
a means of preventing the compiler (and CPU!!) from re-ordering the reads and writes.
The answer to the implied question is 'yes'. You will need something at does all of these things (see below)
If read and write happend at the same time does this cause to crash the application?
not when it's a bool, but the program won't behave as you expect. In fact, because the program is now exhibiting undefined behaviour you can no longer reason about its behaviour at all.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
not in this case because it's an intrinsic (atomic) type.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
Not unless you specify one.
Your options are:
std::atomic<bool>
std::mutex
std::atomic_signal_fence
Realistically speaking, as long as you use an integer type (not bool), make it volatile, and keep inside of its own cache line by properly aligning its storage, you don't need to do anything special at all.
In situation like above does mutex is necessary?
Only if you want to keep the value of the variable synchronized with other state.
If read and write happed at the same time does this cause to crash the application?
According to C++ standard, it's undefined behavior. So anything can happen: e.g. your application might not crash, but its state might be subtly corrupted. In real life, though, compilers often offer some sane implementation defined behavior and you're fine unless your platform is really weird. Anything commonplace, like 32 and 64 bit intel, PPC and ARM will be fine.
If read and write happens at same time, is globalVar going to have some value other thantrue or false?
globalVar can only have these two values, so it makes no sense to speak of any other values unless you're talking about its binary representation. Yes, it could happen that the binary representation is incorrect and not what the compiler would expect. That's why you shouldn't use a bool but a uint8_t instead.
I wouldn't love to see such flag in a code review, but if a uint8_t flag is the simplest solution to whatever problem you're solving, I say go for it. The if (globalVar) test will treat zero as false, and anything else as true, so temporary "gibberish" is OK and won't have any odd effects in practice. According to the standard, you'll be facing undefined behavior, of course.
And is it going to happen the access(read/write) of a memory location at same time by different thread, does OS providing any kind of locking mechanism to prevent it?
It's not the OS's job to do that.
Speaking of practice, though: on any reasonable platform, the use of a std::atomic_bool will have no overhead over the use of a naked uint8_t, so just use that and be done.
I have some data that is both read and updated by multiple threads. Both reads and writes must be atomic. I was thinking of doing it like this:
// Values must be read and updated atomically
struct SValues
{
double a;
double b;
double c;
double d;
};
class Test
{
public:
Test()
{
m_pValues = &m_values;
}
SValues* LockAndGet()
{
// Spin forver until we got ownership of the pointer
while (true)
{
SValues* pValues = (SValues*)::InterlockedExchange((long*)m_pValues, 0xffffffff);
if (pValues != (SValues*)0xffffffff)
{
return pValues;
}
}
}
void Unlock(SValues* pValues)
{
// Return the pointer so other threads can lock it
::InterlockedExchange((long*)m_pValues, (long)pValues);
}
private:
SValues* m_pValues;
SValues m_values;
};
void TestFunc()
{
Test test;
SValues* pValues = test.LockAndGet();
// Update or read values
test.Unlock(pValues);
}
The data is protected by stealing the pointer to it for every read and write, which should make it threadsafe, but it requires two interlocked instructions for every access. There will be plenty of both reads and writes and I cannot tell in advance if there will be more reads or more writes.
Can it be done more effective than this? This also locks when reading, but since it's quite possible to have more writes then reads there is no point in optimizing for reading, unless it does not inflict a penalty on writing.
I was thinking of reads acquiring the pointer without an interlocked instruction (along with a sequence number), copying the data, and then having a way of telling if the sequence number had changed, in which case it should retry. This would require some memory barriers, though, and I don't know whether or not it could improve the speed.
----- EDIT -----
Thanks all, great comments! I haven't actually run this code, but I will try to compare the current method with a critical section later today (if I get the time). I'm still looking for an optimal solution, so I will get back to the more advanced comments later. Thanks again!
What you have written is essentially a spinlock. If you're going to do that, then you might as well just use a mutex, such as boost::mutex. If you really want a spinlock, use a system-provided one, or one from a library rather than writing your own.
Other possibilities include doing some form of copy-on-write. Store the data structure by pointer, and just read the pointer (atomically) on the read side. On the write side then create a new instance (copying the old data as necessary) and atomically swap the pointer. If the write does need the old value and there is more than one writer then you will either need to do a compare-exchange loop to ensure that the value hasn't changed since you read it (beware ABA issues), or a mutex for the writers. If you do this then you need to be careful how you manage memory --- you need some way to reclaim instances of the data when no threads are referencing it (but not before).
There are several ways to resolve this, specifically without mutexes or locking mechanisms. The problem is that I'm not sure what the constraints on your system is.
Remember that atomic operations is something that often get moved around by the compilers in C++.
Generally I would solve the issue like this:
Multiple-producer-single-consumer by having 1 single-producer-single-consumer per writing thread. Each thread writes into their own queue. A single consumer thread that gathers the produced data and stores it in a single-consumer-multiple-reader data storage. The implementation for this is a lot of work and only recommended if you are doing a time-critical application and that you have the time to put in for this solution.
There are more things to read up about this, since the implementation is platform specific:
Atomic etc operations on windows/xbox360:
http://msdn.microsoft.com/en-us/library/ee418650(VS.85).aspx
The multithreaded single-producer-single-consumer without locks:
http://www.codeproject.com/KB/threads/LockFree.aspx#heading0005
What "volatile" really is and can be used for:
http://www.drdobbs.com/cpp/212701484
Herb Sutter has written a good article that reminds you of the dangers of writing this kind of code:
http://www.drdobbs.com/cpp/210600279;jsessionid=ZSUN3G3VXJM0BQE1GHRSKHWATMY32JVN?pgno=2
I have a class that is accessed from multiple threads. Both of its getter and setter functions are guarded with locks.
Are the locks for the getter functions really needed? If so, why?
class foo {
public:
void setCount (int count) {
boost::lock_guard<boost::mutex> lg(mutex_);
count_ = count;
}
int count () {
boost::lock_guard<boost::mutex> lg(mutex_); // mutex needed?
return count_;
}
private:
boost::mutex mutex_;
int count_;
};
The only way you can get around having the lock is if you can convince yourself that the system will transfer the guarded variable atomicly in all cases. If you can't be sure of that for one reason or another, then you'll need the mutex.
For a simple type like an int, you may be able to convince yourself this is true, depending on architecture, and assuming that it's properly aligned for single-instruction transfer. For any type that's more complicated than this, you're going to have to have the lock.
If you don't have a mutex around the getter, and a thread is reading it while another thread is writing it, you'll get funny results.
Is the mutex really only protecting a single int? It makes a difference -- if it is a more complex datatype you definitely need locking.
But if it is just an int, and you are sure that int is an atomic type (i.e., the processor will not have to do two separate memory reads to load the int into a register), and you have benchmarked the performance and determined you need better performance, then you may consider dropping the lock from both the getter and the setter. If you do that, make sure to qualify the int as volatile. And write a comment explaining why you do not have mutex protection, and under what conditions you would need it if the class changes.
Also, beware that you don't have code like this:
void func(foo &f) {
int temp = f.count();
++temp;
f.setCount(temp);
}
That is not threadsafe, regardless of whether you use a mutex or not. If you need to do something like that, the mutex protection has to be outside the setter/getter functions.
The synchronization concern is already covered in other answers (specifically David Schwartz's).
There's another concern I don't see addressed, though: this is usually a bad design.
Consider David's example code, assuming we have a correctly-synchronized version of foo
{
foo j;
some_func(j);
while (j.count() == 0)
{
// do we still expect (j.count() == 0) here?
bar();
}
}
The code suggests that the while condition still holds in the body. That's how single-threaded code works, after all.
But of course, even if we correctly synchronize the implementation of a getter, the setter can still be called from another thread, between our while condition succeeding and the first instruction of the loop body executing.
So, if any logic in the loop body can't depend on the condition being true, what was the point of testing it?
Sometimes it makes perfect sense, such as
while (foo.shouldKeepRunning())
{
// foo event loop or something
}
where it's OK if our shouldKeepRunning state changes during the loop body, because we only need to test it periodically. However, if you're going to do something with count, you need a longer-lived lock, and an interface to support it:
{
auto guard = j.lock_guard();
while (j.count(guard) == 0) // prove to count that we're locked
{
// now we _know_ count is zero in the body
// (but bar should release and re-acquire the lock or that can never change)
bar(j);
}
} // guard goes out of scope and unlocks
in you case probably not, if your cpu is 32 bit, however if count is a complex object or cpu needs more than one instruction to update its value, then yes
The lock is necessary to serialize access to shared resource. In your specific case you might get away with just atomic integer operations but in general, for larger objects that require more then one bus transaction, you do need locks to guarantee that reader always sees a consistent object.
It depends on the exact implementation of the object being locked. However, in general you do not want someone modifying (setting?) an object while someone else is in the process of reading (getting?) it. The easiest way to prevent that is to have a reader lock it.
In more complicated setups the lock will be implemented in such a way that any number of folks can read at once, but nobody can write to it while anyone is reading, and nobody can read while a write is going on.
They are really needed.
Imagine if you have an instance of class foo that's completely local to some piece of code. And you have something like this:
{
foo j;
some_func(j); // this stashes a reference to j where another thread can find it
while (j.count() == 0)
bar();
}
Suppose the optimizer looks carefully at the code to bar and sees that it can't possibly modify j.count_. This allows the optimizer to rewrite the code as follows:
{
foo j;
some_func(j); // this stashes a reference to j where another thread can find it
if (j.count() == 0)
{
while (1)
bar();
}
}
Clearly this is a disaster. Another thread might call j.setCount(5) and the thread wouldn't exit to loop.
The compiler can prove that bar can't modify the return value of j.count(). If it was required to assume that another thread could modify every memory value it accesses, it could never stash anything in a register ever, which would clearly be an untenable situation.
So, yes, the lock is needed. Alternatively, you need to use some other construct that provides similar guarantees.
Do not ever write code that relies on compilers not being able to make any optimization that they are permitted to make unless you really have no other practical choice. I have seen this cause a lot of pain over the many years I've been programming. Optimizers today can do things that would have been considered absurdly implausible a decade ago and lots of code lasts longer than you expect.