For example consider
class ProcessList {
private
std::vector<std::shared_ptr<SomeObject>> list;
Mutex mutex;
public:
void Add(std::shared_ptr<SomeObject> o) {
Locker locker(&mutex); // Start of critical section. Locker release so will the mutex - In house stuff
list.push_back(std::make_shared<SomeObject>(o).
}
void Remove(std::shared_ptr<SomeObject> o) {
Locker locker(&mutex); // Start of critical section. Locker release so will the mutex - In house stuff
// Code to remove said object but indirectly modifying the reference count in copy below
}
void Process() {
std::vector<std::shared_ptr<SomeObject>> copy;
{
Locker locker(&mutes);
copy = std::vector<std::shared_ptr<SomeObject>>(
list.begin(), list.end()
)
}
for (auto it = copy.begin(), it != copy.end(); ++it) {
it->Procss(); // This may take time/add/remove to the list
}
}
};
One thread runs Process. Multiple threads run add/remove.
Will the reference count be safe and always correct - or should a mutex be placed around that?
Yes, the standard (at ยง20.8.2.2, at least as of N3997) that's intended to require that the reference counting be thread-safe.
For your simple cases like Add:
void Add(std::shared_ptr<SomeObject> o) {
Locker locker(&mutex);
list.push_back(std::make_shared<SomeObject>(o).
}
...the guarantees in the standard are strong enough that you shouldn't need the mutex, so you can just have:
void Add(std::shared_ptr<SomeObject> o) {
list.push_back(std::make_shared<SomeObject>(o).
}
For some of your operations, it's not at all clear that thread-safe reference counting necessarily obviates your mutex though. For example, inside of your Process, you have:
{
Locker locker(&mutes);
copy = std::vector<std::shared_ptr<SomeObject>>(
list.begin(), list.end()
)
}
This carries out the entire copy as an atomic operation--nothing else can modify the list during the copy. This assures that your copy gives you a snapshot of the list precisely as it was when the copy was started. If you eliminate the mutex, the reference counting will still work, but your copy might reflect changes made while the copy is being made.
In other words, the thread safety of the shared_ptr only assures that each individual increment or decrement is atomic--it doesn't assure that manipulations of the entire list are atomic as the mutex does in this case.
Since your list is actually a vector, you should be able to simplify the copying code a bit to just copy = list.
Also note that your Locker seems to be a subset of what std::lock_guard provides. It appears you could use:
std::lock_guard<std::mutex> locker(&mutes);
...in its place quite easily.
It will be an overhead to have a mutex for reference counting.
Internally, mutexes use atomic operations, basically a mutex does internal thread safe reference counting. So you can just use atomics for your reference counting directly instead of using a mutex and essentially doing double the work.
Unless your CPU architecture have an atomic increment/decrement and you used that for reference count, then, no, it's not safe; C++ makes no guarantee on the thread safety of x++/x-- operation on any of its standard types.
Use atomic<int> if your compiler supports them (C++11), otherwise you'll need to have the lock.
Further references:
https://www.threadingbuildingblocks.org/docs/help/tbb_userguide/Atomic_Operations.htm
Related
I'm learning about mutex and threading right now. I was wondering if there's anything dangerous or inherently wrong with automating mutex with a class like this:
class AutoMutex
{
private:
std::mutex& m_Mutex;
public:
AutoMutex(std::mutex& m) : m_Mutex(m)
{
m_Mutex.lock();
}
~AutoMutex()
{
m_Mutex.unlock();
}
};
And then, of course, you would use it like this:
void SomeThreadedFunc()
{
AutoMutex m(Mutex); // With 'Mutex' being some global mutex.
// Do stuff
}
The idea is that, on construction of an AutoMutex object, it locks the mutex. Then, when it goes out of scope, the destructor automatically unlocks it.
You could even just put it in scopes if you don't need it for an entire function. Like this:
void SomeFunc()
{
// Do stuff
{
AutoMutex m(Mutex);
// Do race condition stuff.
}
// Do other stuff
}
Is this okay? I don't personally see anything wrong with it, but as I'm not the most experienced, I feel there's something I may be missing.
It's safe to use a RAII wrapper, and in fact safer than using mutex member functions directly, but it's also unnecessary to write since standard library already provides this. It's called std::lock_guard.
However, your implementation isn't entirely safe, because it's copyable, and a copy will attempt to re-unlock the mutex which will lead to undefined behaviour. std::lock_guard resolves this issue by being non-copyable.
There's also std::unique_lock which is very similar, but allows things such as releasing the lock within the lifetime. std::scoped_lock should be used if you need to lock multiple mutexes. Using multiple lock guard may lead to deadlock. std::scoped_lock is also fine to use with a single mutex, so you can replace all uses of lock guard with it.
I have a global reference-counted object obj that I want to protect from data races by using atomic operations:
T* obj; // initially nullptr
std::atomic<int> count; // initially zero
My understanding is that I need to use std::memory_order_release after I write to obj, so that the other threads will be aware of it being created:
void increment()
{
if (count.load(std::memory_order_relaxed) == 0)
obj = std::make_unique<T>();
count.fetch_add(1, std::memory_order_release);
}
Likewise, I need to use std::memory_order_acquire when reading the counter, to ensure the thread has visibility of obj being changed:
void decrement()
{
count.fetch_sub(1, std::memory_order_relaxed);
if (count.load(std::memory_order_acquire) == 0)
obj.reset();
}
I am not convinced that the code above is correct, but I'm not entirely sure why. I feel like after obj.reset() is called, there should be a std::memory_order_release operation to inform other threads about it. Is that correct?
Are there other things that can go wrong, or is my understanding of atomic operations in this case completely wrong?
It is wrong regardless of memory ordering.
As #MaartenBamelis pointed out for concurrent calling of increment the object is constructed twice. And the same is true for concurrent decrement: object is reset twice (which may result in double destructor call).
Note that there's disagreement between T* obj; declaration and using it as unique_ptr but neither raw pointer not unique pointer are safe for concurrent modification. In practice, reset or delete will check pointer for null, then delete and set it to null, and these steps are not atomic.
fetch_add and fetch_sub are fetch and op instead of just op for a reason: if you don't use the value observed during operation, it is likely to be a race.
This code is inherently racey. If two threads call increment at the same time when count is initially 0, both will see count as 0, and both will create obj (and race to see which copy is kept; given unique_ptr has no special threading protections, terrible things can happen if two of them set it at once).
If two threads decrement at the same time (holding the last two references), and finish the fetch_sub before either calls load, both will reset obj (also bad).
And if a decrement finishes the fetch_sub (to 0), then another thread increments before the decrement load occurs, the increment will see the count as 0 and reinitialize. Whether the object is cleared after being replaced, or replaced after being cleared, or some horrible mixture of the two, will depend on whether increment's fetch_add runs before or after decrement's load.
In short: If you find yourself using two separate atomic operations on the same variable, and testing the result of one of them (without looping, as in a compare and swap loop), you're wrong.
More correct code would look like:
void increment() // Still not safe
{
// acquire is good for the != 0 case, for a later read of obj
// or would be if the other writer did a release *after* constructing an obj
if (count.fetch_add(1, std::memory_order_acquire) == 0)
obj = std::make_unique<T>();
}
void decrement()
{
if (count.fetch_sub(1, std::memory_order_acquire) == 1)
obj.reset();
}
but even then it's not reliable; there's no guarantee that, when count is 0, two threads couldn't call increment, both of them fetch_add at once, and while exactly one of them is guaranteed to see the count as 0, said 0-seeing thread might end up delayed while the one that saw it as 1 assumes the object exists and uses it before it's initialized.
I'm not going to swear there's no mutex-free solution here, but dealing with the issues involved with atomics is almost certainly not worth the headache.
It might be possible to confine the mutex to inside the if() branches, but taking a mutex is also an atomic RMW operation (and not much more than that for a good lightweight implementation) so this doesn't necessarily help a huge amount. If you need really good read-side scaling, you'd want to look into something like RCU instead of a ref-count, to allow readers to truly be read-only, not contending with other readers.
I don't really see a simple way of implementing a reference-counted resource with atomics. Maybe there's some clever way that I haven't thought of yet, but in my experience, clever does not equal readable.
My advice would be to implement it first using a mutex. Then you simply lock the mutex, check the reference count, do whatever needs to be done, and unlock again. It's guaranteed correct:
std::mutex mutex;
int count;
std::unique_ptr<T> obj;
void increment()
{
auto lock = std::scoped_lock{mutex};
if (++count == 1) // Am I the first reference?
obj = std::make_unique<T>();
}
void decrement()
{
auto lock = std::scoped_lock{mutex};
if (--count == 0) // Was I the last reference?
obj.reset();
}
Although at this point, I would just use a std::shared_ptr instead of managing the reference count myself:
std::mutex mutex;
std::weak_ptr<T> obj;
std::shared_ptr<T> acquire()
{
auto lock = std::scoped_lock{mutex};
auto sp = obj.lock();
if (!sp)
obj = sp = std::make_shared<T>();
return sp;
}
I believe this also makes it safe when exceptions may be thrown when constructing the object.
Mutexes are surprisingly performant, so I expect that locking code is plenty quick unless you have a highly specialized use case where you need code to be lock-free.
I have a object and all its function should be executed in sequential order.
I know it is possible to do that with a mutex like
#include <mutex>
class myClass {
private:
std::mutex mtx;
public:
void exampleMethod();
};
void myClass::exampleMethod() {
std::unique_lock<std::mutex> lck (mtx); // lock mutex until end of scope
// do some stuff
}
but with this technique a deadlock occurs after calling an other mutex locked method within exampleMethod.
so i'm searching for a better resolution.
the default std::atomic access is sequentially consistent, so its not possible to read an write to this object at the same time, but now when i access my object and call a method, is the whole function call also atomic or more something like
object* obj = atomicObj.load(); // read atomic
obj.doSomething(); // call method from non atomic object;
if yes is there a better way than locking the most functions with a mutex ?
Stop and think about when you actually need to lock a mutex. If you have some helper function that is called within many other functions, it probably shouldn't try to lock the mutex, because the caller already will have.
If in some contexts it is not called by another member function, and so does need to take a lock, provide a wrapper function that actually does that. It is not uncommon to have 2 versions of member functions, a public foo() and a private fooNoLock(), where:
public:
void foo() {
std::lock_guard<std::mutex> l(mtx);
fooNoLock();
}
private:
void fooNoLock() {
// do stuff that operates on some shared resource...
}
In my experience, recursive mutexes are a code smell that indicate the author hasn't really got their head around the way the functions are used - not always wrong, but when I see one I get suspicious.
As for atomic operations, they can really only be applied for small arithmetic operations, say incrementing an integer, or swapping 2 pointers. These operations are not automatically atomic, but when you use atomic operations, these are the sorts of things they can be used for. You certainly can't have any reasonable expectations about 2 separate operations on a single atomic object. Anything could happen in between the operations.
You could use a std::recursive_mutex instead. This will allow a thread that already owns to mutex to reacquire it without blocking. However, if another thread tries to acquire the lock it will block.
As #BoBTFish properly indicated, it is better to separate your class's public interface, which member functions acquire non-recursive lock and then call private methods which don't. Your code must then assume a lock is always held when a private method is run.
To be safe on this, you may add a reference to std::unique_lock<std::mutex> to each of the method that requires the lock to be held.
Thus, even if you happen to call one private method from another, you would need to make sure a mutex is locked before execution:
class myClass
{
std::mutex mtx;
//
void i_exampleMethod(std::unique_lock<std::mutex> &)
{
// execute method
}
public:
void exampleMethod()
{
std::unique_lock<std::mutex> lock(mtx);
i_exampleMethod(lock);
}
};
I have a std::map that is going to be modified from multiple threads, and I have a mutex to lock these writes. However, the map must occasionally be used during a long-running operation, and it would not be good to hold the lock during this entire operation, blocking all of those writes.
I have two questions about this:
Is it thread-safe to perform that operation on a copy of the map while other threads are writing to the original?
Is it thread-safe to copy the map while other threads are mutating it?
For example:
class Foo {
void writeToMap(Bar &bar, Baz &baz) {
// lock mutex
map[bar] = baz;
// unlock mutex
}
std::map<Bar, Baz> getMapCopy() {
// lock mutex (Is this necessary?)
std::map<Bar, Baz> copy (map);
// unlock mutex
return copy;
}
std::map<Bar, Baz> map;
};
void expensiveOperation(Foo &f) {
std::map<Bar, Baz> mapCopy = f.getMapCopy();
// Can I safely read mapCopy?
}
Sounds like the copy operation itself is not thread safe, the issue being atomicity of copying 64-bit values. You copy the first 4 bytes, while the second 4 is being modified by another thread leaving you with inconsistent 8 bytes.
Please have a look here: Is copy thread-safe?
If you manage to create a consistent copy, however, I do not see why not...
You have undefined behavior in that code, as you return a reference to a local variable. This locla variable will be destructed once the function returns, and you now have a reference to a destructed object.
If you want to return a copy then you have to return by value, so it would be e.g.
std::map<Bar, Baz> getMapCopy() {
return map;
}
And if you use the mutexes and locks from the C++11 standard thread library you don't need an explicit unlock, the mutex will be unlocked with the destruction of the lock.
I am trying to make my_class thread-safe like so.
class my_class
{
const std::vector<double>&
get_data() const
{ //lock so that cannot get_data() while setting data
lock l(m_mutex);
return m_data;
}
void
run()
{
vector<double> tmp;
//some calculations on tmp.
{ //lock so that cannot get_data() while setting m_data
lock l(m_mutex);
m_data = tmp; //set the data
}
}
private:
std::vector<double> m_data;
mutex m_mutex;
my_class(); //non-copyable
}
run() and get_data() may be called by different openmp threads and so I introduce a lock.
(Since am using openmp, m_mutex and lock are RAII wrappers around omp_init_lock(); etc. commands).
However, the lock on get_data () is expensive to create and destroy (The most expensive operation when I profile my code - I call get_data() a lot).
Is is possible to reorganise my_class to remove the lock in get_data()? Or is this lock the unavoidable cost of parallelising the code?
First step would be to look into read-write locks: this way multiple readers will not block each other.
The next step would be using lock-free or wait-free operations. There are plenty of resources online describing them better than I would be able to. Just one note: lock-free approaches deal with atomic (interlocked) operations, which means the data size needs to be small. If you go this route, you'll be atomically replacing a pointer to your vector, not the whole vector. This means your class will get a bit more complex and will deal with some pointers and memory management.
It may be cheaper to use a critical section around get_data/run functions, you will not incur additional setup/teardown overhead (as the critical section is statically initialized), but this would also synchronize other instances of the class.