I was reading the Mir project source code and I stumbled upon this piece of code :
void mir::frontend::ResourceCache::free_resource(google::protobuf::Message* key)
{
std::shared_ptr<void> value;
{
std::lock_guard<std::mutex> lock(guard);
auto const& p = resources.find(key);
if (p != resources.end())
{
value = p->second;
}
resources.erase(key);
}
}
I have seen this before in other projects as well. It holds a reference to the value in the map before its erasure, even when the bloc is protected by a lock_guard. I'm not sure why they hold a reference to the value by using std::shared_ptr value.
What are the repercussions if we remove the value = p->second ?
Will someone please enlighten me ?
This is the code http://bazaar.launchpad.net/~mir-team/mir/trunk/view/head:/src/frontend/resource_cache.cpp
My guess it that this is done to avoid running the destructor of value inside the locked code. This lock is meant to protect the modification of the map, and running some arbitrary code, such as the destructor of another object with it locked, is not needed nor wanted.
Just imagine that the destructor of value accesses indirectly, for whatever reason to the map, or to another thread-shared structure. There are chances that you end up in a deadlock.
The bottom end is: run as little code as possible from locked code, but not less. And never call a external, unknown, function (such as the shared_ptr deleter or a callback) from locked code.
The goal is to move the actual execution of the shared_ptr deleter till after the lock is released. This way, if the deleter (or the destructor using the default deleter) takes a long time, the lock is not held for that operation.
If you were to remove the value = p->second, the value would be destroyed while the lock is held. Since the lock protects the map, but not the actual values, it would hold the lock longer than is strictly necessary.
Related
I have a global reference-counted object obj that I want to protect from data races by using atomic operations:
T* obj; // initially nullptr
std::atomic<int> count; // initially zero
My understanding is that I need to use std::memory_order_release after I write to obj, so that the other threads will be aware of it being created:
void increment()
{
if (count.load(std::memory_order_relaxed) == 0)
obj = std::make_unique<T>();
count.fetch_add(1, std::memory_order_release);
}
Likewise, I need to use std::memory_order_acquire when reading the counter, to ensure the thread has visibility of obj being changed:
void decrement()
{
count.fetch_sub(1, std::memory_order_relaxed);
if (count.load(std::memory_order_acquire) == 0)
obj.reset();
}
I am not convinced that the code above is correct, but I'm not entirely sure why. I feel like after obj.reset() is called, there should be a std::memory_order_release operation to inform other threads about it. Is that correct?
Are there other things that can go wrong, or is my understanding of atomic operations in this case completely wrong?
It is wrong regardless of memory ordering.
As #MaartenBamelis pointed out for concurrent calling of increment the object is constructed twice. And the same is true for concurrent decrement: object is reset twice (which may result in double destructor call).
Note that there's disagreement between T* obj; declaration and using it as unique_ptr but neither raw pointer not unique pointer are safe for concurrent modification. In practice, reset or delete will check pointer for null, then delete and set it to null, and these steps are not atomic.
fetch_add and fetch_sub are fetch and op instead of just op for a reason: if you don't use the value observed during operation, it is likely to be a race.
This code is inherently racey. If two threads call increment at the same time when count is initially 0, both will see count as 0, and both will create obj (and race to see which copy is kept; given unique_ptr has no special threading protections, terrible things can happen if two of them set it at once).
If two threads decrement at the same time (holding the last two references), and finish the fetch_sub before either calls load, both will reset obj (also bad).
And if a decrement finishes the fetch_sub (to 0), then another thread increments before the decrement load occurs, the increment will see the count as 0 and reinitialize. Whether the object is cleared after being replaced, or replaced after being cleared, or some horrible mixture of the two, will depend on whether increment's fetch_add runs before or after decrement's load.
In short: If you find yourself using two separate atomic operations on the same variable, and testing the result of one of them (without looping, as in a compare and swap loop), you're wrong.
More correct code would look like:
void increment() // Still not safe
{
// acquire is good for the != 0 case, for a later read of obj
// or would be if the other writer did a release *after* constructing an obj
if (count.fetch_add(1, std::memory_order_acquire) == 0)
obj = std::make_unique<T>();
}
void decrement()
{
if (count.fetch_sub(1, std::memory_order_acquire) == 1)
obj.reset();
}
but even then it's not reliable; there's no guarantee that, when count is 0, two threads couldn't call increment, both of them fetch_add at once, and while exactly one of them is guaranteed to see the count as 0, said 0-seeing thread might end up delayed while the one that saw it as 1 assumes the object exists and uses it before it's initialized.
I'm not going to swear there's no mutex-free solution here, but dealing with the issues involved with atomics is almost certainly not worth the headache.
It might be possible to confine the mutex to inside the if() branches, but taking a mutex is also an atomic RMW operation (and not much more than that for a good lightweight implementation) so this doesn't necessarily help a huge amount. If you need really good read-side scaling, you'd want to look into something like RCU instead of a ref-count, to allow readers to truly be read-only, not contending with other readers.
I don't really see a simple way of implementing a reference-counted resource with atomics. Maybe there's some clever way that I haven't thought of yet, but in my experience, clever does not equal readable.
My advice would be to implement it first using a mutex. Then you simply lock the mutex, check the reference count, do whatever needs to be done, and unlock again. It's guaranteed correct:
std::mutex mutex;
int count;
std::unique_ptr<T> obj;
void increment()
{
auto lock = std::scoped_lock{mutex};
if (++count == 1) // Am I the first reference?
obj = std::make_unique<T>();
}
void decrement()
{
auto lock = std::scoped_lock{mutex};
if (--count == 0) // Was I the last reference?
obj.reset();
}
Although at this point, I would just use a std::shared_ptr instead of managing the reference count myself:
std::mutex mutex;
std::weak_ptr<T> obj;
std::shared_ptr<T> acquire()
{
auto lock = std::scoped_lock{mutex};
auto sp = obj.lock();
if (!sp)
obj = sp = std::make_shared<T>();
return sp;
}
I believe this also makes it safe when exceptions may be thrown when constructing the object.
Mutexes are surprisingly performant, so I expect that locking code is plenty quick unless you have a highly specialized use case where you need code to be lock-free.
In the C++ Seasoning video by Sean Parent https://youtu.be/W2tWOdzgXHA at 33:41 when starting to talk about “no raw synchronization primitives”, he brings an example to show that with raw synchronization primitives we will get it wrong. The example is a bad copy on write class:
template <typename T>
class bad_cow {
struct object_t {
explicit object_t(const T& x) : data_m(x) { ++count_m; }
atomic<int> count_m;
T data_m;
};
object_t* object_m;
public:
explicit bad_cow(const T& x) : object_m(new object_t(x)) { }
~bad_cow() { if (0 == --object_m->count_m) delete object_m; }
bad_cow(const bad_cow& x) : object_m(x.object_m) { ++object_m->count_m; }
bad_cow& operator=(const T& x) {
if (object_m->count_m == 1) {
// label #2
object_m->data_m = x;
} else {
object_t* tmp = new object_t(x);
--object_m->count_m; // bug #1
// this solves bug #1:
// if (0 == --object_m->count_m) delete object_m;
object_m = tmp;
}
return *this;
}
};
He then asks the audience to find the bug, which is the bug #1 as he confirms.
But a more obvious bug I guess, is when some thread is about to proceed to execute a line of code that I have denoted with label #2, while all of a sudden, some other thread just destroys the object and the destructor is called, which deletes object_m. So, the first thread will encounter a deleted memory location.
Am I right? I don’t seem so!
some other thread just destroys the object and the destructor is
called, which deletes object_m. So, the first thread will encounter a
deleted memory location.
Am I right? I don’t seem so!
Assuming the rest of the program isn't buggy, that shouldn't happen, because each thread should have its own reference-count object referencing the data_m object. Therefore, if thread B has a bad_cow object that references the data-object, then thread A cannot (or at least should not) ever delete that object, because the count_m field can never drop to zero as long as there remains at least one reference-count object pointing to it.
Of course, a buggy program might encounter the race condition you suggest -- for example, a thread might be holding only a raw pointer to the data-object, rather than a bad_cow that increments its reference count; or a buggy thread might call delete on the object explicitly rather than relying on the bad_cow class to handle deletion properly.
Your objection doesn't hold because *this at that moment is pointing to the object and the count is 1. The counter cannot get to 0 unless someone is not playing this game correctly (but in that case anything can happen anyway).
Another similar objection could be that while you're assigning to *this and the code being executed is inside the #2 branch another thread makes a copy of *this; even if this second thread is just reading the pointed object may see it mutating suddenly because of the assignment. The problem in this case is that count was 1 when entering the if in the thread doing the mutation but increased immediately after.
This is also however a bad objection because this code handles concurrency to the pointed-to object (like for example std::shared_ptr does) but you are not allowed to mutate and read a single instance of bad_cow class from different threads. In other words a single instance of bad_cow cannot be used from multiple threads if some of them are writers without adding synchronization. Distinct instances of bad_cow pointing to the same storage are instead safe to be used from different threads (after the fix #1, of course).
I've a multithreaded C++ application that could call from any thread a function like the following, to get an Object from a list/vector.
class GlobalClass{
public:
MyObject* GlobalClass::getObject(int index) const
{
/* mutex lock & unlock */
if (m_list.hasValueAt(index))
return m_list[index];
else
return 0;
}
List<MyObject*> m_list;
};
//Thread function
MyObject* obj = globalClass->getObject(0);
if (!obj) return;
obj->doSomething();
Note: the scope here is to understand some best practice related to function returns by reference, value or pointer, so forgive some pseudo-code or missing declarations (I make use of lock/unlock, GlobalClass is a global singleton, etc...).
The issue here is that if the MyObject at that index in deleted inside GlobalClass, at a certain point I'm using a bad pointer (obj).
So I was thinking about returning a copy of the oject:
MyObject GlobalClass::getObject(int index) const
{
/* mutex lock & unlock */
if (m_list.hasValueAt(index))
return MyObject(*m_list[index]);
else
return MyObject();
}
The issue here is that the object (MyObject) being returned is a large enough object that returning a copy is not efficient.
Finally, I would like to return a reference to that object (better a const reference):
const MyObject& GlobalClass::getObject(int index) const
{
/* mutex lock & unlock */
if (m_list.hasValueAt(index))
return *m_list[index];
else{
MyObject* obj = new MyObject();
return *obj ;
}
}
Considering that my list couldn't cointain the object at that index, I'm introducing a memory leak.
What's the best solution to deal with this?
Must I fall back in returning a copy even if is less efficient or is there something I'm missing in returning a reference?
You have multiple choices:
Use a std::shared_ptr if "Get" pass the owning of the object to the caller. This way the object cannot get out of scope. Of course the caller is unaware when it happens.
Use a std::weak_ptr. This has the same meaning of 1., but the ptr can be reset. In this case the caller can detect if the object was deleted.
Use std::optional as suggested in a comment, and return a copy or a reference. The use of a reference type as argument of optional doesn't avoid the problem of the object being deleted so the reference can become invalid as well. A copy would avoid this, but it may be too expensive, as said.
Reading through the lines, you seems to suggest that the caller will use the pointer immediately after the call, and for a limited span of time. So 1. and 2. are equivalent and seems to fit your needs.
See this introduction to smart pointers for more details.
If you want to avoid copying the object, there are only two possible cases:
The m_list entry that is returned by getObject is/can be deleted concurrently by another thread. If you don't copy that object beforehand, there is nothing you can do within getObject to prevent another thread from suddenly having a reference/pointer dangle. However, you could make each entry of m_list be a std::shared_ptr<MyObject> and return that directly. The memory management will happen automatically (but beware of the potential overhead in the reference counting of shared_ptr, as well as the possibility of deadlocks).
You have (or add) some mechanism to ensure that objects can only be deleted from m_list if no other thread currently holds some pointer/reference to them. This very much depends on your algorithm, but it might e.g. be possible to mark objects for deletion only and then delete them later in a synchronous section.
Your issues seems to stem from the fact that your program is multithreaded - another way forward (and for raw pointer or the std::optional reference returning version: only way forward, perhaps short of a complete redesign), is that you need to expose the mutex to outside the function scope to accomplish what you need. This you can accomplish in multiple ways, however the most simple way to illustrate this is the following:
/*mutex lock*/
const MyObject& obj = globalClass.get(index);
/*do stuff with obj*/
/*mutex unlock*/
This article by Jeff Preshing states that the double-checked locking pattern (DCLP) is fixed in C++11. The classical example used for this pattern is the singleton pattern but I happen to have a different use case and I am still lacking experience in handling "atomic<> weapons" - maybe someone over here can help me out.
Is the following piece of code a correct DCLP implementation as described by Jeff under "Using C++11 Sequentially Consistent Atomics"?
class Foo {
std::shared_ptr<B> data;
std::mutex mutex;
void detach()
{
if (data.use_count() > 1)
{
std::lock_guard<std::mutex> lock{mutex};
if (data.use_count() > 1)
{
data = std::make_shared<B>(*data);
}
}
}
public:
// public interface
};
No, this is not a correct implementation of DCLP.
The thing is that your outer check data.use_count() > 1 accesses the object (of type B with reference count), which can be deleted (unreferenced) in mutex-protected part. Any sort of memory fences cannot help there.
Why data.use_count() accesses the object:
Assume these operations have been executed:
shared_ptr<B> data1 = make_shared<B>(...);
shared_ptr<B> data = data1;
Then you have following layout (weak_ptr support is not shown here):
data1 [allocated with B::new()] data
--------------------------
[pointer type] ref; --> |atomic<int> m_use_count;| <-- [pointer type] ref
|B obj; |
--------------------------
Each shared_ptr object is just a pointer, which points to allocated memory region. This memory region embeds object of type B plus atomic counter, reflecting number of shared_ptr's, pointed to given object. When this counter becomes zero, memory region is freed(and B object is destroyed). Exactly this counter is returned by shared_ptr::use_count().
UPDATE: Execution, which can lead to accessing memory which is already freed (initially, two shared_ptr's point to the same object, .use_count() is 2):
/* Thread 1 */ /* Thread 2 */ /* Thread 3 */
Enter detach() Enter detach()
Found `data.use_count()` > 1
Enter critical section
Found `data.use_count()` > 1
Dereference `data`,
found old object.
Unreference old `data`,
`use_count` becomes 1
Delete other shared_ptr,
old object is deleted
Assign new object to `data`
Access old object
(for check `use_count`)
!! But object is freed !!
Outer check should only take a pointer to object for decide, whether to need aquire lock.
BTW, even your implementation would be correct, it has a little sence:
If data (and detach) can be accessed from several threads at the same time, object's uniqueness gives no advantages, since it can be accessed from the several threads. If you want to change object, all accesses to data should be protected by outer mutex, in that case detach() cannot be executed concurrently.
If data (and detach) can be accessed only by single thread at the same time, detach implementation doesn't need any locking at all.
This constitutes a data race if two threads invoke detach on the same instance of Foo concurrently, because std::shared_ptr<B>::use_count() (a read-only operation) would run concurrently with the std::shared_ptr<B> move-assignment operator (a modifying operation), which is a data race and hence a cause of undefined behavior. If Foo instances are never accessed concurrently, on the other hand, there is no data race, but then the std::mutex would be useless in your example. The question is: how does data's pointee become shared in the first place? Without this crucial bit of information, it is hard to tell if the code is safe even if a Foo is never used concurrently.
According to your source, I think you still need to add thread fences before the first test and after the second test.
std::shared_ptr<B> data;
std::mutex mutex;
void detach()
{
std::atomic_thread_fence(std::memory_order_acquire);
if (data.use_count() > 1)
{
auto lock = std::lock_guard<std::mutex>{mutex};
if (data.use_count() > 1)
{
std::atomic_thread_fence(std::memory_order_release);
data = std::make_shared<B>(*data);
}
}
}
Let's say I have a container (std::vector) of pointers used by a multi-threaded application. When adding new pointers to the container, the code is protected using a critical section (boost::mutex). All well and good. The code should be able to return one of these pointers to a thread for processing, but another separate thread could choose to delete one of these pointers, which might still be in use. e.g.:
thread1()
{
foo* p = get_pointer();
...
p->do_something();
}
thread2()
{
foo* p = get_pointer();
...
delete p;
}
So thread2 could delete the pointer whilst thread1 is using it. Nasty.
So instead I want to use a container of Boost shared ptrs. IIRC these pointers will be reference counted, so as long as I return shared ptrs instead of raw pointers, removing one from the container WON'T actually free it until the last use of it goes out of scope. i.e.
std::vector<boost::shared_ptr<foo> > my_vec;
thread1()
{
boost::shared_ptr<foo> sp = get_ptr[0];
...
sp->do_something();
}
thread2()
{
boost::shared_ptr<foo> sp = get_ptr[0];
...
my_vec.erase(my_vec.begin());
}
boost::shared_ptr<foo> get_ptr(int index)
{
lock_my_vec();
return my_vec[index];
}
In the above example, if thread1 gets the pointer before thread2 calls erase, will the object pointed to still be valid? It won't actually be deleted when thread1 completes? Note that access to the global vector will be via a critical section.
I think this is how shared_ptrs work but I need to be sure.
For the threading safety of boost::shared_ptr you should check this link. It's not guarantied to be safe, but on many platforms it works. Modifying the std::vector is not safe AFAIK.
In the above example, if thread1 gets the pointer before thread2 calls erase, will the object pointed to still be valid? It won't actually be deleted when thread1 completes?
In your example, if thread1 gets the pointer before thread2, then thread2 will have to wait at the beginning of the function (because of the lock). So, yes, the object pointed to will still be valid. However, you might want to make sure that my_vec is not empty before accessing its first element.
If in addition, you synchronize the accesses to the vector (as in your original raw pointer proposal), your usage is safe. Otherwise, you may fall foul of example 4 in the link provided by the other respondent.