Let's say I have a container (std::vector) of pointers used by a multi-threaded application. When adding new pointers to the container, the code is protected using a critical section (boost::mutex). All well and good. The code should be able to return one of these pointers to a thread for processing, but another separate thread could choose to delete one of these pointers, which might still be in use. e.g.:
thread1()
{
foo* p = get_pointer();
...
p->do_something();
}
thread2()
{
foo* p = get_pointer();
...
delete p;
}
So thread2 could delete the pointer whilst thread1 is using it. Nasty.
So instead I want to use a container of Boost shared ptrs. IIRC these pointers will be reference counted, so as long as I return shared ptrs instead of raw pointers, removing one from the container WON'T actually free it until the last use of it goes out of scope. i.e.
std::vector<boost::shared_ptr<foo> > my_vec;
thread1()
{
boost::shared_ptr<foo> sp = get_ptr[0];
...
sp->do_something();
}
thread2()
{
boost::shared_ptr<foo> sp = get_ptr[0];
...
my_vec.erase(my_vec.begin());
}
boost::shared_ptr<foo> get_ptr(int index)
{
lock_my_vec();
return my_vec[index];
}
In the above example, if thread1 gets the pointer before thread2 calls erase, will the object pointed to still be valid? It won't actually be deleted when thread1 completes? Note that access to the global vector will be via a critical section.
I think this is how shared_ptrs work but I need to be sure.
For the threading safety of boost::shared_ptr you should check this link. It's not guarantied to be safe, but on many platforms it works. Modifying the std::vector is not safe AFAIK.
In the above example, if thread1 gets the pointer before thread2 calls erase, will the object pointed to still be valid? It won't actually be deleted when thread1 completes?
In your example, if thread1 gets the pointer before thread2, then thread2 will have to wait at the beginning of the function (because of the lock). So, yes, the object pointed to will still be valid. However, you might want to make sure that my_vec is not empty before accessing its first element.
If in addition, you synchronize the accesses to the vector (as in your original raw pointer proposal), your usage is safe. Otherwise, you may fall foul of example 4 in the link provided by the other respondent.
Related
As it's well known that shared_ptr only guarantees access to underlying control block is thread
safe and no guarantee made for accesses to owned object.
Then why there is a race condition in the code snippet below:
std::shared_ptr<int> g_s = std::make_shared<int>(1);
void f1()
{
std::shared_ptr<int>l_s1 = g_s; // read g_s
}
void f2()
{
std::shared_ptr<int> l_s2 = std::make_shared<int>(3);
std::thread th(f1);
th.detach();
g_s = l_s2; // write g_s
}
In the code snippet above, the owned object of the shared pointer named g_s are not accessed indeed.
I am really confused now. Could somebody shed some light on this matter?
std::shared_ptr<T> guarantees that access to its control block is thread-safe, but not access to the std::shared_ptr<T> instance itself, which is generally an object with two data members: the raw pointer (the one returned by get()) and the pointer to the control block.
In your code, the same std::shared_ptr<int> instance may be concurrently accessed by the two threads; f1 reads, and f2 writes.
If the two threads were accessing two different shared_ptr instances that shared ownership of the same object, there would be no data race. The two instances would have the same control block, but accesses to the control block would be appropriately synchronized by the library implementation.
If you need concurrent, race-free access to a single std::shared_ptr<T> instance from multiple threads, you can use std::atomic<std::shared_ptr<T>>. (There is also an older interface that can be used prior to C++20, which is deprecated in C++20.)
This article by Jeff Preshing states that the double-checked locking pattern (DCLP) is fixed in C++11. The classical example used for this pattern is the singleton pattern but I happen to have a different use case and I am still lacking experience in handling "atomic<> weapons" - maybe someone over here can help me out.
Is the following piece of code a correct DCLP implementation as described by Jeff under "Using C++11 Sequentially Consistent Atomics"?
class Foo {
std::shared_ptr<B> data;
std::mutex mutex;
void detach()
{
if (data.use_count() > 1)
{
std::lock_guard<std::mutex> lock{mutex};
if (data.use_count() > 1)
{
data = std::make_shared<B>(*data);
}
}
}
public:
// public interface
};
No, this is not a correct implementation of DCLP.
The thing is that your outer check data.use_count() > 1 accesses the object (of type B with reference count), which can be deleted (unreferenced) in mutex-protected part. Any sort of memory fences cannot help there.
Why data.use_count() accesses the object:
Assume these operations have been executed:
shared_ptr<B> data1 = make_shared<B>(...);
shared_ptr<B> data = data1;
Then you have following layout (weak_ptr support is not shown here):
data1 [allocated with B::new()] data
--------------------------
[pointer type] ref; --> |atomic<int> m_use_count;| <-- [pointer type] ref
|B obj; |
--------------------------
Each shared_ptr object is just a pointer, which points to allocated memory region. This memory region embeds object of type B plus atomic counter, reflecting number of shared_ptr's, pointed to given object. When this counter becomes zero, memory region is freed(and B object is destroyed). Exactly this counter is returned by shared_ptr::use_count().
UPDATE: Execution, which can lead to accessing memory which is already freed (initially, two shared_ptr's point to the same object, .use_count() is 2):
/* Thread 1 */ /* Thread 2 */ /* Thread 3 */
Enter detach() Enter detach()
Found `data.use_count()` > 1
Enter critical section
Found `data.use_count()` > 1
Dereference `data`,
found old object.
Unreference old `data`,
`use_count` becomes 1
Delete other shared_ptr,
old object is deleted
Assign new object to `data`
Access old object
(for check `use_count`)
!! But object is freed !!
Outer check should only take a pointer to object for decide, whether to need aquire lock.
BTW, even your implementation would be correct, it has a little sence:
If data (and detach) can be accessed from several threads at the same time, object's uniqueness gives no advantages, since it can be accessed from the several threads. If you want to change object, all accesses to data should be protected by outer mutex, in that case detach() cannot be executed concurrently.
If data (and detach) can be accessed only by single thread at the same time, detach implementation doesn't need any locking at all.
This constitutes a data race if two threads invoke detach on the same instance of Foo concurrently, because std::shared_ptr<B>::use_count() (a read-only operation) would run concurrently with the std::shared_ptr<B> move-assignment operator (a modifying operation), which is a data race and hence a cause of undefined behavior. If Foo instances are never accessed concurrently, on the other hand, there is no data race, but then the std::mutex would be useless in your example. The question is: how does data's pointee become shared in the first place? Without this crucial bit of information, it is hard to tell if the code is safe even if a Foo is never used concurrently.
According to your source, I think you still need to add thread fences before the first test and after the second test.
std::shared_ptr<B> data;
std::mutex mutex;
void detach()
{
std::atomic_thread_fence(std::memory_order_acquire);
if (data.use_count() > 1)
{
auto lock = std::lock_guard<std::mutex>{mutex};
if (data.use_count() > 1)
{
std::atomic_thread_fence(std::memory_order_release);
data = std::make_shared<B>(*data);
}
}
}
I was reading the Mir project source code and I stumbled upon this piece of code :
void mir::frontend::ResourceCache::free_resource(google::protobuf::Message* key)
{
std::shared_ptr<void> value;
{
std::lock_guard<std::mutex> lock(guard);
auto const& p = resources.find(key);
if (p != resources.end())
{
value = p->second;
}
resources.erase(key);
}
}
I have seen this before in other projects as well. It holds a reference to the value in the map before its erasure, even when the bloc is protected by a lock_guard. I'm not sure why they hold a reference to the value by using std::shared_ptr value.
What are the repercussions if we remove the value = p->second ?
Will someone please enlighten me ?
This is the code http://bazaar.launchpad.net/~mir-team/mir/trunk/view/head:/src/frontend/resource_cache.cpp
My guess it that this is done to avoid running the destructor of value inside the locked code. This lock is meant to protect the modification of the map, and running some arbitrary code, such as the destructor of another object with it locked, is not needed nor wanted.
Just imagine that the destructor of value accesses indirectly, for whatever reason to the map, or to another thread-shared structure. There are chances that you end up in a deadlock.
The bottom end is: run as little code as possible from locked code, but not less. And never call a external, unknown, function (such as the shared_ptr deleter or a callback) from locked code.
The goal is to move the actual execution of the shared_ptr deleter till after the lock is released. This way, if the deleter (or the destructor using the default deleter) takes a long time, the lock is not held for that operation.
If you were to remove the value = p->second, the value would be destroyed while the lock is held. Since the lock protects the map, but not the actual values, it would hold the lock longer than is strictly necessary.
I have a vector of pointers to objects created with new. Multiple threads access this vector in a safe manner with various gets/sets. However, a thread may delete one of the objects, in which case another thread's pointer to the object is no longer valid. How can a method know if the pointer is valid? Options 1 and 2 actually seem to work well. I don't know how they will scale. What is the best approach? Is there a portable version 3?
Testing for pointer validity examples that work:
1. Use integers instead of pointers. A hash (std::map) checks to see if the pointer is still valid. Public methods look like:
get(size_t iKey)
{
if((it = mMap.find(iKey)) != mMap.end())
{
TMyType * pMyType = it->second;
// do something with pMyType
}
}
2. Have a vector of shared_ptr. Each thread tries to call lock() on its weak_ptr. If the returned shared_ptr is null we know someone deleted it while we were waiting. Public methods looks like:
get(boost::weak_ptr<TMyType> pMyType)
{
boost::shared_ptr<TMyType> pGOOD = pMyType.lock();
if (pGOOD != NULL)
{
// Do something with pGOOD
}
}
3. Test for null on plain raw pointers? Is this possible?
get(TMyType * pMyType)
{
if(pMyType != NULL){ //do something }
}
#3 will not work. Deleting a pointer and setting it to NULL doesn't affect other pointers that are pointing to the same object. There is nothing you can do with raw pointers to detect if the object is deleted.
#1 is effectively a pointer to a pointer. If you always access it through that pointer and are able to lock it. If not, what happens if it's deleted in another thread after you successfully get it?
#2 is the standard implementation of this kind of idea, and the pattern is used in many libraries. Lock a handle, getting a pointer. If you get it, use it. If not, it's gone.
There is a scenario that i need to solve with shared_ptr and weak_ptr smart pointers.
Two threads, thread 1 & 2, are using a shared object called A. Each of the threads have a reference to that object. thread 1 decides to delete object A but at the same time thread 2 might be using it. If i used shared_ptr to hold object A's references in each thread, the object wont get deleted at the right time.
What should i do to be able to delete the object when its supposed to and prevent an error in other threads that using that object at the same time?
There's 2 cases:
One thread owns the shared data
If thread1 is the "owner" of the object and thread2 needs to just use it, store a weak_ptr in thread2. Weak pointers do not participate in reference counting, instead they provide a way to access a shared_ptr to the object if the object still exists. If the object doesn't exist, weak_ptr will return an empty/null shared_ptr.
Here's an example:
class CThread2
{
private:
boost::weak_ptr<T> weakPtr
public:
void SetPointer(boost::shared_ptr<T> ptrToAssign)
{
weakPtr = ptrToAssign;
}
void UsePointer()
{
boost::shared_ptr<T> basePtr;
basePtr = weakPtr.lock()
if (basePtr)
{
// pointer was not deleted by thread a and still exists,
// so it can be used.
}
else
{
// thread1 must have deleted the pointer
}
}
};
My answer to this question (link) might also be useful.
The data is truly owned by both
If either of your threads can perform deletion, than you can not have what I describe above. Since both threads need to know the state of the pointer in addition to the underlying object, this may be a case where a "pointer to a pointer" is useful.
boost::shared_ptr< boost::shared_ptr<T> >
or (via a raw ptr)
shared_ptr<T>* sharedObject;
or just
T** sharedObject;
Why is this useful?
You only have one referrer to T (in fact shared_ptr is pretty redundant)
Both threads can check the status of the single shared pointer (is it NULL? Was it deleted by the other thread?)
Pitfalls:
- Think about what happens when both sides try to delete at the same time, you may need to lock this pointer
Revised Example:
class CAThread
{
private:
boost::shared_ptr<T>* sharedMemory;
public:
void SetPointer(boost::shared_ptr<T>* ptrToAssign)
{
assert(sharedMemory != NULL);
sharedMemory = ptrToAssign;
}
void UsePointer()
{
// lock as needed
if (sharedMemory->get() != NULL)
{
// pointer was not deleted by thread a and still exists,
// so it can be used.
}
else
{
// other thread must have deleted the pointer
}
}
void AssignToPointer()
{
// lock as needed
sharedMemory->reset(new T);
}
void DeletePointer()
{
// lock as needed
sharedMemory->reset();
}
};
I'm ignoring all the concurrency issues with the underlying data, but that's not really what you're asking about.
Qt has a QPointer class that does this. The pointers are automatically set to 0 if what they're pointed at is deleted.
(Of course, this would only work if you're interested in integrating Qt into your project.)