I have a double free bug using std::shared_ptr and trying to get know why. I am using shared_ptr in multithread environment , one thread sometimes replaces some element in a global array
std::shared_ptr<Bucket> globalTable[100]; // global elements storage
using:
globalTable[idx].reset(newBucket);
and the other thread reads this table sometimes using :
std::shared_ptr<Bucket> bkt(globalTable[pIdx]);
// do calculations with bkt-> items
After this I am receiving double-free error, and AddressSanitizer says that the second code tries to free an object that was destroyed by the first one . How it is possible ? As I know shared_ptr must be completly thread safe.
Reset does not guarantee you thread saefty.
Assignments and reference counting are thread safe as explained here
To satisfy thread safety requirements, the reference counters are
typically incremented using an equivalent of std::atomic::fetch_add
with std::memory_order_relaxed (decrementing requires stronger
ordering to safely destroy the control block).
If multiple threads access same shared_ptr you can have a race condition.
If multiple threads of execution access the same shared_ptr without
synchronization and any of those accesses uses a non-const member
function of shared_ptr then a data race will occur; the shared_ptr
overloads of atomic functions can be used to prevent the data race.
Your function reset is non const so falls on that category. You need to use mutex or another synchronization mechanism.
http://en.cppreference.com/w/cpp/memory/shared_ptr
Not all operations on a std::shared_ptr are thread-safe.
Specifically, the reference-counts are managed atomically, but it's your responsibility to make sure the std::shared_ptr instance you access is not concurrently modified.
You fail that responsibility, resulting in a data-race and the expected undefined behavior, manifesting as a double-free in your case.
Related
For consumer/producer model there is a built-in mechanism to avoid data race - queue.
But for global flag there seems not yet a ready-to-go type to avoid data race rather than attaching a mutex to each global flag as simple as boolean or int type.
I came across shared pointer. Is it true that as one pointer operates on that variable, another is prohibited from accessing it?
Or will unique pointer promise no data race?
e.g. scenario:
One thread updates the number of visits on serving a new visitor, while another thread periodically reads that number out (might be copy behavior) and save it to log. They will be accessing the same memory on the heap that stores that number, and race condition is that they are accessing it at the same time from different cpu cores, which would cause a crash.
For consumer/producer model there is a built-in mechanism to avoid data race - queue.
The standard library has no thread-safe queue. std::queue and others cannot be used without explicit synchronization in multiple threads.
I came across shared pointer. Is it true that as one pointer operates on that variable, another is prohibited from accessing it?
std::shared_ptr (or any other standard library smart pointer) does not in any way prevent multiple threads accessing the managed object unsynchronized. std::shared_ptr only guarantees that destruction of the managed object is thread-safe.
Or will unique pointer promise no data race?
std::unique_ptr cannot be copied, so you cannot have multiple std::unique_ptr or threads managing the object. None of the smart pointers guarantee that access to the smart pointer object itself is free of data races.
One thread updates the number of visits on serving a new visitor, while another thread periodically reads that number out (might be copy behavior) and save it to log. They will be accessing the same memory on the heap that stores that number, and race condition is that they are accessing it at the same time from different cpu cores, which would cause a crash.
That can simply be a std::atomic<int> or similar. Unsynchronized access to a std::atomic is allowed. There can of course still be race conditions if you rely on a particular order in which the access should happen, but in your example that doesn't seem to be the case. However, in contrast to non-atomic objects, there will be at least no undefined behavior due to the unsynchronized access (data race).
Assume I have shared_ptr<T> a and two threads running concurrently where one does:
a.reset();
and another does:
auto b = a;
if the operations are atomic, then I either end up with two empty shared_ptrs or a being empty and b pointing to what was pointed to by a. I am fine with either outcome, however, due to the interleaving of the instructions, these operations might not be atomic. Is there any way I can assure that?
To be more precise I only need a.reset() to be atomic.
UPD: as pointed out in the comments my question is silly if I don't get more specific. It is possible to achieve atomicity with a mutex. However, I wonder if, on the implementation level of shared_ptr, things are already taken care of. From cppreference.com, copy assignment and copy constructors are thread-safe. So auto b = a is alright to run without a lock. However, from this it's unclear if a.reset() is also thread-safe.
UPD1: it would be great if there is some document that specifies which methods of shared_ptr are thread-safe. From cppreference:
If multiple threads of execution access the same shared_ptr without synchronization and any of those accesses uses a non-const member function of shared_ptr then a data race will occur
It is unclear to me which of the methods are non-const.
Let the other thread use a weak_ptr. The lock() operation on weak pointer is documented to be atomic.
Create:
std::shared_ptr<A> a = std::make_shared<A>();
std::weak_ptr<A> a_weak = std::weak_ptr<A>(a);
Thread 1:
a.reset();
Thread 2:
b = a_weak.get();
if (b != nullptr)
{
...
}
std::shared_ptr<T> is what some call a "thread-compatible" class, meaning that as long as each instance of a std::shared_ptr<T> can only have one thread calling its member functions at a given point in time, such member function invocations do not cause race conditions, even if multiple threads are accessing shared_ptrs that share ownership with each other.
std::shared_ptr<T> is not a thread-safe class; it is not safe for one thread to call a non-const method of an std::shared_ptr<T> instance while another thread is also accessing the same instance. If you need potentially concurrent reads and writes to not race, then synchronize them using a mutex.
`struct MyClass {
~MyClass() {
// Asynchronously invoke deletion (erase) of entries from my_map;
// Different entries are deleted in different threads.
// Need to spin as 'this' object is shared among threads and
// destruction of the object will result in seg faults.
while(my_map.size() > 0); // This spins for ever due to complier optimization.
}
unordered_map<key, value> my_map;
};`
I have the above class in which elements of the unordered map are deleted asynchronoulsy in the destructor and I must spin/sleep as the object is shared among other threads. I cannot declare my_map as volatile as it results in compilation errors. What else can I do here ? How do I tell the complier that my_map.size() will result in 0 at some point in time. Please do not tell me why/how this design is bad; I cannot change the design as it is bound due to the reason I cannot explain unless I write thousands of lines of code here.
Edit: my_map is protected using a version of spinlock. So, threads do grab the spinlock before erasing the entries. Just the while(my_map.size() > 0); was the only "naive" spin I had in the code. I converted it to grab the spinlock and then check the size (in a loop) and it worked. Though using a condition_variable would be the right way of doing it, we use asynchronous programming model (like SEDA) which binds us to not use any sleeping/yeilding calls.
volatile is not the solution to this problem. volatile has exactly three uses: 1. Accessing memory mapped devices in a driver, 2. signal handlers, 3. setjmp usage.
Read the following, over and over until it sinks in. volatile is useless in multithreading.
A naive spin lock like that has three problems:
The compiler is permitted to cache results, therefore you see the "spin forever" behavior you're seeing.
In the classic case, you have the risk of a race condition: thread A may check the lock variable, find the resource is accessible, but then get pre-empted before setting the lock variable. Along comes thread B who also finds the lock variable showing the resource as accessible, so it then locks it and starts to access the resource, Then thread A wakes back up, locks the variable again, and also accesses the resource.
There is a data write order problem. If a protected variable is written to, and then a lock variable is changed, you have no guarantees that a different thread will not see the protected variable changed even though it may also see the lock variable claiming it has been written. Both the compiler and the Out of order execution on the CPU are permitted do this.
volatile only solves the first of these problems, it does nothing to address the other two. With one caveat, by default MSVC on x86 / x64 adds a memory fence to volatile accesses, even though it's not required by the standard. That happens to solve the third problem, but it still doesn't fix the second one.
The only solutions to all three of these problems involves use of correct synchronization primitives: std::atomic<> if you really must spin lock, preferably std::mutex and maybe std::condition_variable for a lock that will put the thread to sleep till something interesting happens.
Consider the following implementation of a trivial thread pool written in C++14.
threadpool.h
threadpool.cpp
Observe that each thread is sleeping until it's been notified to awaken -- or some spurious wake up call -- and the following predicate evaluates to true:
std::unique_lock<mutex> lock(this->instance_mutex_);
this->cond_handle_task_.wait(lock, [this] {
return (this->destroy_ || !this->tasks_.empty());
});
Furthermore, observe that a ThreadPool object uses the data member destroy_ to determine if its being destroyed -- the destructor has been called. Toggling this data member to true will notify each worker thread that it's time to finish its current task and any of the other queued tasks then synchronize with the thread that's destroying this object; in addition to prohibiting the enqueue member function.
For your convenience, the implementation of the destructor is below:
ThreadPool::~ThreadPool() {
{
std::lock_guard<mutex> lock(this->instance_mutex_); // this line.
this->destroy_ = true;
}
this->cond_handle_task_.notify_all();
for (auto &worker : this->workers_) {
worker.join();
}
}
Q: I do not understand why it's necessary to lock the object's mutex while toggling destroy_ to true in the destructor. Furthermore, is it only necessary for setting its value or is it also necessary for accessing its value?
BQ: Can this thread pool implementation be improved or optimized while maintaining it's original purpose; a thread pool that can pool N amount of threads and distribute tasks to them to be executed concurrently?
This thread pool implementation is forked from Jakob Progsch's C++11 thread pool repository with a thorough code step through to understand the purpose behind its implementation and some subjective style changes.
I am introducing myself to concurrent programming and there is still much to learn -- I am a novice concurrent programmer as it stands right now. If my questions are not worded correctly then please make the appropriate correction(s) in your provided answer. Moreover, if the answer can be geared towards a client who is being introduced to concurrent programming for the first time then that would be best -- for myself and any other novices as well.
If the owning thread of the ThreadPool object is the only thread that atomically writes to the destroy_ variable, and the worker threads only atomically read from the destroy_ variable, then no, a mutex is not needed to protect the destroy_ variable in the ThreadPool destructor. Typically a mutex is necessary when an atomic set of operations must take place that can't be accomplished through a single atomic instruction on a platform, (i.e., operations beyond an atomic swap, etc.). That being said, the author of the thread pool may be trying to force some type of acquire semantics on the destroy_ variable without restoring to atomic operations (i.e. a memory fence operation), and/or the setting of the flag itself is not considered an atomic operation (platform dependent)... Some other options include declaring the variable as volatile to prevent it from being cached, etc. You can see this thread for more info.
Without some sort of synchronization operation in place, the worst case scenario could end up with a worker that won't complete due to the destroy_ variable being cached on a thread. On platforms with weaker memory ordering models, that's always a possibility if you allowed a benign memory race condition to exist ...
C++ defines a data race as multiple threads potentially accessing an object simultaneously with at least one of those accesses being a write. Programs with data races have undefined behavior. If you were to write to destroy in your destructor without holding the mutex, your program would have undefined behavior and we cannot predict what would happen.
If you were to read destroy elsewhere without holding the mutex, that read could potentially happen while the destructor is writing to it which is also a data race.
I have a question about boost::shared_ptr<T>.
There are lots of thread.
using namespace boost;
class CResource
{
// xxxxxx
}
class CResourceBase
{
public:
void SetResource(shared_ptr<CResource> res)
{
m_Res = res;
}
shared_ptr<CResource> GetResource()
{
return m_Res;
}
private:
shared_ptr<CResource> m_Res;
}
CResourceBase base;
//----------------------------------------------
// Thread_A:
while (true)
{
//...
shared_ptr<CResource> nowResource = base.GetResource();
nowResource.doSomeThing();
//...
}
// Thread_B:
shared_ptr<CResource> nowResource;
base.SetResource(nowResource);
//...
Q1
If Thread_A do not care the nowResource is the newest, will this part of code have problem?
I mean when Thread_B do not SetResource() completely, Thread_A get a wrong smart point by GetResource()?
Q2
What does thread-safe mean?
If I do not care about whether the resource is newest, will the shared_ptr<CResource> nowResource crash the program when the nowResource is released or will the problem destroy the shared_ptr<CResource>?
boost::shared_ptr<> offers a certain level of thread safety. The reference count is manipulated in a thread safe manner (unless you configure boost to disable threading support).
So you can copy a shared_ptr around and the ref_count is maintained correctly. What you cannot do safely in multiple threads is modify the actual shared_ptr object instance itself from multiple threads (such as calling reset() on it from multiple threads). So your usage is not safe - you're modifying the actual shared_ptr instance in multiple threads - you'll need to have your own protection.
In my code, shared_ptr's are generally locals or parameters passed by value, so there's no issue. Getting them from one thread to another I generally use a thread-safe queue.
Of course none of this addresses the thread safety of accessing the object pointed to by the shared_ptr - that's also up to you.
From the boost documentation:
shared_ptr objects offer the same
level of thread safety as built-in
types. A shared_ptr instance can be
"read" (accessed using only const
operations) simultaneously by multiple
threads. Different shared_ptr
instances can be "written to"
(accessed using mutable operations
such as operator= or reset)
simultaneously by multiple threads
(even when these instances are copies,
and share the same reference count
underneath.)
Any other simultaneous accesses result in undefined behavior.
So your usage is not safe, since it uses simultaneous read and write of m_res. Example 3 in the boost documentation also illustrates this.
You should use a separate mutex that guards the access to m_res in SetResource/GetResource.
Well, tr1::shared_ptr (which is based on boost) documentation tells a different story, which implies that resource management is thread safe, whereas access to the resource is not.
"...
Thread Safety
C++0x-only features are: rvalue-ref/move support, allocator support, aliasing constructor, make_shared & allocate_shared. Additionally, the constructors taking auto_ptr parameters are deprecated in C++0x mode.
The Thread Safety section of the Boost shared_ptr documentation says "shared_ptr objects offer the same level of thread safety as built-in types." The implementation must ensure that concurrent updates to separate shared_ptr instances are correct even when those instances share a reference count e.g.
shared_ptr a(new A);
shared_ptr b(a);
// Thread 1 // Thread 2
a.reset(); b.reset();
The dynamically-allocated object must be destroyed by exactly one of the threads. Weak references make things even more interesting. The shared state used to implement shared_ptr must be transparent to the user and invariants must be preserved at all times. The key pieces of shared state are the strong and weak reference counts. Updates to these need to be atomic and visible to all threads to ensure correct cleanup of the managed resource (which is, after all, shared_ptr's job!) On multi-processor systems memory synchronisation may be needed so that reference-count updates and the destruction of the managed resource are race-free.
..."
see
http://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html#std.util.memory.shared_ptr
m_Res is not threadsafe ,because it simultaneous read/write,
you need boost::atomic_store/load function to protects it.
//--- Example 3 ---
// thread A
p = p3; // reads p3, writes p
// thread B
p3.reset(); // writes p3; undefined, simultaneous read/write
Add, your class has a Cyclic-references condition; the shared_ptr<CResource> m_Res can't be a member of CResourceBase. You can use weak_ptr instead.