C++11 shared_ptr released multiple times under multithreading

C++11 shared_ptr released multiple times under multithreading - c++

Under g++ 4.8.5 compiling, it is found that improper use of sharedptr will cause multiple destruction of shared_ptr.
Fake code：
#include<memory>
class Demo
{
public:
~Demo()
{
// Do something and cost some milliseconds
}
};
typedef std::shared_ptr<Demo> DemoPtr;
DemoPtr global_demo;
DemoPtr instance() {return global_demo;}
// Main thread
int main()
{
global_demo = std::make_shared<Demo>();
// Do something
}
// Thread A
void thread_func()
{
// Do something
if(instance() != nullptr)
{
// Do something
}
// Do something
}
When the main thread ends, the global_demo reference count is reduced to 0, and global_demo begins to be destructed. When global_demo is being destructed, thread A calls instance() and makes a judgment, which causes the reference count of global_demo to increase by one again, and then when the local variable is released, the reference count is reduced to 0 again, resulting in the destruction of the object pointed to by global_demo The function is called again.
View gcc source code：
//*************__shared_count***************//
__shared_count&
operator=(const __shared_count& __r) noexcept
{
_Sp_counted_base<_Lp>* __tmp = __r._M_pi;
if (__tmp != _M_pi)
{
if (__tmp != 0)
__tmp->_M_add_ref_copy();
if (_M_pi != 0)
_M_pi->_M_release();
_M_pi = __tmp;
}
return *this;
}
//************_Sp_counted_base*****************//
void
_M_add_ref_copy()
{ __gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); }
So, this is a GCC bug?
Should I use std::weak_ptr to solve this problem in this case?
So, my instance() method code like this?
DemoPtr instance()
{
std::weak_ptr<Demo> w(global_demo);
if(!w.expired())
{
return w.lock();
}
return nullptr;
}

So, this is a GCC bug?
No. It is a bug in the program:
global_demo is being destructed
thread A calls instance()
DemoPtr instance() {return global_demo;}
You are making a copy (return global_demo;) of an object whose lifetime has ended (is being destructed). The behaviour of the program is undefined.
Should I use std::weak_ptr to solve this problem in this case?
This would not fix the bug. What you must do is join any threads that depend on static variables before returning from main. It may technically be OK to join a thread after main has returned, within a destructor of a static object as long as that object is guaranteed to be destroyed before the depended static object. But good luck with that.
for some threads I cannot control the end.
Then you must avoid using any static variables in those threads. In the example case, create a thread local copy of global_demo, and use that within the thread instead.

Related

What makes a singleton thread-unsafe?

I read somewhere that a singleton was thread-unsafe. I'm trying to understand why this is. If I have a singleton object like this:
class singleton final
{
public:
static singleton& instance()
{
static singleton unique;
return unique;
}
private:
singleton() = default;
singleton(singleton const&) = delete;
singleton& operator=(singleton const&) = delete;
};
And if I have code like this:
singleton *p1, *p2;
auto t1 = std::thread([] { p1 = &singleton::instance(); });
auto t2 = std::thread([] { p2 = &singleton::instance(); });
t1.join();
t2.join();
Is it possible for p1 and p2 to point to two different singleton instances? If unique is static, does its "static" nature not take affect until it's fully initialized? If that's so, does that mean that a static object's initialization can be accessed concurrently and thus allowing the creation of multiple static objects?

In C++98/03 a file local static:
X& instance()
{
static X x;
return x;
}
meant that your code would do something like this:
bool __instance_initialized = false;
alignas(X) char __buf_instance[sizeof(X)];
// ...
X& instance()
{
if (!__instance_initialized)
{
::new(__buf_instance) X;
__instance_initialized = true;
}
return *static_cast<X*>(__buf_instance);
}
Where the "__"-prefixed names are compiler supplied.
But in the above code, nothing is stopping two threads from entering the if at the same time, and both trying to construct the X at the same time. The compiler might try to combat that problem by writing:
bool __instance_initialized = false;
alignas(X) char __buf_instance[sizeof(X)];
// ...
X& instance()
{
if (!__instance_initialized)
{
__instance_initialized = true;
::new(__buf_instance) X;
}
return *static_cast<X*>(__buf_instance);
}
But now it is possible for one thread to set __instance_initialized to true and start constructing the X, and have the second thread test and skip over the if while the first thread is still busy constructing X. The second thread would then present uninitialized memory to its client until the first thread finally completes the construction.
In C++11 the language rules were changed such that the compiler must set up the code such that the second thread can not run past, nor start the construction of X until the first thread successfully finishes the construction. This may mean that the second thread has to wait an arbitrary amount of time before it can proceed ... until the first thread finishes. If the first thread throws an exception while trying to construct X, the second thread will wake up and try its hand at constructing it.
Here is the Itanium ABI specification for how the compiler might accomplish that.

pthread_key_create destructor not getting called

As per pthread_key_create man page we can associate a destructor to be called at thread shut down. My problem is that the destructor function I have registered is not being called. Gist of my code is as follows.
static pthread_key_t key;
static pthread_once_t tls_init_flag = PTHREAD_ONCE_INIT;
void destructor(void *t) {
// thread local data structure clean up code here, which is not getting called
}
void create_key() {
pthread_key_create(&key, destructor);
}
// This will be called from every thread
void set_thread_specific() {
ts = new ts_stack; // Thread local data structure
pthread_once(&tls_init_flag, create_key);
pthread_setspecific(key, ts);
}
Any idea what might prevent this destructor being called? I am also using atexit() at moment to do some cleanup in the main thread. Is there any chance that is interfering with destructor function being called? I tried removing that as well. Still didn't work though. Also I am not clear if I should handle the main thread as a separate case with atexit. (It's a must to use atexit by the way, since I need to do some application specific cleanup at application exit)

This is by design.
The main thread exits (by returning or calling exit()), and that doesn't use pthread_exit(). POSIX documents pthread_exit calling the thread-specific destructors.
You could add pthread_exit() at the end of main. Alternatively, you can use atexit to do your destruction. In that case, it would be clean to set the thread-specific value to NULL so in case the pthread_exit was invoked, the destruction wouldn't happen twice for that key.
UPDATE Actually, I've solved my immediate worries by simply adding this to my global unit test setup function:
::atexit([] { ::pthread_exit(0); });
So, in context of my global fixture class MyConfig:
struct MyConfig {
MyConfig() {
GOOGLE_PROTOBUF_VERIFY_VERSION;
::atexit([] { ::pthread_exit(0); });
}
~MyConfig() { google::protobuf::ShutdownProtobufLibrary(); }
};
Some of the references used:
http://www.resolvinghere.com/sof/6357154.shtml
https://sourceware.org/ml/pthreads-win32/2008/msg00007.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_key_create.html
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_exit.html
PS. Of course c++11 introduced <thread> so you have better and more portable primitves to work with.

It's already in sehe's answer, just to present the key points in a compact way:
pthread_key_create() destructor calls are triggered by a call to pthread_exit().
If the start routine of a thread returns, the behaviour is as if pthread_exit() was called (i. e., destructor calls are triggered).
However, if main() returns, the behaviour is as if exit() was called — no destructor calls are triggered.
This is explained in http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_create.html. See also C++17 6.6.1p5 or C11 5.1.2.2.3p1.

I wrote a quick test and the only thing I changed was moving the create_key call of yours outside of the set_thread_specific.
That is, I called it within the main thread.
I then saw my destroy get called when the thread routine exited.

I call destructor() manually at the end of main():
void * ThreadData = NULL;
if ((ThreadData = pthread_getspecific(key)) != NULL)
destructor(ThreadData);
Of course key should be properly initialized earlier in main() code.
PS. Calling Pthread_Exit() at the end to main() seems to hang entire application...

Your initial thought of handling the main thread as a separate case with atexit worked best for me.
Be ware that pthread_exit(0) overwrites the exit value of the process. For example, the following program will exit with status of zero even though main() returns with number three:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
class ts_stack {
public:
ts_stack () {
printf ("init\n");
}
~ts_stack () {
printf ("done\n");
}
};
static void cleanup (void);
static pthread_key_t key;
static pthread_once_t tls_init_flag = PTHREAD_ONCE_INIT;
void destructor(void *t) {
// thread local data structure clean up code here, which is not getting called
delete (ts_stack*) t;
}
void create_key() {
pthread_key_create(&key, destructor);
atexit(cleanup);
}
// This will be called from every thread
void set_thread_specific() {
ts_stack *ts = new ts_stack (); // Thread local data structure
pthread_once(&tls_init_flag, create_key);
pthread_setspecific(key, ts);
}
static void cleanup (void) {
pthread_exit(0); // <-- Calls destructor but sets exit status to zero as a side effect!
}
int main (int argc, char *argv[]) {
set_thread_specific();
return 3; // Attempt to exit with status of 3
}

I had similar issue as yours: pthread_setspecific sets a key, but the destructor never gets called. To fix it we simply switched to thread_local in C++. You could also do something like if that change is too complicated:
For example, assume you have some class ThreadData that you want some action to be done on when the thread finishes execution. You define the destructor something on these lines:
void destroy_my_data(ThreadlData* t) {
delete t;
}
When your thread starts, you allocate memory for ThreadData* instance and assign a destructor to it like this:
ThreadData* my_data = new ThreadData;
thread_local ThreadLocalDestructor<ThreadData> tld;
tld.SetDestructorData(my_data, destroy_my_data);
pthread_setspecific(key, my_data)
Notice that ThreadLocalDestructor is defined as thread_local. We rely on C++11 mechanism that when the thread exits, the destructor of ThreadLocalDestructor will be automatically called, and ~ThreadLocalDestructor is implemented to call function destroy_my_data.
Here is the implementation of ThreadLocalDestructor:
template <typename T>
class ThreadLocalDestructor
{
public:
ThreadLocalDestructor() : m_destr_func(nullptr), m_destr_data(nullptr)
{
}
~ThreadLocalDestructor()
{
if (m_destr_func) {
m_destr_func(m_destr_data);
}
}
void SetDestructorData(void (*destr_func)(T*), T* destr_data)
{
m_destr_data = destr_data;
m_destr_func = destr_func;
}
private:
void (*m_destr_func)(T*);
T* m_destr_data;
};

Is it possible to define an std::thread and initialize it later?

My aim is to keep an std::thread object as data member, and initialize it when needed.
I'm not able to do this (as in my code below) because the copy constructor of the std::thread class is deleted. Is there any other way to do it?
class MyClass
{
public:
MyClass():DiskJobThread(){};
~MyClass();
void DoDiskJobThread();
private:
int CopyThread(const std::wstring & Source, const std::wstring & Target);
int MoveThread(const std::wstring & Source, const std::wstring & Target);
std::thread DiskJobThread;
};
MyClass::~MyClass()
{
DiskJobThread.join();
}
void MyClass::DoDiskJobThread()
{
std::wstring Source = GetSource();
std::wstring Target = GetTarget();
int m_OperationType = GetOperationType();
if (m_OperationType == OPERATION_COPY)
{
DiskJobThread = std::thread(&MyClass::CopyThread, *this, Source, Target);
}
else if (m_OperationType == OPERATION_MOVE)
{
DiskJobThread = std::thread(&MyClass::MoveThread, *this, Source, Target);
}
}

How about wrapping it in a pointer?
std::unique_ptr<std::thread> thread_ptr;
// Look into std::make_unique if possible
thread_ptr = std::unique_ptr<std::thread>(new std::thread(...));
Edit: And yes, the others have mentioned it and I didn't feel the need to add it here, but in order to avoid more downvote piling, I'll say it: You are passing *this and not this thereby copying an instance of your class. (Problems arise because it's non-copyable. Pass this and you should be good to go.)

Your problem is something else - you're passing an instance of MyClass into the thread instead of the pointer to MyClass which the member functions expect. Simply change DoDiskJobThread() like this (do not dereference this):
void MyClass::DoDiskJobThread()
{
std::wstring Source = GetSource();
std::wstring Target = GetTarget();
int m_OperationType = GetOperationType();
if (m_OperationType == OPERATION_COPY)
{
DiskJobThread = std::thread(&MyClass::CopyThread, this, Source, Target);
}
else if (m_OperationType == OPERATION_MOVE)
{
DiskJobThread = std::thread(&MyClass::MoveThread, this, Source, Target);
}
}
You were getting the error because *this resulted in trying to copy MyClass into the thread function, and the copy ctor of your class is deleted (because that of std::thread is deleted). However, the member functions CopyThread and MoveThread require a pointer as the first (hidden) argument anyway.
Live demonstration

You can't initialize the thread object after it's created; by definition, initialization occurs when an object is created. But you can use swap to move a thread object into another:
std::thread thr1; // no thread of execution
std::thread thr2(my_function_object); // creates thread of execution
thr1.swap(thr2); // thr1 is now running the thread created as thr2
// and thr2 has no thread of execution

My aim is to keep an std::thread object as data member, and initialize it when needed.
Since a default-constructed std::thread object doesn't have an associated thread of execution, you can achieve that by using such an object as the target for a (move) assignment operation. However, note that the following is not initialization, but assignment:
std::thread th; // no thread of execution associated with th object
// ...
th = std::thread(func);
The temporary std::thread object created with std::thread(func) has an associated thread of execution. The ownership of this thread of execution is transferred to th through the move assignment – i.e., th steals the ownership of that thread of execution from the temporary.
Note that if th had an associated thread of execution at the moment of the assignment, std::terminate() would have been called.

Double checked locking on C++: new to a temp pointer, then assign it to instance

Anything wrong with the following Singleton implementation?
Foo& Instance() {
if (foo) {
return *foo;
}
else {
scoped_lock lock(mutex);
if (foo) {
return *foo;
}
else {
// Don't do foo = new Foo;
// because that line *may* be a 2-step
// process comprising (not necessarily in order)
// 1) allocating memory, and
// 2) actually constructing foo at that mem location.
// If 1) happens before 2) and another thread
// checks the foo pointer just before 2) happens, that
// thread will see that foo is non-null, and may assume
// that it is already pointing to a a valid object.
//
// So, to fix the above problem, what about doing the following?
Foo* p = new Foo;
foo = p; // Assuming no compiler optimisation, can pointer
// assignment be safely assumed to be atomic?
// If so, on compilers that you know of, are there ways to
// suppress optimisation for this line so that the compiler
// doesn't optimise it back to foo = new Foo;?
}
}
return *foo;
}

No, you cannot even assume that foo = p; is atomic. It's possible that it might load 16 bits of a 32-bit pointer, then be swapped out before loading the rest.
If another thread sneaks in at that point and calls Instance(), you're toasted because your foo pointer is invalid.
For true security, you will have to protect the entire test-and-set mechanism, even though that means using mutexes even after the pointer is built. In other words (and I'm assuming that scoped_lock() will release the lock when it goes out of scope here (I have little experience with Boost)), something like:
Foo& Instance() {
scoped_lock lock(mutex);
if (foo != 0)
foo = new Foo();
return *foo;
}
If you don't want a mutex (for performance reasons, presumably), an option I've used in the past is to build all singletons before threading starts.
In other words, assuming you have that control (you may not), simply create an instance of each singleton in main before kicking off the other threads. Then don't use a mutex at all. You won't have threading problems at that point and you can just use the canonical don't-care-about-threads-at-all version:
Foo& Instance() {
if (foo != 0)
foo = new Foo();
return *foo;
}
And, yes, this does make your code more dangerous to people who couldn't be bothered to read your API docs but (IMNSHO) they deserve everything they get :-)

Why not keep it simple?
Foo& Instance()
{
scoped_lock lock(mutex);
static Foo instance;
return instance;
}
Edit: In C++11 where threads is introduced into the language. The following is thread safe. The language guarantees that instance is only initialized once and in a thread safe manor.
Foo& Instance()
{
static Foo instance;
return instance;
}
So its lazily evaluated. Its thread safe. Its very simple. Win/Win/Win.

This depends on what threading library you're using. If you're using C++0x you can use atomic compare-and-swap operations and write barriers to guarantee that double-checked locking works. If you're working with POSIX threads or Windows threads, you can probably find a way to do it. The bigger question is why? Singletons, it turns out, are usually unnecessary.

the new operator in c++ always invovle 2-steps process :
1.) allocating memory identical to simple malloc
2.) invoke constructor for given data type
Foo* p = new Foo;
foo = p;
the code above will make the singleton creation into 3 step, which is even vulnerable to problem you trying to solve.

Why don't you just use a real mutex ensuring that only one thread will attempt to create foo?
Foo& Instance() {
if (!foo) {
pthread_mutex_lock(&lock);
if (!foo) {
Foo *p = new Foo;
foo = p;
}
pthread_mutex_unlock(&lock);
}
return *foo;
}
This is a test-and-test-and-set lock with free readers. Replace the above with a reader-writer lock if you want reads to be guaranteed safe in a non-atomic-replacement environment.
edit: if you really want free readers, you can write foo first, and then write a flag variable fooCreated = 1. Checking fooCreated != 0 is safe; if fooCreated != 0, then foo is initialized.
Foo& Instance() {
if (!fooCreated) {
pthread_mutex_lock(&lock);
if (!fooCreated) {
foo = new Foo;
fooCreated = 1;
}
pthread_mutex_unlock(&lock);
}
return *foo;
}

It has nothing wrong with your code. After the scoped_lock, there will be only one thread in that section, so the first thread that enters will initialize foo and return, and then second thread(if any) enters, it will return immediately because foo is not null anymore.
EDIT: Pasted the simplified code.
Foo& Instance() {
if (!foo) {
scoped_lock lock(mutex);
// only one thread can enter here
if (!foo)
foo = new Foo;
}
return *foo;
}

Thanks for all your input. After consulting Joe Duffy's excellent book, "Concurrent Programming on Windows", I am now thinking that I should be using the code below. It's largely the code from his book, except for some renames and the InterlockedXXX line. The following implementation uses:
volatile keyword on both the temp and "actual" pointers to protect against re-ordering from the compiler.
InterlockedCompareExchangePointer to protect against reordering from
the CPU.
So, that should be pretty safe (... right?):
template <typename T>
class LazyInit {
public:
typedef T* (*Factory)();
LazyInit(Factory f = 0)
: factory_(f)
, singleton_(0)
{
::InitializeCriticalSection(&cs_);
}
T& get() {
if (!singleton_) {
::EnterCriticalSection(&cs_);
if (!singleton_) {
T* volatile p = factory_();
// Joe uses _WriterBarrier(); then singleton_ = p;
// But I thought better to make singleton_ = p atomic (as I understand,
// on Windows, pointer assignments are atomic ONLY if they are aligned)
// In addition, the MSDN docs say that InterlockedCompareExchangePointer
// sets up a full memory barrier.
::InterlockedCompareExchangePointer((PVOID volatile*)&singleton_, p, 0);
}
::LeaveCriticalSection(&cs_);
}
#if SUPPORT_IA64
_ReadBarrier();
#endif
return *singleton_;
}
virtual ~LazyInit() {
::DeleteCriticalSection(&cs_);
}
private:
CRITICAL_SECTION cs_;
Factory factory_;
T* volatile singleton_;
};

efficient thread-safe singleton in C++

The usual pattern for a singleton class is something like
static Foo &getInst()
{
static Foo *inst = NULL;
if(inst == NULL)
inst = new Foo(...);
return *inst;
}
However, it's my understanding that this solution is not thread-safe, since 1) Foo's constructor might be called more than once (which may or may not matter) and 2) inst may not be fully constructed before it is returned to a different thread.
One solution is to wrap a mutex around the whole method, but then I'm paying for synchronization overhead long after I actually need it. An alternative is something like
static Foo &getInst()
{
static Foo *inst = NULL;
if(inst == NULL)
{
pthread_mutex_lock(&mutex);
if(inst == NULL)
inst = new Foo(...);
pthread_mutex_unlock(&mutex);
}
return *inst;
}
Is this the right way to do it, or are there any pitfalls I should be aware of? For instance, are there any static initialization order problems that might occur, i.e. is inst always guaranteed to be NULL the first time getInst is called?

If you are using C++11, here is a right way to do this:
Foo& getInst()
{
static Foo inst(...);
return inst;
}
According to new standard there is no need to care about this problem any more. Object initialization will be made only by one thread, other threads will wait till it complete.
Or you can use std::call_once. (more info here)

Your solution is called 'double checked locking' and the way you've written it is not threadsafe.
This Meyers/Alexandrescu paper explains why - but that paper is also widely misunderstood. It started the 'double checked locking is unsafe in C++' meme - but its actual conclusion is that double checked locking in C++ can be implemented safely, it just requires the use of memory barriers in a non-obvious place.
The paper contains pseudocode demonstrating how to use memory barriers to safely implement the DLCP, so it shouldn't be difficult for you to correct your implementation.

Herb Sutter talks about the double-checked locking in CppCon 2014.
Below is the code I implemented in C++11 based on that:
class Foo {
public:
static Foo* Instance();
private:
Foo() {}
static atomic<Foo*> pinstance;
static mutex m_;
};
atomic<Foo*> Foo::pinstance { nullptr };
std::mutex Foo::m_;
Foo* Foo::Instance() {
if(pinstance == nullptr) {
lock_guard<mutex> lock(m_);
if(pinstance == nullptr) {
pinstance = new Foo();
}
}
return pinstance;
}
you can also check complete program here: http://ideone.com/olvK13

Use pthread_once, which is guaranteed that the initialization function is run once atomically.
(On Mac OS X it uses a spin lock. Don't know the implementation of other platforms.)

TTBOMK, the only guaranteed thread-safe way to do this without locking would be to initialize all your singletons before you ever start a thread.

Your alternative is called "double-checked locking".
There could exist multi-threaded memory models in which it works, but POSIX does not guarantee one

ACE singleton implementation uses double-checked locking pattern for thread safety, you can refer to it if you like.
You can find source code here.

Does TLS work here? https://en.wikipedia.org/wiki/Thread-local_storage#C_and_C++
For example,
static _thread Foo *inst = NULL;
static Foo &getInst()
{
if(inst == NULL)
inst = new Foo(...);
return *inst;
}
But we also need a way to delete it explicitly, like
static void deleteInst() {
if (!inst) {
return;
}
delete inst;
inst = NULL;
}

The solution is not thread safe because the statement
inst = new Foo();
can be broken down into two statements by compiler:
Statement1: inst = malloc(sizeof(Foo));
Statement2: inst->Foo();
Suppose that after execution of statement 1 by one thread context switch occurs. And 2nd thread also executes the getInstance() method. Then the 2nd thread will find that the 'inst' pointer is not null. So 2nd thread will return pointer to an uninitialized object as constructor has not yet been called by the 1st thread.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++11 shared_ptr released multiple times under multithreading - c++

Related

What makes a singleton thread-unsafe?

pthread_key_create destructor not getting called

Is it possible to define an std::thread and initialize it later?

Double checked locking on C++: new to a temp pointer, then assign it to instance

efficient thread-safe singleton in C++

Categories

Resources