Shouldn't I use _endthreadex() in thread procedure for stack unwinding? - c++

I examined about stack unwinding on thread procedure in win32 environment.
My test code is the following.
class Dummy
Dummy() { wcout << L"dummy ctor" << endl; }
~Dummy() { wcout << L"dummy dtor" << endl; }
void InnerFunc()
Dummy dm;
char *buf = new char[100000000];
unsigned WINAPI ThreadFunc(void *arg)
Dummy dm;
catch(bad_alloc e)
wcout << e.what() << endl;
return 0;
void OuterFunc()
Dummy dm;
HANDLE hModule;
hModule = (HANDLE)_beginthreadex(0, 0, ThreadFunc, 0, 0, 0);
WaitForSingleObject(hModule, INFINITE);
int _tmain(int argc, _TCHAR* argv[])
wcout << e.what() << endl;
return 0;
Output result:
dummy ctor
dummy ctor
dummy ctor
dummy dtor
bad allocation
dummy dtor
As you know, an output of constructor and destructor is not paired. I think that _endthreadex() makes the thread handle be signaled and skips stack unwinding of the thread.
When I tested again without _endthreadex(), I was able to get a result I expected.
In this case, if I need stack unwinding on thread, shouldn't I use _endthreadex() in thread procedure?

I would guess the destructor is never called for the instance created in ThreadFunc. However, you should add a way to distinguish each constructor and destructor call to be sure.
Assuming that's what's happening, it seems pretty clear that endthreadex terminates the thread immediately without cleaning up the stack. The docs explicitly state that endthreadex is called when ThreadFunc returns, so why bother calling it explicitly here?
This is definitely a case where I'd use boost::thread instead. It will do the right thing in terms of thread creation and cleanup without making you worry about the win32-specific details.

Your problem is:
char *buf = new char[100000000];
You have created a memory leak, on each iteration you create a new object losing any reference to the old object.
Stack Unwinding, clears off all the local objects in that scope,
Dummy dm;
is a object allocated on local storage inside InnerFunc(), Stack Unwinding rightly destroys this object and the single destructor call trace you see is due to this.
Stack Unwinding does not explicitly deallocate the dynamic memory. Each pointer allocated with new[] will have to be explicitly deallocated by calling a delete [] on the same address.
I don't see how it is related to any of the Windows thread functions(I am not much in to windows) but as I already stated you have a problem there.
The simple solution to handling cleanups during exceptions is RAII.
You should use a Smart pointer to wrap your raw pointer and then the Smart pointer ensures that your object memory gets appropriately deallocated once the scope ends.


How to terminate a thread safely? (with the usage of pointer) c++

I am currently learning multithreading in c++11 and I am confused with the way to terminate a thread safely.
In c++, I know the way to create threads and use thread.join() to safely ensure main() to wait for all threads to finish before quitting itself.
However, I found that some multithread codes implemented via pointers are able to run even without using thread.join().
class Greating
Greating(const int& _i):i_(_i){}
int i_;
void say()
std::cout << "Hello World" << i_ << std::endl;
int main(){
Greating greating1(1);
Greating greating2(2);
std::thread t1(&Greating::say, greating1);
std::thread t2(&Greating::say, greating2);
return 0;
The code shown above will absolutely report the error "terminate called without an active exception
Aborted (core dumped)", because I did not use t1.join() and t2.join().
However, I found in some codes when they use the pointer to manage the thread, this does not become a problem, as shown below.
class Greating
Greating(const int& _i):i_(_i){}
int i_;
void say()
std::cout << "Hello World" << i_ << std::endl;
int main(){
Greating greating1(1);
Greating greating2(2);
std::thread* tt1 = new std::thread(&Greating::say, greating1);
std::thread* tt2 = new std::thread(&Greating::say, greating2);
return 0;
The output is:
Hello WorldHello World12
Hello World12
There is no error reported. This made me very confused.
So my question is:
Why when we use pointer to manage the thread, we could not use the function thread.join()?
How to correctly terminate a thread? (probably wait for the callable function to finish?)
Thanks very much!
When creating objects with dynamic allocation, you have to deallocate the memory with operator delete so it calls appropriate destructor.
In the first example, two std::thread objects are created. At the end of main function, the destructor std::thread::~thread is called. Since the threads are not joined, the destructor reports an error.
On the other hand, in the second example, you called operator new so you create objects with dynamic allocation. But, you didn't call operator delete, so the destructor is not called. That is, the program didn't check whether the threads are joined.
Therefore, the only way to correctly terminate a thread is to call std::thread::join. If you want to use pointers, you have to do as following:
std::thread *th = new std::thread(foo);
delete th;

Will there be any leak in below C++ shared_ptr usage?

Is the allocated memory managed by a smart pointer guaranteed to be freed up in event of an exception, such as below?
#include <memory>
void test( std::shared_ptr<int> sptr )
throw "exception";
int main()
std::shared_ptr<int> ptr( new int(1) );
test( ptr );
return 0;
I tried executing the code, putting breakpoint at shared_ptr destructor but I did not see it getting called. I think the memory should be cleaned up by itself. Am I right, or won't it be cleaned up?
The language standard states that:
If no matching handler is found, the function std::terminate() is
called; whether or not the stack is unwound before this call to
std::terminate() is implementation-defined
So your program isn't guaranteed to clean up after itself, but most (if not all) modern operating systems will do it post-mortem.
Had you caught the exception, the shared_ptr's instance would've been destroyed properly, ensuring no leaks.
Take better example for understanding:
#include <memory>
#include <windows.h>
using namespace std;
class A
cout << "Constructor" << endl;
cout << "destructor" << endl;
void test(std::shared_ptr<A> sptr)
throw "exception";
void function()
std::shared_ptr<A> ptr(new A);
int main()
Before Program crash only one constructor gets called which shows it does not do destruction.
But if we do debugging in visual studio and say continue after exception then even destructor gets called.

pthread_key_create destructor not getting called

As per pthread_key_create man page we can associate a destructor to be called at thread shut down. My problem is that the destructor function I have registered is not being called. Gist of my code is as follows.
static pthread_key_t key;
static pthread_once_t tls_init_flag = PTHREAD_ONCE_INIT;
void destructor(void *t) {
// thread local data structure clean up code here, which is not getting called
void create_key() {
pthread_key_create(&key, destructor);
// This will be called from every thread
void set_thread_specific() {
ts = new ts_stack; // Thread local data structure
pthread_once(&tls_init_flag, create_key);
pthread_setspecific(key, ts);
Any idea what might prevent this destructor being called? I am also using atexit() at moment to do some cleanup in the main thread. Is there any chance that is interfering with destructor function being called? I tried removing that as well. Still didn't work though. Also I am not clear if I should handle the main thread as a separate case with atexit. (It's a must to use atexit by the way, since I need to do some application specific cleanup at application exit)
This is by design.
The main thread exits (by returning or calling exit()), and that doesn't use pthread_exit(). POSIX documents pthread_exit calling the thread-specific destructors.
You could add pthread_exit() at the end of main. Alternatively, you can use atexit to do your destruction. In that case, it would be clean to set the thread-specific value to NULL so in case the pthread_exit was invoked, the destruction wouldn't happen twice for that key.
UPDATE Actually, I've solved my immediate worries by simply adding this to my global unit test setup function:
::atexit([] { ::pthread_exit(0); });
So, in context of my global fixture class MyConfig:
struct MyConfig {
MyConfig() {
::atexit([] { ::pthread_exit(0); });
~MyConfig() { google::protobuf::ShutdownProtobufLibrary(); }
Some of the references used:
PS. Of course c++11 introduced <thread> so you have better and more portable primitves to work with.
It's already in sehe's answer, just to present the key points in a compact way:
pthread_key_create() destructor calls are triggered by a call to pthread_exit().
If the start routine of a thread returns, the behaviour is as if pthread_exit() was called (i. e., destructor calls are triggered).
However, if main() returns, the behaviour is as if exit() was called — no destructor calls are triggered.
This is explained in See also C++17 6.6.1p5 or C11
I wrote a quick test and the only thing I changed was moving the create_key call of yours outside of the set_thread_specific.
That is, I called it within the main thread.
I then saw my destroy get called when the thread routine exited.
I call destructor() manually at the end of main():
void * ThreadData = NULL;
if ((ThreadData = pthread_getspecific(key)) != NULL)
Of course key should be properly initialized earlier in main() code.
PS. Calling Pthread_Exit() at the end to main() seems to hang entire application...
Your initial thought of handling the main thread as a separate case with atexit worked best for me.
Be ware that pthread_exit(0) overwrites the exit value of the process. For example, the following program will exit with status of zero even though main() returns with number three:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
class ts_stack {
ts_stack () {
printf ("init\n");
~ts_stack () {
printf ("done\n");
static void cleanup (void);
static pthread_key_t key;
static pthread_once_t tls_init_flag = PTHREAD_ONCE_INIT;
void destructor(void *t) {
// thread local data structure clean up code here, which is not getting called
delete (ts_stack*) t;
void create_key() {
pthread_key_create(&key, destructor);
// This will be called from every thread
void set_thread_specific() {
ts_stack *ts = new ts_stack (); // Thread local data structure
pthread_once(&tls_init_flag, create_key);
pthread_setspecific(key, ts);
static void cleanup (void) {
pthread_exit(0); // <-- Calls destructor but sets exit status to zero as a side effect!
int main (int argc, char *argv[]) {
return 3; // Attempt to exit with status of 3
I had similar issue as yours: pthread_setspecific sets a key, but the destructor never gets called. To fix it we simply switched to thread_local in C++. You could also do something like if that change is too complicated:
For example, assume you have some class ThreadData that you want some action to be done on when the thread finishes execution. You define the destructor something on these lines:
void destroy_my_data(ThreadlData* t) {
delete t;
When your thread starts, you allocate memory for ThreadData* instance and assign a destructor to it like this:
ThreadData* my_data = new ThreadData;
thread_local ThreadLocalDestructor<ThreadData> tld;
tld.SetDestructorData(my_data, destroy_my_data);
pthread_setspecific(key, my_data)
Notice that ThreadLocalDestructor is defined as thread_local. We rely on C++11 mechanism that when the thread exits, the destructor of ThreadLocalDestructor will be automatically called, and ~ThreadLocalDestructor is implemented to call function destroy_my_data.
Here is the implementation of ThreadLocalDestructor:
template <typename T>
class ThreadLocalDestructor
ThreadLocalDestructor() : m_destr_func(nullptr), m_destr_data(nullptr)
if (m_destr_func) {
void SetDestructorData(void (*destr_func)(T*), T* destr_data)
m_destr_data = destr_data;
m_destr_func = destr_func;
void (*m_destr_func)(T*);
T* m_destr_data;

pthread_exit vs return in posix thread

Here is my program just to find the difference between pthread_exit and return from a thread.
struct foo{
int a,b,c,d;
~foo(){cout<<"foo destructor called"<<endl;}
//struct foo foo={1,2,3,4};
void printfoo(const char *s, const struct foo *fp)
cout<<"struct at 0x"<<(unsigned)fp<<endl;
void *thr_fn1(void *arg)
struct foo foo={1,2,3,4};
pthread_exit((void *)&foo);
//return((void *)&foo);
int main(int argc, char *argv[])
int err;
pthread_t tid1,tid2;
struct foo *fp;
cout<<"can't create thread 1"<<endl;
err=pthread_join(tid1,(void **)&fp);
cout<<"can't join with thread 1"<<endl;
In "*thr_fn1" thread function I created an object foo.
According to the site pthread_exit vs. return
when I exit the thread function "thr_fun1()" using "return((void *)&foo);" it should call the destructor for the object foo, but it should not call the destructor when I call "pthread_exit((void *)&foo);" to return to main from function "thr_fun1()".
But in both the cases using "return((void *)&foo);" or "pthread_exit((void *)&foo);" the local object "foo" in function "thr_fun1()" is getting called.
This is not the behaviour I guess. Destructor should be called only in "return((void *)&foo);" case only.
Please verify me if I am wrong?
Your code has a serious problem. Specifically, you're using a local variable as the exit value for pthread_exit():
void *thr_fn1(void *arg)
struct foo foo={1,2,3,4};
pthread_exit((void *)&foo);
//return((void *)&foo);
Per the Pthreads spec, "After a thread has terminated, the result of access to local (auto) variables of the thread is undefined."
Therefore, returning the address of a stack-allocated variable from your thread function as the thread exit value (in your case, pthread_exit((void *)&foo) ) will cause problems for any code that retrieves and attempts to dereference this address.
Yes, that's right. pthread_exit() immediately exits the current thread, without calling any destructors of objects higher up on the stack. If you're coding in C++, you should make sure to either always return from your thread procedure, or only call pthread_exit() from one of the bottommost stack frames with no objects with destructors still alive in that frame or any higher frames; otherwise, you will leak resources or cause other bad problems.
pthread_exit() is throwing an exception which causes the stack to unwind and destructors to be called for locals. See for more details.
The exception thrown is of type abi::__forced_unwind (from cxxabi.h); an Internet search can give you more details.
note: as other answers/comments have mentioned, returning the address of a local wouldn't work anyway, but that is besides the point of the question. You get the same behavior regarding destructing foo if some other valid address (or the null pointer) is returned instead of &foo.

Pointer Value is changed when passed through Threadcreate

I am creating a new thread in which I am passing passing an object of class
class demo is defined in .h file
int threadentry(void* data)
demo* inst=(demo*) data;
cout << "Value of inst "<<hex << &inst<< endl;//value is different from below
int main()
demo* inst=new demo();
cout << "Value of inst "<<hex << &inst<< endl; //value is coming different from above
HANDLE threads;
DWORD threadId1;
if ((threads = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadentry,
(void *)inst, 0, &threadId1)) == NULL)
return -1;
delete inst;
I think value should be different because address is copied in data variable of threadentry. How can I check that these are the same object passed.
The code is printing the address of a pointer, not the address of the object. There are two pointer variables involved (one is declared in main() and the other is the argument of the thread function) so the output is different. Drop the &, address of operator, from the output statements:
cout << "Value of inst "<<hex << inst << endl;
Give ownership of the supplied object to the thread as it knows when it has finished using it. In the posted code the object is deleted after the thread is created, possibly resulting in the thread using a dangling pointer. Move the delete of object from main into the thread.
The signature of the thread function is:
_In_ LPVOID lpParameter
and it must return a value, the posted code does not.
The code also has a resource leak as the handle returned from CreateThread() is not being closed. Either CloseHandle() immediately if the thread does not need to be joined with or store the thread handles, in a std::vector for example, to be joined with (using WaitForSingleObject for example) and closed later.
You might get a race condition. You delete your class instance right after CreateThread. At this point threadentry() might not start executing yet.