I've got a setup something a bit like this:
void* work(void*) { while (true) {/*do work*/} return 0;}
class WorkDoer
{
private:
pthread_t id;
public:
WorkDoer() { pthread_create(&id, NULL, work, (void*)this); }
void Shutdown() { pthread_join(id, NULL); /*other cleanup*/ }
}
There's some cases where Shutdown() is called from the main thread, and some other cases where I want to call shutdown from within the thread itself (returning from that thread right after).
The documentation for pthread_join() says that it will return a EDEADLK if the calling thread is the same as the one passed.
My question is: Is that an okay thing to do, and if so is it safe? (thus ignoring the join fail, because I'll be nicely ending the thread in a moment anyways?) Or, is it something that should be avoided?
You can certainly call pthread_join() from the running thread itself, and as you have found out the call itself will handle it properly giving you an error code. However, there are a few problems:
It doesn't make any sense because the join won't join anything. It will merely tell you that you are doing something wrong.
The thread itself won't exit upon calling pthread_join() on itself.
Even if the thread exists, its state won't be cleaned up properly. Some other thread (i.e. your application's main thread) should call pthread_join() unless the thread was created as “detached”.
So in other words this approach is as acceptable as a bug in your program.
As a solution I would recommend to revisit the design and make sure that Shutdown() is called from the right place and at the right time. After all, the name “shutdown” doesn't make a lot of sense here because it doesn't shutdown a thing. What it does is merely waiting for a thread to finish and cleans up its state after that happens.
When you want to end the worker thread, either return from the thread routine or call pthread_exit(). Then make sure that whoever started a thread cleans things up by calling pthread_join().
If you want to force the thread to stop, consider using pthread_kill() to signal a thread or, alternatively, implement some sort of message passing that you can use to "tell" thread to stop doing whatever it is doing.
The pthread_join() function may fail if:
EDEADLK
A deadlock was detected or the value of thread specifies the
calling thread.
I would say use at your own risk.
Why not just let the thread call pthread_detach(pthread_self()); and then exit. No need to call pthread_join() then anymore and risking to have it fail.
Related
I have some code, roughly:
pthread_create(thread_timeout, NULL, handleTimeOut, NULL);
void handleTimeOut()
{
/*...*/
pthread_cancel(thread_timeout);
/*...*/
}
But as I noticed by pthread's manual the cancellation must be used by another threads. I have tried to use pthread_exit() function instead, but this thread hangs on again...
How must the tread termination be handled correctly? Will it be terminated successfully if the function handleTimeOut() just ends without special pthread functions?
Killing a thread without its cooperation is a recipe for problems. The right solution will be one that allows an external thread to request the thread to clean up and terminate, and has the thread periodically example this state and when it's been requested, it follows through with the request. Such a request can be done through anything that all threads can share.
If a thread wants to finish, it can either call pthread_exit() or it can return from the initial thread function. These are equivalent.
I don't see any reason why a thread couldn't call pthread_cancel() on itself, but this would be highly unusual.
pthread_create(&thread, NULL, AcceptLoop, (void *)this);
I have declared like this and inside of the AcceptLoop function I have infinity while loop. I'd like to close this thread when the server is closed. I have read pthread_cancel and pthread_join but I am not sure which one is better and safer. I would like to hear some detailed instructions or tutorials. Thanks in advance.
You don't need to do anything, just returning from the thread function will end the thread cleanly. You can alternatively call pthread_exit() but I'd rather return.
pthread_cancel() is scary and complicated/hard to get right. Stay clear if possible.
pthread_join() is mostly needed if you want to know when thread finishes and are interested in the return value.
Ooops, I'm wrong. It's been some time. In order for what I said to be true, you must detach from your thread. Otherwise you'll need to call pthread_join:
Either pthread_join(3) or
pthread_detach() should be called for
each thread
that an application creates, so that system resources for the thread
can be
released. (But note that the resources of all threads are freed
when the
process terminates.)
http://www.kernel.org/doc/man-pages/online/pages/man3/pthread_detach.3.html
I believe you would like to exit the worker thread by signalling from the main thread.
Inside AcceptLoop instead of looping infinitiely you loop on a condition, you can set the condition through your main thread, You will have to use some synchronization for this variable. Once the variable is set from main thread the worker thread AcceptLoop would break out and you can then call pthread_exit.
if you would like your main thread to wait for child thread to exit you can use pthread_join to do so.
In general, A child thread can exit in three conditions:
calling pthread_exit.
calling pthread_cancel.
The thread function returns.
In my code, I use QueueUserAPC to interrupt the main thread from his current work in order to invoke some callback first before going back to his previous work.
std::string buffer;
std::tr1::shared_ptr<void> hMainThread;
VOID CALLBACK myCallback (ULONG_PTR dwParam) {
FILE * f = fopen("somefile", "a");
fprintf(f, "CALLBACK WAS INVOKED!\n");
fclose(f);
}
void AdditionalThread () {
// download some file using synchronous wininet and store the
// HTTP response in buffer
QueueUserAPC(myCallback, hMainThread.get(), (ULONG_PTR)0);
}
void storeHandle () {
HANDLE hUnsafe;
DuplicateHandle(GetCurrentProcess(), GetCurrentThread(),
GetCurrentProcess(), &hUnsafe, 0, FALSE, DUPLICATE_SAME_ACCESS);
hMainThread.reset(hUnsafe, CloseHandle);
}
void startSecondThread () {
CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)AdditionalThread, 0, 0, NULL);
}
storeHandle and startSecondThread are exposed to a Lua interpreter which is running in the main thread along with other things. What I do now, is
invoke storeHandle from my Lua interpreter. DuplicateHandle returns a non-zero value and therefore succeeds.
invoke startSecondThread from my Lua interpreter. The additional thread gets started properly, and QueueUserAPC returns a nonzero value, stating, that all went well.
as far as I understood QueueUserAPC, myCallback should now get called from the main thread. However, it doesn't.
If QueueUserAPC is the correct way to accomplish my goal (==> see my other question):
How can I get this working?
If I should some other method to interrupt the main thread:
What other method should I use? (Note that I don't want to use pull-ing method in the main thread for this like WaitForSingleObject or polling. I want that the additional thread push-es it's data straight into the main thread, as soon as possible.)
Yeah, QueueUserAPC is not the solution here. Its callback will only run when the thread blocks and the programmer has explicitly allowed the wait to be alertable. That's unlikely.
I hesitate to post the solution because it is going to get you into enormous trouble. You can implement a thread interrupt with SuspendThread(), GetThreadContext(), SetThreadContext() and ResumeThread(). The key is to save the CONTEXT.Eip value on the thread's call stack and replace it with the address of the interrupt function.
The reason you cannot make this work is because you'll have horrible re-entrancy problems. There is no way you can guess at which point of execution you'll interrupt the thread. It may well be right in the middle of it mutating state, the state that you need so badly that you are contemplating doing this. There is no way to not fall into this trap, you can't block it with a mutex or whatnot. It is also extremely hard to diagnose because it will work so well for so long, then randomly fail when the interrupt timing just happens to be unlucky.
A thread must be in a well known state before it can safely run injected code. The traditional one has been mentioned many times before: when a thread is pumping a message loop is is implicitly idle and not doing anything dangerous. QueueUserAPC has the same approach, a thread explicitly signals the operating system that it is a state where the callback can be safely executed. Both by blocking (not executing dangerous code) and setting the bAlertable flag.
A thread has to explicitly signal that it is in a safe state. There is no safe push model, only pull.
From what I can understand in MSDN, the callback is not invoked until the thread enters an alertable state, and this is done by calling SleepEx, SignalObjectAndWait, WaitForSingleObjectEx, WaitForMultipleObjectsEx, or MsgWaitForMultipleObjectsEx.
So if you really don't want to do some polling, I don't think this method is adapted to your case.
Is it possible to implement a "message pump" (or rather an event listener) in your main thread and to delegate all its current work to another thread ? In this case, the main thread waits for any event that are set by the other threads.
How can I wait for a detached thread to finish in C++?
I don't care about an exit status, I just want to know whether or not the thread has finished.
I'm trying to provide a synchronous wrapper around an asynchronous thirdarty tool. The problem is a weird race condition crash involving a callback. The progression is:
I call the thirdparty, and register a callback
when the thirdparty finishes, it notifies me using the callback -- in a detached thread I have no real control over.
I want the thread from (1) to wait until (2) is called.
I want to wrap this in a mechanism that provides a blocking call. So far, I have:
class Wait {
public:
void callback() {
pthread_mutex_lock(&m_mutex);
m_done = true;
pthread_cond_broadcast(&m_cond);
pthread_mutex_unlock(&m_mutex);
}
void wait() {
pthread_mutex_lock(&m_mutex);
while (!m_done) {
pthread_cond_wait(&m_cond, &m_mutex);
}
pthread_mutex_unlock(&m_mutex);
}
private:
pthread_mutex_t m_mutex;
pthread_cond_t m_cond;
bool m_done;
};
// elsewhere...
Wait waiter;
thirdparty_utility(&waiter);
waiter.wait();
As far as I can tell, this should work, and it usually does, but sometimes it crashes. As far as I can determine from the corefile, my guess as to the problem is this:
When the callback broadcasts the end of m_done, the wait thread wakes up
The wait thread is now done here, and Wait is destroyed. All of Wait's members are destroyed, including the mutex and cond.
The callback thread tries to continue from the broadcast point, but is now using memory that's been released, which results in memory corruption.
When the callback thread tries to return (above the level of my poor callback method), the program crashes (usually with a SIGSEGV, but I've seen SIGILL a couple of times).
I've tried a lot of different mechanisms to try to fix this, but none of them solve the problem. I still see occasional crashes.
EDIT: More details:
This is part of a massively multithreaded application, so creating a static Wait isn't practical.
I ran a test, creating Wait on the heap, and deliberately leaking the memory (i.e. the Wait objects are never deallocated), and that resulted in no crashes. So I'm sure it's a problem of Wait being deallocated too soon.
I've also tried a test with a sleep(5) after the unlock in wait, and that also produced no crashes. I hate to rely on a kludge like that though.
EDIT: ThirdParty details:
I didn't think this was relevant at first, but the more I think about it, the more I think it's the real problem:
The thirdparty stuff I mentioned, and why I have no control over the thread: this is using CORBA.
So, it's possible that CORBA is holding onto a reference to my object longer than intended.
Yes, I believe that what you're describing is happening (race condition on deallocate). One quick way to fix this is to create a static instance of Wait, one that won't get destroyed. This will work as long as you don't need to have more than one waiter at the same time.
You will also permanently use that memory, it will not deallocate. But it doesn't look like that's too bad.
The main issue is that it's hard to coordinate lifetimes of your thread communication constructs between threads: you will always need at least one leftover communication construct to communicate when it is safe to destroy (at least in languages without garbage collection, like C++).
EDIT:
See comments for some ideas about refcounting with a global mutex.
To the best of my knowledge there's no portable way to directly ask a thread if its done running (i.e. no pthread_ function). What you are doing is the right way to do it, at least as far as having a condition that you signal. If you are seeing crashes that you are sure are due to the Wait object is being deallocated when the thread that creates it quits (and not some other subtle locking issue -- all too common), the issue is that you need to make sure the Wait isn't being deallocated, by managing from a thread other than the one that does the notification. Put it in global memory or dynamically allocate it and share it with that thread. Most simply don't have the thread being waited on own the memory for the Wait, have the thread doing the waiting own it.
Are you initializing and destroying the mutex and condition var properly?
Wait::Wait()
{
pthread_mutex_init(&m_mutex, NULL);
pthread_cond_init(&m_cond, NULL);
m_done = false;
}
Wait::~Wait()
{
assert(m_done);
pthread_mutex_destroy(&m_mutex);
pthread_cond_destroy(&m_cond);
}
Make sure that you aren't prematurely destroying the Wait object -- if it gets destroyed in one thread while the other thread still needs it, you'll get a race condition that will likely result in a segfault. I'd recommend making it a global static variable that gets constructed on program initialization (before main()) and gets destroyed on program exit.
If your assumption is correct then third party module appears to be buggy and you need to come up with some kind of hack to make your application work.
Static Wait is not feasible. How about Wait pool (it even may grow on demand)? Is you application using thread pool to run?
Although there will still be a chance that same Wait will be reused while third party module is still using it. But you can minimize such chance by properly queing vacant Waits in your pool.
Disclaimer: I am in no way an expert in thread safety, so consider this post as a suggestion from a layman.
I am working on a multithreaded program using C++ and Boost. I am using a helper thread to eagerly initialize a resource asynchronously. If I detach the thread and all references to the thread go out of scope, have I leaked any resources? Or does the thread clean-up after itself (i.e. it's stack and any other system resources needed for the itself)?
From what I can see in the docs (and what I recall from pthreads 8 years ago), there's not explicit "destory thread" call that needs to be made.
I would like the thread to execute asynchronously and when it comes time to use the resource, I will check if an error has occured. The rough bit of code would look something like:
//Assume this won't get called frequently enough that next_resource won't get promoted
//before the thread finishes.
PromoteResource() {
current_resource_ptr = next_resource_ptr;
next_resource_ptr.reset(new Resource());
callable = bind(Resource::Initialize, next_resource); //not correct syntax, but I hope it's clear
boost::thread t(callable);
t.start();
}
Of course--I understand that normal memory-handling problems still exist (forget to delete, bad exception handling, etc)... I just need confirmation that the thread itself isn't a "leak".
Edit: A point of clarification, I want to make sure this isn't technically a leak:
void Run() {
sleep(10 seconds);
}
void DoSomething(...) {
thread t(Run);
t.run();
} //thread detaches, will clean itself up--the thread itself isn't a 'leak'?
I'm fairly certain everything is cleaned up after 10 seconds-ish, but I want to be absolutely certain.
The thread's stack gets cleaned up when it exits, but not anything else. This means that anything it allocated on the heap or anywhere else (in pre-existing data structures, for example) will get left when it quits.
Additionally any OS-level objects (file handle, socket etc) will be left lying around (unless you're using a wrapper object which closes them in its destructor).
But programs which frequently create / destroy threads should probably mostly free everything that they allocate in the same thread as it's the only way of keeping the programmer sane.
If I'm not mistaken, on Windows Xp all resources used by a process will be released when the process terminates, but that isn't true for threads.
Yes, the resources are automatically released upon thread termination. This is a perfectly normal and acceptable thing to do to have a background thread.
To clean up after a thread you must either join it, or detach it (in which case you can no longer join it).
Here's a quote from the boost thread docs that somewhat explains that (but not exactly).
When the boost::thread object that
represents a thread of execution is
destroyed the thread becomes detached.
Once a thread is detached, it will
continue executing until the
invocation of the function or callable
object supplied on construction has
completed, or the program is
terminated. A thread can also be
detached by explicitly invoking the
detach() member function on the
boost::thread object. In this case,
the boost::thread object ceases to
represent the now-detached thread, and
instead represents Not-a-Thread.
In order to wait for a thread of
execution to finish, the join() or
timed_join() member functions of the
boost::thread object must be used.
join() will block the calling thread
until the thread represented by the
boost::thread object has completed. If
the thread of execution represented by
the boost::thread object has already
completed, or the boost::thread object
represents Not-a-Thread, then join()
returns immediately. timed_join() is
similar, except that a call to
timed_join() will also return if the
thread being waited for does not
complete when the specified time has
elapsed.
In Win32, as soon as the thread's main function, called ThreadProc in the documentation, finishes, the thread is cleaned up. Any resources allocated by you inside the ThreadProc you'll need to clean up explicitly, of course.