My code is calling a function from a third-party library before the exit of the program. Unfortunately the called function blocks the main thread, which is caused by pthread_join() in the .so library.
Since it is inside the library, which is out of my control, I am wandering how to break it so the main thread can proceed.
Attaching the info from using gdb:
0x00007ffff63cd06d in pthread_join (threadid=140737189869312, thread_return=0x0)
at pthread_join.c:89
89 lll_wait_tid (pd->tid);
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 openssl-1.0.1e-57.el6.x86_64
Thanks in advance.
The library is designed to have the calling thread wait for something to finish. Since you can't change the design of the library, just call the library from a thread that has nothing else to do.
By the way you design the interaction, you can then get whatever semantics you want. If you want the calling thread to get the results at its convenience later, you can use a promise/future. You can design the calling thread to wait a certain amount of time and then timeout. In the timeout case, you can ignore the result if you don't need it or you can design some way to check and get the result later. You can also have the thread that calls the library do whatever needs to be done with the result so that the calling thread doesn't have to worry about it.
Just quarantine the code you can't control and write whatever code around it you need to get the behavior your code needs. The library needs the thread that calls it to wait until it's done, so isolate the thread that calls it and let the library have what it wants.
If you call exit, the process is terminated without shutting down the other threads.
If you have a pthread_t handle for the thread that is being waited on, you can perhaps call pthread_cancel on it, but if the application and libraries are not prepared to handle thread cancellation, it will cause other problems. (Canceling the thread does pthread_join will not help because the shutdown will then block on the same thread that pthread_join waits on.)
In general, it is probably a better idea to figure out why the pthread_join call is waiting indefinitely in your environment (that is, why the other thread is not termining), and fix that.
Related
I am working on a project where we have used pthread_create to create several child threads.
The thread creation logic is not in my control as its implemented by some other part of project.
Each thread perform some operation which takes more than 30 seconds to complete.
Under normal condition the program works perfectly fine.
But the problem occurs at the time of termination of the program.
I need to exit from main as quickly as possible when I receive the SIGINT signal.
When I call exit() or return from main, the exit handlers and global objects' destructors are called. And I believe these operations are having a race condition with the running threads. And I believe there are many race conditions, which is making hard to solve all of theses.
The way I see it there are two solutions.
call _exit() and forget all de-allocation of resources
When SIGINT is there, close/kill all threads and then call exit() from main thread, which will release resources.
I think 1st option will work, but I do not want to abruptly terminate the process.
So I want to know if it is possible to terminate all child threads as quickly as possible so that exit handler & destructor can perform required clean-up task and terminate the program.
I have gone through this post, let me know if you know other ways: POSIX API call to list all the pthreads running in a process
Also, let me know if there is any other solution to this problem
What is it that you need to do before the program quits? If the answer is 'deallocate resources', then you don't need to worry. If you call _exit then the program will exit immediately and the OS will clean up everything for you.
Be aware also that what you can safely do in a signal hander is extremely limited, so attempting to perform any cleanup yourself is not recommended. If you're interested, there's a list of what you can do here. But you can't flush a file to disk, for example (which is about the only thing I can think of that you might legitimately want to do here). That's off limits.
I need to exit from main as quickly as possible when I receive the SIGINT signal.
How is that defined? Because there's no way to "exit quickly as possible" when you receive one signal like that.
You can either set flag(s), post to semaphore(s), or similar to set a state that tells other threads it's time to shut down, or you can kill the entire process.
If you elect to set flag(s) or similar to tell the other threads to shut down, you set those flags and return from your signal handler and hope the threads behave and the process shuts down cleanly.
If you elect to kill threads, there's effectively no difference in killing a thread, killing the process, or calling _exit(). You might as well just keep it simple and call _exit().
That's all you can chose between when you have to make your decision in a single signal handler call. Pick one.
A better solution is to use escalating signals. For example, when you get SIGQUIT or SIGINT, you set flag(s) or otherwise tell threads it's time to clean up and exit the process - or else. Then, say five seconds later whatever is shutting down your process sends SIGTERM and the "or else" happens. When you get SIGTERM, your signal handler simply calls _exit() - those threads had their chance and they messed it up and that's their fault. Or you can call abort() to generate a core file and maybe provide enough evidence to fix the miscreant threads that won't shut down.
And finally, five seconds later the managing process will nuke the process from orbit with SIGKILL just to be sure.
I'm new with multi-threading and I need to get the whole idea about the "join" and do I need to join every thread in my application ?, and how does that work with multi-threading ?
no, you can detach one thread if you want it to leave it alone.
If you start a thread, either you detach it or you join it before the program ends, otherwise this is undefined behaviour.
To know that a thread needs to be detached you need to ask yourself this question: "do I want the the thread to run after the program main function is finished?". Here are some examples:
When you do File/New you create a new thread and you detach it: the thread will be closed when the user closes the document Here you don't need to join the threads
When you do a Monte Carlo simulation, some distributed computing, or any Divide And Conquer type algorithms, you launch all the threads and you need to wait for all the results so that you can combine them. Here you explicitly need to join the thread before combining the results
Not joining a thread is like not deleteing all memory you new. It can be harmless, or it could be a bad habit.
A thread you have not synchronized with is in an unknown state of execution. If it is a file writing thread, it could be half way through writing a file and then the app finishes. If it is a network communications thread, it could be half way through a handshake.
The downside to joining every thread is if one of them has gotten into a bad state and has blocked, your app can hang.
In general you should try to send a message to your outstanding threads to tell them to exit and clean up. Then you should wait a modest amount of time for them to finish or otherwise respond that they are good to die, and then shut down the app. Now prior to this you should signify your program is no longer open for business -- shit down GUI windows, respond to requests from other processes that you are shutting down, etc -- so if this takes longer than anticipated the user is not bothered. Finally if things go imperfectly -- if threads refuse to respond to your request that they shut down and you give up on them -- then you should log errors as well, so you can fix what may be a symptom of a bigger problem.
The last time a worker thread unexpectedly hung I initially thought was a problem with a network outage and a bug in the timeout code. Upon deeper inspection it was because one of the objects in use was deleted prior to the shutdown synchronization: the undefined behaviour that resulted just looked like a hang in my reproduction cases. Had we not carefully joined, that bug would have been harder to track down (now, the right thing to do would have been to use a shared resource that we could not delete: but mistakes happen).
The pthread_join() function suspends execution of the calling thread
until the target thread terminates, unless the target thread has
already terminated. On return from a successful pthread_join() call
with a non-NULL value_ptr argument, the value passed to pthread_exit()
by the terminating thread is made available in the location referenced
by value_ptr. When a pthread_join() returns successfully, the target
thread has been terminated. The results of multiple simultaneous calls
to pthread_join() specifying the same target thread are undefined. If
the thread calling pthread_join() is canceled, then the target thread
will not be detached.
So pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated
with the thread.
This means that if you exit the process without call to pthread_join, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup), and (1) will not be done.
So whether you need to call pthread_join depends whether you need (1) to happen.
Detached thread
If you don't need the thread to run, then you may as well pthread_detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
do I need to join every thread in my application ?
Not necessarily - depends on your design and OS. Join() is actively hazardous in GUI apps - tend If you don't need to know, or don't care, about knowing if one thread has terminated from another thread, you don't need to join it.
I try very hard to not join/WaitFor any threads at all. Pool threads, app-lifetime threads and the like often do not require any explicit termination - depends on OS and whether the thread/s own, or are explicitly bound to, any resources that need explicit termination/close/whatever.
Threads can be either joinable or detached. Detached threads should not be joined. On the other hand, if you didn't join the joinable thread, you app would leak some memory and some thread structures. c++11 std::thread would call std::terminate, if it wasn't marked detached and thread object went out of scope without .join() called. See pthread_detach and pthread_create. This is much alike with processes. When the child exits, it will stay as zombee while it's creater willn't call waitpid. The reson for such behavior is that thread's and process's creater might want to know there exit code.
Update: if pthread_create is called with attribute argument equal to NULL (default attributes are used), joinable thread will be created. To create a detached thread, you can use attributes:
pthread_attr_t attrs;
pthread_attr_init(&attrs);
pthread_attr_setdetachstate(&attrs, PTHREAD_CREATE_DETACHED);
pthread_create(thread, attrs, callback, arg);
Also, you can make a thread to be detached by calling pthread_detach on a created one. If you will try to join with a detached thread, pthread_join will return EINVAL error code. glibc has a non portable extension pthread_getattr_np that allows to get attributes of a running thread. So you can check if thread is detached with pthread_attr_getdetachstate.
Question 1:
I read that when you call join after creating a thread it blocks the thread that called it until the thread function returned. I'm trying to build a multiply client server which can accept clients and create thread for each one. The problem is that after the first client joins and created it's thread and called join the listen thread hangs until it is done. What can I do to make this thread run without blocking the calling thread? (In C# I would just call Start() and the calling thread kept run as usual).
Question 2:
In general (Im probably missing something), why would someone want a blocking thread? What's the point of that? Wouldn't it be easier and faster to just call a regular function?
If someone could of explain me how to achieve the same thing like the threads in C# it would be great!
Thanks in Advance! Sorry for my bad english.
What can I do to make this thread run without blocking the calling thread
You can create the thread and then invoke detach() on it, so that the destructor of the thread object won't throw an exception if the thread has not terminated yet. I would honestly advise to think twice before adopting this kind of fire-and-forget design. In C++11, you may want to call std::async instead (and in that case you may want to take a look at this Q&A, where a workaround is proposed for a current drawback of that function).
In general (Im probably missing something), why would someone want a blocking thread? What's the point of that? Wouldn't it be easier and faster to just call a regular function?
Well, if your program has absolutely nothing else to do than waiting for the task to be completed, then yes - I would say, just use a synchronous call. But it might be the case that your program wants to do something in parallel, and once it is done it may need to wait for the end of the asynchronous computation in order to continue. In that case, it would need to join with the thread.
Don't call join(). You join a thread only when you want to make sure that the thread has finished execution (for instance, when you destroy your connection manager class that owns the threads, you want to make sure that the threads have finished execution).
See answer one on when to call join().
so I have some main function. 24 time a second it opens a boost thread A with a function. that function takes in a buffer with data. It starts up a boost timer. It opens another thread B with a function sending buffer into it. I need thread A to kill thread B if it is executing way 2 long. Of course if thread B has executed in time I do not need to kill it it should kill itself. What boost function can help me to kill created thread (not join - stop/kill or something like that)?
BTW I cannot affect speed of Function I am exequting in thread B thats why I need to be capable of killing it when needed.
There's no clean way to kill a thread, so if you need to do something like this, your clean choices are to either use a function that includes some cancellation capability, or use a separate process for it, since you can kill a process cleanly.
Other than that, my immediate reaction is that instead of "opening" (do you mean creating?) thread A 24 times a second, you'd be better off with thread A reading a buffer, sending it on to thread B, then sleeping until it's ready to read another buffer. Creating and killing threads isn't terribly expensive, but doing it at a rate of 24 (or, apparently, 48) a second strikes me as a bit excessive.
The term you are looking for is "cancellation", as in pthread_cancel(3). Cancellation is troublesome, because the cancelled thread might not execute C++ destructors or release locks on the way out ... but then again it might; the uncertainty is actually worse than a definitive no.
Because of this, boost threads do not support cancellation (see for instance this older question) but they do support interruption, which you might be able to bend to fit. Interruption works by way of a regular C++ exception so it has predictable semantics.
please don't kill threads at random unless you completely control their execution (and then just make proper signals for threads to exit gracefully). you never know if other thread is in some critical section of a library you never heard of and then your program will end up stalling on that CS as it was never exited or something like that.
Like in the above graph,all other threads will automatically exit once the main thread is dead.
Is it possible to create a thread that never dies?
You can, but you probably shouldn't; it will just end up confusing people. Here is a good explanation of how this works with Win32 and the CRT.
You can end the main() function's thread without returning from main() by calling ExitThread() on it. This will end your main thread, but the CRT shutdown code that comes after main() will not be executed, and thus, ExitProcess() will not be called, and all your other threads will continue to live on.
Although in this case, you must take care of ending all the other threads correctly. The process will not terminate while there is at least one thread that is not "background".
If main() is careful not to call ExitProcess() (or whatever it's called that happens when main returns) until all threads have terminated, that is easily done. Just don't exit main until it's done.
Not really. The CRT startup code calls main(), then calls exit(). That terminates the program, regardless of any other threads.
You would have to prevent main() from returning. Normally done with WaitForSingleObject() on the thread handle.
In this specific case, if you see the threads still running when you trace through main's return then you forgot to release/close the Win32 resource you are using.
It looks like it might not be possible as you can see from the other answers. The question is why would you want to do this ? By doing this you are going against the intended design of the OS.
If you look at this : http://msdn.microsoft.com/en-us/library/ms684841(VS.85).aspx
You will see that a Thread is meant to execute within the context of a process and that in turn a fiber is intended to operate withing the context of a thread.
By violating these premises you will potentially end up having issues with future operating system upgrades and your code will be brittle.
Why do you not spawn another process and keep it in the background ? That way you can terminate your original process as desired, Your users will still be able to terminate the spawned process if they desire.