How will pthread_cancel() respond when the cancellation request is queued? - c++

This is a basic question, but the answer seems to be eluding me. In any case, here's the background information:
According to the man pages, pthread_cancel()'s return values are as follows:
On success, pthread_cancel() returns 0; on error, it returns a
nonzero error number.
Depending on the cancellation state of the thread to be cancelled, it may terminate immediately or the request may be queued. In my case, the cancellation will be deferred and will run through a few clean up handlers. Within my main thread, I want to validate the return value. That is, a simple approach would be to add a line such as
assert(pthread_cancel(tID));
From what I can tell, it seems like pthread_cancel() simply returns 0 if the request was successfully queued and not if the thread was cancelled. In other words, will the above line of code be non-blocking? My concern is that if I misinterpreted the man pages, and I have a particularly long deferment period in the child thread, my main thread will be stuck on the assertion because pthread_cancel() is blocking.

pthread_cancel() will never block. It will also not tell you, whether the thread was successfully cancelled.
In mode deferred: It just sets a flag (a cancel request) which the thread in question has to actively query (potentially implicitly in system calls) and then the thread will exit cooperatively.
In mode asynchronous: The thread will be cancelled at any point in time (usually immediately). This is only safe in pure CPU-bound loops, not calling any system or library function, not even allocating memory directly or indirectly. Note that this is also not safe if the thread is synchronizing with other threads using mutexes or other related thread primitives, as cancelling the thread will leave all mutexes in an undefined state.
In short: Cancelling asynchronously is generally unsafe, unclean by design and should generally be avoided. There are only very few use cases where this can be used in a clean way (for example 100% CPU bound code communicating with other threads only through acquire/release semantics (lock free)).

From my man page:
The cancellation processing in the target thread runs asynchronously with respect to the calling thread returning from pthread_cancel().
So yes, your assert will not block.

Related

How do I signal a std::thread to exit gracefully?

Using C++17, for a worker thread with a non-blocking loop in it that performs some task, I see three ways to signal the thread to exit:
A std::atomic_bool that the thread checks in a loop. If it is set to true, the thread exits. The main thread sets it to true before invoking std::thread::join().
A std::condition_variable with a bool. This is similar to the above, except it allows you to invoke std::condition_variable::wait_for() to effectively "sleep" the thread (to lower CPU usage) while it waits for a potential exit signal (via setting the bool, which is checked in the 3rd argument to wait_for() (the predicate). The main thread would lock a mutex, change the bool to true, and invoke std::condition_variable::notify_all() before invoking std::thread::join() to signal the thread to exit.
A std::future and std::promise. The main thread holds a std::promise<void> while the worker thread holds the corresponding std::future<void>. The worker thread uses std::future::wait_for() similar to the step above. Main thread invokes std::promise::set_value() before calling std::thread::join().
My thoughts on each:
This is simple, but lacks the ability to "slow down" the worker thread loop without explicitly calling std::this_thread::sleep_for(). Seems like an "old fashioned" way of doing thread signals.
This one is comprehensive, but very complicated, because you need a condition variable plus a boolean variable.
This one seems like the best option, because it has the simplicity of #1 without the verbosity of #2. But I have no personal experience with std::future and std::promise yet, so I am not sure if it's the ideal solution. In my mind, promise & future are meant to transfer values across threads, not really be used as signals. So I'm not sure if there are efficiency concerns.
I see multiple ways of signaling a thread to exit. And sadly, my Google searching has only introduced more as I keep looking, without actually coming to a general consensus on the "modern" and/or "best" way of doing this with C++17.
I would love to see some light shed on this confusion. Is there a conclusive, definitive way of doing this? What is the general consensus? What are the pros/cons of each solution, if there is no "one size fits all"?
If you have a busy working thread which requires one-way notification if it should stop working the best way is to just use an atomic<bool>. It is up to the worker thread if it wants to slow down or it doesn't want to slow down. The requirement to "throttle" the worker thread is completely orthogonal to the thread cancellation and, in my opinion, should not be considered with the cancellation itself. This approach, to my knowledge, has 2 drawbacks: you can't pass back the result (if any) and you can't pass back an exception (if any). But if you do not need any of those then use atomic<bool> and don't bother with anything else. It is as modern as any; there is nothing old-fashioned about it.
condition_variable is part of the consumer/producer pattern. So there is something which produces work and there is something that consumes what was produced. To avoid busy waiting for the consumer while there is nothing to consume the condition_variable is a great option to use. It is just a perfect primitive for such tasks. But it doesn't make sense for the thread cancellation process. And you will have to use another variable anyway because you can't rely on condition_variable alone. It might spuriously wake up the thread. You might "set" it before it gets in the waiting process, losing the "set" completely, and so on. It just can't be used alone so we back to square one but now with an atomic<bool> variable to accompany our condition_variable
The future/promise pair is good when you need to know the result of the operation done on the other thread. So it is not a replacement of the approach with the atomic<bool> but it rather complements it. So to remove the drawbacks described in the first paragraph you add future/promise to the equation. You provide the calling side with the future extracted from the promise which lives within the thread. That promise gets set once the thread is finished:
Because exception is thrown.
Because thread has done its work and completed on its own.
Because we asked it to stop by setting the atomic<bool> variable.
So as you see the future/promise pair just helps to provide some feedback for the callee it has nothing to do with the cancellation itself.
P.S. You can always use an electric sledgehammer to crack a nut but it doesn't make the approach any more modern.
I can't say that this is conclusive, or definitive, but since this is somewhat an opinion question, I'll give an answer that it is based upon a lot of trial and error to solve the kind of problem you are asking about (I think).
My preferred pattern is to signal the thread to stop using atomic bool, and control the 'loop' timing with a condition variable.
We ran into the requirement for running repeating tasks on worker threads so often that we created a class that we called 'threaded_worker'. This class handles the complexities of aborting the thread, and timing the calls to the worker function.
The abort is handled via a method that sets the atomic bool 'abort' signal which tells the thread to stop calling the work function and terminate.
The loop timing can be controlled by methods that set the wait time for the condition variable. The thread can be released to continue via method that calls the notify on the condition variable.
We use the class as a base class for all kinds of objects that have some function that needs to execute on a separate thread. The class is designed to run the 'work' function once, or in a loop.
We use the bool for the abort, because it is simple and suitable to do the job. We use the condition variable for loop timing, because it has the benefit of being notified to 'short circuit' the timing. This is very useful when the threaded object is a consumer. When a producer has work for the threaded object, it can queue the work and notify that the work is available. The threaded object immediately continues, instead of waiting for the specified wait time on the condition variable.
The reason for both (the abort signal, and the condition variable) is that I see terminating the thread as one function, and timing the loop as another.
We used to time loops by putting the thread to sleep for some duration. This made it almost impossible to get predictable loop timing on Windows computers. Some computers will return from sleep(1) in about 1ms, but others will return in 15ms. Our performance was highly dependent on the specific hardware. Using condition variables we have greatly improved the timing of critical tasks. The added benefit of notifying a waiting thread when work is available is more than worth the complexity of the condition variable.

Do I need to join every thread in my application ?

I'm new with multi-threading and I need to get the whole idea about the "join" and do I need to join every thread in my application ?, and how does that work with multi-threading ?
no, you can detach one thread if you want it to leave it alone.
If you start a thread, either you detach it or you join it before the program ends, otherwise this is undefined behaviour.
To know that a thread needs to be detached you need to ask yourself this question: "do I want the the thread to run after the program main function is finished?". Here are some examples:
When you do File/New you create a new thread and you detach it: the thread will be closed when the user closes the document Here you don't need to join the threads
When you do a Monte Carlo simulation, some distributed computing, or any Divide And Conquer type algorithms, you launch all the threads and you need to wait for all the results so that you can combine them. Here you explicitly need to join the thread before combining the results
Not joining a thread is like not deleteing all memory you new. It can be harmless, or it could be a bad habit.
A thread you have not synchronized with is in an unknown state of execution. If it is a file writing thread, it could be half way through writing a file and then the app finishes. If it is a network communications thread, it could be half way through a handshake.
The downside to joining every thread is if one of them has gotten into a bad state and has blocked, your app can hang.
In general you should try to send a message to your outstanding threads to tell them to exit and clean up. Then you should wait a modest amount of time for them to finish or otherwise respond that they are good to die, and then shut down the app. Now prior to this you should signify your program is no longer open for business -- shit down GUI windows, respond to requests from other processes that you are shutting down, etc -- so if this takes longer than anticipated the user is not bothered. Finally if things go imperfectly -- if threads refuse to respond to your request that they shut down and you give up on them -- then you should log errors as well, so you can fix what may be a symptom of a bigger problem.
The last time a worker thread unexpectedly hung I initially thought was a problem with a network outage and a bug in the timeout code. Upon deeper inspection it was because one of the objects in use was deleted prior to the shutdown synchronization: the undefined behaviour that resulted just looked like a hang in my reproduction cases. Had we not carefully joined, that bug would have been harder to track down (now, the right thing to do would have been to use a shared resource that we could not delete: but mistakes happen).
The pthread_join() function suspends execution of the calling thread
until the target thread terminates, unless the target thread has
already terminated. On return from a successful pthread_join() call
with a non-NULL value_ptr argument, the value passed to pthread_exit()
by the terminating thread is made available in the location referenced
by value_ptr. When a pthread_join() returns successfully, the target
thread has been terminated. The results of multiple simultaneous calls
to pthread_join() specifying the same target thread are undefined. If
the thread calling pthread_join() is canceled, then the target thread
will not be detached.
So pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated
with the thread.
This means that if you exit the process without call to pthread_join, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup), and (1) will not be done.
So whether you need to call pthread_join depends whether you need (1) to happen.
Detached thread
If you don't need the thread to run, then you may as well pthread_detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
do I need to join every thread in my application ?
Not necessarily - depends on your design and OS. Join() is actively hazardous in GUI apps - tend If you don't need to know, or don't care, about knowing if one thread has terminated from another thread, you don't need to join it.
I try very hard to not join/WaitFor any threads at all. Pool threads, app-lifetime threads and the like often do not require any explicit termination - depends on OS and whether the thread/s own, or are explicitly bound to, any resources that need explicit termination/close/whatever.
Threads can be either joinable or detached. Detached threads should not be joined. On the other hand, if you didn't join the joinable thread, you app would leak some memory and some thread structures. c++11 std::thread would call std::terminate, if it wasn't marked detached and thread object went out of scope without .join() called. See pthread_detach and pthread_create. This is much alike with processes. When the child exits, it will stay as zombee while it's creater willn't call waitpid. The reson for such behavior is that thread's and process's creater might want to know there exit code.
Update: if pthread_create is called with attribute argument equal to NULL (default attributes are used), joinable thread will be created. To create a detached thread, you can use attributes:
pthread_attr_t attrs;
pthread_attr_init(&attrs);
pthread_attr_setdetachstate(&attrs, PTHREAD_CREATE_DETACHED);
pthread_create(thread, attrs, callback, arg);
Also, you can make a thread to be detached by calling pthread_detach on a created one. If you will try to join with a detached thread, pthread_join will return EINVAL error code. glibc has a non portable extension pthread_getattr_np that allows to get attributes of a running thread. So you can check if thread is detached with pthread_attr_getdetachstate.

When is a thread actually terminated when calling TerminateThread?

If I terminate a thread on Windows using the TerminateThread function, is that thread actually terminated once the function returns or is termination asychnronous?
Define "actually terminated". The documentation says the thread can not execute any more user-mode code, so effectively: yes, it is terminated, nothing of your code is going to be executed by that thread any more.
If you "WaitForSingleObject" on it right after terminating, I guess there could still be some slight delay because of cleanup that Windows is doing.
By the way: TerminateThread is the worst way of ending a thread. Try using some other means of synchronization, like a global variable that tells the thread to stop, or an event for example.
Terminating a thread is akin to killing a process, only on a per-thread level. It may in fact be implemented by raising an (uncatchable) signal in the targeted thread.
The result is essentially the same: Your program is not in any particular, predictable state. There's not much you can do with the dead thread. The control flow of your program becomes generally indeterminate, and thus it is extremely hard to reason about your program's behaviour in the presence of thread termination.
Basically, unless your thread is doing something extremely narrow, specific and restricted (e.g. increment an atomic counter once every second), there's no good model for the need to terminate a thread, and for the state of the program after the thread termination.
Don't do it. Design your threads so that you can communicate with them and so that their entry functions can return. Design your program so that you can always join all threads eventually and account for everything.
It is a synchronous call. That does not mean that it necessarily returns quickly - there may be some blocking involved if the OS has to resort to using its inter-core driver to stop the thread, (ie. it's actually running on a different core than the thread requesting the termination).
There are issues with calling TerminateThread from user code during an app run, (as distinct from the kernel using it during app/process termination), as clearly posted by others.
I try very hard to never terminate threads at all during an app run, with TerminateThread or by any other means. App-lifetime threads and thread pools often do not require any explicit termination before the OS destroys them on app close.

Do you need to join a cancelled thread? (pthreads)

I'm a little confused about clean-up order when you're using PThreads with regard to cancellation. Normally, if your thread is detached, it automatically cleans up when it terminates. If it's not detached, you need to join it to reclaim the system resources.
The textbook I'm reading states the following which strangely sounds like joining is optional with regard to cancellation:
"If you need to know when the thread has actually terminated, you must
join with it by calling pthread_join after cancelling it."
So, do I need to join a cancelled thread to free its resources - and if not, then why?
TLPI says this:
Upon receiving a cancellation request, a thread whose cancelability is
enabled and deferred terminates when it next reaches a cancellation
point. If the thread was not detached, then some other thread in the
process must join with it, in order to prevent it from becoming a
zombie thread.
Also, since canceling a thread isn't usually done immediately (read more about "cancellation points") without joining you can't be sure the thread was actually canceled.
From man pthread_join:
After a canceled thread has terminated, a join with that thread using
pthread_join(3) obtains PTHREAD_CANCELED as the thread's exit status.
(Joining with a thread is the only way to know that cancellation has
completed.)
It seems that joining is not necessary for execution it is necessary if you want know what you did actually succeed.
From Doccumentation of pthread_cancel():
After a canceled thread has terminated, a join with that thread using pthread_join(3) obtains PTHREAD_CANCELED as the thread's exit status. (Joining with a thread is the only way to know that cancellation has completed.)
A thread using pthread can have following cancelling statuses:
PTHREAD_CANCEL_ENABLE
PTHREAD_CANCEL_DISABLE
If you try to cancel a thread you do not 100% know if the thread will really get cancelled. Using a join delivers the information to you if the thread was really cancelled or not. There are also cancel types to be considered and respective pthread functions for setting the cancel type and state:
int pthread_setcancelstate (int state, int *oldstate);
int pthread_setcanceltype (int type, int *oldtype);
Here is a sample code borrowed from http://www.ijon.de/comp/tutorials/threads/cancel.html
EDIT: Either I am too stupid to post a few lines of code or the formatter is really going on my nerves today. Just look up the code in the link above, please.
If something goes wrong in a thread or it is stopped from with in somehow it will always be tidied up by the OS. So it's all nice and safe.
You only need to join the thread if you have to be sure it has actually stopped executing, like merging two parallel tasks. (E.g. if you have various threads working on various parts a split structure you need to join them all, as in wait until they are all finished, when you want to combine the structure again)

Linux C++: Does a return from main() cause a multithreaded app to terminate?

This question seems like it's probably a duplicate, but I was unable to find one. If I missed a previous question, apologies.
In Java, where I have most of my experience, if your main() forks a thread and immediately returns the process continues to run until all (non-daemon) threads in the process have stopped.
In C++, this appears not to be the case - as soon as the main thread returns the process is terminating with other threads still running. For my current app this is easily solved with the application of pthread_join() but I'm wondering what causes this behavior. Is this compiler (gcc) specific, pthreads specific, or is kind of behavior shared across most/all platforms for which C++ has been implemented? Is this behavior configurable within pthreads (I've looked through the pthread api at the pthread_attr_*() functions and didn't see anything that looked relevant.)?
Completely separate question, but while you're here ... what would one use pthread_detatch() for?
Yes. In modern linux (more importantly newer versions of GNU libc) exit_group is the system call used when main returns, not plain exit. exit_group is described as follows:
This system call is equivalent to
exit(2) except that it terminates not
only the calling thread, but all
threads in the calling process's
thread group.
It is worth noting that current the c++ standard makes no mention of threads, so this behavior is not c++ specific, but instead is specific to your particular implementation. That said, every implementation I've personally seen kills all threads when the main thread terminates.
EDIT: It is also worth noting Jonathan Leffler's answer which points out that the POSIX standard does indeed specify this behavior, so it is certainly normal for an application using pthreads for its threading.
EDIT: To answer the follow up about pthread_detach. Basically it is considered a resource leak if you do not join a non-detached thread. If you have a long running task which you have no need to "wait for", and it just "ends when it ends" then you should detach it which will not have a resource leak when it terminates with no join. The man page says the following:
The pthread_detach() function marks
the thread identified by thread as
detached. When a detached thread
terminates, its resources are
automatically released back to the
system without the need for another
thread to join with the terminated
thread.
So a quick and dirty answer is: "when you don't care when it ends, detach it. If another thread cares when it ends and must wait for it to terminate, then don't."
Yes
The POSIX standard says:
ยง3.297 Process Termination
There are two kinds of process termination:
Normal termination occurs by a return from main(), when requested with the exit(), _exit(), or _Exit() functions; or when the last thread in the process terminates by returning from its start function, by calling the pthread_exit() function, or through cancellation.
Abnormal termination occurs when requested by the abort() function or when some signals are received.
The first normal termination condition applies. (Note that the C++ (1998, 2003) standard says nothing about threads.)
Regarding pthread_detach()
The POSIX standard (again) says:
The pthread_detach() function shall indicate to the implementation that storage for the thread thread can be reclaimed when that thread terminates. If thread has not terminated, pthread_detach() shall not cause it to terminate.
And the rationale says:
The pthread_join() or pthread_detach() functions should eventually be called for every thread that is created so that storage associated with the thread may be reclaimed.
It has been suggested that a "detach" function is not necessary; the detachstate thread creation attribute is sufficient, since a thread need never be dynamically detached. However, need arises in at least two cases:
In a cancellation handler for a pthread_join() it is nearly essential to have a pthread_detach() function in order to detach the thread on which pthread_join() was waiting. Without it, it would be necessary to have the handler do another pthread_join() to attempt to detach the thread, which would both delay the cancellation processing for an unbounded period and introduce a new call to pthread_join(), which might itself need a cancellation handler. A dynamic detach is nearly essential in this case.
In order to detach the "initial thread" (as may be desirable in processes that set up server threads).
This is not compiler specific and is standard behavior; the application terminates when main() exits, so if you want to prevent the application from terminating, you need main() to block until all threads have terminated, which you do by joining those threads. When you invoke pthread_create, it allocates resources for that thread. The resources are not deallocated unless you do a pthread_join (which blocks until the thread terminates) or pthread_detach (which causes the thread to automatically release resources when that thread exits). You should use pthread_detach whenever you launch a background thread that will terminate when its task is completed and for which you do not need to wait.
To make this a little bit more concrete, suppose you have several threads that perform a piece of a computation, and then you aggregate the result in some way. That would be a case where you would use join, because you need the results of the threads to proceed. Now, consider a case where a thread listens on a socket and processes incoming requests, until a flag indicates that the thread should quit. In this case, you would use pthread_detach, since nothing needs the thread to terminate in order to proceed, and so the resources associated with that thread should go away automatically.