How does std::notify_all_at_thread_exit work? - c++

According to cppref:
std::notify_all_at_thread_exit provides a mechanism to notify other
threads that a given thread has completely finished, including
destroying all thread_local objects.
I know the exact semantics of std::notify_all_at_thread_exit. What makes me puzzled is:
How to register a callback function that will be called after a given thread has finished and destroyed all of its thread-local objects?

std::notify_all_at_thread_exit takes a condition variable in its first parameter, by reference. When the thread exits, it will call notify_all on that condition variable, waking up threads that are waiting for the condition variable to be notified.
There doesn't appear to be a direct way to truly register a callback for this; you'll likely need to have a thread waiting for the condition variable to be notified (using the same lock as the one passed into std::notify_all_at_thread_exit. When the CV is notified, the thread that's waiting should verify that the wakeup isn't spurious, and then execute the desired code that should be run.
More info about how this is implemented:
At least on Google's libcxx, std::notify_all_at_thread_exit calls __thread_struct_imp::notify_all_at_thread_exit, which stores a pair with the parameters to a vector (_Notify). Upon thread death, the destructor of __thread_struct_imp iterates over this vector and notifies all of the condition variables that have been registered in this way.
Meanwhile, GNU stdc++ uses a similar approach: A notifier object is created, it's registered with __at_thread_exit, it's designed to call its destructor when run at thread exit, and the destructor actually performs the notification process. I'd need to investigate __at_thread_exit more closely as I don't understand its inner workings fully just yet.

Related

Why is HANDLE event object assumed valid in thread function?

Why is HANDLE event object(synchronization object which is created by CreateEvent function) in winapi assumed to be valid in thread function?
From multithreading example to microsoft docs code examples, this event object is passed to WaitForSingleObject function without any protection.
I've been doing the same. And today, I just reached to the thought that how can I deal with this "branch" safe, in such a sense like branch coverage in code perspective.
In the strict sense, this event object is shared along multiple threads, at least in the thread which calls SetEvent and in the thread which Calls WaitForSingleObject.
Therefore, it has to be classified as a type of shared resource. Then, all shared resources must be protected by "lock", such as mutex or critical section.
Also, it is possible to deliberately call CloseHandle after SetEvent while thread is alive, which will lead to passing closed event handle to WaitForSingleObject in thread function. (maybe the event object won't be deleted due to deferred deletion)
Acquiring lock and calling WaitForSingleObject in thread function, and trying to acquire lock in other thread in order to call SetEvent would definitely lead to deadlock.
[EDIT]
Maybe I misled my point by mentioning "assumed" and particular code example. I wonder how to do thread safe validity check for HANDLE event object, treating HANDLE as variable.
According to Synchronizing Execution of Multiple Threads, There are a number of objects whose handles can be used to synchronize multiple threads. These objects include:
Console input buffers
Events
Mutexes
Processes
Semaphores
Threads
Timers
The state of each of these objects is either signaled or not signaled.(atomic)
For handle concerned, WaitForSingleObject function say If this handle is closed while the wait is still pending, the function's behavior is undefined.
For an invalid handle, It's programmer's responsibility to troubleshoot where the handle becomes invalid(BUG).

C++. std::condition_variable and multiple wait-threads

I have some class, with queue of std::function<void()> member and methods Push and Pop.
I want to implement addition method PushAndWaitUntilExecuted. It is easy when you have one consumer-thread(who call Pop) and one producer-thread(who call Push) - just simple std::condition_variable will be enough.
But my application have dynamic number of threads which can execute the same lines of code with calling PushAndWaitUntilExecuted function in parallel and wait until consumer-thread execute pushed std::function object.
A have the idea with passing std::pair<uint64_t, std::function<void()>> to queue instead of just std::function<void()>, where uint64_t - producer-thread ID(boost::this_thread::get_id()). And then consumer-thread call std::condition_variable::notify_all() and all threads will check if executed std::function have same ID with thread or not.
Is it ok solution or something better can be implemented?
More than just a condition variable needs to be introduced here, in order to avoid several different race conditions. A mutex and a job completion flag are also required.
At this point, it becomes cleaner to replace your std::function<void()> with a small class that contains this closure, as well as all the additional baggage:
struct job {
std::function<void()> implementation;
std::mutex m;
std::condition_variable flag;
bool completed=false;
};
Your queue becomes a queue of std::shared_ptr<job>s, instead of a queue of std::functions, with the jobs constructed in dynamic scope (since, of course, mutexes and condition variables are not copyable or movable, and these objects get accessed from both of your threads).
After your worker thread finishes executing the implementation, it:
locks the mutex.
sets completed to true
signals the condition variable.
And your PushAndWaitUntilExecuted, after it executes the push:
locks the mutex
waits on the condition variable, until completed is set
You must thoroughly understand that C++ gives you absolutely no guarantees, whatsoever, that after you push a new closure into your job queue, some thread doesn't immediately grab it, execute it, and finish it, before the original thread (the one that pushed it) gets around to looking at the condition variable. By now, nobody will be signaling the condition variable any more. If all you have to work with is just a condition variable here, you'll be waiting for the condition variable to get signaled until our sun explodes.
Which is why you need more than just a condition variable, a mutex and an explicit flag, and use the above approach, to correctly handle interthread sequencing.
This is a fairly classical, routine approach. You should find examples of many similar implementations in every good C++ textbook on this subject matter.

Possible race condition in std::condition_variable?

I've looked into the VC++ implementation of std::condition_variable(lock,pred), basically, it looks like this:
template<class _Predicate>
void wait(unique_lock<mutex>& _Lck, _Predicate _Pred)
{ // wait for signal and test predicate
while (!_Pred())
wait(_Lck);
}
Basically , the naked wait calls _Cnd_waitX which calls _Cnd_waitwhich calls do_wait which calls cond->_get_cv()->wait(cs); (all of these are in the file cond.c).
cond->_get_cv() returns Concurrency::details::stl_condition_variable_interface .
If we go to the file primitives.h, we see that under windows 7 and above, we have the class stl_condition_variable_win7 which contains the old good win32 CONDITION_VARIABLE, and wait calls __crtSleepConditionVariableSRW.
Doing a bit of assembly debug, __crtSleepConditionVariableSRW just extract the the SleepConditionVariableSRW function pointer, and calls it.
Here's the thing: as far as I know, the win32 CONDITION_VARIABLE is not a kernel object, but a user mode one. Therefore, if some thread notifies this variable and no thread actually sleep on it, you lost the notification, and the thread will remain sleeping until timeout has reached or some other thread notifies it. A small program can actually prove it - if you miss the point of notification - your thread will remain sleeping although some other thread notified it.
My question goes like this:
one thread waits on a condition variable and the predicate returns false. Then, the whole chain of calls explained above takes place. In that time, another thread changed the environment so the predicate will return true and notifies the condition variable. We passed the predicate in the original thread, but we still didn't get into SleepConditionVariableSRW - the call chain is very long.
So, although we notified the condition variable and the predicate put on the condition variable will definitely return true (because the notifier made so), we are still blocking on the condition variable, possibly forever.
Is this how should it behave? It seems like a big ugly race condition waiting to happen. If you notify a condition variable and it's predicate returns true - the thread should unblock. But if we're in the limbo between checking the predicate and going to sleep - we are blocked forever. std::condition_variable::wait is not an atomic function.
What does the standard says about it and is it a really race condition?
You've violated the contract so all bets are off. See: http://en.cppreference.com/w/cpp/thread/condition_variable
TLDR: It's impossible for the predicate to change by someone else while you're holding the mutex.
You're supposed to change the underlying variable of the predicate while holding a mutex and you have to acquire that mutex before calling std::condition_variable::wait (both because wait releases the mutex, and because that's the contract).
In the scenario you described the change happened after the while (!_Pred()) saw that the predicate doesn't hold but before wait(_Lck) had a chance to release the mutex. This means that you changed the thing the predicate checks without holding the mutex. You have violated the rules and a race condition or an infinite wait are still not the worst kinds of UB you can get. At least these are local and related to the rules you violated so you can find the error...
If you play by the rules, either:
The waiter takes hold of the mutex first
Goes into std::condition_variable::wait. (Recall the notifier still waits on the mutex.)
Checks the predicate and sees that it doesn't hold. (Recall the notifier still waits on the mutex.)
Call some implementation defined magic to release the mutex and wait, and only now may the notifier proceed.
The notifier finally managed to take the mutex.
The notifier changes whatever needs to change for the predicate to hold true.
The notifier calls std::condition_variable::notify_one.
or:
The notifier acquires the mutex. (Recall that the waiter is blocked on trying to acquire the mutex.)
The notifier changes whatever needs to change for the predicate to hold true. (Recall that the waiter is still blocked.)
The notifier releases the mutex. (Somewhere along the way the waiter will call std::condition_variable::notify_one, but once the mutex is released...)
The waiter acquires the mutex.
The waiter calls std::condition_variable::wait.
The waiter checks while (!_Pred()) and viola! the predicate is true.
The waiter doesn't even go into the internal wait, so whether or not the notifier managed to call std::condition_variable::notify_one or didn't manage to do that yet is irrelevant.
That's the rationale behind the requirement on cppreference.com:
Even if the shared variable is atomic, it must be modified under the mutex in order to correctly publish the modification to the waiting thread.
Note that this is a general rule for condition variables rather than a special requirements for std::condition_variabless (including Windows CONDITION_VARIABLEs, POSIX pthread_cond_ts, etc.).
Recall that the wait overload that takes a predicate is just a convenience function so that the caller doesn't have to deal with spurious wakeups. The standard (ยง30.5.1/15) explicitly says that this overload is equivalent to the while loop in Microsoft's implementation:
Effects: Equivalent to:
while (!pred())
wait(lock);
Does the simple wait work? Do you test the predicate before and after calling wait? Great. You're doing the same. Or are you questioning void std::condition_variable::wait( std::unique_lock<std::mutex>& lock ); too?
Windows Critical Sections and Slim Reader/Writer Locks being user-mode facilities rather than kernel objects is immaterial and irrelevant to the question. There are alternative implementations. If you're interested to know how Windows manages to atomically release a CS/SRWL and enter a wait state (what naive pre-Vista user-mode implementations with Mutexes and Events did wrong) that's a different question.

Boost thread object lifetime and thread lifetime

I would like to have boost thread object being deleted together with exiting from thread entry function. Is it something wrong if I start the thread function and pass a shared pointer to object, which owns thread object instance and then, when thread function exits, it destroys the this object together with thread object at the same time?
EDIT:
Maybe I will describe why I want to do that. I have to use low level dbus API. What I want to do is to create the adapter class, which will start its own thread and wait for incoming messages until the DISCONNECT message arrives. If it arrives I want to close the thread and kill the Adapter itself. The adapter is an Active Object, which runs the method sent to its scheduler. These methods put themselves on the scheduler queue once again after reading message from dbus. But if it is DISCONNECT message, they should not sent the method but just exit scheduler thread, destroying the Adapter object. hmmm looks like it is too complicated...
From the Boost.Thread documentation you can see that a thread object that is joinable should not be deleted, otherwise std::terminate will be called.
So you should assure that if the thread is joinable, either join() or detach() should be called in the destructor of the object owning the thread. Note: if the thread itself is destroying the object, join() is not an option. The thread would attempt to join itself, resulting in a deadlock.
However, if you keep these restrictions in mind, you can destroy a thread from within its own thread of execution.
You can do this, but you probably should not.
The main purpose of the boost::thread object is that you can monitor the associated thread. Having a thread monitor itself does not make much sense in most scenarios.
As was suggested by the other answers, you could just detach the thread and throw the boost::thread object away. Doing this is usually considered bad style, unless the monitoring responsibility has been transferred to another object first. For example, many simple worker threads set a future upon completion. The future already provides all the monitoring we need, so we can detach the thread.
You should never detach a thread completely such that you lose all means of monitoring it. You must at least be able to guarantee a clean shutdown, which becomes impossible for all but the most trivial threads if you detach them completely.
I am not sure if that addresses your use case but it sounds to me like you don't have to do this.
The lifetime of the boost::thread object does not necessarily coincide with the thread itself. Meaning that if you don't care you can just as well start the thread, call detach() on it and have the object run out of scope. Then it is deleted but the thread will still run until it's function is finished. The only thing is, you won't be able to join it. So if your program finishes while the thread still runs it will crash.
In case you do care about this stuff, the question might be wrong because in this case you would store the objects and call join() on them before deleting.

Do threads clean-up after themselves in Win32/MFC and POSIX?

I am working on a multithreaded program using C++ and Boost. I am using a helper thread to eagerly initialize a resource asynchronously. If I detach the thread and all references to the thread go out of scope, have I leaked any resources? Or does the thread clean-up after itself (i.e. it's stack and any other system resources needed for the itself)?
From what I can see in the docs (and what I recall from pthreads 8 years ago), there's not explicit "destory thread" call that needs to be made.
I would like the thread to execute asynchronously and when it comes time to use the resource, I will check if an error has occured. The rough bit of code would look something like:
//Assume this won't get called frequently enough that next_resource won't get promoted
//before the thread finishes.
PromoteResource() {
current_resource_ptr = next_resource_ptr;
next_resource_ptr.reset(new Resource());
callable = bind(Resource::Initialize, next_resource); //not correct syntax, but I hope it's clear
boost::thread t(callable);
t.start();
}
Of course--I understand that normal memory-handling problems still exist (forget to delete, bad exception handling, etc)... I just need confirmation that the thread itself isn't a "leak".
Edit: A point of clarification, I want to make sure this isn't technically a leak:
void Run() {
sleep(10 seconds);
}
void DoSomething(...) {
thread t(Run);
t.run();
} //thread detaches, will clean itself up--the thread itself isn't a 'leak'?
I'm fairly certain everything is cleaned up after 10 seconds-ish, but I want to be absolutely certain.
The thread's stack gets cleaned up when it exits, but not anything else. This means that anything it allocated on the heap or anywhere else (in pre-existing data structures, for example) will get left when it quits.
Additionally any OS-level objects (file handle, socket etc) will be left lying around (unless you're using a wrapper object which closes them in its destructor).
But programs which frequently create / destroy threads should probably mostly free everything that they allocate in the same thread as it's the only way of keeping the programmer sane.
If I'm not mistaken, on Windows Xp all resources used by a process will be released when the process terminates, but that isn't true for threads.
Yes, the resources are automatically released upon thread termination. This is a perfectly normal and acceptable thing to do to have a background thread.
To clean up after a thread you must either join it, or detach it (in which case you can no longer join it).
Here's a quote from the boost thread docs that somewhat explains that (but not exactly).
When the boost::thread object that
represents a thread of execution is
destroyed the thread becomes detached.
Once a thread is detached, it will
continue executing until the
invocation of the function or callable
object supplied on construction has
completed, or the program is
terminated. A thread can also be
detached by explicitly invoking the
detach() member function on the
boost::thread object. In this case,
the boost::thread object ceases to
represent the now-detached thread, and
instead represents Not-a-Thread.
In order to wait for a thread of
execution to finish, the join() or
timed_join() member functions of the
boost::thread object must be used.
join() will block the calling thread
until the thread represented by the
boost::thread object has completed. If
the thread of execution represented by
the boost::thread object has already
completed, or the boost::thread object
represents Not-a-Thread, then join()
returns immediately. timed_join() is
similar, except that a call to
timed_join() will also return if the
thread being waited for does not
complete when the specified time has
elapsed.
In Win32, as soon as the thread's main function, called ThreadProc in the documentation, finishes, the thread is cleaned up. Any resources allocated by you inside the ThreadProc you'll need to clean up explicitly, of course.