How can I avoid threading + optimizer == infinite loop? [duplicate] - c++

This question already has answers here:
Multithreading program stuck in optimized mode but runs normally in -O0
(3 answers)
Closed 1 year ago.
In a code review today, I stumbled across the following bit of code (slightly modified for posting):
while (!initialized)
{
// The thread can start before the constructor has finished initializing the object.
// Can lead to strange behavior.
continue;
}
This is the first few lines of code that runs in a new thread. In another thread, once initialization is complete, it sets initialized to true.
I know that the optimizer could turn this into an infinite loop, but what's the best way to avoid that?
volatile - considered harmful
calling an isInitialized() function instead of using the variable directly - would this guarantee a memory barrier? What if the function was declared inline?
Are there other options?
Edit:
Should have mentioned this sooner, but this is portable code that needs to run on Windows, Linux, Solaris, etc. We use mostly use Boost.Thread for our portable threading library.

Calling a function won't help at all; even if a function is not declared inline, its body can still be inlined (barring something extreme, like putting your isInitialized() function in another library and dynamically linking against it).
Two options that come to mind:
Declare initialized as an atomic flag (in C++0x, you can use std::atomic_flag; otherwise, you'll want to consult the documentation for your threading library for how to do this)
Use a semaphore; acquire it in the other thread and wait for it in this thread.

#Karl's comment is the answer. Don't start processing in thread A until thread B has finished initialization. They key to doing this is sending a signal from thread B to thread A that it is up & running.
You mentioned no OS, so I will give you some Windows-ish psudocode. Transcode to the OS/library of your choice.
First create a Windows Event object. This will be used as the signal:
Thread A:
HANDLE running = CreateEvent(0, TRUE, FALSE, 0);
Then have Thread A start Thread B, passing the event along to it:
Thread A:
DWORD thread_b_id = 0;
HANDLE thread_b = CreateThread(0, 0, ThreadBMain, (void*)handle, 0, &thread_b_id);
Now in Thread A, wait until the event is signaled:
Thread A:
DWORD rc = WaitForSingleObject(running, INFINITE);
if( rc == WAIT_OBJECT_0 )
{
// thread B is up & running now...
// MAGIC HAPPENS
}
Thread B's startup routine does its initialization, and then signals the event:
Thread B:
DWORD WINAPI ThreadBMain(void* param)
{
HANDLE running = (HANDLE)param;
do_expensive_initialization();
SetEvent(running); // this will tell Thread A that we're good to go
}

Synchronization primitives are the solution to this problem, not spinning in a loop... But if you must spin in a loop and can't use a semaphore, event, etc, you can safely use volatile. It's considered harmful because it hurts the optimizer. In this case that's exactly what you want to do, no?

There is a boost equivalent of atomic_flag which is called once_flag in boost::once. It may well be what you want here.
Effectively if you want something to be constructed the first time it is called, eg lazy loading, and happens in multiple threads, you get boost::once to call your function the first time it is reached. The post-condition is that it has been initialized so there is no need for any kind of looping or locking.
What you do need to ensure is that your initialization logic does not throw exceptions.

This is a well known problem when working with threads. Creation/Initialization of objects takes relatively little time. When the thread actually starts running though... That can take quite a long time in terms of executed code.
Everyone keeps mentioning semaphores...
You may want to look at POSIX 1003.1b semaphores. Under Linux, try man sem_init. E.g.:
http://manpages.ubuntu.com/manpages/dapper/man3/sem_init.3.html
http://www.skrenta.com/rt/man/sem_init.3.html
http://docs.oracle.com/cd/E23824_01/html/821-1465/sem-init-3c.html
These semaphores have the advantage that, once Created/Initialized, one thread can block indefinitely until signaled by another thread. More critically, that signal can occur BEFORE the waiting thread starts waiting. (A significant difference between Semaphores and Condition Variables.) Also, they can handle the situation where you receive multiple signals before waking up.

Related

How to control thread lifetime using C++11 atomics

Following on from this question, I'd like to know what's the recommended approach we should take to replace the very common pattern we have in legacy code.
We have plenty of places where a primary thread is spawing one or more background worker threads and periodically pumping out some work for them to do, using a suitably synchronized queue. So the general pattern for a worker thread will look like this:
There will be an event HANDLE and a bool defined somewhere (usually as member variables) -
HANDLE hDoSomething = CreateEvent(NULL, FALSE, FALSE, NULL);
volatile bool bEndThread = false;
Then the worker thread function waits for the event to be signalled before doing work, but checks for a termination request inside the main loop -
unsigned int ThreadFunc(void *pParam)
{
// typical legacy implementation of a worker thread
while (true)
{
// wait for event
WaitForSingleObject(hDoSomething, INFINITE);
// check for termination request
if (bEndThread) break;
// ... do background work ...
}
// normal termination
return 0;
}
The primary thread can then give some work to the background thread like this -
// ... put some work on a synchronized queue ...
// pulse worker thread to do the work
SetEvent(hDoSomething);
And it can finally terminate the worker thread like so -
// to terminate the worker thread
bEndThread = true;
SetEvent(hDoSomething);
// wait for worker thread to die
WaitForSingleObject(hWorkerThreadHandle, dwSomeSuitableTimeOut);
In some cases, we've used two events (one for work, one for termination) and WaitForMultipleObjects instead, but the general pattern is the same.
So, looking at replacing the volatile bool with a C++11 standard equivalent, is it as simple as replacing this
volatile bool bEndThread = false;
with this?
std::atomic<bool> bEndThread = false;
I'm sure it will work, but it doesn't seem enough. Also, it doesn't affect the case where we use two events and no bool.
Note, I'm not intending to replace all this legacy stuff with the PPL and/or Concurrency Runtime equivalents because although we use these for new development, the legacy codebase is end-of-life and just needs to be compatible with the latest development tools (the original question I linked above shows where my concern arose).
Can someone give me a rough example of C++11 standard code we could use for this simple thread management pattern to rewrite our legacy code without too much refactoring?
If it ain't broken don't fix it (especially if this is a legacy code base)
VS style volatile will be around for a few more years. Given that
MFC isn't dead this won't be dead any time soon. A cursory Google
search says you can control it with /volatile:ms.
Atomics might do the job of volatile, especially if this is a counter
there might be little performance overhead.
Many Windows native functions have different performance characteristics when compared to their C++11 implementation. For example, Windows TimerQueues and Multimedia have precision that is not possible to achieve with C++11.
For example ::sleep_for(5)
will sleep for 15 (and not 5 or 6). This can be solved with a mysterious
call to timeSetPeriod. Another example is that unlocking on a condition variable can be slow to respond. Interfaces to fix these aren't exposed to C++11 on Windows.

C++: How to call a synchronous library call asynchronously?

I am working with a library that has a blocking call that never times out if it does not succeed. I would like to be able to handle this error condition more gracefully. I know there must be a way to wrap the call in a worker thread (or some other type of delegate object), wait x amount of seconds, and then throw an exception if x amount of seconds have passed. I only need to do this for one function in the library. How do I go about implementing this? I see similar examples all over the net but none that are doing exactly what I'm trying to do. Thanks!
My answer is "do not attempt to do this".
Sure, you can probably find some hack that will appear to work in your particular case. But the race conditions here are very hard to fix.
The obvious approach is to have thread A make the blocking call, then set up thread B to kill A if a timeout expires.
But... What if the timeout expires at the same time A is returning from the blocking call? Specifically, what if B thinks it is time to kill A, then your OS scheduler decides to run A for a while, then your OS decides to run the B code that kills A?
Bottom line: You wind up killing A at some indeterminate point in its execution. (For example, maybe it just deducted $500 from the savings account but has not yet added $500 to the checking account. The possibilities are endless...)
OK, so you can have thread A exist for the sole purpose of running the library call, and then signal a condition or whatever when it finishes. At least it is possible to make this work in principle. But even then, what if the library itself has some internal state that gets left in an inconsistent state should A get killed at an inopportune moment?
There are good reasons asynchronous thread cancellation was omitted from the C++11 standard. Just say no. Fix the library routine. Whatever that costs, it is almost certainly cheaper in the long run than what you are attempting.
Using C++11 then launching a thread explicitly for that call could look like:
// API
T call(int param);
// asynchronous call with 42 as parameter
auto future = std::async(std::launch::async, call, 42);
// let's wait for 40 ms
auto constexpr duration = std::chrono::milliseconds(40);
if(future.wait_for(duration) == std::future_status::timeout) {
// We waited for 40ms and had a timeout, now what?
} else {
// Okay, result is available through future.get()
// if call(42) threw an exception then future.get() will
// rethrow that exception so it's worth doing even if T is void
future.get();
}
As you can see in case of a timeout you have a big problem as you're stuck with a blocked thread forever. This is arguably not a fault of the C++11 std::future: a fair number of thread abstractions will provide at best cooperative cancellation, and that would still not be enough to save you.
If you're not using C++11 then Boost.Thread has a very similar interface with boost::unique_future (where wait_for is instead timed_wait, and returns bool), although it doesn't have something akin to std::async so you have to do some of the busywork yourself (with e.g. boost::packaged_task + boost::thread). Details available in the documentation.
Obviously the thread within which the blocking call is made cannot kill itself - it will be blocked.
One approach would be to launch a thread A that makes the blocking call, then launch another thread B that sleeps for the timeout then kills thread A. A mutex protected shared flag can indicate whether the operation succeeded, based on which an exception can be thrown or not.
A second approach (very similar) would be to launch a thread A, which in turn launches thread B, sleeps for the timeout, then kills thread B.
The specifics of your preferred threading library (such as which threads are allowed to kill each other) and the nature of the blocking function will impact exactly how you go about this.
On Windows, you will want to do something like this:
//your main thread
DWORD threadID;
HANDLE h = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)&ThreadProc, 0, 0, &threadID);
DWORD ret = 0xFFFFFF;
for (int i = 0; i < /*some timeout*/; i++) {
ret = WaitForSingleObject(h, 100);
if (ret == WAIT_OBJECT_0) break;
}
if (ret != WAIT_OBJECT_0) {
DWORD exitCode;
TerminateThread(h, &exitCode); // you will want to stop the thread as it isn't exiting.
/*throw*/;
}
And
//Thread Routine
DWORD ThreadProc (LPVOID threadParam) {
//call your function here
return 0;
}
The idea here is to spin up a thread to do the work you want. You can then wait on that thread in
100 ms increments (or whatever you want). If it doesn't end within a certain time period, you can throw an exception.
There are some problems. First, does the library hold any internal state that will be left unuseable by a failed library call? If so, you are stuft because calls following the failed call that blocked will also fail or, worse, generate erroneous results without any exception or other notification.
If the library is safe, then you could indeed try to thread off the call and wait on some event with a timeout. It's now that you need to handle the concerns of #Nemo - you need to take care over how you handle the return of results. How exactly you do this depends on, well, how you intend to return results from the thread that calls the library. Typically, both threads would enter a critical section to safely arbitrate between the lib thread returning results and the timeout thread instructing the lib thread to never return anything, (eg. by setting a flag in it), and just exit if the lib call ever returns.
Orphaning the lib. thread is such a way will result in a thread leak if the lib call never returns. Whether you can absorb such leaks, or safely resort to eventual forced termination of the orphaned threads, is between you and your app :)

kill boost thread after n seconds

I am looking for the best way to solve the following (c++) problem. I have a function given by some framework, which returns an object. Sometimes it takes just miliseconds, but on some occasions it takes minutes. So i want to stop the execution if it takes longer than let's say 2 seconds.
I was thinking about doing it with boost threads. Important sidenote, if the function returns faster than the 2 seconds the program should not wait.
So i was thinking about 2 threads:
1.thread: execute function a
2.thread: run timer
if(thread 2 exited bevore thread 1) kill thread 1
else do nothing
I am struggeling a bit the practical implementation. Especially,
how do i return an object from a child boost thread to the main thread?
how do i kill a thread in boost?
is my idea even a good one, is there a better way to solve the problem in c++ (with or without boost)?
As for waiting, just use thread::timed_join() inside your main thread, this will return false, if the thread didn't complete within the given time.
Killing the thread is not feasible if your third-party library is not aware of boost:threads. Also, you almost certainly don't want to 'kill' the thread without giving the function the possibility to clean up.
I'd suggest that you wait for, say, 2 seconds and then continue with some kind of error message, letting the framework function finish its work and just ignoring the result if it came too late.
As for returning a value, I'd suggest something like
struct myfunction {
MyObj returnValue;
void operator() () {
// ...
returnValue = theComputedReturnValue;
}
};
// ...
myfunction f;
boost::thread t = boost::thread(boost::ref(f));
t.join(); // or t.timed_join()...
use(f.returnValue);
// ...
I have done something similar by the past and that works (even though not ideal).
To get the return value just "share" a variable (that could be just a pointer (initially nil) to the returned value, or a full object with a state etc ...) and make your thread read/udate it. Don't forget to mutex it needed. That should be quite straight forward.
Expanding what James has said above, "kill a thread" is such a harsh term! :) But interruption is not so easy either, typically with boost threads, there needs to be an interruption point, where the running thread can be interrupted. There is a set of these interruptible functions (unfortunately they are boost specific), such as wait/sleep etc. One option you have is in the first thread, liberally scatter interruption_points(). Such that when you call interrupt() once thread 2 dies, at the next interruption_point() thread 1 will throw an exception.
Threads are in the same process space, thus you can have shared state between multiple threads as long as there is synchronized access to that shared state.
EDIT: just noticed that the OP has already looked into this... will leave the answer up anyway I guess...

Wait for a detached thread to finish in C++

How can I wait for a detached thread to finish in C++?
I don't care about an exit status, I just want to know whether or not the thread has finished.
I'm trying to provide a synchronous wrapper around an asynchronous thirdarty tool. The problem is a weird race condition crash involving a callback. The progression is:
I call the thirdparty, and register a callback
when the thirdparty finishes, it notifies me using the callback -- in a detached thread I have no real control over.
I want the thread from (1) to wait until (2) is called.
I want to wrap this in a mechanism that provides a blocking call. So far, I have:
class Wait {
public:
void callback() {
pthread_mutex_lock(&m_mutex);
m_done = true;
pthread_cond_broadcast(&m_cond);
pthread_mutex_unlock(&m_mutex);
}
void wait() {
pthread_mutex_lock(&m_mutex);
while (!m_done) {
pthread_cond_wait(&m_cond, &m_mutex);
}
pthread_mutex_unlock(&m_mutex);
}
private:
pthread_mutex_t m_mutex;
pthread_cond_t m_cond;
bool m_done;
};
// elsewhere...
Wait waiter;
thirdparty_utility(&waiter);
waiter.wait();
As far as I can tell, this should work, and it usually does, but sometimes it crashes. As far as I can determine from the corefile, my guess as to the problem is this:
When the callback broadcasts the end of m_done, the wait thread wakes up
The wait thread is now done here, and Wait is destroyed. All of Wait's members are destroyed, including the mutex and cond.
The callback thread tries to continue from the broadcast point, but is now using memory that's been released, which results in memory corruption.
When the callback thread tries to return (above the level of my poor callback method), the program crashes (usually with a SIGSEGV, but I've seen SIGILL a couple of times).
I've tried a lot of different mechanisms to try to fix this, but none of them solve the problem. I still see occasional crashes.
EDIT: More details:
This is part of a massively multithreaded application, so creating a static Wait isn't practical.
I ran a test, creating Wait on the heap, and deliberately leaking the memory (i.e. the Wait objects are never deallocated), and that resulted in no crashes. So I'm sure it's a problem of Wait being deallocated too soon.
I've also tried a test with a sleep(5) after the unlock in wait, and that also produced no crashes. I hate to rely on a kludge like that though.
EDIT: ThirdParty details:
I didn't think this was relevant at first, but the more I think about it, the more I think it's the real problem:
The thirdparty stuff I mentioned, and why I have no control over the thread: this is using CORBA.
So, it's possible that CORBA is holding onto a reference to my object longer than intended.
Yes, I believe that what you're describing is happening (race condition on deallocate). One quick way to fix this is to create a static instance of Wait, one that won't get destroyed. This will work as long as you don't need to have more than one waiter at the same time.
You will also permanently use that memory, it will not deallocate. But it doesn't look like that's too bad.
The main issue is that it's hard to coordinate lifetimes of your thread communication constructs between threads: you will always need at least one leftover communication construct to communicate when it is safe to destroy (at least in languages without garbage collection, like C++).
EDIT:
See comments for some ideas about refcounting with a global mutex.
To the best of my knowledge there's no portable way to directly ask a thread if its done running (i.e. no pthread_ function). What you are doing is the right way to do it, at least as far as having a condition that you signal. If you are seeing crashes that you are sure are due to the Wait object is being deallocated when the thread that creates it quits (and not some other subtle locking issue -- all too common), the issue is that you need to make sure the Wait isn't being deallocated, by managing from a thread other than the one that does the notification. Put it in global memory or dynamically allocate it and share it with that thread. Most simply don't have the thread being waited on own the memory for the Wait, have the thread doing the waiting own it.
Are you initializing and destroying the mutex and condition var properly?
Wait::Wait()
{
pthread_mutex_init(&m_mutex, NULL);
pthread_cond_init(&m_cond, NULL);
m_done = false;
}
Wait::~Wait()
{
assert(m_done);
pthread_mutex_destroy(&m_mutex);
pthread_cond_destroy(&m_cond);
}
Make sure that you aren't prematurely destroying the Wait object -- if it gets destroyed in one thread while the other thread still needs it, you'll get a race condition that will likely result in a segfault. I'd recommend making it a global static variable that gets constructed on program initialization (before main()) and gets destroyed on program exit.
If your assumption is correct then third party module appears to be buggy and you need to come up with some kind of hack to make your application work.
Static Wait is not feasible. How about Wait pool (it even may grow on demand)? Is you application using thread pool to run?
Although there will still be a chance that same Wait will be reused while third party module is still using it. But you can minimize such chance by properly queing vacant Waits in your pool.
Disclaimer: I am in no way an expert in thread safety, so consider this post as a suggestion from a layman.

C++ Thread question - setting a value to indicate the thread has finished

Is the following safe?
I am new to threading and I want to delegate a time consuming process to a separate thread in my C++ program.
Using the boost libraries I have written code something like this:
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
Where finished_flag is a boolean member of my class. When the thread is finished it sets the value and the main loop of my program checks for a change in that value.
I assume that this is okay because I only ever start one thread, and that thread is the only thing that changes the value (except for when it is initialised before I start the thread)
So is this okay, or am I missing something, and need to use locks and mutexes, etc
You never mentioned the type of finished_flag...
If it's a straight bool, then it might work, but it's certainly bad practice, for several reasons. First, some compilers will cache the reads of the finished_flag variable, since the compiler doesn't always pick up the fact that it's being written to by another thread. You can get around this by declaring the bool volatile, but that's taking us in the wrong direction. Even if reads and writes are happening as you'd expect, there's nothing to stop the OS scheduler from interleaving the two threads half way through a read / write. That might not be such a problem here where you have one read and one write op in separate threads, but it's a good idea to start as you mean to carry on.
If, on the other hand it's a thread-safe type, like a CEvent in MFC (or equivilent in boost) then you should be fine. This is the best approach: use thread-safe synchronization objects for inter-thread communication, even for simple flags.
Instead of using a member variable to signal that the thread is done, why not use a condition? You are already are using the boost libraries, and condition is part of the thread library.
Check it out. It allows the worker thread to 'signal' that is has finished, and the main thread can check during execution if the condition has been signaled and then do whatever it needs to do with the completed work. There are examples in the link.
As a general case I would neve make the assumption that a resource will only be modified by the thread. You might know what it is for, however someone else might not - causing no ends of grief as the main thread thinks that the work is done and tries to access data that is not correct! It might even delete it while the worker thread is still using it, and causing the app to crash. Using a condition will help this.
Looking at the thread documentation, you could also call thread.timed_join in the main thread. timed_join will wait for a specified amount for the thread to 'join' (join means that the thread has finsihed)
I don't mean to be presumptive, but it seems like the purpose of your finished_flag variable is to pause the main thread (at some point) until the thread thrd has completed.
The easiest way to do this is to use boost::thread::join
// launch the thread...
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
// ... do other things maybe ...
// wait for the thread to complete
thrd.join();
If you really want to get into the details of communication between threads via shared memory, even declaring a variable volatile won't be enough, even if the compiler does use appropriate access semantics to ensure that it won't get a stale version of data after checking the flag. The CPU can issue reads and writes out of order as long (x86 usually doesn't, but PPC definitely does) and there is nothing in C++9x that allows the compiler to generate code to order memory accesses appropriately.
Herb Sutter's Effective Concurrency series has an extremely in depth look at how the C++ world intersects the multicore/multiprocessor world.
Having the thread set a flag (or signal an event) before it exits is a race condition. The thread has not necessarily returned to the OS yet, and may still be executing.
For example, consider a program that loads a dynamic library (pseudocode):
lib = loadLibrary("someLibrary");
fun = getFunction("someFunction");
fun();
unloadLibrary(lib);
And let's suppose that this library uses your thread:
void someFunction() {
volatile bool finished_flag = false;
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
while(!finished_flag) { // ignore the polling loop, it's besides the point
sleep();
}
delete thrd;
}
void myclass::mymethod() {
// do stuff
finished_flag = true;
}
When myclass::mymethod() sets finished_flag to true, myclass::mymethod() hasn't returned yet. At the very least, it still has to execute a "return" instruction of some sort (if not much more: destructors, exception handler management, etc.). If the thread executing myclass::mymethod() gets pre-empted before that point, someFunction() will return to the calling program, and the calling program will unload the library. When the thread executing myclass::mymethod() gets scheduled to run again, the address containing the "return" instruction is no longer valid, and the program crashes.
The solution would be for someFunction() to call thrd->join() before returning. This would ensure that the thread has returned to the OS and is no longer executing.