I write a DLL MyDLL.dll with Visual C++ 2008, as follows:
(1) MFC static linked
(2) Using multi-thread runtime library.
In the DLL, this is a global data m_Data shared by two export functions, as follows:
ULONGLONG WINAPI MyFun1(LPVOID *lpCallbackFun1)
{
...
Write m_Data(using Critical section to protect)
…
return xxx;
}
ULONGLONG WINAPI MyFun2(LPVOID *lpCallbackFun2)
{
...
Suspend MyThread1 to prevent conflict.
Read m_Data(using Critical section to protect)
Resume MyThread1.
…
return xxx;
}
In in my main application, it will first call LoadLibrary to load MyDLL.dll, then get the address of MyFun1 and MyFun2, then do the following thing:
(1) Start a new thread MyThread1, which will invoke MyFun1 to do a time-consuming task.
(2) Start a new thread MyThread2, which will invoke MyFun2 for several times, as follows:
for (nIndex = 0; nIndex = 20; nIndex)
{
nResult2 = MyFun2(lpCallbackFun2);
NextStatement2;
}
Although MyThread1 and MyThread2 using critical section to protect the shared data m_Data, I will still suspend MyThread1 before accessing the shared data, to prevent any possible conflicts.
The problem is:
(1) When the first invoke of MyFun2, everything is OK, and the return value of MyFun2(that is nResult2) is 1 , which is expected.
(2) When the second, third and fourth invoke of MyFun2, the operations in MyFun2 are executed successfully, but the return value of MyFun2(that is nResult2) is a random value instead of the expected value 1. I try to using Debug to trace into MyFun2, and confirm that the last return statement is just return a value of 1, but the invoker will receive a random value instead of 1 when inspecting nResult2.
(3) After the fourth invoke of MyFun2 and return back to the next statement follow MyFun2, I will always get a “buffer overrun detected” error, whatever the next statement is.
I think this looks like a stack corruption, so try to make some tests:
I confirm the /GS (Stack security check) feature in the compiler is ON.
If MyFun2 is invoked after MyFun1 in MyThread1 is completed, then everything will be OK.
In debug mode, the codeline in MyFun2 that reads the shared data m_Data will not cause any errors or exceptions. Neither will the codeline in MyFun1 that writes the shared Data.
So, how to solve this problem
Thank you!
I suppose at this line
Suspend MyThread1 to prevent conflict.
you are using SuspendThread() function. That's what its documentation says:
This function is primarily designed for use by debuggers. It is not intended to be used for thread synchronization. Calling SuspendThread on a thread that owns a synchronization object, such as a mutex or critical section, can lead to a deadlock if the calling thread tries to obtain a synchronization object owned by a suspended thread. To avoid this situation, a thread within an application that is not a debugger should signal the other thread to suspend itself. The target thread must be designed to watch for this signal and respond appropriately.
So, in short: don't use it. Critical sections and other synchronization objects do their job just fine.
Never use SupsendThread!!! NEVER!
SuspendThread is only used for Debugging purpose.
The reason is simple. You don't know where you suspend the thread. It may be just in time, when the thread blocks a resource that you want to use. Also a bunch of CRT function use thread synchronisation.
Just use critcal sectins or mutexes.
Just see the simple sample here: http://blog.kalmbachnet.de/?postid=6 and here
http://blog.kalmbachnet.de/?postid=16
Since this is a windows program you could use windows based mutex or semaphore and WaitForSingleObject when reading or writing shared data.
Related
I'm using the boost interprocess library to create server and client programs for passing opencv mat objects around in shared memory. Each server and client process has two boost threads which are members of a boost::thread_group. One handles command line IO while the other manages data processing. Shared memory access is synchronized using boost::interprocess condition_variables.
Since this program involves shared memory, I need to do some manual cleaning before exiting. My problem is that if the server terminates prematurely, then the processing thread on the client blocks at the wait() call since the server is responsible for sending notifications. I need to somehow interrupt the thread stuck at wait() to initiate shared memory destruction. I understand that calling interrupt() (in my case, thread_group.interrupt_all()) on the thread will cause theboost::thread_interrupted exception to be thrown upon reaching a interruption point (such as wait()), which if left unhandled, would allow the shared memory destruction to proceed. However, when I try to interrupt the thread while it is in wait(), nothing seems to happen. For instance, this prints nothing to the command line:
try {
shared_mat_header->new_data_condition.wait(lock);
} catch (...) {
std::cout << "Thread interrupt occurred\n";
}
I am not at all sure, but it seems like the interrupt() call needs to occur before the thread enters wait() for the exception to be thrown. Is this true? If not, then what is the proper way to interrupt a boost thread that is blocked by a condition_variable.wait() call?
Thanks for any insight.
Edit
I accepted Chris Desjardins' answer, which does not answer the question directly, but has the intended effect. Here I'm translating his code snippet for use with boost::interprocess condition variables, which have slightly different syntax than boost::thread condition variables:
while (_running) {
boost::system_time timeout = boost::get_system_time() + boost::posix_time::milliseconds(1);
if (shared_mat_header->new_data_condition.timed_wait(lock, timeout))
{
//process data
}
}
I prefer to wait with timeouts, then check the return code of the wait call to see if it timed out or not. In fact I have a thread pattern I like to use that resolves this situation (and other common problems with threads in c++).
http://blog.chrisd.info/how-to-run-threads/
The main point for you is to not block infinitely in a thread, so your thread would look like this:
while (_running == true)
{
if (shared_mat_header->new_data_condition.wait_for(lock, boost::chrono::milliseconds(1)) == boost::cv_status::no_timeout)
{
// process data
}
}
Then in your destructor you set _running = false; and join the thread(s).
Try using the "notify function". Keep a pointer to your condition variable and call that instead of interrupting the threads. Interrupting is much more costly than a notify call.
So instead of doing
thread_group.interrupt_all()
call this instead
new_data_condition_pointer->notify_one()
In Microsoft Visual C++ I can call CreateThread() to create a thread by starting a function with one void * parameter. I pass a pointer to a struct as that parameter, and I see a lot of other people do that as well.
My question is if I am passing a pointer to my struct how do I know if the structure members have been actually written to memory before CreateThread() was called? Is there any guarantee they won't be just cached? For example:
struct bigapple { string color; int count; } apple;
apple.count = 1;
apple.color = "red";
hThread = CreateThread( NULL, 0, myfunction, &apple, 0, NULL );
DWORD WINAPI myfunction( void *param )
{
struct bigapple *myapple = (struct bigapple *)param;
// how do I know that apple's struct was actually written to memory before CreateThread?
cout << "Apple count: " << myapple->count << endl;
}
This afternoon while I was reading I saw a lot of Windows code on this website and others that passes in data that is not volatile to a thread, and there doesn't seem to be any memory barrier or anything else. I know C++ or at least older revisions are not "thread aware" so I'm wondering if maybe there's some other reason. My guess would be the compiler sees that I've passed a pointer &apple in a call to CreateThread() so it knows to write out members of apple before the call.
Thanks
No. The relevant Win32 thread functions all take care of the necessary memory barriers. All writes prior to CreateThread are visible to the new thread. Obviously the reads in that newly created thread cannot be reordered before the call to CreateThread.
volatile would not add any extra useful constraints on the compiler, and merely slow down the code. In practice thiw wouldn't be noticeable compared to the cost of creating a new thread, though.
No, it should not be volatile. At the same time you are pointing at the valid issue. Detailed operation of the cache is described in the Intel/ARM/etc papers.
Nevertheless you can safely assume that the data WILL BE WRITTEN. Otherwise too many things will be broken. Several decades of experience tell that this is so.
If thread scheduler will start thread on the same core, the state of the cache will be fine, otherwise, if not, kernel will flush the cache. Otherwise, nothing will work.
Never use volatile for interaction between threads. It is an instruction on how to handle data inside the thread only (use a register copy or always reread, etc).
First, I think optimizer cannot change the order at expense of the correctness. CreateThread() is a function, parameter binidng for function calls happens before the call is made.
Secondly, volatile is not very helpful for the purpose you intend. Check out this article.
You're struggling into a non-problem, and are creating at least other two...
Don't worry about the parameter given to CreateThread: if they exist at the time the thread is created they exist until CreateThread returns. And since the thread who creates them does not destroy them, they are also available to the other thread.
The problem now becomes who and when they will be destroyed: You create them with new so they will exist until a delete is called (or until the process terminates: good memory leak!)
The process terminate when its main thread terminate (and all other threads will also be terminated as well by the OS!). And there is nothing in your main that makes it to wait for the other thread to complete.
Beware when using low level API like CreateThread form languages that have thir own library also interfaced with thread. The C-runtime has _beginthreadex. It call CreateThread and perform also other initialization task for the C++ library you will otherwise miss. Some C (and C++) library function may not work properly without those initializations, that are also required to properly free the runtime resources at termination. Unsing CreateThread is like using malloc in a context where delete is then used to cleanup.
The proper main thread bnehavior should be
// create the data
// create the other thread
// // perform othe task
// wait for the oter thread to terminate
// destroy the data
What the win32 API documentation don't say clearly is that every HANDLE is waitable, and become signaled when the associate resource is freed.
To wait for the other thread termination, you main thread will just have to call
WaitForSingleObject(hthread,INFINITE);
So the main thread will be more properly:
{
data* pdata = new data;
HANDLE hthread = (HANDLE)_beginthreadex(0,0,yourprocedure, pdata,0,0);
WaitForSingleObject(htread,INFINITE);
delete pdata;
}
or even
{
data d;
HANDLE hthread = (HANDLE)_beginthreadex(0,0,yourprocedure, &d,0,0);
WaitForSingleObject(htread,INFINITE);
}
I think the question is valid in another context.
As others have pointed out using a struct and the contents is safe (although access to the data should by synchronized).
However I think that the question is valid if you hav an atomic variable (or a pointer to one) that can be changed outside the thread. My opinion in that case would be that volatile should be used in this case.
Edit:
I think the examples on the wiki page are a good explanation http://en.wikipedia.org/wiki/Volatile_variable
I have thread A that is creating another thread B, than thread A is waiting using WaitForSingleObject to wait until thread B dies.
The problem is that even though thread B returns from the thread's "thread_func", thread A does not get signaled!.
I know that because I added traces (OutputDebugString) to the end of the thread_func (thread B's main function) and I can see that thread B finishes its execution, but thread A never comes out of the WaitForSingleObject.
Now, I must also add that this code is in a COM object, and the scenario described above is happening when I'm calling regsvr32.exe (it get stuck!), so I believe that thread A is coming from the DLLMain.
Any ideas why thread A does not get signaled ?!?!
You could be hitting a problem with the loader lock. Windows, has an internal critical section that gets locked whenever a DLL is loaded/unloaded or when thread is started/stopped (DllMain is always called inside that lock). If your waiting thread A has that critical section locked (i.e. you are waiting somewhere from DllMain), while another thread B tries to shutdown and attempts to acquire that loader critical section, you have your deadlock.
To see where the deadlock happens, just run your app from the VS IDE debugger and after it gets stuck, break execution. Then look at all running threads, and note the stack of each one. You should be able to follow each stack and see what each thread is waiting on.
I think #DXM is right. The documentation on exactly what you can or can't do inside of DllMain is sparse and hard to find, but the bottom line is that you should generally keep it to a bare minimum -- initialize internal variables and such, but that's about it.
The other point I'd make is that you generally should not "call" regsvr32.exe -- ever.
RegSvr32 is basically just a wrapper that loads a DLL into its address space with LoadLibrary, Calls GetProcAddress to get the address of a function named DllRegisterServer, then calls that function. It's much cleaner (and ultimately, easier) to do the job on your own, something like this:
HMODULE mod = LoadLibrary(your_COM_file);
register_DLL = GetProcAddress(mod, "DllRegisterServer");
if ( register_DLL == NULL) {
// must not really be a COM object...
}
if ( S_OK != register_DLL()) {
// registration failed.
}
How can I wait for a detached thread to finish in C++?
I don't care about an exit status, I just want to know whether or not the thread has finished.
I'm trying to provide a synchronous wrapper around an asynchronous thirdarty tool. The problem is a weird race condition crash involving a callback. The progression is:
I call the thirdparty, and register a callback
when the thirdparty finishes, it notifies me using the callback -- in a detached thread I have no real control over.
I want the thread from (1) to wait until (2) is called.
I want to wrap this in a mechanism that provides a blocking call. So far, I have:
class Wait {
public:
void callback() {
pthread_mutex_lock(&m_mutex);
m_done = true;
pthread_cond_broadcast(&m_cond);
pthread_mutex_unlock(&m_mutex);
}
void wait() {
pthread_mutex_lock(&m_mutex);
while (!m_done) {
pthread_cond_wait(&m_cond, &m_mutex);
}
pthread_mutex_unlock(&m_mutex);
}
private:
pthread_mutex_t m_mutex;
pthread_cond_t m_cond;
bool m_done;
};
// elsewhere...
Wait waiter;
thirdparty_utility(&waiter);
waiter.wait();
As far as I can tell, this should work, and it usually does, but sometimes it crashes. As far as I can determine from the corefile, my guess as to the problem is this:
When the callback broadcasts the end of m_done, the wait thread wakes up
The wait thread is now done here, and Wait is destroyed. All of Wait's members are destroyed, including the mutex and cond.
The callback thread tries to continue from the broadcast point, but is now using memory that's been released, which results in memory corruption.
When the callback thread tries to return (above the level of my poor callback method), the program crashes (usually with a SIGSEGV, but I've seen SIGILL a couple of times).
I've tried a lot of different mechanisms to try to fix this, but none of them solve the problem. I still see occasional crashes.
EDIT: More details:
This is part of a massively multithreaded application, so creating a static Wait isn't practical.
I ran a test, creating Wait on the heap, and deliberately leaking the memory (i.e. the Wait objects are never deallocated), and that resulted in no crashes. So I'm sure it's a problem of Wait being deallocated too soon.
I've also tried a test with a sleep(5) after the unlock in wait, and that also produced no crashes. I hate to rely on a kludge like that though.
EDIT: ThirdParty details:
I didn't think this was relevant at first, but the more I think about it, the more I think it's the real problem:
The thirdparty stuff I mentioned, and why I have no control over the thread: this is using CORBA.
So, it's possible that CORBA is holding onto a reference to my object longer than intended.
Yes, I believe that what you're describing is happening (race condition on deallocate). One quick way to fix this is to create a static instance of Wait, one that won't get destroyed. This will work as long as you don't need to have more than one waiter at the same time.
You will also permanently use that memory, it will not deallocate. But it doesn't look like that's too bad.
The main issue is that it's hard to coordinate lifetimes of your thread communication constructs between threads: you will always need at least one leftover communication construct to communicate when it is safe to destroy (at least in languages without garbage collection, like C++).
EDIT:
See comments for some ideas about refcounting with a global mutex.
To the best of my knowledge there's no portable way to directly ask a thread if its done running (i.e. no pthread_ function). What you are doing is the right way to do it, at least as far as having a condition that you signal. If you are seeing crashes that you are sure are due to the Wait object is being deallocated when the thread that creates it quits (and not some other subtle locking issue -- all too common), the issue is that you need to make sure the Wait isn't being deallocated, by managing from a thread other than the one that does the notification. Put it in global memory or dynamically allocate it and share it with that thread. Most simply don't have the thread being waited on own the memory for the Wait, have the thread doing the waiting own it.
Are you initializing and destroying the mutex and condition var properly?
Wait::Wait()
{
pthread_mutex_init(&m_mutex, NULL);
pthread_cond_init(&m_cond, NULL);
m_done = false;
}
Wait::~Wait()
{
assert(m_done);
pthread_mutex_destroy(&m_mutex);
pthread_cond_destroy(&m_cond);
}
Make sure that you aren't prematurely destroying the Wait object -- if it gets destroyed in one thread while the other thread still needs it, you'll get a race condition that will likely result in a segfault. I'd recommend making it a global static variable that gets constructed on program initialization (before main()) and gets destroyed on program exit.
If your assumption is correct then third party module appears to be buggy and you need to come up with some kind of hack to make your application work.
Static Wait is not feasible. How about Wait pool (it even may grow on demand)? Is you application using thread pool to run?
Although there will still be a chance that same Wait will be reused while third party module is still using it. But you can minimize such chance by properly queing vacant Waits in your pool.
Disclaimer: I am in no way an expert in thread safety, so consider this post as a suggestion from a layman.
Is the following safe?
I am new to threading and I want to delegate a time consuming process to a separate thread in my C++ program.
Using the boost libraries I have written code something like this:
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
Where finished_flag is a boolean member of my class. When the thread is finished it sets the value and the main loop of my program checks for a change in that value.
I assume that this is okay because I only ever start one thread, and that thread is the only thing that changes the value (except for when it is initialised before I start the thread)
So is this okay, or am I missing something, and need to use locks and mutexes, etc
You never mentioned the type of finished_flag...
If it's a straight bool, then it might work, but it's certainly bad practice, for several reasons. First, some compilers will cache the reads of the finished_flag variable, since the compiler doesn't always pick up the fact that it's being written to by another thread. You can get around this by declaring the bool volatile, but that's taking us in the wrong direction. Even if reads and writes are happening as you'd expect, there's nothing to stop the OS scheduler from interleaving the two threads half way through a read / write. That might not be such a problem here where you have one read and one write op in separate threads, but it's a good idea to start as you mean to carry on.
If, on the other hand it's a thread-safe type, like a CEvent in MFC (or equivilent in boost) then you should be fine. This is the best approach: use thread-safe synchronization objects for inter-thread communication, even for simple flags.
Instead of using a member variable to signal that the thread is done, why not use a condition? You are already are using the boost libraries, and condition is part of the thread library.
Check it out. It allows the worker thread to 'signal' that is has finished, and the main thread can check during execution if the condition has been signaled and then do whatever it needs to do with the completed work. There are examples in the link.
As a general case I would neve make the assumption that a resource will only be modified by the thread. You might know what it is for, however someone else might not - causing no ends of grief as the main thread thinks that the work is done and tries to access data that is not correct! It might even delete it while the worker thread is still using it, and causing the app to crash. Using a condition will help this.
Looking at the thread documentation, you could also call thread.timed_join in the main thread. timed_join will wait for a specified amount for the thread to 'join' (join means that the thread has finsihed)
I don't mean to be presumptive, but it seems like the purpose of your finished_flag variable is to pause the main thread (at some point) until the thread thrd has completed.
The easiest way to do this is to use boost::thread::join
// launch the thread...
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
// ... do other things maybe ...
// wait for the thread to complete
thrd.join();
If you really want to get into the details of communication between threads via shared memory, even declaring a variable volatile won't be enough, even if the compiler does use appropriate access semantics to ensure that it won't get a stale version of data after checking the flag. The CPU can issue reads and writes out of order as long (x86 usually doesn't, but PPC definitely does) and there is nothing in C++9x that allows the compiler to generate code to order memory accesses appropriately.
Herb Sutter's Effective Concurrency series has an extremely in depth look at how the C++ world intersects the multicore/multiprocessor world.
Having the thread set a flag (or signal an event) before it exits is a race condition. The thread has not necessarily returned to the OS yet, and may still be executing.
For example, consider a program that loads a dynamic library (pseudocode):
lib = loadLibrary("someLibrary");
fun = getFunction("someFunction");
fun();
unloadLibrary(lib);
And let's suppose that this library uses your thread:
void someFunction() {
volatile bool finished_flag = false;
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
while(!finished_flag) { // ignore the polling loop, it's besides the point
sleep();
}
delete thrd;
}
void myclass::mymethod() {
// do stuff
finished_flag = true;
}
When myclass::mymethod() sets finished_flag to true, myclass::mymethod() hasn't returned yet. At the very least, it still has to execute a "return" instruction of some sort (if not much more: destructors, exception handler management, etc.). If the thread executing myclass::mymethod() gets pre-empted before that point, someFunction() will return to the calling program, and the calling program will unload the library. When the thread executing myclass::mymethod() gets scheduled to run again, the address containing the "return" instruction is no longer valid, and the program crashes.
The solution would be for someFunction() to call thrd->join() before returning. This would ensure that the thread has returned to the OS and is no longer executing.