Couldn't terminate thread (error 6) - c++

We have a huge, complex wxWidgets application written in C++. I added an extra background thread. When the user clicks "go", the thread starts. When they click "stop", the thread stops. For reasons beyond my comprehension, clicking "stop" also causes the following message to be displayed:
Can not wait for thread termination (error 6: the handle is invalid.)
Couldn't terminate thread (error 6: the handle is invalid.)
Why the hell is this happening?? And more importantly, how do I make this go away immediately?
The thread is started here:
_worker = new WorkerThread();
_worker->Create();
_worker->Run();
I know for a fact that the thread is running, because I can see the disk files it's writing.
The thread is stopped here:
if (_worker)
{
_worker->Delete();
_worker = NULL;
}
The WorkerThread class only overrides Enter(). It is definitely a detachable thread.
The documentation is full of dire warnings about how a detachable thread can delete itself at any moment, and everything must always be wrapped in a critical section. But my worker thread runs forever, until I tell it to stop. I can't see why I would need a critical section for anything.
Is the thread taking too long to stop? Is that the problem? (It only checks TestDestroy() once per second. Is that too slow?)
I really can't figure out how the hell to solve this.

You may "make it go away" by using wxLogNull, as with any other messages generated by wxWidgets. You should not do this however as you seem to have a real bug somewhere in your code, the thread handle obviously should not be invalid and if it is, something clearly doesn't go as you think it does. By sweeping the error under the carpet you all but guarantee that it will reappear in a different guise at the worst possible moment and typically on a clients machine where you will be unable to debug it. Better really do it now.

Related

Rare EXCEPTION_ACCESS_VIOLATION when debugging any process started with CREATE_SUSPENDED

While writing an x86 WinAPI-based debugger, I've encountered a rare condition when the debuggee (which usually works well) suddenly terminates with EXCEPTION_ACCESS_VIOLATION after I attach to it with my native debugger. I can stably reproduce this on any applications it seems (tried on .NET Hello World-styled application and on notepad.exe on multiple Windows 10 machines).
Essentially I've written a simple WaitForDebugEvent loop:
CreateProcessW(L"C:\\Windows\\SYSWOW64\\notepad.exe", […], CREATE_SUSPENDED, […]);
DebugActiveProcess(processId);
DEBUG_EVENT debugEvent = {};
while (WaitForDebugEvent(&debugEvent, INFINITE)) {
switch (debugEvent.dwDebugEventCode) {
// log all the events
}
ContinueDebugEvent(debugEvent.dwProcessId, debugEvent.dwThreadId, DBG_EXCEPTION_NOT_HANDLED);
}
DebugActiveProcessStop(processId);
(here's the full listing: I won't paste it all here, because there's some additional non-essential boilerplate there; the MCVE is 136 lines long)
For the sake of an example, I'll just log all the debugger events and detect whether the debuggee is ready to "proceed normally" or it will terminate due to an exception.
Most of the time, my debugging session looks like that:
CREATE_PROCESS_DEBUG_EVENT (which reports creation of both the process and its initial thread)
LOAD_DLL_DEBUG_EVENT (I was never able to get the name for this DLL, but this is documented in MSDN)
CREATE_THREAD_DEBUG_EVENT (which, I suspect, is a thread injected by debugger)
LOAD_DLL_DEBUG_EVENT […] — after this, many DLLs get loaded into the target process and everything looks okay, the process works as intended
But sometimes (in about 1.5% of all runs), the event sequence changes:
CREATE_PROCESS_DEBUG_EVENT
LOAD_DLL_DEBUG_EVENT
CREATE_THREAD_DEBUG_EVENT
EXCEPTION_DEBUG_EVENT: EXCEPTION_ACCESS_VIOLATION (which I never was able to gather details for: it reports a DEP violation, and the address is empty)
After that, I cannot proceed with debugging, because my debuggee is in exception state and will terminate soon. I was never able to catch notepad.exe crash without my debugger attached (and I doubt it is that bad and will crash for no reason), so I suspect that my debugger causes these exceptions.
One bizarre detail is that I could "fix" the situation by calling Sleep(1) immediately after WaitForDebugEvent. So, this is possibly some sort of race condition, but race condition between what? Between the debugger thread and other threads in the debuggee? Is it a thing? How are we supposed to debug other applications, then? How could actual debuggers work if it is a thing?
I couldn't reproduce the issue with the same code compiled for x64 CPU (and debugging an x64 process).
What could actually cause this erroneous behavior? I've carefully read the documentation about the API functions I call, and checked some other debugger examples online, but still wasn't able to find what's wrong with my debugger: it looks like I follow all the right conventions.
I have tried to debug my debuggee with WinDBG while it is still paused in my debugger, but had no luck doing that. First of all, it's difficult to attach to the debuggee with another debugger (WinDBG only allows to use non-intrusive mode, which is less functional it seems?), and the call stacks for the process' threads aren't usually meaningful.
Steps to reproduce
Checkout this repository, compile with MSVC and then execute in cmd:
Debug\NetRuntimeWaiter.exe > log.txt
It is important to redirect output to the log file and not show it in the terminal: without that, timings for the log writer get changed, and the issue won't reproduce (due to a possible race condition I mentioned earlier?).
Usually the program will start and terminate 1000 notepads in about 10 seconds, and 10-15 of 1000 invocations will hold the error condition (i.e. EXCEPTION_ACCESS_VIOLATION).
the DebugActiveProcess (and undocumented DbgUiDebugActiveProcess which is internally called by DebugActiveProcess) have serious design problem: after calling NtDebugActiveProcess it create remote thread in the target process, via DbgUiIssueRemoteBreakin call - as result new thread in target process is created - DbgUiRemoteBreakin - this thread call DbgBreakPoint and then RtlExitUserThread
all this not documented and explained, only this note from DebugActiveProcess:
After all of this is done, the system resumes all threads in the
process. When the first thread in the process resumes, it executes a
breakpoint instruction that causes an EXCEPTION_DEBUG_EVENT
debugging event to be sent to the debugger.
of course this is wrong. why is DbgUiRemoteBreakin first (??) thread ? and which thread resume first undefined. why not exactly write - we create additional (but not first) thread in process ? and this thread execute breakpoint.
however, when process already running - create this additional thread not create problems. but in case we create process in suspended state, and then just call DebugActiveProcess - the DbgUiRemoteBreakin really became first executing thread in process and process initialization was done on this thread, instead of created first thread. on xp this always lead to fail process initialize at connect to csrss phase. (csrss wait connect to it only on first created thread in process). on later systems this is fixed and process can execute as usual. but can and not, because thread on which it was initialized is exit. it can cause subtle problems.
solution here - not use DebugActiveProcess but NtDebugActiveProcess in it place.
the debug object we can create or via DbgUiConnectToDbg() and then get it via DbgUiGetThreadDebugObject() (system store debug object in thread TEB) or direct by call NtCreateDebugObject
also if we create debuggee process from another process(B) we can do next:
duplicate debug object from debugger process to this B process
call DbgUiSetThreadDebugObject(hDdg) just before call
CreateProcessW with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS
system will be use DbgUiGetThreadDebugObject() for get debug object
from your thread and pass it to low level process create api
remove debug object from your thread via
DbgUiSetThreadDebugObject(0)
really no matter who is create process with debug object. matter who is handle events posted to this debug object.
all undocumented api definitions you can take from ntdbg.h and then link with ntdll.lib or ntdllp.lib

Qt: The relation between Worker thread and GUI Events

I have an ordinary GUI Thread (Main Window) and want to attach a Worker thread to it. The Worker thread will be instantiated, moved to its own thread and then fired away to run on its own independently, running a messaging routine (non-blocking).
This is where the worker is created:
void MainWindow::on_connectButton_clicked()
{
Worker* workwork;
workwork= new Worker();
connect(workwork,SIGNAL(invokeTestResultsUpdate(int,quint8)),
this,SLOT(updateTestResults(int,quint8)),Qt::QueuedConnection);
connect(this,SIGNAL(emitInit()),workwork,SLOT(init()));
workwork->startBC();
}
This is where the Worker starts:
void Worker::startBC()
{
t1553 = new QThread();
this->moveToThread(t1553);
connect(t1553,SIGNAL(started()),this,SLOT(run1553Process()));
t1553->start();
}
I have two problems here, regarding the event queue of the new thread:
The first and minor problem is that, while I can receive the signals from the Worker thread (namely: invokeTestResultsUpdate), I cannot invoke the init method by emitting the emitInit signal from MainWindow. It just doesn't fire unless I call it directly or connect it via Qt::DirectConnection . Why is this happening? Because I have to start the Worker thread's own messaging loop explicitly? Or some other thing I'm not aware of? (I really fail to wrap my head around the concept of Thread/Event Loop/Signal Slot mechanism and the relation between each other even though I try. I welcome any fresh perspective here too.)
The second and more obscure problem is: run1553process method does some heavy work. By heavy work, I mean a very high rate of data. There is a loop running, and I try to receive the data flowing from a device (real-time) as soon as it lands in the buffer, using mostly extern API functions. Then throw the mentioned invokeTestResultsUpdate signal towards the GUI each time it receives a message, updating the message number box. It's nothing more than that.
The thing I'm experiencing is weird; normally the messaging routine is mostly unhindered but when I resize the main window, move it, or hide/show the window, the Worker thread skips many messages. And the resizing action is really slow (not responds very fast). It's really giving me a cancer.
(Note: I have tried subclassing QThread before, it did not mitigate the problem.)
I've been reading all the "Thread Affinity" topics and tried to apply them but it still behaves like it is somehow interrupted by the GUI thread's events at some point. I can understand MainWindow's troubles since there are many messages at the queue to be executed (both the invoked slots and the GUI events). But I cannot see as to why a background thread is affected by the GUI events. I really need to have an extremely robust and unhindered message routine running seperately behind, firing and forgetting the signals and not giving a damn about anything.
I'm really desperate for any help right now, so any bit of information is useful for me. Please do not hesitate to throw ideas.
TL;DR: call QCoreApplication::processEvents(); periodiacally inside run1553process.
Full explanation:
Signals from the main thread are put in a queue and executed once the event loop in the second thread takes control. In your implementation you call run1553Process as soon as the thread starts. the control will not go back to the event loop until the end of that function or QCoreApplication::processEvents is manually invoked so signals will just sit there waiting for the event loop to pick them up.
P.S.
you are leaking both the worker and the thread in the code above
P.P.S.
Data streams from devices normally provide an asynchronous API instead of you having to poll them indefinetly
I finally found the problem.
The crucial mistake was connecting the QThread's built in start() signal to run1553Process() slot. I had thought of this as replacing run() with this method, and expected everything to be fine. But this caused the actual run() method to get blocked, therefore preventing the event loop to start.
As stated in qthread.cpp:
void QThread::run()
{
(void) exec();
}
To fix this, I didn't touch the original start() signal, instead connected another signal to my run1553Process() independently. First started the thread ordinarily, allowed the event loop to start, then fired my other signals. That did it, now my Worker can receive all the messages.
I think now I understand the relation between threads and events better.
By the way, this solution did not take care of the message skipping problem entirely, but I feel that's caused by another factor (like my message reading implementation).
Thanks everyone for the ideas. I hope the solution helps some other poor guy like me.

How does the message loop use threads?

I'm somewhat confused and wondering if I've been misinformed, in a separate post I was told "New threads are only created when you make them explicitly. C++ programs are by default single threaded." When I open my program that doesn't explicitly create new threads in ollydbg I noticed multiple times that there are often 2 threads running. I wanted to understand how the message loop works without stopping up execution, the explanation I got was very insufficient at explaining how it works.
Does the message loop create a new thread or does it take up the main thread? If it takes the main thread does it do so after everything else has been executed regardless of code order? If it doesn't do this but still takes up the main thread does it spawn a new thread so that the program can execute instead of getting stuck in the message loop?
EDIT: Solved most of my questions with experimentation. The message loop occupies the main thread and any code after the code:
while (GetMessage (&messages, NULL, 0, 0))
{
TranslateMessage(&messages);
DispatchMessage(&messages);
}
return messages.wParam;
Will not execute unless something special is done to cause it to execute because the program is stuck in the message loop. Putting an infinite loop in a window procedure that gets executed causes the program to crash. I still don't understand the mystery of the multiple threads when in olly to the degree I would prefer though.
Perhaps the place to start is to realize that "the message loop" isn't a thing as such; it's really just something that a thread does.
Threads in windows generally fall into one of two categories: those that own UI, and those that do background work (eg network operations).
A simple UI app typically has just one thread, which is a UI thread. For the UI to work, the thread needs to wait until there's some input to handle (mouse click, keyboard input, etc), handle the input (eg. update the state and redraw the window), and then go back to waiting for more input. This whole act of "wait for input, process it, repeat" is the message loop. (Also worth mentioning at this stage is the message queue: each thread has its own input queue which stores up the input messages for a thread; and the act of a thread "waiting for input" is really about checking if there's anything in the queue, and if not, waiting till there is.) In win32 speak, if a thread is actively processing input this way, it's also said to be "pumping messages".
A typical simple windows app's mainline code will first do basic initialization, create the main window, and then do the wait-for-input-and-process-it message loop. It does this usually until the user closes the main window, at which point the thread exits the loop, and carries on executing the code that comes afterwards, which is usually cleanup code.
A common architecture in windows apps is to have a main UI thread - usually this is the main thread - and it creates and owns all the UI, and has a message loop that dispatches messages for all of the UI that the thread created. If an app needs to do something that could potentially block, such as reading from a socket, a worker thread is often used for that purpose: you don't want the UI thread to block (eg. while waiting for input from a socket), as it wouldn't be processing input during that time and the UI would end up being unresponsive.
You could write an app that had more than one UI thread in it - and each thread that creates windows would then need its own message loop - but it's a fairly advanced technique and not all that useful for most basic apps.
The other threads you are seeing are likely some sort of helper threads that are created by Windows to do background tasks; and for the most part, you can ignore them. If you initialize COM, for example, windows may end up creating a worker thread to manage some COM internal stuff, and it may also create some invisible HWNDs too.
Typically the thread that starts the program only runs the message loop, taking up the main thread. Anything not part of handling messages or updating the UI is typically done by other threads. The additional thread that you see even if your application doesn't create any threads could be created by a library or the operating system. Windows will create threads inside your process to handle things like dispatching events to your message loop.

C++ - Totally suspend windows application

I am developing a simple WinAPI application and started from writing my own assertion system.
I have a macro defined like ASSERT(X) which would make pretty the same thing as assert(X) does, but with more information, more options and etc.
At some moment (when that assertion system was already running and working) I realized there is a problem.
Suppose I wrote a code that does some action using a timer and (just a simple example) this action is done while handling WM_TIMER message. And now, the situation changes the way that this code starts throwing an assert. This assert message would be shown every TIMER_RESOLUTION milliseconds and would simply flood the screen.
Options for solving this situation could be:
1) Totally pause application running (probably also, suspend all threads) when the assertion messagebox is shown and continue running after it is closed
2) Make a static counter for the shown asserts and don't show asserts when one of them is already showing (but this doesn't pause application)
3) Group similiar asserts and show only one for each assert type (but this also doesn't pause application)
4) Modify the application code (for example, Get / Translate / Dispatch message loop) so that it suspends itself when there are any asserts. This is good, but not universal and looks like a hack.
To my mind, option number 1 is the best. But I don't know any way how this can be achieved. What I'm seeking for is a way to pause the runtime (something similiar to Pause button in the debugger). Does somebody know how to achieve this?
Also, if somebody knows an efficient way to handle this problem - I would appreciate your help. Thank you.
It is important to understand how Windows UI programs work, to answer this question.
At the core of the Windows UI programming model is of course "the message" queue". Messages arrive in message queues and are retrieved using message pumps. A message pump is not special. It's merely a loop that retrieves one message at a time, blocking the thread if none are available.
Now why are you getting all these dialogs? Dialog boxes, including MessageBox also have a message pump. As such, they will retrieve messages from the message queue (It doesn't matter much who is pumping messages, in the Windows model). This allows paints, mouse movement and keyboard input to work. It will also trigger additional timers and therefore dialog boxes.
So, the canonical Windows approach is to handle each message whenever it arrives. They are a fact of life and you deal with them.
In your situation, I would consider a slight variation. You really want to save the state of your stack at the point where the assert happened. That's a particularity of asserts that deserves to be respected. Therefore, spin off a thread for your dialog, and create it without a parent HWND. This gives the dialog an isolated message queue, independent of the original window. Since there's also a new thread for it, you can suspend the original thread, the one where WM_TIMER arrives.
Don't show a prompt - either log to a file/debug output, or just forcibly break the debugger (usually platform specific, eg. Microsoft's __debugbreak()). You have to do something more passive than show a dialog if there are threads involved which could fire lots of failures.
Create a worker thread for your debugging code. When an assert happens, send a message to the worker thread. The worker thread would call SuspendThread on each thread in the process (except itself) to stop it, and then display a message box.
To get the threads in a process - create a dll and monitor the DllMain for Thread Attach (and Detach) - each call will be done in the context of a thread being created (or destroyed) so you can get the current thread id and create a handle to use with SuspendThread.
Or, the toolhelp debug api will help you find out the threads to pause.
The reason I prefer this approach is, I don't like asserts that cause side effects. Too often Ive had asserts fire from asynchronous socket processing - or window message - processing code - then the assert Message box is created on that thread which either causes the state of the thread to be corrupted by a totally unexpected re-entrancy point - MessageBox also discards any messages sent to the thread, so it messes up any worker threads using thread message queues to queue jobs.
My own ASSERT implementation calls DebugBreak() or as alternative INT 3 (__asm int 3 in MS VC++). An ASSERT should break on the debugger.
Use the MessageBox function. This will block until the user clicks "ok". After this is done, you could choose to discard extra assertion failure messages or still display them as your choice.

Shutting down multithreaded NSDocument

I have an NSDocument-based Cocoa app and I have a couple of secondary threads that I need to terminate gracefully (wait for them to run through the current loop) when the users closes the document window or when the application quits. I'm using canCloseDocumentWithDelegate to send a flag to the threads when the document is closing and then when they're done, one of them calls [NSDocument close]. This seems to work peachy keen when the user closes the document window, but when you quit the app, it goes all kinds of wrong (crashes before it calls anything). What is the correct procedure for something like this?
The best possible way is for the threads to own the objects necessary for the thread to finish doing whatever it is doing to the point of being able to abort processing and terminate as quickly as possible.
Under non-GC, this means a -retain that the thread -releases when done. For GC, it is just a hard reference to the object(s) desired.
If there is some kind of lengthy processing that must go on and must complete before the document is closed, then drop a sheet with a progress bar and leave the document modal until done (both Aperture and iPhoto do exactly this).