How to detect why thread exit on windows? - c++

I had a process with C++ on windows 2008R2, there are several theads in it. During the process's startup, there is a chance that one of the thread will exit. I didn't get a way to detect what happens, any suggestions?
Based on my investigation, the thread just exit without an exception. Access to a null pointer can cause the similar issue, but I didn't find such a position in the process. In fact, it should be better if the process just crash, then I can get a dump file; but nothing happens, just one thread exit.
I had tried the tool user mode process dumper, but it cannot work on the windows version that this process is working on.
I had tried the tool process monitor to check the thread exit event, but the process monitor will throw an exception when I try to reproduce this issue by starting the process again and again.
Thanks in advance.

Found the root cause at last -- the string is accessed by more than one threads, and one thread just exit. String is not thread safe.
Process Monitor helped to get the thread exit call stack on a powerful host, this makes the root cause clear.
Thanks all for your suggestions.

Related

Rare EXCEPTION_ACCESS_VIOLATION when debugging any process started with CREATE_SUSPENDED

While writing an x86 WinAPI-based debugger, I've encountered a rare condition when the debuggee (which usually works well) suddenly terminates with EXCEPTION_ACCESS_VIOLATION after I attach to it with my native debugger. I can stably reproduce this on any applications it seems (tried on .NET Hello World-styled application and on notepad.exe on multiple Windows 10 machines).
Essentially I've written a simple WaitForDebugEvent loop:
CreateProcessW(L"C:\\Windows\\SYSWOW64\\notepad.exe", […], CREATE_SUSPENDED, […]);
DebugActiveProcess(processId);
DEBUG_EVENT debugEvent = {};
while (WaitForDebugEvent(&debugEvent, INFINITE)) {
switch (debugEvent.dwDebugEventCode) {
// log all the events
}
ContinueDebugEvent(debugEvent.dwProcessId, debugEvent.dwThreadId, DBG_EXCEPTION_NOT_HANDLED);
}
DebugActiveProcessStop(processId);
(here's the full listing: I won't paste it all here, because there's some additional non-essential boilerplate there; the MCVE is 136 lines long)
For the sake of an example, I'll just log all the debugger events and detect whether the debuggee is ready to "proceed normally" or it will terminate due to an exception.
Most of the time, my debugging session looks like that:
CREATE_PROCESS_DEBUG_EVENT (which reports creation of both the process and its initial thread)
LOAD_DLL_DEBUG_EVENT (I was never able to get the name for this DLL, but this is documented in MSDN)
CREATE_THREAD_DEBUG_EVENT (which, I suspect, is a thread injected by debugger)
LOAD_DLL_DEBUG_EVENT […] — after this, many DLLs get loaded into the target process and everything looks okay, the process works as intended
But sometimes (in about 1.5% of all runs), the event sequence changes:
CREATE_PROCESS_DEBUG_EVENT
LOAD_DLL_DEBUG_EVENT
CREATE_THREAD_DEBUG_EVENT
EXCEPTION_DEBUG_EVENT: EXCEPTION_ACCESS_VIOLATION (which I never was able to gather details for: it reports a DEP violation, and the address is empty)
After that, I cannot proceed with debugging, because my debuggee is in exception state and will terminate soon. I was never able to catch notepad.exe crash without my debugger attached (and I doubt it is that bad and will crash for no reason), so I suspect that my debugger causes these exceptions.
One bizarre detail is that I could "fix" the situation by calling Sleep(1) immediately after WaitForDebugEvent. So, this is possibly some sort of race condition, but race condition between what? Between the debugger thread and other threads in the debuggee? Is it a thing? How are we supposed to debug other applications, then? How could actual debuggers work if it is a thing?
I couldn't reproduce the issue with the same code compiled for x64 CPU (and debugging an x64 process).
What could actually cause this erroneous behavior? I've carefully read the documentation about the API functions I call, and checked some other debugger examples online, but still wasn't able to find what's wrong with my debugger: it looks like I follow all the right conventions.
I have tried to debug my debuggee with WinDBG while it is still paused in my debugger, but had no luck doing that. First of all, it's difficult to attach to the debuggee with another debugger (WinDBG only allows to use non-intrusive mode, which is less functional it seems?), and the call stacks for the process' threads aren't usually meaningful.
Steps to reproduce
Checkout this repository, compile with MSVC and then execute in cmd:
Debug\NetRuntimeWaiter.exe > log.txt
It is important to redirect output to the log file and not show it in the terminal: without that, timings for the log writer get changed, and the issue won't reproduce (due to a possible race condition I mentioned earlier?).
Usually the program will start and terminate 1000 notepads in about 10 seconds, and 10-15 of 1000 invocations will hold the error condition (i.e. EXCEPTION_ACCESS_VIOLATION).
the DebugActiveProcess (and undocumented DbgUiDebugActiveProcess which is internally called by DebugActiveProcess) have serious design problem: after calling NtDebugActiveProcess it create remote thread in the target process, via DbgUiIssueRemoteBreakin call - as result new thread in target process is created - DbgUiRemoteBreakin - this thread call DbgBreakPoint and then RtlExitUserThread
all this not documented and explained, only this note from DebugActiveProcess:
After all of this is done, the system resumes all threads in the
process. When the first thread in the process resumes, it executes a
breakpoint instruction that causes an EXCEPTION_DEBUG_EVENT
debugging event to be sent to the debugger.
of course this is wrong. why is DbgUiRemoteBreakin first (??) thread ? and which thread resume first undefined. why not exactly write - we create additional (but not first) thread in process ? and this thread execute breakpoint.
however, when process already running - create this additional thread not create problems. but in case we create process in suspended state, and then just call DebugActiveProcess - the DbgUiRemoteBreakin really became first executing thread in process and process initialization was done on this thread, instead of created first thread. on xp this always lead to fail process initialize at connect to csrss phase. (csrss wait connect to it only on first created thread in process). on later systems this is fixed and process can execute as usual. but can and not, because thread on which it was initialized is exit. it can cause subtle problems.
solution here - not use DebugActiveProcess but NtDebugActiveProcess in it place.
the debug object we can create or via DbgUiConnectToDbg() and then get it via DbgUiGetThreadDebugObject() (system store debug object in thread TEB) or direct by call NtCreateDebugObject
also if we create debuggee process from another process(B) we can do next:
duplicate debug object from debugger process to this B process
call DbgUiSetThreadDebugObject(hDdg) just before call
CreateProcessW with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS
system will be use DbgUiGetThreadDebugObject() for get debug object
from your thread and pass it to low level process create api
remove debug object from your thread via
DbgUiSetThreadDebugObject(0)
really no matter who is create process with debug object. matter who is handle events posted to this debug object.
all undocumented api definitions you can take from ntdbg.h and then link with ntdll.lib or ntdllp.lib

Couldn't terminate thread (error 6)

We have a huge, complex wxWidgets application written in C++. I added an extra background thread. When the user clicks "go", the thread starts. When they click "stop", the thread stops. For reasons beyond my comprehension, clicking "stop" also causes the following message to be displayed:
Can not wait for thread termination (error 6: the handle is invalid.)
Couldn't terminate thread (error 6: the handle is invalid.)
Why the hell is this happening?? And more importantly, how do I make this go away immediately?
The thread is started here:
_worker = new WorkerThread();
_worker->Create();
_worker->Run();
I know for a fact that the thread is running, because I can see the disk files it's writing.
The thread is stopped here:
if (_worker)
{
_worker->Delete();
_worker = NULL;
}
The WorkerThread class only overrides Enter(). It is definitely a detachable thread.
The documentation is full of dire warnings about how a detachable thread can delete itself at any moment, and everything must always be wrapped in a critical section. But my worker thread runs forever, until I tell it to stop. I can't see why I would need a critical section for anything.
Is the thread taking too long to stop? Is that the problem? (It only checks TestDestroy() once per second. Is that too slow?)
I really can't figure out how the hell to solve this.
You may "make it go away" by using wxLogNull, as with any other messages generated by wxWidgets. You should not do this however as you seem to have a real bug somewhere in your code, the thread handle obviously should not be invalid and if it is, something clearly doesn't go as you think it does. By sweeping the error under the carpet you all but guarantee that it will reappear in a different guise at the worst possible moment and typically on a clients machine where you will be unable to debug it. Better really do it now.

Using C++, How to detect process right before terminated

Im using Visual C++
I'm trying to monitor another process.
Is there a way to detect when the process is terminated ? I mean right before it's terminated, the program can raise an event. After that event, the process will be terminated.
I want my code run before the process is terminated.
The reason I want to do that because I use WMI to detect the process started. But some the process is ended too quickly, my code doesn't not run yet, but the process already ended.
You would use the DebugActiveProcess function, and then use a loop which starts with WaitForDebugEvent - when the process exits, you get a EXIT_PROCESS_DEBUG_EVENT.
You will probably get a bunch of other debug events [it depends on when you attach to the process and what the process does after that point]. For those, you will just issue a call to ContinueDebugEvent - if it was an exception, DBG_EXCEPTION_NOT_HANDLED should be used, otherwise, DBG_CONTINUE.
Once you see your EXIT_PROCESS_DEBUG_EVENT, you do your thing, then issue DBG_CONTINUE. You will also need to handle LOAD_DLL_DEBUG_EVENT by closing the handle given, or you'll leak handles.
I haven't used DebugActiveProcess in exactly this manner, but I believe this will work.
See these functions for more details:
Windows Debugging Functions

Is it possible to detect 'end process' externally?

Is there some way to detect that a program was ended by windows task manager's "end process"?
I know that its kinda impossible to do that from within the application being ended (other than to build your app as a driver and hook ZwTerminateProcess), but I wonder if there is a way to notice it from outside.
I don't want to stop the program from terminating, just to know that it was ended by "end process" (and not by any other way).
There might be a better way - but how about using a simple flag?
Naturally, you'd have to persist this flag somewhere outside of the process/program's memory - like the registry, database, or file system. Essentially, when the app starts up, you set the flag to 'True' when the app shuts down through the normal means, you set the flag to 'False'.
Each time the application starts you can check the flag to see if it was not shut down correctly the previous time it was executed.
Open up a handle to the process with OpenProcess, and then wait on that handle using one of the wait functions such as WaitForSingleObject. You can get the exit status of the process using GetExitCodeProcess. If you need your program to remain responsive to user input while waiting, then make sure to wait on a separate thread (or you can periodically poll using a timeout of zero, but remember the performance consequences of polling -- not recommended).
When you're done, don't forget to call CloseHandle. The process object won't be fully deleted from the OS until all of its handles are closed, so you'll leak resources if you forget to call CloseHandle.
Note that there's no way to distinguish between a process exiting normally or being terminated forcefully. Even if you have a convention that your program only ever exits with a status of 0 (success) or 1 (failure) normally, some other process could call TerminateProcess(YourProcess, 1), and that would be indistinguishable from your ordinary failure mode.
According to the documentation, ExitProcess calls the entry point of all loaded DLLs with DLL_PROCESS_DETACH, whereas TerminateProcess does not. (Exiting the main function results in a call to ExitProcess, as do most unhandled exceptions.)
You might also want to look into Application Recovery and Restart.
One option might be to create a "watchdog" application (installed as a service, perhaps) that monitors WMI events for stopping a process via the ManagementEventWatcher class (in the System.Management namespace).
You could query for the death of your process on an interval or come up with some event driven way to alert of your process's demise.
Here's sort of an example (it's in C# though) that could get you started.

How to check if a process is running or got segfaulted or terminated in linux from its pid in my main() in c++

I am invoking several processes in my main and I can get the pid of that processes. Now I want to wait until all this processes have been finished and then clear the shared memory block from my parent process. Also if any of the process not finished and segfaulted I want to kill that process. So how to check from the pid of processes in my parent process code that a process is finished without any error or it gave broke down becoz of runtime error or any other cause, so that I can kill that process.
Also what if I want to see the status of some other process which is not a child process but its pid is known.
Code is appreciated( I am not looking for script but code ).
Look into waitpid(2) with WNOHANG option. Check the "fate" of the process with macros in the manual page, especially WIFSIGNALED().
Also, segfaulted process is already dead (unless SIGSEGV is specifically handled by the process, which is usually not a good idea.)
From your updates, it looks like you also want to check on other processes, which are not children of your current process.
You can look at /proc/{pid}/status to get an overview of what a process is currently doing, its either going to be:
Running
Stopped
Sleeping
Disk (D) sleep (i/o bound, uninterruptable)
Zombie
However, once a process dies (fully, unless zombied) so does its entry in /proc. There's no way to tell if it exited successfully, segfaulted, caught a signal that could not be handled, or failed to handle a signal that could be handled. Not unless its parent logs that information somewhere.
It sounds like your writing a watchdog for other processes that you did not start, rather than keeping track of child processes.
If a program segfaults, you won't need to kill it. It's dead already.
Use the wait and waitpid calls to wait for children to finish and check the status for some idea of how they exiting. See here for details on how to use these functions. Note especially the WIFSIGNALED and WTERMSIG macros.
waitpid() from SIGCHLD handler to catch the moment when application terminates itself. Note that if you start multiple processes you have to loop on waitpid() with WNOHANG until it returns 0.
kill() with signal 0 to check whether the process is still running. IIRC zombies still qualify as processes thus you have to have proper SIGCHLD handler for that to work.