Rare EXCEPTION_ACCESS_VIOLATION when debugging any process started with CREATE_SUSPENDED - c++

While writing an x86 WinAPI-based debugger, I've encountered a rare condition when the debuggee (which usually works well) suddenly terminates with EXCEPTION_ACCESS_VIOLATION after I attach to it with my native debugger. I can stably reproduce this on any applications it seems (tried on .NET Hello World-styled application and on notepad.exe on multiple Windows 10 machines).
Essentially I've written a simple WaitForDebugEvent loop:
CreateProcessW(L"C:\\Windows\\SYSWOW64\\notepad.exe", […], CREATE_SUSPENDED, […]);
DebugActiveProcess(processId);
DEBUG_EVENT debugEvent = {};
while (WaitForDebugEvent(&debugEvent, INFINITE)) {
switch (debugEvent.dwDebugEventCode) {
// log all the events
}
ContinueDebugEvent(debugEvent.dwProcessId, debugEvent.dwThreadId, DBG_EXCEPTION_NOT_HANDLED);
}
DebugActiveProcessStop(processId);
(here's the full listing: I won't paste it all here, because there's some additional non-essential boilerplate there; the MCVE is 136 lines long)
For the sake of an example, I'll just log all the debugger events and detect whether the debuggee is ready to "proceed normally" or it will terminate due to an exception.
Most of the time, my debugging session looks like that:
CREATE_PROCESS_DEBUG_EVENT (which reports creation of both the process and its initial thread)
LOAD_DLL_DEBUG_EVENT (I was never able to get the name for this DLL, but this is documented in MSDN)
CREATE_THREAD_DEBUG_EVENT (which, I suspect, is a thread injected by debugger)
LOAD_DLL_DEBUG_EVENT […] — after this, many DLLs get loaded into the target process and everything looks okay, the process works as intended
But sometimes (in about 1.5% of all runs), the event sequence changes:
CREATE_PROCESS_DEBUG_EVENT
LOAD_DLL_DEBUG_EVENT
CREATE_THREAD_DEBUG_EVENT
EXCEPTION_DEBUG_EVENT: EXCEPTION_ACCESS_VIOLATION (which I never was able to gather details for: it reports a DEP violation, and the address is empty)
After that, I cannot proceed with debugging, because my debuggee is in exception state and will terminate soon. I was never able to catch notepad.exe crash without my debugger attached (and I doubt it is that bad and will crash for no reason), so I suspect that my debugger causes these exceptions.
One bizarre detail is that I could "fix" the situation by calling Sleep(1) immediately after WaitForDebugEvent. So, this is possibly some sort of race condition, but race condition between what? Between the debugger thread and other threads in the debuggee? Is it a thing? How are we supposed to debug other applications, then? How could actual debuggers work if it is a thing?
I couldn't reproduce the issue with the same code compiled for x64 CPU (and debugging an x64 process).
What could actually cause this erroneous behavior? I've carefully read the documentation about the API functions I call, and checked some other debugger examples online, but still wasn't able to find what's wrong with my debugger: it looks like I follow all the right conventions.
I have tried to debug my debuggee with WinDBG while it is still paused in my debugger, but had no luck doing that. First of all, it's difficult to attach to the debuggee with another debugger (WinDBG only allows to use non-intrusive mode, which is less functional it seems?), and the call stacks for the process' threads aren't usually meaningful.
Steps to reproduce
Checkout this repository, compile with MSVC and then execute in cmd:
Debug\NetRuntimeWaiter.exe > log.txt
It is important to redirect output to the log file and not show it in the terminal: without that, timings for the log writer get changed, and the issue won't reproduce (due to a possible race condition I mentioned earlier?).
Usually the program will start and terminate 1000 notepads in about 10 seconds, and 10-15 of 1000 invocations will hold the error condition (i.e. EXCEPTION_ACCESS_VIOLATION).

the DebugActiveProcess (and undocumented DbgUiDebugActiveProcess which is internally called by DebugActiveProcess) have serious design problem: after calling NtDebugActiveProcess it create remote thread in the target process, via DbgUiIssueRemoteBreakin call - as result new thread in target process is created - DbgUiRemoteBreakin - this thread call DbgBreakPoint and then RtlExitUserThread
all this not documented and explained, only this note from DebugActiveProcess:
After all of this is done, the system resumes all threads in the
process. When the first thread in the process resumes, it executes a
breakpoint instruction that causes an EXCEPTION_DEBUG_EVENT
debugging event to be sent to the debugger.
of course this is wrong. why is DbgUiRemoteBreakin first (??) thread ? and which thread resume first undefined. why not exactly write - we create additional (but not first) thread in process ? and this thread execute breakpoint.
however, when process already running - create this additional thread not create problems. but in case we create process in suspended state, and then just call DebugActiveProcess - the DbgUiRemoteBreakin really became first executing thread in process and process initialization was done on this thread, instead of created first thread. on xp this always lead to fail process initialize at connect to csrss phase. (csrss wait connect to it only on first created thread in process). on later systems this is fixed and process can execute as usual. but can and not, because thread on which it was initialized is exit. it can cause subtle problems.
solution here - not use DebugActiveProcess but NtDebugActiveProcess in it place.
the debug object we can create or via DbgUiConnectToDbg() and then get it via DbgUiGetThreadDebugObject() (system store debug object in thread TEB) or direct by call NtCreateDebugObject
also if we create debuggee process from another process(B) we can do next:
duplicate debug object from debugger process to this B process
call DbgUiSetThreadDebugObject(hDdg) just before call
CreateProcessW with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS
system will be use DbgUiGetThreadDebugObject() for get debug object
from your thread and pass it to low level process create api
remove debug object from your thread via
DbgUiSetThreadDebugObject(0)
really no matter who is create process with debug object. matter who is handle events posted to this debug object.
all undocumented api definitions you can take from ntdbg.h and then link with ntdll.lib or ntdllp.lib

Related

How to detect why thread exit on windows?

I had a process with C++ on windows 2008R2, there are several theads in it. During the process's startup, there is a chance that one of the thread will exit. I didn't get a way to detect what happens, any suggestions?
Based on my investigation, the thread just exit without an exception. Access to a null pointer can cause the similar issue, but I didn't find such a position in the process. In fact, it should be better if the process just crash, then I can get a dump file; but nothing happens, just one thread exit.
I had tried the tool user mode process dumper, but it cannot work on the windows version that this process is working on.
I had tried the tool process monitor to check the thread exit event, but the process monitor will throw an exception when I try to reproduce this issue by starting the process again and again.
Thanks in advance.
Found the root cause at last -- the string is accessed by more than one threads, and one thread just exit. String is not thread safe.
Process Monitor helped to get the thread exit call stack on a powerful host, this makes the root cause clear.
Thanks all for your suggestions.

Using C++, How to detect process right before terminated

Im using Visual C++
I'm trying to monitor another process.
Is there a way to detect when the process is terminated ? I mean right before it's terminated, the program can raise an event. After that event, the process will be terminated.
I want my code run before the process is terminated.
The reason I want to do that because I use WMI to detect the process started. But some the process is ended too quickly, my code doesn't not run yet, but the process already ended.
You would use the DebugActiveProcess function, and then use a loop which starts with WaitForDebugEvent - when the process exits, you get a EXIT_PROCESS_DEBUG_EVENT.
You will probably get a bunch of other debug events [it depends on when you attach to the process and what the process does after that point]. For those, you will just issue a call to ContinueDebugEvent - if it was an exception, DBG_EXCEPTION_NOT_HANDLED should be used, otherwise, DBG_CONTINUE.
Once you see your EXIT_PROCESS_DEBUG_EVENT, you do your thing, then issue DBG_CONTINUE. You will also need to handle LOAD_DLL_DEBUG_EVENT by closing the handle given, or you'll leak handles.
I haven't used DebugActiveProcess in exactly this manner, but I believe this will work.
See these functions for more details:
Windows Debugging Functions

Is it possible to detect 'end process' externally?

Is there some way to detect that a program was ended by windows task manager's "end process"?
I know that its kinda impossible to do that from within the application being ended (other than to build your app as a driver and hook ZwTerminateProcess), but I wonder if there is a way to notice it from outside.
I don't want to stop the program from terminating, just to know that it was ended by "end process" (and not by any other way).
There might be a better way - but how about using a simple flag?
Naturally, you'd have to persist this flag somewhere outside of the process/program's memory - like the registry, database, or file system. Essentially, when the app starts up, you set the flag to 'True' when the app shuts down through the normal means, you set the flag to 'False'.
Each time the application starts you can check the flag to see if it was not shut down correctly the previous time it was executed.
Open up a handle to the process with OpenProcess, and then wait on that handle using one of the wait functions such as WaitForSingleObject. You can get the exit status of the process using GetExitCodeProcess. If you need your program to remain responsive to user input while waiting, then make sure to wait on a separate thread (or you can periodically poll using a timeout of zero, but remember the performance consequences of polling -- not recommended).
When you're done, don't forget to call CloseHandle. The process object won't be fully deleted from the OS until all of its handles are closed, so you'll leak resources if you forget to call CloseHandle.
Note that there's no way to distinguish between a process exiting normally or being terminated forcefully. Even if you have a convention that your program only ever exits with a status of 0 (success) or 1 (failure) normally, some other process could call TerminateProcess(YourProcess, 1), and that would be indistinguishable from your ordinary failure mode.
According to the documentation, ExitProcess calls the entry point of all loaded DLLs with DLL_PROCESS_DETACH, whereas TerminateProcess does not. (Exiting the main function results in a call to ExitProcess, as do most unhandled exceptions.)
You might also want to look into Application Recovery and Restart.
One option might be to create a "watchdog" application (installed as a service, perhaps) that monitors WMI events for stopping a process via the ManagementEventWatcher class (in the System.Management namespace).
You could query for the death of your process on an interval or come up with some event driven way to alert of your process's demise.
Here's sort of an example (it's in C# though) that could get you started.

command to suspend a thread with GDB

I'm a little new to GDB. I'm hoping someone can help me with something that should be quite simple, I've used Google/docs but I'm just missing something.
What is the 'normal' way folks debug threaded apps with GDB? I'm using pthreads. I'm wanting to watch only one thread - the two options I see are
a) tell the debugger somehow to attach to a particular thread, such that stepping wont result in jumping threads on each context switch
b) tell the debugger to suspend/free any 'uninteresting' threads
I'd prefer to go route b) - reading the help for GDB I dont see a command for this, tips?
See documentation for set scheduler-locking on.
Beware: if you suspend other threads, and if one of them holds a lock, and if your interesting thread needs that lock at some point while stepping, you'll deadlock.
What is the 'normal' way folks debug threaded apps
You can never debug thread correctness, you can only design it in. In my experience, most of debugging of threaded apps is putting in assertions, and examining state of the world when one of the assertions is violated.
First, you need to enable comfortable for multi-threading debugger behavior with the following commands. No idea why it's disabled by default.
set target-async 1
set non-stop on
I personally put those commands into .gdbinit file. They make your every command to be applied only to the currently focused thread. Note: the thread might be running, so you have to pause it.
To see the focused thread execute the thread.
To switch to another thread append the number of the thread, e.g. thread 2.
To see all threads with their numbers issue info thread.
To apply a command to a particular thread issue something like thread apply threadnum command. E.g. thread apply 4 bt will apply backtrace command to a thread number 4. thread apply all continue continues all paused threads.
There is a small problem though — many commands needs the thread to be paused. I know a few ways of doing that:
interrupt command: interrupts the thread execution, accepts a number of a thread to pause, without an argument breaks the focused one.
Setting a breakpoint somewhere. Note that you may set a breakpoint to a particular thread, so that other threads will ignore it, like break linenum thread threadnum. E.g. break 25 thread 4.
You may also find very useful that you can set a list of commands to be executed when a breakpoint hit through the command commands — so e.g. you may quickly print interesting values, then continue execution.

How can I perform network IO at the very end of a process' lifetime?

I'm developing a DLL in C++ which needs to write some data via a (previously established) TCP/IP connection using the write() call. To be precise, the DLL should send a little 'Process 12345 is terminating at 2007-09-27 15:30:42, value of i is 131' message over the wire when the process goes down.
Unfortunately, all the ways I know for detecting that the process is ending are apparently too late for any network calls to succeed. In particular, I tried the following approaches and the write() call returned -1 in every case:
Calling write() from the destructor of a global object.
Calling write() from a callback function registered using atexit().
Calling write() from DllMain (in case the reason argument is DLL_PROCESS_DETACH). I know that this is not a safe thing to do, but I'm getting a bit desperate. :-)
I'm aware that a DLL can't detect any process shutdown (it might have been unloaded long before the process terminates) but since the shutdown data which the DLL needs to send depends on other code in the DLL, that's acceptable. I'm basically looking for the latest moment at which I can safely perform network IO.
Does anybody know how to do this?
Consider monitoring the process from a separate watchdog process.
Determining If a Process Has Exited: http://msdn.microsoft.com/en-us/library/y111seb2(v=VS.71).aspx
Tutorial: Managing a Windows Process: http://msdn.microsoft.com/en-us/library/s9tkk4a3(v=VS.71).aspx
Consider to use Windows Job Objects.
You main program (monitoring program, which will use for example send()) can start child process suspended, place it into a Job and then resume. Then it will run in the job object. You can register notification via SetInformationJobObject with JobObjectAssociateCompletionPortInformation. Then you will be notified if in the job will be created some child process and if some process inside of job will be ended. So you will be able to send all what you need from the monitoring process. If you debug a program in Visual Studio it uses also job objects to have control under your process and all child processes which you start.
I successfully use the technique in C++ and in C#. So if you will have some problem with implementation I could post you a code example.
I suggest taking option 3. Just do your DLL loading/unloading properly and you're fine. Calling write() should work, I can't explain why it's not in your case. Is it possible that the call fails for a different reason that is unrelated?
Does it work if you call your DLL function manually from the host app?
Why? Just close the socket. If that's the only close in the program, which by your description it must be, that tells the other end that this end is exiting, and you can send the process ID information at the beginning instead of the end. You shouldn't do anything time-consuming or potentially blocking in an exit hook or static destructor.
Where is Winsock being shut down using WSACleanup? You need to make sure that your I/O completes before this happens.
You should be able to work out if this is happening by placing a breakpoint on the Win32 call in Winsock2.dll. Unload of DLLs is displayed in the output in the debug window.