Phantom Input When Running Green Hills Debugger - c++

I'm running on a Marvell Monahans PXA320 under Green Hills INTEGRITY 5.0.10. I'm using MULTI 4.2.3 for development. I'm using an RTSERV connection for debugging, I've been asked to take over a menu-driven program.
I've noticed that if I halt the program (to modify breakpoints) and then resume it, the task gets into an infinite loop displaying the menu in the debugger I/O tab. After each instance of the menu that gets printed, it says that I have made an illegal selection. So, some input is apparently being fed into the task as if I had typed it in (and this input obviously corresponds to an invalid menu selection). I do not see on the display what this phantom input is.
Is there anything I can do to prevent a halt / resume from screwing up the I/O?
Thanks,
Dave

My first guess is that getc() (or your equivalent) is returning -1. This can happen if your input buffers overflowed as a result of halting the application. I/O keeps flowing while the application is halted...
It is generally not a good idea to halt the program when debugging with INTEGRITY. You're generally better off to attach the debugger to a single thread (something idle or infrequently used), set an "any-task" breakpoint in that thread, then resume the thread. (Don't close the window! Doing so will delete the breakpoint.) You'll see a "DebugBrk" status on the thread that hits the breakpoint -- then you can double-click and attach to that specific thread.
Following that alternate procedure should (hopefully!) prevent the I/O error.

Related

Rare EXCEPTION_ACCESS_VIOLATION when debugging any process started with CREATE_SUSPENDED

While writing an x86 WinAPI-based debugger, I've encountered a rare condition when the debuggee (which usually works well) suddenly terminates with EXCEPTION_ACCESS_VIOLATION after I attach to it with my native debugger. I can stably reproduce this on any applications it seems (tried on .NET Hello World-styled application and on notepad.exe on multiple Windows 10 machines).
Essentially I've written a simple WaitForDebugEvent loop:
CreateProcessW(L"C:\\Windows\\SYSWOW64\\notepad.exe", […], CREATE_SUSPENDED, […]);
DebugActiveProcess(processId);
DEBUG_EVENT debugEvent = {};
while (WaitForDebugEvent(&debugEvent, INFINITE)) {
switch (debugEvent.dwDebugEventCode) {
// log all the events
}
ContinueDebugEvent(debugEvent.dwProcessId, debugEvent.dwThreadId, DBG_EXCEPTION_NOT_HANDLED);
}
DebugActiveProcessStop(processId);
(here's the full listing: I won't paste it all here, because there's some additional non-essential boilerplate there; the MCVE is 136 lines long)
For the sake of an example, I'll just log all the debugger events and detect whether the debuggee is ready to "proceed normally" or it will terminate due to an exception.
Most of the time, my debugging session looks like that:
CREATE_PROCESS_DEBUG_EVENT (which reports creation of both the process and its initial thread)
LOAD_DLL_DEBUG_EVENT (I was never able to get the name for this DLL, but this is documented in MSDN)
CREATE_THREAD_DEBUG_EVENT (which, I suspect, is a thread injected by debugger)
LOAD_DLL_DEBUG_EVENT […] — after this, many DLLs get loaded into the target process and everything looks okay, the process works as intended
But sometimes (in about 1.5% of all runs), the event sequence changes:
CREATE_PROCESS_DEBUG_EVENT
LOAD_DLL_DEBUG_EVENT
CREATE_THREAD_DEBUG_EVENT
EXCEPTION_DEBUG_EVENT: EXCEPTION_ACCESS_VIOLATION (which I never was able to gather details for: it reports a DEP violation, and the address is empty)
After that, I cannot proceed with debugging, because my debuggee is in exception state and will terminate soon. I was never able to catch notepad.exe crash without my debugger attached (and I doubt it is that bad and will crash for no reason), so I suspect that my debugger causes these exceptions.
One bizarre detail is that I could "fix" the situation by calling Sleep(1) immediately after WaitForDebugEvent. So, this is possibly some sort of race condition, but race condition between what? Between the debugger thread and other threads in the debuggee? Is it a thing? How are we supposed to debug other applications, then? How could actual debuggers work if it is a thing?
I couldn't reproduce the issue with the same code compiled for x64 CPU (and debugging an x64 process).
What could actually cause this erroneous behavior? I've carefully read the documentation about the API functions I call, and checked some other debugger examples online, but still wasn't able to find what's wrong with my debugger: it looks like I follow all the right conventions.
I have tried to debug my debuggee with WinDBG while it is still paused in my debugger, but had no luck doing that. First of all, it's difficult to attach to the debuggee with another debugger (WinDBG only allows to use non-intrusive mode, which is less functional it seems?), and the call stacks for the process' threads aren't usually meaningful.
Steps to reproduce
Checkout this repository, compile with MSVC and then execute in cmd:
Debug\NetRuntimeWaiter.exe > log.txt
It is important to redirect output to the log file and not show it in the terminal: without that, timings for the log writer get changed, and the issue won't reproduce (due to a possible race condition I mentioned earlier?).
Usually the program will start and terminate 1000 notepads in about 10 seconds, and 10-15 of 1000 invocations will hold the error condition (i.e. EXCEPTION_ACCESS_VIOLATION).
the DebugActiveProcess (and undocumented DbgUiDebugActiveProcess which is internally called by DebugActiveProcess) have serious design problem: after calling NtDebugActiveProcess it create remote thread in the target process, via DbgUiIssueRemoteBreakin call - as result new thread in target process is created - DbgUiRemoteBreakin - this thread call DbgBreakPoint and then RtlExitUserThread
all this not documented and explained, only this note from DebugActiveProcess:
After all of this is done, the system resumes all threads in the
process. When the first thread in the process resumes, it executes a
breakpoint instruction that causes an EXCEPTION_DEBUG_EVENT
debugging event to be sent to the debugger.
of course this is wrong. why is DbgUiRemoteBreakin first (??) thread ? and which thread resume first undefined. why not exactly write - we create additional (but not first) thread in process ? and this thread execute breakpoint.
however, when process already running - create this additional thread not create problems. but in case we create process in suspended state, and then just call DebugActiveProcess - the DbgUiRemoteBreakin really became first executing thread in process and process initialization was done on this thread, instead of created first thread. on xp this always lead to fail process initialize at connect to csrss phase. (csrss wait connect to it only on first created thread in process). on later systems this is fixed and process can execute as usual. but can and not, because thread on which it was initialized is exit. it can cause subtle problems.
solution here - not use DebugActiveProcess but NtDebugActiveProcess in it place.
the debug object we can create or via DbgUiConnectToDbg() and then get it via DbgUiGetThreadDebugObject() (system store debug object in thread TEB) or direct by call NtCreateDebugObject
also if we create debuggee process from another process(B) we can do next:
duplicate debug object from debugger process to this B process
call DbgUiSetThreadDebugObject(hDdg) just before call
CreateProcessW with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS
system will be use DbgUiGetThreadDebugObject() for get debug object
from your thread and pass it to low level process create api
remove debug object from your thread via
DbgUiSetThreadDebugObject(0)
really no matter who is create process with debug object. matter who is handle events posted to this debug object.
all undocumented api definitions you can take from ntdbg.h and then link with ntdll.lib or ntdllp.lib

How does TSTP (polite pause) interact with my C++ program in linux?

I have written a C++ program and I am executing in the gnome terminal (I am on Ubuntu). I press Ctrl + Z, which suspends the process. Later on, I execute % on the same terminal, which resumes execution.
From what I've read, Ctrl+Z sends a TSTP signals to the process, which tells it to stop execution. But TSTP is polite, in the sense that the process is allowed to continue until it decides it can stop. In my C++ program code, I didn't do anything to explicitly deal with TSTP signals. So, my question is, what things inside my C++ code will continue running in spite of the TSTP signal? For example, if I have a file stream open, will it wait until it is closed? I expect an overall answer, not too deep or covering all the details. I just want an idea of how this happens.
Your program continues running while the SIGTSTP handler executes. Since you haven't set one up, you get the default signal handling behavior, which is for the process to be stopped.
While your process is stopped, it simply isn't scheduled for execution. Files don't get closed, nor is stopping delayed until files get closed (unless done in the signal handler).
This website looks like it has a helpful explanation of how a handler can be installed to perform some tasks and then have the default stopping behavior:
http://man7.org/tlpi/code/online/dist/pgsjc/handling_SIGTSTP.c.html

Qt check if external process crashes

I'm building a failsafe application for professional video. The Qt application checks the 4 corners of the 2nd screen and if they are a certain RGB value (I use a special background) the Qt program knows it crashed so it sends a signal to the videomixer to fade to the other input.
Now I also want to add a check to see if the video program didn't crash (it can be the video program doesn't respond but still shows an output so I can't see the desktop on the 2nd screen). I know I can use Qprocess to start an external process. It's not that easy to hook it up to a process that already runs.
Now the question: how can I check if the program crashed (so "not responding") and see this as quick as possible so I can fade to the other video input. And what happens when my Qt program crashes, will it also exit the child process?
Thanks!
Using QProcess creates an attached process, so unfortunately it will be killed when your process dies. When you create a detached process using the static method QProcess::startDetached, you don't get the monitoring functionality.
You need to write a little platform-specific monitoring class that can launch a detached process and inform you of changes in its status. You need to use the native APIs in implementing that. QProcess's sources can be a good inspiration for where to start.
#KubaOber is partially correct in his statement. If you start and detach a process indeed you loose the Qt way of communicating with it and monitory what it does. However you OS offers plenty solutions to oversee what happens with it.
On Linux you can use:
pgrep to check if the process is running or not (execute the command as a child process and see if it returns 0 (process is running) or 1 (process is no longer running)
you can use proc filesystem to see when a process terminates (see here) and then use $? or a variable (as in described in the link) to check its exit status
kill allows you a great amount of control possibilities along with pipes
You should note however that especially on Windows there are plenty of programs that do not follow the Unix convention for exit codes (0 = exited normally, anything else - error has occurred). Also a crash is just an error state that the process ended up with. The exit code tells you that an error has occurred but in terms of a crash you will probably not be able to make the difference just by looking at it.

Multiple CUDA streams crashing GPU

This is a continuation of this post.
It seems as though a special case has been solved by adding volitile but now something else has broken. If I add anything between the two kernel calls, the system reverts back to the old behavior, namely freezing and printing everything at once. This behavior is shown by adding sleep(2); between set_flag and read_flag. Also, when put in another program, this causes the GPU to lock up. What am I doing wrong now?
Thanks again.
There is an interaction with X and the display driver, as well as the standard output queue and it's interaction with the graphical display driver.
A few experiments you can try, (with the sleep(2); added between the set_flag and read_flag kernels):
Log into your machine over the network via ssh from another machine. I think your program will work. (X is not involved in the display in this case)
comment out the line that prints out "Starting..." I think your
program will then work. (This avoids the display driver/ print queue deadlock, see below).
add a sleep(2); in between the "Starting..." print line and the first kernel. I think your program will then work. (This allows the display driver to fully service the first printout before the first kernel is launched, so no CPU thread stall.)
Stop X and run from a console. I think your program will work.
When the GPU is both hosting an X display and also running CUDA tasks, it has to switch between the two. For the duration of the CUDA task, ordinary display processing is suspended. You can read more about this here.
The problem here is that when running X, the first printout is getting sent to the print queue but not actually displayed before the first kernel is launched. This is evident because you don't see the printout before the display freeze. After that, the CPU thread is getting stalled waiting for the display of the text. The second kernel is not starting. The intervening sleep(2); and it's interaction with the OS is enough for this stall to occur. And the executing first kernel has the display driver "stopped" for ordinary display tasks, so the OS never gets past it's stall, so the 2nd kernel doesn't get launched, leading to the apparent hang.
Note that options 1,2, or 3 in the linked custhelp article would be effective in your case. Option 4 would not.

C++ - Totally suspend windows application

I am developing a simple WinAPI application and started from writing my own assertion system.
I have a macro defined like ASSERT(X) which would make pretty the same thing as assert(X) does, but with more information, more options and etc.
At some moment (when that assertion system was already running and working) I realized there is a problem.
Suppose I wrote a code that does some action using a timer and (just a simple example) this action is done while handling WM_TIMER message. And now, the situation changes the way that this code starts throwing an assert. This assert message would be shown every TIMER_RESOLUTION milliseconds and would simply flood the screen.
Options for solving this situation could be:
1) Totally pause application running (probably also, suspend all threads) when the assertion messagebox is shown and continue running after it is closed
2) Make a static counter for the shown asserts and don't show asserts when one of them is already showing (but this doesn't pause application)
3) Group similiar asserts and show only one for each assert type (but this also doesn't pause application)
4) Modify the application code (for example, Get / Translate / Dispatch message loop) so that it suspends itself when there are any asserts. This is good, but not universal and looks like a hack.
To my mind, option number 1 is the best. But I don't know any way how this can be achieved. What I'm seeking for is a way to pause the runtime (something similiar to Pause button in the debugger). Does somebody know how to achieve this?
Also, if somebody knows an efficient way to handle this problem - I would appreciate your help. Thank you.
It is important to understand how Windows UI programs work, to answer this question.
At the core of the Windows UI programming model is of course "the message" queue". Messages arrive in message queues and are retrieved using message pumps. A message pump is not special. It's merely a loop that retrieves one message at a time, blocking the thread if none are available.
Now why are you getting all these dialogs? Dialog boxes, including MessageBox also have a message pump. As such, they will retrieve messages from the message queue (It doesn't matter much who is pumping messages, in the Windows model). This allows paints, mouse movement and keyboard input to work. It will also trigger additional timers and therefore dialog boxes.
So, the canonical Windows approach is to handle each message whenever it arrives. They are a fact of life and you deal with them.
In your situation, I would consider a slight variation. You really want to save the state of your stack at the point where the assert happened. That's a particularity of asserts that deserves to be respected. Therefore, spin off a thread for your dialog, and create it without a parent HWND. This gives the dialog an isolated message queue, independent of the original window. Since there's also a new thread for it, you can suspend the original thread, the one where WM_TIMER arrives.
Don't show a prompt - either log to a file/debug output, or just forcibly break the debugger (usually platform specific, eg. Microsoft's __debugbreak()). You have to do something more passive than show a dialog if there are threads involved which could fire lots of failures.
Create a worker thread for your debugging code. When an assert happens, send a message to the worker thread. The worker thread would call SuspendThread on each thread in the process (except itself) to stop it, and then display a message box.
To get the threads in a process - create a dll and monitor the DllMain for Thread Attach (and Detach) - each call will be done in the context of a thread being created (or destroyed) so you can get the current thread id and create a handle to use with SuspendThread.
Or, the toolhelp debug api will help you find out the threads to pause.
The reason I prefer this approach is, I don't like asserts that cause side effects. Too often Ive had asserts fire from asynchronous socket processing - or window message - processing code - then the assert Message box is created on that thread which either causes the state of the thread to be corrupted by a totally unexpected re-entrancy point - MessageBox also discards any messages sent to the thread, so it messes up any worker threads using thread message queues to queue jobs.
My own ASSERT implementation calls DebugBreak() or as alternative INT 3 (__asm int 3 in MS VC++). An ASSERT should break on the debugger.
Use the MessageBox function. This will block until the user clicks "ok". After this is done, you could choose to discard extra assertion failure messages or still display them as your choice.