Displaying exception debug information to users

Displaying exception debug information to users - c++

I'm currently working on adding exceptions and exception handling to my OSS application. Exceptions have been the general idea from the start, but I wanted to find a good exception framework and in all honesty, understand C++ exception handling conventions and idioms a bit better before starting to use them. I have a lot of experience with C#/.Net, Python and other languages that use exceptions. I'm no stranger to the idea (but far from a master).
In C# and Python, when an unhandled exception occurs, the user gets a nice stack trace and in general a lot of very useful priceless debugging information. If you're working on an OSS application, having users paste that info into issue reports is... well let's just say I'm finding it difficult to live without that. For this C++ project, I get "The application crashed", or from more informed users, "I did X, Y and Z, and then it crashed". But I want that debugging information too!
I've already (and with great difficulty) made my peace with the fact that I'll never see a cross-platform and cross-compiler way of getting a C++ exception stack trace, but I know I can get the function name and other relevant information.
And now I want that for my unhandled exceptions. I'm using boost::exception, and they have this very nice diagnostic_information thingamajig that can print out the (unmangled) function name, file, line and most importantly, other exception specific information the programmer added to that exception.
Naturally, I'll be handling exceptions inside the code whenever I can, but I'm not that naive to think I won't let a couple slip through (unintentionally, of course).
So what I want to do is wrap my main entry point inside a try block with a catch that creates a special dialog that informs the user that an error has occurred in the application, with more detailed information presented when the user clicks "More" or "Debug info" or whatever. This would contain the string from diagnostic_information. I could then instruct the users to paste this information into issue reports.
But a nagging gut feeling is telling me that wrapping everything in a try block is a really bad idea. Is what I'm about to do stupid? If it is (and even if it's not), what's a better way to achieve what I want?

Putting a try/catch block in main() is okay, it doesn't cause any problems. The program is dead on an unhandled exception anyway. It isn't going to be helpful at all in your quest to get the all-important stack trace though. That info is gonzo when the catch block traps the exception.
Catching a C++ exception won't be very helpful either. The odds that the program dies on a an exception derived from std::exception are pretty slim. Although it could happen. Much more likely in a C/C++ app is death due to hardware exceptions, AccessViolation being numero uno. Trapping those requires the __try and __except keywords in your main() method. Again, very little context is available, you've basically only got an exception code. An AV also tells you which exact memory location caused the exception.
This is not just a cross-platform issue btw, you can't get a good stack trace on any platform. There is no reliable way to walk the stack, there are too many optimizations (like framepointer omission) that make this a perilous journey. It is the C/C++ way: make it as fast as possible, leave no clue what happened when it blows up.
What you need to do is debug these kind of problems the C/C++ way. You need to create a minidump. It is roughly analogous to the "core dump" of old, a snapshot of the process image at the time the exception happens. Back then, you actually got a complete dump of the core. There's been progress, nowadays it is "mini", somewhat necessary because a complete core dump would take close to 2 gigabytes. It actually works pretty well to diagnose the program state.
On Windows, that starts by calling SetUnhandledExceptionFilter(), you provide a callback function pointer to a function that will run when your program dies on an unhandled exception. Any exception, C++ as well as SEH. Your next resource is dbghelp.dll, available in the Debugging Tools for Windows download. It has an entrypoint called MiniDumpWriteDump(), it creates a minidump.
Once you get the file created by MiniDumpWriteDump(), you're pretty golden. You can load the .dmp file in Visual Studio, almost like it's a project. Press F5 and VS grinds away for a while trying to load .pdb files for the DLLs loaded in the process. You'll want to setup the symbol server, that's very important to get good stack traces. If everything works, you'll get a "debug break" at the exact location where the exception was thrown". With a stack trace.
Things you need to do to make this work smoothly:
Use a build server to create the binaries. It needs to push the debugging symbols (.pdb files) to a symbol server so they are readily available when you debug the minidump.
Configure the debugger so it can find the debugging symbols for all modules. You can get the debugging symbols for Windows from Microsoft, the symbols for your code needs to come from the symbol server mentioned above.
Write the code to trap the unhandled exception and create the minidump. I mentioned SetUnhandledExceptionFilter() but the code that creates the minidump should not be in the program that crashed. The odds that it can write the minidump successfully are fairly slim, the state of the program is undetermined. Best thing to do is to run a "guard" process that keeps an eye on a named Mutex. Your exception filter can set the mutex, the guard can create the minidump.
Create a way for the minidump to get transferred from the client's machine to yours. We use Amazon's S3 service for that, terabytes at a reasonable rate.
Wire the minidump handler into your debug database. We use Jira, it has a web-service that allows us to verify the crash bucket against a database of earlier crashes with the same "signature". When it is unique, or doesn't have enough hits, we ask the crash manager code to upload the minidump to Amazon and create the bug database entry.
Well, that's what I did for the company I work for. Worked out very well, it reduced crash bucket frequency from thousands to dozens. Personal message to the creators of the open source ffdshow component: I hate you with a passion. But you're no longer crashing our app anymore! Buggers.

Wrapping all your code in one try/catch block is a-ok. It won't slow down the execution of anything inside it, for example. In fact, all my programs have (code similar to) this framework:
int execute(int pArgc, char *pArgv[])
{
// do stuff
}
int main(int pArgc, char *pArgv[])
{
// maybe setup some debug stuff,
// like splitting cerr to log.txt
try
{
return execute(pArgc, pArgv);
}
catch (const std::exception& e)
{
std::cerr << "Unhandled exception:\n" << e.what() << std::endl;
// or other methods of displaying an error
return EXIT_FAILURE;
}
catch (...)
{
std::cerr << "Unknown exception!" << std::endl;
return EXIT_FAILURE;
}
}

No it's not stupid. It's a very good idea, and it costs virtually nothing at runtime until you hit an unhandled exception, of course.
Be aware that there is already an exception handler wrapping your thread, provided by the OS (and another one by the C-runtime I think). You may need to pass certain exceptions on to these handlers to get correct behavior. In some architectures, accessing mis-aligned data is handled by an exception handler. so you may want to special case EXCEPTION_DATATYPE_MISALIGNMENT and let it pass on to the higher level exception handler.
I include the registers, the app version and build number, the exception type and a stack dump in hex annotated with module names and offsets for hex values that could be addresses to code. Be sure to include the version number and build number/date of your exe.
You can also use VirtualQuery to turn stack values into "ModuleName+Offset" pretty easily. And that, combined with a .MAP file will often tell you exactly where you crashed.
I found that I could train beta testers to send my the text pretty easily, but in the early days what I got was a picture of the error dialog rather than the text. I think that's because a lot of users don't know you can right click on any Edit control to get a menu with "Select All" and "Copy". If I was going to do it again, I would add a button
that copied that text to the clipboard so that it can easily be pasted into an email.
Even better if you want to go to the trouble of haveing a 'send error report' button, but just giving users a way to get the text into their own emails gets you most of the way there, and doesn't raise any red flags about "what information am I sharing with them?"

In fact, boost::diagnostic_information has been designed specifically to be used in a "global" catch(...) block, to display information about exceptions which should not have reached it. However, note that the string returned by boost::diagnostic_information is NOT user-friendly.

Related

If application crashes (eg. segfault or unhandled exception), since some Win10 update it now seems to die silently

In good old time, invalid memory access or unhandled exception in application resulted in some form messagebox displayed.
It seems to me that recently this stopped to be true. I can create a small application that does nothing else than write to NULL pointer, run it from the windows shell and it just dies silently.
I have Visual C++ commandline tools installed and using to compile that small app (plain C++ and win32 SDK). App is compiled in 64bit mode.
Any clue what is going on? I am really missing those crash messageboxes...

It's true by default this message boxes are disabled. You can do a few things about it:
1. (Re)Enable the messagebox (most probably what you are looking for)
Press Start and type gpedit.msc. Than navigate to Computer Configuration -> Administrative Templates -> Windows Components -> Windows Error Reporting -> Prevent display of the user interface for critical errors and select Disabled.
This will bring back at least some error messages if your application crashes.
2. Setup an Unhandled Exception Filter (probably dangerous)
Install an exception handler filter and filter for your desired exceptions. The drawback here is, the filter is called on every thrown exception.
3. Setup a signalhandler (also dangerous)
Basically like this.
void SignalHandler(int signal)
{
printf("Signal %d",signal);
throw "!Access Violation!";
}
int main()
{
typedef void (*SignalHandlerPointer)(int);
SignalHandlerPointer previousHandler;
previousHandler = signal(SIGSEGV , SignalHandler);
}
4. Use Windows Error Reporting
As mentioned by IInspectable and described in his answer.
Option 2 and 3 can become quite tricky and dangerous. You need some basic understanding in SEH exceptions, since different options can lead to different behavior. Also, not everything is allowed in the exception handlers, e.g: writing into files is cosidered extremly dangerous, or even printing to the terminal. Plus, since you are handling this exceptions, your program won't be terminated, means after the handler, it will jump right back to the erroneous code.

If you want your process to always show the error UI, you can call WerSetFlags with a value of WER_FAULT_REPORTING_ALWAYS_SHOW_UI. Or use any other applicable option offered by Windows Error Reporting, that suits your needs (like automatically creating a crash dump on unhandled exceptions).

C++ / WIndows - Can't catch exceptions

I'm a beginner with the windows api so there must be something i don't understand here. In my main function i'm using a try-catch to catch all uncaught exceptions, but for some reason, exceptions i throw from somewhere else in the code are never caught. My application uses a single (main) thread.
I'm throwing like this :
throw "ClassName::methodName() - Error message";
And catching the exceptions outside of the message loop :
try {
while(GetMessage(args...)) {
TranslateMessage(args...);
DispatchMessage(args...);
}
}
catch( const char * sExc ) {
::MessageBox(args...);
}
I first thought it was a problem of types mismatching, but then i added a catch(...) with ellipsis and i still caught nothing. If you ask, yes, i'm sure the exception is thrown. Isn't it a problem related to some kind of asynchronousness or something like that ?
Thanks for your help !

It depends on the specific message that's getting dispatched. But no, not all of them allow the stack to be unwound through Windows internal code. Particularly the messages that involve the window manager, WM_CREATE for example. There's a backstop inside Windows that prevents the stack from being unwound past that critical code. There's also an issue with exceptions in 32-bit code that run on the 64-bit version of Windows 7, they can get swallowed when the message needs to traverse the Wow64 boundary several times. Fixed in Windows 8.
On later Windows versions this can also activate "self-healing" code, automatically activating an appcompat shim that swallows the exception. You'll get a notification for that, easy to dismiss. You'll then see the first-chance exception notification in the VS Output window but your program keeps running. Okayish for the user perhaps but not great while you are debugging of course. Run Regedit.exe and navigate to HKCU\Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\Compatibility Assistant\Persisted and check if your program is listed there. Just delete the entry.
Long story short, it is not safe to catch exceptions outside of the message loop. You have to do it inside the window procedure.

You are talking about "Windows Structured Exception Handling" (http://msdn.microsoft.com/en-us/library/windows/desktop/ms680657%28v=vs.85%29.aspx). C++ exceptions are not thrown.
If you want to go the troublesome route: _set_se_translator
See also: Can a C program handle C++ exceptions? (The windows API is not C++)

Can one prevent Microsoft Error Reporting for a single app?

We have an unmanaged C++ application that utilizes 3rd party APIs to read CAD files. On certain corrupted CAD files, the 3rd party library crashes and brings our EXE down with it. Because of this our main application is a separate EXE and in this way it does not get affected by the crash. Howevever, we end up with annoying Microsoft Error Reporting dialogs.
I do not want to disable Microsoft Error Reporting system wide. Is there a way to turn off error reporting for a single application, so that if it crashes, it crashes silently without error popup dialogs?

On Vista and above, the WerAddExcludedApplication API function can be used to exclude a specified application executable from error reporting. As far as I'm aware, there's no similar option on XP and other legacy OS versions.
However, since WER will only kick in on unhandled application exceptions, you should be able to suppress it by adding a "catch-all" exception handler to your EXE. See vectored exception handling for some ideas on how to achieve that.
Note that suppressing all unhandled exceptions is generally a bad idea (and will, for example, cause your app to fail Windows Logo certification), so you should not use this technique indiscriminately...

Yeah, there's something you can do. Call SetUnhandledExceptionFilter() in your main() method to register a callback. It will be called when nobody volunteers to handle the exception, just before the Microsoft WER dialog shows up.
Actually doing something in that callback is fraught with trouble. The program has died with, invariably, something nasty like an AccessViolation exception. Which often is tripped by heap corruption. Trying to do something like displaying a message box to let the user know is troublesome when the heap is toast. Deadlock is always lurking around the corner too, ready to just lock up the program without any diagnostic at all.
The only safe thing to do is to have a helper process that guards your main process. Wake it up by signaling a named event in your callback. Give it the exception info it needs with a memory-mapped file. The helper can do pretty much anything it wants when it sees the event signaled. Including showing a message, taking a minidump. And terminating the main process.
Which is exactly how the Microsoft WerFault helper works.

Hans Passant's answer about SetUnhandledExceptionFilter is on the right track. He also makes some good points about not being able to do too much within the callback because various parts of the process could be in an unstable state.
However, from the way the issue is described, it doesn't sound like you want to do anything except tell the system not to put up the normal crash dialog. In that case, it's easy and should be safe regardless of what parts of the process the crash may have affected.
Make a function something like this:
LONG WINAPI UnhandledExceptionCallback(PEXCEPTION_POINTERS pExceptPtrs)
{
if (IsDebuggerPresent())
// Allow normal crash handling, which means the debugger will take over.
return EXCEPTION_CONTINUE_SEARCH;
else
// Say we've handled it, so that the standard crash dialog is inhibited.
return EXCEPTION_EXECUTE_HANDLER;
}
And somewhere in your program (probably as early as possible) set the callback:
SetUnhandledExceptionFilter(UnhandledExceptionCallback);
That should do what you want - allow any crashes of that particular program to die silently.
However, there's something else to note about this: Any time you bring in 3rd-party components (DLLs, OCXs, etc) there is a risk that one of them may also call SetUnhandledExceptionFilter and thus replace your callback with their own. I once encountered an ActiveX control that would set its own callback when instantiated. And even worse, it failed to restore the original callback when it was destroyed. That seemed to be a bug in their code, but regardless I had to take extra steps to ensure that my desired callback was at least restored when it was supposed to be after their control was shutdown. So if you find this doesn't appear to work for you sometimes, even when you know you've set the callback correctly, then you may be encountering something similar.

I found myself in exactly this situation while developing a Delphi application. I found that I needed two things to reliably suppress the "app has stopped working" dialog box.
Calling SetErrorMode(SEM_NOGPFAULTERRORBOX); suppresses the "app has stopped working" dialog box. But then Delphi's exception handler shows a message box with a runtime error message instead.
To suppress Delphi's exception handler I call SetUnhandledExceptionFilter with a custom handler that terminates the process by calling Halt.
So the skeleton for a Delphi client app that runs code prone to crashes becomes:
function HaltOnException(const ExceptionInfo: TExceptionPointers): Longint; stdcall;
begin
Halt;
Result := 1; // Suppress compiler warning
end;
begin
SetErrorMode(SEM_NOGPFAULTERRORBOX);
SetUnhandledExceptionFilter(#HaltOnException);
try
DoSomethingThatMightCrash;
except
on E: Exception do
TellServerWeFailed(E.Message);
end;
end.

I'm not at all sure, but perhaps SetErrorMode or SetThreadErrorMode will work for you?

How can I debug a win32 process that unexpectedly terminates silently?

I have a Windows application written in C++ that occasionally evaporates. I use the word evaporate because there is nothing left behind: no "we're sorry" message from Windows, no crash dump from the Dr. Watson facility...
On the one occasion the crash occurred under the debugger, the debugger did not break---it showed the application still running. When I manually paused execution, I found that my process no longer had any threads.
How can I capture the reason this process is terminating?

You could try using the adplus utility in the windows debugging tool package.
adplus -crash -p yourprocessid
The auto dump tool provides mini dumps for exceptions and a full dump if the application crashes.

If you are using Visual Studio 2003 or later, you should enable the debuggers "First Chance Exception" handler feature by turning on ALL the Debug Exception Break options found under the Debug Menu | Exceptions Dialog. Turn on EVERY option before starting the debug build of the process within the debugger.
By default most of these First Chance Exception handlers in the debugger are turned off, so if Windows or your code throws an exception, the debugger expects your application to handle it.
The First Chance Exception system allows debuggers to intercept EVERY possible exception thrown by the Process and/or System.
http://support.microsoft.com/kb/105675

All the other ideas posted are good.
But it also sounds like the application is calling abort() or terminate().
If you run it in the debugger set a breakpoint on both these methods and exit() just for good measure.
Here is a list of situations that will cause terminate to be called because of exceptions going wrong.
See also:
Why destructor is not called on exception?
This shows that an application will terminate() if an exceptions is not caught. So stick a catch block in main() that reports the error (to a log file) then re-throw.
int main()
{
try
{
// Do your code here.
}
catch(...)
{
// Log Error;
throw; // re-throw the error for the de-bugger.
}
}

Well, the problem is you are getting an access violation. You may want to attach with WinDBG and turn on all of the exception filters. It may still not help - my guess is you are getting memory corruption that isn't throwing an exception.
You may want to look at enabling full pageheap checking
You might also want to check out this older question about heap corruption for some ideas on tools.

The most common cause for this kind of sudden disappearance is a stack overflow, usually caused by some kind of infinite recursion (which may, of course, involve a chain of several functions calling each other).
Is that a possibility in your app?

You could check the Windows Logs in Event Viewer on Windows.

First of all I want to say that I've only a moderate experience on windows development.
After that I think this is a typical case where a log may help.
Normally debugging and logging supply orthogonal info. If your debugger is useless probably the log will help you.

This could be a call to _exit() or some Windows equivalent. Try setting a breakpoint on _exit...

Have you tried PC Lint etc and run it over your code?
Try compiling with maximum warnings
If this is a .NET app - use FX Cop.

Possible causes come to mind.
TerminateProcess()
Stack overflow exception
Exception while handling an exception
The last one in particular results in immediate failure of the application.
The stack overflow - you may get a notification of this, but unlikely.
Drop into the debugger, change all exception notifications to "stop always" rather than "stop if not handled" then do what you do to cause the program failure. The debugger will stop if you get an exception and you can decide if this is the exception you are looking for.

I spent a couple hours trying to dig into this on Visual Studio 2017 running a 64-bit application on Windows 7. I ended up having to set a breakpoint on the RtlReportSilentProcessExit function, which lives in the ntdll.dll file. Just the base function name was enough for Visual Studio to find it.
That said, after I let Visual Studio automatically download symbols for the C standard library, it also automatically stopped on the runtime exception that caused the problem.

How to get a stack trace when C++ program crashes? (using msvc8/2005)

Sometimes my c++ program crashes in debug mode, and what I got is a message box saying that an assertion failed in some of the internal memory management routines (accessing unallocated memory etc.). But I don't know where that was called from, because I didn't get any stack trace. How do I get a stack trace or at least see where it fails in my code (instead of library/ built-in routines)?

If you have a crash, you can get information about where the crash happened whether you have a debug or a release build. And you can see the call stack even if you are on a computer that does not have the source code.
To do this you need to use the PDB file that was built with your EXE. Put the PDB file inside the same directory as the EXE that crashed. Note: Even if you have the same source code, building twice and using the first EXE and the second PDB won't work. You need to use the exact PDB that was built with your EXE.
Then attach a debugger to the process that crashed. Example: windbg or VS.
Then simply checkout your call stack, while also having your threads window open. You will have to select the thread that crashed and check on the callstack for that thread. Each thread has a different call stack.
If you already have your VS debugger attached, it will automatically go to the source code that is causing the crash for you.
If the crash is happening inside a library you are using that you don't have the PDB for. There is nothing you can do.

If you run the debug version on a machine with VS, it should offer to bring it up and let you see the stack trace.
The problem is that the real problem is not on the call stack any more. If you free a pointer twice, that can result in this problem somewhere else unrelated to the program (the next time anything accesses the heap datastructures)
I wrote this blog on some tips for getting the problem to show up in the call stack so you can figure out what is going on.
http://www.atalasoft.com/cs/blogs/loufranco/archive/2007/02/06/6-_2200_Pointers_2200_-on-Debugging-Unmanaged-Code.aspx
The best tip is to use the gflags utility to make pointer issues cause immediate problems.

You can trigger a mini-dump by setting a handler for uncaught exceptions. Here's an article that explains all about minidumps
Google actually implemented their own open source crash handler called BreakPad, which also mozilla use I think (that's if you want something more serious - a rich and robust crash handler).

If I remember correctly that message box should have a button which says 'retry'. This should then break the program (in the debugger) at the point where the assertion happened.

CrashFinder can help you locate the place of the exception given the DLL and the address of the exception reported.
You can take this code and integrate it into your application to have a stack trage automatically generated when there is an uncaught exception. This is generally performed using __try{} __except{} or with a call to SetUnhandledExceptionFilter which allows you to specify a callback to all unhandled exceptions.

You can also have a post-mortem debugger installed on the client system. This is a decent, general way to get information when you do not have dump creation built into your application (maybe for an older version for which you must still get information).
Dr. Watson on Windows can be installed by running: drwtsn32 -i Running drwtsn32 (without any options) will bring up the configuration dialog. This will allow the creation of crash dump files, which you can later analyze with WinDbg or something similar.

You can use Poppy for this. You just sprinkle some macros across your code and it will gather the stack trace, together with the actual parameter values, local variables, loop counters, etc. It is very lightweight so it can be left in the release build to gather this information from crashes on end-user machines

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js