I have an utterly baffling problem with a C++ program I'm compiling with MinGW. At a certain, deterministic point in the program some exception handlers get disabled, and any future exceptions thrown are no longer handled. I can track down the precise line of source that does this, it's an inoffensive assignment to an array on the heap. The pointer isn't corrupt, nor am I writing beyond its bounds. What's more the same code is called in a bunch of different circumstances, even with most of the same arguments without triggering the bug. If I fiddle with the code to make the value it writes always zero, the bug is never triggered. I'm at a loss to know what's happening. It also only disables some exception handlers. exception handlers further down the callstack get disabled, while ones higher up stay active. It's baffling.
So, how do should I go about debugging this? I really don't have a good grasp of how exceptions actually work in MinGW's version of GCC. What could be happening to cause this weird set of symptoms?
Never mind, turns out that taking some time out and thinking about the symptoms suggested the answer.
Turns out a terrible library I'm using can throw an exception during cleanup, so I was getting an exception within an exception during stack unwinding.
I should have realised that when handlers further up the call stack behaved properly. Duh.
Related
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the easiest way to make a C++ program crash?
There's a construct I see frequently in our codebase, where the program gets into an invalid state somehow, the code will do something intentionally wrong, just to force a crash. It usually goes something like this:
if(<something is wrong>)
{
int *ptr = NULL;
*ptr = 0;
}
This of course causes a null reference exception and crashes the program in an unrecoverable manner. I'm just wondering if this is really the best way to do this? First of all, it doesn't read well. Without a comment, you might not realize that the crash that occurs here is intended. Secondly, there's pretty much no way to recover from this. It doesn't throw an exception, so it can't be handled by other code. It just kills the program dead with no way of backtracking. It also doesn't give much of a clue as to why it had to crash here. And it will crash in all builds, unlike, say, an assert. (We do have a pretty robust assert system available, but it isn't always used in cases like this.)
It's the style we're using all over the place, and I'm in no position to try and convince anyone otherwise. I'm just curious how common this is in the industry.
You can't "crash" a program in a deliberate way, because by its very definition a crash is when a program goes wrong and fails to behave deterministically.
The standard way to terminate execution is via std::terminate; the usual way to achieve this is to call std::abort, which raises an unblockable signal against the process (leading to std::terminate automatically) and also causing many operating systems to produce a core dump.
You should throw an exception, which is basically causing a crash deliberately and in a controlled manner. Here is an example with help from this question.
throw string("something_went_wrong");
Better yet is that the error is caught or fixed. Assert would also be a fine choice.
I guess that this is the way to trigger a core dump in the situations where you are not debugging. Core dump then gives you enough information to analyze the problem. In case of "programmer errors" (or bugs) this is better than throwing an exception because stack unwinding would not allow you to build a reasonable core dump. A similar effect could be achieved in a more elegant way by calling std::terminate having previously registered (with std::set_terminate) a function that generates a core dump or something similar. See this article for more detailed explanation.
I have a Win32 C++ app developed in VS2005. There is a try {} catch (...) {} wrapped around a block of code, and yet 3 functions deep, when the algorithm breaks down and tries to reference off the end of a std::vector, instead of catching the exception, the program drops into the VS debugger, tells me I have an unhandled win32 exception, and the following is found on the call stack above my function:
msvcr80.dll!:inavlid_parameter_noinfo()
msvcr80.dll!:invoke_watson(....)
msvcr80.dll!:_crt_debugger_hook(...)
How can I prevent the debugger being called? This occurs at the end of a 30 minute simulation, at which point I lose all my results unless I can catch and log the exception. This and similar try/catch constructs have been working in the past - are there compiler settings which affect this? Help?
You may want to convert non-C++ exceptions into C++ exceptions. Here's an example of how to do it.
Apologies for the delay - unforeseen circumstances - but the real answer appears to be the following.
First, the application is also wrapped in a __try {} __except () {} construct, and uses unhandled exception filtering based on XCrashReport. This picks up the non C++ exceptions for the reasons many have pointed out above.
Second, however, as one can find here and elsewhere, since CRT 8.0, for security reasons Microsoft bypasses unhandled exception handling on buffer overruns (and certain other situations), and passes the issue directly to Dr Watson.
This is really annoying - I want to know where my buffer overruns are occurring, and why, and to fix them. I also want my program to crash, leave an audit trail, and then be automatically restarted, 24/7. Having MS Visual Studio on the PC seems to mean that Dr Watson will pause and offer me the option of using MSVC to debug the issue. However, until I respond to the dialog box, nothing will happen. Comments on workarounds greatly appreciated...
Just because you have a catch(...) doesn't mean you can catch and recover from all exceptions. Not all exceptions are recoverable.
You problem might be that the first exception that pin points the exact problem is getting obscured by your catch statement
You need to attach the debugger to the program and have it break on all exceptions and fix the code
Standard Library containers do not check for any logic errors and hence themselves never emit any exceptions.
Specifically, for std::vector only method that throws an exception is std::vector::at().
Any Undefined Behavior w.r.t to them will most likely lead your program to crash.
You will need to use Windows' SEH and C++ Exception Handling for catching Windows specific exceptions, which are non C++ standard btw.
This is an unusual question to ask but here goes:
In my code, I accidentally dereference NULL somewhere. But instead of the application crashing with a segfault, it seems to stop execution of the current function and just return control back to the UI. This makes debugging difficult because I would normally like to be alerted to the crash so I can attach a debugger.
What could be causing this?
Specifically, my code is an ODBC Driver (ie. a DLL). My test application is ODBC Test (odbct32w.exe) which allows me to explicitly call the ODBC API functions in my DLL. When I call one of the functions which has a known segfault, instead of crashing the application, ODBC Test simply returns control to the UI without printing the result of the function call. I can then call any function in my driver again.
I do know that technically the application calls the ODBC driver manager which loads and calls the functions in my driver. But that is beside the point as my segfault (or whatever is happening) causes the driver manager function to not return either (as evidenced by the application not printing a result).
One of my co-workers with a similar machine experiences this same problem while another does not but we have not been able to determine any specific differences.
Windows has non-portable language extensions (known as "SEH") which allow you to catch page faults and segmentation violations as exceptions.
There are parts of the OS libraries (particularly inside the OS code that processes some window messages, if I remember correctly) which have a __try block and will make your code continue to run even in the face of such catastrophic errors. Likely you are being called inside one of these __try blocks. Sad but true.
Check out this blog post, for example: The case of the disappearing OnLoad exception – user-mode callback exceptions in x64
Update:
I find it kind of weird the kind of ideas that are being attributed to me in the comments. For the record:
I did not claim that SEH itself is bad.I said that it is "non-portable", which is true. I also claimed that using SEH to ignore STATUS_ACCESS_VIOLATION in user mode code is "sad". I stand by this. I should hope that I had the nerve to do this in new code and you were reviewing my code that you would yell at me, just as if I wrote catch (...) { /* Ignore this! */ }. It's a bad idea. It's especially bad for access violation because getting an AV typically means your process is in a bad state, and you shouldn't continue execution.
I did not argue that the existence of SEH means that you must swallow all errors.Of course SEH is a general mechanism and not to blame for every idiotic use of it. What I said was that some Windows binaries swallow STATUS_ACCESS_VIOLATION when calling into a function pointer, a true and observable fact, and that this is less than pretty. Note that they may have historical reasons or extenuating circumstances to justify this. Hence "sad but true."
I did not inject any "Windows vs. Unix" rhetoric here. A bad idea is a bad idea on any platform. Trying to recover from SIGSEGV on a Unix-type OS would be equally sketchy.
Dereferencing NULL pointer is an undefined behavior, which can produce almost anything -- a seg.fault, a letter to IRS, or a post to stackoverflow :)
Windows 7 also have its Fault Tollerant Heap (FTH) which sometimes does such things. In my case it was also a NULL-dereference. If you develop on Windows 7 you really want to turn it off!
What is Windows 7's Fault Tolerant Heap?
http://msdn.microsoft.com/en-us/library/dd744764%28v=vs.85%29.aspx
Read about the different kinds of exception handlers here -- they don't catch the same kind of exceptions.
Attach your debugger to all the apps that might call your dll, turn on the feature to break when an excption is thrown not just unhandled in the [debug]|[exceptions] menu.
ODBC is most (if not all) COM as such unhandled exceptions will cause issues, which could appear as exiting the ODBC function strangely or as bad as it hang and never return.
How can I specify a separate function which will be called automatically at Run-Time error to prevent program crash?
Best thing as mentioned already is to identify the area where the crash ic occuring and then fix the piece of code. This is the idealistic approach.
In case you are unable to find that out another alternative is to do structured exception handling in the areas where you suspect crashes to occur. Once the crash occurs you capture what ever data you want and process it. Meanwhile you can change the settings in windows service manager to restart your application whenever it crashes. Hope this answers your question.
Also in case you are looking for methods to capture the crash and analyse debugdiag and windbg are some of the standard tools that people use to take the crash dumps.
If your in Windows you need to write your own custom runtime check handler. Use:
_RTC_SetErrorFunc
to install your custom function in place of
_CrtDbgReport.
Here is a good article on how to do it:
http://msdn.microsoft.com/en-us/library/40ky6s47(VS.71).aspx
If by run-time error you mean an unhandled exception, don't do it. If an unhanded exception propagates through your code, you want your app to crash. Maybe create a dumpfile on the way down, sure. But the last thing you want to do is catch the exception, do nothing, and continue running as if nothing had happened in the first place.
When an exception is generated, whatever caused the problem could have had other effects. Like blowing the stack or corrupting the heap. If you were to silently ignore these exceptions and continue running anyway, your app might run for a while and seem OK, but something underneath is unstable. Memory could be corrupt, resources might not be available, who knows. You will eventually crash. Bu ignoring unhandled exceptions and running anyway, all you do is delay the inevitable, and make it that much more difficult to diagnose the real problem, because when you do crash, the stack will be in a totally different and probably unrelated place from what caused the initial problem.
Background
I have an application with a Poof-Crash[1]. I'm fairly certain it is due to a blown stack.
The application is Multi-Threaded.
I am compiling with "Enable C++ Exceptions: Yes With SEH Exceptions (/EHa)".
I have written an SE Translator function and called _set_se_translator() with it.
I have written functions for and setup set_terminate() and set_unexpected().
To get the Stack Overflow, I must run in release mode, under heavy load, for several days. Running under a debugger is not an option as the application can't perform fast enough to achieve the runtime necessary to see the issue.
I can simulate the issue by adding infinite recursion on execution of one of the functions, and thus test the catching of the EXCEPTION_STACK_OVERFLOW exception.
I have WinDBG setup as the crash dump program, and get good information for all other crash issues but not this one. The crash dump will only contain one thread, which is 'Sleep()'ing. All other threads have exited.
The Question
None of the things I've tried has resulted in picking up the EXCEPTION_STACK_OVERFLOW exception.
Does anyone know how to guarantee getting a a chance at this exception during runtime in release mode?
Definitions
Poof-Crash: The application crashes by going "poof" and disappearing without a trace.
(Considering the name of this site, I'm kind of surprised this question isn't on here already!)
Notes
An answer was posted briefly about adjusting the stack size to potentially force the issue sooner and allow catching it with a debugger. That is a clever thought, but unfortunately, I don't believe it would help. The issue is likely caused by a corner case leading to infinite recursion. Shortening the stack would not expose the issue any sooner and would likely cause an unrelated crash in validly deep code. Nice idea though, and thanks for posting it, even if you did remove it.
Everything prior to windows xp would not (or would be harder) generally be able to trap stack overflows. With the advent of xp, you can set vectored exception handler that gets a chance at stack overflow prior to any stack-based (structured exception) handlers (this is being the very reason - structured exception handlers are stack-based).
But there's really not much you can do even if you're able to trap such an exception.
In his blog, cbrumme (sorry, do not have his/her real name) discusses a stack page neighboring the guard page (the one, that generates the stack overflow) that can potentially be used for backout. If you can squeeze your backout code to use just one stack page - you can free as much as your logic allows. Otherwise, the application is pretty much dead upon encountering stack overflow. The only other reasonable thing to do, having trapped it, is to write a dump file for later debugging.
Hope, it helps.
I'm not convinced that you're on the right track in diagnosing this as a stack overflow.
But in any case, the fact that you're getting a poof!, plus what you're seeing in WinDbg
The crash dump will only contain one thread, which is 'Sleep()'ing. All other threads have exited.
suggests to me that somebody has called the C RTL exit() function, or possibly called the Windows API TerminateProcess() directly. That could have something to do with your interrupt handlers or not. Maybe something in the exception handling logic has a re-entrance check and arbitrarily decides to exit() if it's reentered.
My suggestion is to patch your executables to put maybe an INT 3 debug at the entry point to exit (), if it's statically linked, or if it's dynamically linked, patch up the import and also patch up any imports of kernel32::TerminateProcess to throw a DebugBreak() instead.
Of course, exit() and/or TerminateProcess() may be called on a normal shutdown, too, so you'll have to filter out the false alarms, but if you can get the call stack for the case where it's just about to go proof, you should have what you need.
EDIT ADD: Just simply writing your own version of exit() and linking it in instead of the CRTL version might do the trick.
I remember code from a previous workplace that sounded similar having explicit bounds checks on the stack pointer and throwing an exception manually.
It's been a while since I've touched C++ though, and even when I did touch it I didn't know what I was doing, so caveat implementor about portability/reliability of said advice.
Have you considered ADPlus from Debugging Tools for Windows?
ADPlus attaches the CDB debugger to a process in "crash" mode and will generate crash dumps for most exceptions the process generates. Basically, you run "ADPlus -crash -p yourPIDhere", it performs an invasive attach and begins logging.
Given your comment above about running under a debugger, I just wanted to add that CDB adds virtually zero overhead in -crash mode on a decent (dual-core, 2GB RAM) machine, so don't let that hold you back from trying it.
You can generate debugging symbols without disabling optimizations. In fact, you should be doing that anyways. It just makes debugging harder.
And the documentation for _set_se_translator says that each thread has its own SE translator. Are you setting one for each thread?
set_unexpected is probably a no-op, at least according to the VS 2005 documentation. And each thread also has its own terminate handler, so you should install that per thread as well.
I would also strongly recommend NOT using SE translation. It takes hardware exceptions that you shouldn't ignore (i.e., you should really log an error and terminate) and turns them into something you can ignore (C++ exceptions). If you want to catch this kind of error, use a __try/__except handler.