Why catch an Access Violation? - c++

Unlike C++ exceptions, an Access Violation indicates your applications runtime is compromised and the state of your application is, therefore, undefined. The best thing to do in that circumstance is to exit your application (generally done for you because it crashes).
I note that it is possible to catch one of these exceptions. In Microsoft Visual C++, for example, you can use /EHa or __try/__catch to do so.
So, what are the reasons you would want to catch them? As I understand it, there's no way for your application to recover.

You can recover from an access violation.
For example, you can create a dynamic array by allocating some address space with VirtualAlloc, but marking the memory it points at as not-present. Then, when you attempt to use some memory, you catch the access violation, map a page of memory where the access took place, then re-try the instruction that caused the violation.

One reason might be to write a crash dump file; you would have more control over it and be able to write the exact type you wanted. In Windows, for example, you could call MiniDumpWriteDump to do this.

Your app is not guaranteed to be stable after all access violations. But it can be stable or recover after some, so catching the access violation gives you the opportunity to:
Inform the user something went bad
Let the user attempt to save work
Attempt to recover
Log diagnostic information
Exit in the way you want, rather than disappearing.
A typical example of this is a host application that catches exceptions from a plugin. This way the host app (Photoshop, for example) can tell the user that "Plugin X crashed, and Photoshop is unstable... you should save your work, and restart Photoshop."
Note that this is different than C++ exception handling, which does not indicate unrecoverable errors at all, but is more of a stack unwinding feature.

I think it would be better to gracefully crash and let the user know what's going on, rather than just having the application disappear.

If your application gets an access violation you may or may not be able to recover. If your application is left in an undefined state, you clobber your stack for instance, you're done.
Read from a runaway pointer probably won't corrupt your application and from that you likely can recover.
Either way, the fact that it is occurring indicates a bug in your code so instead of trying to recover you should catch the error and dump what state you can to help you debug the problem.

Related

Productive use of SIGSEGV

Are there many productive uses of handling SIGSEGV, other than a last ditch "something bad happened"?
From the SIGSEGV wiki page, debuggers use it to catch errors in a user's program and inform the user of what happened. The way that I see it, it's a way to query the virtual memory system, and since we have a virtual memory system, I feel like SIGSEGV could be used in a more productive way. One thing I thought of was that you could have a stack somewhere, try to put things onto it, and then when you catch a SIGSEGV, increase the size of the stack, and retry the operation. I feel like this could be useful if, for example, you are processing messages and yet don't know the size of your messages but due to speed, you want to be optimistic about writing them to memory.
Is this a legitimate use of the signal? Are there other ways to incorporate it into your software design as a method of control flow and execution rather than as a last ditch method of error handling?
Debuggers don't "use" SIGSEGV to catch faults in a user program. Debuggers trap the event so that you (the programmer) can see what has happened.
You really don't want to use SIGSEGV like that. It would be incredibly slow. When a memory violation happens, the O/S has to check if it is a valid access to memory that is currently paged out of the system, as opposed to something you shouldn't be accessing.
You'd have to make similar checks after the operating system had decided you shouldn't be doing that, to check whether or not you were accessing the wrong place on the stack, and decide whether or not it was because the stack needed expanding. Moreover it is unlikely you'd be able to return to the instruction that caused the access violation to re-execute it. You normally have to be the O/S to do that.
If it did work, the code needed would be highly compiler, O/S and architecture specific.
Basically, don't do it.
A typical usage is to catch access to specific memory locations.
You set up memory access rights with mprotect, then you handle the fault, give access, and continue the execution. This can be used for debugging, creative logging, custom i/o-memory mapping, ...
The productive uses of SIGSEGV and other signals are typically limited in production environments due to portability concerns. The list of things you can do safely in a signal handler is pretty short, and the interaction between threads and signals is largely unspecified.
For this reason, the use of SIGSEGV is usually limited to debuggers and similar programs that have to run unknown code without changing it.
In the production environment context it is useful to handle SIGSEGV - to log a message and handle the shutdown gracefully (shutdown network connections etc, perhaps having a restart).

Why won't my code segfault on Windows 7?

This is an unusual question to ask but here goes:
In my code, I accidentally dereference NULL somewhere. But instead of the application crashing with a segfault, it seems to stop execution of the current function and just return control back to the UI. This makes debugging difficult because I would normally like to be alerted to the crash so I can attach a debugger.
What could be causing this?
Specifically, my code is an ODBC Driver (ie. a DLL). My test application is ODBC Test (odbct32w.exe) which allows me to explicitly call the ODBC API functions in my DLL. When I call one of the functions which has a known segfault, instead of crashing the application, ODBC Test simply returns control to the UI without printing the result of the function call. I can then call any function in my driver again.
I do know that technically the application calls the ODBC driver manager which loads and calls the functions in my driver. But that is beside the point as my segfault (or whatever is happening) causes the driver manager function to not return either (as evidenced by the application not printing a result).
One of my co-workers with a similar machine experiences this same problem while another does not but we have not been able to determine any specific differences.
Windows has non-portable language extensions (known as "SEH") which allow you to catch page faults and segmentation violations as exceptions.
There are parts of the OS libraries (particularly inside the OS code that processes some window messages, if I remember correctly) which have a __try block and will make your code continue to run even in the face of such catastrophic errors. Likely you are being called inside one of these __try blocks. Sad but true.
Check out this blog post, for example: The case of the disappearing OnLoad exception – user-mode callback exceptions in x64
Update:
I find it kind of weird the kind of ideas that are being attributed to me in the comments. For the record:
I did not claim that SEH itself is bad.I said that it is "non-portable", which is true. I also claimed that using SEH to ignore STATUS_ACCESS_VIOLATION in user mode code is "sad". I stand by this. I should hope that I had the nerve to do this in new code and you were reviewing my code that you would yell at me, just as if I wrote catch (...) { /* Ignore this! */ }. It's a bad idea. It's especially bad for access violation because getting an AV typically means your process is in a bad state, and you shouldn't continue execution.
I did not argue that the existence of SEH means that you must swallow all errors.Of course SEH is a general mechanism and not to blame for every idiotic use of it. What I said was that some Windows binaries swallow STATUS_ACCESS_VIOLATION when calling into a function pointer, a true and observable fact, and that this is less than pretty. Note that they may have historical reasons or extenuating circumstances to justify this. Hence "sad but true."
I did not inject any "Windows vs. Unix" rhetoric here. A bad idea is a bad idea on any platform. Trying to recover from SIGSEGV on a Unix-type OS would be equally sketchy.
Dereferencing NULL pointer is an undefined behavior, which can produce almost anything -- a seg.fault, a letter to IRS, or a post to stackoverflow :)
Windows 7 also have its Fault Tollerant Heap (FTH) which sometimes does such things. In my case it was also a NULL-dereference. If you develop on Windows 7 you really want to turn it off!
What is Windows 7's Fault Tolerant Heap?
http://msdn.microsoft.com/en-us/library/dd744764%28v=vs.85%29.aspx
Read about the different kinds of exception handlers here -- they don't catch the same kind of exceptions.
Attach your debugger to all the apps that might call your dll, turn on the feature to break when an excption is thrown not just unhandled in the [debug]|[exceptions] menu.
ODBC is most (if not all) COM as such unhandled exceptions will cause issues, which could appear as exiting the ODBC function strangely or as bad as it hang and never return.

Is it not possible to make a C++ application "Crash Proof"?

Let's say we have an SDK in C++ that accepts some binary data (like a picture) and does something. Is it not possible to make this SDK "crash-proof"? By crash I primarily mean forceful termination by the OS upon memory access violation, due to invalid input passed by the user (like an abnormally short junk data).
I have no experience with C++, but when I googled, I found several means that sounded like a solution (use a vector instead of an array, configure the compiler so that automatic bounds check is performed, etc.).
When I presented this to the developer, he said it is still not possible.. Not that I don't believe him, but if so, how is language like Java handling this? I thought the JVM performs everytime a bounds check. If so, why can't one do the same thing in C++ manually?
UPDATE
By "Crash proof" I don't mean that the application does not terminate. I mean it should not abruptly terminate without information of what happened (I mean it will dump core etc., but is it not possible to display a message like "Argument x was not valid" etc.?)
You can check the bounds of an array in C++, std::vector::at does this automatically.
This doesn't make your app crash proof, you are still allowed to deliberately shoot yourself in the foot but nothing in C++ forces you to pull the trigger.
No. Even assuming your code is bug free. For one, I have looked at many a crash reports automatically submitted and I can assure you that the quality of the hardware out there is much bellow what most developers expect. Bit flips are all too common on commodity machines and cause random AVs. And, even if you are prepared to handle access violations, there are certain exceptions that the OS has no choice but to terminate the process, for example failure to commit a stack guard page.
By crash I primarily mean forceful termination by the OS upon memory access violation, due to invalid input passed by the user (like an abnormally short junk data).
This is what usually happens. If you access some invalid memory usually OS aborts your program.
However the question what is invalid memory... You may freely fill with garbage all the memory in heap and stack and this is valid from OS point of view, it would not be valid from your point of view as you created garbage.
Basically - you need to check the input data carefully and relay on this. No OS would do this for you.
If you check your input data carefully you would likely to manage the data ok.
I primarily mean forceful termination
by the OS upon memory access
violation, due to invalid input passed
by the user
Not sure who "the user" is.
You can write programs that won't crash due to invalid end-user input. On some systems, you can be forcefully terminated due to using too much memory (or because some other program is using too much memory). And as Remus says, there is no language which can fully protect you against hardware failures. But those things depend on factors other than the bytes of data provided by the user.
What you can't easily do in C++ is prove that your program won't crash due to invalid input, or go wrong in even worse ways, creating serious security flaws. So sometimes[*] you think that your code is safe against any input, but it turns out not to be. Your developer might mean this.
If your code is a function that takes for example a pointer to the image data, then there's nothing to stop the caller passing you some invalid pointer value:
char *image_data = malloc(1);
free(image_data);
image_processing_function(image_data);
So the function on its own can't be "crash-proof", it requires that the rest of the program doesn't do anything to make it crash. Your developer also might mean this, so perhaps you should ask him to clarify.
Java deals with this specific issue by making it impossible to create an invalid reference - you don't get to manually free memory in Java, so in particular you can't retain a reference to it after doing so. It deals with a lot of other specific issues in other ways, so that the situations which are "undefined behavior" in C++, and might well cause a crash, will do something different in Java (probably throw an exception).
[*] let's face it: in practice, in large software projects, "often".
I think this is a case of C++ codes not being managed codes.
Java, C# codes are managed, that is they are effectively executed by an Interpreter which is able to perform bound checking and detect crash conditions.
With the case of C++, you need to perform bound and other checking yourself. However, you have the luxury of using Exception Handling, which will prevent crash during events beyond your control.
The bottom line is, C++ codes themselves are not crash proof, but a good design and development can make them to be so.
In general, you can't make a C++ API crash-proof, but there are techniques that can be used to make it more robust. Off the top of my head (and by no means exhaustive) for your particular example:
Sanity check input data where possible
Buffer limit checks in the data processing code
Edge and corner case testing
Fuzz testing
Putting problem inputs in the unit test for regression avoidance
If "crash proof" only mean that you want to ensure that you have enough information to investigate crash after it occurred solution can be simple. Most cases when debugging information is lost during crash resulted from corruption and/or loss of stack data due to illegal memory operation by code running in one of threads. If you have few places where you call library or SDK that you don't trust you can simply save the stack trace right before making call into that library at some memory location pointed to by global variable that will be included into partial or full memory dump generated by system when your application crashes. On windows such functionality provided by CrtDbg API.On Linux you can use backtrace API - just search doc on show_stackframe(). If you loose your stack information you can then instruct your debugger to use that location in memory as top of the stack after you loaded your dump file. Well it is not very simple after all, but if you haunted by memory dumps without any clue what happened it may help.
Another trick often used in embedded applications is cycled memory buffer for detailed logging. Logging to the buffer is very cheap since it is never saved, but you can get idea on what happen milliseconds before crash by looking at content of the buffer in your memory dump after the crash.
Actually, using bounds checking makes your application more likely to crash!
This is good design because it means that if your program is working, it's that much more likely to be working /correctly/, rather than working incorrectly.
That said, a given application can't be made "crash proof", strictly speaking, until the Halting Problem has been solved. Good luck!

Unhandled Win32 exception

At runtime, when myApp.exe crashes i receive "Unhandled Win32 exception" but how would i know which exception was occurred? where did something went wrong?
For a Native C++ app see my earlier answer here: Detect/Redirect core dumps (when a software crashes) on Windows for catching the unhandled exception (that also gives code for creating a crash dump that you can use to analyse the crash later. If the crash is happening on a development system then in Visual Studio (I'm assuming you're using that, if not other IDEs will have something similar), in Debug/Exceptions tick the 'Thrown' box for 'Win32 Exceptions'.
Typically, Windows will give you several hexadecimal numbers as well. Chances are that the exception code will be 0xC0000005. This is the code of an Access Violation. When that happens, you also will have three additional bits of information: the violating address, the violated address, and the type of violation (read, write or execute).
Windows won't narrow it down any further than that, and often it couldn't anyway. For instance, if you walk past the end of an array in your program, Windows likely won't realize that you were even iterating over an array. It just sees "read:OK, read:OK, read:out of bounds => page fault => ACCESS VIOLATION". You will have to figure that out from the violating address (your array iteration code) and the violated address (the out-of-bounds address behind your array).
If it's a .Net app you could try to put in a handledr for the UnhandledException event. You can find more information about it and some sample code here.
In general it's a good sign that your exception handling is broken though, so might be worth going through your code and finding places that could throw but where you don't handle exceptions.
Use the debugger. You can run the program and see what exception is been thrown that kills your application. It might be able to pinpoint the location of the throw. I have not used the VS debugger for this, but in gdb you can use catch throw to force a breakpoint when an exception is thrown, there should be something similar.

run-time error and program crash

How can I specify a separate function which will be called automatically at Run-Time error to prevent program crash?
Best thing as mentioned already is to identify the area where the crash ic occuring and then fix the piece of code. This is the idealistic approach.
In case you are unable to find that out another alternative is to do structured exception handling in the areas where you suspect crashes to occur. Once the crash occurs you capture what ever data you want and process it. Meanwhile you can change the settings in windows service manager to restart your application whenever it crashes. Hope this answers your question.
Also in case you are looking for methods to capture the crash and analyse debugdiag and windbg are some of the standard tools that people use to take the crash dumps.
If your in Windows you need to write your own custom runtime check handler. Use:
_RTC_SetErrorFunc
to install your custom function in place of
_CrtDbgReport.
Here is a good article on how to do it:
http://msdn.microsoft.com/en-us/library/40ky6s47(VS.71).aspx
If by run-time error you mean an unhandled exception, don't do it. If an unhanded exception propagates through your code, you want your app to crash. Maybe create a dumpfile on the way down, sure. But the last thing you want to do is catch the exception, do nothing, and continue running as if nothing had happened in the first place.
When an exception is generated, whatever caused the problem could have had other effects. Like blowing the stack or corrupting the heap. If you were to silently ignore these exceptions and continue running anyway, your app might run for a while and seem OK, but something underneath is unstable. Memory could be corrupt, resources might not be available, who knows. You will eventually crash. Bu ignoring unhandled exceptions and running anyway, all you do is delay the inevitable, and make it that much more difficult to diagnose the real problem, because when you do crash, the stack will be in a totally different and probably unrelated place from what caused the initial problem.