When can the problem actually be fixed by catching an exception? - c++

Here's the thing. There's something I don't quite understand about exceptions, and to me they seem like a construct that almost works, but can't be used cleanly.
I have a simple question. When has catching an exception been a useful or necessary component of solving the root cause of the problem? I.e. when have you been able to write code that fixes a problem signaled through an exception? I am looking for factual data, or experience you have had.
Here's what I mean. A normal program does work. If some piece of work can't be completed for reason X, the function responsible for doing the work throws an exception. But who catches the exception? As I see it, there are three reasons you might want to catch an exception:
You catch it because you want to change its type and rethrow it. (This happens when you translate mechanical exception, such as std::out_of_range, to business exceptions, such as could_not_complete_transaction)
You catch it because you want to log it, or let the user know about the problem, before aborting.
You catch it because you actually know how to solve the problem.
It is point 3 that I'm skeptical about. I have never actually caught an exception knowing what to do to solve it. When you get a std::out_of_memory, what are you supposed to do with it? It's not like you can barter the operating system to get more memory. That's just not something you can fix. And it's not just std::out_of_memory, there are also business class exceptions that suffer from this. Think about a potential connection_error exception: what can you do to fix this except wait and retry later and hope it fixes itself?
Now, to be fair, I do know of one case in which code does catch an exception and tries to fix the problem. I know that there are certain Win32 SEH handlers that catch a Stack Overflow exception and try to fix the problem by enlarging the size of the thread stack if it's possible. However, this works because SEH has try-resume semantics, which C++ exceptions don't have (you can't resume at the point the exception occurred).
The main part of the question is over. However, there's also another problem I have with exceptions that, to me, seems exactly the reason why you don't have catch clauses that fix the problem: the code that catches the exception necessarily has to be coupled with the code that throws it. Because, in order to fix the problem, it must have domain specific knowledge about what the problem cause is. But when some library documents that "if this function fails, an internal_error exception will be thrown", how am I supposed to be able to fix the problem when I don't know how the library works internally?
PS: Please note that this is not a "exceptions vs. error codes" kind of question; I am well aware that error codes suck as an error handling mechanism. They actually suffer from the same problem I have explained for exceptions.

I think your problem is that you equate "solve the problem" with "make the program keep going correctly". That is the wrong way to think of exceptions, or error handling in general.
Error handling code of any kind should not be something that is internally fixable by the program. That is, error handling logic (like catching exceptions) should not be entered because of programming mistakes.
If the user gives you a non-existent filename, that's not a programming mistake; that's a user-error. You cannot "fix" that without going back to the user and getting an existing file. But exceptions do allow you to undo what you were trying to do, restore the program to a valid state, and then communicate what happened to the user.
An invalid_connection is similarly not a programming mistake. Unlike the above, it's not necessarily a user error either. It's something that's expected to be able to happen, and different programs will handle it in different ways. Some will want to try again. Others will want to halt and let the user know.
The point is, because there is no one means to handle this condition, it cannot be done by the library. The error must be given to the caller of the library to figure out what to do.
If you have a function that parses integers, and you are given text that doesn't conform to an integer, it's not that function's job to figure out what to do next. The caller needs to be notified that the string they provided is malformed and that something ought to be done.
The caller needs to handle the error.
You don't abort most programs because a file that was supposed to contain integers didn't contain integers. But your parsing function does need to communicate this fact to the caller, and the caller does need to deal with that possibility.
That's what "catching exceptions" is for.
Now, unexpected environmental conditions like OOM are a different story. This is not usually external code's fault, but it's also not usually a programming error. And if it is a programming error (ie: memory leak), it's not one you can deal with in most cases. P0709 has an entire section on the ability (or lack thereof) of programs to be able to generally respond to OOM. The result is that, even when programs are coded defensively against OOM exceptions, they're usually still broken when they run out of memory.
Especially when dealing with OS's that don't commit pages to memory until you actually use them.

Here is my take,
There are more reasons to catch exceptions, for example, if it is a critical application, such as ones found in power substations etc. and an exception is caught to which there is no known system recovery or solution, you may want to have a controlled shutdown, protect certain modules, protect connected embedded systems etc. instead of just letting the system crash on its own. The latter could be disastrous...
I.e. when have you been able to write code that fixes a problem signaled through an exception?
When you get a std::out_of_memory, what are you supposed to do with it? It's not like you can barter the operating system to get more memory.
Actually I feel like that was my primary coding style for a while. An example: a system I worked on did not have a huge amount of memory and the system was dedicated, so, it was only my app and nothing else. Whenever I had an out_of_memory type of exception, I'd just kill the older process and open the one with the higher priority. Of course I'd wait for the kill to happen in a controlled fashion.
Think about a potential connection_error exception: what can you do to fix this except wait and retry later and hope it fixes itself?
I'd try to connect through another medium such as bluetooth, fiber, bus etc. Normally of course there would be a primary medium of contact, and the others wouldn't be called unless there is an exception.
But when some library documents that "if this function fails, an internal_error exception will be thrown", how am I supposed to be able to fix the problem when I don't know how the library works internally?
Most often an exception in a dedicated library has different consequences in your system than its own. You may not need to read the library and its internal workings to fix the problem. You just need to study its effect on your software and handle that situation. That's probably the easiest solution. And that is a lot easier to do if the library raises a known exception instead of just crashing or giving gibberish answers.

One obvious thing that came to mind was socket connections.
You try and connect to Server A and the program finds that it can't do that
Try connecting to Server B
The other examples regarding user input are equally as valid if not more so.
I admit that seeing something along the lines of
try
{
connectToServerA();
}
catch(cantConnectToServer)
{
connectToServerB();
}
would look like a bit of a weird pattern to see in real world code. It might make sense if the function takes an address and we iterate through a list of potential addresses.
Broadly speaking I agree with you often all you want to do is log the error and terminate - but some systems, which have to be robust and "always on" shouldn't just terminate if they encounter a problem.
Webservers are one obvious example. You don't just terminate because one users connection faulters, because that would drop the session for all the other connected users. There might be parts of code where raising an exception is the simplest way to deal with such a failure however.

Related

How to wrap my C code in C++ exception handling?

I have an old C based project, which I would like to port from an Atmel processor to Raspberry Pi.
At the time that it was written, C++ was not an option, and it would be too much effort, almost a rewrite, to convert it all to C++.
Some problems/crashes can't be (easily) caught by C, so sometimes my program will just die & I would like to send a last chance cry for help before expiring. No attempt at recovery and I can even live without details of the error, just so long as I get a message telling me to visit the equipment
Long story short, I think that I could have better error detection if I had exception handling.
I am thinking of using exception handling as chance of alerting me to go to the device and fetch the complete error log, reset the hardware etc. C won't always give me that last gasp chance to do something, if my code goes bang
Since I don't want to do a total C++ rewrite, would it be enough just to wrap main() in try / catch?
Is that technically enough, or do I need to do more?
Other than more detailed error reporting, is there anything to gain by wrapping every (major) function in it's own try / catch?
Other than more detailed error reporting, is there anything to gain by wrapping every (major) function in it's own try / catch?
Firstly, only catch exceptions where you are in a position to alter the behaviour of the program in response to them (unless you're simply looking to add more contextual information via std::throw_with_nested())
Secondly, a c program will not exhibit RAII, so throwing exceptions in this circumstance is likely to leak resources unless you wrap all your handle and memory allocation in smart pointers or RAII-enabled handle classes.
You should do that before you consider adding exception handling.
If the program is likely to be actively maintained into the future, there is probably mileage in doing this. If not, probably better to leave sleeping dogs lie.

C++: Should I catch all exceptions or let the program crash?

I have a Windows service written in (Visual) C++ with a very detail logging functionality that has often helped me find the cause of errors customers are sometimes experiencing. Basically I check every return value and log what is going on and where errors are coming from.
Ideally, I would like to have the same level of detailed visibility into exceptions (like array out of range, division by zero, and so on). In other words: I want to know exactly where an exception is coming from. For reasons of readability and practicality I do not want to wrap every few lines of code into separate try/catch blocks.
What I have today is one general catch-all that catches everything and logs an error before shutting down the program. This is good from the user's point of view - clean shutdown instead of app crash - but bad for me because I only get a generic message from the exception (e.g. "array out of range") but have no idea where that is coming from.
Wouldn't it be better to remove the catch-all and let the program crash instead? I could direct the customer to have Windows create an application crash dump (as described here). With the dump file WinDbg would point me exactly to the position in the code where the exception was thrown.
You can register a custom, vectored exception handler by calling AddVectoredExceptionHandler .
This will get called whenever an exception gets thrown, and in it you can generate a stack trace that you can then save off for logging purposes.
Writing the code to do this is not completely trivial but not rocket surgery either.
I've never personally done it in C++, but I would be surprised if there weren't ready-built libraries that do this available somewhere, if you don't have the time or inclination to do it on your own.
You can throw exceptions with description where the error occurred and why:
throw std::string("could not open this file");
If you do not want to write different descriptions for every possible error you can use standard macros __FILE__ and __LINE__:
#define _MyError std::string("error in " __FILE__ + std::to_string(__LINE__))
// ...
throw _MyError;
If source file name and line of the error is not enough and you need more information, for example stack trace or memory values, your program can generate a debug report. Google Breakpad is a C++ library that allows you to do that in a portable way. Class wxDebugReport from wxWidgets library is an alternative. On Windows the debug reports may include a minidump file that can be loaded in Visual Studio and allows you to analyse the error in a way similar to debugging.
Wouldn't it be better to remove the catch-all and let the program
crash instead?
You can catch-all and
Write a (more personal) message about the fatal error that occurred, forcing the application to be closed. Do not let the program continue: you don't know what happened, where. Continuing might cause damage to the user's data, follow up errors, etc.
Tell the user to contact you with specifics as to what they did and what happened.
Tell the user to include the log file your application has generated.
If you don't do something like this, then you might just as well remove the catch-all.
For reasons of readability and practicality I do not want to wrap
every few lines of code into separate try/catch blocks.
And yet if you want your program to be able to recover, this is exactly what you have to do. What you could do is
Tell the user what happened, perhaps what was wrong with the input that may have caused it. Don't make it sound technical.
Save any data the user has entered so their work is not completely lost
You know at which step the failure happened. Undo that step, i.e. throw away objects / data, and go back to the point before the exception.
Restore data the user had entered from the second point so they don't need to repeat actions all over again.
The point being that your program can return to a valid state.

Should exceptions ever be caught

No doubt exceptions are usefull as they show programmer where he's using functions incorrectly or something bad happens with an environment but is there a real need to catch them?
Not caught exceptions are terminating the program but you can still see where the problem is. In well designed libraries every "unexpected" situation has actually workaround. For example using map::find instead of map::at, checking whether your int variable is smaller than vector::size prior to using index operator.
Why would anyone need to do it (excluding people using libraries that enforce it)? Basically if you are writing a handler to given exception you could as well write a code that prevents it from happening.
Not all exceptions are fatal. They may be unusual and, therefore, "exceptions," but a point higher in the call stack can be implemented to either retry or move on. In this way, exceptions are used to unwind the stack and a nested series of function or method calls to a point in the program which can actually handle the cause of the exception -- even if only to clean up some resources, log an error, and continue on as before.
You can't always write code that prevents an exception. Just for an obvious example, consider concurrent code. Let's assume I attempt to verify that i is between (say) 0 and 20, then use i to index into some array. So, I check and i == 12, so I proceed to use it to index into the array. Unfortunately, in between the test and the indexing operation, some other thread added 20 to i, so by the time it's used as an index, it's not in range any more.
The concurrency has led to a race condition, so the attempt at assuring against an exceptional condition has failed. While it's possible to prevent this by (for example) wrapping each such test/use sequence in a critical section (or similar), it's often impractical to do so--first, getting the code correct will often be quite difficult, and second even if you do get it correct, the consequences on execution speed may be unacceptable.
Exceptions also decouple code that detects an exceptional condition from code that reacts to that exceptional condition. This is why exception handling is so popular with library writers. The code in the library doesn't have a clue of the correct way to react to a particular exceptional condition. Just for a really trivial example, let's assume it can't read from a file. Should it print a message to stderr, pop up a MessageBox, or write to a log?
In reality, it should do none of these. At least two (and possibly all three) will be wrong for any given program. So, what it should do is throw an exception, and let code at a higher level determine the appropriate way to respond. For one program it may make sense to log the error and continue with other work, but for another the file may be sufficiently critical that its only reasonable reaction is to abort execution entirely.
Exceptions are very expensive, performance vise - thus, whenever performance matter you will want to write an exception free code (using "plain C" techniques for error propagation).
However, if performance is not of immediate concern, then exceptions would allow you to develop a less cluttered code, as error handling can be postponed (but then you will have to deal with non-local transfer of control, which may be confusing in itself).
I have used extensivelly exceptions as a method to transfer control on specific positions depending on event handling.
Exceptions may also be a method to transfer control to a "labeled" position alog the tree of calling functions.
When an exception happens the code may be thought as backtracking one level at a time and checking if that level has an exception active and executing it.
The real problem with exceptions is that you don't really know where these will happen.
The code that arrives to an exception, usually doesn't know why there is a problem, so a fast returning back to a known state is a good action.
Let's make an example: You are in Venice and you look at the map walking throught small roads, at a moment you arrive somewhere that you aren't able to find in the map.
Essentially you are confused and you don't understand where you are.
If you have the ariadne "μιτος" you may go back to a known point and restart to try to arrive where you want.
I think you should treat error handling only as a control structure allowing to go back at any level signaled (by the error handling routine and the error code).

How should a software product handle an access violation

We have a software product in c++, that due to documented problems with the compiler generates faulty code (Yes I know it is horrible in itself). This among other bugs causes Access Violations to be thrown.
Our response to that is to catch the error and continue running.
My question is, is this a responsible approach? Is it responsible to let an application live when it has failed so disasterously? Would it be more responsible to alert the user and die?
Edit:
One of the arguments of letting the exception unhandled is that Access Violation shows that the program was prevented from doing harm, and probably haven't done any either. I am not sure if I buy that. Are there any views on this?
I'm with Ignacio: It's imperative to get a fix for that compiler ASAP, or if such a fix is not forthcoming, to jump ship. Naturally there may be barriers to doing so, and I'm guessing you're looking for a short-term solution en route to achieving that goal. :-)
If the faulty code problem is not very narrowly constrained to a known, largely harmless situation, then I'd tend to think continuing to produce and ship the product with the faulty code could be considered irresponsible, regardless of how you handle the violation.
If it's a very narrowly constrained, known situation, how you handle it depends on the situation. You seem to know what the fault is, so you're in the position to know whether you can carry on in the face of that fault or not. I would tend to lean toward report and exit, but again, it totally depends on what the fault actually is.
Really it goes without saying, but it's irresponsible to act like the program did something it didn't (when it should have set some value somewhere which actually was a dangling pointer), or didn't do something it shouldn't have (when it randomizes some variable somewhere unfortunate enough to be the destination of a dangling pointer).
Damage minimization/mitigation strategies might be to checksum files (but not in a trivial way; actually verify that untouched data within the file is unmodified) and auto-save often.
Do you think the customer is aware of the problem?

To catch or not to catch

Should application catch "bad" signals such as SIGSEV, SIGBUS?
Those signals are produced in "should never happen" circumstances, when your program is in an undefined state. If you did catch them, continuing execution would be extremely problemeatic, as it would almost certainly cause more, possibly even more severe, errors. Also, if you don't catch them, the OS may be able to do things like produce useful diagnostics such as core dumps. So I would say "no", unless you don't want the core dump, and your error handling does something very simple such as write to a log and terminate.
Only if you have something more meaningful to do than the default action. You can't do very much more than aborting quite rapidly but sometimes trying to save the current work is adequate. But pay attention at not overwriting existing files -- users don't like the replacement of good files even if outdated with garbage.
No you should not. I know it's tempting. But there are only very very few reasons why you would ever want to catch fatal signals such as SIGSEV and SIGBUS.
One of the few exceptions might be to have some extra signalling/postmortem code which tells that your program has failed. Even this should only be done in controlled environments, not in code that ships to hundreds of thousands of users.
You have to be prepared that your postmortem code itself will crash though because SIGSEV and SIGBUS are signs of defective code or data.
There can be situation where you must catch signals like SIGSEGV and SIGBUS: one such example is: your pointer points to a mmaped region of memory and you are doing *ptr=x and for example this memory address belongs to a network file and your network is throwing some errors. at this point the only way to do error checking is catch the signal and retry or do something else.