C++ Boost io streams, error handling - c++

Is it possible to make a custom stream work like the stanadrd ones in regard for errors? That is by default use the good/fail/bad/eof bits rather than exceptions?
The boost docs only mention throwing an std::failure for stream errors and letting other error propagate (eg a badalloc from trying to allocate a buffer), however the boost code does not seem to catch these, instead relying on the user code to handle them, but all my existing code relies on the good(), bad() etc methods and the clear() method in cases where it needs to try again after an error.

From http://www.trip.net/~bobwb/cppnotes/lec08.htm
The error state can be set using:
void clear(iostate = 0);
The default value of zero results in ios_base::goodbit being set.
clear();
is therefore equivalent to
clear(0);
which is equivalent to
clear(ios_base::goodbit);
Note that ios_base::goodbit is a non-zero value. clear() might be used to set one of the other bits as part of a programmer's code for operator>>() for a particular object. For example:
if (bad_char) is.clear(ios_base::badbit); // set istream's badbit

Related

In C++, Is it possible to force the user to catch exceptions?

In short, is it possible to get C++ to force the invoker of a method to put a try...catch block?
(To clarify:
I don't necessarily mean the immediate invoker, I mean forcing the fact that it's caught somewhere. Also, I'm talking about forcing at compile time.)
The long:
I've read that it not recommended to use exception specification and that it doesn't work properly anyway (http://4thmouse.com/mystuff/articles/UsingExceptionsEffectively.html)
But the general consensus seems to favor the use of exceptions to return errors over the user of writing methods that return error codes.
So if I'm writing say a library, what's to stop the user from calling my method without putting any try...catch blocks, and then getting his program crashing when my code throws an exception?
(To be clear, I only require the exception to be caught somewhere in the users stack, not necessarily in the immediate calling code, and the compiler to complain if this is not the case.)
No, it is not.
Indeed, there is no mechanism to force the caller of a function (anywhere in the call stack) to handle any kind of error. At least, not via a compilation failure. Return values can be discarded. Even bundling error codes with return values (via expected<T, E>) doesn't issue a compile-time error if the user doesn't actually check to see if the value is available before fetching it.
C++17 may give us the [[nodiscard]] attribute, which allows compilers to issue a warning if a return value (presumably an error code) is discarded by the caller. But a compile-time warning will be as close as you can get.
In short, is it possible to get C++ to force the invoker of a method
to put a try...catch block?
No. This would defeat the whole purpose of exceptions. Exceptions are specifically made for the use case of propagating errors across multiple layers without the intermediate layers being aware of them.
Let's say you have a call hierarchy like A -> B -> C -> D -> E, and an error occurs in E. A can handle the error. B, C and D do not need to be aware of the error at all. This is exactly what exceptions are good for!
If you want to return an error directly to the caller because handling the error is indeed the caller's concern, then an exception is often the wrong design and a return value might be the better choice.
"Enforced" exceptions of a certain form have been tried in Java, but I'd consider it a failed experiment, as it usually results in code like this:
try {
method();
} catch (SomeCheckedException ex) {
// ignore
}
That C++ does not encourage this should be considered a feature.
I've read that it not recommended to use exception specification and
that it doesn't work properly anyway
Exactly. The only exception specification which was ever useful and which worked was throw() to signal that no exception at all is thrown, and that one has been superseded in C++11 by noexcept.
But the general consensus seems to favor the use of exceptions to
return errors over the user of writing methods that return error
codes.
See above. It depends on whether you want an error to propagate or if the caller can and should handle it.
So if I'm writing say a library, what's to stop the user from calling
my method without putting any try...catch blocks, and then getting his
program crashing when my code throws an exception?
A library which requires its user to surround all function calls with try blocks has a bad interface and should be redesigned, accordingly.
Also... you assume that a "program" will use your library. But this assumption will not always be true. The library client may itself be a library. There may be a lot of different library layers between the program and your library. You use exceptions if you do not care which layer handles them.
There's a general consensus? Not that I'm aware of. As for the exceptions, no. The compiler cannot enforce that somebody catches the exception somewhere up the call stack. At compile time, the compiler has no idea who may be calling your function, and your function may throw any arbitrary exception, as may any function that your function calls. The linker might have a chance, but it would have to maintain a lot of extra information dealing with what exceptions a function may throw, as well as what exceptions a function may catch. This gets even uglier when you start to talk about dynamically loaded libraries (DLL/.so) as that would have to get resolved at runtime.

Can exceptions in a custom implementation of the std::streambuf be delivered to the stream user?

Is this a standard enforced behavior that when an exception is thrown
in a custom std::streambuf method (eg. xsgetn),
it is caught (some status bits are set) but not rethrown ?
Is there any method to alter this (or pass the error message without some
dirty tricks) ?
There is a general guarantee that things like operator<< and
operator>> will not raise an exception unless asked to. You
can ask it to by specifying the exception mask:
stream.exceptions( std::ios_base::badbit );
(You can specify any or all of the error conditions here, but
badbit is probably the only thing you'd ever want to.)
If this is set, and a streambuf function exits through an
exception, that exception will be rethrown.
Another possibility is to maintain information concerning the
error in the streambuf, with a function to extract it. Then,
when the client code detects an error, it can use
dynamic_cast<MyStreambuf*>( stream.rdbuf() ) to get the
derived streambuf, and call its member functions to get the
error information.

Is GetLastError() kind of design pattern? Is it good mechanism?

Windows APIs uses GetLastError() mechanism to retrieve information about an error or failure. I am considering the same mechanism to handle errors as I am writing APIs for a proprietary module. My question is that is it better for API to return the error code directly instead? Does GetLastError() has any particular advantage? Consider the simple Win32 API example below:
HANDLE hFile = CreateFile(sFile,
GENERIC_WRITE, FILE_SHARE_READ,
NULL, CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile == INVALID_HANDLE_VALUE)
{
DWORD lrc = GetLastError();
if (lrc == ERROR_FILE_EXISTS)
{
// msg box and so on
}
}
As I was writing my own APIs I realized GetLastError() mechanism means that CreateFile() must set the last error code at all exit points. This can be a little error prone if there are many exit points and one of them maybe missed. Dumb question but is this how it is done or there is some kind of design pattern for it?
The alternative would be to provide an extra parameter to the function which can fill in the error code directly so a separate call to GetLastError() will not be needed. Yet another approach can be as below. I will stick with the above Win32 API which is good example to analyzer this. Here I am changing the format to this (hypothetically).
result = CreateFile(hFile, sFile,
GENERIC_WRITE, FILE_SHARE_READ,
NULL, CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
if (result == SUCCESS)
{
// hFile has correct value, process it
}
else if (result == FILE_ALREADY_EXIT )
{
// display message accordingly
return;
}
else if ( result == INVALID_PATH )
{
// display message accordingly.
return;
}
My ultimate question is what is the preferred way to return error code from an API or even just a function since they both are the same?
Overall, it's a bad design. This is not specific to Windows' GetLastError function, Unix systems have the same concept with a global errno variable. It's a because it's an output of the function which is implicit. This has a few nasty consequences:
Two functions being executed at the same time (in different threads) may overwrite the global error code. So you may need to have a per-thread error code. As pointed out by various comments to this answer, this is exactly what GetLastError and errno do - and if you consider using a global error code for your API then you'll need to do the same in case your API should be usable from multiple threads.
Two nested function calls may throw away error codes if the outer function overwrites an error code set by the inner.
It's very easy to ignore the error code. In fact, it's harder to actually remember that it's there because not every function uses it.
It's easy to forget setting it when you implement a function yourself. There may be many different code paths, and if you don't pay attention one of them may allow the control flow to escape without setting the global error code correctly.
Usually, error conditions are exceptional. They don't happen very often, but they can. A configuration file you need may not be readable - but most of the time it is. For such exceptional errors, you should consider using C++ exceptions. Any C++ book worth it's salt will give a list of reasons why exceptions in any language (not just C++) are good, but there's one important thing to consider before getting all excited:
Exceptions unroll the stack.
This means that when you have a function which yields an exception, it gets propagated to all the callers (until it's caught by someone, possible the C runtime system). This in turn has a few consequences:
All caller code needs to be aware of the presence of exceptions, so all code which acquires resources must be able to release them even in the face of exceptions (in C++, the 'RAII' technique is usually used to tackle them).
Event loop systems usually don't allow exceptions to escape event handlers. There's no good concept of dealing with them in this case.
Programs dealing with callbacks (plain function pointers for instance, or even the 'signal & slot' system used by the Qt library) usually don't expect that a called function (a slot) can yield an exception, so they don't bother trying to catch it.
The bottom line is: use exceptions if you know what they are doing. Since you seem to be rather new to the topic, rather stick to return codes of functions for now but keep in mind that this is not a good technique in general. Don't go for a global error variable/function in either case.
The GetLastError pattern is by far the most prone to error and the least preferred.
Returning a status code enum is a better choice by far.
Another option which you did not mention, but is quite popular, would be to throw exceptions for the failure cases. This requires very careful coding if you want to do it right (and not leak resources or leave objects in half-set-up states) but leads to very elegant-looking code, where all the core logic is in one place and the error handling is neatly separated out.
I think GetLastError is a relic from the days before multi-threading. I don't think that pattern should be used any more except in cases where errors are extraordinarily rare. The problem is that the error code has to be per-thread.
The other irritation with GetLastError is that it requires two levels of testing. You first have to check the return code to see if it indicates an error and then you have to call GetLastError to get the error. This means you have to do one of two things, neither particularly elegant:
1) You can return a boolean indicating success or failure. But then, why not just return the error code with zero for success?
2) You can have a different return value test for each function based on a value that is illegal as its primary return value. But then what of functions where any return value is legal? And this is a very error-prone design pattern. (Zero is the only illegal value for some functions, so you return zero for error in that case. But where zero is legal, you may need to use -1 or some such. It's easy to get this test wrong.)
I have to say, I think the global error handler style (with proper thread-local storage) is the most realistically applicable when exception-handling cannot be used. This is not an optimal solution for sure, but I think if you are living in my world (a world of lazy developers who don't check for error status as often as they should), it's the most practical.
Rationale: developers just tend to not check error return values as often as they should. How many examples can we point to in real world projects where a function returned some error status only for the caller to ignore them? Or how many times have we seen a function that wasn't even correctly returning error status even though it was, say, allocating memory (something which can fail)? I've seen too many examples like these, and going back and fixing them can sometimes even require massive design or refactoring changes through the codebase.
The global error handler is a lot more forgiving in this respect:
If a function failed to return a boolean or some ErrorStatus type to indicate failure, we don't have to modify its signature or return type to indicate failure and change the client code all over the application. We can just modify its implementation to set a global error status. Granted, we still have to add the checks on the client side, but if we miss an error immediately at a call site, there's still opportunity to catch it later.
If a client fails to check the error status, we can still catch the error later. Granted, the error may be overwritten by subsequent errors, but we still have an opportunity to see that an error occurred at some point whereas calling code that simply ignored error return values at the call site would never allow the error to be noticed later.
While being a sub-optimal solution, if exception-handling can't be used and we're working with a team of code monkeys who have a terrible habit of ignoring error return values, this is the most practical solution as far as I see.
Of course, exception-handling with proper exception-safety (RAII) is by far the superior method here, but sometimes exception-handling cannot be used (ex: we should not be throwing out of module boundaries). While a global error handler like the Win API's GetLastError or OpenGL's glGetError sounds like an inferior solution from a strict engineering standpoint, it's a lot more forgiving to retrofit into a system than to start making everything return some error code and start forcing everything calling those functions to check for them.
If this pattern is applied, however, one must take careful note to ensure it can work properly with multiple threads, and without significant performance penalties. I actually had to design my own thread-local storage system to do this, but our system predominantly uses exception-handling and only this global error handler to translate errors across module boundaries into exceptions.
All in all, exception-handling is the way to go, but if this is not possible for some reason, I have to disagree with the majority of the answers here and suggest something like GetLastError for larger, less disciplined teams (I'd say return errors through the call stack for smaller, more disciplined ones) on the basis that if a returned error status is ignored, this allows us to at least notice an error later, and it allows us to retrofit error-handling into a function that wasn't properly designed to return errors by simply modifying its implementation without modifying the interface.
If your API is in a DLL and you wish to support clients that use a different compiler then you then you cannot use exceptions. There is no binary interface standard for exceptions.
So you pretty much have to use error codes. But don't model the system using GetLastError as your exemplar. If you want a good example of how to return error codes look at COM. Every function returns an HRESULT. This allows callers to write concise code that can convert COM error codes into native exceptions. Like this:
Check(pIntf->DoSomething());
where Check() is a function, written by you, that receives an HRESULT as its single parameter and raises an exception if the HRESULT indicates failure. It is the fact that the return value of the function indicates status that allows this more concise coding. Imagine the alternative of returning the status via a parameter:
pIntf->DoSomething(&status);
Check(status);
Or, even worse, the way it is done in Win32:
if (!pIntf->DoSomething())
Check(GetLastError());
On the other hand, if you are prepared to dictate that all clients use the same compiler as you, or you deliver the library as source, then use exceptions.
Exception handling in unmanaged code is not recommended. Dealing with memory leaks without exceptions is a big issue, with exception it becomes nightmare.
Thread local variable for error code is not so bad idea, but as some of the other people said it is a bit error prone.
I personally prefer every method to return an error code. This creates inconvenience for functional methods because instead of:
int a = foo();
you will need to write:
int a;
HANDLE_ERROR(foo(a));
Here HANDLE_ERROR could be a macro that checks the code returned from foo and if it is an error propagates it up (returning it).
If you prepare a good set of macros to handle different situations writhing code with good error handling without exception handling could became possible.
Now when your project start growing you will notice that a call stack information for the error is very important. You could extend your macros to store the call stack info in a thread local storage variable. That is very useful.
Then you will notice that even the call stack is not enough. In many cases an error code for "File not found" at line that say fopen(path, ...); does not give you enough information to find out what is the problem. Which is the file that is not found. At this point you could extend your macros to be able to store massages as well. And then you could provide the actual path of file that was not found.
The question is why bother all of this you could do with exceptions. Several reasons:
Again, Exception handling in unmanaged code is hard to do right
The macro based code (if done write) happens to be smaller and faster than the code needed for exception handling
It is way more flexible. You could enable disable features.
In the project that I am working at the moment I implemented such error handling. It took me 2 days to put in a level to be ready to start using it. And for about a year now I probably spend about 2 weeks total time of maintaining and adding features to it.
You should also consider a object/structure based error code variable. Like the stdio C library is doing it for FILE streams.
On some of my io objects for example, i just skip all further operations when the error state is set so that the user is fine just when checking the error once after a sequence of operations.
This pattern allows you to finetune the error handling scheme much better.
One of the bad designs of C/C++ comes to full light here when comparing it for example with googles GO language. The return of just one value from a function. GO does not use exceptions instead it always returns two values, the result and the error code.
There is a minor group of people who think that exceptions are most of the time bad and misused because errors are not exceptions but something you have to expect. And it hasn't proved that software becames more reliable and easier. Especially in C++ where the only way to program nowadays is RIIA techniques.

Alternative to exceptions for methods that return objects

I have a class that can perform many types of transformations to a given object.
All data is supplied by the users (via hand crafted command files) but for the most part there is no way to tell if the file commands are valid until we start transforming it (if we checked everything first we'd end up doing the same amount of work as the transformations themselves)
These transformation methods are called on the object and it returns a newly transformed object however if there is a problem we throw an exception since returning an object (even a blank object) could be confused with success whereas an exception sends a clear signal there was a problem and an object can not be returned. It also means we don't need to call a "get last error" type function to find out exactly what went wrong (error code, message, statement, etc.)
However from reading numerous answers on SO it seems this is incorrect since invalid user input is not an exceptional circumstance and due to the complexity of this thing its not uncommon for the user to submit incorrect command files.
Performance is not really an issue here because if we hit an error we have to stop and go back to the user.
If I use return codes I'd have to take the output object as a parameter and then to check the return codes I'd need long nested if blocks (ie. the way you check HRESULT from COM)
How would I go about avoiding exceptions in this case while still keeping the code reasonably tidy?
The design of your problem really lends itself for exceptions. The fact that program execution will halt or be invalid once the user supplies a bad input file, is a sign of "exceptional circumstance". Sure, a lot of program runs will end in an exception being thrown (and caught), but this is in one execution of the program.
The things you are reading about exceptions being slow when used for every other circumstance, is only valid if the program can recover, and has to recover often (e.g. a compiler that fails to find a header in a search directory, and has to look at the next directory in the list of search directories, which really happens a lot).
Use exceptions. When in doubt, measure if this is killing performance. Cleaner code >> following what people say on SO. But then again, that would be a reason to ignore what I just said.
Well, I wouldn't (avoid exceptions). If you really need an "error reporting" mechanism (and you do), there is not much besides return values and exceptions. I think the whole argument about what is exceptional enough (and therefor deserves exceptions) and what is not, is not that important to keep you from solving your problem. Especially if performace isn't an issue. But if you REALLY want avoid exceptions, you could use some kind of global queues to queue your error iformation. But that is utterly ugly, if you ask me.

Safely overloading stream operator>>

There's a ton of information available on overloading operator<< to mimic a toString()-style method that converts a complex object to a string. I'm interested in also implementing the inverse, operator>> to deserialize a string into an object.
By inspecting the STL source, I've gathered that:
istream &operator>>(istream &, Object &);
would be the correct function signature for deserializing an object of type Object. Unfortunately, I have been at a loss for how to properly implement this - specifically how to handle errors:
How to indicate invalid data in the stream? Throw an exception?
What state should the stream be in if there is malformed data in the stream?
Should any flags be reset before returning the reference for operator chaining?
How to indicate invalid data in the stream? Throw an exception?
You should set the fail bit. If the user of the stream wants exception to be thrown, he can configure the stream (using istream::exceptions), and the stream will throw accordingly. I would do it like this, then
stream.setstate(ios_base::failbit);
What state should the stream be in if there is malformed data in the stream?
For malformed data that doesn't fit the format you want to read, you usually should set the fail bit. For internal stream specific errors, the bad bit is used (such as, if there is no buffer connected to the stream).
Should any flags be reset before returning the reference for operator chaining?
I haven't heard of such a thing.
For checking whether the stream is in a good state, you can use the istream::sentry class. Create an object of it, passing the stream and true (to tell it not to skip whitespace immediately). The sentry will evaluate to false if the eof, fail or bad bit is set.
istream::sentry s(stream, true);
if(!s) return stream;
// now, go on extracting data...
Some additional notes:
when implementing the operator>>, you probably should consider using the
bufstream and not other overloads of operator>>;
exceptions occuring during the operation should be translated to the
failbit or the badbit (members of streambuf may throw, depending on the
class used);
setting the state may throw; if you set the state after catching an
exception, you should propagate the original exception and not the one
throwed by setstate;
the width is a field to which you should pay attention. If you are
taking it into account, you should reset it to 0. If you are using other
operator>> to do basic works, you have to compute the width you are passing
from the one you received;
consider taking the locale into account.
Lange and Kreft (Standard C++ IOStreams and Locales) conver this in even
more details. They give a template code for the error handling which takes
about one page.
As for flags, I don't know if there is any standard somewhere, but it is a good idea to reset them.
Boost has neat raii wrappers for that: IO State Savers