String conversion errors: exceptions or error codes? - c++

In writing a function to convert between strings of different encodings (e.g. from UTF-8 to UTF-16), what would be the best way to handle errors (e.g. invalid input UTF-8 byte sequence)? Throwing an exception or returning an error code (even a bool)?
// Throws a C++ exception on error.
std::wstring ConvertFromUtf8ToUtf16(const std::string& utf8);
// Returns true on success, false on error.
bool ConvertFromUtf8ToUtf16(std::wstring& utf16, const std::string& utf8);
Using exceptions, it would be possible to do chained function calls (when the function return value is used as input for other functions/methods).
But I'm not sure that using exceptions in this case is good; I was thinking of what Eric Lippert in his quality blog post calls vexing exceptions (and related Int32.Parse()/TryParse() example).
For example, if exceptions are used, the caller should be forced to wrap the function call in try/catch blocks to check the case of invalid UTF-8 input:
try
{
wstring utf16 = ConvertFromUtf8ToUtf16(utf8);
}
catch(const Utf8ConversionException& e)
{
// Bad UTF-8 byte sequence
...
}
Which seems not ideal to me.
Maybe the best thing to do is to just provide both overloads (implementing the conversion code in the non-throwing overload, and in the throwing overload just call the non-throwing version, and in case of error return code throw an exception)?

One guideline is to consider what will happen if users ignore or don't know that they should check your returned error code.
If the code could theoretically continue in the face of an error, returning an error could be considered reasonable. And as you mention, the code looks cleaner.
If ignoring the error would likely lead to Very Bad Behavior later, it's probably a better idea to throw the exception.
A third potential choice which somewhat balances the terseness of error codes and forcing the programmer to be aware of potential errors is to make the function require a reference to the error code. This will also work well in exported libraries and with (mostly older) compilers that don't handle exceptions efficiently.
StringConversionResult result; // Could be a "success" bool
wstring utf16 = ConvertFromUtf8ToUtf16(utf8, result);

If this function is exported from a library, use return code. Throwing exception from exported function may crash the program in the case, when library and client are built with different C/C++ runtime libraries. Generally, this is undefined behavior.
For internal use, I believe, exception is a better choice. The case you are talking about, when caller doesn't use catch block, crashes the program immediately (unhandled exception). This is better, then continuing program execution with undefined results in some future point.

There are only three choices. The first is "Replace all failures by error codepoint"- the Unicode Standard provides for a couple of replacement codepoints. This is fine in some scenarios. The second is to throw an exception. The third is to provide an error function object, to be called on failure. For example,
bool fail = false;
std::u16string str = ConvertFromUTF8ToUTF16(utf8, [&] {
return u16"default";
// or
throw std::runtime_error("fail");
// or
fail = true;
});
The point is that in no scenario do you depend on the user to check for failure- if he does nothing, then either his function does not continue, the compiler cries, or it's OK for the function to continue.
Returning an error code is not an option- this is plain hideously error prone.

Related

How to return an error from a function that returns signed integers

I have some psuedocode for a function to display a value and get a value
int getValue(){
int value;
// open file
// read line into "value"
if(error occurs){
// if file doesn't open or line was not an integer
/* Normally I would return something such as -1
but -1 in this case would be a valid value*/
value = ?
}
return value;
}
void displayValue(){
int value = getValue();
if(value is valid)
display(value);
}
As described in the code above, I would like to return that there was an error and let displayValue know that there was an error. But i want to accept negative,positive, and 0 from getValue.
Is there a better way to go about this? Does anyone have any advice?
Throw an exception. One of the advantages of C++ over C is that, when you have an error, you don't have to smuggle error codes out of the function, you can just throw an exception. If it's a truly exceptional case, most of the time the caller won't have anything useful to do with it anyway, so forcing them to check for it manually, then pass the error up the call chain is pointless. If they do know what to do with it, they can catch it.
This solution is also more composable. Imagine a scenario where A returns int, B calls A and returns a std::string based on it, and C calls B and returns class Foo based on that. If A has an exceptional condition that requires it to return an error, you can either:
Come up with some way to smuggle the error out of A as an int (or std::optional<int> or std::pair<bool, int> or whatever), then check for and convert that smuggled error to a different smuggled error for B, then check for and convert that to yet another smuggled error for C, then the caller of C still needs to check for that smuggled error and all three layers have to pay the price of the checks every time, even when all three layers succeeded, or...
You throw an exception in A, neither B nor C have anything useful to do with it (so they don't write any additional code at all), and the caller of C can choose to catch the exception and produce a friendlier error message if they so choose.
On modern architectures, the cost in the success case for #2 should be pretty negligible; the failure case might be more costly than the "check at every level case", but for something like "file doesn't exist" or "file contains corrupt data", it hardly matters if performance suffers, since you're probably about to exit the program (so speed doesn't count) or pop a dialog the user needs to respond to (the user is slower than the computer by many orders of magnitude).
There are several error handling approaches in C++:
The traditional way popular in C API's (also used by std::string algorithms) is to reserve at least one value as "invalid", which when returned would signal that there was an error, or that the value represents "no result". In case of error, the C API's would use the global errno to inform what error happened.
One of the features C++ introduced over the C language is exceptions. You can simply throw when error occurs. This is most appropriate for unexpected errors or pre/post-condition violations, rather than "no result exists" type situations.
Yet another way is to return both the value, and information about whether the result is valid. Old fashioned approach might be to return a pair of integer and boolean (or a named class that achieves the same). Alternatively, either value or error state can written into object passed through indirection. std::optional has been introduced into the standard library just for this kind of situation an is a great way of representing lack of result.
Latter approach can be further extended to not only return a boolean, but actual information about the error in similar way to the way exceptions do. The error information can also be wrapped with the value in a "variant" type so that they can share the space, as only one of them can exist at any time. This approach is similar to Maybe type in Haskell. There is a proposal to introduce a template for this purpose into the standard library.
Each of these approaches have their benefits and drawbacks. Choose one that is appropriate for your use case.
One option is to throw an exception when an error occurs. It's highly dependent on the rest of your project. Are Exceptions used all around ? Personally, I prefer more conventional old-school approaches. Mostly because people will start throwing exception everywhere, where it's not really exceptional and then it makes debugging much harder as the debugger keeps stopping for non-exceptional situations.
Another option is to return a pair std::pair<bool, int>. Some people love it, some people hate it.
My preference would be bool attemptGetValue(int& outValue). You return false if there's an error, in which case you don't touch outValue. Your return true otherwise and modify outValue
You can also use std::optional, but old timers might not be familiar wiht it.
Other than throwing an exception, returning a std::optional, or a std::pair, there is a precedent here: std::string::npos is normally set to a particularly large std::string::size_type value, normally -1 (wrapped around of course) and is used by some std::string functions to indicate a failure.
If you're willing to give up one legitimate return value then you could do something similar in your case. In reality though, typical (perhaps all) strings will be significantly smaller than npos; if that's not the case for you then perhaps one of the alternatives already mentioned would be better.

What is the better way to use an error code in C++?

I'm a member of project which uses c++11. I'm not sure when I should use error code for return values. I found RVO in c++ works great even if string and struct data are returned directly. But if I use a return code, I cannot get the benefit of RVO and the code is a little redundant.
So every time I declare function, I couldn't decide which I should use for a return value. How should I keep a consistency of my code? Any advice will help me.
// return string
MyError getString(string& out);
string getString();
// return struct
MyError getStructData(MyStruct& out);
MyStruct getStructData();
Usually, using exceptions instead of error codes is the preferred alternative in C++. This has several reasons:
As you already observed, using error codes as return values prevents you from using the return values for something more "natural"
Global error codes, are not threadsafe and have several other problems
Error codes can be ignored, exceptions can not
Error codes have to be evaluated after every function call that could possibly fail, so you have to litter your code with error handling logic
Exceptions can be thrown and passed several layers up in the call stack without extra handling
Of course there are environments where exceptions are not available, often due to platform restrictions, e.g. in embedded programming. In those cases error codes are the only option.
However, if you use error codes, be consistent with how you pass them. The most appealing use of error codes I have seen that does not occupy the return value spot and is still thread safe was passign a reference to a context object in each and every function. The context object would have global or per-thread information, including error codes for the last functions.

How to retrieve error from function?

Suppose I need to get value from config.
What function is more correctly?
int ret;
string value = config.getStringValue(string name, &ret);
or
string value;
int ret = config.getValue(string name, &value);
or maybe
string value = config.getStringValue(string name);
int ret = config.getResultCode();
And what var name for result code is more correctly: ret, error, etc?
Update:
Additional to #computerfreaker's comment: there is no exceptions in same platforms like bada
Neither solutions you proposed are the correct C++ way. What you provided is just C. In C++, use exceptions
The way you think is: "I have to send some status code to the caller"... this is the way you usually handle errors in C, but since there are exceptions in C++, it's much cleaner and wiser to do:
#include <exception>
std::string getValue() {
if (...)
throw std::exception("Unable to retrieve value.");
}
and caller would do:
try {
std::string val = getValue();
} catch (std::exception& e) { ... }
Just remember the rule: Throw by value, catch by reference.
Regarding "exceptions are meant for handling exceptional states" - it's true. Exceptions should be used in situations when something unexpected / exceptional happens. If function getValue relies on the fact that the value exists and it can be retrieved, then the situation when your code for some reason fails to retrieve this value is exceptional and thus suitable for handling it using exceptions.
C++ offers several ways of reporting errors from functions which return a value:
Throw an exception. This should be done when the cause of the error is with some external resource and does not normally happen. Perfect example: out of memory.
Return a special value to indicate failure, and follow this convention at the call site. For example, return "" for errors and have callers check for "".
Use std::optional or a similar technique. That's an advanced version of your first example. The basic idea is to return a special object which contains the original object and a boolean flag indicating success. The special object is used with the rule that the original object may only be accessed if the boolean flag indicates success. Other names of this idiom which I've heard are "Fallible" and "Box". This solution and the previous one are good candidates when error cases are expected and frequent -- usually a perfect match for user input.
Abort the program with assert. This is a good solution if an error indicates that your own code is wrong. In this case, the best thing to do is usually terminating the program as quickly as possible before it can do any harm.
Use global error state and have callers check it. That's your third example. C code fancies doing that a lot with errno. In C++, however, this is typically not considered a good solution. It's bad for the same reasons that any kind of global variable is typically bad.
Do not return the value itself but make it an out parameter with a reference. Return an error flag instead. That's your second example. It is better than the previous approach but still very C-like. I would not recommend doing it because it will force callers to name every received value.

What is the purpose of std::exception::what()?

I can only think of the following situations where std::exception::what() is used:
For debug purpose. In my Visual Studio to see e.what() I have to manually add it to the watch list. Isn't it better to have a member std::string (so the debugger directly shows it in the object inspector), and only include it in non-NDEBUG builds? At least they should disable what() in NDEBUG builds.
Output it, e.g. MessageBox(e.what()) or cout << e.what(). As far as I know these messages are useless for many users. For example when I try to rename a file that doesn't exist:
boost::filesystem::rename: 系统找不到指定的文件。: "D:\MyDesktop\4", "D:\MyDesktop\5"
(The Chinese words means "The system cannot find the file specified.") How can the users decrypt the mixed things? Also, it is a const char* instead of something like const platform_char*, which may have unicode problems in Windows.
Extract data from it, e.g. std::regex_match(e.what()...). I think it's a terrible idea that shows design flaws.
So where should I use std::exception::what()? Is it useless?
A programmer is supposed to derive a class from std::exception and taylor what() to the specific requirements. Then it can be very useful.
It's also useful to report something back (e.g. in plain text for logging), which is why the standard mandates a concrete std::exception::what() rather than a pure virtual function.
what() is generic in the sense that it means what you want it to mean for your own exception classes. In many cases it is used only for logging, but in other cases it may provide information that can be used to recover from an exceptional situation.
So where should I use std::exception::what()? Is it useless?
The error message of std::exception is a char* because it's supposed to be an as-simple-as-possible user-facing diagnostics message giving details on the error.
For boost::system_error (and std::system_error), you can also get the OS-level error code (for which the user-friendly message is "file not found").
Valid uses:
if you want to identify the error type, catch the specialization std::system_error or boost::system_error and perform a switch/if on the code() function (i.e. instead of running a regex on the message).
if you want to display an explanation of the error for diagnostics (logging) or user friendliness (to console/GUI) just display the error message (what()).
std::exception is a base class (i.e. it has a virtual destructor). If you implement your own exception classes, follow the same pattern as std::system_error: use an error code to easily identify an error, and create your exception's base class (std::exception, std::runtime_error or std::logic_error usually) with a text message depending on the error type.
Ultimately, the type of the error message is char* instead of std::string, wstring or your-platform-specific-char-type because it sacrifices flexibility for reliability: with a char* everybody knows how to use it, that it has no encoding (except ASCII), that it's null terminated and once allocated, it does not fail (it doesn't generate new exceptions).
Using something that could fail (in any way) when constructing an exception would be disastrous: you would fail while creating/using a diagnostics message of your error handling code.
This means you could either not throw the exception (and your application will remain in an invalid state), or throw an exception while creating an exception instance (you really don't want that as it would discard the exception you wanted to throw (hide errors) or throw an exception with a missing or corrupted error code (which could waste months of development time in various projects to track down the wrong error).
Consider this example (unnecessarily complicated, but it makes a point):
int * p = new(nothrow) int(10); // I want to throw a complicated
// string message exception
if(nullptr == p)
throw complicated_exception(std::string("error message here"));
The throw line will presumably fail: the application has so little memory available that it won't even allocate an int*, let alone a string. The result of this code is that in low memory conditions I will still get a std::out_of_memory error - one generated by std::string constructor.
std::exception is a good implementation: it provides a minimal extendable interface with a good management/restriction of failure points and it gives a good interface for extensions (such as std::system_error).

In C++ what are the benefits of using exceptions and try / catch instead of just returning an error code?

I've programmed C and C++ for a long time and so far I've never used exceptions and try / catch. What are the benefits of using that instead of just having functions return error codes?
Possibly an obvious point - a developer can ignore (or not be aware of) your return status and go on blissfully unaware that something failed.
An exception needs to be acknowledged in some way - it can't be silently ignored without actively putting something in place to do so.
The advantage of exceptions are two fold:
They can't be ignored. You must deal with them at some level, or they will terminate your program. With error code, you must explicitly check for them, or they are lost.
They can be ignored. If an error can't be dealt with at one level, it will automatically bubble up to the next level, where it can be. Error codes must be explicitly passed up until they reach the level where it can be dealt with.
The advantage is that you don't have to check the error code after each potentially failing call. In order for this to work though, you need to combine it with RAII classes so that everything gets automatically cleaned up as the stack unwinds.
With error messages:
int DoSomeThings()
{
int error = 0;
HandleA hA;
error = CreateAObject(&ha);
if (error)
goto cleanUpFailedA;
HandleB hB;
error = CreateBObjectWithA(hA, &hB);
if (error)
goto cleanUpFailedB;
HandleC hC;
error = CreateCObjectWithA(hB, &hC);
if (error)
goto cleanUpFailedC;
...
cleanUpFailedC:
DeleteCObject(hC);
cleanUpFailedB:
DeleteBObject(hB);
cleanUpFailedA:
DeleteAObject(hA);
return error;
}
With Exceptions and RAII
void DoSomeThings()
{
RAIIHandleA hA = CreateAObject();
RAIIHandleB hB = CreateBObjectWithA(hA);
RAIIHandleC hC = CreateCObjectWithB(hB);
...
}
struct RAIIHandleA
{
HandleA Handle;
RAIIHandleA(HandleA handle) : Handle(handle) {}
~RAIIHandleA() { DeleteAObject(Handle); }
}
...
On first glance, the RAII/Exceptions version seems longer, until you realize that the cleanup code needs to be written only once (and there are ways to simplify that). But the second version of DoSomeThings is much clearer and maintainable.
DO NOT try and use exceptions in C++ without the RAII idiom, as you will leak resources and memory. All your cleanup needs to be done in destructors of stack-allocated objects.
I realize there are other ways to do the error code handling, but they all end up looking somewhat the same. If you drop the gotos, you end up repeating clean up code.
One point for error codes, is that they make it obvious where things can fail, and how they can fail. In the above code, you write it with the assumption that things are not going to fail (but if they do, you'll be protected by the RAII wrappers). But you end up paying less heed to where things can go wrong.
Exception handling is useful because it makes it easy to separate the error handling code from the code written to handle the function of the program. This makes reading and writing the code easier.
return an error code when an error condition is expected in some cases
throw an exception when an error condition is not expected in any cases
in the former case the caller of the function must check the error code for the expected failure; in the latter case the exception can be handled by any caller up the stack (or the default handler) as is appropriate
Aside from the other things that were mentioned, you can't return an error code from a constructor. Destructors either, but you should avoid throwing an exception from a destructor too.
I wrote a blog entry about this (Exceptions make for Elegant Code), which was subsequently published in Overload. I actually wrote this in response to something Joel said on the StackOverflow podcast!
Anyway, I strongly believe that exceptions are preferable to error codes in most circumstances. I find it really painful to use functions that return error codes: you have to check the error code after each call, which can disrupt the flow of the calling code. It also means you can't use overloaded operators as there is no way to signal the error.
The pain of checking error codes means that people often neglect to do so, thus rendering them completely pointless: at least you have to explicitly ignore exceptions with a catch statement.
The use of destructors in C++ and disposers in .NET to ensure that resources are correctly freed in the presence of exceptions can also greatly simplify code. In order to get the same level of protection with error codes you either need lots of if statements, lots of duplicated cleanup code, or goto calls to a common block of cleanup at the end of a function. None of these options are pleasant.
Here's a good explanation of EAFP ("Easier to Ask for Forgiveness than Permission."), which I think applies here even if it's a Python page in Wikipedia. Using exceptions leads to a more natural style of coding, IMO -- and in the opinion of many others, too.
When I used to teach C++, our standard explanation was that they allowed you to avoid tangling sunny-day and rainy-day scenarios. In other words, you could write a function as if everything would work ok, and catch the exception in the end.
Without exceptions, you would have to get a return value from each call and ensure that it is still legitimate.
A related benefit, of course, is that you don't "waste" your return value on exceptions (and thus allow methods that should be void to be void), and can also return errors from constructors and destructors.
Google's C++ Style Guide has a great, thorough analysis of the pros and cons of exception use in C++ code. It also indicates some of the larger questions you should be asking; i.e. do I intend to distribute my code to others (who may have difficulty integrating with an exception-enabled code base)?
Sometimes you really have to use an exception in order to flag an exceptional case. For example, if something goes wrong in a constructor and you find it makes sense to notify the caller about this then you have no choice but to throw an exception.
Another example: Sometimes there is no value your function can return to denote an error; any value the function may return denotes success.
int divide(int a, int b)
{
if( b == 0 )
// then what? no integer can be used for an error flag!
else
return a / b;
}
The fact that you have to acknowledge exceptions is correct but this can also be implemented using error structs.
You could create a base error class that checks in its dtor whether a certain method ( e.g. IsOk ) has been called. If not, you could log something and then exit, or throw an exception, or raise an assert, etc...
Just calling the IsOk on the error object without reacting to it, would then be the equivalent of writing catch( ... ) {}
Both statement would display the same lack of programmer good will.
The transport of the error code up to the correct level is a greater concern. You would basically have to make almost all methods return an error code for the sole reason of propagation.
But then again, a function or method should always be annotated with the exceptions it can generate. So basically you have to same problem, without an interface to support it.
As #Martin pointed out throwing exceptions forces the programmer to handle the error. For example, not checking return codes is one of the biggest sources of security holes in C programs. Exceptions make sure that you handle the error (hopefully) and provide some kind of recover path for your program. And if you choose to ignore an exception rather than introduce a security hole your program crashes.