I'm currently building a library that parses XML definitions for hardware configurations (obtained from the manufacturer).
I've mapped the XML types to c++ classes, and I'm making use of std::optional where ever there is an optional XML member when it is semantically correct for that piece of data to be missing.
I'm now trying to come up with a good error-handling strategy for my datatypes.
Sometimes, the XML may be missing some information that is marked as required by the schema, or a required element might not be found (which would be a different error to an element which is missing required data).
The basic idea for all the types follows this (example) class:
class TMyXmlType {
std::string name;
std::optional<int> factor;
std::optional<int> minFactor;
std::optional<int> maxFator;
public:
TMyXmlType(const xml_node & root){
if(root){ // Check if the element exists
name = root.value();
if(root.has_child("factor"){ factor = root.child("factor").value(); }
if(root.has_child("minFactor"){ factor = root.child("minFactor").value(); }
if(root.has_child("maxFator"){ factor = root.child("maxFator").value(); }
}else{
// What do I do here?
}
}
operator Json::Value() const {
if(/*object constructed correctly?*/){
Json::Value asJson;
asJson["name"] = name;
if(factor.has_value()){ asJson["factor"] = factor.value(); }
if(minFactor.has_value()){ asJson["minFactor"] = minFactor.value(); }
if(maxFator.has_value()){ asJson["maxFator"] = factor.maxFator(); }
return asJson;
}else{
// return error object?
}
}
}
So far, so good. The optional members are being taken care of.
However, root might be an empty node (the xml parsing library returns an empty node if it wasn't found.).
I basically want to return an error object instead of the value of the class (in my operator function) if one or more required XML nodes weren't found.
As far as I was able to find, for modern C++ you're supposed to throw an exception if the constructor can't construct the object correctly, however, if I throw an exception, my aggregate datatypes will have massive constructors with a bunch of try/catch blocks for each required data-member, which would make the codebase a pain to read and to maintain.
So now, the question is: What would be the cleanest way to have the operator return an error object instead of the class data if a required member is missing?
I don't need the constructor to explicitly fail, it currently also won't ever throw (as far as I know) and I really want error objects to give to the caller instead of return codes or bubbling exceptions.
Plain and simple, when you don't wan't an error to fire don't do anything fancy - eg. just do:
class TMyXmlType {
std::string name;
std::optional<int> factor;
std::optional<int> minFactor;
std::optional<int> maxFator;
public:
TMyXmlType(const xml_node & root){
if(root){ // Check if the element exists
name = root.value();
if(root.has_child("factor"){ factor = root.child("factor").value(); }
if(root.has_child("minFactor"){ factor = root.child("minFactor").value(); }
if(root.has_child("maxFator"){ factor = root.child("maxFator").value(); }
}else{
name = "NODE_ERROR";
}
}
}
That being said - reconsider your design - passing an invalid node to the constructor is probably an error you don't wan't to silence and should probably fire an exception !
(But catch the exception at a higher level - not for every construction - perhaps that is your underlying misconception)
Checking if a node is found for a search should be done at the level calling the search. Eg. the following code seems sensible to me:
const auto& node_found = searchNode(...);
const auto optXmlType = node_found ? std::make_optional<TMyXmlType>(node_found) : std::nullopt;
However, you might want to wrap that in a function.
Related
I have a class which is loaded from an external file, so ideally I would want its constructor to load from a given path if the load fails, I will want to throw an error if the file is not found/not readable (Throwing errors from constructors is not a horrible idea, see ISO's FAQ).
There is a problem with this though, I want to handle errors myself in some controlled manner, and I want to do that immediately, so I need to put a try-catch statement around the constructor for this object ... and if I do that, the object is not declared outside the try statement, i.e.:
//in my_class.hpp
class my_class
{
...
public:
my_class(string path);//Throws file not found, or other error error
...
};
//anywhere my_class is needed
try
{
my_class my_object(string);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
//Problem... now my_object doesn't exist anymore
I have tried a number of ways of getting around it, but I don't really like any of them:
Firstly, I could use a pointer to my_class instead of the class itself:
my_class* my_pointer;
try
{
my_class my_pointer = new my_class(string);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
The problem is that the instance of this object doesn't always end up in the same object which created it, so deleting all pointers correctly would be easy to do wrong, and besides, I personally think it is ugly to have some objects be pointers to objects, and have most others be "regular objects".
Secondly, I could use a vector with only one element in much the same way:
std::vector<my_class> single_vector;
try
{
single_vector.push_back(my_class(string));
single_vector.shrink_to_fit();
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
I don't like the idea of having a lot of single-element vectors though.
Thirdly, I can create an empty faux constructor and use another loading function, i.e.
//in my_class.hpp
class my_class
{
...
public:
my_class() {}// Faux constructor which does nothing
void load(string path);//All the code in the constructor has been moved here
...
};
//anywhere my_class is needed
my_class my_object
try
{
my_object.load(path);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
This works, but largely defeats the purpose of having a constructor, so I don't really like this either.
So my question is, which of these methods for constructing an object, which may throw errors in the constructor, is the best (or least bad)? and are there better ways of doing this?
Edit: Why don't you just use the object within the try-statement
Because the object may need to be created as the program is first started, and stopped much later. In the most extreme case (which I do actually need in this case also) that would essentially be:
int main()
{
try
{
//... things which might fail
//A few hundred lines of code
}
catch(/*whaveter*/)
{
}
}
I think this makes my code hard to read since the catch statement will be very far from where things actually went wrong.
One possibility is to wrap the construction and error handling in a function, returning the constructed object. Example :
#include <string>
class my_class {
public:
my_class(std::string path);
};
my_class make_my_object(std::string path)
{
try {
return {std::move(path)};
}
catch(...) {
// Handle however you want
}
}
int main()
{
auto my_object = make_my_object("this path doesn't exist");
}
But beware that the example is incomplete because it isn't clear what you intend to do when construction fails. The catch block has to either return something, throw or terminate.
If you could return a different instance, one with a "bad" or "default" state, you could have just initialized your instance to that state in my_class(std::string path) when it was determined the path is invalid. So in that case, the try/catch block is not needed.
If you rethrow the exception, then there is no point in catching it in the first place. In that case, the try/catch block is also not needed, unless you want to do a bit of extra work, like logging.
If you want to terminate, you can just let the exception go uncaught. Again, in that case, the try/catch block is not needed.
The real solution here is probably to not use a try/catch block at all, unless there is actually error handling you can do that shouldn't be implemented as part of my_class which isn't made apparent in the question (maybe a fallback path?).
and if I do that, the object is not declared outside the try statement
I have tried a number of ways of getting around it
That doesn't need to be a problem. There's not necessarily need to get around it. Simply use the object within the try statement.
If you really cannot have the try block around the entire lifetime, then this is a use case for std::optional:
std::optional<my_class> maybe_my_object;
try {
maybe_my_object.emplace(string);
} catch(...) {}
The problem is that the instance of this object doesn't always end up in the same object which created it, so deleting all pointers correctly would be easy to do wrong,
A pointer returned by new is correct to delete. In the error case, simply set the pointer to null and there would be no problem. That said, use a smart pointer instead for dynamic allocation, if you were to use this approach.
single_vector.push_back(my_class(string));
single_vector.shrink_to_fit();
Don't push and shrink when you know the number of objects that are going to be in the vector. Use reserve instead if you were to use this approach.
The object creation can fail because a resource is unavailable. It's not the creation which fails; it is a prerequisite which is not fulfilled.
Consequently, separate these two concerns: First obtain all resources and then, if that succeeded, create the object with these resources and use it. The object creation as such in this design cannot fail, the constructor is nothrow; it is trivial boilerplate code (copy data etc.). If, on the other hand, resource acquisition failed, object creation and object use are both skipped: Your problem with existing but unusable objects is gone.
Responding to your edit about try/catch comprising the entire program: Exceptions as error indicators are better suited for things which are done in many places at various times in a program because they guarantee error handling (by default through an abort) while separating it from the normal control flow. This is impossible to do with classic return value examination, which leaves us with a choice between unreadable or unreliable programs.
But if you have long-lived objects which are created only rarely (in your example: only at startup) you don't need exceptions. As you said, constructor exceptions guarantee that only properly initialized objects can be used. But if such an object is only created at startup this danger is low. You check for success one way or another and exit the program which cannot perform its purpose if the initial resource acquisition failed. This way the error is handled where it occurred. Even in less extreme cases (e.g. when an object is created at the beginning of a large function other than main) this may be the simpler solution.
In code, my suggestion looks like this:
struct T2;
struct myEx { myEx(const char *); };
void exit(int);
T1 *acquireResource1(); // e.g. read file
T2 *acquireResource2(); // e.g. connect to db
void log(const char *what);
class ObjT
{
public:
struct RsrcT
{
T1 *mT1;
T2 *mT2;
operator bool() { return mT1 && mT2; }
};
ObjT(const RsrcT& res) noexcept
{
// initialize from file data etc.
}
// more member functions using data from file and db
};
int main()
{
ObjT::RsrcT rsrc = { acquireResource1(), acquireResource2() };
if(!rsrc)
{
log("bummer");
exit(1);
}
///////////////////////////////////////////////////
// all resources are available. "Real" code starts here.
///////////////////////////////////////////////////
ObjT obj(rsrc);
// 1000 lines of code using obj
}
In go a common way to do error handling and still return a value is to use tuples.
I was wondering if doing the same in C++ using std::tie would be a good idea when exceptions are not applicable.
like
std::tie(errorcode, data) = loadData();
if(errorcode)
...//error handling
Are there any downsides to doing so (performance or otherwise)? I suppose with return value optimization it doesn't really make a difference but maybe I'm wrong.
One potential problematic case that I could see is the use in a cross-compiler API but that's not specific to this use.
The current way I do this is
errorcode = loadData(&data);
if(errorcode)
...//error handling
but that allows to pass in a value for data.
The errorcode itself is something that is already defined and that I can't change.
Edit: I'm using/have to use C++11
Sometimes output parameters are very handy. Suppose that loadData returns std::vector<T> and is called in a loop:
std::pair<ErrorCode, std::vector<T>> loadData();
for (...) {
ErrorCode errorcode;
std::vector<T> data;
std::tie(errorcode, data) = loadData();
}
In this case loadData will have to allocate memory on each iteration. However, if you pass data as the output parameter, previously allocated space can be reused:
ErrorCode loadData(std::vector<T>&);
std::vector<T> data;
for (...) {
ErrorCode errorcode = loadData(data);
}
If the above is of no concern, then you might want to take a look at expected<T, E>. It represents either
a value of type T, the expected value type; or
a value of type E, an error type used when an unexpected outcome occurred.
With expected, loadData() signature might look like:
expected<Data, ErrorCode> loadData();
C++11 implementation is available: https://github.com/TartanLlama/expected
There are multiple competing strategies for error handling. I will not go into it, as it is beyond the scope of the question, but error handling by return error codes is only one option. Consider alternatives like std::optional or exceptions, which are both common in C++, but not in Go.
If you have a function that is intended to return a Go-style error code plus value, then your std::tie solution is perfectly fine in C++11 or C+14, although in C++17, you would prefer structured bindings instead.
Are there any downsides to doing so (performance or otherwise)?
Yes. With tie, a copy or move of the returned values is required that would not be required if you avoid tie:
auto result = loadData();
if (std::get<0>(result))
...//error handling
Of course, if you would later copy or move the data somewhere else anyway, like in
data = std::move(std::get<1>(result));
then use tie because it is shorter.
I am often using the common
if (Value * value = getValue())
{
// do something with value
}
else
{
// handle lack of value
}
Now, I also often do
QString error = someFunctionReturningAnErrorString(arg);
if (!error.isEmpty())
{
// handle the error
}
// empty error means: no error
That's all fine but I would like the error variable to be scoped to the if-block. Is there a nice idiom for that? Obviously, I can just wrap the whole part inside another block.
This, obviously, does not work:
if(QString error = someFunctionReturningAnErrorString(arg), !error.isEmpty())
{
// handle the error
}
// empty error means: no error
And unfortunately (but for good reasons) the QString cannot be converted to bool, so this does not work either:
if(QString error = someFunctionReturningAnErrorString(arg))
{
// handle the error
}
// empty error means: no error
Any suggestions?
No. There is no idiom like this, and there is no syntax like this!
Besides, you have reached the point at which it is no longer worthwhile to make your code more and more obfuscated.
Simply write it as you do now.
If you really don't want the scope leakage, introduce a new scope:
{
const QString error = someFunctionReturningAnErrorString(arg);
if (!error.isEmpty()) {
// handle the error
}
}
// The above-declared `error` doesn't exist down here
I use this pattern quite a lot, though I've been fairly accused of scope-addiction, so take that as you will.
The only way to use that idiom while still keeping your code understandable is if your function returns an object that is convertible to bool in a way that true indicates that you want to take the branch and false means that you do not care about it. Anything else is just going to lead to write-only code.
One such object which may be relevant happens to be boost::optional. Given:
boost::optional<QString> someFunctionReturningAnErrorString(T arg);
You could use the idiom you want in a natural way:
if (auto error = someFunctionReturningAnErrorString(arg)) {
// ...
}
This also has the added benefit where I'd consider an optional error message more semantically meaningful than having to check for an empty error message.
There is basically no clean way to do that.
I'd recommend you just define an extra block around the if, but if you really want to have that exact syntax, a solution could be to declare your own class wrapping QString:
struct ErrorString
{
ErrorString(QString&& s) : s{move(s)} {}
operator bool() {return !s.isEmpty();}
QString s;
};
And then you could write:
if(ErrorString error = someFunctionReturningAnErrorString(arg))
{
// handle the error
}
// empty error means: no error
But I'm not particularly fond of this solution.
You could use:
for(QString error = someFunctionReturningAnErrorString(arg); !error.isEmpty(); /* too bad 'break' is invalid here */)
{
// handle the error
break;
}
but this is ugly, and makes your code hard to read. So please don't.
if(auto message = maybe_filter( getError(arg), [](auto&&str){
return !str.isEmpty();
}) {
}
where maybe_filter takes a T and a test function and returns optional<T>. The optional<T> is empty if evalutating the test function on the T gives you false, and T otherwise.
Or really, modify your error getting API to return an optional string.
You can use a lambda.
auto error_string_handler = [](QString && error) {
if (error.isEmpty()) return;
//...
}
error_string_handler(someFunctionReturningAnErrorString(arg));
I'm currently learning C++ and practicing my Knowledge by implementing an simple AddressBook Application. I started with an Entry class and an AddressBook class which implements a STL Map to access the entries by the last names of the persons. Now I arrived at the following code:
Entry AddressBook::get_by_last_name(string last_name){
if(this->addr_map.count(last_name) != 0){
//What can I do here?
} else {
return addr_map[last_name];
}
In Scripting Languages I would just return something like -1, Error Message(A List in Python) to indicate that the Function failed. I don't want throw an exception, because it's part of the application logic. The Calling Class should be able to react to the request by printing something on the console or opening a Message Box. Now I thought about implementing the Scripting Languae Approach in C++ by introducing some kind of an Invalid State to the Class Entry. But isn't that bad practice in C++? Could it be that my whole class design is just not appropriate? I appreciate any help. Please keep in mind that I'm still learning C++.
Some quick notes about your code:
if(this->addr_map.count(last_name) != 0){
//What can I do here?
You probably wanted it the other way:
if(this->addr_map.count(last_name) == 0){
//handle error
But your real problem lies here:
return addr_map[last_name];
Two things to note here:
The operator[] for map can do 2 things: If the element exists, it returns it; If the element doesn't exist, it creaets a new (key,value) pair with the specified key and value's default constructor. Probably not what you wanted. However, if your if statement from before would have been the right way, then the latter would never happen because we would knowthe key exists before hand.
In calling count() before, you effectively tell map to try and find the element. By calling operator[], you are telling map to find it again. So, you're doing twice the work to retrieve a single value.
A better (faster) way to do this involves iterators, and the find method:
YourMap::iterator it = addr_map.find(last_name); //find the element (once)
if (it == addr_map.end()) //element not found
{
//handle error
}
return *it.second; //return element
Now, back to the problem at hand. What to do if last_name is not found?
As other answers noted:
Simplest solution would be to return a pointer (NULL if not found)
Use boost::optional.
Simply return the YourMap::iterator but it seems that you are trying to "hide" the map from the user of AddressBook so that's probably a bad idea.
throw an exception. But wait, now you'll have to first check that calling this method is 'safe' (or handle the exception when appropriate). This check requires a boolean method like lastNameExists which would have to be called before calling get_by_last_name. Of course then we'er back to square 1. We're performing 2 find operations to retrieve a single value. It's safe, but if you're doing A LOT of calls to get_by_last_name then this is potentially a good place to optimize with a different solution (besides, arguably the exception is not very constructive: What's wrong with searching for something that isn't there, huh?).
Create a dummy member for Entryindicating that is not a real Entry but that is very poor design (unmanageable, counter intuitive, wasteful - you name it).
As you can see, the first 2 solutions are by far preferable.
One dead-simple option is to change the return type to Entry* (or const Entry*) and then return either the address of the Entry if found, or NULL if not.
If you use Boost, you could return a boost::optional<Entry>, in which case your success code would be the same, but on not-found you'd say return boost::none. This is fancier, but does about the same thing as using a pointer return type.
Throwing an exception is definitely the 'correct' C++ thing to do, based on your function return type.
You might want a function like this to help you, though:
bool AddressBook::lastNameExists(const string &last_name)
{
return addr_map.count(last_name) > 0;
}
Note that your current code returns the entry 'by value' so modifying the returned entry won't update the map. Not sure if this is by accident or design...
Other answers have given various approaches, most of them valid. I didn't see this one yet:
You could add a second parameter with a default value:
Entry AddressBook::get_by_last_name(string last_name, const Entry& default_value){
if(this->addr_map.count(last_name) == 0){
return default_value;
} else {
return addr_map[last_name];
}
In this particular instance, there might not be a sensible default value for a non-existing last name, but in many situations there is.
In C++ you have several ways of signalling that an issue happened in your function.
You can return a special value which the calling code will recognize as an invalid value. This can be a NULL pointer if the function should return a pointer, or a negative value if your function returns an index in an array, or, in the case of a custom class (e.g. your Entry class) you can define a special Entry::invalid value or something similar that can be detected by the calling function.
Your calling code could look like
if ( entryInstance->get_by_last_name("foobar") != Entry::invalid)
{
// here goes the code for the case where the name is valid
} else {
// here goes the code for the case where the name is invalid
}
On the other hand you can use the C++ exceptions mechanism and make your function throw an exception. For this youcan create your own exception class (or use one defined in the standard library, deriving from std::exception). Your function will throw the exception and your calling code will have to catch it with a try...catch statement.
try
{
entryInstance->get_by_last_name("foobar")
}
catch (Exception e)
{
// here goes the code for the case where the name is invalid
}
// here goes the code for the case where the name is valid
Apart from the fact that you could have more than one entry per surname.
Eliminate the getter, and you've solved the problem, or at least shifted it elsewhere.
Tell the AddressBook to display people with given surnames. If there aren't any it can do nothing.
AddressBookRenderer renderer;
AddressBook contacts;
contacts.renderSurnames("smith", renderer);
contacts.renderCompletions("sm", renderer);
//etc
You can do what std::map (and the other containers do).
You return an iterator from your search function.
If the search does not find a value that is useful return an iterator to end().
class AddressBook
{
typedef <Your Container Type> Container;
public:
typedef Container::iterator iterator;
iterator get_by_last_name(std::string const& lastName) {return addr_map.find[lastName];}
iterator end() {return addr_map.end();}
};
Your address book is a container like object.
Not finding an item in a search is likely to happen but it does not have enough context to incorporate error handling code (As the address book could be used from lots of places and each place would have different error handling ideas).
So you must move the test for not found state out of your address book.
just like "Python" we return a marker. In C++ this is usually an iterator to end() which the calling code can check and take the appropriate action.
AddressBook& ab = getAddressBookRef();
AddressBook::iterator find = ab.get_by_last_name("cpp_hobbyist");
if (find != ab.end())
{
Entity& person *find; // Here you have a reference to your entity.
// you can now manipulate as you want.
}
else
{
// Display appropriate error message
}
I've stumbled across this great post about validating parameters in C#, and now I wonder how to implement something similar in C++. The main thing I like about this stuff is that is does not cost anything until the first validation fails, as the Begin() function returns null, and the other functions check for this.
Obviously, I can achieve something similar in C++ using Validate* v = 0; IsNotNull(v, ...).IsInRange(v, ...) and have each of them pass on the v pointer, plus return a proxy object for which I duplicate all functions.
Now I wonder whether there is a similar way to achieve this without temporary objects, until the first validation fails. Though I'd guess that allocating something like a std::vector on the stack should be for free (is this actually true? I'd suspect an empty vector does no allocations on the heap, right?)
Other than the fact that C++ does not have extension methods (which prevents being able to add in new validations as easily) it should be too hard.
class Validation
{
vector<string> *errors;
void AddError(const string &error)
{
if (errors == NULL) errors = new vector<string>();
errors->push_back(error);
}
public:
Validation() : errors(NULL) {}
~Validation() { delete errors; }
const Validation &operator=(const Validation &rhs)
{
if (errors == NULL && rhs.errors == NULL) return *this;
if (rhs.errors == NULL)
{
delete errors;
errors = NULL;
return *this;
}
vector<string> *temp = new vector<string>(*rhs.errors);
std::swap(temp, errors);
}
void Check()
{
if (errors)
throw exception();
}
template <typename T>
Validation &IsNotNull(T *value)
{
if (value == NULL) AddError("Cannot be null!");
return *this;
}
template <typename T, typename S>
Validation &IsLessThan(T valueToCheck, S maxValue)
{
if (valueToCheck < maxValue) AddError("Value is too big!");
return *this;
}
// etc..
};
class Validate
{
public:
static Validation Begin() { return Validation(); }
};
Use..
Validate::Begin().IsNotNull(somePointer).IsLessThan(4, 30).Check();
Can't say much to the rest of the question, but I did want to point out this:
Though I'd guess that allocating
something like a std::vector on the
stack should be for free (is this
actually true? I'd suspect an empty
vector does no allocations on the
heap, right?)
No. You still have to allocate any other variables in the vector (such as storage for length) and I believe that it's up to the implementation if they pre-allocate any room for vector elements upon construction. Either way, you are allocating SOMETHING, and while it may not be much allocation is never "free", regardless of taking place on the stack or heap.
That being said, I would imagine that the time taken to do such things will be so minimal that it will only really matter if you are doing it many many times over in quick succession.
I recommend to get a look into Boost.Exception, which provides basically the same functionality (adding arbitrary detailed exception-information to a single exception-object).
Of course you'll need to write some utility methods so you can get the interface you want. But beware: Dereferencing a null-pointer in C++ results in undefined behavior, and null-references must not even exist. So you cannot return a null-pointer in a way as your linked example uses null-references in C# extension methods.
For the zero-cost thing: A simple stack-allocation is quite cheap, and a boost::exception object does not do any heap-allocation itself, but only if you attach any error_info<> objects to it. So it is not exactly zero cost, but nearly as cheap as it can get (one vtable-ptr for the exception-object, plus sizeof(intrusive_ptr<>)).
Therefore this should be the last part where one tries to optimize further...
Re the linked article: Apparently, the overhaead of creating objects in C# is so great that function calls are free in comparison.
I'd personally propose a syntax like
Validate().ISNOTNULL(src).ISNOTNULL(dst);
Validate() contructs a temporary object which is basically just a std::list of problems. Empty lists are quite cheap (no nodes, size=0). ~Validate will throw if the list is not empty. If profiling shows even this is too expensive, then you just change the std::list to a hand-rolled list. Remember, a pointer is an object too. You're not saving an object just by sticking to the unfortunate syntax of a raw pointer. Conversely, the overhead of wrapping a raw pointer with a nice syntax is purely a compile-time price.
PS. ISNOTNULL(x) would be a #define for IsNotNull(x,#x) - similar to how assert() prints out the failed condition, without having to repeat it.