Error handling when parsing text file to object

Error handling when parsing text file to object - c++

I want to parse a simple text file and create an object from the data it contains. I'm using C++11 for this (and I'm not fluent).
In case of any kind of error (e.g. missing file or invalid text) I wish to tell the caller of my parsing function what went wrong, providing information like what kind of error occurred and where in the file.
I don't consider exceptional that errors occur while parsing, so it seems exceptions are not the way to go.
I thought of returning a struct with all the info, including the resulting parsed object in case of success:
struct ParsingResult
{
bool success;
int errorCode;
int errorLine;
ParsedObject object;
}
However I'm not convinced by this solution because, in case of errors, I must still provide a ParsedObject. I can define a default constructor for that, of course, but by it's nature a ParsedObject makes sense only when the parsing is successful.
I could change ParsedObject to ParsedObject*, but I'm reluctant to use pointers when not necessary, and I wonder if this can be avoided.
My question: can you suggest a better solution to this problem? What is it?

struct Obj
{
// your object from the data...
}
struct ParseError
{
int errorCode;
int errorLine;
}
class Parser
{
ParseError m_error;
// other things
public:
bool parse(const std::string& filename, Obj& parsedObject)
{
// Open file, etc...
//parsedObject.property1 = some_value;
// if something is wrong, set m_error and return false;
// return true if ok
}
ParseError getLastError() { return m_error; }
}
// in your code
Parser p;
Obj o;
if(!p.parse("filename", o))
{
// ParseError = p.getLastError();
}

You can use a
unique_ptr<ParsedObject>
or
shared_ptr<ParsedObject>
whose default constructor can be compared to null in case of unsuccessful parsing.
Avoiding raw pointers will free you from having to free memory :)

Related

Generic logger vs. exception pattern

Consider a class foo which has one or more functions which can report failure either through a logger or by throwing an exception if no logger was provided:
struct logger
{
// ...
};
struct foo
{
void set_logger(std::shared_ptr<logger> logger) { m_logger = std::move(logger); }
bool bar(const std::filesystem::path& path)
{
// Check path validity
if (not std::filesystem::exists(path) {
if (m_logger) {
m_logger->warn("path does not exist.");
return false;
}
else {
throw std::runtime_error("path does not exists.");
}
}
// Some random operation
try {
// Do something here that might throw
}
catch (const std::exception& e) {
if (m_logger) {
m_logger->warn("operation bar failed.");
return false;
}
else {
throw e;
}
}
return true;
}
private:
std::shared_ptr<logger> m_logger;
};
This does not only look ugly but is extremely error prone and as more functions are added to foo code will be repeditive.
Is there any kind of pattern or paradigm to abstract this logic away? Some kind of wrapper template I can construct to use inside foo's various functions when error reporting is needed?
Anything up to C++20 would be acceptable.

Is there any kind of pattern or paradigm to abstract this logic away? Some kind of wrapper template I can construct to use inside foo's various functions when error reporting is needed?
One thing you could do is to simply remove all the else { throw... } code and provide a default logger that throws an exception containing the logged message. If the client provides a different logger, fine, your code will use that; if not, it uses the default one. This scheme eliminates about half your error handling code and simplifies the flow while providing the same behavior, which seems like a positive outcome.
It's important to remember that logging a message is different from throwing an exception: it's often useful to be able to log a message that doesn't stop the flow of the program. So be sure to give your logger class methods that log without throwing as well; you might use a different log level for messages that are equivalent to exceptions.

Since your error case handling seems to follow the same structure, you could add a simple function that will do just that:
bool log_or_throw(logger * logger_, std::string const& message, std::exception const& exception) {
if (logger_) {
logger_->warn(message);
return false;
}
else {
throw exception;
}
}
And then you can change your error handling to a single line, probably making the flow a lot more readable:
if (not std::filesystem::exists(path) {
return log_or_throw(m_logger, "path does not exist.", std::runtime_error("path does not exists."));
}

Function Return Type: Pointer, Reference or something else?

Let us assume I always need the direkt return type of the function to be of a errorcode (success of calculation or failure) , then I will return some arguments as parameters. Is it better to define them as reference (and create them before empty) or better to return pointer?
Edit: I should be more precise: The errorcode is obligatory because I have to stick to the coding guidline given.
Possibility A:
ErrorCode func( some_parameters ... , ReturnType & result)
...
ReturnType result; // empty constructor, probably not good practice
func( some_parameters ..., result)
Possibility B:
ErrorCode func( some_parameters ... , ReturnType * result){
...
result = new ReturnType(...)
...
}
...
ReturnType * result; // void pointer
func( some_parameters ..., result)
...
delete result; // this is needed if I do not use a smart pointer
Even better: Maybe you have a more appropriate solution?
Edit: Please indicate which standard you are using, since unfortunatelly (guidelines) I have to stick to C++98.

I would do the following (and in general do)
1.) throw an exception instead of returning error codes
if this is not possible (for any reason)
2.) return the pointer directly (either raw or std::unique_ptr) and return nullptr for failure
if return type has to be bool or not all objects returned are (pointers / heap allocated)
3.) return your error code (bool or enum class) and accept a reference parameter for all objects that are to be initialized (must have objects so to speak) and pointers to objects that may be optionally created / initialized
if the object cannot be created in advance to the call (e.g. because it is not default constructible)
4.) pass a reference to a pointer (raw or std::unique_ptr) or a pointer to a pointer, which will then be filled by the function
std::optional (or similar) may be an option if you only have a true/false return code.
I don't like returning std::pair or std::tuple because it can make your code look quite annoying if you have to start using .first/.second or std::get<>() to access your different return types. Using std::tie() can reduce this a little bit, but it is not (yet) very comfortable to use and prevents the use of const.
Examples:
std::unique_ptr<MyClass> func1() { /* ... */ throw MyException("..."); }
std::unique_ptr<MyClass> func2() { /* ... */ }
ErrorCode func3(MyClass& obj, std::string* info = nullptr) { /* ... */ }
ErrorCode func4(std::unique_ptr<MyClass>& obj) { /* ... */ }
int main()
{
try
{
auto myObj1 = func1();
// use ...
}
catch(const MyException& e)
{
// handle error ...
}
if(auto myObj2 = func2())
{
// use ...
}
MyClass myObj3;
std::string info;
ErrorCode error = func3(myObj3, &info);
if(error == ErrorCode::NoError)
{
// use ...
}
std::unique_ptr<MyClass> myObj4;
ErrorCode error = func4(myObj4);
if(error == ErrorCode::NoError)
{
// use ...
}
}
Edit: And in general it is advisable to keep your API consistent, so if you already have a medium or large codebase, which makes use of one or the other strategy you should stick to that (if you do not have good reasons not to).

This is a typical example for std::optional. Sadly it isn't available yet, so you want to use boost::optional.
This is assuming that the result is always either "success with result" or "fail without result". If your result code is more complicated you can return
std::pair<ResultCode, std::optional<ReturnType>>.

It would be good style to to use the return value for all return information. For example:
std::tuple<bool, ReturnType> func(input_args....)
Alternatively, the return type could be std::optional (or its precursor) if the status code is boolean, with an empty optional indicating that the function failed.
However, if the calculation is supposed to normally succeed, and only fail in rare circumstances, it would be better style to just return ReturnType, and throw an exception to indicate failure.
Code is much easier to read when it doesn't have error-checking on every single return value; but to be robust code those errors do need to be checked somewhere or other. Exceptions let you handle a range of exceptional conditions in a single place.

Don't know if it's applicable in your situation but if you have only two state return type then maybe just return pointer from your function and then test if it is nullptr?

Is there an idiom like `if (Value * value = getValue())` when you branch on an expression of the retrieved value?

I am often using the common
if (Value * value = getValue())
{
// do something with value
}
else
{
// handle lack of value
}
Now, I also often do
QString error = someFunctionReturningAnErrorString(arg);
if (!error.isEmpty())
{
// handle the error
}
// empty error means: no error
That's all fine but I would like the error variable to be scoped to the if-block. Is there a nice idiom for that? Obviously, I can just wrap the whole part inside another block.
This, obviously, does not work:
if(QString error = someFunctionReturningAnErrorString(arg), !error.isEmpty())
{
// handle the error
}
// empty error means: no error
And unfortunately (but for good reasons) the QString cannot be converted to bool, so this does not work either:
if(QString error = someFunctionReturningAnErrorString(arg))
{
// handle the error
}
// empty error means: no error
Any suggestions?

No. There is no idiom like this, and there is no syntax like this!
Besides, you have reached the point at which it is no longer worthwhile to make your code more and more obfuscated.
Simply write it as you do now.
If you really don't want the scope leakage, introduce a new scope:
{
const QString error = someFunctionReturningAnErrorString(arg);
if (!error.isEmpty()) {
// handle the error
}
}
// The above-declared `error` doesn't exist down here
I use this pattern quite a lot, though I've been fairly accused of scope-addiction, so take that as you will.

The only way to use that idiom while still keeping your code understandable is if your function returns an object that is convertible to bool in a way that true indicates that you want to take the branch and false means that you do not care about it. Anything else is just going to lead to write-only code.
One such object which may be relevant happens to be boost::optional. Given:
boost::optional<QString> someFunctionReturningAnErrorString(T arg);
You could use the idiom you want in a natural way:
if (auto error = someFunctionReturningAnErrorString(arg)) {
// ...
}
This also has the added benefit where I'd consider an optional error message more semantically meaningful than having to check for an empty error message.

There is basically no clean way to do that.
I'd recommend you just define an extra block around the if, but if you really want to have that exact syntax, a solution could be to declare your own class wrapping QString:
struct ErrorString
{
ErrorString(QString&& s) : s{move(s)} {}
operator bool() {return !s.isEmpty();}
QString s;
};
And then you could write:
if(ErrorString error = someFunctionReturningAnErrorString(arg))
{
// handle the error
}
// empty error means: no error
But I'm not particularly fond of this solution.

You could use:
for(QString error = someFunctionReturningAnErrorString(arg); !error.isEmpty(); /* too bad 'break' is invalid here */)
{
// handle the error
break;
}
but this is ugly, and makes your code hard to read. So please don't.

if(auto message = maybe_filter( getError(arg), [](auto&&str){
return !str.isEmpty();
}) {
}
where maybe_filter takes a T and a test function and returns optional<T>. The optional<T> is empty if evalutating the test function on the T gives you false, and T otherwise.
Or really, modify your error getting API to return an optional string.

You can use a lambda.
auto error_string_handler = [](QString && error) {
if (error.isEmpty()) return;
//...
}
error_string_handler(someFunctionReturningAnErrorString(arg));

Stop process if there is an error in the constructor

In a utility class file, I want to open a file to read or write it.
If I can't open it, I don't want to continue the process.
FileUtility::FileUtility(const char *fileName) {
ifstream in_stream;
in_stream.open(filename);
}
FileUtility fu = FileUtility("bob.txt");
fu.read();
fu.write();
File bob.txt doesn't exist, so I don't want method to read and write.
Is there a clean way to do it?

When construction of an object fails in C++, you should throw an exception, or propagate the exception from the failed construction of the subobject.
FileUtility(const char* filename) {
std::ifstream in_stream;
in_stream.exceptions(std::ios_base::failbit);
in_stream.open(filename); // will throw if file can't be opened
}
In the calling code you can choose to handle the exception:
try {
FileUtility fu = FileUtility("bob.txt");
} catch (std::ios_base::failure) {
printf("Failed to open bob.txt\n");
exit(EXIT_FAILURE);
}
// do other stuff
Or, if you don't catch the exception, the runtime will just call std::terminate(), which will print out its own error message, which may or may not be helpful:
terminate called after throwing an instance of 'std::ios_base::failure'
what(): basic_ios::clear
Aborted

There are generally four ways error state can be communicated from a callee to a caller:
1. Direct return value (return code or OUT parameter).
A return code is not possible for a constructor call, although an OUT parameter is. However, it's somewhat invasive to require every function to provide its return code or an OUT parameter for this purpose, so I don't like this solution in general, although it is certainly heavily used in various libraries and APIs. You could use this approach by adding a pointer or reference parameter to your constructor, to which the caller could provide the address of some local error variable, into which the constructor could store a possible return value. I don't recommend this.
2. Exceptions.
There is a somewhat polarized debate on the pros and cons of exceptions, in both C++ code and in other languages. I may take some downvotes for saying this, but my personal opinion is that exceptions should be avoided like the plague. See http://www.joelonsoftware.com/items/2003/10/13.html for someone who shares my view. But this is a workable solution if you're so inclined. See #Brian's answer for a good demonstration of this solution.
3. Object attribute.
The std::ifstream object actually does this, so you can leverage that. (Actually, from your example code, you define your std::ifstream as a local variable in the constructor, which implies it won't persist after the call, but since you call some kind of read() and write() methods on the constructed object, that implies that you do persist it after the call, so I'm going to assume the latter is the correct inference.) You can leverage that, by calling std::ifstream::is_open(). If you want to maintain encapsulation of the std::ifstream, you can define your own is_open() equivalent on FileUtility that will simply return in_stream.is_open();, again, assuming it is retained as an attribute on your FileUtility class.
struct FileUtility {
ifstream ifs;
FileUtility(const char* fileName);
bool is_open(void) const;
};
FileUtility::FileUtility(const char* fileName) { ifs.open(fileName); }
bool FileUtility::is_open(void) const { return ifs.is_open(); }
FileUtility fu = FileUtility("bob.txt");
if (!fu.is_open()) return 1;
Alternatively, you could create a whole new error state layer just for the FileUtility class, and propagate the std::ifstream error through that. For example:
struct FileUtility {
static const int ERROR_NONE = 0;
static const int ERROR_BADFILE = 1;
ifstream ifs;
int error;
FileUtility(const char* fileName);
};
FileUtility::FileUtility(const char* fileName) : error(ERROR_NONE) {
ifs.open(fileName);
if (!ifs.is_open()) { error = ERROR_BADFILE; return; }
}
FileUtility fu = FileUtility("bob.txt");
if (fu.error != FileUtility::ERROR_NONE) return 1;
These are reasonable solutions.
4. Global error state.
I wouldn't be surprised if some programmers were to respond with a "that sounds like a bad idea" reaction to this possible solution, but the truth is that many extremely successful and prominent code bases use this solution for communicating error state. Perhaps the best examples are the errno variable used by the C Standard Library (although it should be mentioned that errno sort of works in conjunction with direct return codes), and the GetLastError() system used by the Windows C API. I suppose some might argue that that's really the "C approach", and exceptions are the "C++ approach", but again, I avoid exceptions like the plague.
As an aside, multithreadedness is not a problem for this solution, because errno and GetLastError() both use thread-local error state, rather than true global error state.
I like this solution best, because it's simple, extremely uninvasive, and can easily be reused by different code bases, provided of course that you define the error framework (basically the thread-local variable and possibly the ERROR_NONE macro/global; see below) in its own library, in which case your code gains a consistency when it comes to error handling.
Example:
#define ERROR_NONE 0
thread_local int error = ERROR_NONE;
struct FileUtility {
static const int ERROR_BADFILE = 1;
ifstream ifs;
FileUtility(const char* fileName);
};
FileUtility::FileUtility(const char* fileName) {
ifs.open(fileName);
if (!ifs.is_open()) { error = ERROR_BADFILE; return; }
}
FileUtility fu = FileUtility("bob.txt");
if (error != ERROR_NONE) return 1;
This is the solution I'd recommend; otherwise I'd go with an object attribute solution.

Have I to throw an exception?

Suppose I have a static method of my class that returns an object of the same type of my class. To create the object for example this method have to parse a string:
class C
{
public:
static C get_obj(const std::string& str)
{
C obj;
// Parse the string and set obj properties
return obj;
}
};
If, when I parse the string, I get an error and the object can't be constructed as a valid object, have I to throw an exception or what else?

Given that there is a possibility of failure in get_obj the failure must be reported back to the caller in some manner. This is typically either done by
Throwing an exception
Communicating the failure in the output of the method
In this particular case the only output of the method is a C instance. Given that throwing an exception is probably the best option for a method of this signature. The only other choice is to embed the success / failure inside the C object which you almost certainly don't want to do.
Another way to approach this problem is the try_parse pattern. Let a bool return indicate success / failure and return the constructed object on success through a reference parameter
bool try_parse(const std::string& str, C& obj) {
if (string is valid) {
obj = C(...);
return true;
}
return false;
}

I'd say you should throw an exception. This way you notify the client that the obj could not be obtained, and force him to deal with this.
If not important (not critical), you could return a special C that would act as a sentinel value indicating that something went wrong. The client will choose whether to do something about it or not.
I'd go with the exception. The second approach is not recommended.

Yes it is perfectly valid to throw an exception.
Its the same reason when constructing an object, if you cannot proceed with the construction of an object you have very little choice but to throw an exception.

Yes, you do need to throw an exception.
class C
{
public:
static C get_obj(const std::string& str)
{
try
{
C obj;
// Parse the string and set obj properties
return obj;
}
catch (int x)
{
cout "blahblah";
}
}
};
If the object cannot be constructed you risk a 0 variable, which can cause a lot of trouble further on

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Error handling when parsing text file to object - c++

You can use a unique_ptr<ParsedObject> or shared_ptr<ParsedObject> whose default constructor can be compared to null in case of unsuccessful parsing. Avoiding raw pointers will free you from having to free memory :)

Related

Generic logger vs. exception pattern

Function Return Type: Pointer, Reference or something else?

Is there an idiom like `if (Value * value = getValue())` when you branch on an expression of the retrieved value?

Stop process if there is an error in the constructor

Have I to throw an exception?

Categories

Resources