So, I've been doing some library development and came to a dilemma. Library is private, so I can't share it, but I feel this could be a meaningful question.
The dilemma presented itself as an issue over why there is no default constructor for a resource handling class in the library. The class handles a specific file structure, which is not really important, but lets call the class Quake3File.
The request was then to implement a default constructor and the "appropriate" Open/Close methods. My line of thinking is RAII style, that is if you create a instantiation of the class you must give it a resource that it handles. Doing this ensures that any and all successfully constructed handles are valid and IMO eliminates a whole class of bugs.
My suggestion was to keep a (smart)pointer around and then instead of having to implement a Open/Close and open a can of worms, the user creates the class on the free store to "Open" it and deletes is when you want to "Close" it. Using a smart pointer will even "Close" it for you when it gets out of scope.
This is where the conflict comes in, I like to mimic the STL design of classes, since that makes my classes easier to use. Since I'm making a class that essentially deals with files and if I take std::fstream as a guide, then I am not sure if I should implement a default constructor. The fact that the entire std::fstream hierarchy does it points me to a Yes, but my own thinking goes to No.
So the questions are more or less:
Should resource handles really have default constructors?
What is a good way implement a default constructor on a class that deals with files? Just set the internal state to an invalid one and if a user missuses it by not giving it a resource have it result in undefined behavior? Seems strange to want to take it down this route.
Why does the STL implement the fstream hierarchy with default constructors? Legacy reasons?
Hope my question is understood. Thanks.
One could say that there are two categories of RAII classes: "always valid" and "maybe empty" classes. Most classes in standard libraries (or near-standard libraries like Boost) are of the latter category, for a couple of reasons that I'll explain here. By "always valid", I mean classes that must be constructed into a valid state, and then remain valid until destruction. And by "maybe empty", I mean classes that could be constructed in an invalid (or empty) state, or become invalid (or empty) at some point. In both cases, the RAII principles remain, i.e., the class handles the resource and implements an automatic management of it, as in, it releases the resource upon destruction. So, from a user's perspective, they both enjoy the same protections against leaking resources. But there are some key differences.
First thing to consider is that one key aspect of almost any resource that I can think of is that resource acquisition can always fail. For example, you could fail to open a file, fail to allocate memory, fail to establish a connection, fail to create a context for the resource, etc.. So, you need a method to handle this potential failure. In an "always valid" RAII class, you have no choice but to report that failure by throwing an exception from the constructor. In a "maybe empty" class, you can choose to report that failure either through leaving the object in an empty state, or you can throw an exception. This is probably one of the main reasons why the IO-streams library uses that pattern, because they decided to make exception-throwing an optional feature in its classes (probably because of many people's reticence to using exceptions too much).
Second thing to consider is that "always valid" classes cannot be movable classes. Moving a resource from one object to another implies making the source object "empty". This means that an "always valid" class will have to be non-copyable and non-movable, which could be a bit of an annoyance for the users, and could also limit your own ability to provide an easy-to-use interface (e.g., factory functions, etc.). This will also require the user to allocate the object on freestore whenever he needs to move the object around.
(EDIT)
As pointed out by DyP below, you could have an "always valid" class that is movable, as long as you can put the object in a destructible state. In other words, any other subsequent use of the object would be UB, only destruction will be well-behaved. It remains, however, that a class that enforces an "always valid" resource will be less flexible and cause some annoyance for the user.
(END EDIT)
Obviously, as you pointed out, an "always valid" class will be, in general, more fool-proof in its implementation because you don't need to consider the case where the resource is empty. In other words, when you implement a "maybe empty" class you have to check, within each member function, if the resource is valid (e.g., if the file is open). But remember that "ease of implementation" is not a valid reason to dictate a particular choice of interface, the interface faces the user. But, this problem is also true for code on the user's side. When the user deals with a "maybe empty" object, he always has to check validity, and that can become troublesome and error-prone.
On the other hand, an "always valid" class will have to rely solely on exception mechanisms to report its errors (i.e., error conditions don't disappear because of the postulate of "always valid"), and thus can present some interesting challenges in its implementation. In general, you will have to have strong exception safety guarantees for all your functions involving that class, including both implementation code and user-side code. For example, if you postulate that the object is "always valid", and you attempt an operation that fails (like reading beyond the end of a file), then you need to roll back that operation and bring the object back to its original valid state, to enforce your "always valid" postulate. The user will, in general, be forced to do the same when relevant. This may or may not be compatible with the kind of resource you are dealing with.
(EDIT)
As pointed out by DyP below, there are shades of grey between those two types of RAII classes. So, please note that this explanation is describing two pole-opposites or two general classifications. I am not saying that this is a black-and-white distinction. Obviously, many resources have varying degrees of "validity" (e.g., an invalid file-handler could be in a "not opened" state or a "reached end-of-file" state, which could be handled differently, i.e., like a "always opened", "maybe at EOF", file-handler class).
(END EDIT)
Should resource handles really have default constructors?
Default constructors for RAII classes are generally understood as creating the object into an "empty" state, meaning that they are only valid for "maybe empty" implementations.
What is a good way implement a default constructor on a class that deals with files? Just set the internal state to an invalid one and if a user missuses it by not giving it a resource have it result in undefined behavior? Seems strange to want to take it down this route.
Most resources that I have ever encountered have a natural way to express "emptiness" or "invalidity", whether it be a null pointer, null file-handle, or just a flag to mark the state as being valid or not. So, that's easy. However, this does not mean that a misuse of the class should trigger "undefined behavior". It would be absolutely terrible to design a class like that. As I said earlier, there are error conditions that can occur, and making the class "always valid" does not change that fact, only the means by which you deal with them. In both cases, you must check for error conditions and report them, and fully specify the behavior of your class in case they happen. You cannot just say "if something goes wrong, the code has 'undefined behavior'", you must specify the behavior of your class (one way or another) in case of error conditions, period.
Why does the STL implement the fstream hierarchy with default constructors? Legacy reasons?
First, the IO-stream library is not part of the STL (Standard Template Library), but that's a common mistake. Anyway, if you read my explanations above you will probably understand why the IO-stream library chose to do things the way it does. I think that it essentially boils down to avoiding exceptions as a required, fundamental mechanism for their implementations. They allow exceptions as an option, but don't make them mandatory, and I think that must have been a requirement for many people, especially back in the days it was written, and probably still today.
I think that each case should be considered separately, but for a file class I'd certainly consider the introduction of an "invalid state" like "file cannot be opened" (or "no file attached to the wrapper handler class").
For example, if you don't have this "invalid file" state, you will force a file loading method or function to throw exceptions for the case that a file cannot be opened. I don't like that, because the caller then should use lots of try/catch wrappers around file loading code, while instead a good-old boolean check would be just fine.
// *** I don't like this: ***
try
{
File f1 = loadFile("foo1");
}
catch(FileException& e)
{
...handle load failure, e.g. use some defaults for f1
}
doSomething();
try
{
File f2 = loadFile("foo2");
}
catch(FileException& e)
{
...handle load failure for f2
}
I prefer this style:
File f1 = loadFile("foo");
if (! f1.valid())
... handle load failure, e.g. use some default settings for f1...
doSomething();
File f2 = loadFile("foo2");
if (! f2.valid())
... handle load failure
Moreover, it may also make sense to make the File class movable (so you may also put File instances in containers, e.g. have a std::vector<File>), and in this case you must have an "invalid" state for a moved-from file instance.
So, for a File class, I'd consider the introduction of an invalid state to be just fine.
I also wrote a RAII template wrapper to raw resources, and I implemented an invalid state there as well. Again, this makes it possible to properly implement move semantics, too.
At least IMO, your thinking on this subject is probably better than that shown in iostreams. Personally, if I were creating an analog of iostreams from scratch today, it probably would not have a default ctor and separate open. When I use an fstream, I nearly always pass the file name to the ctor rather than default-constructing followed by using open.
Nearly the only point in favor of having a default ctor for a class like this is that it makes putting them into a collection easier. With move semantics and the ability to emplace objects, that becomes much less compelling though. It was never truly necessary, and is now almost irrelevant.
Related
After adding the comment "// not null" to a raw pointer for the Nth time I wondered once again whatever happened to the not_null template.
The C++ core guidelines were created quite some time ago now and a few things have made into into the standard including for example std::span (some like string_view and std::array originated before the core guidelines themselves but are sometimes conflated). Given its relative simplicity why hasn't not_null (or anything similar) made it into the standard yet?
I scan the ISO mailings regularly (but perhaps not thoroughly) and I am not even aware of a proposal for it.
Possibly answering my own question. I do not recall coming across any cases where it would have prevented a bug in code I've worked on as we try not to write code that way.
The guidelines themselves are quite popular, making it into clang-tidy and sonar for example. The support libraries seem a little less popular.
For example boost has been available as a package on Linux from near the start. I am not aware of any implementation of GSL that is. Though, I presume it is bundled with Visual C++ on windows.
Since people have asked in the comments.
Myself I would use it to document intent.
A construct like not_null<> potentially has semantic value which a comment does not.
Enforcing it is secondary though I can see its place. This would preferably be done with zero overhead (perhaps at compile time only for a limited number of cases).
I was thinking mainly about the case of a raw pointer member variable. I had forgotten about the case of passing a pointer to a function for which I always use a reference to mean not-null and also to mean "I am not taking ownership".
Likewise (for class members) we could also document ownership owned<> not_owned<>.
I suppose there is also whether the associated object is allowed to be changed. This might be too high level though. You could use references members instead of pointers to document this. I avoid reference members myself because I almost always want copyable and assignable types. However, see for example Should I prefer pointers or references in member data? for some discussion of this.
Another dimension is whether another entity can modify a variable.
"const" says I promise not to modify it. In multi-threaded code we would like to say almost the opposite. That is "other code promises not to modify it while we are using it" (without an explicit lock) but this is way off topic...
There is one big technical issue that is likely unsolvable which makes standardizing not_null a problem: it cannot work with move-only smart pointers.
The most important use case for not_null is with smart pointers (for raw pointers a reference usually is adequate, but even then, there are times when a reference won't work). not_null<shared_ptr<T>> is a useful thing that says something important about the API that consumes such an object.
But not_null<unique_ptr<T>> doesn't work. It cannot work. The reason being that moving from a unique pointer leaves the old object null. Which is exactly what not_null is expected to prevent. Therefore, not_null<T> always forces a copy on its contained T. Which... you can't do with a unique_ptr, because that's the whole point of the type.
Being able to say that the unqiue_ptr consumed by an API is not null is good and useful. But you can't actually do that with not_null, which puts a hole in its utility.
So long as move-only smart pointers can't work with not_null, standardizing the class becomes problematic.
There's a problem I've been running into lately and since I'm a self taught C++ programer I'd really like to know how professionals in the real world solve it.
Is it a good idea to write a default constructor for all classes? Aren't there certain parts of the STL that won't work if your classes don't have default constructors?
IF SO, then how does one write a default constructor that does sensible things? That is, how can I assign default values to my private members if there simply are not sensible default values? I can only think of two solutions:
Use pointers (or unique_ptrs) for each member and that way a nullptr means the field is uninitialized.
OR
Add extra fields/logic/methods to do the work of checking to see whether or not a field has been initialized and rely on that (think kind of like unique_ptr's "reset" method).
How do people solve problems like this in the real world?
If it doesn't make sense for your data type to have a default constructor, then don't write one.
(STL is long dead, but I assume you mean the standard library.) Most standard library containers work well even if the contained type doesn't have a default constructor. Some notable gotchas:
std::vector<T>::resize(n) requires T to have a default constructor. But without one, you can use erase and insert instead.
std::map<K,V>::operator[] and std::unordered_map<K,V>::operator[] require V to have a default constructor. But without one, you can use find and insert instead.
Is it a good idea to write a default constructor for all classes?
No. If there is no sensible “default value” for your type, you should not write a default constructor. Doing so would violate the principle of least surprise.
That said, many types do have a sensible default value.
the number zero (more generally: the neutral element)
the empty string
an empty list
a 0 × 0 matrix
the time-zone UTC ± 00:00
…
For such types, you really should define a default constructor.
Other types don't have a natural default value but have an “empty” state that can be reached by performing certain operations. It is sensible to default-construct such an object to have that state.
an I/O stream disconnected from any source / sink that fails every operation (can be reached by reaching the end of the file or encountering an I/O error)
a lock guard that doesn't hold a lock (can be reached by releasing the lock)
a smart pointer that doesn't own an object (can be reached by releasing the managed object)
…
For these types, it is a trade-off whether to define a default constructor. Doing so does no harm but makes your type slightly more complicated. If it is a public type found in a library, then it is probably worth the trouble. If you're going to implement a move constructor (and assignment operator) anyway, you can equally well define the default constructor to construct the object in a state that would be reached by moving away from it.
For other types, you just cannot define a sensible default.
a day of the week
a name for a baby
…
Do not invent an artificial “null” state for these types just to make them default-constructible. Doing so complicates the code and forces you to provide less useful class invariants as you could without that artificial state. If somebody really needs that additional state, they can easily use an optional<T> but going the other way round is not possible.
Aren't there certain parts of the STL that won't work if your classes don't have default constructors?
Yes. std::vector::resize is probably the most prominent example. But don't worry about these. If your type does not have a sensible default value, performing that operation isn't meaningful either. It's not the fault of your type. It's inherent to the nature of the concept you're trying to model.
Is it a good idea to write a default constructor for all classes?
No. Sometimes there is no sense in having default values for object.
Aren't there certain parts of the STL that won't work if your classes don't have default constructors?
There are some parts which require DefaultConstructible objects. And there are ways to circumvent it (overloads which takes a object to use instead of default constructed).
Background
Suppose we have an implementation of the abstract factory which is invoked as follows:
std::string ObjectName = "An Object Name";
std::string Args = "Argument passed directly to constructor by the factory";
std::unique_ptr<MyBaseClass> MyPtr(ObjectFactory::Instance().Construct(ObjectName,Args));
The factory uses a std::map to turn "An Object Name" into a constructor, which itself takes a std::string as an argument. The idea is that the user knows more about the constructed objects than I do, so I should get out of the way and let the user pass any information they want to the constructor.
Question
This works fine when Args is in exactly the form expected but I don't know the most idiomatic way of handling duff inputs. What should happen if a user supplies an invalid argument string?
I can think of the following:
have the object's constructor throw an exception
require the object provides a bool Validate(std::string x) method, which checks whether x is a valid argument string
Have the factory use the default constructor, and then call an initialisation method afterwards (begs the question: what if the init method fails?)
Set a bool member variable which, if true, means "this object is not in a sane state"
Some other option I've not thought of
Throw an exception. You're constructing an object (albeit in a different way than just new), failure is an exception that needs to be handled if it can happen.
"Solution 2" has nothing to do with handling this issue, it's more of a how to determine bad input. For such it can be an acceptable solution, but again it has nothing to do with the question at hand.
Solution 3 leaves the object in an indeterminate state in case of failure, which is unacceptable.
Solution 4 is another way to do it, and the only way if you don't have exception support in your language, but we do. Exceptions in this case are strictly better in my opinion because failing to construct an object is an action so destructive it should require alternative code or the program to die.
The first solution is the standard solution for constructor failure.
Solution #2 and #4 are pretty error prone (user might forget to check)
Solution #3 doesn't solve the problem
If you're making a library or just writing code that might be re-used in a "no exceptions allowed" environment, you could consider returning std::unique_ptr{nullptr}. Even if it is also error prone, that's the other standard way of handling construction failure.
Maybe the easiest way is to return a nullptr.
What ever you do, do not throw an exception from a constructor (as per Item #10 in More effective C++), as the destructor will not be called. It leads to newbie memory leaks and forces users to worry about exception handling.
I would just assert and return a Null Object, with dummy factory method implementations, or return nullptr.
As you might know C++11 has noexcept keyword. Now ugly part about it is this:
Note that a noexcept specification on a function is not a compile-time
check; it is merely a method for a programmer to inform the compiler
whether or not a function should throw exceptions.
http://en.cppreference.com/w/cpp/language/noexcept_spec
So is this a design failure on the committee part or they just left it as an exercise for the compile writers :) in a sense that decent compilers will enforce it, bad ones can still be compliant?
BTW if you ask why there isnt a third option ( aka cant be done) reason is that I can easily think of a (slow) way to check if function can throw or not. Problem is off course if you limit the input to 5 and 7(aka I promise the file wont contain anything beside 5 and 7) and it only throws when you give it 33, but that is not a realistic problem IMHO.
The committee pretty clearly considered the possibility that code that (attempted to) throw an exception not allowed by an exception specification would be considered ill-formed, and rejected that idea. According to $15.4/11:
An implementation shall not reject an expression merely because when executed it throws or might throw an exception that the containing function does not allow. [ Example:
extern void f() throw(X, Y);
void g() throw(X) {
f(); // OK
}
the call to f is well-formed even though when called, f might throw exception Y that g does not allow. —end example ]
Regardless of what prompted the decision, or what else it may have been, it seems pretty clear that this was not a result of accident or oversight.
As for why this decision was made, at least some goes back to interaction with other new features of C++11, such as move semantics.
Move semantics can make exception safety (especially the strong guarantee) much harder to enforce/provide. When you do copying, if something goes wrong, it's pretty easy to "roll back" the transaction -- destroy any copies you've made, release the memory, and the original remains intact. Only if/when the copy succeeds, you destroy the original.
With move semantics, this is harder -- if you get an exception in the middle of moving things, anything you've already moved needs to be moved back to where it was to restore the original to order -- but if the move constructor or move assignment operator can throw, you could get another exception in the process of trying to move things back to try to restore the original object.
Combine this with the fact that C++11 can/does generate move constructors and move assignment operators automatically for some types (though there is a long list of restrictions). These don't necessarily guarantee against throwing an exception. If you're explicitly writing a move constructor, you almost always want to ensure against it throwing, and that's usually even pretty easy to do (since you're normally "stealing" content, you're typically just copying a few pointers -- easy to do without exceptions). It can get a lot harder in a hurry for template though, even for simple ones like std:pair. A pair of something that can be moved with something that needs to be copied becomes difficult to handle well.
That meant, if they'd decided to make nothrow (and/or throw()) enforced at compile time, some unknown (but probably pretty large) amount of code would have been completely broken -- code that had been working fine for years suddenly wouldn't even compile with the new compiler.
Along with this was the fact that, although they're not deprecated, dynamic exception specifications remain in the language, so they were going to end up enforcing at least some exception specifications at run-time anyway.
So, their choices were:
Break a lot of existing code
Restrict move semantics so they'd apply to far less code
Continue (as in C++03) to enforce exception specifications at run time.
I doubt anybody liked any of these choices, but the third apparently seemed the last bad.
One reason is simply that compile-time enforcement of exception specifications (of any flavor) is a pain in the ass. It means that if you add debugging code you may have to rewrite an entire hierarchy of exception specifications, even if the code you added won't throw exceptions. And when you're finished debugging you have to rewrite them again. If you like this kind of busywork you should be programming in Java.
The problem with compile-time checking: it's not really possible in any useful way.
See the next example:
void foo(std::vector<int>& v) noexcept
{
if (!v.empty())
++v.at(0);
}
Can this code throw?
Clearly not. Can we check automatically? Not really.
The Java's way of doing things like this is to put the body in a try-catch block, but I don't think it is better than what we have now...
As I understand things (admittedly somewhat fuzzy), the entire idea of throw specifications was found to be a nightmare when it actually came time to try to use it in useful way.
Calling functions that don't specify what they throw or do not throw must be considered to potentially throw anything at all! So the compiler, were it to require that you neither throw nor call anything that might throw anything outside of the specification you're provided actually enforce such a thing, your code could call almost nothing whatsoever, no library in existence would be of any use to you or anyone else trying to make any use of throw specifications.
And since it is impossible for a compiler to tell the difference between "This function may throw an X, but the caller may well be calling it in such a way that it will never throw anything at all" -- one would forever be hamstrung by this language "feature."
So... I believe that the only possibly useful thing to come of it was the idea of saying nothrow - which indicates that it is safe to call from dtors and move and swap and so on, but that you're making a notation that - like const - is more about giving your users an API contract rather than actually making the compiler responsible to tell whether you violate your contract or not (like with most things C/C++ - the intelligence is assumed to be on the part of the programmer, not the nanny-compiler).
Should scoped objects (with complimentary logic implemented in constructor and destructor) only be used for resource cleanup (RAII)?
Or can I use it to implement certain aspects of the application's logic?
A while ago I asked about Function hooking in C++. It turns out that Bjarne addressed this problem and the solution he proposes is to create a proxy object that implements operator-> and allocates a scoped object there. The "before" and "after" are implemented in the scoped object's constructor and destructor respectively.
The problem is that destructors should not throw. So you have to wrap the destructor in a try { /* ... */ } catch(...) { /*empty*/ } block. This severely limits the ability to handle errors in the "after" code.
Should scoped objects only be used to cleanup resources or can I use it for more than that? Where do I draw the line?
If you pedantically consider the definition of RAII, anything you do using scoping rules and destructor invocation that doesn't involve resource deallocation simply isn't RAII.
But, who cares? Maybe what you're really trying to ask is,
I want X to happen every time I leave function Y. Is it
abusive to use the same scoping rules and destructor invocation that
RAII uses in C++ if X isn't resource deallocation?
I say, no. Scratch that, I say heck no. In fact, from a code clarity point of view, it might be better to use destructor calls to execute a block of code if you have multiple return points or possibly exceptions. I would document the fact that your object is doing something non-obvious on destruction, but this can be a simple comment at the point of instantiation.
Where do you draw the line? I think the KISS principle can guide you here. You could probably write your entire program in the body of a destructor, but that would be abusive. Your Spidey Sense will tell you that is a Bad Idea, anyway. Keep your code as simple as possible, but not simpler. If the most natural way to express certain functionality is in the body of a destructor, then express it in the body of a destructor.
You want a scenario where, guaranteed, the suffix is always done. That sounds exactly like the job of RAII to me. I however would not necessarily actually write it that way. I'd rather use a method chain of templated member functions.
I think with C++11 you can semi-safely allow the suffix() call to throw. The strict rule isn't "never throw from a destructor", although that's good advice, instead the rule is:
never throw an exception from a destructor while processing another
exception
In the destructor you can now use std::current_exception, which I think verifies the "while processing another exception" element of the destructor+exception rule. With this you could do:
~Call_proxy() {
if (std::current_exception()) {
try {
suffix();
}
catch(...) {
// Not good, but not fatal perhaps?
// Just don't rethrow and you're ok
}
}
else {
suffix();
}
}
I'm not sure if that's actually a good idea in practise though, you have a hard problem to deal with if it throws during the throwing of another exception still.
As for the "is it abuse" I don't think it's abusive anymore than metaprogramming or writing a?b:c instead of a full blown if statement if it's the right tool for the job! It's not subverting any language rules, simply exploiting them, within the letter of the law. The real issue is the predictability of the behaviour to readers unfamiliar with the code and the long term maintainability, but that's an issue for all designs.