What to inherit from when I want to create a new fstream? - c++

I want to define a new type of filestream in C++. What should I inherit from?

Inherit from basic_filebuf, or basic_streambuf if you're writing the I/O parts from scratch. You might also want a class derived from basic_[i/o]fstream, but that is strictly optional, for convenience. If templating is not required, drop the basic_ and inherit from the classes, not the templates.
The *stream classes all dispatch I/O through a polymorphic pointer which you can get and set using the rdbuf() method. So unless/until you implement the convenience class, you can test by instantiating std::iostream and calling rdbuf with your pointer.
It is very useful to have a copy of the Standard handy, to work through the requirements for the derived class. Your main functionality will be in the virtual functions overflow and underflow.

Related

If I need to work before or after call virtual function, is NVI idiom always better than other interface?

From what I understand, NVI idiom grants hardened privileges to base class and places constraints on derived classes.
Because behaviors such as pre-action, post-action, when to call are implemented in base class, and all derived classes are enforced and cannot change.
In other words, an interface that is difficult for users (creators of derived classes) to misuse.
So this interface looks useful for any base class that can be derived and need to work before or after call virtual function.
I've looked for downsides to applying this idiom, but it just seems more useful.
Some docs say that misuse of this idiom will make your code bigger, but I don't understand. Wouldn't it rather help reduce duplicate code?
If a derived class needs to call a virtual function of the base class, it can change the virtual function of the base class to protected.
Perhaps this idiom prevents the derived class from implementing special behavior (because it imposes constraints on the derived class). But I'm not sure if there can be any special behavior that makes it impossible by applying this idiom. Maybe the design that requires special behavior is wrong.
So my question is, this idiom is really useful of any base class? Should I always apply it? What are the disadvantages of applying this idiom?

Example for non-virtual multiple inheritance

Is there a real-world example where non-virtual multiple inheritance is being used? I'd like to have one mostly for didactic reasons. Slapping around classes named A, B, C, and D, where B and C inherit from A and D inherits from B and C is perfectly fine for explaining the question "Does/Should a D object have one or two A sub-objects?", but bears no weight about why we even have both options. Many examples care about why we do want virtual inheritance, but why would we not want virtual inheritance?
I know what virtual base classes are and how to express that stuff in code. I know about diamond inheritance and examples of multiple inheritance with a virtual base class are abundant.
The best I could find is vehicles. The base class is Vehicle which is inherited by Car and Boat. Among other things, a Vehicle has occupants() and a max_speed(). So an Amphibian that inherits from both Car and Boat inherits different max_speed() on land and water – and that makes sense –, but also different occupants() – and that does not make sense. So the Vehicle sub-objects aren't really independent; that is another problem which might be interesting to solve, but this is not the question.
Is there an example, that makes sense as a real-world model, where the two sub-objects are really independent?
You're thinking like an OOP programmer, trying to design abstract models of things. C++ multiple inheritance, like many things in C++, is a tool that has a particular effect. Whether it maps onto some OOP model is irrelevant next to the utility of the tool itself. To put it another way, you don't need a "real-world model" to justify non-virtual inheritance; you just need a real-world use case.
Because a derived class inherits the members of a base class, inheritance often is used in C++ as a means of collecting a set of common functionality together, sometimes with minimal interaction from the derived class, and injecting this functionality directly into the derived class.
The Curiously Recurring Template Pattern and other mixin-like constructs are mechanisms for doing this. The idea is that you have a base class that is a template, and its template parameter is the derived class that uses it. This allows the base class to have some access to the derived class itself without virtual functions.
The simplest example I can think of in C++ is enable_shared_from_this, which allows an object whose lifetime is currently managed by a shared_ptr to actually retrieve a shared_ptr to that object just from a pointer/reference to that object. That uses CRTP to add the various members and interfaces needed to make shared_from_this possible to the derived class. And since the inheritance is public, it also allows shared_ptr's various functions that "enable shared_from_this" to to detect that a particular type has the shared_from_this stuff in it and to properly initialize it.
enable_shared_from_this doesn't need virtual inheritance, and indeed would probably not work very well with it.
Now imagine that I have some other CRTP class that injects some other functionality into an object. This functionality has nothing to do with shared_ptr, but it uses CRTP and inheritance.
Well, if I now write some type that wants to inherit from both enable_shared_from_this and this other functionality, well, that works just fine. There is no need for virtual inheritance, and in fact doing so would only make composition that much harder.
Virtual inheritance is not free. It fundamentally changes a bunch of things about how a type relates to its base classes. If you inherit from such a type, your constructors have to initialize any virtual base classes directly. The layout of such a type is very odd and is highly unlikely to be standardized. And various other things. C++ tries not to make programmers pay for functionality they don't use, so if you don't need the special properties of virtual inheritance, you shouldn't be using it.
Its the same reason C++ has non-virtual methods -- because the implementation is simpler and more efficient if you use non-virtual inheritance, so you need to explicitly ask for virtual inheritance if you want it. Since you don't need it if your classes never use multiple inheritance, that is the default.

If a class might be inherited, should every function be virtual?

In C++, a coder doesn't know whether other coders will inherit his class. Should he make every function in that class virtual? Are there any drawbacks? Or is it just not acceptable at all?
In C++, you should only make a class inheritable from if you intend for it to be used polymorphically. The way that you treat polymorphic objects in C++ is very different from how you treat other objects. You don't tend to put polymorphic classes on the stack, or pass them by or return them from functions by value, since this can lead to slicing. Polymorphic objects tend to be heap-allocated, be passed around and returns by pointer or by reference, etc.
If you design a class to not be inherited from and then inherit from it, you cause all sorts of problems. If the destructor isn't marked virtual, you can't delete the object through a base class pointer without causing undefined behavior. Without the member functions marked virtual, they can't be overridden in a derived class.
As a general rule in C++, when you design the class, determine whether you want it be inherited from. If you do, mark the appropriate functions virtual and give it a virtual destructor. You might also disable the copy assignment operator to avoid slicing. Similarly, if you want the class not to be inheritable, don't give it any of these functions. In most cases it's a logic error to inherit from a class that wasn't designed to be inherited from, and most of the times you'd want to do this you can often use composition instead of inheritance to achieve this effect.
No, not usually.
A non-virtual function enforces class-invariant behavior. A virtual function doesn't. As such, the person writing the base class should think about whether the behavior of a particular function is/should be class invariant or not.
While it's possible for a design to allow all behaviors to vary in derived classes, it's fairly unusual. It's usually a pretty good clue that the person who wrote the class either didn't think much about its design, lacked the resolve to make a decision.
In C++ you design your class to be used either as a value type or a polymorphic type. See, for example, C++ FAQ.
If you are making a class to be used by other people, you should put a lot of thought into your interface and try to work out how your class will be used. Then make the decisions like which functions should be virtual.
Or better yet write a test case for your class, using it how you expect it to be used, and then make the interface work for that. You might be surprised what you find out doing it. Things you thought were absolutely necessary might turn out to be rarely needed and things that you thought were not going to be used might turn out to be the most useful methods. Doing it this way around will save you time not doing unnecessary work in the long run and end up with solid designs.
Jerry Coffin and Dominic McDonnell have already covered the most important points.
I'll just add an observation, that in the time of MFC (middle 1990s) I was very annoyed with the lack of ways hook into things. For example, the documentation suggested copying MFC's source code for printing and modifying, instead of overriding behavior. Because nothing was virtual there.
There are of course a zillion+1 ways to provide "hooks", but virtual methods are one easy way. They're needed in badly designed classes, so that the client code can fix things, but in those badly designed classes the methods are not virtual. For classes with better design there is not so much need to override behavior, and so for those classes making methods virtual by default (and non-virtual only as active choice) can be counter-productive; as Jerry remarked, virtuals provide opportunites for derived classes to screw up.
There are design patterns that can be employed to minimize the possibilities of screw-ups.
For example, wrapping internal virtuals in exposed non-virtual methods with sanity checks, and, for example, using decoupled event handling (where appropriate) instead of virtuals.
Cheers & hth.,
When you create a class, and you want that class to be used polymorphically you have to consider that the class has two different interfaces. The user interface is defined by the set of public functions that are available in your base class, and that should pretty much cover all operations that users want to perform on objects of your class. This interface is defined by the access qualifiers, and in particular the public qualifier.
There is a second interface, that defines how your class is to be extended. At that level you have to think on what behavior you want to be overridden by extending classes, and what elements of your object you want to provide to extending classes. You offer access to derived classes by means of the protected qualifier, and you offer extension points by means of virtual functions.
You should try to follow the Non-Virtual Interface idiom whenever possible. That idiom (google for it) basically tries to fully separate the two interfaces by not having public virtual functions. Users call non-virtual functions, and those in turn call on configurable functionalities by means of protected/private virtual functions. This clearly separates extension points from the class interface.
There is a single case, where virtual has to be part of the user interface: the destructor. If you want to offer your users the ability to destroy derived objects through pointers to the base, then you have to provide a virtual destructor. Else you just provide a protected non-virtual one.
He should code the functions as it is, he shouldn't make them virtual at all, as in the circumstances specified by you.
The reasons being
1> The CLASS CODER would obviously have certain use of functions he is using.
2> The inherited class may or may not make use of these functions as per requirement.
3> Any function may be overwritten in derived class without any errors.

protected ifstream member

I am close to completion of my first OOP project, coming from a C background. I was wondering a design issue related to some ifstream object that I use in the base class to open a file. After that I would like to use the same stream to do further operation by the derived classes. I defined only this member as a protected one so I could reach that in the derived classes, protected breaks the encapsulation(I would like to earn good habits), should I define some getter function to return a reference to a stream object? Since the ifstream objects are not copiable, that might be a problem, first thing I see...
Best,
Umut
protected is ideal for preserving encapsulation, if it's integral to your design that the derived classes have the same I/O functionality as the base class.
Encapsulation does not mean everything has to be private, it means that each data or code member of a given class is visible only to the minimal set of class users to achieve the class's designed purpose. In other words, don't make everything public just because that makes it easier to code.
You would only need a public getter if you wanted to expose the I/O function of the base and derived classes to code outside of the hierarchy. Returning a reference does not imply any copy, by the way.

Deriving from streambuf without rewriting a corresponding stream

Some days ago, I decided that it would be fun to write a streambuf subclass that would use mmap and read-ahead.
I looked at how my STL (SGI) implemented filebuf and realized that basic_filebuf contains a FILE*. So inheriting from basic_filebuf is out of the question.
So I inherited from basic_streambuf. Then i wanted to bind my mmapbuf to a fstream.
I thought the only thing that I would have to do would be to copy the implicit interface of filebuf... but that was a clear mistake. In the SGI, basic_fstream owns a basic_filebuf. No matter if I call basic_filestream.std::::ios::rdbuf( streambuf* ), the filestream completely ignores it and uses its own filebuf.
So now I'm a bit confused... sure, I can create my own mmfstream, that would be the exact copy/paste of the fstream but that sounds really not DRY-oriented.
What I can't understand, is: why does fstream is so tightly coupled with filebuf, so that it is not possible to use anything else than a filebuf? The whole point of separating streams and bufs is that one can use a stream with a different buffer.
Solutions:
=> filestream should rely on the implicit interface of filebuf. That is, fstream should be templated by a streambuf class. That would allow everyone to provide its own streambuf subclass to a fstream as long as it implements filebuf's implicit interface. Problem: we cannot add a template parameter to fstream since it would break template selectors while using fstream as template template parameter.
=> filebuf should be a pure virtual class without any additional attributes. So that one can inherit from it without carrying all its FILE* garbage.
Your ideas on the subject ?
In the IO streams' design, most of the actual streams' functionality (as opposed to the stream buffers' functionality) is implemented in std::basic_istream, std::basic_ostream, and their base classes. The string and file stream classes are more or less just convenience wrappers which make sure a stream with the right type of buffer is instantiated.
If you want to extend the streams, you almost always want to provide your own stream buffer class, and you almost never need to provide your own stream class. .
Once you have your own stream buffer type, you can then make it the buffer for any stream object you happen to have around. Or you derive your own classes from std::basic_istream, std::basic_ostream, and std::basic_iostream which instantiates your stream buffer and pass it to their base classes.
The latter is more convenient for users, but requires you to write some boiler-plate code for the buffer's instantiation (namely constructors for the stream class).
To answer your question: File streams and file buffer are coupled so tightly because the former only exists to ease the creation of the latter. Using a file stream makes it easy to set it all up.
Using your own stream class to wrap construction of your own stream buffer shouldn't be a problem, since you shouldn't be passing around file streams anyway, but only (references) to the base classes.
Check out mapped_file in the Boost.Iostreams library. I've never used used it myself, but it seems like it might already do what you need.
EDIT: Oops, reread your questions and I see you're doing this for fun. Perhaps you can draw inspiration from Boost.Iostreams?
fstream in itself is not a big class. It inherits from basic_stream to provide support for all the << and >> operations, contains a specialized steambuf that have to be initialized, and the corresponding constructors to pass the parameters to the streambuf constructor.
In a sense, what you wrote about your templated solution is OK. But basic_stream can also be derived into a tcp_stream for example. In that case, the constructors of fstream are a bit useless. Thus you need to provide a new tcpstream class, inheriting from basic_stream with the correct parameters for the constructors to be able to create the tcp_stream. In the end, you wouldn't use anything from fstream. Creating this new tcpstream is a matter of writing 3 or 4 functions only.
In the end, you would derive from the fstream class without any real reason to. This would add more coupling in the class hierarchy, unneeded coupling.
The whole point of std::fstream is that it is a _F_ile based std::stream. If you want an ordinary std::stream backed by your mmstreambuf, then you should create a mmstreambuf and pass it to std::stream::stream(std::streambuf*)