C++ design: subclass, or are there better ways? - c++

I have a Cheese class. In my program, I deal a lot with collection of cheeses, mostly vector<Cheese> objects.
I want to be able to eat() a cheese collection, something like this:
vector<Cheese> cheeses;
//cheeses = ...
cheeses.eat();
How to do this? How do I add a new member function to the vector<Cheese> class? Should I just subclass the vector<Cheese> class, name the subclass CheeseCollection and add the member function there, or are there any better ways?
Coming from Objective-C, I'm used to categories, which allowed me to add functions ("methods") to classes. Is something like that available in C++, or is it considered more natural to subclass like crazy in C++?

In C++ you simply wouldn’t use a member function for this – use a free function:
void eat(std::vector<Cheese> const& cheeses) {
// …
}
This is a close equivalent to those Obj-C categories even though the syntax differs (and you’re not using member access).
The standard library container classes weren’t designed to be subclassable so that approach will fail. What you could do is use composition instead of inheritance – i.e. have a CheeseCollection class which contains a vector of cheeses as a member. This may have some advantages, depending on your overall design. However, in general the above is the most C++ic solution.

Neither -- what you want is an algorithm. Assuming you have an eat that already knows how to eat one Cheese object, applying it to an entire collection would be something like:
std::for_each(cheeses.begin(), cheeses.end(), eat).
Unlike some other languages, C++ does not maintain a slavish adherence to object orientation, even when it makes no real sense.

One thing you can do is define your own class which uses (encapsulates) the vector:
class Cheeses
{
vector<Cheese> v;
public:
void eat()
{
v.erase();
}
// plus other methods which delegate to the contained vector
};

Related

Is there any way to avoid declaring virtual methods when storing (children) pointers?

I have run into an annoying problem lately, and I am not satisfied with my own workaround: I have a program that maintains a vector of pointers to a base class, and I am storing there all kind of children object-pointers. Now, each child class has methods of their own, and the main program may or not may call these methods, depending on the type of object (note though that they all heavily use common methods of the base class, so this justify inheritance).
I have found useful to have an "object identifier" to check the class type (and then either call the method or not), which is already not very beautiful, but this is not the main inconvenience. The main inconvenience is that, if I want to actually be able to call a derived class method using the base class pointer (or even just store the pointer in the pointer array), then one need to declare the derived methods as virtual in the base class.
Make sense from the C++ coding point of view.. but this is not practical in my case (from the development point of view), because I am planning to create many different children classes in different files, perhaps made by different people, and I don't want to tweak/maintain the base class each time, to add virtual methods!
How to do this? Essentially, what I am asking (I guess) is how to implement something like Objective-C NSArrays - if you send a message to an object that does not implement the method, well, nothing happens.
regards
Instead of this:
// variant A: declare everything in the base class
void DoStuff_A(Base* b) {
if (b->TypeId() == DERIVED_1)
b->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
b->DoDerived12Stuff();
}
or this:
// variant B: declare nothing in the base class
void DoStuff_B(Base* b) {
if (b->TypeId() == DERIVED_1)
(dynamic_cast<Derived1*>(b))->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
(dynamic_cast<Derived2*>(b))->DoDerived12Stuff();
}
do this:
// variant C: declare the right thing in the base class
b->DoStuff();
Note there's a single virtual function in the base per stuff that has to be done.
If you find yourself in a situation where you are more comfortable with variants A or B then with variant C, stop and rethink your design. You are coupling components too tightly and in the end it will backfire.
I am planning to create many different children classes in different
files, perhaps made by different people, and I don't want to
tweak/maintain the base class each time, to add virtual methods!
You are OK with tweaking DoStuff each time a derived class is added, but tweaking Base is a no-no. May I ask why?
If your design does not fit in either A, B or C pattern, show what you have, for clairvoyance is a rare feat these days.
You can do what you describe in C++, but not using functions. It is, by the way, kind of horrible but I suppose there might be cases in which it's a legitimate approach.
First way of doing this:
Define a function with a signature something like boost::variant parseMessage(std::string, std::vector<boost::variant>); and perhaps a string of convenience functions with common signatures on the base class and include a message lookup table on the base class which takes functors. In each class constructor add its messages to the message table and the parseMessage function then parcels off each message to the right function on the class.
It's ugly and slow but it should work.
Second way of doing this:
Define the virtual functions further down the hierarchy so if you want to add int foo(bar*); you first add a class that defines it as virtual and then ensure every class that wants to define int foo(bar*); inherit from it. You can then use dynamic_cast to ensure that the pointer you are looking at inherits from this class before trying to call int foo(bar*);. Possible these interface adding classes could be pure virtual so they can be mixed in to various points using multiple inheritance, but that may have its own problems.
This is less flexible than the first way and requires the classes that implement a function to be linked to each other. Oh, and it's still ugly.
But mostly I suggest you try and write C++ code like C++ code not Objective-C code.
This can be solved by adding some sort of introspection capabilities and meta object system. This talk Metadata and reflection in C++ — Jeff Tucker demonstrates how to do this using c++'s template meta programming.
If you don't want to go to the trouble of implementing one yourself, then it would be easier to use an existing one such as Qt's meta object system. Note that this solution does not work with multiple inheritance due to limitations in the meta object compiler: QObject Multiple Inheritance.
With that installed, you can query for the presence of methods and call them. This is quite tedious to do by hand, so the easiest way to call such a methods is using the signal and slot mechanism.
There is also GObject which is quite simmilar and there are others.
If you are planning to create many different children classes in different files, perhaps made by different people, and also I would guess you don't want to change your main code for every child class. Then I think what you need to do in your base class is to define several (not to many) virtual functions (with empty implementation) BUT those functions should be used to mark a time in the logic where they are called like "AfterInseart" or "BeforeSorting", Etc.
Usually there are not to many places in the logic you wish a derived classes to perform there own logic.

Sharing function between classes

I have three classes which each store their own array of double values. To populate the arrays I use a fairly complex function, lets say foo(), which takes in several parameters and calculates the appropriate values for the array.
Each of my three classes uses the same function with only minor adjustments (i.e. the input parameters vary slightly). Each of the classes is actually quite similar although they each perform separate logic when retrieving the values of the array.
So I am wondering how should I 'share' the function so that all classes can use it, without having to duplicate the code?
I was thinking of creating a base class which contained the function foo() and a virtual get() method. My three classes could then inherit this base class. Alternatively, I was also thinking perhaps a global function was the way to go? maybe putting the function into a namespace?
If the classes have nothing in common besides this foo() function, it is silly to put it in a base class; make it a free function instead. C++ is not Java.
Declaring of a function in base class sounds the most appropriate solution. Not sure if you need virtual "get" though, instead just declare the array in the base class and provide access method(s) for descendants.
More complex part is "the input parameters vary slightly". If parameters differ by type only then you may write a template function. If difference is more significant than the only solution I see is splitting main function into several logic blocks and using these blocks in descendant classes to perform final result.
If your classes are quite similar, you could create a template class with three different implementations that has the function foo<T>()
Implement that function in base class. If these classes are similar as you say, they should be derived from one base class anyway! If there are several functions like foo(), it might be reasonable in some cases to combine them into another class which is utilized by/with your classes.
If the underlying data of the class is the same (Array of doubles), considering using a single class and overloading the constructor, or just use 3 different functions:
void PopulateFromString(const string&)
void PopulateFromXml(...)
void PopulateFromInteger(...)
If the data or the behavior is different in each class type, then your solution of base class is good.
You can also define a function in the same namespace as your classes as utility function, if it has nothing to do with specific class behavior (Polymorphism). Bjarne StroupStroup recommends this method by the way.
For the purpose of this answer, I am assuming the classes you have are not common in any other outwards way; they may load the same data, but they are providing different interfaces.
There are two possible situations here, and you haven't told us which one it is. It could be more like
void foo(double* arr, size_t size) {
// Some specific code (that probably just does some preparation)
// Lots of generic code
// ...
// Some more specific code (cleanup?)
}
or something similar to
void foo(double* arr, size_t size) {
// generic_code();
// ...
// specific_code();
// generic_code();
// ...
}
In the first case, the generic code may very well be easy to put into a separate function, and then making a base class doesn't make much sense: you'll probably be inheriting from it privately, and you should prefer composition over private inheritance unless you have a good reason to. You could put the new function in its own class if it benefits from it, but it's not strictly necessary. Whether you put it in a namespace or not depends on how you're organising your code.
The second case is trickier, and in that case I would advise polymorphism. However, you don't seem to need runtime polymorphism for this, and so you could just as well do it compile-time. Using the fact that this is C++, you can use CRTP:
template<typename IMPL>
class MyBase {
void foo(double* arr, size_t size) {
// generic code
// ...
double importantResult = IMPL::DoALittleWork(/* args */);
// more generic code
// ...
}
};
class Derived : MyBase<Derived> {
static double DoALittleWork(/* params */) {
// My specific stuff
return result;
}
};
This gives you the benefit of code organisation and saves you some virtual functions. On the other hand, it does make it slightly less clear what functions need to be implemented (although the error messages are not that bad).
I would only go with the second route if making a new function (possibly within a new class) would clearly be uglier. If you're parsing different formats as Andrey says, then having a parser object (that would be polymorphic) passed in would be even nicer as it would allow you to mock things with less trouble, but you haven't given enough details to say for sure.

How can I structure my C++ code so that I only write my common methods once?

If C++.NET allowed multiple inheritance, I would have my common methods in a class and derive from it.
I have classes derived from Panel, Label, TabControl ... which have the same methods exactly.
How can I structure my C++ code so that I only write my common methods once?
Here is a simple example of a property I want to add to each derived class. Extension methods sound ideal, but don't exist in C++.
private: int panelBottomMargin;
public:
[Browsable(true)]
[CategoryAttribute("Layout"), DescriptionAttribute(
"Specify the gap between the last control and the bottom of the panel"),
DefaultValueAttribute(panelBottomMarginDefault)]
[DesignerSerializationVisibility(DesignerSerializationVisibility::Visible)]
property int PanelBottomMargin
{
int get() { return this->panelBottomMargin; }
void set(int margin) { this->panelBottomMargin = margin; }
}
I can't quite make out for sure what you mean by "common methods" here, but generally speaking namespace level non-member functions are the best way to do that (see pretty much every algorithm in the standard library).
If it actually needs access to private attributes of your class then it's probably not a common method and should be implemented in the level of inheritance where the attribute it operates on exist.
It's almost certainly an abuse of inheritance to put common methods into a class that you then inherit from: Use inheritance to extend, NOT to reuse.
Put your common methods in a Utility class, create an instance of this class (pass the object to work on to the constructor) when needed.
What is wrong with static methods? Or instantiating a new class which can operate on objects of the type given? Its best to not abuse inheritance in ways which clearly don't follow the "is-a" doctrine - use "has-a" whenever possible.
Generally, if MI is being considered as a solution to your problem which does not involve "mixin" type semantics, you should consider a new solution.
You could use .NETs "extension methods" if you don't need to access private/protected fields of an object.

Why bother with virtual functions in c++?

This is not a question about how they work and declared, this I think is pretty much clear to me. The question is about why to implement this?
I suppose the practical reason is to simplify bunch of other code to relate and declare their variables of base type, to handle objects and their specific methods from many other subclasses?
Could this be done by templating and typechecking, like I do it in Objective C? If so, what is more efficient? I find it confusing to declare object as one class and instantiate it as another, even if it is its child.
SOrry for stupid questions, but I havent done any real projects in C++ yet and since I am active Objective C developer (it is much smaller language thus relying heavily on SDK's functionalities, like OSX, iOS) I need to have clear view on any parallel ways of both cousins.
Yes, this can be done with templates, but then the caller must know what the actual type of the object is (the concrete class) and this increases coupling.
With virtual functions the caller doesn't need to know the actual class - it operates through a pointer to a base class, so you can compile the client once and the implementor can change the actual implementation as much as it wants and the client doesn't have to know about that as long as the interface is unchanged.
Virtual functions implement polymorphism. I don't know Obj-C, so I cannot compare both, but the motivating use case is that you can use derived objects in place of base objects and the code will work. If you have a compiled and working function foo that operates on a reference to base you need not modify it to have it work with an instance of derived.
You could do that (assuming that you had runtime type information) by obtaining the real type of the argument and then dispatching directly to the appropriate function with a switch of shorts, but that would require either manually modifying the switch for each new type (high maintenance cost) or having reflection (unavailable in C++) to obtain the method pointer. Even then, after obtaining a method pointer you would have to call it, which is as expensive as the virtual call.
As to the cost associated to a virtual call, basically (in all implementations with a virtual method table) a call to a virtual function foo applied on object o: o.foo() is translated to o.vptr[ 3 ](), where 3 is the position of foo in the virtual table, and that is a compile time constant. This basically is a double indirection:
From the object o obtain the pointer to the vtable, index that table to obtain the pointer to the function and then call. The extra cost compared with a direct non-polymorphic call is just the table lookup. (In fact there can be other hidden costs when using multiple inheritance, as the implicit this pointer might have to be shifted), but the cost of the virtual dispatch is very small.
I don't know the first thing about Objective-C, but here's why you want to "declare an object as one class and instantiate it as another": the Liskov Substitution Principle.
Since a PDF is a document, and an OpenOffice.org document is a document, and a Word Document is a document, it's quite natural to write
Document *d;
if (ends_with(filename, ".pdf"))
d = new PdfDocument(filename);
else if (ends_with(filename, ".doc"))
d = new WordDocument(filename);
else
// you get the point
d->print();
Now, for this to work, print would have to be virtual, or be implemented using virtual functions, or be implemented using a crude hack that reinvents the virtual wheel. The program need to know at runtime which of various print methods to apply.
Templating solves a different problem, where you determine at compile time which of the various containers you're going to use (for example) when you want to store a bunch of elements. If you operate on those containers with template functions, then you don't need to rewrite them when you switch containers, or add another container to your program.
A virtual function is important in inheritance. Think of an example where you have a CMonster class and then a CRaidBoss and CBoss class that inherit from CMonster.
Both need to be drawn. A CMonster has a Draw() function, but the way a CRaidBoss and a CBoss are drawn is different. Thus, the implementation is left to them by utilizing the virtual function Draw.
Well, the idea is simply to allow the compiler to perform checks for you.
It's like a lot of features : ways to hide what you don't want to have to do yourself. That's abstraction.
Inheritance, interfaces, etc. allow you to provide an interface to the compiler for the implementation code to match.
If you didn't have the virtual function mecanism, you would have to write :
class A
{
void do_something();
};
class B : public A
{
void do_something(); // this one "hide" the A::do_something(), it replace it.
};
void DoSomething( A* object )
{
// calling object->do_something will ALWAYS call A::do_something()
// that's not what you want if object is B...
// so we have to check manually:
B* b_object = dynamic_cast<B*>( object );
if( b_object != NULL ) // ok it's a b object, call B::do_something();
{
b_object->do_something()
}
else
{
object->do_something(); // that's a A, call A::do_something();
}
}
Here there are several problems :
you have to write this for each function redefined in a class hierarchy.
you have one additional if for each child class.
you have to touch this function again each time you add a definition to the whole hierarcy.
it's visible code, you can get it wrong easily, each time
So, marking functions virtual does this correctly in an implicit way, rerouting automatically, in a dynamic way, the function call to the correct implementation, depending on the final type of the object.
You dont' have to write any logic so you can't get errors in this code and have an additional thing to worry about.
It's the kind of thing you don't want to bother with as it can be done by the compiler/runtime.
The use of templates is also technically known as polymorphism from theorists. Yep, both are valid approach to the problem. The implementation technics employed will explain better or worse performance for them.
For example, Java implements templates, but through template erasure. This means that it is only apparently using templates, under the surface is plain old polymorphism.
C++ has very powerful templates. The use of templates makes code quicker, though each use of a template instantiates it for the given type. This means that, if you use an std::vector for ints, doubles and strings, you'll have three different vector classes: this means that the size of the executable will suffer.

Member functions for derived information in a class

While designing an interface for a class I normally get caught in two minds whether should I provide member functions which can be calculated / derived by using combinations of other member functions. For example:
class DocContainer
{
public:
Doc* getDoc(int index) const;
bool isDocSelected(Doc*) const;
int getDocCount() const;
//Should this method be here???
//This method returns the selected documents in the contrainer (in selectedDocs_out)
void getSelectedDocs(std::vector<Doc*>& selectedDocs_out) const;
};
Should I provide this as a class member function or probably a namespace where I can define this method? Which one is preferred?
In general, you should probably prefer free functions. Think about it from an OOP perspective.
If the function does not need access to any private members, then why should it be given access to them? That's not good for encapsulation. It means more code that may potentially fail when the internals of the class is modified.
It also limits the possible amount of code reuse.
If you wrote the function as something like this:
template <typename T>
bool getSelectedDocs(T& container, std::vector<Doc*>&);
Then the same implementation of getSelectedDocs will work for any class that exposes the required functions, not just your DocContainer.
Of course, if you don't like templates, an interface could be used, and then it'd still work for any class that implemented this interface.
On the other hand, if it is a member function, then it'll only work for this particular class (and possibly derived classes).
The C++ standard library follows the same approach. Consider std::find, for example, which is made a free function for this precise reason. It doesn't need to know the internals of the class it's searching in. It just needs some implementation that fulfills its requirements. Which means that the same find() implementation can work on any container, in the standard library or elsewhere.
Scott Meyers argues for the same thing.
If you don't like it cluttering up your main namespace, you can of course put it into a separate namespace with functionality for this particular class.
I think its fine to have getSelectedDocs as a member function. It's a perfectly reasonable operation for a DocContainer, so makes sense as a member. Member functions should be there to make the class useful. They don't need to satisfy some sort of minimality requirement.
One disadvantage to moving it outside the class is that people will have to look in two places when the try to figure out how to use a DocContainer: they need to look in the class and also in the utility namespace.
The STL has basically aimed for small interfaces, so in your case, if and only if getSelectedDocs can be implemented more efficiently than a combination of isDocSelected and getDoc it would be implemented as a member function.
This technique may not be applicable anywhere but it's a good rule of thumbs to prevent clutter in interfaces.
I agree with the answers from Konrad and jalf. Unless there is a significant benefit from having "getSelectedDocs" then it clutters the interface of DocContainer.
Adding this member triggers my smelly code sensor. DocContainer is obviously a container so why not use iterators to scan over individual documents?
class DocContainer
{
public:
iterator begin ();
iterator end ();
// ...
bool isDocSelected (Doc *) const;
};
Then, use a functor that creates the vector of documents as it needs to:
typedef std::vector <Doc*> DocVector;
class IsDocSelected {
public:
IsDocSelected (DocContainer const & docs, DocVector & results)
: docs (docs)
, results (results)
{}
void operator()(Doc & doc) const
{
if (docs.isDocSelected (&doc))
{
results.push_back (&doc);
}
}
private:
DocContainer const & docs;
DocVector & results;
};
void foo (DocContainer & docs)
{
DocVector results;
std :: for_each (docs.begin ()
, docs.end ()
, IsDocSelected (docs, results));
}
This is a bit more verbose (at least until we have lambdas), but an advantage to this kind of approach is that the specific type of filtering is not coupled with the DocContainer class. In the future, if you need a new list of documents that are "NotSelected" there is no need to change the interface to DocContainer, you just write a new "IsDocNotSelected" class.
The answer is proabably "it depends"...
If the class is part of a public interface to a library that will be used by many different callers then there's a good argument for providing a multitude of functionality to make it easy to use, including some duplication and/or crossover. However, if the class is only being used by a single upstream caller then it probably doesn't make sense to provide multiple ways to achieve the same thing. Remember that all the code in the interface has to be tested and documented, so there is always a cost to adding that one last bit of functionality.
I think this is perfectly valid if the method:
fits in the class responsibilities
is not too specific to a small part of the class clients (like at least 20%)
This is especially true if the method contains complex logic/computation that would be more expensive to maintain in many places than only in the class.