Class design to avoid need for list of base classes - c++

I'm currently in the design phase of a class library and stumbled up on a question similar to "Managing diverse classes with a central manager without RTTI" or "pattern to avoid dynamic_cast".
Imagine there is a class hierarchy with a base class Base and two classes DerivedA and DerivedB that are subclasses of Base. Somewhere in my library there will be a class that needs to hold lists of objects of both types DerivedA and DerivedB. Further suppose that this class will need to perform actions on both types depending on the type. Obviously I will use virtual functions here to implement this behavior. But what if I will need the managing class to give me all objects of type DerivedA?
Is this an indicator of a bad class design because I have the need to perform actions only on a subset of the class hierarchy?
Or does it just mean that my managing class should not use a list of Base but two lists - one for DerivedA and one for DerivedB? So in case I need to perform an action on both types I would have to iterate over two lists. In my case the probability that there will be a need to add new subclasses to the hierarchy is quite low and the current number is around 3 or 4 subclasses.

But what if I will need the managing class to give me all objects of
type DerivedA?
Is this an indicator of a bad class design because I have the need to
perform actions only on a subset of the class hierarchy?
More likely yes than no. If you often need to do this, then it makes sense to question whether the hierarchy makes sense. In that case, you should separate this into two unrelated lists.
Another possible approach is to also handle it through virtual methods, where e.g. DeriveB will have a no-op implementation for methods which don't affect that. It is hard to tell without knowing more information.

It certainly is a sign of bad design if you store (pointers to) objects together that have to be handled differently.
You could however just implement this differing behaviour as an empty function in the base class or use the visitor pattern.

You can do it in several ways.
Try to dynamic_cast to specific class (this is a bruteforce solution, but I'd use it only for interfaces, using it for classes is a kind of code smell. It'll work though.)
Do something like:
class BaseRequest {};
class DerivedASupportedRequest : public BaseRequest {};
Then modify your classes to support the method:
// (...)
void ProcessRequest(const BaseRequest & request);
Create a virtual method bool TryDoSth() in a base class; DerivedB will always return false, while DerivedA will implement the required functionality.
Alternative to above: Create method Supports(Action action), where Action is an enum defining possible actions or groups of actions; in such case calling DoSth() on class, which does not support given feature should result in thrown exception.
Base class may have a method ActionXController * GetControllerForX(); DerivedA will return the actual controller, DerivedB will return nullptr.
Similarly, base class can provide method: BaseController * GetController(Action a)
You asked, if it is a bad design. I believe, that it depends on how much functionality is common and how much is different. If you have 100 common methods and only one different, it would be weird to hold these data in separate lists. However, if count of different methods is noticeable, consider changing design of your application. This may be a general rule, but there are also exceptions. It's hard to tell without knowing the context.

Related

Determine real type of base pointer in a big hierarchy without dynamic_cast

Suppose, that I have an abstract base State class and at least two derived classes AnimalState and PlantState(also abstract). Also, I have many derived classes from AnimalState and PlantState.
class State{} // abstract
class AnimalState: public State{} // abstract
class PlantState: public State{} // abstract
//maybe few more of such classes here
class AnimalStateSpecific1: public AnimalState{}
class AnimalStateSpecific2: public AnimalState{}
... //many of them
class PlantStateSpecific1: public PlantState{}
class PlantStateSpecific2: public PlantState{}
... //many of them
Now suppose, that I use them in some kind of method that operates on base State pointers. Those pointers are replaced over time with other pointers to different class from the State hierarchy. It happens by some rule, specifically within the predefined state graph.
Now to the question part. In order to determine the next state, I need to know the previous one. But since I have only base State pointers, I can not efficiently tell what type of state I have, without doing dynamic_cast to every derived class in the hierarchy that is not good. I can have some enum with all kinds of states that I have, but I do not really like that because I do not want to mix information from two hierarchy branches, as it is really different. Also, I do not like different enums for every branch in the hierarchy such as AnimalStateEnum, PlantStateEnum etc.
What is the best solution for this problem? Maybe my design is not good from the start? I want to keep it as generic as possible and work only with base class objects, if possible.
Now to the question part. In order to determine the next state, I need to know the previous one.
Simplest solution based on limited information we have - object, which knows it's own state creates next state object:
class State{
public:
...
virtual std::unique_ptr<State> transform( some data ) = 0;
};
then you implement it in each derived from State class which can change it's state and knows where it can move to. What data you need to pass is not a simple question - it depends on your task and may have various options, but you need to define something that can be used by all derived classes, as signature is defined on the base class and shared on all derived ones.
What is the best solution for this problem? Maybe my design is not good from the start?
This question is not trivial and only can be answered having pretty deep knowledge on your task. If you are unsure - implement a prototype and check if solution fits your problem well. Unfortunately the only way to learn how to create a good design is your own experience (except trivial cases of course).
You could simply have a virtual method next() inside the state class hierarchy,
and then do something similar to the following example:
State *globalState = nullptr;
void foo(State *s)
{
globalState = s->next();
}
Where each derived class will implement next() to its own meaning:
PlantStateSpecific1 *AnimalStateSpecific1::next(){ return new PlantStateSpecific1; }
AnimalStateSpecific1 *PlantStateSpecific1::next(){ return new AnimalStateSpecific1; }
This is more OOP than having an enum / integer descriptor of the derived class.
What you can have is an integer inside the base state class that every class below it will set in its constructor. Then you can either use a sereis of constants, a list of possible states with the id corresponding to the state type index, or use an enumerator.
The id is more flexible as you can create state types with relative ease and add handling to them without too much difficulty, aswell as if you want to create a new state from the id type.
Just one of the ways iv done this before, but there are probably many others.

Use of making the base class polymorphic?

I know the keyword virtual makes the base class polymorphic and if I create an object and call a virtual function, corresponding function will be called based on the run time allocation but why should I create an object with different types. I mean
Base *ptr = new Derived;
ptr->virtualfunction(); //calls the function which has implemented in Derived class.
If I create an object so that
Derived *ptr = new Derived;
ptr->virtualfunction(); // which does the same without the need of making the function virtual.
Because you might want to store objects of different types together:
std::vector<std::unique_ptr<Base>> v;
v.push_back(make_unique(new DerivedA()));
v.push_back(make_unique(new DerivedB()));
v.push_back(make_unique(new DerivedC()));
Now, if you go over that vector:
for (auto& p : v) {
p->foo();
}
It will call foo() of DerivedA, B, and C appropriately.
Let's go with a simple example : Let's say you have
class Base {};
class Derived1 : public Base {};
class Derived2 : public Base {};
Now, let's say you want to be able to store in a vector (or any container) both Derived1 and Derived2 instances.
You have to use the base class in that case.
std::vector<Base*>
// or std::vector<std::unique_ptr<Base>>
The need for polymorphism is the need of processing different data in the same manner. Rather than reimplementing over and over the same algorithm for dataset with different shapes, wouldn't it be much easier to have only one implementation of that algorithm, and parameterize it with different operators?
That's the essence of polymorphism. You start with an algorithm, establish the interface it must interact with, and then build implementations of that interface. In C++ the notion of interface is implicit in every classes. Any class exposes one interface (though it may support many interfaces through its ancestors), and its descendants implement it as well. By making certain methods virtuals, the descendants may override and adapt them to their own internal structures, without modifying how the object is manipulated from the outside.
So polymorphism is really that, values which may adopt different shapes, and the means to access and manipulate them uniformally. The key point in answering your question is perhaps that the algorithm does not know which implepentation it is manipulating. You provide a trivial example where the code knows that it works with an instance of Derived, and thus may call its methods directly. In generic code, or code refering to an interface (so to speak), that knowledge does not exist, which forces the code to rely on the base class methods (and requires the programmer to ensure that the classes he plans to use with that code are well defined - ie. virtual - where needed).
There are many useful applications of polymorphism, but they all derive from the above principle:
heterogeneous dataset (as illustrated by other answers),
injection ( in which different implementations of the same interface may be swapped one for another at runtime),
testing (and more specifically mocking, in which classes which interact with a given class C are replaced by dummies which help test the correct behaviour of C),
to name a few. Note that compile time polymorphism (templates), and runtime polymorphism (virtual methods and inheritance) both achieve that goal, albeit in a different way, and with different pros and cons.

Is there any way to avoid declaring virtual methods when storing (children) pointers?

I have run into an annoying problem lately, and I am not satisfied with my own workaround: I have a program that maintains a vector of pointers to a base class, and I am storing there all kind of children object-pointers. Now, each child class has methods of their own, and the main program may or not may call these methods, depending on the type of object (note though that they all heavily use common methods of the base class, so this justify inheritance).
I have found useful to have an "object identifier" to check the class type (and then either call the method or not), which is already not very beautiful, but this is not the main inconvenience. The main inconvenience is that, if I want to actually be able to call a derived class method using the base class pointer (or even just store the pointer in the pointer array), then one need to declare the derived methods as virtual in the base class.
Make sense from the C++ coding point of view.. but this is not practical in my case (from the development point of view), because I am planning to create many different children classes in different files, perhaps made by different people, and I don't want to tweak/maintain the base class each time, to add virtual methods!
How to do this? Essentially, what I am asking (I guess) is how to implement something like Objective-C NSArrays - if you send a message to an object that does not implement the method, well, nothing happens.
regards
Instead of this:
// variant A: declare everything in the base class
void DoStuff_A(Base* b) {
if (b->TypeId() == DERIVED_1)
b->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
b->DoDerived12Stuff();
}
or this:
// variant B: declare nothing in the base class
void DoStuff_B(Base* b) {
if (b->TypeId() == DERIVED_1)
(dynamic_cast<Derived1*>(b))->DoDerived1Stuff();
else if if (b->TypeId() == DERIVED_2)
(dynamic_cast<Derived2*>(b))->DoDerived12Stuff();
}
do this:
// variant C: declare the right thing in the base class
b->DoStuff();
Note there's a single virtual function in the base per stuff that has to be done.
If you find yourself in a situation where you are more comfortable with variants A or B then with variant C, stop and rethink your design. You are coupling components too tightly and in the end it will backfire.
I am planning to create many different children classes in different
files, perhaps made by different people, and I don't want to
tweak/maintain the base class each time, to add virtual methods!
You are OK with tweaking DoStuff each time a derived class is added, but tweaking Base is a no-no. May I ask why?
If your design does not fit in either A, B or C pattern, show what you have, for clairvoyance is a rare feat these days.
You can do what you describe in C++, but not using functions. It is, by the way, kind of horrible but I suppose there might be cases in which it's a legitimate approach.
First way of doing this:
Define a function with a signature something like boost::variant parseMessage(std::string, std::vector<boost::variant>); and perhaps a string of convenience functions with common signatures on the base class and include a message lookup table on the base class which takes functors. In each class constructor add its messages to the message table and the parseMessage function then parcels off each message to the right function on the class.
It's ugly and slow but it should work.
Second way of doing this:
Define the virtual functions further down the hierarchy so if you want to add int foo(bar*); you first add a class that defines it as virtual and then ensure every class that wants to define int foo(bar*); inherit from it. You can then use dynamic_cast to ensure that the pointer you are looking at inherits from this class before trying to call int foo(bar*);. Possible these interface adding classes could be pure virtual so they can be mixed in to various points using multiple inheritance, but that may have its own problems.
This is less flexible than the first way and requires the classes that implement a function to be linked to each other. Oh, and it's still ugly.
But mostly I suggest you try and write C++ code like C++ code not Objective-C code.
This can be solved by adding some sort of introspection capabilities and meta object system. This talk Metadata and reflection in C++ — Jeff Tucker demonstrates how to do this using c++'s template meta programming.
If you don't want to go to the trouble of implementing one yourself, then it would be easier to use an existing one such as Qt's meta object system. Note that this solution does not work with multiple inheritance due to limitations in the meta object compiler: QObject Multiple Inheritance.
With that installed, you can query for the presence of methods and call them. This is quite tedious to do by hand, so the easiest way to call such a methods is using the signal and slot mechanism.
There is also GObject which is quite simmilar and there are others.
If you are planning to create many different children classes in different files, perhaps made by different people, and also I would guess you don't want to change your main code for every child class. Then I think what you need to do in your base class is to define several (not to many) virtual functions (with empty implementation) BUT those functions should be used to mark a time in the logic where they are called like "AfterInseart" or "BeforeSorting", Etc.
Usually there are not to many places in the logic you wish a derived classes to perform there own logic.

Virtual event handlers from several classes: multiple inheritance or composition?

My team has written several C++ classes which implement event handling via pure virtual callbacks - for example, when a message is received from another process, the base class which handles IPC messaging calls its own pure virtual function, and a derived class handles the event in an override of that function. The base class knows the event has occurred; the derived class knows what to do with it.
I now want to combine the features provided by these base classes in a higher-level class, so for example when a message arrives from another process, my new class can then forward it on over its network connection using a similar event-driven networking class. It looks like I have two options:
(1) composition: derive classes from each of the event-handling base classes and add objects of those derived classes to my new class as members, or:
(2) multiple inheritance: make my new class a derived class of all of the event-handling base classes.
I've tried both (1) and (2), and I'm not satisfied with my implementation of either.
There's an extra complication: some of the base classes have been written using initialisation and shutdown methods instead of using constructors and destructors, and of course these methods have the same names in each class. So multiple inheritance causes function name ambiguity. Solvable with using declarations and/or explicit scoping, but not the most maintainable-looking thing I've ever seen.
Even without that problem, using multiple inheritance and overriding every pure virtual function from each of several base classes is going to make my new class very big, bordering on "God Object"-ness. As requirements change (read: "as requirements are added") this isn't going to scale well.
On the other hand, using separate derived classes and adding them as members of my new class means I have to write lots of methods on each derived class to exchange information between them. This feels very much like "getters and setters" - not quite as bad, but there's a lot of "get this information from that class and hand it to this one", which has an inefficient feel to it - lots of extra methods, lots of extra reads and writes, and the classes have to know a lot about each other's logic, which feels wrong. I think a full-blown publish-and-subscribe model would be overkill, but I haven't yet found a simple alternative.
There's also a lot of duplication of data if I use composition. For example, if my class's state depends on whether its network connection is up and running, I have to either have a state flag in every class affected by this, or have every class query the networking class for its state every time a decision needs to be made. If I had just one multiply-inherited class, I could just use a flag which any code in my class could access.
So, multiple inheritance, composition, or perhaps something else entirely? Is there a general rule-of-thumb on how best to approach this kind of thing?
From your description I think you've gone for a "template method" style approach where the base does work and then calls a pure virtual that the derived class implements rather than a "callback interface" approach which is pretty much the same except that the pure virtual method is on a completely separate interface that's passed in to the "base" as a parameter to the constructor. I personally prefer the later as I find it considerably more flexible when the time comes to plug objects together and build higher level objects.
I tend to go for composition with the composing class implementing the callback interfaces that the composed objects require and then potentially composing again in a similar style at a higher level.
You can then decide if it's appropriate to compose by having the composing object implement the callback interfaces and pass them in to the "composed" objects in their constructors OR you can implement the callback interface in its own object possibly with a simpler and more precise callback interface that your composing object implements, and compose both the "base object" and the "callback implementation object"...
Personally I wouldn't go with an "abstract event handling" interface as I prefer my code to be explicit and clear even if that leads to it being slightly less generic.
I'm not totally clear on what your new class is trying to achieve, but it sounds like you're effectively having to provide a new implementation somewhere for all of these abstract event classes.
Personally I would plump for composition. Multiple inheritance quickly becomes a nightmare, especially when things have to change, and composition keeps the existing separation of concerns.
You state that each derived object will have to communicate with the network class, but can you try and reduce this to the minimum. For instance, each derived event object is purely responsible for packaging up the event info into some kind of generic packet, and then that packet is passed to the network class to do the guts of sending?
Without knowing exactly what your new class is doing it's hard to comment, or suggest better patterns, but the more I code, the more I am learning to agree with the old adage "favour composition over inheritance"

Factory Pattern in C++ -- doing this correctly?

I am relatively new to "design patterns" as they are referred to in a formal sense. I've not been a professional for very long, so I'm pretty new to this.
We've got a pure virtual interface base class. This interface class is obviously to provide the definition of what functionality its derived children are supposed to do. The current use and situation in the software dictates what type of derived child we want to use, so I recommended creating a wrapper that will communicate which type of derived child we want and return a Base pointer that points to a new derived object. This wrapper, to my understanding, is a factory.
Well, a colleague of mine created a static function in the Base class to act as the factory. This causes me trouble for two reasons. First, it seems to break the interface nature of the Base class. It feels wrong to me that the interface would itself need to have knowledge of the children derived from it.
Secondly, it causes more problems when I try to re-use the Base class across two different Qt projects. One project is where I am implementing the first (and probably only real implementation for this one class... though i want to use the same method for two other features that will have several different derived classes) derived class and the second is the actual application where my code will eventually be used. My colleague has created a derived class to act as a tester for the real application while I code my part. This means that I've got to add his headers and cpp files to my project, and that just seems wrong since I'm not even using his code for the project while I implement my part (but he will use mine when it is finished).
Am I correct in thinking that the factory really needs to be a wrapper around the Base class rather than the Base acting as the factory?
You do NOT want to use your interface class as the factory class. For one, if it is a true interface class, there is no implementation. Second, if the interface class does have some implementation defined (in addition to the pure virtual functions), making a static factory method now forces the base class to be recompiled every time you add a child class implementation.
The best way to implement the factory pattern is to have your interface class separate from your factory.
A very simple (and incomplete) example is below:
class MyInterface
{
public:
virtual void MyFunc() = 0;
};
class MyImplementation : public MyInterface
{
public:
virtual void MyFunc() {}
};
class MyFactory
{
public:
static MyInterface* CreateImplementation(...);
};
I'd have to agree with you. Probably one of the most important principles of object oriented programming is to have a single responsibility for the scope of a piece of code (whether it's a method, class or namespace). In your case, your base class serves the purpose of defining an interface. Adding a factory method to that class, violates that principle, opening the door to a world of shi... trouble.
Yes, a static factory method in the interface (base class) requires it to have knowledge of all possible instantiations. That way, you don't get any of the flexibility the Factory Method pattern is intended to bring.
The Factory should be an independent piece of code, used by client code to create instances. You have to decide somewhere in your program what concrete instance to create. Factory Method allows you to avoid having the same decision spread out through your client code. If later you want to change the implementation (or e.g. for testing), you have just one place to edit: this may be e.g. a simple global change, through conditional compilation (usually for tests), or even via a dependency injection configuration file.
Be careful about how client code communicates what kind of implementation it wants: that's not an uncommon way of reintroducing the dependencies factories are meant to hide.
It's not uncommon to see factory member functions in a class, but it makes my eyes bleed. Often their use have been mixed up with the functionality of the named constructor idiom. Moving the creation function(s) to a separate factory class will buy you more flexibility also to swap factories during testing.
When the interface is just for hiding the implementation details and there will be only one implementation of the Base interface ever, it could be ok to couple them. In that case, the factory function is just a new name for the constructor of the actual implementation.
However, that case is rare. Except when explicit designed having only one implementation ever, you are better off to assume that multiple implementations will exist at some point in time, if only for testing (as you discovered).
So usually it is better to split the Factory part into a separate class.