Use of making the base class polymorphic? - c++

I know the keyword virtual makes the base class polymorphic and if I create an object and call a virtual function, corresponding function will be called based on the run time allocation but why should I create an object with different types. I mean
Base *ptr = new Derived;
ptr->virtualfunction(); //calls the function which has implemented in Derived class.
If I create an object so that
Derived *ptr = new Derived;
ptr->virtualfunction(); // which does the same without the need of making the function virtual.

Because you might want to store objects of different types together:
std::vector<std::unique_ptr<Base>> v;
v.push_back(make_unique(new DerivedA()));
v.push_back(make_unique(new DerivedB()));
v.push_back(make_unique(new DerivedC()));
Now, if you go over that vector:
for (auto& p : v) {
It will call foo() of DerivedA, B, and C appropriately.

Let's go with a simple example : Let's say you have
class Base {};
class Derived1 : public Base {};
class Derived2 : public Base {};
Now, let's say you want to be able to store in a vector (or any container) both Derived1 and Derived2 instances.
You have to use the base class in that case.
// or std::vector<std::unique_ptr<Base>>

The need for polymorphism is the need of processing different data in the same manner. Rather than reimplementing over and over the same algorithm for dataset with different shapes, wouldn't it be much easier to have only one implementation of that algorithm, and parameterize it with different operators?
That's the essence of polymorphism. You start with an algorithm, establish the interface it must interact with, and then build implementations of that interface. In C++ the notion of interface is implicit in every classes. Any class exposes one interface (though it may support many interfaces through its ancestors), and its descendants implement it as well. By making certain methods virtuals, the descendants may override and adapt them to their own internal structures, without modifying how the object is manipulated from the outside.
So polymorphism is really that, values which may adopt different shapes, and the means to access and manipulate them uniformally. The key point in answering your question is perhaps that the algorithm does not know which implepentation it is manipulating. You provide a trivial example where the code knows that it works with an instance of Derived, and thus may call its methods directly. In generic code, or code refering to an interface (so to speak), that knowledge does not exist, which forces the code to rely on the base class methods (and requires the programmer to ensure that the classes he plans to use with that code are well defined - ie. virtual - where needed).
There are many useful applications of polymorphism, but they all derive from the above principle:
heterogeneous dataset (as illustrated by other answers),
injection ( in which different implementations of the same interface may be swapped one for another at runtime),
testing (and more specifically mocking, in which classes which interact with a given class C are replaced by dummies which help test the correct behaviour of C),
to name a few. Note that compile time polymorphism (templates), and runtime polymorphism (virtual methods and inheritance) both achieve that goal, albeit in a different way, and with different pros and cons.


Example for non-virtual multiple inheritance

Is there a real-world example where non-virtual multiple inheritance is being used? I'd like to have one mostly for didactic reasons. Slapping around classes named A, B, C, and D, where B and C inherit from A and D inherits from B and C is perfectly fine for explaining the question "Does/Should a D object have one or two A sub-objects?", but bears no weight about why we even have both options. Many examples care about why we do want virtual inheritance, but why would we not want virtual inheritance?
I know what virtual base classes are and how to express that stuff in code. I know about diamond inheritance and examples of multiple inheritance with a virtual base class are abundant.
The best I could find is vehicles. The base class is Vehicle which is inherited by Car and Boat. Among other things, a Vehicle has occupants() and a max_speed(). So an Amphibian that inherits from both Car and Boat inherits different max_speed() on land and water – and that makes sense –, but also different occupants() – and that does not make sense. So the Vehicle sub-objects aren't really independent; that is another problem which might be interesting to solve, but this is not the question.
Is there an example, that makes sense as a real-world model, where the two sub-objects are really independent?
You're thinking like an OOP programmer, trying to design abstract models of things. C++ multiple inheritance, like many things in C++, is a tool that has a particular effect. Whether it maps onto some OOP model is irrelevant next to the utility of the tool itself. To put it another way, you don't need a "real-world model" to justify non-virtual inheritance; you just need a real-world use case.
Because a derived class inherits the members of a base class, inheritance often is used in C++ as a means of collecting a set of common functionality together, sometimes with minimal interaction from the derived class, and injecting this functionality directly into the derived class.
The Curiously Recurring Template Pattern and other mixin-like constructs are mechanisms for doing this. The idea is that you have a base class that is a template, and its template parameter is the derived class that uses it. This allows the base class to have some access to the derived class itself without virtual functions.
The simplest example I can think of in C++ is enable_shared_from_this, which allows an object whose lifetime is currently managed by a shared_ptr to actually retrieve a shared_ptr to that object just from a pointer/reference to that object. That uses CRTP to add the various members and interfaces needed to make shared_from_this possible to the derived class. And since the inheritance is public, it also allows shared_ptr's various functions that "enable shared_from_this" to to detect that a particular type has the shared_from_this stuff in it and to properly initialize it.
enable_shared_from_this doesn't need virtual inheritance, and indeed would probably not work very well with it.
Now imagine that I have some other CRTP class that injects some other functionality into an object. This functionality has nothing to do with shared_ptr, but it uses CRTP and inheritance.
Well, if I now write some type that wants to inherit from both enable_shared_from_this and this other functionality, well, that works just fine. There is no need for virtual inheritance, and in fact doing so would only make composition that much harder.
Virtual inheritance is not free. It fundamentally changes a bunch of things about how a type relates to its base classes. If you inherit from such a type, your constructors have to initialize any virtual base classes directly. The layout of such a type is very odd and is highly unlikely to be standardized. And various other things. C++ tries not to make programmers pay for functionality they don't use, so if you don't need the special properties of virtual inheritance, you shouldn't be using it.
Its the same reason C++ has non-virtual methods -- because the implementation is simpler and more efficient if you use non-virtual inheritance, so you need to explicitly ask for virtual inheritance if you want it. Since you don't need it if your classes never use multiple inheritance, that is the default.

Class design to avoid need for list of base classes

I'm currently in the design phase of a class library and stumbled up on a question similar to "Managing diverse classes with a central manager without RTTI" or "pattern to avoid dynamic_cast".
Imagine there is a class hierarchy with a base class Base and two classes DerivedA and DerivedB that are subclasses of Base. Somewhere in my library there will be a class that needs to hold lists of objects of both types DerivedA and DerivedB. Further suppose that this class will need to perform actions on both types depending on the type. Obviously I will use virtual functions here to implement this behavior. But what if I will need the managing class to give me all objects of type DerivedA?
Is this an indicator of a bad class design because I have the need to perform actions only on a subset of the class hierarchy?
Or does it just mean that my managing class should not use a list of Base but two lists - one for DerivedA and one for DerivedB? So in case I need to perform an action on both types I would have to iterate over two lists. In my case the probability that there will be a need to add new subclasses to the hierarchy is quite low and the current number is around 3 or 4 subclasses.
But what if I will need the managing class to give me all objects of
type DerivedA?
Is this an indicator of a bad class design because I have the need to
perform actions only on a subset of the class hierarchy?
More likely yes than no. If you often need to do this, then it makes sense to question whether the hierarchy makes sense. In that case, you should separate this into two unrelated lists.
Another possible approach is to also handle it through virtual methods, where e.g. DeriveB will have a no-op implementation for methods which don't affect that. It is hard to tell without knowing more information.
It certainly is a sign of bad design if you store (pointers to) objects together that have to be handled differently.
You could however just implement this differing behaviour as an empty function in the base class or use the visitor pattern.
You can do it in several ways.
Try to dynamic_cast to specific class (this is a bruteforce solution, but I'd use it only for interfaces, using it for classes is a kind of code smell. It'll work though.)
Do something like:
class BaseRequest {};
class DerivedASupportedRequest : public BaseRequest {};
Then modify your classes to support the method:
// (...)
void ProcessRequest(const BaseRequest & request);
Create a virtual method bool TryDoSth() in a base class; DerivedB will always return false, while DerivedA will implement the required functionality.
Alternative to above: Create method Supports(Action action), where Action is an enum defining possible actions or groups of actions; in such case calling DoSth() on class, which does not support given feature should result in thrown exception.
Base class may have a method ActionXController * GetControllerForX(); DerivedA will return the actual controller, DerivedB will return nullptr.
Similarly, base class can provide method: BaseController * GetController(Action a)
You asked, if it is a bad design. I believe, that it depends on how much functionality is common and how much is different. If you have 100 common methods and only one different, it would be weird to hold these data in separate lists. However, if count of different methods is noticeable, consider changing design of your application. This may be a general rule, but there are also exceptions. It's hard to tell without knowing the context.

What are the disadvantages of "upcasting"?

The purpose of an abstract class is not to let the developers create an object of the base class and then upcast it, AFAIK.
Now, even if the upcasting is not required, and I still use it, does it prove to be "disadvantageous" in some way?
More clarification:
From The Thinking in C++:
Often in a design, you want the base class to present only an
interface for its derived classes. That is, you don’t want anyone to
actually create an object of the base class, only to upcast to it so that
its interface can be used. This is accomplished by making that class
By upcasting, I meant: baseClass *obj = new derived ();
Upcasting can be disadvantageous for non polymorphic classes. For example:
class Fruit { ... }; // doesn't contain any virtual method
class Apple : public Fruit { ... };
class Blackberry : public Fruit { ... };
upcast it somewhere,
Fruit *p = new Apple; // oops, information gone
Now, you will never know (without any manual mechanism) that if *p is an instance of an Apple or a Blackberry.
[Note that dynamic_cast<> is not allowed for non-polymorphic classes.]
Abstract classes are used to express concepts that are common to a set of (sub-)classes, but for which it is not sensible to create instances.
Consider a class Animal. It does not make sense to create an instance of that class, because there is no thing that is just an animal. There are ducks, dogs and elephants, each of which is a subclass of animal. By formally declaring the class animal you can capture the similarities of all types of animals, and by making it abstract you can express that it cannot be instantiated.
Upcasting is required to make use of polymorphism in statically typed languages. This is, as #Jigar Joshi pointed out in a comment, called the Liskov Substituion Principle.
Edit: Upcasting is not disadvantageous. In fact, you should use it whenever possible, making your code depend on super-classes(interfaces) instead of base-classes(implementations). This enables you later switch implementations without having to change your code.
Upcasting is a technical tool.
Like every tool it is useful when used correctly and dangerous / disadvantageous if used inconsistently.
It can be good or bad depending on how "pure" you want your code to be in respect to a given programming paradigm.
Now, C++ is not necessarily "pure OOP", not necessarily "pure Generic", not necessarily "pure functional". And since C++ is a "pragmatic language", it is not in general an advantage force it to fit a "one and only paradigm".
The only thing that can be said, in technical terms, is that,
A derived class is a base class plus something more
Referring a derived through a base pointer makes that "something more" not accessible, unless there is a mechanism in the base to make you jump into the derived scope.
The mechanism C++ offers for that implicit jump are virtual functions.
The mechanism C++ offers for explicit jump is dynamic_cast (used in downcasting).
For non-polymorphic objects (that don't have any virtual method) static_cast (to downcast) is still available, but with no runtime check.
Advantages and disadvantages derive from consistent and inconsistent use of all of those points together. Is not a matter related to downcast only.
One disadvantage would be the obvious loss of new functionality introduced in the derived class:
class A
void foo();
class B : public A
void foo2();
A* b = new B;
b->foo2(); //error - no longer visible
I'm talking here about non-virtual functions.
Also, if you forget to make your destructors virtual, you might get some memory leaks when deleting a derived object via a pointer to a base object.
However all these can be avoided with a good architecture.

C++: What is a class interface?

I know that in C++ there is no interface keyword or whatsoever, but that it is more of a design-pattern instead.
So, if I have an Apple class, which contains information and methods to work on apples (color, sourness, size, eat, throw)..
What would an interface to Apple look like?
What do you usually need interfaces for?
You just use pure virtual functions in a class.
class IApple
virtual ~IApple() {} // Define a virtual de-structor
virtual color getColor() = 0;
virtual sourness getSourness() = 0;
virtual size getSize() = 0;
virtual void eat() = 0;
Martin's illustrated an interface. Re your other question - what do you usually need them for:
they can be used as base classes by functions that provide this API
an interface may be a small part of the derived class's overall functionality; a derived class can implement many interfaces
pointers or references to interfaces (possibly in containers) can be used in code to decouple that code from any particular implementation (i.e. as a base for run-time polymorphic code using virtual functions / dispatch)
this can help reduce compile times and break cyclic dependencies
the implementation might be provided by a caller or a factory method
being able to vary the implementation often makes the system overall more flexible and reusable
implementations that facilitate testing can be slotted in
the interface itself may have value as a form of usage documentation (sometimes I even create interfaces as illustrates of expected template policy parameters, although there's no actual need to derive your policy from them)
some design patterns work by changing the implementation during the lifetime of the containing object/code
they can be used as a kind of annotation or trait for a class - even without providing any actual behaviour of their own - with other code checking whether the interface is a base when deciding on appropriate behaviour
A interface is a set of members eg. functions and variables that is shared between different classes so you can access the members of the interface without having to know which class it was in the first place, as long as it implements the interface you can be sure it has the members.
You can use it for example to iterate through different objects calling the same function on each.

Factory Pattern in C++ -- doing this correctly?

I am relatively new to "design patterns" as they are referred to in a formal sense. I've not been a professional for very long, so I'm pretty new to this.
We've got a pure virtual interface base class. This interface class is obviously to provide the definition of what functionality its derived children are supposed to do. The current use and situation in the software dictates what type of derived child we want to use, so I recommended creating a wrapper that will communicate which type of derived child we want and return a Base pointer that points to a new derived object. This wrapper, to my understanding, is a factory.
Well, a colleague of mine created a static function in the Base class to act as the factory. This causes me trouble for two reasons. First, it seems to break the interface nature of the Base class. It feels wrong to me that the interface would itself need to have knowledge of the children derived from it.
Secondly, it causes more problems when I try to re-use the Base class across two different Qt projects. One project is where I am implementing the first (and probably only real implementation for this one class... though i want to use the same method for two other features that will have several different derived classes) derived class and the second is the actual application where my code will eventually be used. My colleague has created a derived class to act as a tester for the real application while I code my part. This means that I've got to add his headers and cpp files to my project, and that just seems wrong since I'm not even using his code for the project while I implement my part (but he will use mine when it is finished).
Am I correct in thinking that the factory really needs to be a wrapper around the Base class rather than the Base acting as the factory?
You do NOT want to use your interface class as the factory class. For one, if it is a true interface class, there is no implementation. Second, if the interface class does have some implementation defined (in addition to the pure virtual functions), making a static factory method now forces the base class to be recompiled every time you add a child class implementation.
The best way to implement the factory pattern is to have your interface class separate from your factory.
A very simple (and incomplete) example is below:
class MyInterface
virtual void MyFunc() = 0;
class MyImplementation : public MyInterface
virtual void MyFunc() {}
class MyFactory
static MyInterface* CreateImplementation(...);
I'd have to agree with you. Probably one of the most important principles of object oriented programming is to have a single responsibility for the scope of a piece of code (whether it's a method, class or namespace). In your case, your base class serves the purpose of defining an interface. Adding a factory method to that class, violates that principle, opening the door to a world of shi... trouble.
Yes, a static factory method in the interface (base class) requires it to have knowledge of all possible instantiations. That way, you don't get any of the flexibility the Factory Method pattern is intended to bring.
The Factory should be an independent piece of code, used by client code to create instances. You have to decide somewhere in your program what concrete instance to create. Factory Method allows you to avoid having the same decision spread out through your client code. If later you want to change the implementation (or e.g. for testing), you have just one place to edit: this may be e.g. a simple global change, through conditional compilation (usually for tests), or even via a dependency injection configuration file.
Be careful about how client code communicates what kind of implementation it wants: that's not an uncommon way of reintroducing the dependencies factories are meant to hide.
It's not uncommon to see factory member functions in a class, but it makes my eyes bleed. Often their use have been mixed up with the functionality of the named constructor idiom. Moving the creation function(s) to a separate factory class will buy you more flexibility also to swap factories during testing.
When the interface is just for hiding the implementation details and there will be only one implementation of the Base interface ever, it could be ok to couple them. In that case, the factory function is just a new name for the constructor of the actual implementation.
However, that case is rare. Except when explicit designed having only one implementation ever, you are better off to assume that multiple implementations will exist at some point in time, if only for testing (as you discovered).
So usually it is better to split the Factory part into a separate class.