Overhead enforcing member function implementation - c++

I have a Base class and a Derived class. The only goal of the Base class is to make sure Derived implements a member function.
struct Base
{
virtual void f() = 0;
};
struct Derived : Base
{
void f() override final {}
};
I don't use this class polymorphically, i.e., I just instantiate objects of type Derived on the stack, like this:
Derived obj;
and I need to do this millions of times.
Edit: only a few instances exist at the same time (no stack overflow).
Is a vtable created here (during compilation time I guess)? Does it matter to me if a vtable is created or not, since I don't use it (or do I somehow)? Is there any overhead I should consider using this design? Maybe there is another way to make sure the compiler complains if Derived doesn't implement f()?

Is a vtable created here ?
Yes since you have virtual member functions.
Does it matter to me if a vtable is created or not, since I don't use it ?
Since you don't use it, it still matters in the sense that it will increase the size of your Derived structure.
Here your Derived structure is of size 8. But without the vtable, it would be of size 1.
Maybe there is another way to make sure the compiler complains if Derived doesn't implement f()?
To be honest, I think your solution using Base as an interface in order to force every derived class to implement the f() function is totally fine since it is the exact use-case for the use of interfaces.
But if the size of the Derived structure is a concern (because you said you wanted to instantiate it millions of times), perhaps you would be interested by the std::is_member_function_pointer type trait.
I have no idea about how you intend to instantiate your Derived structure so I cannot provide a code that would exactly suit your needs.
But the idea I'm thinking about is equivalent to the following (generic example):
#include <type_traits>
template <typename T>
void instantiate_a_lot_of_times(std::size_t nb_times)
{
// Check if the f() member function exists
static_assert(std::is_member_function_pointer<decltype(&T::f)>::value, "Error: The T::f() member function must be defined");
for(std::size_t i = 0; i < nb_times; ++i)
{
T t;
// Do something with t
}
}
But keep in mind that this approach has the drawback of delaying the check.
The compilation will not fail when the structure definition is encountered but when the static_assert is evaluated.

Related

Calling a member function from a base class pointer with varying parameters depending on the derived class

I'm pretty experienced in C++, but I find myself struggling with this particular design problem.
I want to have a base class, that I can stuff in a std::map, with a virtual function that can be called generically by a method that is querying the map. But I want to be able to call that function from a base class pointer with different parameters, depending on what the derived type is. Something functionally similar to the following wildly illegal example:
class Base
{
virtual void doThing() = 0;
}
class Derived1 : public Base
{
void doThing(int i, const std::string& s) {} // can't do that
}
class Derived2: public Base
{
void doThing(double d, std::vector<int>& v) {} // can't do that either
}
enum class ID = {
DERIVED1,
DERIVED2
}
std::map<ID, std::unique_ptr<Base> thingmap = { ... }
std::unique_ptr<Base>& getThing(int) { return thingmap[i] };
int main(int I, const char* argv[]) {
auto baseptr = getThing(DERIVED1);
baseptr->doThing(42, "hello world");
}
I don't want the caller to have to know what the derived type is, only that a Derived1 takes an int and a string. Downcasting isn't an option because the whole point of this is that I don't want the caller to have to specify the derived type explicitly. And C-style variable argument lists are yucky. :-)
Edited to clarify: I know exactly why the above can't possibly work, thank you. :-) This is library code and I'm trying to conceal the internals of the library from the caller to the greatest extent possible. If there's a solution it probably involves a variadic template function.
You can't do that.
Your map is filled with Base instances, so the class DO NOT have the required prototypes implemented in Derived1 or Derived2... And redefining overloaded methods do not implement the pure virtual method doThing, so Derived1 and Derived2 are still abstract classes and therefore cannot be instanciated.
Worst, your getThing function only deals with Base, so the compiler would NEVER allows you to use the overloaded signatures, since they don't exist AT ALL in Base. There is nothing to know the real class behind, since you don't use templates and implicit template argument deduction.
Your pattern cannot be done this way, period. Since you don't want to use neither downcasting nor explicitely specified child classes, you're stuck.
Even if you add all possible prototypes in Base, since it will be pure virtual methods, both derived classes will still be abstract classes. And if they aren't, then you'll never be able to know which one is a NOP and which one is implemented, since it will requires downcasting!
I think that you made a common mistake, even done by expert developers sometimes: you went into conception directly, BEFORE determining your real ROOT needs.
What you ask looks like the core system of a factory, and it's really not the good way to implement this design pattern and/or designing the specialized derived classes.

What if I must override a non-virtual member function

Say we have a library which provides a class
struct Base { int foo() { return 42; } };
I cannot change that class.
99% of the people never want to override foo, hence it has not been made virtual by the library designers.
But I need to override it:
struct MyClass : Base { int foo() { return 73; } };
Even worse, the library has interfaces accepting pointers to Base.
I want to plug in MyClass, but of course, since foo is not virtual, the code behind the interface always calls Base::foo. I want it to call MyClass::foo.
What can I do about it? Is there a common pattern to make Base::foo appear to be virtual?
In reality, Base::foo is QAbstractProxyModel::sourceModel.
I'm implementing a ProxyChain, to abstract many proxy models to a single one.
QAbstractProxyModel::setSourceModel is virtual, but QAbstractProxyModel::sourceModel isn't and that makes a lot of trouble.
void ProxyChain::setSourceModel(QAbstractItemModel* source_model)
{
for (auto* proxy : m_proxies) {
proxy->setSourceModel(source_model);
source_model = proxy;
}
QIdentityProxyModel::setSourceModel(source_model);
}
QAbstractItemModel* ProxyChain::sourceModel() const
{
return m_proxies.front()->sourceModel();
}
What can I do about it?
Nothing.
This is why guidelines tell us to use virtual if we want other people to be able to "pretend" that their classes are versions of our classes.
The author of Base did not do that, so you do not have that power.
That's it.
What can I do about it?
Nothing. If a member function is non-virtual, then it is non-virtual. This means that any code, anywhere in the code base, which takes a Base pointer or reference who calls base->foo will be calling exactly and only Base::foo. This call is statically (compile-time) bound to the function it calls.
You cannot reach into someone else's code and make them use dynamic binding. If they didn't choose to participate in dynamic binding, then you can't make them. You can create your own derived class and write your own version of foo which hides the base class version. But this will not affect the behavior of any code which gets a pointer/reference to Base.
In your specific case, your best bet will be to make sure to call the base class setSourceModel with the object that you want sourceModel to return any time something changes which changes what sourceModel should return.

Why do we need a pointer of superclass to point to a object of a subclass?

I'm learning C++ now and I read a lot of materials about using superclass's pointer to point to a subclass' object, especially in the case of (pure) virtual classes. Since I don't have a lot of experience, could anyone help me to understand why we need to do that? Thanks a lot!
You don't need to. You can use a pointer to the derived type if that's what you really want.
The Liskov substitution principle says that we should always be able to use a derived type wherever the base type is expected. The idea is that a derived type should be no more restrictive than its base class. That way, the derived really is-a base type, and can be used wherever the base type would be used. The base type defines the interface and the derived type should meet that same interface. The derived type can augment the interface, if it likes.
The type of pointer that your function should take depends on exactly what you want to be able to accept. If you have a hierarchy with two Widgets, Button and List, for example, then if your function is happy to take any kind of Widget, it should take a Widget*. If the function specifically requires a Button, however, it should take a Button*. The reason for this is that the function probably requires some functionality that only the Button can provide.
When a member function is called through a pointer and the compiler sees that that function is virtual, the compiler ensures that the dynamic type of the object is used to determine which function to call. That is, imagine you have a Widget* parameter but actually pass a pointer to a Button object. The static type of the object is Widget (if the compiler were to only look at the parameter type), but its dynamic type is Button. If you call widget->draw(), where draw is a virtual function, it will see that the dynamic type is Button and ensure that Button::draw is called.
However, I don't recommend using raw pointers in general, so prefer references (Widget&) or smart pointers if you can.
Here's an example:
struct base { virtual void do_stuff(); = 0 };
struct specialization1: base {
void do_stuff() override { std::cout << "doing concrete stuff"; }
};
Consider that you have client code that wants to call do_stuff.
First implementation (this is how not to do it):
void client_do_stuff( specialization1& s ) { s.do_stuff(); }
This function works. If you decide (four months from now) to add to your code base:
struct specialization2: base {
void do_stuff() override { std::cout << "doing other concrete stuff"; }
};
You may want to call void client_do_stuff for an instance of specialization2. You could duplicate client_do_stuff with a specialization2 reference, but that is code duplication and unnecessary.
A better solution would be to change client_do_stuff to take a reference to the base class, and use the same implementation with both specializations.
Second implementation:
void client_do_stuff( base& b ) { b.do_stuff(); }
client code:
specialization1 s1;
specialization2 s2;
client_do_stuff(s1); // works
client_do_stuff(s2); // works
This implementation of client_do_stuff is implemented in terms of the public interface of the base class, instead of a specialization. This makes the function "future-proof" (the principle is sometimes called "program to an interface, not an implementation").
The idea is as follows: An object has the following interface (the pure virtual class). I will hand a concrete object to your code, which adheres to this interface, but the internal details of said object I will keep to myself (encapsulation). Thus your code can make no assumptions on the precise size etc. of the object. Therefore when compiling your code, you have to use pointers or references when manipulating the object.

What is the advantage of using dynamic_cast instead of conventional polymorphism?

We can use Polymorphism (inheritance + virtual functions) in order to generalize different types under a common base-type, and then refer to different objects as if they were of the same type.
Using dynamic_cast appears to be the exact opposite approach, as in essence we are checking the specific type of an object before deciding what action we want to take.
Is there any known example for something that cannot be implemented with conventional polymorphism as easily as it is implemented with dynamic_cast?
Whenever you find yourself wanting a member function like "IsConcreteX" in a base class (edit: or, more precisely, a function like "ConcreteX *GetConcreteX"), you are basically implementing your own dynamic_cast. For example:
class Movie
{
// ...
virtual bool IsActionMovie() const = 0;
};
class ActionMovie : public Movie
{
// ...
virtual bool IsActionMovie() const { return true; }
};
class ComedyMovie : public Movie
{
// ...
virtual bool IsActionMovie() const { return false; }
};
void f(Movie const &movie)
{
if (movie.IsActionMovie())
{
// ...
}
}
This may look cleaner than a dynamic_cast, but on closer inspection, you'll soon realise that you've not gained anything except for the fact that the "evil" dynamic_cast no longer appears in your code (provided you're not using an ancient compiler which doesn't implement dynamic_cast! :)). It's even worse - the "self-written dynamic cast" approach is verbose, error-prone and repetitve, while dynamic_cast will work just fine with no additional code whatsoever in the class definitions.
So the real question should be whether there are situations where it makes sense that a base class knows about a concrete derived class. The answer is: usually it doesn't, but you will doubtlessly encounter such situations.
Think, in very abstract terms, about a component of your software which transmits objects from one part (A) to another (B). Those objects are of type Class1 or Class2, with Class2 is-a Class1.
Class1
^
|
|
Class2
A - - - - - - - -> B
(objects)
B, however, has some special handling only for Class2. B may be a completely different part of the system, written by different people, or legacy code. In this case, you want to reuse the A-to-B communication without any modification, and you may not be in a position to modify B, either. It may therefore make sense to explicitly ask whether you are dealing with Class1 or Class2 objects at the other end of the line.
void receiveDataInB(Class1 &object)
{
normalHandlingForClass1AndAnySubclass(object);
if (typeid(object) == typeid(Class2))
{
additionalSpecialHandlingForClass2(dynamic_cast<Class2 &>(object));
}
}
Here is an alternative version which does not use typeid:
void receiveDataInB(Class1 &object)
{
normalHandlingForClass1AndAnySubclass(object);
Class2 *ptr = dynamic_cast<Class2 *>(&object);
if (ptr != 0)
{
additionalSpecialHandlingForClass2(*ptr);
}
}
This might be preferable if Class2 is not a leaf class (i.e. if there may be classes further deriving from it).
In the end, it often comes down to whether you are designing a whole system with all its parts from the beginning or have to modify or adapt parts of it at a later stage. But if you ever find yourself confronted with a problem like the one above, you may come to appreciate dynamic_cast as the right tool for the right job in the right situation.
It allows you to do things which you can only do to the derived type. But this is usually a hint that a redesign is in order.
struct Foo
{
virtual ~Foo() {}
};
struct Bar : Foo
{
void bar() const {}
};
int main()
{
Foo * f = new Bar();
Bar* b = dynamic_cast<Bar*>(f);
if (b) b->bar();
delete f;
}
I can't think of any case where it's not possible to use virtual functions (other than such things as boost:any and similar "lost the original type" work).
However, I have found myself using dynamic_cast a few times in the Pascal compiler I'm currently writing in C++. Mostly because it's a "better" solution than adding a dozen virtual functions to the baseclass, that are ONLY used in one or two places when you already (should) know what type the object is. Currently, out of roughly 4300 lines of code, there are 6 instances of dynamic_cast - one of which can probably be "fixed" by actually storing the type as the derived type rather than the base-type.
In a couple of places, I use things like ArrayDecl* a = dynamic_cast<ArrayDecl*>(type); to determine that type is indeed an array declaration, and not someone using an non-array type as a base, when accessing an index (and I also need a to access the array type information later). Again, adding all the virtual functions to the base TypeDecl class would give lots of functions that mostly return nothing useful (e.g. NULL), and aren't called except when you already know that the class is (or at least should be) one of the derived types. For example, getting to know the range/size of an array is useless for types that aren't arrays.
No advantages really. Sometimes dynamic_cast is useful for a quick hack, but generally it is better to design classes properly and use polymorphism. There may be cases when due to some reasons it is not possible to modify the base class in order to add necessary virtual functions (e.g. it is from a third-party which we do not want to modify), but still dynamic_cast usage should be an exception, not a rule.
An often used argument that it is not convenient to add everything to the base class does not work really, since the Visitor pattern (see e.g. http://sourcemaking.com/design_patterns/visitor/cpp/2) solves this problem in a more organised way purely with polymorphism - using Visitor you can keep the base class small and still use virtual functions without casting.
dynamic_cast needs to be used on base class pointer for down cast when member function is not available in base class, but only in derived class. There is no advantage to use it. It is a way to safely down cast when virtual function is not overridden from base class. Check for null pointer on return value. You are correct in that it is used where there is no virtual function derivation.

Would using a virtual destructor make non-virtual functions do v-table lookups?

Just what the topic asks. Also want to know why non of the usual examples of CRTP do not mention a virtual dtor.
EDIT:
Guys, Please post about the CRTP prob as well, thanks.
Only virtual functions require dynamic dispatch (and hence vtable lookups) and not even in all cases. If the compiler is able to determine at compile time what is the final overrider for a method call, it can elide performing the dispatch at runtime. User code can also disable the dynamic dispatch if it so desires:
struct base {
virtual void foo() const { std::cout << "base" << std::endl; }
void bar() const { std::cout << "bar" << std::endl; }
};
struct derived : base {
virtual void foo() const { std::cout << "derived" << std::endl; }
};
void test( base const & b ) {
b.foo(); // requires runtime dispatch, the type of the referred
// object is unknown at compile time.
b.base::foo();// runtime dispatch manually disabled: output will be "base"
b.bar(); // non-virtual, no runtime dispatch
}
int main() {
derived d;
d.foo(); // the type of the object is known, the compiler can substitute
// the call with d.derived::foo()
test( d );
}
On whether you should provide virtual destructors in all cases of inheritance, the answer is no, not necessarily. The virtual destructor is required only if code deletes objects of the derived type held through pointers to the base type. The common rule is that you should
provide a public virtual destructor or a protected non-virtual destructor
The second part of the rule ensures that user code cannot delete your object through a pointer to the base, and this implies that the destructor need not be virtual. The advantage is that if your class does not contain any virtual method, this will not change any of the properties of your class --the memory layout of the class changes when the first virtual method is added-- and you will save the vtable pointer in each instance. From the two reasons, the first being the important one.
struct base1 {};
struct base2 {
virtual ~base2() {}
};
struct base3 {
protected:
~base3() {}
};
typedef base1 base;
struct derived : base { int x; };
struct other { int y; };
int main() {
std::auto_ptr<derived> d( new derived() ); // ok: deleting at the right level
std::auto_ptr<base> b( new derived() ); // error: deleting through a base
// pointer with non-virtual destructor
}
The problem in the last line of main can be resolved in two different ways. If the typedef is changed to base1 then the destructor will correctly be dispatched to the derived object and the code will not cause undefined behavior. The cost is that derived now requires a virtual table and each instance requires a pointer. More importantly, derived is no longer layout compatible with other. The other solution is changing the typedef to base3, in which case the problem is solved by having the compiler yell at that line. The shortcoming is that you cannot delete through pointers to base, the advantage is that the compiler can statically ensure that there will be no undefined behavior.
In the particular case of the CRTP pattern (excuse the redundant pattern), most authors do not even care to make the destructor protected, as the intention is not to hold objects of the derived type by references to the base (templated) type. To be in the safe side, they should mark the destructor as protected, but that is rarely an issue.
Very unlikely indeed. There's nothing in the standard to stop compilers doing whole classes of stupidly inefficient things, but a non-virtual call is still a non-virtual call, regardless of whether the class has virtual functions too. It has to call the version of the function corresponding to the static type, not the dynamic type:
struct Foo {
void foo() { std::cout << "Foo\n"; }
virtual void virtfoo() { std::cout << "Foo\n"; }
};
struct Bar : public Foo {
void foo() { std::cout << "Bar\n"; }
void virtfoo() { std::cout << "Bar\n"; }
};
int main() {
Bar b;
Foo *pf = &b; // static type of *pf is Foo, dynamic type is Bar
pf->foo(); // MUST print "Foo"
pf->virtfoo(); // MUST print "Bar"
}
So there's absolutely no need for the implementation to put non-virtual functions in the vtable, and indeed in the vtable for Bar you'd need two different slots in this example for Foo::foo() and Bar::foo(). That means it would be a special-case use of the vtable even if the implementation wanted to do it. In practice it doesn't want to do it, it wouldn't make sense to do it, don't worry about it.
CRTP base classes really ought to have destructors that are non-virtual and protected.
A virtual destructor is required if the user of the class might take a pointer to the object, cast it to the base class pointer type, then delete it. A virtual destructor means this will work. A protected destructor in the base class stops them trying it (the delete won't compile since there's no accessible destructor). So either one of virtual or protected solves the problem of the user accidentally provoking undefined behavior.
See guideline #4 here, and note that "recently" in this article means nearly 10 years ago:
http://www.gotw.ca/publications/mill18.htm
No user will create a Base<Derived> object of their own, that isn't a Derived object, since that's not what the CRTP base class is for. They just don't need to be able to access the destructor - so you can leave it out of the public interface, or to save a line of code you can leave it public and rely on the user not doing something silly.
The reason it's undesirable for it to be virtual, given that it doesn't need to be, is just that there's no point giving a class virtual functions if it doesn't need them. Some day it might cost something, in terms of object size, code complexity or even (unlikely) speed, so it's a premature pessimization to make things virtual always. The preferred approach among the kind of C++ programmer who uses CRTP, is to be absolutely clear what classes are for, whether they are designed to be base classes at all, and if so whether they are designed to be used as polymorphic bases. CRTP base classes aren't.
The reason that the user has no business casting to the CRTP base class, even if it's public, is that it doesn't really provide a "better" interface. The CRTP base class depends on the derived class, so it's not as if you're switching to a more general interface if you cast Derived* to Base<Derived>*. No other class will ever have Base<Derived> as a base class, unless it also has Derived as a base class. It's just not useful as a polymorphic base, so don't make it one.
The answer to your first question: No. Only calls to virtual functions will cause an indirection via the virtual table at runtime.
The answer to your second question: The Curiously recurring template pattern is commonly implemented using private inheritance. You don't model an 'IS-A' relationship and hence you don't pass around pointers to the base class.
For instance, in
template <class Derived> class Base
{
};
class Derived : Base<Derived>
{
};
You don't have code which takes a Base<Derived>* and then goes on to call delete on it. So you never attempt to delete an object of a derived class through a pointer to the base class. Hence, the destructor doesn't need to be virtual.
Firstly, I think the answer to the OP's question has been answered quite well - that's a solid NO.
But, is it just me going insane or is something going seriously wrong in the community? I felt a bit scared to see so many people suggesting that it's useless/rare to hold a pointer/reference to Base. Some of the popular answers above suggest that we don't model IS-A relationship with CRTP, and I completely disagree with those opinions.
It's widely known that there's no such thing as interface in C++. So to write testable/mockable code, a lot of people use ABC as an "interface". For example, you have a function void MyFunc(Base* ptr) and you can use it this way: MyFunc(ptr_derived). This is the conventional way to model IS-A relationship which requires vtable lookups when you call any virtual functions in MyFunc. So this is pattern one to model IS-A relationship.
In some domain where performance is critical, there exists another way(pattern two) to model IS-A relationship in a testable/mockable manner - via CRTP. And really, performance boost can be impressive(600% in the article) in some cases, see this link. So MyFunc will look like this template<typename Derived> void MyFunc(Base<Derived> *ptr). When you use MyFunc, you do MyFunc(ptr_derived); The compiler is going to generate a copy of code for MyFunc() that matches best with the parameter type ptr_derived - MyFunc(Base<Derived> *ptr). Inside MyFunc, we may well assume some function defined by the interface is called, and pointers are statically cast-ed at compile time(check out the impl() function in the link), there's no overheads for vtable lookups.
Now, can someone please tell me either I am talking insane nonsense or the answers above simply did not consider the second pattern to model IS-A relationship with CRTP?