Cannot remember where I saw it now- but somewhere I read that dynamic polymorphism prevents the compiler from making various optimizations.
Besides inlining, could somebody please englighten me with any examples of such "missed" optimization opportunities which polymorphism prevents the compiler from making?
With:
Derived d;
d.vMethod(); // that will call Derived::vMethod statically (allowing inlining).
With (unless one of Derived or Derived::vMethod is declared final in C++11):
void foo(Derived& d)
{
d.vMethod(); // this will call virtually vMethod (disallowing inlining).
}
Virtual call has an additional cost (as indirection through the vtable).
C++11 introduces final keyword which may turn the last example in static call.
At least in C++, polymorphic objects must be in the form of pointers or references. Sometimes this prevents the possibility of putting them on stack variables, or List types, you need to use List. Stack variables spare dynamic allocations etc.
A call to Poly.vmethod() is always resolved at compile time, even if vmethod() is virtual, while Poly->vmethod() consults the virtual method table. (Well, if the method is virtual it is meant to be polymorphic. Static methods are statically resolved in either case.)
Return value optimization (RVO) is another trick that does not have place when returning pointers or references. RVO is typically implemented by passing a hidden parameter: a pointer to a memory region, that is filled with the "returned" object. The size and the type of this region must be perfectly known at compile time.
Related
final is an excellent keyword. It lets me prevent inheritance from a class. It also lets the compiler skip the runtime dispatch mechanisms when calling virtual functions or accessing virtual bases.
Suppose now that I have some non-final class T with virtual functions and/or base classes. Suppose also that I have a reference to an instance of this class. How could I tell the compiler that this particular reference is to the fully-derived complete object, and not to a base sub-object of a more derived type?
My motivations are classes like optional and vector. If I invoke optional<T>::operator*(), I get a T&. However, I know with certainty that this reference really is a T, not some more derived object. The same applies to vector<T> and all the ways I have of accessing its elements.
I think it would be a great optimization to skip the dynamic dispatch in such cases, especially in debug mode and on compilers not smart enough to look through the optional and vector implementations and devirtualize the calls.
Formally, you can do this:
void final(A &a) {
static_cast<A*>(dynamic_cast<void*>(&a))->foo();
}
dynamic_cast<void*> returns a pointer to the most-derived type (and static_cast from void* cannot select a polymorphic base class), so the compiler can know that A::foo is being called. However, compilers don’t seem to take advantage of this information; they even generate the obvious extra instructions to actually perform the dynamic cast (even though it’ll of course be undefined behavior if it fails).
You can, certainly, devirtualize yourself by actually writing a.A::foo() whenever verbosity and genericity permit.
The above was an interview question.
I understand that as part of the virtual dispatch mechanism, the compiler creates a vTable for each class and inserts an extra pointer (vptr) during compilation. But when exactly does it assign the class' virtual table to this vptr?
How is the vptr initialized at compile time?
Whatever I read over the internet says that the compiler initializes the vptr at compile time, but initialization is a run-time mechanism. Am I wrong?
I don't understand how a compiler would initialize it.
Strictly speaking, this is not covered by the C++ standard. But enough implementations go for this to consider it common wisdom. I'm going to address only single inheritance, since multiple inheritance is way more complicated.
The compiler knows in advance where a type's virtual function table is located (it's the compiler that allocates it). It also knows all of that class type's constructors. So what it has to do is fairly simple, at the beginning of each constructor, add the following (illustrative):
this->_vptr = /*VTable's Address*/;
That's it. This is exceptionally simple, and even works intuitively when overriding. Because a derived class's constructor will just overwrite the pointer value.
And yes, that assignment, naturally, happens at run-time. Even though the table itself may be populated beforehand.
What is the performance difference between calling a virtual function from a derived class pointer directly vs from a base class pointer to the same derived class?
In the derived pointer case, will the call be statically bound, or dynamically bound? I think it'll be dynamically bound because there's no guarantee the derived pointer isn't actually pointing to a further derived class. Would the situation change if I have the derived class directly by value (not through pointer or reference)? So the 3 cases:
base pointer to derived
derived pointer to derived
derived by value
I'm concerned about performance because the code will be run on a microcontroller.
Demonstrating code
struct Base {
// virtual destructor left out for brevity
virtual void method() = 0;
};
struct Derived : public Base {
// implementation here
void method() {
}
}
// ... in source file
// call virtual method from base class pointer, guaranteed vtable lookup
Base* base = new Derived;
base->method();
// call virtual method from derived class pointer, any difference?
Derived* derived = new Derived;
derived->method();
// call virtual method from derived class value
Derived derivedValue;
derived.method();
In theory, the only C++ syntax that makes a difference is a member function call that uses qualified member name. In terms of your class definitions that would be
derived->Derived::method();
This call ignores the dynamic type of the object and goes directly to Derived::method(), i.e. it's bound statically. This is only possible for calling methods declared in the class itself or in one of its ancestor classes.
Everything else is a regular virtual function call, which is resolved in accordance with the dynamic type of the object used in the call, i.e. it is bound dynamically.
In practice, compilers will strive to optimize the code and replace dynamically-bound calls with statically-bound calls in contexts where the dynamic type of the object is known at compile time. For example
Derived derivedValue;
derivedValue.method();
will typically produce a statically-bound call in virtually every modern compiler, even though the language specification does not provide any special treatment for this situation.
Also, virtual method calls made directly from constructors and destructors are typically compiled into statically-bound calls.
Of course, a smart compiler might be able to bind the call statically in a much greater variety of contexts. For example, both
Base* base = new Derived;
base->method();
and
Derived* derived = new Derived;
derived->method();
can be seen by the compiler as trivial situations that easily allow for statically-bound calls.
Virtual functions must be compiled to work as if they were always called virtually. If your compiler compiles a virtual call as a static call, that's an optimization that must satisfy this as-if rule.
From this, it follows that the compiler must be able to prove the exact type of the object in question. And there are some valid ways in which it can do this:
If the compiler sees the creation of the object (the new expression or the automatic variable from which the address is taken) and can prove that that creation is actually the source of the current pointer value, that gives it the precise dynamic type it needs. All your examples fall into this category.
While a constructor runs, the type of the object is exactly the class containing the running constructor. So any virtual function call made in a constructor can be resolved statically.
Likewise, while a destructor runs, the type of the object is exactly the class containing the running destructor. Again, any virtual function call can be resolved statically.
Afaik, these are all the cases that allow the compiler to convert a dynamic dispatch into a static call.
All of these are optimizations, though, the compiler may decide to perform the runtime vtable lookup anyway. But good optimizing compilers should be able to detect all three cases.
There should be no difference between the first two cases, since the very idea of virtual functions is to call always the actual implementation. Leaving compiler optimisations aside (which in theory could optimise all virtual function calls away if you construct the object in the same compilation unit and there is no way the pointer can be altered in between), the second call must be implemented as a indirect (virtual) call as well, since there could be a third class inheriting from Derived and implementing that function as well. I would assume that the third call will not be virtual, since the compiler knows the actual type already at compile time. Actually you could make sure of this by not defining the function as virtual, if you know you will always do the call on the derived class directly.
For really lightweight code running on a small microcontroller I would recommend avoiding defining functions as virtual at all. Usually there is no runtime abstraction required. If you write a library and need some kind of abstraction, you can maybe work with templates instead (which give you some compile-time abstraction).
At least on PC CPUs I often find virtual calls one of the most expensive indirections you can have (probably because branch prediction is more difficult). Sometimes one can also transform the indirection to the data level, e.g. you keep one generic function which operates on different data which is indirected with pointers to the actual implementation. Of course this will work only in some very specific cases.
At run-time.
BUT: Performance as compared to what? It isn't valid to compare a virtual function call to a non-virtual function call. You need to compare it to a non-virtual function call plus an if, a switch, an indirection, or some other means of providing the same function. If the function doesn't embody a choice among implementations, i.e. doesn't need to be virtual, don't make it virtual.
I am reading something about virtual table. When it comes to pointer __vptr,
it is stated that by the author
Unlike the *this pointer, which is actually a function parameter used by the compiler to resolve self-references, *__vptr is a real pointer. Consequently, it makes each class object allocated bigger by the size of one pointer.
What does it mean here by saying this is actually a function parameter? And this is not a real pointer?
Both pointers are real in the sense that they store an address of something else in memory. By "real" the author means "stored within the class", as opposed to this pointer, which is passed to member functions without being stored in the object itself. Essentially, the pointer to __vptr is part of the object, while this pointer is not.
this is always a hidden implicit formal argument. Practically speaking, every non static member function of a class is getting an implicit first argument which is this
so in
class Foo {
int x; // a field, i.e. an instance variable
void bar(double x);
};
the Foo::bar function has two arguments, exactly as if it was the C (not C++) function
void Foo__bar(Foo* mythis, double x);
And actually, name mangling and the compiler is transforming the first into a very close equivalent of the second. (I am using mythis instead of this because this is a keyword in C++).
In principle, the ABI of your implementation could mandate a different passing convention for this (e.g. use another machine register) and for other explicit arguments. In practice, it often does not. On my Linux system the x86-64 ABI (its figure 3.4 page 21) defines a calling convention that passes this (and first pointer formal argument to C function) in %rdi processor register.
Practically speaking, in C++, most -but not all- member functions are small (defined inside the class) and inlined by the optimizing compiler (and the latest C++11 and C++14 standards have been written with optimizing compilers in mind; see also this). In that case, the question of where is this stored becomes practically meaningless... (because of the inlining).
The virtual method table (vtable) is generally an implicit first pointer field (or instance variable) of objects, but things could become more complex, e.g. with virtual multiple inheritance. the vtable data itself (the addresses of virtual functions) is generated by the compiler. See also this answer.
In theory, a C++ implementation could provide the dynamic method dispatching by another mechanism than vtable. In practice, I know no C++ implementation doing that.
If i call an inherited method on a derived class instance does the code require the use of a vtable? Or can the method calls be 'static' (Not sure if that is the correct usage of the word)
For example:
Derived derived_instance;
derived_instance.virtual_method_from_base_class();
I am using msvc, but i guess that most major compilers implement this roughly the same way.
I am (now) aware that the behavior is implementation-specific, i'm curious about the implementation.
EDIT:
I should probaby add that the reason that we are interested is that the function is called a lot of times, and it is very simple, and i am not allowed to edit the function itself in any way, i was just wondering if would be possible, and if there would be any benifit to eliminating the dynamic-dispach anyway.
I have profiled and counted functions etc etc before you all get on my backs about optomization.
Both of your examples will require that Derived has a constructor accepting a Base and create a new instance of Derived. Assuming that you have such a constructor and that this is what you want, then the compiler would "probably" be able to determine the dynamic object type statically and avoid the virtual call (if it decides to make such optimizations).
Note that the behavior is not undefined, it's just implementation-specific. There's a huge difference between the two.
If you want to avoid creating a new instance (or, more likely, that's not what you want) then you could use a reference cast static_cast<Derived&>(base_instance).virtual_method_from_base_class(); but while that avoids creating a new object it won't allow you to avoid the virtual call.
If you really want to cast at compile time what you're looking for is most likely the CRTP http://en.wikipedia.org/wiki/Curiously_recurring_template_pattern which allows you to type everything at compile time, avoiding virtual calls.
EDIT for updated question: In the case you've shown now, I would suspect many compilers capable of statically determining the dynamic type and avoiding the virtual call.
Vtable only come into play when you use pointers or references. For objects, it's always the specific class method which is invoked.
You can simply qualify the call, then there is no virtual function dispatch:
Derived derived_instance;
derived_instance.Derived::virtual_method_from_base_class();
However, I suspect that that would be premature optimization.
Do measure.