When virtual functions are invoked statically? - c++

What is the performance difference between calling a virtual function from a derived class pointer directly vs from a base class pointer to the same derived class?
In the derived pointer case, will the call be statically bound, or dynamically bound? I think it'll be dynamically bound because there's no guarantee the derived pointer isn't actually pointing to a further derived class. Would the situation change if I have the derived class directly by value (not through pointer or reference)? So the 3 cases:
base pointer to derived
derived pointer to derived
derived by value
I'm concerned about performance because the code will be run on a microcontroller.
Demonstrating code
struct Base {
// virtual destructor left out for brevity
virtual void method() = 0;
};
struct Derived : public Base {
// implementation here
void method() {
}
}
// ... in source file
// call virtual method from base class pointer, guaranteed vtable lookup
Base* base = new Derived;
base->method();
// call virtual method from derived class pointer, any difference?
Derived* derived = new Derived;
derived->method();
// call virtual method from derived class value
Derived derivedValue;
derived.method();

In theory, the only C++ syntax that makes a difference is a member function call that uses qualified member name. In terms of your class definitions that would be
derived->Derived::method();
This call ignores the dynamic type of the object and goes directly to Derived::method(), i.e. it's bound statically. This is only possible for calling methods declared in the class itself or in one of its ancestor classes.
Everything else is a regular virtual function call, which is resolved in accordance with the dynamic type of the object used in the call, i.e. it is bound dynamically.
In practice, compilers will strive to optimize the code and replace dynamically-bound calls with statically-bound calls in contexts where the dynamic type of the object is known at compile time. For example
Derived derivedValue;
derivedValue.method();
will typically produce a statically-bound call in virtually every modern compiler, even though the language specification does not provide any special treatment for this situation.
Also, virtual method calls made directly from constructors and destructors are typically compiled into statically-bound calls.
Of course, a smart compiler might be able to bind the call statically in a much greater variety of contexts. For example, both
Base* base = new Derived;
base->method();
and
Derived* derived = new Derived;
derived->method();
can be seen by the compiler as trivial situations that easily allow for statically-bound calls.

Virtual functions must be compiled to work as if they were always called virtually. If your compiler compiles a virtual call as a static call, that's an optimization that must satisfy this as-if rule.
From this, it follows that the compiler must be able to prove the exact type of the object in question. And there are some valid ways in which it can do this:
If the compiler sees the creation of the object (the new expression or the automatic variable from which the address is taken) and can prove that that creation is actually the source of the current pointer value, that gives it the precise dynamic type it needs. All your examples fall into this category.
While a constructor runs, the type of the object is exactly the class containing the running constructor. So any virtual function call made in a constructor can be resolved statically.
Likewise, while a destructor runs, the type of the object is exactly the class containing the running destructor. Again, any virtual function call can be resolved statically.
Afaik, these are all the cases that allow the compiler to convert a dynamic dispatch into a static call.
All of these are optimizations, though, the compiler may decide to perform the runtime vtable lookup anyway. But good optimizing compilers should be able to detect all three cases.

There should be no difference between the first two cases, since the very idea of virtual functions is to call always the actual implementation. Leaving compiler optimisations aside (which in theory could optimise all virtual function calls away if you construct the object in the same compilation unit and there is no way the pointer can be altered in between), the second call must be implemented as a indirect (virtual) call as well, since there could be a third class inheriting from Derived and implementing that function as well. I would assume that the third call will not be virtual, since the compiler knows the actual type already at compile time. Actually you could make sure of this by not defining the function as virtual, if you know you will always do the call on the derived class directly.
For really lightweight code running on a small microcontroller I would recommend avoiding defining functions as virtual at all. Usually there is no runtime abstraction required. If you write a library and need some kind of abstraction, you can maybe work with templates instead (which give you some compile-time abstraction).
At least on PC CPUs I often find virtual calls one of the most expensive indirections you can have (probably because branch prediction is more difficult). Sometimes one can also transform the indirection to the data level, e.g. you keep one generic function which operates on different data which is indirected with pointers to the actual implementation. Of course this will work only in some very specific cases.

At run-time.
BUT: Performance as compared to what? It isn't valid to compare a virtual function call to a non-virtual function call. You need to compare it to a non-virtual function call plus an if, a switch, an indirection, or some other means of providing the same function. If the function doesn't embody a choice among implementations, i.e. doesn't need to be virtual, don't make it virtual.

Related

Why does an abstract class have a vtable?

Regarding this post:
For implementations that use vtable, the answer is: Yes, usually. You
might think that vtable isn't required for abstract classes because
the derived class will have its own vtable, but it is needed during
construction: While the base class is being constructed, it sets the
vtable pointer to its own vtable. Later when the derived class
constructor is entered, it will use its own vtable instead.
I'm assuming the answer is correct, but I don't quite get it. Why is the vtable needed exactly for construction?
Because the standard says so.
[class.cdtor]/4
When a virtual function is called directly or indirectly from a
constructor or from a destructor, including during the construction or
destruction of the class's non-static data members, and the object to
which the call applies is the object (call it x) under construction or
destruction, the function called is the final overrider in the
constructor's or destructor's class and not one overriding it in a
more-derived class.
The rationale is that first the base class is constructed, then the derived one. If a virtual function is called inside the base class' constructor, it would be bad to call the derived class, since the derived class isn't initialized yet.
Remember that an abstract class may have non-pure virtual functions. Also, for debugging purposes, it is good to point pure virtual functions to a debugging trap (e.g. MSVC calls _purecall()).
If all virtual functions are pure, in MSVC you can omit the vtable with __declspec(novtable). If you use a lot of interface classes, this can lead to significant savings because you omit vfptr initialization. But if you accidentally call a pure virtual function, you'll get a hard to debug access violation.
vtables are implementation issues in C++, they are not part of the standard.
vtables are used for both dynamic dispatching of methods and for RTTI. While a nullptr vtable pointer would work for dynamic dispatching (as the vtable pointer is only used when you have an instance of that type) in a pure-abstract class, a dynamic_cast to a pure abstract class is legal, and it may require that the vtable itself exist.
Designers of the C++ implementation and ABI might have simply given the purely abstract class (a class with no implemented methods, just =0 ones) a vtable to make their implementation simpler. Every class has a vtable, and the vtable pointer gets set during construction of that class. Code can then rely on the fact that the vtable pointer exists and does not have to check for null every time. Code doesn't have to ask questions like "is this a purely abstract class".
For a non-pure abstract class (where some methods have implementations but some are pure virtual), during construction/destruction you can have defined (if unexpected) behavior that involves invoking exactly this class's version of a given method, and not the base class method or an inherited method. For this to work, you need to have a vtable set up. With a pure abstract class, there is no defined result of such a call, so the vtable is redundant, but for an abstract class that isn't totally abstract this does not hold.
When your class has a pure virtual function, that does not mean you cannot also have an implementation for it (!!). So that implies you can have an abstract class, which is also fully implemented. The constructor of your abstract class has to be able to call all functions - even the pure virtual ones, because of this point - that exist for it so far.
If you'd have substituted the client one, you'd get different behaviour for the base class constructor depending on the deriving class - not a great idea, so that's not allowed. You could put in place no vtable and statically resolve all function calls - that works, but it implies handling the constructor specially compared to all other functions and requires inlining all other functions to do this (since a function called from the constructor may also call a virtual etc.) - not very practical.
So it just implements a vtable for the constructor and destructor to use during construction and destruction. It allows you to use typeid and dynamic_cast in the c'tor and d'tor with the predictable result and get reliable behaviour out of the virtual functions you have. No alternative solution would do that.

"Direct" vs "virtual" call to a virtual function

I am self-taught, and therefore am not familiar with a lot of terminology. I cannot seem to find the answer to this by googling: What is a "virtual" vs a "direct" call to a virtual function?
This pertains to terminology, not technicality. I am asking for when a call is defined as being made "directly" vs "virtually".
It does not pertain to vtables, or anything else that has to do with the implementation of these concepts.
The answer to your question is different at different conceptual levels.
At conceptual language level the informal term "virtual call" usually refers to calls resolved in accordance with the dynamic type of the object used in the call. According to C++ language standard, this applies to all calls to virtual functions, except for calls that use qualified name of the function. When qualified name of the method is used in the call, the call is referred to as "direct call"
SomeObject obj;
SomeObject *pobj = &obj;
SomeObject &robj = obj;
obj.some_virtual_function(); // Virtual call
pobj->some_virtual_function(); // Virtual call
robj.some_virtual_function(); // Virtual call
obj.SomeObject::some_virtual_function(); // Direct call
pobj->SomeObject::some_virtual_function(); // Direct call
robj.SomeObject::some_virtual_function(); // Direct call
Note that you can often hear people say that calls to virtual functions made through immediate objects are "not virtual". However, the language specification does not support this point of view. According to the language, all non-qualified calls to virtual functions are the same: they are resolved in accordance with the dynamic type of the object. In that [conceptual] sense they are all virtual.
At implementation level the term "virtual call" usually refers to calls dispatched through some implementation-defined mechanism, that implements the standard-required functionality of virtual functions. Typically it is implemented through Virtual Method Table (VMT) associated with the object used in the call. However, smart compilers will only use VMT to perform calls to virtual functions when they really have to, i.e. when the dynamic type of the object is not known at compile time. In all other cases the compiler will strive to call the method directly, even if the call is formally "virtual" at the conceptual level.
For example, most of the time, calls to virtual functions made with an immediate object (as opposed to a pointer or a reference to object) will be implemented as direct calls (without involving VMT dispatch). The same applies to immediate calls to virtual functions made from object's constructor and destructor
SomeObject obj;
SomeObject *pobj = &obj;
SomeObject &robj = obj;
obj.some_virtual_function(); // Direct call
pobj->some_virtual_function(); // Virtual call in general case
robj.some_virtual_function(); // Virtual call in general case
obj.SomeObject::some_virtual_function(); // Direct call
pobj->SomeObject::some_virtual_function(); // Direct call
robj.SomeObject::some_virtual_function(); // Direct call
Of course, in this latter sense, nothing prevents the compiler from implementing any calls to virtual functions as direct calls (without involving VMT dispatch), if the compiler has sufficient information to determine the dynamic type of the object at compile time. In the above simplistic example any modern compiler should be able to implement all calls as direct calls.
Suppose you have this class:
class X {
public:
virtual void myfunc();
};
If you call the virtual function for a plain object of type X, the compiler will generate a direct call, i.e. refer directly to X::myfunct():
X a; // object of known type
a.myfunc(); // will call X::myfunc() directly
If you'd call the virtual function via a pointer dereference, or a reference, it is not clear which type the object pointed to will really have. It could be X but it could also be a type derived from X. Then the compiler will make a virtual call, i.e. use a table of pointers to the function address:
X *pa; // pointer to a polymorphic object
... // initialise the pointer to point to an X or a derived class from X
pa->myfunc(); // will call the myfunc() that is related to the real type of object pointed to
Here you have an online simulation of the code. You'll see that in the first case, the generated assembly calls the address of the function, whereas in the second case, the compiler loads something in a register and make an indirect call using this register (i.e. the called address is not "hard-wired" and will be determined dynamically at run time).

Does a pure-virtual object have a pointer to the vtbl?

Does a pure-virtual object have a pointer to the vtbl?
(that probably points to NULL?)
thanks, i'm a little bit confused with all the virtual mechanism.
Don't worry about it. Virtual tables are an implementation detail, and aren't even guaranteed to exist. The more you worry about how it might be done, the less you learn about the actual language.
That said, yes. A concrete class will then set that pointer to point to the correct virtual table.
There isn't technically such a thing as a 'pure-virtual object'. I assume you mean an object with pure-virtual methods? But you can't actually create such an object because it would be abstract and the compiler would complain.
Having said that, while the object is being constructed it is briefly an instance of the abstract class before becoming an instance of the derived class. It will in such a case have a virtual table set the functions it defines. It will probably have NULL for the pure virtual methods. If you try calling that the program will crash.
You can try this out by calling virtual methods in the constructor. You'll find they invoke the base class version if you call the methods in the base class. If you call a pure virtual method it'll crash. (In some cases the compiler will figure out what you are doing and complain instead).
The take home is:
Don't call virtual functions in your constructor, its just likely to be confusing. In fact, in most cases it is best if your constructor just sets its internal status up and does not do anything too complicated.

virtual functions are determine during the compilation?

i tried to look up whether virtual function determine during compilation or while running.
while looking i found something as dynamic linking/late binding
but i didn't understand if it means that the function itself determine during compilation before the executable or during the executable.
can someone please explain?
For virtual functions resolution is done at runtime. When you have an instance of an object the resolution of which method to call is known only when the program is running because only at runtime you know the exact type of this instance. For non-virtual functions this resolution can be done at compile time because it is known that only this method can be called and there cannot be child classes overriding it. Also that's why virtual method calls are a bit slower (absolutely negligibly but slower than non-virtual method calls). This article explains the concept in more details.
Usually virtual functions are resolved during runtime. The reasons are obvious: you usually don't know what actual object will be called at the call site.
Base *x; Derived *y;
Call1(y);
void Call1(Base *ptr)
{
ptr->virtual_member();
// will it be Base::virtual_member or Derived::virtual_member ?
//runtime resolution needed
}
Such situation, when it's not clear what function will be called at the certain place of code, and only in runtime it's actually determined, is called late binding.
However, in certain cases, you may know the function you're going to call. For example, if you don't call by pointer:
Base x; Derived y;
Call2(y);
void Call2(Base ptr)
{
ptr.virtual_member();
// It will always be Base::virtual_member even if Derived is passed!
//No dynamic binding necessary
}
The name lookup, overload resolution and access check for a virtual function call happens at compile time in the 'static' type of the object expression used to invoke the virtual function call (i.e if the object expression is of type pointer or a reference to a polymorphic base class).
The actual function called at run time however depends on the dynamic type of the object expression pointed to by the base class pointer or reference.

Virtual function and Classes

I need some answers to basic questions. I'm lost again. :(
q1 - Is this statement valid:
Whenever we define the function to be pure virtual function,
this means that function has no body.
q2 - And what is the concept of Dynamic Binding? I mean if the Compiler optimizes the code using VTABLEs and VPTRs then how is it Run-Time Polymorphism?
q3 - What are VTABLES AND VPTRs and how do their sizes change?
q4 - Please see this code:
class base
{
public:
virtual void display()
{
cout<<"Displaying from base";
}
};
class derived:public base
{
public:
void display(){cout<<"\nDisplaying from derived";}
};
int main()
{
base b,*bptr;
derived d;
bptr=&b;
bptr->display();
bptr=&d;
bptr->display();
}
Output:
Displaying from base
Displaying from derieved
Please can somebody answer why a pointer of base class can point the member function of a derived class and the vice-versa is not possible, why ?
False. It just means any derived classes must implement said function. You can still provide a definition for the function, and it can be called by Base::Function().*
Virtual tables are a way of implementing virtual functions. (The standard doesn't mandate this is the method, though.) When making a polymorphic call, the compiler will look up the function in the function table and call that one, enabling run-time binding. (The table is generated at compile time.)
See above. Their sizes change as there are more virtual functions. However, instances don't store a table but rather a pointer to the table, so class size only has a single size increase.
Sounds like you need a book.
*A classic example of this is here:
struct IBase
{
virtual ~IBase(void) = 0;
};
inline IBase::~IBase(void) {}
This wouldn't be an abstract class without a pure virtual function, but a destructor requires a definition (since it will be called when derived classes destruct.)
1) Not necessarily. There are times when you provide body for pure virtual functions
2) The function to be called is determined at run time.
False. It only means that derived classes must implement the method and that the method definition (if present) at that level will not be consider an override of the virtual method.
The vtable is implemented at compile time, but used at runtime. The compiler will redirect the call through the vtable, and that depends on the runtime type of the object (a pointer to base has static type base*, but might point to an object of type derived at runtime).
vptrs are pointers to an vtable, they do not change size. vtables are tables of pointers to code (might point to methods or to some adapter code) and have one entry for each virtual method declared in the class.
After the edit in the code:
The pointer refers to an object of type base during the first call, but it points to an object of type derived at the second call position. The dynamic dispatch mechanism (vtable) routes the call to the appropriate method.
A common implementation, which may help you understand is that in each class that declares virtual functions the compiler reserves space for a pointer to a virtual table, and it also generates the virtual table itself, where it adds pointers to the definition of each virtual method. The memory layout of the object only has that extra pointer.
When a derived class overrides any of the base class methods, the compiler generates a different vtable with pointers to the final overriders at that level. The memory layout of both the base and the derived class coincide in the base subobject part (usually the beginning), but the value of the vptr of a base object will point to the base vtable, while the value of the vptr in the derived object will point to the derived vtable.
When the compiler sees a call like bptr->display(), it checks the definition of the base class, and sees that it is the first virtual method, then it redirects the call as: bptr->hidden_vptr[0](). If the pointer is referring to a real base instance, that will be a pointer to base::display, while in the case of a derived instance it will point to derived::display.
Note that there is a lot of hand-waving in this answer. All this is implementation defined (the language does not specify the dispatch mechanism), and in most cases the dispatch mechanism is more complex. For example, when multiple inheritance takes place, the vtable will not point directly to the final overrider, but to an adapter block of code that resets the first implicit this parameter offset, as the base subobject of all but the first base are unaligned with the most derived object in memory --this is well beyond the scope of the question, just remember that this answer is a rough idea and that there is added complexity in real systems.
Is this statement valid
Not exactly: it might have a body. A more accurate definition is "Whenever we define a method to be pure virtual, this means that the method must be defined (overriden) in a concrete subclass."
And what is the concept of Dynamic Binding? I mean if the Compiler optimizes the code using VTABLEs and VPTRs then how is it Run-Time Polymorphism?
If you have an instance of a superclass (e.g. Shape) at run-time, you don't/needn't necessarily know which of its subclasses (e.g. Circle or Square) it is.
What are VTABLES AND VPTRs and how do their sizes change?
There's one vtable per class (for any class which has one or more virtual methods). The vtable contains pointers to the addresses of the class's virtual methods.
There's one vptr per object (for any object which has one or more virtual methods). The vptr points to the vtable for that object's class.
The size of the vtable increases with the number of virtual functions in the class. The size of the vptr is probably constant.
Please can somebody answer why a pointer of base class can point the member function of a derived class and the vice-versa is not possible, why ?
If you want to invoke the base class function then (because it's virtual, and the default behaviour for virtual is to call the most-derived version via the vptr/vtable) then you have to say so explicitly, e.g. like this:
bptr->base::display();