Clarification Needed on C++ Virtual Call Implementation - c++

I have some doubts regarding virtual function or better we can say Run Time Polymorphism. According to me, I assumed the way it works as below,
A Virtual Table (V-Table) will be created for every class that has at least one virtual member function. I believe this is static table and so it is created for every class and not for every object. Please correct me in this if I am wrong here.
This V-Table has the address of the virtual function. If the class has 4 virtual functions, then this table has 4 entries pointing to the corresponding 4 functions.
Compiler will add a virtual pointer (V-Ptr) as a hidden member of the class. This virtual pointer will point to the starting address in the virtual table.
Assume I have program like this,
class Base
{
virtual void F1();
virtual void F2();
virtual void F3();
virtual void F4();
}
class Der1 : public Base //Overrides only first 2 functions of Base class
{
void F1(); //Overrides Base::F1()
void F2(); //Overrides Base::F2()
}
class Der2 : public Base //Overrides remaining functions of Base class
{
void F3(); //Overrides Base::F3()
void F4(); //Overrides Base::F4()
}
int main()
{
Base* p1 = new Der1; //Believe Vtable will populated in compile time itself
Base* p2 = new Der2;
p1->F1(); //how does it call Der1::F1()
p2->F3(); //how does it call Base::F3();
}
If the V-Table gets populated in compile time, why do call it as Run Time Polymorphism ?. Please explain me how many vtables and vptr and how it works using the above example. According to me 3 Vtables will be there for Base, Der1 and Der2 class. In Der1 Vtable,it has address of F1() and F2() of its own, whereas for F3() and F4() the address will point to Base class. Also 3 Vptr will be added as hidden member in Base, Der1 and Der2 class. If everything is decided at compile time, What happens exactly during the run time ?. Please correct me if I am wrong in the concept.

It's obviously implementation defined, but most implementations
are fairly similar, more or less along the lines you describe.
This is correct.
vtables contain more than just pointers to functions.
There's usually an entry pointing to the RTTI information, and
often some information concerning how to fix up the this pointer
when calling the function (although this can also be done using
trampolines). In the case of virtual bases, there could also be
an offset to the virtual base.
This is also correct. Note that during construction and
destruction, the compiler will change the vptr as the dynamic
type of the object changes, and that in the case of multiple
inheritance (with or without virtual bases), there will be more
than one vptr. (The vptr is at a fixed offset with
respect to the base address of the class, and in the case of
multiple inheritance, not all classes can have the same base
address.)
As to your final remarks: the vtables are populated at compile
time, and are static. But the vptr's are set at runtime,
according to the dynamic type, and the function call uses it to
find the vtable and dispatch the call.
In your (very simple) example, there are three vtable, one for
each class. Because only simple inheritance is involved, there
is only one vptr per instance, shared between Base and the
derived class. The vtable for Base will contain four slots,
pointing to Base::f1, Base::f2, Base::f3 and Base::f4.
The vtable for Der1 will also contain four slots, pointing to
Der1::f1, Der1::f2, Base::f3 and Base::f4. The vtable
for Der2 will point to Base::f1, Base::f2, Der2::f3 and
Der2::f4. The constructor for Base will set the vptr to the
table of Base; the constructor for the derived classes will
first call the constructor for the base class, then set the vptr
to the vtable corresponding to its type. (In practice, in such
simple cases, the compiler is probably capable of determining
that the vptr is never used in the constructor to Base, and so
skip setting it. In more complicated cases, where the compiler
cannot see all of the behavior of the base class constructor,
however, this is not the case.)
As to why it is called runtime polymorphism, consider
a function:
void f(Base* p)
{
p->f1();
}
The function actually called will be different, depending on
whether p points to a Der1 or a Der2. In other words, it
will be determined at runtime.

The C++ standard doesn't specify how virtual function calls have to be implemented, but here's a simplified example of the approach that is universally accepted.
From a high-level perspective, the v-tables would look like this:
Base:
Index | Function Address
------|------------------
0 | Base::F1
1 | Base::F2
2 | Base::F3
3 | Base::F4
Der1:
Index | Function Address
------|------------------
0 | Der1::F1
1 | Der1::F2
2 | Base::F3
3 | Base::F4
Der2:
Index | Function Address
------|------------------
0 | Base::F1
1 | Base::F2
2 | Der2::F3
3 | Der2::F4
When you create p1 and p2, they get a pointer that points to Der1's vtable and Der2's vtable, respectively.
The call to p1->F1 basically means "call function 0 on p1's virtual table".
vptr[0] is Der1::F1, so it gets called.
It's called run-time polymorphism because the function that will be called for a specific object is determined at run-time (by making a look-up in the object's vtable).

It's implementation defined. When programming in C++, the only thing that should concern you is that if you declare a method virtual, the run-time contents of the object behind the pointer or reference will decide what code will be called.
Perhaps you should read about that topic first. Here is the C++ specific stuff.

I'm not going to go through four virtual functions and three derived types. Suffice it to say: for the ultimate base class, the vtable has pointers that point to the base class' version of all the virtual functions. For derived classes, the vtable has pointers to all of the derived class's virtual functions; when the derived class overrides a base class function, the function pointer for that function points to the derived class' version of that virtual function; when the derived class inherits a virtual function, the function pointer points to the inherited function.

Related

Function resolution from vtable in C++

I have a confusion regarding vtable after reading more about name mangling.
for ex:
class Base
{
public:
virtual void print()
{
}
};
class A : public Base
{
public:
void hello()
{
....
}
void print()
{
}
};
A obj;
obj.hello();
Base* test = new A();
test->print();
As per my understanding after the name manging obj.hello() call will be converted to something like _ZASDhellov(&obj) now
how this virtual functions will be invoked from vtable?
my wild guess test->__vtable[_ZASDprintv](&test(dynamic cast to derived???)) is correct?
How the function names are resolved from vtable?
Firstly, vtables are not in any way part of the C++ language, but rather an implementation detail used by particular compilers. Below I describe one way it is commonly used as such.
Second, your function hello is not virtual. To make it virtual, you would simply pre-pend virtual to the declaration.
Assuming it is now virtual: Your guess is quite close. In fact, the vtable (to which a pointer is stored with every instance of a virtual class) is an array of function pointers. The way that a particular function is looked up in it is by its ordinal. The first declared virtual function in A is the first entry in its vtable, the second one is the second entry and so on. If A had a base class, the index of A's first (non-override) virtual function in the table would be n+1, where n is the index of the last virtual function of its base class. If A has more than one base class, their entries precede A's entries in order of their declaration as base classes of A.
If A uses virtual inheritance, the picture is a bit more complicated than that, I won't elaborate unless you're specifically interested.
UPDATE: I'll add a very brief description for the virtual inheritance case as requested. If A had Base as a virtual base class, A's vtable would store at the very beginning (before the function addresses) the byte offset of where Base's data starts within the A object. This is necessary because, unlike in normal inheritance, a base class does not have its data precede the derived class's data - instead it follows it. So in effect, any function call to a virtual function defined in Base has to have its this pointer offset by that amount. Additionally, Base would have to have its own vtable pointer, right at the beginning of its data where it expects to find it. Thus the full A object would contain two vtable pointers instead of one. The actual vtable pointed to by this second pointer would be the same one as the first vtable pointer, except advanced to skip the offset entry described above (so that any Base code using the vtable would find the first virtual function at the beginning where it is expected). Apart from these differences, the vtable itself is the same as before.

Resolving of vptr

class base {
public:
virtual void fn(){}
};
class der : public base {};
I know that compiler provides a member call VPTR in class which is initialised with the exact VTABLE at run time by constructor. I have 2 questions
1) Which class holds the VPTR. or all the class is having seperate VPTR.
2) When executing statement der d; how VPTR is being resolved at run time?
vtable is created for the class that contains virtual function and for the classes derived from it.It means in your program vtable will be created for base class and der class.Each of these vtables would contain the address of virtual function void fn().Now note that der class doesn't contain the definition of void fn(),hence its vtable contains the address of base class's void fn() function.Thus if u make a call like d.fn(); the void fn() function of base class would get executed.
Note: a virtual table and a virtual pointer are implementation details, though all the C++ compilers I know use them, they are not mandated by the Standard, only the results are.
To answer your specific question: each instance of a class with virtual methods (either its own, or inherited ones) or a class with (somewhere) a virtual inheritance relationship will need at least one virtual-pointer.
There can be several (when virtual inheritance or multi-inheritance are involved).
In your example, a single virtual pointer is sufficient. However it does not make sense to speak of it as being part of a class. The virtual pointer is part of the instance (object), and lives outsides the classes rules because those apply to the language, and the virtual pointer is an implementation mechanism.
1) which class holds the VPTR. or all the class is having seperate VPTR.
Every class object has its own vptr if the class is polymorphic (i.e. contains virtual function or has virtual inheritance.) In this case both the classes has virtual function.
2) when executing statement der d; how VPTR is resolve at run time?
You are just declaring the object of der. But even if you call a function then in this case the call to any function is resolved at compile time. Virtual function resolution comes into picture only when the function is called with pointer/reference.

Which function to call? (delegate to a sister class)

I just read about this in the C++ FAQ Lite
[25.10] What does it mean to "delegate to a sister class" via virtual inheritance?
class Base {
public:
virtual void foo() = 0;
virtual void bar() = 0;
};
class Der1 : public virtual Base {
public:
virtual void foo();
};
void Der1::foo()
{ bar(); }
class Der2 : public virtual Base {
public:
virtual void bar();
};
class Join : public Der1, public Der2 {
public:
...
};
int main()
{
Join* p1 = new Join();
Der1* p2 = p1;
Base* p3 = p1;
p1->foo();
p2->foo();
p3->foo();
}
"Believe it or not, when Der1::foo() calls this->bar(), it ends up calling Der2::bar(). Yes, that's right: a class that Der1 knows nothing about will supply the override of a virtual function invoked by Der1::foo(). This "cross delegation" can be a powerful technique for customizing the behavior of polymorphic classes. "
My question is:
What is happening behind the scene.
If I add a Der3 (virtual inherited from Base), what will happen? (I dont have a compiler here, couldn't test it right now.)
What is happening behind the scene.
The simple explanation is that, because inheritance from Base is virtual in both Der1 and Der2, there is a single instance of the object in the most derived object Join. At compile time, and assuming (which is the common case) virtual tables as dispatch mechanism, when compiling Der1::foo it will redirect the call to bar() through the vtable.
Now the question is how the compiler generates vtables for each of the objects, the vtable for Base will contain two null pointers, the vtable for Der1 will contain Der1::foo and a null pointer and the vtable for Der2 will contain a null pointer and Der2::bar [*]
Now, because of virtual inheritance in the previous level, when the compiler processes Join it will create a single Base object, and thus a single vtable for the Base subojbect of Join. It effectively merges the vtables of Der1 and Der2 and produces a vtable that contains pointers to Der1::foo and Der2::bar.
So the code in Der1::foo will dispatch through Join's vtable to the final overrider, which in this case is in a different branch of the virtual inheritance hierarchy.
If you add a Der3 class, and that class defines either of the virtual functions, the compiler will not be able to cleanly merge the three vtables and will complain, with some error relating to the ambiguity of the multiply defined method (none of the overriders can be considered to be the final overrider). If you add the same method to Join, then the ambiguity will no longer be a problem, as the final overrider will be the member function defined in Join, so the compiler is able to generate the virtual table.
[*] Most compilers will not write null pointers here, but rather a pointer to a generic function that will print an error message and terminate the application, allowing for better diagnostics than a plain segmentation fault.
If you add a Der3 what will happen depends on which class it inherits from.
As you know, instantiating a class is only possible when all virtual functions have been defined; otherwise you can only make pointers to them. This is to prevent constructing partially defined objects.
In your example you cannot instantiate Der1 nor Der2 directly because in Der1, bar() is still pure virtual and in Der2, foo() is pure virtual.
Your Join class can be instantiated because it inherits from both and has therefore no pure virtual function.
Once you have made an instance of a class, you can instantiate pointers to non-instantiable classes by dynamic_casting.
From the moment a class has been instantiated, the virtual function mechanism, that works with a table of pointer to functions, will still call the functions that have been defined at instantiation time.
So the key here is that when you create your object, you create an instance of Join. Its virtual functions are defined because you are able to create the object. From that moment, you can call the virtual functions with any pointer to a base class.
I see why this is interesting to explore. In real code this would probably be hardly useful however. As others pointed out, virtual inheritance is more of a fix-this-bad-design-to-work-somehow tool, than a valid desing tool.
Your code produces warnings in VS2010 - the compiler is making you know that dominance is being used. Of course thats not a show stopper, but another discouragement to use this.
If you introduce Der3 like this
class Der3 : public virtual Base {
public:
void bar() {}
};
class Join : public Der1, public Der2, public Der3 {}
the code fails to compile because of ambiguous inheritance of 'void Base::bar(void)'
One point is missing in the discussion ( none-the-less this is quite informative and thanks to all ).
When you 'virtually inherit' a class. What happens is: a pointer to the virtual base class is kept by most of the compilers ( it can be implemented in different ways by different compilers). So if you take the size of Der1 and Der2, it would be atleast 4 bytes on 32 bit and 8 bytes on 64 bit. Because they have a pointer to the virtual base class and therefore, no ambiguity. That is why when you create the object of Join, it first calls the constructor of Virtual Base class ( not really the first call, but it initializes the pointer which came to it through Der1 and Der2 first in its construtor ). In Join compiler can check the pointer name / type and then it makes sure that only one pointer of virtual base class comes to it from Der1 and Der2. You can check even this by sizeof operator. As we know that compiler puts the calls in the constructor silently. Therefore, it first calls the Virtual Base class's constructor in Depth First way. ( can be checked using all the base classes as virtual derivation ). Rest is already explained
This is a pretty stupid example imo and a perfect example of academics making themselves look clever. If this situation ever came up, it would almost CERTAINLY be because of a bug, specifically forgetting to make Der1::foo() virtual.
Edit:
I misread the class definitions. Which is exactly the problem with this type of design. It takes a lot of thought to determine exactly what would happen in each of these cases, which is bad. Making your code readable is by far better than being "clever" like this.

virtual function - vtable

Let's say I have class A who inherits from class B and C (multiple inheritance).
How many vtable members class A would have ?
What's the case in single inheritance ?
In addition, suppose:
Class A : Public B {}
and:
B* test = new A();
Where does test gets its vtable from? What's assignment?
I assume it gets B's part of A's vtable, but does A's constructor changes its fathers (B) vtable too ?
First, vtable's are implementation specific. In fact, nowhere in the standard is specified that vtable's must exist at all.
Anyway, in most usual cases, you would get one vtable pointer per base class with virtual functions. And, as Yuval explained, nobody "fills" the vtable's when an object is constructed; you have one vtable per class with virtual functions, and objects just have pointers to their correct vtable (or vtable's, in case of multiple inheritance). In your single-inheritance example, test would have a pointer to A's vtable, assuming that A has at least one virtual function (inherited from B or newly declared in A).
Generally speaking - you need at least one vtable entry for each virtual function you inherit. If you have no virtual functions, you have no vtable.
Generally speaking, a subclass will have a vtable pointer to each of the multiple superclasses it inherits from (assuming, obviously, that each of those classes have at least one virtual function).
I'm not quite sure I understood your second question. When building an object, part of the construction process is setting the relevant vtable pointers, this is something that is done implicitly by the c++ compiler by static analysis of the inheritance hierarchy. None of the vtables change, they are merely pointed at.
When a class defines virtual functions, the compiler silently inserts a hidden vPtr data member for each supported interface.
The vPtr points to the correct vTable for the object.
The vTable contains a list of addresses which point to function implementations.
Here's an example of sorts.
class Foo: public Bar, public Baz
{
VTable* bar_vPtr;
VTable* baz_vPtr;
// Bar overrides/implementations
void barOverride();
// Baz overrides/implementations
void bazOverride();
};
VTable:
&barOverride() // address of implementation
VTable:
&bazOverride() // address of implementation
When an object is created, the memory is created for the data members and not the methods. The methods will be in a common location to be accessible by all the objects. This [ one per class ] applies to VTable as it merely consists of a function pointer for each virtual function in the class.
As mentioned by Steven, compiler adds a hidden pointer to the base class, which is set when a class instance is created such that it points to the virtual table for that class, This pointer is also inherited by derived classes.
When an object of the derived class is assigned to the base class pointer, the hidden pointer in the base class is replaced with the address of the derived class's vtable.

When is a vtable created in C++?

When exactly does the compiler create a virtual function table?
1) when the class contains at least one virtual function.
OR
2) when the immediate base class contains at least one virtual function.
OR
3) when any parent class at any level of the hierarchy contains at least one virtual function.
A related question to this:
Is it possible to give up dynamic dispatch in a C++ hierarchy?
e.g. consider the following example.
#include <iostream>
using namespace std;
class A {
public:
virtual void f();
};
class B: public A {
public:
void f();
};
class C: public B {
public:
void f();
};
Which classes will contain a V-Table?
Since B does not declare f() as virtual, does class C get dynamic polymorphism?
Beyond "vtables are implementation-specific" (which they are), if a vtable is used: there will be unique vtables for each of your classes. Even though B::f and C::f are not declared virtual, because there is a matching signature on a virtual method from a base class (A in your code), B::f and C::f are both implicitly virtual. Because each class has at least one unique virtual method (B::f overrides A::f for B instances and C::f similarly for C instances), you need three vtables.
You generally shouldn't worry about such details. What matters is whether you have virtual dispatch or not. You don't have to use virtual dispatch, by explicitly specifying which function to call, but this is generally only useful when implementing a virtual method (such as to call the base's method). Example:
struct B {
virtual void f() {}
virtual void g() {}
};
struct D : B {
virtual void f() { // would be implicitly virtual even if not declared virtual
B::f();
// do D-specific stuff
}
virtual void g() {}
};
int main() {
{
B b; b.g(); b.B::g(); // both call B::g
}
{
D d;
B& b = d;
b.g(); // calls D::g
b.B::g(); // calls B::g
b.D::g(); // not allowed
d.D::g(); // calls D::g
void (B::*p)() = &B::g;
(b.*p)(); // calls D::g
// calls through a function pointer always use virtual dispatch
// (if the pointed-to function is virtual)
}
return 0;
}
Some concrete rules that may help; but don't quote me on these, I've likely missed some edge cases:
If a class has virtual methods or virtual bases, even if inherited, then instances must have a vtable pointer.
If a class declares non-inherited virtual methods (such as when it doesn't have a base class), then it must have its own vtable.
If a class has a different set of overriding methods than its first base class, then it must have its own vtable, and cannot reuse the base's. (Destructors commonly require this.)
If a class has multiple base classes, with the second or later base having virtual methods:
If no earlier bases have virtual methods and the Empty Base Optimization was applied to all earlier bases, then treat this base as the first base class.
Otherwise, the class must have its own vtable.
If a class has any virtual base classes, it must have its own vtable.
Remember that a vtable is similar to a static data member of a class, and instances have only pointers to these.
Also see the comprehensive article C++: Under the Hood (March 1994) by Jan Gray. (Try Google if that link dies.)
Example of reusing a vtable:
struct B {
virtual void f();
};
struct D : B {
// does not override B::f
// does not have other virtuals of its own
void g(); // still might have its own non-virtuals
int n; // and data members
};
In particular, notice B's dtor isn't virtual (and this is likely a mistake in real code), but in this example, D instances will point to the same vtable as B instances.
The answer is, 'it depends'. It depends on what you mean by 'contain a vtbl' and it depends on the decisions made by the implementor of the particular compiler.
Strictly speaking, no 'class' ever contains a virtual function table. Some instances of some classes contain pointers to virtual function tables. However, that's just one possible implementation of the semantics.
In the extreme, a compiler could hypothetically put a unique number into the instance that indexed into a data structure used for selecting the appropriate virtual function instance.
If you ask, 'What does GCC do?' or 'What does Visual C++ do?' then you could get a concrete answer.
#Hassan Syed's answer is probably closer to what you were asking about, but it is really important to keep the concepts straight here.
There is behavior (dynamic dispatch based on what class was new'ed) and there's implementation. Your question used implementation terminology, though I suspect you were looking for a behavioral answer.
The behavioral answer is this: any class that declares or inherits a virtual function will exhibit dynamic behavior on calls to that function. Any class that does not, will not.
Implementation-wise, the compiler is allowed to do whatever it wants to accomplish that result.
Answer
a vtable is created when a class declaration contains a virtual function. A vtable is introduced when a parent -- anywhere in the heirarchy -- has a virtual function, lets call this parent Y. Any parent of Y WILL NOT have a vtable (unless they have a virtual for some other function in their heirarchy).
Read on for discussion and tests
-- explanation --
When you specify a member function as virtual, there is a chance that you may try to use sub-classes via a base-class polymorphically at run-time. To maintain c++'s guarantee of performance over language design they offered the lightest possible implementation strategy -- i.e., one level of indirection, and only when a class might be used polymorphically at runtime, and the programmer specifies this by setting at least one function to be virtual.
You do not incur the cost of the vtable if you avoid the virtual keyword.
-- edit : to reflect your edit --
Only when a base class contains a virtual function do any other sub-classes contain a vtable. The parents of said base class do not have a vtable.
In your example all three classes will have a vtable, this is because you can try to use all three classes via an A*.
--test - GCC 4+ --
#include <iostream>
class test_base
{
public:
void x(){std::cout << "test_base" << "\n"; };
};
class test_sub : public test_base
{
public:
virtual void x(){std::cout << "test_sub" << "\n"; } ;
};
class test_subby : public test_sub
{
public:
void x() { std::cout << "test_subby" << "\n"; }
};
int main()
{
test_sub sub;
test_base base;
test_subby subby;
test_sub * psub;
test_base *pbase;
test_subby * psubby;
pbase = ⊂
pbase->x();
psub = &subby;
psub->x();
return 0;
}
output
test_base
test_subby
test_base does not have a virtual table therefore anything casted to it will use the x() from test_base. test_sub on the other hand changes the nature of x() and its pointer will indirect through a vtable, and this is shown by test_subby's x() being executed.
So, a vtable is only introduced in the hierarchy when the keyword virtual is used. Older ancestors do not have a vtable, and if a downcast occurs it will be hardwired to the ancestors functions.
You made an effort to make your question very clear and precise, but there's still a bit of information missing. You probably know, that in implementations that use V-Table, the table itself is normally an independent data structure, stored outside the polymorphic objects, while objects themselves only store a implicit pointer to the table. So, what is it you are asking about? Could be:
When does an object get an implicit pointer to V-Table inserted into it?
or
When is a dedicated, individual V-Table created for a given type in the hierarchy?
The answer to the first question is: an object gets an implicit pointer to V-Table inserted into it when the object is of polymorphic class type. The class type is polymorphic if it contains at least one virtual function, or any of its direct or indirect parents are polymorphic (this is answer 3 from your set). Note also, that in case of multiple inheritance, an object might (and will) end up containing multiple V-Table pointers embedded into it.
The answer to the second question could be the same as to the first (option 3), with a possible exception. If some polymorphic class in single inheritance hierarchy has no virtual functions of its own (no new virtual functions, no overrides for parent virtual function), it is possible that implementation might decide not to create an individual V-Table for this class, but instead use it's immediate parent's V-Table for this class as well (since it is going to be the same anyway). I.e. in this case both objects of parent type and objects of derived type will store the same value in their embedded V-Table pointers. This is, of course, highly dependent on implementation. I checked GCC and MS VS 2005 and they don't act that way. They both do create an individual V-Table for the derived class in this situation, but I seem to recall hearing about implementations that don't.
C++ standards doesn't mandate using V-Tables to create the illusion of polymorphic classes. Most of the time implementations use V-Tables, to store the extra information needed. In short, these extra pieces of information are equipped when you have at least one virtual function.
The behavior is defined in chapter 10.3, paragraph 2 of the C++ language specification:
If a virtual member function vf is
declared in a class Base and in a
class Derived, derived directly or
indirectly from Base, a member
function vf with the same name and
same parameter list as Base::vf is
declared, then Derived::vf is also
virtual ( whether or not it is so
declared ) and it overrides Base::vf.
A italicized the relevant phrase. Thus, if your compiler creates v-tables in the usual sense then all classes will have a v-table since all their f() methods are virtual.