Function resolution from vtable in C++ - c++

I have a confusion regarding vtable after reading more about name mangling.
for ex:
class Base
{
public:
virtual void print()
{
}
};
class A : public Base
{
public:
void hello()
{
....
}
void print()
{
}
};
A obj;
obj.hello();
Base* test = new A();
test->print();
As per my understanding after the name manging obj.hello() call will be converted to something like _ZASDhellov(&obj) now
how this virtual functions will be invoked from vtable?
my wild guess test->__vtable[_ZASDprintv](&test(dynamic cast to derived???)) is correct?
How the function names are resolved from vtable?

Firstly, vtables are not in any way part of the C++ language, but rather an implementation detail used by particular compilers. Below I describe one way it is commonly used as such.
Second, your function hello is not virtual. To make it virtual, you would simply pre-pend virtual to the declaration.
Assuming it is now virtual: Your guess is quite close. In fact, the vtable (to which a pointer is stored with every instance of a virtual class) is an array of function pointers. The way that a particular function is looked up in it is by its ordinal. The first declared virtual function in A is the first entry in its vtable, the second one is the second entry and so on. If A had a base class, the index of A's first (non-override) virtual function in the table would be n+1, where n is the index of the last virtual function of its base class. If A has more than one base class, their entries precede A's entries in order of their declaration as base classes of A.
If A uses virtual inheritance, the picture is a bit more complicated than that, I won't elaborate unless you're specifically interested.
UPDATE: I'll add a very brief description for the virtual inheritance case as requested. If A had Base as a virtual base class, A's vtable would store at the very beginning (before the function addresses) the byte offset of where Base's data starts within the A object. This is necessary because, unlike in normal inheritance, a base class does not have its data precede the derived class's data - instead it follows it. So in effect, any function call to a virtual function defined in Base has to have its this pointer offset by that amount. Additionally, Base would have to have its own vtable pointer, right at the beginning of its data where it expects to find it. Thus the full A object would contain two vtable pointers instead of one. The actual vtable pointed to by this second pointer would be the same one as the first vtable pointer, except advanced to skip the offset entry described above (so that any Base code using the vtable would find the first virtual function at the beginning where it is expected). Apart from these differences, the vtable itself is the same as before.

Related

Clarification Needed on C++ Virtual Call Implementation

I have some doubts regarding virtual function or better we can say Run Time Polymorphism. According to me, I assumed the way it works as below,
A Virtual Table (V-Table) will be created for every class that has at least one virtual member function. I believe this is static table and so it is created for every class and not for every object. Please correct me in this if I am wrong here.
This V-Table has the address of the virtual function. If the class has 4 virtual functions, then this table has 4 entries pointing to the corresponding 4 functions.
Compiler will add a virtual pointer (V-Ptr) as a hidden member of the class. This virtual pointer will point to the starting address in the virtual table.
Assume I have program like this,
class Base
{
virtual void F1();
virtual void F2();
virtual void F3();
virtual void F4();
}
class Der1 : public Base //Overrides only first 2 functions of Base class
{
void F1(); //Overrides Base::F1()
void F2(); //Overrides Base::F2()
}
class Der2 : public Base //Overrides remaining functions of Base class
{
void F3(); //Overrides Base::F3()
void F4(); //Overrides Base::F4()
}
int main()
{
Base* p1 = new Der1; //Believe Vtable will populated in compile time itself
Base* p2 = new Der2;
p1->F1(); //how does it call Der1::F1()
p2->F3(); //how does it call Base::F3();
}
If the V-Table gets populated in compile time, why do call it as Run Time Polymorphism ?. Please explain me how many vtables and vptr and how it works using the above example. According to me 3 Vtables will be there for Base, Der1 and Der2 class. In Der1 Vtable,it has address of F1() and F2() of its own, whereas for F3() and F4() the address will point to Base class. Also 3 Vptr will be added as hidden member in Base, Der1 and Der2 class. If everything is decided at compile time, What happens exactly during the run time ?. Please correct me if I am wrong in the concept.
It's obviously implementation defined, but most implementations
are fairly similar, more or less along the lines you describe.
This is correct.
vtables contain more than just pointers to functions.
There's usually an entry pointing to the RTTI information, and
often some information concerning how to fix up the this pointer
when calling the function (although this can also be done using
trampolines). In the case of virtual bases, there could also be
an offset to the virtual base.
This is also correct. Note that during construction and
destruction, the compiler will change the vptr as the dynamic
type of the object changes, and that in the case of multiple
inheritance (with or without virtual bases), there will be more
than one vptr. (The vptr is at a fixed offset with
respect to the base address of the class, and in the case of
multiple inheritance, not all classes can have the same base
address.)
As to your final remarks: the vtables are populated at compile
time, and are static. But the vptr's are set at runtime,
according to the dynamic type, and the function call uses it to
find the vtable and dispatch the call.
In your (very simple) example, there are three vtable, one for
each class. Because only simple inheritance is involved, there
is only one vptr per instance, shared between Base and the
derived class. The vtable for Base will contain four slots,
pointing to Base::f1, Base::f2, Base::f3 and Base::f4.
The vtable for Der1 will also contain four slots, pointing to
Der1::f1, Der1::f2, Base::f3 and Base::f4. The vtable
for Der2 will point to Base::f1, Base::f2, Der2::f3 and
Der2::f4. The constructor for Base will set the vptr to the
table of Base; the constructor for the derived classes will
first call the constructor for the base class, then set the vptr
to the vtable corresponding to its type. (In practice, in such
simple cases, the compiler is probably capable of determining
that the vptr is never used in the constructor to Base, and so
skip setting it. In more complicated cases, where the compiler
cannot see all of the behavior of the base class constructor,
however, this is not the case.)
As to why it is called runtime polymorphism, consider
a function:
void f(Base* p)
{
p->f1();
}
The function actually called will be different, depending on
whether p points to a Der1 or a Der2. In other words, it
will be determined at runtime.
The C++ standard doesn't specify how virtual function calls have to be implemented, but here's a simplified example of the approach that is universally accepted.
From a high-level perspective, the v-tables would look like this:
Base:
Index | Function Address
------|------------------
0 | Base::F1
1 | Base::F2
2 | Base::F3
3 | Base::F4
Der1:
Index | Function Address
------|------------------
0 | Der1::F1
1 | Der1::F2
2 | Base::F3
3 | Base::F4
Der2:
Index | Function Address
------|------------------
0 | Base::F1
1 | Base::F2
2 | Der2::F3
3 | Der2::F4
When you create p1 and p2, they get a pointer that points to Der1's vtable and Der2's vtable, respectively.
The call to p1->F1 basically means "call function 0 on p1's virtual table".
vptr[0] is Der1::F1, so it gets called.
It's called run-time polymorphism because the function that will be called for a specific object is determined at run-time (by making a look-up in the object's vtable).
It's implementation defined. When programming in C++, the only thing that should concern you is that if you declare a method virtual, the run-time contents of the object behind the pointer or reference will decide what code will be called.
Perhaps you should read about that topic first. Here is the C++ specific stuff.
I'm not going to go through four virtual functions and three derived types. Suffice it to say: for the ultimate base class, the vtable has pointers that point to the base class' version of all the virtual functions. For derived classes, the vtable has pointers to all of the derived class's virtual functions; when the derived class overrides a base class function, the function pointer for that function points to the derived class' version of that virtual function; when the derived class inherits a virtual function, the function pointer points to the inherited function.

Virtual multiple inheritance - final overrider

while trying to analyse in greater depth inheritance mechanism of C++ I stumbled upon the following example:
#include<iostream>
using namespace std;
class Base {
public:
virtual void f(){
cout << "Base.f" << endl;
}
};
class Left : public virtual Base {
};
class Right : public virtual Base{
public:
virtual void f(){
cout << "Right.f" << endl;
}
};
class Bottom : public Left, public Right{
};
int main(int argc,char **argv)
{
Bottom* b = new Bottom();
b->f();
}
The above, somehow, compiles and calls Right::f(). I see what might be going on in the compiler, that it understands that there is one shared Base object, and that Right overrides f(), but really, in my understanding, there should be two methods: Left::f() (inherited from Base::f()) and Right::f(), which overrides Base::f(). Now, I would think, based that there are two separate methods being inherited by Bottom, both with same signature, there should be a clash.
Could anyone explain which specification detail of C++ deals with this case and how it does it from the low-level perspective?
In the dreaded diamond there is a single base, from which the two intermediate objects derive and then the fourth type closes the diamond with multiple inheritance from both types in the intermediate levels.
Your question seems to be how many f functions are declared in the previous example? and the answer is one.
Lets start with the simpler example of a linear hierarchy of just base and derived:
struct base {
virtual void f() {}
};
struct derived : base {
virtual void f() {}
};
In this example there is a single f declared for which there are two overrides, base::f and derived::f. In an object of type derived, the final overrider is derived::f. It is important to note that both f functions represent a single function that has multiple implementations.
Now, going back to the original example, on the line on the right, Base::f and Right::f are in the same way the same function that is overridden. So for an object of type Right, the final overrider is Right::f. Now for a final object of type Left, the final overrider is Base::f as Left does not override the function.
When the diamond is closed, and because inheritance is virtual there is a single Base object, that declares a single f function. In the second level of inheritance, Right overrides that function with its own implementation and that is the final overrider for the most derived type Bottom.
You might want to look at this outside of the standard and take a look at how this is actually implemented by compilers. The compiler, when creating the Base object it adds a hidden pointer vptr to the virtual table. The virtual table holds pointers to thunks (for simplicity just assume that the table held pointers to the function's final overriders, [1]). In this case, the Base object will contain no member data and just a pointer to a table that holds a pointer to the function Base::f.
When Left extends Base, a new vtable is created for Left and the pointer in that vtable is set to the final overrider of f at this level, which is incidentally Base::f so the pointers in both vtables (ignoring the trampolin) jump to the same actual implementation. When an object of type Left is being constructed, the Basesubobject is initialized first, and then prior to initialization of the members of Left (if there were) the Base::vptr pointer is updated to refer to Left::vtable (i.e. the pointer stored in Base refers to the table defined for Left).
On the other side of the diamond, the vtable that is created for Right contains a single thunk that ends up calling Right::f. If an object of type Right was to be created the same initialization process would happen and the Base::vptr would point to Derived::f.
Now we get to the final object Bottom. Again, a vtable is generated for the type Bottom and that vtable, as is the case in all others, contains a single entry that represents f. The compiler analyzes the hierarchy of inheritance and determines that Right::f overrides Base::f, and there is no equivalent override on the left branch, so in Bottom's vtable the pointer representing f refers to Right::f. Again, during construction of the Bottom object, the Base::vptr is updated to refer to Bottom's vtable.
As you see, all four vtables have a single entry for f, there is a single f in the program, even if the value stored in each vtable is different (the final overriders differ).
[1] The thunk is a small piece of code that adapts the this pointer if needed (multiple inheritance usually implies it is needed) and then forwards the call to the actual override. In the event of single inheritance, the this pointer does not need to be updated and the thunk disappears, with the entry in the vtable pointing directly to the actual function.

Resolving of vptr

class base {
public:
virtual void fn(){}
};
class der : public base {};
I know that compiler provides a member call VPTR in class which is initialised with the exact VTABLE at run time by constructor. I have 2 questions
1) Which class holds the VPTR. or all the class is having seperate VPTR.
2) When executing statement der d; how VPTR is being resolved at run time?
vtable is created for the class that contains virtual function and for the classes derived from it.It means in your program vtable will be created for base class and der class.Each of these vtables would contain the address of virtual function void fn().Now note that der class doesn't contain the definition of void fn(),hence its vtable contains the address of base class's void fn() function.Thus if u make a call like d.fn(); the void fn() function of base class would get executed.
Note: a virtual table and a virtual pointer are implementation details, though all the C++ compilers I know use them, they are not mandated by the Standard, only the results are.
To answer your specific question: each instance of a class with virtual methods (either its own, or inherited ones) or a class with (somewhere) a virtual inheritance relationship will need at least one virtual-pointer.
There can be several (when virtual inheritance or multi-inheritance are involved).
In your example, a single virtual pointer is sufficient. However it does not make sense to speak of it as being part of a class. The virtual pointer is part of the instance (object), and lives outsides the classes rules because those apply to the language, and the virtual pointer is an implementation mechanism.
1) which class holds the VPTR. or all the class is having seperate VPTR.
Every class object has its own vptr if the class is polymorphic (i.e. contains virtual function or has virtual inheritance.) In this case both the classes has virtual function.
2) when executing statement der d; how VPTR is resolve at run time?
You are just declaring the object of der. But even if you call a function then in this case the call to any function is resolved at compile time. Virtual function resolution comes into picture only when the function is called with pointer/reference.

virtual function - vtable

Let's say I have class A who inherits from class B and C (multiple inheritance).
How many vtable members class A would have ?
What's the case in single inheritance ?
In addition, suppose:
Class A : Public B {}
and:
B* test = new A();
Where does test gets its vtable from? What's assignment?
I assume it gets B's part of A's vtable, but does A's constructor changes its fathers (B) vtable too ?
First, vtable's are implementation specific. In fact, nowhere in the standard is specified that vtable's must exist at all.
Anyway, in most usual cases, you would get one vtable pointer per base class with virtual functions. And, as Yuval explained, nobody "fills" the vtable's when an object is constructed; you have one vtable per class with virtual functions, and objects just have pointers to their correct vtable (or vtable's, in case of multiple inheritance). In your single-inheritance example, test would have a pointer to A's vtable, assuming that A has at least one virtual function (inherited from B or newly declared in A).
Generally speaking - you need at least one vtable entry for each virtual function you inherit. If you have no virtual functions, you have no vtable.
Generally speaking, a subclass will have a vtable pointer to each of the multiple superclasses it inherits from (assuming, obviously, that each of those classes have at least one virtual function).
I'm not quite sure I understood your second question. When building an object, part of the construction process is setting the relevant vtable pointers, this is something that is done implicitly by the c++ compiler by static analysis of the inheritance hierarchy. None of the vtables change, they are merely pointed at.
When a class defines virtual functions, the compiler silently inserts a hidden vPtr data member for each supported interface.
The vPtr points to the correct vTable for the object.
The vTable contains a list of addresses which point to function implementations.
Here's an example of sorts.
class Foo: public Bar, public Baz
{
VTable* bar_vPtr;
VTable* baz_vPtr;
// Bar overrides/implementations
void barOverride();
// Baz overrides/implementations
void bazOverride();
};
VTable:
&barOverride() // address of implementation
VTable:
&bazOverride() // address of implementation
When an object is created, the memory is created for the data members and not the methods. The methods will be in a common location to be accessible by all the objects. This [ one per class ] applies to VTable as it merely consists of a function pointer for each virtual function in the class.
As mentioned by Steven, compiler adds a hidden pointer to the base class, which is set when a class instance is created such that it points to the virtual table for that class, This pointer is also inherited by derived classes.
When an object of the derived class is assigned to the base class pointer, the hidden pointer in the base class is replaced with the address of the derived class's vtable.

When is a vtable created in C++?

When exactly does the compiler create a virtual function table?
1) when the class contains at least one virtual function.
OR
2) when the immediate base class contains at least one virtual function.
OR
3) when any parent class at any level of the hierarchy contains at least one virtual function.
A related question to this:
Is it possible to give up dynamic dispatch in a C++ hierarchy?
e.g. consider the following example.
#include <iostream>
using namespace std;
class A {
public:
virtual void f();
};
class B: public A {
public:
void f();
};
class C: public B {
public:
void f();
};
Which classes will contain a V-Table?
Since B does not declare f() as virtual, does class C get dynamic polymorphism?
Beyond "vtables are implementation-specific" (which they are), if a vtable is used: there will be unique vtables for each of your classes. Even though B::f and C::f are not declared virtual, because there is a matching signature on a virtual method from a base class (A in your code), B::f and C::f are both implicitly virtual. Because each class has at least one unique virtual method (B::f overrides A::f for B instances and C::f similarly for C instances), you need three vtables.
You generally shouldn't worry about such details. What matters is whether you have virtual dispatch or not. You don't have to use virtual dispatch, by explicitly specifying which function to call, but this is generally only useful when implementing a virtual method (such as to call the base's method). Example:
struct B {
virtual void f() {}
virtual void g() {}
};
struct D : B {
virtual void f() { // would be implicitly virtual even if not declared virtual
B::f();
// do D-specific stuff
}
virtual void g() {}
};
int main() {
{
B b; b.g(); b.B::g(); // both call B::g
}
{
D d;
B& b = d;
b.g(); // calls D::g
b.B::g(); // calls B::g
b.D::g(); // not allowed
d.D::g(); // calls D::g
void (B::*p)() = &B::g;
(b.*p)(); // calls D::g
// calls through a function pointer always use virtual dispatch
// (if the pointed-to function is virtual)
}
return 0;
}
Some concrete rules that may help; but don't quote me on these, I've likely missed some edge cases:
If a class has virtual methods or virtual bases, even if inherited, then instances must have a vtable pointer.
If a class declares non-inherited virtual methods (such as when it doesn't have a base class), then it must have its own vtable.
If a class has a different set of overriding methods than its first base class, then it must have its own vtable, and cannot reuse the base's. (Destructors commonly require this.)
If a class has multiple base classes, with the second or later base having virtual methods:
If no earlier bases have virtual methods and the Empty Base Optimization was applied to all earlier bases, then treat this base as the first base class.
Otherwise, the class must have its own vtable.
If a class has any virtual base classes, it must have its own vtable.
Remember that a vtable is similar to a static data member of a class, and instances have only pointers to these.
Also see the comprehensive article C++: Under the Hood (March 1994) by Jan Gray. (Try Google if that link dies.)
Example of reusing a vtable:
struct B {
virtual void f();
};
struct D : B {
// does not override B::f
// does not have other virtuals of its own
void g(); // still might have its own non-virtuals
int n; // and data members
};
In particular, notice B's dtor isn't virtual (and this is likely a mistake in real code), but in this example, D instances will point to the same vtable as B instances.
The answer is, 'it depends'. It depends on what you mean by 'contain a vtbl' and it depends on the decisions made by the implementor of the particular compiler.
Strictly speaking, no 'class' ever contains a virtual function table. Some instances of some classes contain pointers to virtual function tables. However, that's just one possible implementation of the semantics.
In the extreme, a compiler could hypothetically put a unique number into the instance that indexed into a data structure used for selecting the appropriate virtual function instance.
If you ask, 'What does GCC do?' or 'What does Visual C++ do?' then you could get a concrete answer.
#Hassan Syed's answer is probably closer to what you were asking about, but it is really important to keep the concepts straight here.
There is behavior (dynamic dispatch based on what class was new'ed) and there's implementation. Your question used implementation terminology, though I suspect you were looking for a behavioral answer.
The behavioral answer is this: any class that declares or inherits a virtual function will exhibit dynamic behavior on calls to that function. Any class that does not, will not.
Implementation-wise, the compiler is allowed to do whatever it wants to accomplish that result.
Answer
a vtable is created when a class declaration contains a virtual function. A vtable is introduced when a parent -- anywhere in the heirarchy -- has a virtual function, lets call this parent Y. Any parent of Y WILL NOT have a vtable (unless they have a virtual for some other function in their heirarchy).
Read on for discussion and tests
-- explanation --
When you specify a member function as virtual, there is a chance that you may try to use sub-classes via a base-class polymorphically at run-time. To maintain c++'s guarantee of performance over language design they offered the lightest possible implementation strategy -- i.e., one level of indirection, and only when a class might be used polymorphically at runtime, and the programmer specifies this by setting at least one function to be virtual.
You do not incur the cost of the vtable if you avoid the virtual keyword.
-- edit : to reflect your edit --
Only when a base class contains a virtual function do any other sub-classes contain a vtable. The parents of said base class do not have a vtable.
In your example all three classes will have a vtable, this is because you can try to use all three classes via an A*.
--test - GCC 4+ --
#include <iostream>
class test_base
{
public:
void x(){std::cout << "test_base" << "\n"; };
};
class test_sub : public test_base
{
public:
virtual void x(){std::cout << "test_sub" << "\n"; } ;
};
class test_subby : public test_sub
{
public:
void x() { std::cout << "test_subby" << "\n"; }
};
int main()
{
test_sub sub;
test_base base;
test_subby subby;
test_sub * psub;
test_base *pbase;
test_subby * psubby;
pbase = ⊂
pbase->x();
psub = &subby;
psub->x();
return 0;
}
output
test_base
test_subby
test_base does not have a virtual table therefore anything casted to it will use the x() from test_base. test_sub on the other hand changes the nature of x() and its pointer will indirect through a vtable, and this is shown by test_subby's x() being executed.
So, a vtable is only introduced in the hierarchy when the keyword virtual is used. Older ancestors do not have a vtable, and if a downcast occurs it will be hardwired to the ancestors functions.
You made an effort to make your question very clear and precise, but there's still a bit of information missing. You probably know, that in implementations that use V-Table, the table itself is normally an independent data structure, stored outside the polymorphic objects, while objects themselves only store a implicit pointer to the table. So, what is it you are asking about? Could be:
When does an object get an implicit pointer to V-Table inserted into it?
or
When is a dedicated, individual V-Table created for a given type in the hierarchy?
The answer to the first question is: an object gets an implicit pointer to V-Table inserted into it when the object is of polymorphic class type. The class type is polymorphic if it contains at least one virtual function, or any of its direct or indirect parents are polymorphic (this is answer 3 from your set). Note also, that in case of multiple inheritance, an object might (and will) end up containing multiple V-Table pointers embedded into it.
The answer to the second question could be the same as to the first (option 3), with a possible exception. If some polymorphic class in single inheritance hierarchy has no virtual functions of its own (no new virtual functions, no overrides for parent virtual function), it is possible that implementation might decide not to create an individual V-Table for this class, but instead use it's immediate parent's V-Table for this class as well (since it is going to be the same anyway). I.e. in this case both objects of parent type and objects of derived type will store the same value in their embedded V-Table pointers. This is, of course, highly dependent on implementation. I checked GCC and MS VS 2005 and they don't act that way. They both do create an individual V-Table for the derived class in this situation, but I seem to recall hearing about implementations that don't.
C++ standards doesn't mandate using V-Tables to create the illusion of polymorphic classes. Most of the time implementations use V-Tables, to store the extra information needed. In short, these extra pieces of information are equipped when you have at least one virtual function.
The behavior is defined in chapter 10.3, paragraph 2 of the C++ language specification:
If a virtual member function vf is
declared in a class Base and in a
class Derived, derived directly or
indirectly from Base, a member
function vf with the same name and
same parameter list as Base::vf is
declared, then Derived::vf is also
virtual ( whether or not it is so
declared ) and it overrides Base::vf.
A italicized the relevant phrase. Thus, if your compiler creates v-tables in the usual sense then all classes will have a v-table since all their f() methods are virtual.