How Vtable of Virtual functions work - c++

I have a small doubt in Virtual Table, whenever compiler encounters the virtual functions in a class, it creates Vtable and places virtual functions address over there. It happens similarly for other class which inherits. Does it create a new pointer in each class which points to each Vtable? If not how does it access the Virtual function when the new instance of derived class is created and assigned to Base PTR?

Each time you create a class that contains virtual functions, or you
derive from a class that contains virtual functions, the compiler
creates a unique VTABLE for that class.
If you
don’t override a function that was declared virtual in the base class,
the compiler uses the address of the base-class version in the
derived class.
Then it places the VPTR into
the class. There is only one VPTR for each object when using simple
inheritance . The VPTR must be initialized to point to the
starting address of the appropriate VTABLE. (This happens in the
constructor.)
Once the VPTR is initialized to the proper VTABLE, the object in
effect “knows” what type it is. But this self-knowledge is worthless
unless it is used at the point a virtual function is called.
When you call a virtual function through a base class address (the
situation when the compiler doesn’t have all the information
necessary to perform early binding), something special happens.
Instead of performing a typical function call, which is simply an
assembly-language CALL to a particular address, the compiler
generates different code to perform the function call.

For each class with virtual functions, a vtable is created. Then, when an object of a class with a viable is created using a constructor, the constructor copies the appropriate vtable into the object. So each object has a pointer to its vtable ( or in the case of multiple inheritance, when necessary, a Orr to each of its vtables. ). The compiler knows where in the object the vtable is, so when it needs to call a virtual method, it outputs byte code to deterrence the vtable, lookup the appropriate method, and jump to its address.
In the simple case of single inheritance, a child class starts with a copy of the parent class's vtable and then gets an overridden entry for each virtual method in the child class that overrides a parent class's method. ( and it also gets a new entry for every virtual function in the child clad that does not override a parent class method )

Whenever the program compiles the virtual table for each class is created, which makes clear to the fact that vtables are created per class basis.
During run time, when the object is created the compiler assigns vptr to the object, which points to the virtual table for the particular class' object.
In short the vptr is created per object basis.

Related

Can derived classes have more than one pointer to a virtual table?

I am watching the BackToBasics talk: Virtual Dispatch and Its Alternatives
from CppCon2019. The presenter says and the slide shows (assuming I haven't misunderstood) that a derived class inherits a vtable pointer from the base class and additionally has its own vptr.
Of course, technically this isn't mandated by the standard but I am getting myself a bit confused and my experiments with sizeof() also seem to imply there should only need to be one pointer. Please can someone clarify if there are any situations where multiple vptrs are needed?
Thanks
P.S. Just to be clear, in this context we are considering the more common public inheritance and not virtual or multiple inheritance (the presenter explicitly mentions this in an earlier part of the talk).
The a vtable contains the address of each virtual function for the class at a known offset.
[Remark: In practice unlike a regular class, vtables have members at negative offset, much like a pointer in a the middle of a array. That is just a convention that doesn't change implementation freedom much. Anyway the only issue is that the placement of an information in a vtable is legislated by a convention (the ABI) and compilers by following the same one produce compatible code for polymorphic classes.]
What happens when you have additional functions in a derived class? (not just the functions "inherited" from the base class)
Once you accept the idea that a pointer to a structure both points to the whole object and to its first member, you have the idea that a pointer to derived class points to a base class that is appropriately located at offset zero. So you can have the exact same pointer value, as represented as a void*, that can be used alternatively for a derived object or a base under this convention for single inheritance.
Now you can apply that to any data structure and even to a vtable which is really not a table (array of elements of the same type, or values that can be interpreted in the same way) but a record (of objects of unrelated type or meaning); you can see that a vtable for such derived class can just be derived from the vtable of its unique base in the exact same way.
(Note that if you compile C++ to C, you might run into type aliasing rules when you do such things. Of course assembly has no such problem, nor naively compiled "high level assembler" C.)
So for single inheritance the base is integrated and optimized into the derived class:
for data members of the instance (of a class type)
and for the virtual functions members, that is the data members of the vtable (or members of the meta class if you imagine one).
Note that placing the base at offset zero allows you to place vtable base at zero offset, which in turn allows you to use the same vptr but does not imply it; conversely sharing the vptr with a base implies that the base vtable is at offset zero (vtable layout = meta class level) so the base must be at offset zero (data members layout = class level).
And multiple inheritance is actually single inheritance plus, as one class is always treated as privileged: it is placed at offset zero so the pointers are the same, so the vtable can be placed at offset zero (because the pointers are the same); others bases, not so.
As we see, all but one of the inherited polymorphic classes are placed at a non zero offset in multiple inheritance. Each one carries an additional "inherited" vptr in the derived class; that (hidden) pointer member must be correctly filled by any derived constructor.
These additional vptr are for base classes that occur at non zero offset, so a pointer to an inherited base must be adjusted (add a positive constant to convert to base pointer, remove it to convert back). That a compiler needs to produce code to perform an implicit conversion is a trivial remark (converting an integer to a floating point type is a much more involved task); but here the conversion of this is between a function call on a given base type and landing in the function that is an overrider in a base or derived class: the difference is that adjustment depends on function overriding which is only known for a class (an instance of a meta type). So the vptr needs to point to distinct vtable information: one that knows how to deal with these base to derived pointer conversions.
As instances of the "meta type", vtables have all the information to do all pointers adjustment automatically. (These depend on the specific class types involved, and on no other factor.)
So at implementation level, the two types of inheritances are:
zero offset inheritance; sharing the vptr; called a primary base class in some vtable and ABI descriptions;
arbitrary offset inheritance; having another vptr; called secondary base class.
This is for the basic stuff. Virtual inheritance is a lot more subtle at the implementation level, and even the concept of primary isn't so clear, as virtual bases can be "primary" of a derived class only in some more derived classes!
Suppose we have two classes, each with at least one virtual function, Possession and Vehicle. To invoke those virtual functions for an instance of a derived class of either of those, a pointer to a virtual table is needed. Since these two classes are independent, their virtual tables will be completely different.
Now imagine that OwnedVehicle derives from both Possession and Vehicle. To call a virtual function in Possession for an instance of OwnedVehicle requires a pointer to a virtual function table of the type required by Possession. Similarly, to call a virtual function in Vehicle for an instance of OwnedVehicle requires a pointer to a virtual function table of the type required by Vehicle.
Typical implementations handle this by building a virtual function table for OwnedVehicle that contains one part for OwnedVehicle virtual functions (if any), one for Vehicle virtual functions and one for Possession virtual functions. Then, when calling a virtual function from a pointer to an object of a different type, all that the compiler has to do is apply an applicable delta to the virtual function table pointer to point to the correct part of it.
While the multiple inheritance case is more complex, the same occurs with just single inheritance. The virtual function table for OwnedVehicle contains inside it a virtual function table for Vehicle and would do so even if Possession were not involved.

Why does an abstract class have a vtable?

Regarding this post:
For implementations that use vtable, the answer is: Yes, usually. You
might think that vtable isn't required for abstract classes because
the derived class will have its own vtable, but it is needed during
construction: While the base class is being constructed, it sets the
vtable pointer to its own vtable. Later when the derived class
constructor is entered, it will use its own vtable instead.
I'm assuming the answer is correct, but I don't quite get it. Why is the vtable needed exactly for construction?
Because the standard says so.
[class.cdtor]/4
When a virtual function is called directly or indirectly from a
constructor or from a destructor, including during the construction or
destruction of the class's non-static data members, and the object to
which the call applies is the object (call it x) under construction or
destruction, the function called is the final overrider in the
constructor's or destructor's class and not one overriding it in a
more-derived class.
The rationale is that first the base class is constructed, then the derived one. If a virtual function is called inside the base class' constructor, it would be bad to call the derived class, since the derived class isn't initialized yet.
Remember that an abstract class may have non-pure virtual functions. Also, for debugging purposes, it is good to point pure virtual functions to a debugging trap (e.g. MSVC calls _purecall()).
If all virtual functions are pure, in MSVC you can omit the vtable with __declspec(novtable). If you use a lot of interface classes, this can lead to significant savings because you omit vfptr initialization. But if you accidentally call a pure virtual function, you'll get a hard to debug access violation.
vtables are implementation issues in C++, they are not part of the standard.
vtables are used for both dynamic dispatching of methods and for RTTI. While a nullptr vtable pointer would work for dynamic dispatching (as the vtable pointer is only used when you have an instance of that type) in a pure-abstract class, a dynamic_cast to a pure abstract class is legal, and it may require that the vtable itself exist.
Designers of the C++ implementation and ABI might have simply given the purely abstract class (a class with no implemented methods, just =0 ones) a vtable to make their implementation simpler. Every class has a vtable, and the vtable pointer gets set during construction of that class. Code can then rely on the fact that the vtable pointer exists and does not have to check for null every time. Code doesn't have to ask questions like "is this a purely abstract class".
For a non-pure abstract class (where some methods have implementations but some are pure virtual), during construction/destruction you can have defined (if unexpected) behavior that involves invoking exactly this class's version of a given method, and not the base class method or an inherited method. For this to work, you need to have a vtable set up. With a pure abstract class, there is no defined result of such a call, so the vtable is redundant, but for an abstract class that isn't totally abstract this does not hold.
When your class has a pure virtual function, that does not mean you cannot also have an implementation for it (!!). So that implies you can have an abstract class, which is also fully implemented. The constructor of your abstract class has to be able to call all functions - even the pure virtual ones, because of this point - that exist for it so far.
If you'd have substituted the client one, you'd get different behaviour for the base class constructor depending on the deriving class - not a great idea, so that's not allowed. You could put in place no vtable and statically resolve all function calls - that works, but it implies handling the constructor specially compared to all other functions and requires inlining all other functions to do this (since a function called from the constructor may also call a virtual etc.) - not very practical.
So it just implements a vtable for the constructor and destructor to use during construction and destruction. It allows you to use typeid and dynamic_cast in the c'tor and d'tor with the predictable result and get reliable behaviour out of the virtual functions you have. No alternative solution would do that.

C++ -- vptr & vtbl associated with object or class?

vptr -- virtual table pointer
vtbl -- virtual table
Question 1> Is it correct that vptr is associated with the object of a class?
Question 2> Is it correct that vtbl is associated with the class?
Question 3> How do they really work together?
Thank you
Note that vptr and vtbl are Implementation defined the C++ standard does not even talk about them. However, most known compilers implement dynamic dispatch through vptr and vtbl.
Question 1: Is it correct that vptr is associated with the object of a class?
YES
vptr is associated with the object of a Class which contains Atleast one virtual member function. The compiler adds Vptr to every object of the Class which is Polymorphic(contains atleast one virtual member function)The first 4 bytes of the this pointer then point to the vptr
Question 2: Is it correct that vtbl is associated with the class?
YES
Vtbl is associated with a Class. A vtbl gets created for each Polymorphic class.
Question 3: How do they really work together?
The compiler adds a vptr to every object of a Polymorphic class and also creates a vtbl for each of that class. The vptr points to the vtbl. The vtbl contains a list of addresses of all the virtual functions in that class. In case if a derived class overrides a virtual function of the Base Class then the address entry for that particular function in the vtbl is replaced by the address of the overridden function. At runtime the compiler traverses the vtbl of the particular class(Base or Derived) based on the address inside the pointer rather than the type of pointer and calls the function address in the vtbl. Thus Dynamic Polymorphism is achieved.
The cost of this dynamic polymorphism is:
fetch(fetch the vptr inside this) fetch(fetch the function address from list of functions in vtbl) Call(Call the function)
as against call(direct call to function since static binding).
The virtual table pointer is just a pointer inside every object of your class, pointing to the correct virtual table. Inside the virtual table are the addresses for the virtual functions of your class. The compiler traverses the virtual table to get the correct function when invoking a function through a base class pointer.

When is VTable in C++ created?

I would like to know when is a vtable created?
Whether its in the startup code before main() or is it at some other point of time??
A vtable isn't a C++ concept so if they are used and when they are created if they are used will depend on the implementation.
Typically, vtables are structures created at compile time (because they can be determined at compile time). When objects of a particular type are created at runtime they will have a vptr which will be initialized to point at a static vtable at construction time.
The vtable is created at compile time. When a new object is created during run time, the hidden vtable pointer is set to point to the vtable.
Keep in mind, though, that you can't make reliable use if the virtual functions until the object is fully constructed. (No calling virtual functions in the constructor.)
EDIT
I thought I'd address the questions in the comments.
As has been pointed out, the exact details of how the vtable is created and used is left up to the implementation. The c++ specification only provides specific behaviors that must be guaranteed, so there's plenty of wiggle room for the implementation. It doesn't have to use vtables at all (though most do). Generally, you don't need to know those details. You just need to know that when you call a virtual function, it do what you expect regardless of how it does it.
That said, I'll clarify a couple points about a typical implementation. A class with virtual functions has a hidden pointer (we'll call vptr) that points to the vtable for that class. Assume we have an employee class:
class Employee {
public:
virtual work();
};
This class will have a vptr in it's structure, so it actually may look like this:
class Employee {
public:
vtble *vptr; // hidden pointer
virtual work();
};
When we derive from this class, it will also have a vptr, and it must be in the same spot (in this case, at the beginning). That way when a function is called, regardless of the type of derived class, it always uses the vptr at the beginning to find the right vtable.

Virtual function and Classes

I need some answers to basic questions. I'm lost again. :(
q1 - Is this statement valid:
Whenever we define the function to be pure virtual function,
this means that function has no body.
q2 - And what is the concept of Dynamic Binding? I mean if the Compiler optimizes the code using VTABLEs and VPTRs then how is it Run-Time Polymorphism?
q3 - What are VTABLES AND VPTRs and how do their sizes change?
q4 - Please see this code:
class base
{
public:
virtual void display()
{
cout<<"Displaying from base";
}
};
class derived:public base
{
public:
void display(){cout<<"\nDisplaying from derived";}
};
int main()
{
base b,*bptr;
derived d;
bptr=&b;
bptr->display();
bptr=&d;
bptr->display();
}
Output:
Displaying from base
Displaying from derieved
Please can somebody answer why a pointer of base class can point the member function of a derived class and the vice-versa is not possible, why ?
False. It just means any derived classes must implement said function. You can still provide a definition for the function, and it can be called by Base::Function().*
Virtual tables are a way of implementing virtual functions. (The standard doesn't mandate this is the method, though.) When making a polymorphic call, the compiler will look up the function in the function table and call that one, enabling run-time binding. (The table is generated at compile time.)
See above. Their sizes change as there are more virtual functions. However, instances don't store a table but rather a pointer to the table, so class size only has a single size increase.
Sounds like you need a book.
*A classic example of this is here:
struct IBase
{
virtual ~IBase(void) = 0;
};
inline IBase::~IBase(void) {}
This wouldn't be an abstract class without a pure virtual function, but a destructor requires a definition (since it will be called when derived classes destruct.)
1) Not necessarily. There are times when you provide body for pure virtual functions
2) The function to be called is determined at run time.
False. It only means that derived classes must implement the method and that the method definition (if present) at that level will not be consider an override of the virtual method.
The vtable is implemented at compile time, but used at runtime. The compiler will redirect the call through the vtable, and that depends on the runtime type of the object (a pointer to base has static type base*, but might point to an object of type derived at runtime).
vptrs are pointers to an vtable, they do not change size. vtables are tables of pointers to code (might point to methods or to some adapter code) and have one entry for each virtual method declared in the class.
After the edit in the code:
The pointer refers to an object of type base during the first call, but it points to an object of type derived at the second call position. The dynamic dispatch mechanism (vtable) routes the call to the appropriate method.
A common implementation, which may help you understand is that in each class that declares virtual functions the compiler reserves space for a pointer to a virtual table, and it also generates the virtual table itself, where it adds pointers to the definition of each virtual method. The memory layout of the object only has that extra pointer.
When a derived class overrides any of the base class methods, the compiler generates a different vtable with pointers to the final overriders at that level. The memory layout of both the base and the derived class coincide in the base subobject part (usually the beginning), but the value of the vptr of a base object will point to the base vtable, while the value of the vptr in the derived object will point to the derived vtable.
When the compiler sees a call like bptr->display(), it checks the definition of the base class, and sees that it is the first virtual method, then it redirects the call as: bptr->hidden_vptr[0](). If the pointer is referring to a real base instance, that will be a pointer to base::display, while in the case of a derived instance it will point to derived::display.
Note that there is a lot of hand-waving in this answer. All this is implementation defined (the language does not specify the dispatch mechanism), and in most cases the dispatch mechanism is more complex. For example, when multiple inheritance takes place, the vtable will not point directly to the final overrider, but to an adapter block of code that resets the first implicit this parameter offset, as the base subobject of all but the first base are unaligned with the most derived object in memory --this is well beyond the scope of the question, just remember that this answer is a rough idea and that there is added complexity in real systems.
Is this statement valid
Not exactly: it might have a body. A more accurate definition is "Whenever we define a method to be pure virtual, this means that the method must be defined (overriden) in a concrete subclass."
And what is the concept of Dynamic Binding? I mean if the Compiler optimizes the code using VTABLEs and VPTRs then how is it Run-Time Polymorphism?
If you have an instance of a superclass (e.g. Shape) at run-time, you don't/needn't necessarily know which of its subclasses (e.g. Circle or Square) it is.
What are VTABLES AND VPTRs and how do their sizes change?
There's one vtable per class (for any class which has one or more virtual methods). The vtable contains pointers to the addresses of the class's virtual methods.
There's one vptr per object (for any object which has one or more virtual methods). The vptr points to the vtable for that object's class.
The size of the vtable increases with the number of virtual functions in the class. The size of the vptr is probably constant.
Please can somebody answer why a pointer of base class can point the member function of a derived class and the vice-versa is not possible, why ?
If you want to invoke the base class function then (because it's virtual, and the default behaviour for virtual is to call the most-derived version via the vptr/vtable) then you have to say so explicitly, e.g. like this:
bptr->base::display();