I would like to know when is a vtable created?
Whether its in the startup code before main() or is it at some other point of time??
A vtable isn't a C++ concept so if they are used and when they are created if they are used will depend on the implementation.
Typically, vtables are structures created at compile time (because they can be determined at compile time). When objects of a particular type are created at runtime they will have a vptr which will be initialized to point at a static vtable at construction time.
The vtable is created at compile time. When a new object is created during run time, the hidden vtable pointer is set to point to the vtable.
Keep in mind, though, that you can't make reliable use if the virtual functions until the object is fully constructed. (No calling virtual functions in the constructor.)
EDIT
I thought I'd address the questions in the comments.
As has been pointed out, the exact details of how the vtable is created and used is left up to the implementation. The c++ specification only provides specific behaviors that must be guaranteed, so there's plenty of wiggle room for the implementation. It doesn't have to use vtables at all (though most do). Generally, you don't need to know those details. You just need to know that when you call a virtual function, it do what you expect regardless of how it does it.
That said, I'll clarify a couple points about a typical implementation. A class with virtual functions has a hidden pointer (we'll call vptr) that points to the vtable for that class. Assume we have an employee class:
class Employee {
public:
virtual work();
};
This class will have a vptr in it's structure, so it actually may look like this:
class Employee {
public:
vtble *vptr; // hidden pointer
virtual work();
};
When we derive from this class, it will also have a vptr, and it must be in the same spot (in this case, at the beginning). That way when a function is called, regardless of the type of derived class, it always uses the vptr at the beginning to find the right vtable.
Related
What is the performance difference between calling a virtual function from a derived class pointer directly vs from a base class pointer to the same derived class?
In the derived pointer case, will the call be statically bound, or dynamically bound? I think it'll be dynamically bound because there's no guarantee the derived pointer isn't actually pointing to a further derived class. Would the situation change if I have the derived class directly by value (not through pointer or reference)? So the 3 cases:
base pointer to derived
derived pointer to derived
derived by value
I'm concerned about performance because the code will be run on a microcontroller.
Demonstrating code
struct Base {
// virtual destructor left out for brevity
virtual void method() = 0;
};
struct Derived : public Base {
// implementation here
void method() {
}
}
// ... in source file
// call virtual method from base class pointer, guaranteed vtable lookup
Base* base = new Derived;
base->method();
// call virtual method from derived class pointer, any difference?
Derived* derived = new Derived;
derived->method();
// call virtual method from derived class value
Derived derivedValue;
derived.method();
In theory, the only C++ syntax that makes a difference is a member function call that uses qualified member name. In terms of your class definitions that would be
derived->Derived::method();
This call ignores the dynamic type of the object and goes directly to Derived::method(), i.e. it's bound statically. This is only possible for calling methods declared in the class itself or in one of its ancestor classes.
Everything else is a regular virtual function call, which is resolved in accordance with the dynamic type of the object used in the call, i.e. it is bound dynamically.
In practice, compilers will strive to optimize the code and replace dynamically-bound calls with statically-bound calls in contexts where the dynamic type of the object is known at compile time. For example
Derived derivedValue;
derivedValue.method();
will typically produce a statically-bound call in virtually every modern compiler, even though the language specification does not provide any special treatment for this situation.
Also, virtual method calls made directly from constructors and destructors are typically compiled into statically-bound calls.
Of course, a smart compiler might be able to bind the call statically in a much greater variety of contexts. For example, both
Base* base = new Derived;
base->method();
and
Derived* derived = new Derived;
derived->method();
can be seen by the compiler as trivial situations that easily allow for statically-bound calls.
Virtual functions must be compiled to work as if they were always called virtually. If your compiler compiles a virtual call as a static call, that's an optimization that must satisfy this as-if rule.
From this, it follows that the compiler must be able to prove the exact type of the object in question. And there are some valid ways in which it can do this:
If the compiler sees the creation of the object (the new expression or the automatic variable from which the address is taken) and can prove that that creation is actually the source of the current pointer value, that gives it the precise dynamic type it needs. All your examples fall into this category.
While a constructor runs, the type of the object is exactly the class containing the running constructor. So any virtual function call made in a constructor can be resolved statically.
Likewise, while a destructor runs, the type of the object is exactly the class containing the running destructor. Again, any virtual function call can be resolved statically.
Afaik, these are all the cases that allow the compiler to convert a dynamic dispatch into a static call.
All of these are optimizations, though, the compiler may decide to perform the runtime vtable lookup anyway. But good optimizing compilers should be able to detect all three cases.
There should be no difference between the first two cases, since the very idea of virtual functions is to call always the actual implementation. Leaving compiler optimisations aside (which in theory could optimise all virtual function calls away if you construct the object in the same compilation unit and there is no way the pointer can be altered in between), the second call must be implemented as a indirect (virtual) call as well, since there could be a third class inheriting from Derived and implementing that function as well. I would assume that the third call will not be virtual, since the compiler knows the actual type already at compile time. Actually you could make sure of this by not defining the function as virtual, if you know you will always do the call on the derived class directly.
For really lightweight code running on a small microcontroller I would recommend avoiding defining functions as virtual at all. Usually there is no runtime abstraction required. If you write a library and need some kind of abstraction, you can maybe work with templates instead (which give you some compile-time abstraction).
At least on PC CPUs I often find virtual calls one of the most expensive indirections you can have (probably because branch prediction is more difficult). Sometimes one can also transform the indirection to the data level, e.g. you keep one generic function which operates on different data which is indirected with pointers to the actual implementation. Of course this will work only in some very specific cases.
At run-time.
BUT: Performance as compared to what? It isn't valid to compare a virtual function call to a non-virtual function call. You need to compare it to a non-virtual function call plus an if, a switch, an indirection, or some other means of providing the same function. If the function doesn't embody a choice among implementations, i.e. doesn't need to be virtual, don't make it virtual.
Say I have an object that exists in high quantity, stores little data about itself, but requires several larger functions to act upon itself.
class Foo
{
public:
bool is_dead();
private:
float x, y, z;
bool dead;
void check_self();
void update_self();
void question_self();
};
What behavior can I expect from the compiler - would every new Foo object cause duplicates of its methods to be copied into memory?
If yes, what are good options for managing class-specific (private-like) functions while avoiding duplication?
If not, could you elaborate on this a little?
C++ methods are simply functions (with a convention about this which often becomes the implicit first argument).
Functions are mostly machine code, starting at some specific address. The start address is all that is needed to call the function.
So objects (or their vtable) need at most the address of called functions.
Of course a function takes some place (in the text segment).
But an object won't need extra space for that function. If the function is not virtual, no extra space per object is needed. If the function is virtual, the object has a single vtable (per virtual class). Generally, each object has, as its first field, the pointer to the vtable. This means 8 bytes per object on x86-64/Linux. Each object (assuming single inheritance) has one vtable pointer, independently of the number or of the code size of the virtual
functions.
If you have multiple, perhaps virtual, inheritance with virtual methods in several superclasses you'll need several vtable pointers per instance.
So for your Foo example, there is no virtual function (and no superclass containing some of them), so instances of Foo contain no vtable pointer.
If you add one (or many hundreds) of virtual functions to Foo (then you should have a virtual destructor, see rule of three in C++), each instance would have one vtable pointer.
If you want a behavior to be specific to instances (so instances a and b could have different behavior) without using the class machinery for that, you need some member function pointers (in C++03) or (in C++11) some std::function (perhaps anonymous closures). Of course they need space in every instance.
BTW, to know the size of some type or class, use sizeof .... (it does include the vtable[s] pointer[s] if relevant).
Methods exists for every class in program, not for every object.
Try to read some good books about c++ to know so easy facts about language.
I know that, how to implement virtual function call resolution is not part of C++ standrads nor it says anything about vptr or v-table, but let me ask this question here.
I've heard that v-table is a common technique used by compilers to implement virtual function call resolution. My understading about this is that there is only virtual table is needed per class, per process.
What I am wondering is, when is the v-table is created for a class?
Is it When the class of a given type (which needs a v-table) is created in a process space for the first time?
All other subsequently created objects of that type in that process space, refers to the v-table that is already created?
When would this v-table will be deleted?
I am sorry if this is too subjective or discussion type of question, but these questions lingers in my mind for a while and I feel its OK asking it here.
The v-table is statically allocated and is never deleted, nor is it explicitly allocated. The pointers within any given specific object are constants.
The C++ FAQ provides a simplified explanation of the vtable mechanism. You should give it a read, although you will probably have to go through your particular compiler documentation for more details.
The most important ideas from my point of view :
The vtable for a type is static and built at compile time
Each of the type instances contains a pointer to this table
Because this pointer is initialized at construction time, a virtual member function should never be called from the constructor
The vtable is static data so available immediately at load. BTW, it is usually bundled in the compilation unit which contains the definition for the first non-inline virtual function on the class (and that heuristic leads to problem when there is only one virtual function which is inline).
I believe it's all implementation defined, so it's difficultto give a universal answer to this question. I believe the vtable should be a some sort of a static class member.
Does a pure-virtual object have a pointer to the vtbl?
(that probably points to NULL?)
thanks, i'm a little bit confused with all the virtual mechanism.
Don't worry about it. Virtual tables are an implementation detail, and aren't even guaranteed to exist. The more you worry about how it might be done, the less you learn about the actual language.
That said, yes. A concrete class will then set that pointer to point to the correct virtual table.
There isn't technically such a thing as a 'pure-virtual object'. I assume you mean an object with pure-virtual methods? But you can't actually create such an object because it would be abstract and the compiler would complain.
Having said that, while the object is being constructed it is briefly an instance of the abstract class before becoming an instance of the derived class. It will in such a case have a virtual table set the functions it defines. It will probably have NULL for the pure virtual methods. If you try calling that the program will crash.
You can try this out by calling virtual methods in the constructor. You'll find they invoke the base class version if you call the methods in the base class. If you call a pure virtual method it'll crash. (In some cases the compiler will figure out what you are doing and complain instead).
The take home is:
Don't call virtual functions in your constructor, its just likely to be confusing. In fact, in most cases it is best if your constructor just sets its internal status up and does not do anything too complicated.
I need some answers to basic questions. I'm lost again. :(
q1 - Is this statement valid:
Whenever we define the function to be pure virtual function,
this means that function has no body.
q2 - And what is the concept of Dynamic Binding? I mean if the Compiler optimizes the code using VTABLEs and VPTRs then how is it Run-Time Polymorphism?
q3 - What are VTABLES AND VPTRs and how do their sizes change?
q4 - Please see this code:
class base
{
public:
virtual void display()
{
cout<<"Displaying from base";
}
};
class derived:public base
{
public:
void display(){cout<<"\nDisplaying from derived";}
};
int main()
{
base b,*bptr;
derived d;
bptr=&b;
bptr->display();
bptr=&d;
bptr->display();
}
Output:
Displaying from base
Displaying from derieved
Please can somebody answer why a pointer of base class can point the member function of a derived class and the vice-versa is not possible, why ?
False. It just means any derived classes must implement said function. You can still provide a definition for the function, and it can be called by Base::Function().*
Virtual tables are a way of implementing virtual functions. (The standard doesn't mandate this is the method, though.) When making a polymorphic call, the compiler will look up the function in the function table and call that one, enabling run-time binding. (The table is generated at compile time.)
See above. Their sizes change as there are more virtual functions. However, instances don't store a table but rather a pointer to the table, so class size only has a single size increase.
Sounds like you need a book.
*A classic example of this is here:
struct IBase
{
virtual ~IBase(void) = 0;
};
inline IBase::~IBase(void) {}
This wouldn't be an abstract class without a pure virtual function, but a destructor requires a definition (since it will be called when derived classes destruct.)
1) Not necessarily. There are times when you provide body for pure virtual functions
2) The function to be called is determined at run time.
False. It only means that derived classes must implement the method and that the method definition (if present) at that level will not be consider an override of the virtual method.
The vtable is implemented at compile time, but used at runtime. The compiler will redirect the call through the vtable, and that depends on the runtime type of the object (a pointer to base has static type base*, but might point to an object of type derived at runtime).
vptrs are pointers to an vtable, they do not change size. vtables are tables of pointers to code (might point to methods or to some adapter code) and have one entry for each virtual method declared in the class.
After the edit in the code:
The pointer refers to an object of type base during the first call, but it points to an object of type derived at the second call position. The dynamic dispatch mechanism (vtable) routes the call to the appropriate method.
A common implementation, which may help you understand is that in each class that declares virtual functions the compiler reserves space for a pointer to a virtual table, and it also generates the virtual table itself, where it adds pointers to the definition of each virtual method. The memory layout of the object only has that extra pointer.
When a derived class overrides any of the base class methods, the compiler generates a different vtable with pointers to the final overriders at that level. The memory layout of both the base and the derived class coincide in the base subobject part (usually the beginning), but the value of the vptr of a base object will point to the base vtable, while the value of the vptr in the derived object will point to the derived vtable.
When the compiler sees a call like bptr->display(), it checks the definition of the base class, and sees that it is the first virtual method, then it redirects the call as: bptr->hidden_vptr[0](). If the pointer is referring to a real base instance, that will be a pointer to base::display, while in the case of a derived instance it will point to derived::display.
Note that there is a lot of hand-waving in this answer. All this is implementation defined (the language does not specify the dispatch mechanism), and in most cases the dispatch mechanism is more complex. For example, when multiple inheritance takes place, the vtable will not point directly to the final overrider, but to an adapter block of code that resets the first implicit this parameter offset, as the base subobject of all but the first base are unaligned with the most derived object in memory --this is well beyond the scope of the question, just remember that this answer is a rough idea and that there is added complexity in real systems.
Is this statement valid
Not exactly: it might have a body. A more accurate definition is "Whenever we define a method to be pure virtual, this means that the method must be defined (overriden) in a concrete subclass."
And what is the concept of Dynamic Binding? I mean if the Compiler optimizes the code using VTABLEs and VPTRs then how is it Run-Time Polymorphism?
If you have an instance of a superclass (e.g. Shape) at run-time, you don't/needn't necessarily know which of its subclasses (e.g. Circle or Square) it is.
What are VTABLES AND VPTRs and how do their sizes change?
There's one vtable per class (for any class which has one or more virtual methods). The vtable contains pointers to the addresses of the class's virtual methods.
There's one vptr per object (for any object which has one or more virtual methods). The vptr points to the vtable for that object's class.
The size of the vtable increases with the number of virtual functions in the class. The size of the vptr is probably constant.
Please can somebody answer why a pointer of base class can point the member function of a derived class and the vice-versa is not possible, why ?
If you want to invoke the base class function then (because it's virtual, and the default behaviour for virtual is to call the most-derived version via the vptr/vtable) then you have to say so explicitly, e.g. like this:
bptr->base::display();