Does an abstract classes have a VTABLE? - c++

Do we have virtual table for an abstract class?

First of all, usage of vtables is implementation defined and not mandated by the standard.
For implementations that use vtable, the answer is: Yes, usually. You might think that vtable isn't required for abstract classes because the derived class will have its own vtable, but it is needed during construction: While the base class is being constructed, it sets the vtable pointer to its own vtable. Later when the derived class constructor is entered, it will use its own vtable instead.
That said, in some cases this isn't needed and the vtable can be optimized away. For example, MS Visual C++ provides the __declspec(novtable) flag to disable vtable generation on pure interface classes.

There seems to be a common misconception here, and I think traces of its sources can still be found online. Paul DiLascia wrote sometime in 2000 that -
...see that the compiler still
generates a vtable all of whose
entries are NULL and still generates
code to initialize the vtable in the
constructor or destructor for A.
That may actually have been true then, but certainly isn't now.
Yes, abstract classes do have vtables, also with pure abstract methods (these can actually be implemented and called), and yes - their constructor does initialize the pure entries to a specified value. For VC++ at least, that value is in the address of the CRT function _purecall. You can in fact control that value, either by overloading purecall yourself or using _set_purecall_handler.

We have a virtual table for a class which has atleast one virtual function.
that virtual function can also be pure.
this means. an abstact class can have a vtable.
in case of abstact classes the vtable entry will be NULL.
when ever you try to instantiate a abstract class it will check in tha vtable and check for a NULL value is present or not.
if NULL is present the compiler will throw an error.

Related

C++ Method Override and Overloading (Compiler level)

I know what the difference between the two. Overriding basically lets you "redefine" your a method in a child class and overloading basically lets you "redefine" your method with different arguments or parameters. I'm a little confused on what's going on under the hood though. I read that when you overload a method, the compiler will have all the overloaded methods and find the best match or report an error if none exists. This is obviously done during compile time but I'm confused on how Override works. I've read that handling overrides is extremely hard because you'll have to check if the return type matches with the class hierarchy and there can be a lot of class levels to check
(ie. class Living is the super class of Human and Animal. Human and Animal can have many derived classes which means we will have a deep level of classes).
Without getting too detailed, how does overriding work at the compiler level and why is it that overriding is done during run time and not compile time?
It depends on if the overridden method is virtual or not. If the overridden method is not virtual, then under the hood it usually works in the same way as overloading, the compiler looks at the static type of the object and calls the correct function based on that.
For objects with virtual methods a vtable is usually used. This is a collection of function pointers to the virtual methods. The reason this is done at run time is to allow for runtime polymorphism. The usual way that a vtable is generate is the compilier will generate a single vtable for each class and populate it with the required pointers at compile time and include this in the executable. The constructor will then set a hidden pointer in the class to point to the correct vtable. When looking up methods it first dereferences the hidden pointer to find the vtable then dereferences the correct slot from the vtable.

C++ Interview: vtable for a class with a pure virtual function

I was asked this interview question today!! (it was a really awkward telephonic interview..):
What is the difference between the vtable for a class with virtual
functions and a class with pure virtual functions?
Now, I know the C++ standard doesn't specify anything about vtables, or even the existence of a v-table ..however theoretically speaking what would the answer be?
I blurted out that the class with a pure virtual function could have a vtable and its vtable entry for the pure virtual function will point to the derived class implementation. Is this assumption correct? I did not get a positive answer from the interviewer.
Will a hypothetical compiler create a vtable for a class with only pure virtual functions? What if the class contains pure virtual functions with definitions? (as shown in : http://www.gotw.ca/gotw/031.htm).
In the case of non-pure virtual functions, each entry in the vtable will refer to the final-overrider or a thunk that adapts the this pointer if needed. In the case of a pure-virtual function, the entry in the vtable usually contains a pointer to a generic function that complains and aborts the program with some sensible message (pure virtual function called within this context or similar error message).
Will a hypothetical compiler create a vtable for a class with only pure virtual functions?
Yes, it will, the difference will be in the contents stored in the table, not in the shape of the table. In a simplistic approach, a NULL pointer for pure virtual functions, non-NULL for virtual functions. Realistically, a pointer to a generic function that will complain and abort() with usual compilers.
What if the class contains pure virtual functions with definitions?
This will not affect the vtable. The vtable is only used for dynamic dispatch, and a call will never be dynamically dispatched to the definition of a pure virtual function (i.e. you can only manually dispatch to the pure virtual function by disabling dynamic dispatch qualifying the name of the type: x.base::f() will call base::f even if it is pure-virtual, but x.f() will never be dispatched to base::f if it is pure virtual.
An implementation can do pretty much anything in such cases, because if
your code ends up calling a pure virtual function in a context where
dynamic resolution is required, and it would resolve to a pure virtual
function, the behavior is undefined. I've seen several different
solutions: the compiler inserts the address of a function which
terminates with an error message (the preferred solution from a
quality of implementation point of view), the compiler inserts a null
pointer, or the compiler inserts the address of the function from some
derived class. I've also seen cases where the compiler will insert the
address of the function if you provide an implementation. The only
correct answer to the question is that you can't count on any particular
behavior.
I can tell you that "pure" abstract classes (classes with only pure virtual functions) are used by Microsoft (and MS VC++) for their COM interfaces. Perhaps he was speaking of it. The "internal" representation of a COM is a pointer to a vtable. Pure abstract classes in MS VC++ are implemented in the same way, so you can use them to represent COM objects. Clearly if you class has other virtual functions, you can't simply overwrite its vtable with the COM vtable :-)

Virtual dispatch implementation details

First of all, I want to make myself clear that I do understand that there is no notion of vtables and vptrs in the C++ standard. However I think that virtually all implementations implement the virtual dispatch mechanism in pretty much the same way (correct me if I am wrong, but this isn't the main question). Also, I believe I know how virtual functions work, that is, I can always tell which function will be called, I just need the implementation details.
Suppose someone asked me the following:
"You have base class B with virtual functions v1, v2, v3 and derived class D:B which overrides functions v1 and v3 and adds a virtual function v4. Explain how virtual dispatch works".
I would answer like this:
For each class with virtual functions(in this case B and D) we have a separate array of pointers-to-functions called vtable.
The vtable for B would contain
&B::v1
&B::v2
&B::v3
The vtable for D would contain
&D::v1
&B::v2
&D::v3
&D::v4
Now the class B contains a member pointer vptr. D naturally inherits it and therefore contains it too. In the constructor and destructor of B B sets vptr to point to B's vtable. In the constructor and destructor of D D sets it to point to D's vtable.
Any call to a virtual function f on an object x of polymorphic class X is interpreted as a call to x.vptr[f's position in vtables]
The questions are:
1. Do I have any errors in the above description?
2. How does the compiler know f's position in vtable (in detail, please)
3. Does this mean that if a class has two bases then it has two vptrs? What is happening in this case? (try to describe in a similar manner as I did, in as much detail as possible)
4. What's happening in a diamond hierarchy with A on top B,C in the middle and D at the bottom? (A is a virtual base class of B and C)
Thanks in advance.
1. Do I have any errors in the above description?
All good. :-)
2. How does the compiler know f's position in vtable
Each vendor will have their own way of doing this, but I always think of the vtable as map of the member function signature to memory offset. So the compiler just maintains this list.
3. Does this mean that if a class has two bases then it has two vptrs? What is happening in this case?
Typically, compilers compose a new vtable which consists of all the vtables of the virtual bases appended together in the order they were specified, along with the vtable pointer of the virtual base. They follow this with the vtable functions of the deriving class. This is extremely vendor-specific, but for class D : B1, B2, you typically see D._vptr[0] == B1._vptr.
That image is actually for composing the member fields of an object, but vtables can be composed by the compiler in the exact same way (as far as I understand it).
4. What's happening in a diamond hierarchy with A on top B,C in the middle and D at the bottom? (A is a virtual base class of B and C)
The short answer? Absolute hell. Did you virtually inherit both the bases? Just one of them? Neither of them? Ultimately, the same techniques of composing a vtable for the class are used, but how this is done varies way to wildly, since how it should be done is not at all set in stone. There is a decent explanation of solving the diamond-hierarchy problem here, but, like most of this, it is quite vendor-specific.
Looks good to me
Implementation specific, but most are just in source code order -- meaning the order they appear in the class -- starting with the base class, then adding on new virtual functions from the derived. As long as the compiler has a deterministic way of doing this, then anything it wants to do is fine. However, on Windows, to create COM compatible V-Tables, it has to be in source order
(not sure)
(guess) A diamond just means that you could have two copies of a base class B. Virtual inheritance will merge them into one instance. So if you set a member via D1, you can read it via D2. (with C derived from D1, D2, each of them derived from B). I believe that in both cases, the vtables would be identical, as the function pointers are the same -- the memory for data members is what is merged.
Comments:
I don't think destructors come into it!
A call such as e.g. D d; d.v1(); will probably not be implemented via the vtable, as the compiler can resolve the function address at compile/link-time.
The compiler knows f's position because it put it there!
Yes, a class with multiple base classes will typically have multiple vptrs (assuming virtual functions in each base class).
Scott Meyers' "Effective C++" books explain multiple inheritance and diamonds better than I can; I'd recommend reading them for this (and many other) reasons. Consider them essential reading!

Inheritance in C++ internals

Can some one explain me how inheritance is implemented in C++ ?
Does the base class gets actually copied to that location or just refers to that location ?
What happens if a function in base class is overridden in derived class ? Does it replace it with the new function or copies it in other location in derived class memory ?
first of all you need to understand that C++ is quite different to e.g. Java, because there is no notion of a "Class" retained at runtime. All OO-features are compiled down to things which could also be achieved by plain C or assembler.
Having said this, what acutally happens is that the compiler generates kind-of a struct, whenever you use your class definition. And when you invoke a "method" on your object, actually the compiler just encodes a call to a function which resides somewhere in the generated executable.
Now, if your class inherits from another class, the compiler somehow includes the fields of the baseclass in the struct he uses for the derived class. E.g. it could place these fields at the front and place the fields corresponding to the derived class after that. Please note: you must not make any assumptions regarding the concrete memory layout the C++ compiler uses. If you do so, you're basically on your own and loose any portability.
How is the inheritance implemented? well, it depends!
if you use a normal function, then the compiler will use the concrete type he's figured out and just encode a jump to the right function.
if you use a virtual function, the compiler will generate a vtable and generate code to look up a function pointer from that vtable, depending on the run time type of the object
This distinction is very important in practice. Note, it is not true that inheritance is allways implemented through a vtable in C++ (this is a common gotcha). Only if you mark a certain member function as virtual (or have done so for the same member function in a baseclass), then you'll get a call which is directed at runtime to the right function. Because of this, a virtual function call is much slower than a non-virtual call (might be several hundered times)
Inheritance in C++ is often accomplished via the vtable. The linked Wikipedia article is a good starting point for your questions. If I went into more detail in this answer, it would essentially be a regurgitation of it.

DECLSPEC_NOVTABLE on pure virtual classes?

This is probably habitual programming redundancy. I have noticed DECLSPEC_NOVTABLE ( __declspec(novtable) ) on a bunch of interfaces defined in headers:
struct DECLSPEC_NOVTABLE IStuff : public IObject
{
virtual method1 () = 0;
virtual method2 () = 0;
};
The MSDN article on this __declspec extended attribute says that adding this guy will remove the construct and desctructor vtable entries and thus result in "significant code size reduction" (because the vtable will be removed entirely).
This just doesn't make much sense to me. These guys are pure virtual, why wouldn't the compiler just do this by default?
The article also says that if you do this, and then try and instantiate one of these things, you will get a run time access violation. But when I tried this with a few compilers (with or without the __declspec extension), they don't compile (as I would have expected).
So I guess to summarize:
Does the compiler strip out the vtable regardless for pure virtual interfaces, or have I missed something fundamental here?
What is the MSDN article talking about ?
The compiler strips out the only reference to the vtable, which would have been during construction of the class. Therefore, the linker can optimize it away since there is no longer a reference in the code to it.
Also by the way, I have made a habit of declaring an empty constructor as protected, and also using Microsoft's extension abstract keyword, to avoid that access violation at runtime. This way, the compiler catches the problem at compile time instead (since only a base class can instantiate the interface through the protected constructor). The derived class will of course fill in the vtable during its construction.
It's a bit of handholding for a dumb compiler/linker. The compiler should not insert any reference to this vtable, as it is quite obvious that there is no need for this vtable. The compiler could also mark the reference in such a way that the linker can eliminate the vtable, but that's more complex of course.