How is inheritance implemented at the memory level? - c++

Suppose I have
class A { public: void print(){cout<<"A"; }};
class B: public A { public: void print(){cout<<"B"; }};
class C: public A { };
How is inheritance implemented at the memory level?
Does C copy print() code to itself or does it have a pointer to the it that points somewhere in A part of the code?
How does the same thing happen when we override the previous definition, for example in B (at the memory level)?

Compilers are allowed to implement this however they choose. But they generally follow CFront's old implementation.
For classes/objects without inheritance
Consider:
#include <iostream>
class A {
void foo()
{
std::cout << "foo\n";
}
static int bar()
{
return 42;
}
};
A a;
a.foo();
A::bar();
The compiler changes those last three lines into something similar to:
struct A a = <compiler-generated constructor>;
A_foo(a); // the "a" parameter is the "this" pointer, there are not objects as far as
// assembly code is concerned, instead member functions (i.e., methods) are
// simply functions that take a hidden this pointer
A_bar(); // since bar() is static, there is no need to pass the this pointer
Once upon a time I would have guessed that this was handled with pointers-to-functions in each A object created. However, that approach would mean that every A object would contain identical information (pointer to the same function) which would waste a lot of space. It's easy enough for the compiler to take care of these details.
For classes/objects with non-virtual inheritance
Of course, that wasn't really what you asked. But we can extend this to inheritance, and it's what you'd expect:
class B : public A {
void blarg()
{
// who knows, something goes here
}
int bar()
{
return 5;
}
};
B b;
b.blarg();
b.foo();
b.bar();
The compiler turns the last four lines into something like:
struct B b = <compiler-generated constructor>
B_blarg(b);
A_foo(b.A_portion_of_object);
B_bar(b);
Notes on virtual methods
Things get a little trickier when you talk about virtual methods. In that case, each class gets a class-specific array of pointers-to-functions, one such pointer for each virtual function. This array is called the vtable ("virtual table"), and each object created has a pointer to the relevant vtable. Calls to virtual functions are resolved by looking up the correct function to call in the vtable.

Check out the C++ ABI for any questions regarding the in-memory layout of things. It's labelled "Itanium C++ ABI", but it's become the standard ABI for C++ implemented by most compilers.

I don't think the standard makes any guarantees. Compilers can choose to make multiple copies of functions, combine copies that happen to access the same memory offsets on totally different types, etc. Inlining is just one of the more obvious cases of this.
But most compilers will not generate a copy of the code for A::print to use when called through a C instance. There may be a pointer to A in the compiler's internal symbol table for C, but at runtime you're most likely going to see that:
A a; C c; a.print(); c.print();
has turned into something much along the lines of:
A a;
C c;
ECX = &a; /* set up 'this' pointer */
call A::print;
ECX = up_cast<A*>(&c); /* set up 'this' pointer */
call A::print;
with both call instructions jumping to the exact same address in code memory.
Of course, since you've asked the compiler to inline A::print, the code will most likely be copied to every call site (but since it replaces the call A::print, it's not actually adding much to the program size).

There will not be any information stored in a object to describe a member function.
aobject.print();
bobject.print();
cobject.print();
The compiler will just convert the above statements to direct call to function print, essentially nothing is stored in a object.
pseudo assembly instruction will be like below
00B5A2C3 call print(006de180)
Since print is member function you would have an additional parameter; this pointer. That will be passes as just every other argument to the function.

In your example here, there's no copying of anything. Generally an object doesn't know what class it's in at runtime -- what happens is, when the program is compiled, the compiler says "hey, this variable is of type C, let's see if there's a C::print(). No, ok, how about A::print()? Yes? Ok, call that!"
Virtual methods work differently, in that pointers to the right functions are stored in a "vtable"* referenced in the object. That still doesn't matter if you're working directly with a C, cause it still follows the steps above. But for pointers, it might say like "Oh, C::print()? The address is the first entry in the vtable." and the compiler inserts instructions to grab that address at runtime and call to it.
* Technically, this is not required to be true. I'm pretty sure you won't find any mention in the standard of "vtables"; it's by definition implementation-specific. It just happens to be the method the first C++ compilers used, and happens to work better all-around than other methods, so it's the one nearly every C++ compiler in existence uses.

Related

pointer to access member function through virtual pointer

I came across articles where in they explain about vptr and vtable.
I know that the first pointer in an object in case of a class with virtual functions stored, is a vptr to vtable and vtable's array entries are pointers to the function in the same sequence as they occur in class ( which I have verified with my test program).
But I am trying to understand what syntax must compiler put in order to call the appropriate function.
Example:
class Base
{
virtual void func1()
{
cout << "Called me" << endl;
}
};
int main()
{
Base obj;
Base *ptr;
ptr=&obj;
// void* is not needed. func1 can be accessed directly with obj or ptr using vptr/vtable
void* ptrVoid=ptr;
// I can call the first virtual function in the following way:
void (*firstfunc)()=(void (*)(void))(*(int*)*(int*)ptrVoid);
firstfunc();
}
Questions:
1. But what I am really trying to understand is how compiler replaces the call to ptr->func1() with vptr?
If I were to simulate the call then what should I do? should I overload the -> operator. But even that would not help as I would not know what really the name func1 is. Even if they say that compiler accesses the vtable through vptr, still how does it know that the entry of func1 is the first array adn entry of func2 is the second element in the array? There must be some mapping for the names of function to the elements of array.
2. How can I simulate it. Can you provide the actual syntax that compiler uses to call function func1(how does it replace ptr->func1())?
Don't think of a vtable as an array. It's only an array if you strip it of everything C++ knows about it other than the size of its members. Instead, think of it as a second struct whose members are all pointers to functions.
Suppose I have a class like this:
struct Foo {
virtual void bar();
virtual int baz(int qux);
int quz;
}
int callSomeFun(Foo* foo) {
foo->bar();
return foo->baz(2);
}
Breaking it down 1 step:
class Foo;
// adding Foo* parameter to simulate the this pointer, which
// in the above would be a pointer to foo.
struct FooVtable {
void (*bar)(Foo* foo);
int (*baz)(Foo* foo, int qux);
}
struct Foo {
FooVtable* vptr;
int quz;
}
int callSomeFun(Foo* foo) {
foo->vptr->bar(foo);
return foo->vptr->baz(foo, 2);
}
I hope that's what you're looking for.
The backgroud:
After compilation (without debug info) binaries of C/C++ have no names, and names aren't required to runtime work, its only machine code
You can think about vptr like clasic C function pointer, in sense that type, argument list etc is known.
It isn't important on which positions are placed func1, func2 etc, only required is order was always the same (so all parts of multi file C++ must be compiled in the same way, compiler settings etc). Lets imagine, position is in declaration order, FIRST parent class, then newly declared in override BUT reimplemented virtuals are at lower positions, like from parent.
Its only image. Implementation must correctly fire overrides classApionter->methodReimplementedInB()
Usually C++ compiler has/had (my knowledge is from years 16/32b migration) 2-4 option to optimalize vtables against speed/size etc. Classic C sizeof() was quite well to understand (size of data plus ev. alignment), in C++ sizeof is bigger, but can guarantee if it is 2,4,8 bytes.
4 Few conversion tool can convert "object" files i.e. from MS format to Borland etc, but usually/only classic C was possible/safe, because of unknown machine code implementations of vtable.
Hard to touch vtable from high level code, fire analysers for intermediate files (.obj, . etc)
EDIT: story about runtime is different than about compilation. My answer is about compiled code & runtime
EDIT2: quasi assembler code (from my head)
load ax, 2
call vt[ax]
vt:
0x123456
0x126785 // virlual parent func1()
derrived:
vt:
0x123456
0x126999 // overriden finc1()
0x456788 // new method
EDIT3: BTW I can't totally agree that C++ has always better speed JVM/.NET because "these are interpreted". C++ has part of "intepretation", and interpreted part is groving: real component/GUI frameworks have interpreted connections between too (map for example). Out of our discussion: what memory model is better, with C++ delete or GC?

get the real address(or index in vTable) of virtual member function

In c++ is there any way to get the real address of member function, or the index in vTable ?
Updated:
I don't know the INDEX in vTable and
I don't know the address
Here's why I want to know this:
I want to hook the function ID3DXFont->DrawText of DirectX. If I know the index of the DrawText in the vTable, I can replace it to do the hook. But how to get the index? If it's able to get the the real address, I can search it in the vTable to get the index.
And not particularly ID3DXFont->DrawText, maybe some other functions in the future, so I'm trying to write a generic hook function.
Here's what I've tried so far:
#include <iostream>
using namespace std;
struct cls {
virtual int fn1() {
cout << "fn1 called" << endl;
return 1;
}
virtual int fn2() {
cout << "fn2 called" << endl;
return 2;
}
};
template <typename fn_t>
DWORD fn_to_addr(fn_t fn) { // convert function to DWORD for printing
union U {
fn_t fn;
DWORD addr;
};
U u;
u.fn = fn;
return u.addr;
}
int main() {
cls c;
DWORD addr = fn_to_addr(&cls::fn2);
cout << hex << addr << endl;
}
In debug mode, the code above outputs the address of jump table.
And in release mode, the &cls::fn2 returns 0x00401058, which points to some optimized code:
00401058 . mov eax, dword ptr [ecx] // get vptr
0040105A . jmp dword ptr [eax+4] // jmp to the second function (fn2)
Both are not the real address. Anyway to do that?
Thanks.
Don't give up so easily!
While the other answers are correct in saying that the C++ language doesn't allow you to do this in a portable way, there's an important factor in your particular case that may make this a more reasonable thing to do.
The key is that ID3DXFont is a COM interface and the exact binary details of how those work are specified separately from the language used to access them. So while C++ doesn't say what you'll find at the other end of that pointer, COM does say that there's a vtable there with an array of function pointers in a specified order and with a specified calling convention. This allows me to tell you that the index of the DrawText function is 314 (DrawTextA) or 15 (DrawTextW) and that this will still be true in Visual C++ 28.0 many years from now. Or in GCC 8.3.1 for that matter: since COM is a binary interface specification, all compilers are supposed to implement it the same way (if they claim to support COM).
Have a look at the second link below for a ready-made implementation of COM function hooking using two different methods. Approach#2 is the closest to what you're asking for but I think you may want to consider the first one instead because it involves less voodoo.
Sources:
[http://msdn.microsoft.com/en-us/library/ms680573(v=vs.85).aspx]
[http://www.codeproject.com/Articles/153096/Intercepting-Calls-to-COM-Interfaces]
[http://goodrender.googlecode.com/svn/trunk/include/d3dx9core.h]
There's nothing anywhere near portable. Your attempt using
&cls::fn2 can't work, since the results must work in cases
like (pCls->*fn)() even when pCls points to a derived class
which overrides the function. (Pointers to member functions are
complicated beasts, which identify whether the function is
virtual or not, and provide different information depending on
this. And if you're experimenting with MSC, be aware that you
have to specify /vmg for pointers to member functions to work
correctly.)
Even for a given implementation, you need an instance of the
correct type. Given that, if you know the class layout, and
the layout of the virtual function table, you can track it down.
Typically, the pointer to the virtual function table is the
first word in the class, although this is not guaranteed. And
usually, the functions will appear in the order they are
declared. Along with additional information, however, like
pointers to the RTTI, and possibly offset information required
to fix up the this pointer when calling the function (although
many compilers will use a trampoline for this). For 64 bit g++
under Windows (CygWin version):
struct C
{
virtual ~C() {}
virtual void fn1() const { std::cout << "In C::fn1\n"; }
virtual void fn2() const {}
};
void const*
fn1ToAddr( C const* pC )
{
void const* const* vPtr = *reinterpret_cast<void const* const* const*>(pC);
return vPtr[2];
}
fn1ToAddr returns the address of fn1 for the object passed
to it; if the object is of type C, it returns the address of
C::fn1, and if it is of a derived type which overrides fn1,
it returns the address of the overriding function.
Whether this works all of the time or not, I cannot say; I think
g++ uses trampolines in cases of multiple inheritance, for
example (in which case, the returned address would be the
address of the trampoline). And it might not work the next
major release of g++. (For the version of MSC I have at hand,
replacing the index 2 with 1 seems to work. But again,
I only tried very simple cases. There are absolutely no
guarantees.)
Basically, you would never want to do anything like this in
production code. It can be useful, however, if you're trying to
understand how the compiler works.
EDIT:
Re your edit with the why? Just because you have the address
(maybe), it doesn't mean that you can call the function. You
cannot call a member function without an object, and depending
on any number of things, you may not be able to pass the
function the object. (With MSC, for example, the object will
usually be passed in ECX.)
as mentioned in this wiki page:
Whenever a class defines a virtual function (or method), most
compilers add a hidden member variable to the class which points to a
so-called virtual method table (VMT or Vtable). This VMT is basically
an array of pointers to (virtual) functions.
as far as I know, you don't have access to the Vtable, the compiler doesn't even know the number of entries in the table.

Can subclass inline a pure virtual method that is not inline in the base?

As I understand it, the compiler can inline a virtual function call when it knows at compile time what the type of the object will be at runtime (C++ faq).
What happens, however, when one is implementing a pure virtual method from a base class? Do the same rules apply? Will the following function call be inlined?
class base
{
public:
virtual void print() = 0;
virtual void callPrint()
{
print(); // will this be inline?
}
};
class child : public base
{
public:
void print() { cout << "hello\n"; }
};
int main()
{
child c;
c.callPrint();
return 0;
}
EDIT:
I think my original example code was actually a poor representation of what I wanted to ask. I've updated the code, but the question remains the same.
The compiler is never required to inline a function call. In this case, it is permitted to inline the function call, because it knows the concrete type of c (since it is not indirected through a pointer or reference, the compiler can see where it was allocated as a child). As such, the compiler knows which implementation of print() is used, and can choose not to perform vtable indirection, and further choose to inline the implementation of the function.
However, the compiler is also free to not inline it; it might insert a direct call to child::print(), or indirect through the vtable, if it decides to do so.
These optimizations in general boil down to the 'as-if' rule - the compiler must behave as-if it was doing a full vtable indirect - this means that the result must be the same, but the compiler can choose a different method of achieving the result if the result is the same. This includes inlining, etc.
The answer is of course "it depends", but in principle there's no obstruction to optimization. In fact, you're not even doing anything polymorphic here, so this is really straight-forward.
The question would be more interesting if you had code like this:
child c;
base & b = c;
b.print();
The point is that the compiler knows at this point what the ultimate target of the dynamic dispatch will be (namly child::print()), so this is eligible for optimization. (There are two separate opportunities for optimization, of course: one by avoiding the dynamic dispatch, and one coming from having the function body of the target visible in the TU.)
There are only a couple of rules you should be aware of:
1) The compiler is never forced to inline - even using the directive or defining a method in the header.
2) Polymorphism MUST ALWAYS WORK. This means that the compiler will prefer calling the function via the vftable rather than inlining it when the possibility of dynamic calls exists.

LTO, Devirtualization, and Virtual Tables

Comparing virtual functions in C++ and virtual tables in C, do compilers in general (and for sufficiently large projects) do as good a job at devirtualization?
Naively, it seems like virtual functions in C++ have slightly more semantics, thus may be easier to devirtualize.
Update: Mooing Duck mentioned inlining devirtualized functions. A quick check shows missed optimizations with virtual tables:
struct vtab {
int (*f)();
};
struct obj {
struct vtab *vtab;
int data;
};
int f()
{
return 5;
}
int main()
{
struct vtab vtab = {f};
struct obj obj = {&vtab, 10};
printf("%d\n", obj.vtab->f());
}
My GCC will not inline f, although it is called directly, i.e., devirtualized. The equivalent in C++,
class A
{
public:
virtual int f() = 0;
};
class B
{
public:
int f() {return 5;}
};
int main()
{
B b;
printf("%d\n", b.f());
}
does even inline f. So there's a first difference between C and C++, although I don't think that the added semantics in the C++ version are relevant in this case.
Update 2: In order to devirtualize in C, the compiler has to prove that the function pointer in the virtual table has a certain value. In order to devirtualize in C++, the compiler has to prove that the object is an instance of a particular class. It would seem that the proof is harder in the first case. However, virtual tables are typically modified in only very few places, and most importantly: just because it looks harder, doesn't mean that compilers aren't as good in it (for otherwise you might argue that xoring is generally faster than adding two integers).
The difference is that in C++, the compiler can guarantee that the virtual table address never changes. In C then it's just another pointer and you could wreak any kind of havoc with it.
However, virtual tables are typically modified in only very few places
The compiler doesn't know that in C. In C++, it can assume that it never changes.
I tried to summarize in http://hubicka.blogspot.ca/2014/01/devirtualization-in-c-part-2-low-level.html why generic optimizations have hard time to devirtualize. Your testcase gets inlined for me with GCC 4.8.1, but in slightly less trivial testcase where you pass pointer to your "object" out of main it will not.
The reason is that to prove that the virtual table pointer in obj and the virtual table itself did not change the alias analysis module has to track all possible places you can point to it. In a non-trivial code where you pass things outside of the current compilation unit this is often a lost game.
C++ gives you more information on when type of object may change and when it is known. GCC makes use of it and it will make a lot more use of it in the next release. (I will write on that soon, too).
Yes, if it is possible for the compiler to deduce the exact type of a virtualized type, it can "devirtualize" (or even inline!) the call. A compiler can only do this if it can guarantee that no matter what, this is the function needed.
The major concern is basically threading. In the C++ example, the guarantees hold even in a threaded environment. In C, that can't be guaranteed, because the object could be grabbed by another thread/process, and overwritten (deliberately or otherwise), so the function is never "devirtualized" or called directly. In C the lookup will always be there.
struct A {
virtual void func() {std::cout << "A";};
}
struct B : A {
virtual void func() {std::cout << "B";}
}
int main() {
B b;
b.func(); //this will inline in optimized builds.
}
It depends on what you are comparing compiler inlining to. Compared to link time or profile guided or just in time optimizations, compilers have less information to use. With less information, the compile time optimizations will be more conservative (and do less inlining overall).
A compiler will still generally be pretty decent at inlining virtual functions as it is equivalent to inlining function pointer calls (say, when you pass a free function to an STL algorithm function like sort or for_each).

Why C++ virtual function defined in header may not be compiled and linked in vtable?

Situation is following. I have shared library, which contains class definition -
QueueClass : IClassInterface
{
virtual void LOL() { do some magic}
}
My shared library initialize class member
QueueClass *globalMember = new QueueClass();
My share library export C function which returns pointer to globalMember -
void * getGlobalMember(void) { return globalMember;}
My application uses globalMember like this
((IClassInterface*)getGlobalMember())->LOL();
Now the very uber stuff - if i do not reference LOL from shared library, then LOL is not linked in and calling it from application raises exception. Reason - VTABLE contains nul in place of pointer to LOL() function.
When I move LOL() definition from .h file to .cpp, suddenly it appears in VTABLE and everything works just great.
What explains this behavior?! (gcc compiler + ARM architecture_)
The linker is the culprit here. When a function is inline it has multiple definitions, one in each cpp file where it is referenced. If your code never references the function it is never generated.
However, the vtable layout is determined at compile time with the class definition. The compiler can easily tell that the LOL() is a virtual function and needs to have an entry in the vtable.
When it gets to link time for the app it tries to fill in all the values of the QueueClass::_VTABLE but doesn't find a definition of LOL() and leaves it blank(null).
The solution is to reference LOL() in a file in the shared library. Something as simple as &QueueClass::LOL;. You may need to assign it to a throw away variable to get the compiler to stop complaining about statements with no effect.
I disagree with #sechastain.
Inlining is far from being automatic. Whether or not the method is defined in place or a hint (inline keyword or __forceinline) is used, the compiler is the only one to decide if the inlining will actually take place, and uses complicated heuristics to do so. One particular case however, is that it shall not inline a call when a virtual method is invoked using runtime dispatch, precisely because runtime dispatch and inlining are not compatible.
To understand the precision of "using runtime dispatch":
IClassInterface* i = /**/;
i->LOL(); // runtime dispatch
i->QueueClass::LOL(); // compile time dispatch, inline is possible
#0xDEAD BEEF: I find your design brittle to say the least.
The use of C-Style casts here is wrong:
QueueClass* p = /**/;
IClassInterface* q = p;
assert( ((void*)p) == ((void*)q) ); // may fire or not...
Fundamentally there is no guarantee that the 2 addresses are equal: it is implementation defined, and unlikely to resist change.
I you wish to be able to safely cast the void* pointer to a IClassInterface* pointer then you need to create it from a IClassInterface* originally so that the C++ compiler may perform the correct pointer arithmetic depending on the layout of the objects.
Of course, I shall also underline than the use of global variables... you probably know it.
As for the reason of the absence ? I honestly don't see any apart from a bug in the compiler/linker. I've seen inlined definition of virtual functions a few times (more specifically, the clone method) and it never caused issues.
EDIT: Since "correct pointer arithmetic" was not so well understood, here is an example
struct Base1 { char mDum1; };
struct Base2 { char mDum2; };
struct Derived: Base1, Base2 {};
int main(int argc, char* argv[])
{
Derived d;
Base1* b1 = &d;
Base2* b2 = &d;
std::cout << "Base1: " << b1
<< "\nBase2: " << b2
<< "\nDerived: " << &d << std::endl;
return 0;
}
And here is what was printed:
Base1: 0x7fbfffee60
Base2: 0x7fbfffee61
Derived: 0x7fbfffee60
Not the difference between the value of b2 and &d, even though they refer to one entity. This can be understood if one thinks of the memory layout of the object.
Derived
Base1 Base2
+-------+-------+
| mDum1 | mDum2 |
+-------+-------+
When converting from Derived* to Base2*, the compiler will perform the necessary adjustment (here, increment the pointer address by one byte) so that the pointer ends up effectively pointing to the Base2 part of Derived and not to the Base1 part mistakenly interpreted as a Base2 object (which would be nasty).
This is why using C-Style casts is to be avoided when downcasting. Here, if you have a Base2 pointer you can't reinterpret it as a Derived pointer. Instead, you will have to use the static_cast<Derived*>(b2) which will decrement the pointer by one byte so that it correctly points to the beginning of the Derived object.
Manipulating pointers is usually referred to as pointer arithmetic. Here the compiler will automatically perform the correct adjustment... at the condition of being aware of the type.
Unfortunately the compiler cannot perform them when converting from a void*, it is thus up to the developer to make sure that he correctly handles this. The simple rule of thumb is the following: T* -> void* -> T* with the same type appearing on both sides.
Therefore, you should (simply) correct your code by declaring: IClassInterface* globalMember and you would not have any portability issue. You'll probably still have maintenance issue, but that's the problem of using C with OO-code: C is not aware of any object-oriented stuff going on.
My guess is that GCC is taking the opportunity to inline the call to LOL. I'll see if I can find a reference for you on this...
I see sechastain beat me to a more thorough description and I could not google up the reference I was looking for. So I'll leave it at that.
Functions defined in header files are in-lined on usage. They're not compiled as part of the library; instead where the call is made, the code of the function simply replaces the code of the call, and that is what gets compiled.
So, I'm not surprised to see that you are not finding a v-table entry (what would it point to?), and I'm not surprised to see that moving the function definition to a .cpp file suddenly makes things work. I'm a little surprised that creating an instance of the object with a call in the library makes a difference, though.
I'm not sure if it's haste on your part, but from the code provided IClassInterface does not necessarily contain LOL, only QueueClass. But you're casting to a IClassInterface pointer to make the LOL call.
If this example is simplified, and your actual inheritance tree uses multiple inheritance, this might be easily explained. When you do a typecast on an object pointer, the compiler needs to adjust the pointer so that the proper vtable is referenced. Because you're returning a void *, the compiler doesn't have the necessary information to do the adjustment.
Edit: There is no standard for C++ object layout, but for one example of how multiple inheritance might work see this article from Bjarne Stroustrup himself: http://www-plan.cs.colorado.edu/diwan/class-papers/mi.pdf
If this is indeed your problem, you might be able to fix it with one simple change:
IClassInterface *globalMember = new QueueClass();
The C++ compiler will do the necessary pointer modifications when it makes the assignment, so that the C function can return the correct pointer.