How does a C++ object access its member functions? - c++

How does a C++ object know where it's member function definitions are present? I am quite confused as the Object itself does not contain the function pointers.
sizeof on the Object proves this.
So how is the object to function mapping done by the Runtime environment? where is a class's member function-pointer table maintained?

If you're calling non-virtual functions, there's no need for a function-pointer table; the compiler can resolve the function addresses at compile-time. So:
A a;
a.func();
translates to something along the lines of:
A a;
A_func(&a);
Calling a virtual function through a base-class pointer typically uses a vtable. So:
A *p_a = new B();
p_a->func();
translates to something along the lines of:
A *p_a = new B();
p_a->p_vtbl->func(p_a);
where p_vtbl is a compiler-implemented pointer to the vtable specific to the actual class of *p_a.

There are generally two ways that an object and its member functions are associated:
For a non-virtual function, the compiler determines the appropriate function at compile time. Non-static member functions are usually passed a hidden parameter that contains the this pointer, which takes care of the association of the object and the class member function.
For virtual functions, most compilers tend to use a lookup table that is usually referenced via the object's this pointer or a similar mechanism. This table, normally called the vtable, contains the function pointer for the virtual functions only.
As C++ is not a dynamic language, the compiler can do most of the object/function/symbol resolution at compile time with the exception of some virtual functions. In some cases, it's even possible for the compiler to determine exactly which instance of a virtual function gets called and skip the resolution via the vtable.

Member functions are not part of the object - they are defined statically, in one place, just like any other function. There is no magic look-up needed.
Virtual functions are different, but I don't think your question is about that...

For non-virtual functions there is one (global, per-class) function table which all instances use. Since it's the same for all of them - deterministic at compile-time - you would not want it duplicated in each instance.
For virtual functions, resolution is done at runtime and the object will contain a function table for them. Try that and look at your object again.

Related

memory layout of C++ object

As far as my understanding all the member functions will be created in separate memory when class definition and is common for all objects. And only the member variables are created individually for each object. But how member function is executed when called using object?
Where is the address for these member function will be stored?
class B{
public:
int a;
void fun(){
}
};
int main(){
B b;
std::cout<<sizeof(b)<<std::endl;
}
If I execute this program, I get the output as 4(which is for only member variable). But calling b.fun() calls its member function correctly. How it is calling without storing its address within the object? Where the member function address are stored?
Is there anything like class memory layout where these addresses will be stored?
Non-virtual member functions are extremely like regular non-member functions, with the only difference between them being a pointer to the class instance passed as a very first argument upon invocation.
This is done automatically by compiler, so (in pseudo-code) your call b.fun() can be compiled into
B::Fun(&b);
Where B::Fun can be seen as a usual function. The address of this function does not have to stored in actual object (all objects of this class will use the same function), and thus size of the class does not include it.
Is there anything like class memory layout where these addresses will be stored?
There is for functions declared virtual, yes. In this case, the addresses of said functions are stored in a table and looked up at runtime. This in turn allows your code to dispatch to the correct function depending on the object's type when the function is called.
Non-virtual functions do not work this way. They're stored in the same way as free (i.e. non-member) functions, with the function name prefixed by the name of the class. No storage space within the object itself is required.
In both cases, a hidden this pointer is passed to the called function. This is what 'connects' it to your object.

Does each object in c++ contain a different version of the class's member functions?

I was just curious, does the creation of an object in C++ allocate space for a new copy of it's member functions? At the assembly or machine code level, where no classes exist, do all calls for a specific function from different objects of the same class actually refer to the same function pointer or are there multiple function blocks in memory and therefore different pointers for each and every member function of every object derived from the same class?
Usually languages implement functionalities as simply as possible.
Class methods are under the hood just simple functions containing object pointer as an argument, where object in fact is just data structure + functions that can operate on this data structure.
Normally compiler knows which function should operate on the object.
However if there is a case of polymorphism where function may be overriden.
Then compiler doesn't know what is the type of class, it may be Derived1 or Derived2.
Then compiler will add a VTable to this object that will contain function pointers to functions that could have been overridden.
Then for overridable methods the program will make a lookup in this table to see which function should be executed.
You can see how it can be implemented by seeing how polymorphism can be implemented in C:
How can I simulate OO-style polymorphism in C?
No, it does not. Functions are class-wide. When you allocate an object in C++ it will contain space for all its attributes plus a VTable with pointers to all its methods/functions, be it from its own class or inherited from parent classes.
When you call a method on that object, you essentially perform a look-up on that VTable and the appropriate method is called.

When virtual functions are invoked statically?

What is the performance difference between calling a virtual function from a derived class pointer directly vs from a base class pointer to the same derived class?
In the derived pointer case, will the call be statically bound, or dynamically bound? I think it'll be dynamically bound because there's no guarantee the derived pointer isn't actually pointing to a further derived class. Would the situation change if I have the derived class directly by value (not through pointer or reference)? So the 3 cases:
base pointer to derived
derived pointer to derived
derived by value
I'm concerned about performance because the code will be run on a microcontroller.
Demonstrating code
struct Base {
// virtual destructor left out for brevity
virtual void method() = 0;
};
struct Derived : public Base {
// implementation here
void method() {
}
}
// ... in source file
// call virtual method from base class pointer, guaranteed vtable lookup
Base* base = new Derived;
base->method();
// call virtual method from derived class pointer, any difference?
Derived* derived = new Derived;
derived->method();
// call virtual method from derived class value
Derived derivedValue;
derived.method();
In theory, the only C++ syntax that makes a difference is a member function call that uses qualified member name. In terms of your class definitions that would be
derived->Derived::method();
This call ignores the dynamic type of the object and goes directly to Derived::method(), i.e. it's bound statically. This is only possible for calling methods declared in the class itself or in one of its ancestor classes.
Everything else is a regular virtual function call, which is resolved in accordance with the dynamic type of the object used in the call, i.e. it is bound dynamically.
In practice, compilers will strive to optimize the code and replace dynamically-bound calls with statically-bound calls in contexts where the dynamic type of the object is known at compile time. For example
Derived derivedValue;
derivedValue.method();
will typically produce a statically-bound call in virtually every modern compiler, even though the language specification does not provide any special treatment for this situation.
Also, virtual method calls made directly from constructors and destructors are typically compiled into statically-bound calls.
Of course, a smart compiler might be able to bind the call statically in a much greater variety of contexts. For example, both
Base* base = new Derived;
base->method();
and
Derived* derived = new Derived;
derived->method();
can be seen by the compiler as trivial situations that easily allow for statically-bound calls.
Virtual functions must be compiled to work as if they were always called virtually. If your compiler compiles a virtual call as a static call, that's an optimization that must satisfy this as-if rule.
From this, it follows that the compiler must be able to prove the exact type of the object in question. And there are some valid ways in which it can do this:
If the compiler sees the creation of the object (the new expression or the automatic variable from which the address is taken) and can prove that that creation is actually the source of the current pointer value, that gives it the precise dynamic type it needs. All your examples fall into this category.
While a constructor runs, the type of the object is exactly the class containing the running constructor. So any virtual function call made in a constructor can be resolved statically.
Likewise, while a destructor runs, the type of the object is exactly the class containing the running destructor. Again, any virtual function call can be resolved statically.
Afaik, these are all the cases that allow the compiler to convert a dynamic dispatch into a static call.
All of these are optimizations, though, the compiler may decide to perform the runtime vtable lookup anyway. But good optimizing compilers should be able to detect all three cases.
There should be no difference between the first two cases, since the very idea of virtual functions is to call always the actual implementation. Leaving compiler optimisations aside (which in theory could optimise all virtual function calls away if you construct the object in the same compilation unit and there is no way the pointer can be altered in between), the second call must be implemented as a indirect (virtual) call as well, since there could be a third class inheriting from Derived and implementing that function as well. I would assume that the third call will not be virtual, since the compiler knows the actual type already at compile time. Actually you could make sure of this by not defining the function as virtual, if you know you will always do the call on the derived class directly.
For really lightweight code running on a small microcontroller I would recommend avoiding defining functions as virtual at all. Usually there is no runtime abstraction required. If you write a library and need some kind of abstraction, you can maybe work with templates instead (which give you some compile-time abstraction).
At least on PC CPUs I often find virtual calls one of the most expensive indirections you can have (probably because branch prediction is more difficult). Sometimes one can also transform the indirection to the data level, e.g. you keep one generic function which operates on different data which is indirected with pointers to the actual implementation. Of course this will work only in some very specific cases.
At run-time.
BUT: Performance as compared to what? It isn't valid to compare a virtual function call to a non-virtual function call. You need to compare it to a non-virtual function call plus an if, a switch, an indirection, or some other means of providing the same function. If the function doesn't embody a choice among implementations, i.e. doesn't need to be virtual, don't make it virtual.

Are methods duplicated in memory for every instance of an object? If so, can this be avoided?

Say I have an object that exists in high quantity, stores little data about itself, but requires several larger functions to act upon itself.
class Foo
{
public:
bool is_dead();
private:
float x, y, z;
bool dead;
void check_self();
void update_self();
void question_self();
};
What behavior can I expect from the compiler - would every new Foo object cause duplicates of its methods to be copied into memory?
If yes, what are good options for managing class-specific (private-like) functions while avoiding duplication?
If not, could you elaborate on this a little?
C++ methods are simply functions (with a convention about this which often becomes the implicit first argument).
Functions are mostly machine code, starting at some specific address. The start address is all that is needed to call the function.
So objects (or their vtable) need at most the address of called functions.
Of course a function takes some place (in the text segment).
But an object won't need extra space for that function. If the function is not virtual, no extra space per object is needed. If the function is virtual, the object has a single vtable (per virtual class). Generally, each object has, as its first field, the pointer to the vtable. This means 8 bytes per object on x86-64/Linux. Each object (assuming single inheritance) has one vtable pointer, independently of the number or of the code size of the virtual
functions.
If you have multiple, perhaps virtual, inheritance with virtual methods in several superclasses you'll need several vtable pointers per instance.
So for your Foo example, there is no virtual function (and no superclass containing some of them), so instances of Foo contain no vtable pointer.
If you add one (or many hundreds) of virtual functions to Foo (then you should have a virtual destructor, see rule of three in C++), each instance would have one vtable pointer.
If you want a behavior to be specific to instances (so instances a and b could have different behavior) without using the class machinery for that, you need some member function pointers (in C++03) or (in C++11) some std::function (perhaps anonymous closures). Of course they need space in every instance.
BTW, to know the size of some type or class, use sizeof .... (it does include the vtable[s] pointer[s] if relevant).
Methods exists for every class in program, not for every object.
Try to read some good books about c++ to know so easy facts about language.

c++: Does a vtable contains pointers to non-virtual functions?

vtable contains pointers to virtual functions of that class. Does it also contains pointers to non-virtual functions as well?
Thx!
It's an implementation detail, but no. If an implementation put pointers to non-virtual functions into a vtable it couldn't use these pointers for making function calls because it would often cause incorrect non-virtual functions to be called.
When a non-virtual function is called the implementation must use the static type of the object on which the function is being called to determine the correct function to call. A function stored in a vtable accessed by a vptr will be dependent on the dynamic type of the object, not any static type of a reference or pointer through which it is being accessed.
No, it doesn't.
As calls to non-virtual methods can be resolved during compilation (since compiler knows the addresses of non virtual functions), the compiler generates instructions to call them 'directly' (i.e. statically).
There is no reason to go through vtable indirection mechanism for methods which are known during compiling.
Whether or not a "vtable" is used by any implementation isn't defined by the standard. Most implementations use a table of function pointers although the functions pointed to are typically not directly those being called (instead, the pointed to function may adjust the pointer before calling the actual function).
Whether or not non-virtual functions show up in this table is also not defined by standard. After all, the standard doesn't even require the existence of a vtable. Normally, non-virtual function are not in a virtual function table since any necessary pointer adjustments and call can be resolved at compile- or link-time. I could imagine an implementation treating all functions similarly and, thus, using a pointer in the virtual function table in all cases. I wouldn't necessary be very popular. However, it might be a good way to implement C++ in an environment where it seamlessly interacts with a more flexible object system, e.g., languages where individual functions can be replaced at run-time (my understanding is that something like this is possible, e.g., in python).
No. A vtable only contains pointers to virtual functions in the same class or file.