memory layout of C++ object - c++

As far as my understanding all the member functions will be created in separate memory when class definition and is common for all objects. And only the member variables are created individually for each object. But how member function is executed when called using object?
Where is the address for these member function will be stored?
class B{
public:
int a;
void fun(){
}
};
int main(){
B b;
std::cout<<sizeof(b)<<std::endl;
}
If I execute this program, I get the output as 4(which is for only member variable). But calling b.fun() calls its member function correctly. How it is calling without storing its address within the object? Where the member function address are stored?
Is there anything like class memory layout where these addresses will be stored?

Non-virtual member functions are extremely like regular non-member functions, with the only difference between them being a pointer to the class instance passed as a very first argument upon invocation.
This is done automatically by compiler, so (in pseudo-code) your call b.fun() can be compiled into
B::Fun(&b);
Where B::Fun can be seen as a usual function. The address of this function does not have to stored in actual object (all objects of this class will use the same function), and thus size of the class does not include it.

Is there anything like class memory layout where these addresses will be stored?
There is for functions declared virtual, yes. In this case, the addresses of said functions are stored in a table and looked up at runtime. This in turn allows your code to dispatch to the correct function depending on the object's type when the function is called.
Non-virtual functions do not work this way. They're stored in the same way as free (i.e. non-member) functions, with the function name prefixed by the name of the class. No storage space within the object itself is required.
In both cases, a hidden this pointer is passed to the called function. This is what 'connects' it to your object.

Related

Does each object in c++ contain a different version of the class's member functions?

I was just curious, does the creation of an object in C++ allocate space for a new copy of it's member functions? At the assembly or machine code level, where no classes exist, do all calls for a specific function from different objects of the same class actually refer to the same function pointer or are there multiple function blocks in memory and therefore different pointers for each and every member function of every object derived from the same class?
Usually languages implement functionalities as simply as possible.
Class methods are under the hood just simple functions containing object pointer as an argument, where object in fact is just data structure + functions that can operate on this data structure.
Normally compiler knows which function should operate on the object.
However if there is a case of polymorphism where function may be overriden.
Then compiler doesn't know what is the type of class, it may be Derived1 or Derived2.
Then compiler will add a VTable to this object that will contain function pointers to functions that could have been overridden.
Then for overridable methods the program will make a lookup in this table to see which function should be executed.
You can see how it can be implemented by seeing how polymorphism can be implemented in C:
How can I simulate OO-style polymorphism in C?
No, it does not. Functions are class-wide. When you allocate an object in C++ it will contain space for all its attributes plus a VTable with pointers to all its methods/functions, be it from its own class or inherited from parent classes.
When you call a method on that object, you essentially perform a look-up on that VTable and the appropriate method is called.

pointer to function call vs pointer to member function call

I always wondered why there is a stylistic difference between calling a pointer to function vs calling a pointer to a member function in terms of using the de-referencing operator *, i.e.
void(*fptr)()=&f; // pointer to function f()
fptr(); // call f() via pointer to function
Foo foo; // instance of Foo
void (Foo::*fptr)()=&Foo::f; // pointer to member function f()
(foo.*fptr)(); // call foo.f() via pointer to member function
In the first part, you don't use the * operator to call f() via the function pointer, but then one must use it in calling the member function via the pointer, (foo.*fptr)(). Why this difference? Why not just use (foo.fptr)() for consistency? Is there any profound reason or just this is the way C++ was designed?
I'll be quoting stuff from C++ Common Knowledge. The related chapter is titled Pointers to member functions are not pointers. There the author explains why pointers to member functions cannot be implemented as pointer to functions. I'll be using this (the fact that they're different things) as a justification for the different syntax :
When you take the address of a non-static member function, you don't get an address; you get a pointer to member function.
(...)
As with a pointer to data member, we need an object or pointer to an object in order to dereference a pointer to member function. (...) In the case of a pointer to member function, we need the object's address to use as (or to calculate) the value of the this pointer for the function call and possibly for other reasons as well.
Note that there is no such thing as a "virtual" pointer to member function. Virtualness is a property of the member function itself, not the pointer that refers to it.
This is one reason why a pointer to member function cannot be implemented, in general, as a simple pointer to function. The implementation of the pointer to member function must store within itself information as to whether the member function to which it refers is virtual or nonvirtual, information about where to find the appropriate virtual function table pointer, an offset to be added to or subtracted from the function's this pointer and possibly other information. A pointer to member function is commonly implemented as a small structure that contains this information, although many other implementations are also in use.
Dereferencing and calling a pointer to member function usually involves examining the stored information and conditionally executing the appropriate virtual or nonvirtual function calling sequence.
IMHO considering these differences one can justify the syntax difference, although historic design choices could have played their role.
If we simply use foo.fptr, it is the same as how we call a member function, making the compiler complex. Actually, fptr is not a member function of foo. Thus *fptr shows the difference. Maybe the compile can make that happen ,and I think that's how C++ was designed now.

Definition of class member

We can declare a class member like this:
class Test {
public:
int a;
}
this is the how we declare, but I want to know where the variable a is defined.
I know the static class member, it is the static variable so it can't be defined in the class, it should be defined outside the class. So I think the normal class member should have a place to be defined, I guess it's the constructor where the normal member is defined implicitly. Is that right?
For non-static data members, the declaration and definition are one and the same.
So I think the normal class member should have a place to be defined, I guess it's the constructor where the normal member is defined implicitly.
I think I can see where you're coming from. For each static data members, there is only one variable instance per type (for templates - each template instantiation creates a distinct type) - and that's why the declaration is more like an extern declaration for normal variables - it's saying "this variable will have an address somewhere - ask the linker to stitch in the address later". The definition is where the program asks the compiler to reserve actual memory for the variable in that specific translation unit's object, which will be found by the linker and made accessible to the code in other translation units that knew of and access the variable based on the declaration. (It's a little more complicated for templates). So, loosely speaking and from a programmer perspective, the static data member definition appears to be the line of source code triggering the allocation of memory and arranging for the constructor to run. Once you've written the definition, allocation and construction are all sorted.
For non-static data members it's quite different though - when the class definition is parsed by the compiler there's still no actual request for those non-static data members to be given any memory anywhere, as there's not yet an instance object of that class type. Only when some other code indicates the need for an object instance will the compiler need to arrange memory (if not using placement new) and construction. Put another way, for non-static data-members definition and allocation/construction are generally decoupled - with separate source code.
This all applied recursively: when an object instance is itself static or of file/namespace scope, the memory and construction (including of the data members inside the class) will be arranged (not necessarily performed) when the definition is seen, as above. But very often object instances are on the stack or heap. Either way the allocation and construction code for the data members is driven by the way the containing object is created, and is unrelated to the data member's definition.
Every instance of an object is given some reserved space in memory for that object. Possibly in heap storage or on the stack.
It's at a specific location within that space that every member variable of that object is stored.
a could be defined right after the declaration, in the constructor, or completely outside of the class. Here is an example showing all the ways that a can be defined
class Test {
public:
int a = 5;
Test() {
a = 5;
}
};
int main() {
Test foo;
foo.a = 5;
return 0;
}
As a good practice, you should encapsulate your data members and manage their definitions in specific methods like SetA(), GetA(), and you can give default values in the constructor

c++ member function and class size [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why class size, depends only on data members and not on member functions?
When I first learnt about Inheritance, my teacher remarked that as opposed to data members, member function don't change the class size. That is, if class B inherits from class A then B's size will be larger than A's size if and only if at least one data member will be added, and won't be changed with respect to function members quantity.
Is it correct? If so, How this mechanism works? It seems like both members should be held in the heap and therefore will cost in size.
Thank you,
Guy
Member variables are stored as part of each class instance. But member functions are not.
Each instance of a class must have a separate copy of each member variable. This keeps each object unique.
But member functions are code. And no matter how many instances you have of a particular class, there is absolutely no reason to have multiple copies of the code. The same code simply operates on the instance data.
Since code is different, code is not part of the size of a class instance. And sizeof(myClass) will not include bytes occupied by code.
You can imagine that calling a member function myClass.myMethod(x) is similar to myMethod(myClass, x). There is no need for that function to exist per every instance of class. You can access this "hidden" first argument using this keyword.
if class B inherits from class A then B's size will be larger than
A's size if and only if at least one data member will be added, and
won't be changed with respect to function members quantity.
Yes it is correct. (if B doesn't implement other methods).
Each class instance has a copy of the data members and a pointer to a member function table where they are actually stored.
The code of the member functions is shared between different instances of the same class. If B doesn't override any member function of the base class A, both A and B can share the same methods.
When you override a member function in a subclass, basically you are changing this mechanism, creating a new definition of the overriden member function, available only for the subclass where you defined it (es.B).
Normally a member function is binded at compile time. So if you have an instance of the subclass B referenced through a pointer to the base class A, and you call a member function foo() defined in both classes, the function called will be the one implemented in the base class.
You can force a member function to be binded at run-time declaring it virtual (the member function called will be the one of the actual type of the class pointed through the base pointer). This will cause an additional table (Virtual Method Table , vtable ) to be used to store virtual methods, and a double pointer indirection for each call.

How does a C++ object access its member functions?

How does a C++ object know where it's member function definitions are present? I am quite confused as the Object itself does not contain the function pointers.
sizeof on the Object proves this.
So how is the object to function mapping done by the Runtime environment? where is a class's member function-pointer table maintained?
If you're calling non-virtual functions, there's no need for a function-pointer table; the compiler can resolve the function addresses at compile-time. So:
A a;
a.func();
translates to something along the lines of:
A a;
A_func(&a);
Calling a virtual function through a base-class pointer typically uses a vtable. So:
A *p_a = new B();
p_a->func();
translates to something along the lines of:
A *p_a = new B();
p_a->p_vtbl->func(p_a);
where p_vtbl is a compiler-implemented pointer to the vtable specific to the actual class of *p_a.
There are generally two ways that an object and its member functions are associated:
For a non-virtual function, the compiler determines the appropriate function at compile time. Non-static member functions are usually passed a hidden parameter that contains the this pointer, which takes care of the association of the object and the class member function.
For virtual functions, most compilers tend to use a lookup table that is usually referenced via the object's this pointer or a similar mechanism. This table, normally called the vtable, contains the function pointer for the virtual functions only.
As C++ is not a dynamic language, the compiler can do most of the object/function/symbol resolution at compile time with the exception of some virtual functions. In some cases, it's even possible for the compiler to determine exactly which instance of a virtual function gets called and skip the resolution via the vtable.
Member functions are not part of the object - they are defined statically, in one place, just like any other function. There is no magic look-up needed.
Virtual functions are different, but I don't think your question is about that...
For non-virtual functions there is one (global, per-class) function table which all instances use. Since it's the same for all of them - deterministic at compile-time - you would not want it duplicated in each instance.
For virtual functions, resolution is done at runtime and the object will contain a function table for them. Try that and look at your object again.