Class members and member functions memory location - c++

Here is a simple C++ class, named A:
class A
{
public:
explicit A() : m_a(0) { }
explicit A(int a) m_a(a) { }
int getA() const { return m_a; }
void setA(int a) { m_a = a; }
private:
int m_a;
}
This is what I know so far:
When you declare an object of a class instance, memory gets allocated for that object. The allocated memory is equivalent to the memory of its members summed up. So in my case, it is:sizeof(A) = sizeof(int) = sizeof(m_a)
All member functions of class A are stored somewhere in memory and all instances of class A use the same member functions.
This is what I don't know:
Where are member functions stored and how are they actually stored? Let's say that an int for example is stored on 4 bytes; I can imagine the RAM memory layout with 4 contiguous cells each storing a part of that int. How can I imagine this layout for a function?(this could sound silly, but I imagine functions must have a place in memory because you can have a pointer point to them). Also how and where are function instructions stored? My first perception was that functions and function instructions are stored in the program executable(and its dynamic or static libraries) but if this is true what happens when you create a function pointer? AFAIK function pointers point to locations in RAM memory, can they point to locations in program binaries? If yes, how does this work?
Can anyone explain to me how this works and point out if what I know is right or wrong?

First, you need to understand the role of the linker and what are executables (usually executed in virtual memory) and address spaces & processes. On Linux, read about ELF and the execve(2) syscall. Read also Levine's Linkers & Loaders book and Operating Systems: Three Easy Pieces, and the C++11 standard n3337, and this draft report and a good C++ programming book, with this reference website.
Member functions can be virtual or plain functions.
A plain (non virtual) member function is just like a C function (except that it has this as an implicit, often first, parameter). For example your getA method is implemented like the following C function (outside of the object, e.g. in the code segment of the binary executable) :
int C$getA(A*thisptr) const { return thisptr->m_a; }
then imagine that the compiler is translating p->getA() into C$getA(p)
A virtual member function is generally implemented thru a vtable (virtual method table). An object with some virtual member functions (including destructor) has generally as its first (implicit) member field a pointer to such a table (generated elsewhere by the compiler). Your class A don't have any virtual method, but imagine if it had an additional virtual void print(std::ostream&); method, then your class A would have the same layout as
struct A$ {
struct A$virtualmethodtable* _vptr;
int m_a;
};
and the virtual table might be
struct A$virtualmethodtable {
void (*print$fun) (struct A$*, std::ostream*);
};
(so adding other virtual functions means simply adding slot inside that vtable);
and then a call like p->print(std::cout); would be translated almost like
p->_vptr.print$fun(p,&std::cout); ... In addition, the compiler would generate as constant tables various virtual method tables (one per class).
NB: things are more complex with multiple or virtual inheritance.
In both cases, member functions don't eat any additional space in the object. If it is non-virtual, it is just a plain function (in the code segment). If it is virtual, it shares a slot in the virtual method table.
NB. If you compile with a recent GCC (i.e. with g++) or with a Clang (so clang++) you could pass it e.g. the -fdump-tree-all flag: it will produce hundreds of dump files showing partly -in a dumped textual form- some internal representations of the compiler, which you could inspect with a pager (e.g. less) or a textual editor. You could also use MELT or look at the assembly code produced with g++ -S -fverbose-asm -O1 ....

All local non static variables and Non virtual functions are saved in code/text segment.
All static and global variables are saved in static data segment.
Class with virtual functions or inherited from a class with virtual functions will inserted a vptr pointer by compiler. Vptr points to a virtual function table which has number of functions slots. Each slot contains the function address which is stored in the code segment.

To understand this you need to learn about memory layout of a program. the code will be shared by the objects. and all objects will have their own copy of data.

Related

Memory allocation for function in class [duplicate]

Here is a simple C++ class, named A:
class A
{
public:
explicit A() : m_a(0) { }
explicit A(int a) m_a(a) { }
int getA() const { return m_a; }
void setA(int a) { m_a = a; }
private:
int m_a;
}
This is what I know so far:
When you declare an object of a class instance, memory gets allocated for that object. The allocated memory is equivalent to the memory of its members summed up. So in my case, it is:sizeof(A) = sizeof(int) = sizeof(m_a)
All member functions of class A are stored somewhere in memory and all instances of class A use the same member functions.
This is what I don't know:
Where are member functions stored and how are they actually stored? Let's say that an int for example is stored on 4 bytes; I can imagine the RAM memory layout with 4 contiguous cells each storing a part of that int. How can I imagine this layout for a function?(this could sound silly, but I imagine functions must have a place in memory because you can have a pointer point to them). Also how and where are function instructions stored? My first perception was that functions and function instructions are stored in the program executable(and its dynamic or static libraries) but if this is true what happens when you create a function pointer? AFAIK function pointers point to locations in RAM memory, can they point to locations in program binaries? If yes, how does this work?
Can anyone explain to me how this works and point out if what I know is right or wrong?
First, you need to understand the role of the linker and what are executables (usually executed in virtual memory) and address spaces & processes. On Linux, read about ELF and the execve(2) syscall. Read also Levine's Linkers & Loaders book and Operating Systems: Three Easy Pieces, and the C++11 standard n3337, and this draft report and a good C++ programming book, with this reference website.
Member functions can be virtual or plain functions.
A plain (non virtual) member function is just like a C function (except that it has this as an implicit, often first, parameter). For example your getA method is implemented like the following C function (outside of the object, e.g. in the code segment of the binary executable) :
int C$getA(A*thisptr) const { return thisptr->m_a; }
then imagine that the compiler is translating p->getA() into C$getA(p)
A virtual member function is generally implemented thru a vtable (virtual method table). An object with some virtual member functions (including destructor) has generally as its first (implicit) member field a pointer to such a table (generated elsewhere by the compiler). Your class A don't have any virtual method, but imagine if it had an additional virtual void print(std::ostream&); method, then your class A would have the same layout as
struct A$ {
struct A$virtualmethodtable* _vptr;
int m_a;
};
and the virtual table might be
struct A$virtualmethodtable {
void (*print$fun) (struct A$*, std::ostream*);
};
(so adding other virtual functions means simply adding slot inside that vtable);
and then a call like p->print(std::cout); would be translated almost like
p->_vptr.print$fun(p,&std::cout); ... In addition, the compiler would generate as constant tables various virtual method tables (one per class).
NB: things are more complex with multiple or virtual inheritance.
In both cases, member functions don't eat any additional space in the object. If it is non-virtual, it is just a plain function (in the code segment). If it is virtual, it shares a slot in the virtual method table.
NB. If you compile with a recent GCC (i.e. with g++) or with a Clang (so clang++) you could pass it e.g. the -fdump-tree-all flag: it will produce hundreds of dump files showing partly -in a dumped textual form- some internal representations of the compiler, which you could inspect with a pager (e.g. less) or a textual editor. You could also use MELT or look at the assembly code produced with g++ -S -fverbose-asm -O1 ....
All local non static variables and Non virtual functions are saved in code/text segment.
All static and global variables are saved in static data segment.
Class with virtual functions or inherited from a class with virtual functions will inserted a vptr pointer by compiler. Vptr points to a virtual function table which has number of functions slots. Each slot contains the function address which is stored in the code segment.
To understand this you need to learn about memory layout of a program. the code will be shared by the objects. and all objects will have their own copy of data.

Where member functions are stored? [duplicate]

This question already has answers here:
Where are member functions stored for an object?
(2 answers)
Closed 6 years ago.
Whenever object is created for a class, memory space will be allocated for a class. So my question is: Do memory created for only member variables or for member functions also?? If memory is created for member functions, then where they will be stored??
Traditionally executable files had three sections. One for initialized data, one for uninitialized data, and one for code. This traditional partitioning is still very much in use, and code, no matter where it comes from, is placed in a separate section.
When an operating system load an executable file into memory, it puts the code in a separate place in memory that it marks as executable (on modern memory-protected systems) and all code are stored there separate from the objects themselves.
Member functions are just code located in the code segment. they are present exact one time, no matter how many objects you have.
They are nearly exactly the same as ordinary functions except that their first parameter is the this pointer, that is hidden in the language but present as a parameter on the executable code.
But there are two kinds of member functions:
"normal"
virtual
there is no difference between them in the sense of code size however they are called differently. Calls to normal functions can be determined by compiletime and the other are indirect calls via the function pointers-
If a class has a virtual member functions (the class is "polymorph") the compiler needs to create a "vtable" for this class (not object).
Each object does contain a pointer to the vtable of its class. this is necessary to access the correct virtual function if the object is accessed by a pointer that is of a base classes type.
Example:
class A{
public: bool doSomething();
int i;
};
class B:public A {
public: bool doSomething();
int j;
}
//
B b;
A* a = &b;
a->doSomething(); // <- A::doSomething() is called;
//
neither of this classes needs a vtable.
Example 2:
class A{
public: virtual bool doSomething();
int i;
};
class B:public A {
public: bool doSomething();
int j;
}
//
B b;
A* a = &b;
a->doSomething(); // <- B::doSomething() is called;
//
A (and all its childs) get a vtable. Then an object is created the objects vtable pointer is set to the correct table, so that independently from the Type of the pointer the correct function is called.
Only the member variables (plus padding between and after them) contribute to the sizeof of a class.
So in that sense regular functions do not take up space as far as an object is concerned. Member functions are little more than regular static functions with a implicit this pointer as a hidden argument.
Saying that though, a virtual function table might be the way an implemention deals with polymorphic types, and that will take up some space, but will probably only be a pointer to a table used by all objects of that particular class.

How does compiler decides when to use a vPtr to invoke a function

How to identify whether vptr will be used to invoke a virtual function?
Consider the below hierarchy:
class A
{
int n;
public:
virtual void funcA()
{std::cout <<"A::funcA()" << std::endl;}
};
class B: public A
{
public:
virtual void funcB()
{std::cout <<"B::funcB()" << std::endl;}
};
A* obj = new B();
obj->funcB(); //1. this does not even compile
typedef void (*fB)();
fB* func;
int* vptr = (int*)obj; //2. Accessing the vptr
func = (fB*)(*vptr);
func[1](); //3. Calling funcB using vptr.
Statement 1. i.e. obj->funcB(); does not even compile although Vtable has an entry for funcB where as on accessing vPtr indirectly funcB() can be invoked successfully.
How does compiler decide when to use the vTable to invoke a function?
In the statement A* obj = new B(); since I am using a base class pointer so I believe vtable should be used to invoke the function.
Below is the memory layout when vptr is accessed indirectly.
So there are two answers to your question:
The short one is:
obj->FuncB() is only a legal call, if the static type of obj (in this case A) has a function FuncB with the appropriate signature (either directly or due to a base class). Only if that is the case, the compiler decides whether it translates it to a direct or dynamic function call (e.g. using a vtable), based on whether FuncB is declared virtual or not in the declaration of A (or its base type).
The longer one is this:
When the compiler sees obj->funcB() it has no way of knowing (optimizations aside), what the runtime type of obj is and especially it doesn't know, whether a derived class that implements funcB() exists, at all. obj might e.g. be created in another translation unit or it might be a function parameter.
And no, that information is usually not stored in the virtual function table:
The vtable is just an array of addresses and without the prior knowledge that a specific addess corresponds to a function called funcB, the compiler can't use it to implement the call obj->funcB()- or to be more precise: it is not allowed to do so by the standard. That prior knowledge can only be provided by a virtual function declaration in the static type of obj (or its base classes).
The reason, why you have that information available in the debugger (whose behavior lys outside of the standard anyway) is, because it has access to the debugging symbols, which are usually not part of the distributed release binary. Storing that information in the vtable by default, would be a waste of memory and performance, as the program isn't allowed to make use of it in standard c++ in the way you describe anyway. For extensions like C++/CLI that might be a different story.
Adding to Barry's comment, adding the line virtual void funcB() = 0; to class A seems to fix the problem.

Memory static function vs member function

class A (say), having all static member functions only
class B(say) having only member functions
If i create 1000 instances of class A. As the class contains only static member functions, the memory do not increase even if there are 1 instance or 1000 instances.
However, for class B. If i create 1000 instances, will there be an increase of memory (even the slightest, may be a pointer for each object pointing to set of member functions) ?
If no, then how does the compiler keep tracks of member function information for a particular object ?
Will there be an increase of memory (even the slightest, may be a pointer for each object pointing to set of member functions)?
NO.
Non virtual Member functions do not contribute towards size of objects of a class.
However, presence of a virtual member function will typically increase the size of an class object.
Note that the latter is purely implementation specific detail but Since all known compilers implement the virtual mechanism using v-table and v-ptr, it is reasonable to assume that almost all compilers will show the same behavior of adding a v-ptr to every object of that polymorphic class thus increasing size of the class object by size equivalent to that of v-ptr.
For starters, you might try outputting sizeof(A) and sizeof(B). But
several things to keep in mind:
Regardless of the number or types of members, C++ forbids a class to
have a size of 0, so static members or not, each instance of A will
take some memory; and
The resolution of non-virtual functions is done entirely at compile
time, so there is no need for the compiler to add anything to the
class for it. (Virtual functions will typically add the size of one
pointer to the class, regardless of how many virtual functions your
class has.)
If we're just talking about member functions, the imprint will be the same. A member function does not take up more memory the more times the class it is contained within is instantiated (as the this pointer is passed to it). Only the data members of the class are going to take up more memory with each class instantiation as they are unique to each instance of the class.
So to answer your second question, it keeps "track" by the user of the this pointer which is passed when calling a non-static member function of a class.
Things get a bit more complicated with virtual methods, but your question has not covered that particular idiom.
You can use sizeof function to test does function occupy class object's memory.
class A{};
class B{
void foo(){};
};
class C{
static void foo();
};
class D{
virtual void foo();
};
class E{
virtual void foo1();
virtual void foo2();
}
sizeof(A)=1
sizeof(B)=1
sizeof(C)=1
sizeof(D)=4
sizeof(E)=4
First class A B C object's memory is zero. But if they memory is zero ,compile will cann't distinguish those class. So compile add char to distinguish those class. So :
sizeof(A)=1
sizeof(B)=1
sizeof(C)=1
So you can find that member functions and static member functions don't occupy memory.So they will not increasing memory.
But if class has virtual function, it will add 4 byte.And it only add 4 byte whether is have how many virtual functions.Because it only add a vptr point to virtual table, which hold virtual function's point.
At a lower level, there are no objects, just function calls. An implicit this parameter is passed to member functions. When we call void B::f(){ mem1(); }, compiler sees it as B::mem1(this). So even if you have millions of objects of class B, there will be a single function mem1 expecting an object of type class B.
Virtual functions are different. They are looked up from a table, however, lookup result depends on [actual] type of this (in mem1(this)) and not the object pointed by this.
The difference between a static member function and a normal member function is simple. When you call a normal member function the ecx register is set to the address of class instance
you can see an additional
lea ecx, instance_name[ebp]
if you deassemble
the member function uses this register for accessing the class
so the memory usage will not increase but the computational time will increase

Can GCC compile classes to work as structs?

Is there any way to force a compiler (well GCC specifically) to make a class compile to object oriented C? Specifically what I want to achieve is to write this:
class Foo {
public:
float x, y, z;
float bar();
int other();
...etc
};
Foo f;
float result = f.bar()
int obSize = sizeof(Foo);
Yet compile to exactly the same as:
Struct Foo { float x, y, z; };
float Foo_bar(Foo *this);
Foo f;
float result = Foo_bar(&f);
int obSize = sizeof(Foo);
My motivation is to increase readability, yet not suffer a memory penalty for each object of Foo. I'd imagine the class implementation normally obSize would be
obSize = sizeof(float)*3 + sizeof(void*)*number_of_class_methods
Mostly to use c++ classes in memory constrained microcontrollers. However I'd imagine if I got it to work I'd use it for network serialization as well (on same endian machines of course).
Your compiler actually does exactly that for you. It might even be able to optimize optimistically by putting the this pointer in a register instead of pushing it onto the stack (this is at least what MSVC does on Windows), which you wouldn't be able to do with standard C calling convention.
As for:
obSize = sizeof(float)*3 + sizeof(void*)*number_of_class_methods
It is plain false. Did you even try it ?
Even if you had virtual functions, only one pointer to a table of functions would be added to each object (one table per class). With no virtual functions, nothing is added to an object beyond its members (and no function table is generated).
void* represents a pointer to data, not a pointer to code (they need not have the same size)
There is no guarantee that the size of the equivalent C struct is 3 * sizeof(float).
C++ already does what you're talking about for non-polymorphic classes (classes without a virtual method).
Generally speaking, a C++ class will have the same size as a C struct, unless the class contains one or more virtual methods, in which case the overhead will be a single pointer (often called the vptr) for each class instance.
There will also be a single instance of a 'vtbl' that has a set of pointers for each virtual function - but that vtbl will be shared among all objects of that class type (ie., there's a single vtbl per-class type, and the various vptrs for objects of that class will point to the same vtbl instance).
In summary, if your class has no virtual methods, it will be no larger than the same C struct. This fits with the C++ philosophy of not paying for what you don't use.
However, note that non-static member functions in a C++ class do take an extra parameter (the this pointer) that isn't explicitly mentioned in the parameter list - that is essentially what you discuss in your question.
footnote: in C++ classes and structs are the same except for the minor difference of default member accessibility. In the above answer, when I use the term 'class', the behavior applies just as well to structs in C++. When I use the term 'struct' I'm talking about C structs.
Also note that if your classes use inheritance, the 'overhead' of that inheritance depends on the exact variety of inheritance. But as in the difference between polymorphic and non-polymorphic classes, whatever that cost might be, it's only brought in if you use it.
No, your imagination is wrong. Class methods take up no space at all in an object. Why not write a class, and take the sizeof. Then add a few more methods and print the sizeof again. You will see that it hasn't changed. Something like this
First program
class X
{
public:
int y;
void method1() {}
};
int main()
{
cout << sizeof(X) << '\n'; // prints 4
}
Second program
class X
{
public:
int y;
void method1() {}
void method2() {}
void method3() {}
void method4() {}
void method5() {}
void method6() {}
};
int main()
{
cout << sizeof(X) << '\n'; // also prints 4
}
Actually, I believe there is no specific memory penalty with using classes since member functions are stored once for every instance of the class. So your memory footprint would be more like 1*sizeof(void*)*number_of_class_methods + N*sizeof(float)*3 where you have N instances of Foo.
The only time you get an additional penalty is when using virtual functions in which case each object carries around a pointer to a vtable with it.
You need to test, but as far as i know a class instance does only store pointers to its methods if said methods are virtual; otherwise, a struct and a class will take roughly the same amount of memory (bar different alignment done by different compilers etc).