In C++, where in memory are class functions put? - c++

I'm trying to understand what kind of memory hit I'll incur by creating a large array of objects. I know that each object - when created - will be given space in the HEAP for member variables, and I think that all the code for every function that belongs to that type of object exists in the code segment in memory - permanently.
Is that right?
So if I create 100 objects in C++, I can estimate that I will need space for all the member variables that object owns multiplied by 100 (possible alignment issues here), and then I need space in the code segment for a single copy of the code for each member function for that type of object( not 100 copies of the code ).
Do virtual functions, polymorphism, inheritance factor into this somehow?
What about objects from dynamically linked libraries? I assume dlls get their own stack, heap, code and data segments.
Simple example (may not be syntactically correct):
// parent class
class Bar
{
public:
Bar() {};
~Bar() {};
// pure virtual function
virtual void doSomething() = 0;
protected:
// a protected variable
int mProtectedVar;
}
// our object class that we'll create multiple instances of
class Foo : public Bar
{
public:
Foo() {};
~Foo() {};
// implement pure virtual function
void doSomething() { mPrivate = 0; }
// a couple public functions
int getPrivateVar() { return mPrivate; }
void setPrivateVar(int v) { mPrivate = v; }
// a couple public variables
int mPublicVar;
char mPublicVar2;
private:
// a couple private variables
int mPrivate;
char mPrivateVar2;
}
About how much memory should 100 dynamically allocated objects of type Foo take including room for the code and all variables?

It's not necessarily true that "each object - when created - will be given space in the HEAP for member variables". Each object you create will take some nonzero space somewhere for its member variables, but where is up to how you allocate the object itself. If the object has automatic (stack) allocation, so too will its data members. If the object is allocated on the free store (heap), so too will be its data members. After all, what is the allocation of an object other than that of its data members?
If a stack-allocated object contains a pointer or other type which is then used to allocate on the heap, that allocation will occur on the heap regardless of where the object itself was created.
For objects with virtual functions, each will have a vtable pointer allocated as if it were an explicitly-declared data member within the class.
As for member functions, the code for those is likely no different from free-function code in terms of where it goes in the executable image. After all, a member function is basically a free function with an implicit "this" pointer as its first argument.
Inheritance doesn't change much of anything.
I'm not sure what you mean about DLLs getting their own stack. A DLL is not a program, and should have no need for a stack (or heap), as objects it allocates are always allocated in the context of a program which has its own stack and heap. That there would be code (text) and data segments in a DLL does make sense, though I am not expert in the implementation of such things on Windows (which I assume you're using given your terminology).

Code exists in the text segment, and how much code is generated based on classes is reasonably complex. A boring class with no virtual inheritance ostensibly has some code for each member function (including those that are implicitly created when omitted, such as copy constructors) just once in the text segment. The size of any class instance is, as you've stated, generally the sum size of the member variables.
Then, it gets somewhat complex. A few of the issues are...
The compiler can, if it wants or is instructed, inline code. So even though it might be a simple function, if it's used in many places and chosen for inlining, a lot of code can be generated (spread all over the program code).
Virtual inheritance increases the size of polymorphic each member. The VTABLE (virtual table) hides along with each instance of a class using a virtual method, containing information for runtime dispatch. This table can grow quite large, if you have many virtual functions, or multiple (virtual) inheritance. Clarification: The VTABLE is per class, but pointers to the VTABLE are stored in each instance (depending on the ancestral type structure of the object).
Templates can cause code bloat. Every use of a templated class with a new set of template parameters can generate brand new code for each member. Modern compilers try and collapse this as much as possible, but it's hard.
Structure alignment/padding can cause simple class instances to be larger than you expect, as the compiler pads the structure for the target architecture.
When programming, use the sizeof operator to determine object size - never hard code. Use the rough metric of "Sum of member variable size + some VTABLE (if it exists)" when estimating how expensive large groups of instances will be, and don't worry overly about the size of the code. Optimise later, and if any of the non-obvious issues come back to mean something, I'll be rather surprised.

Although some aspects of this are compiler vendor dependent, all compiled code goes into a section of memory on most systems called text segment. This is separate from both the heap and stack sections (a fourth section, data, holds most constants). Instantiating many instances of a class incurs run-time space only for its instance variables, not for any of its functions. If you make use of virtual methods, you will get an additional, but small, bit of memory set aside for the virtual look-up table (or equivalent for compilers that use some other concept), but its size is determined by the number of virtual methods times the number of virtual classes, and is independent of the number of instances at run-time.
This is true of statically and dynamically linked code. The actual code all lives in a text region. Most operating systems actually can share dll code across multiple applications, so if multiple applications are using the same dll's, only one copy resides in memory and both applications can use it. Obviously there is no additional savings from shared memory if only one application uses the linked code.

You can't completely accurately say how much memory a class or X objects will take up in RAM.
However to answer your questions, you are correct that code exists only in one place, it is never "allocated". The code is therefore per-class, and exists whether you create objects or not. The size of the code is determined by your compiler, and even then compilers can often be told to optimize code size, leading to differing results.
Virtual functions are no different, save the (small) added overhead of a virtual method table, which is usually per-class.
Regarding DLLs and other libraries... the rules are no different depending on where the code has come from, so this is not a factor in memory usage.

The information given above is of great help and gave me some insight in C++ memory structure. But I would like to add here is that no matter how many virtual functions in a class, there will always be only 1 VPTR and 1 VTABLE per class. After all the VPTR points to the VTABLE, so there is no need for more than one VPTR in case of multiple virtual functions.

Your estimate is accurate in the base case you've presented. Each object also has a vtable with pointers for each virtual function, so expect an extra pointer's worth of memory for each virtual function.
Member variables (and virtual functions) from any base classes are also part of the class, so include them.
Just as in c you can use the sizeof(classname/datatype) operator to get the size in bytes of a class.

Yes, that's right, code isn't duplicated when an object instance is created. As far as virtual functions go, the proper function call is determined using the vtable, but that doesn't affect object creation per se.
DLLs (shared/dynamic libraries in general) are memory-mapped into the process' memory space. Every modification is carried on as Copy-On-Write (COW): a single DLL is loaded only once into memory and for every write into a mutable space a copy of that space is created (generally page-sized).

if compiled as 32 bit. then sizeof(Bar) should yield 4.
Foo should add 10 bytes (2 ints + 2 chars).
Since Foo is inherited from Bar. That is at least 4 + 10 bytes = 14 bytes.
GCC has attributes for packing the structs so there is no padding. In this case 100 entries would take up 1400 bytes + a tiny overhead for aligning the allocation + some overhead of for memory management.
If no packed attribute is specified it depends on the compilers alignment.
But this doesn't consider how much memory vtable takes up and size of the compiled code.

It's very difficult to give an exact answer to yoour question, as this is implementtaion dependant, but approximate values for a 32-bit implementation might be:
int Bar::mProtectedVar; // 4 bytes
int Foo::mPublicVar; // 4 bytes
char Foo::mPublicVar2; // 1 byte
There are allgnment issues here and the final total may well be 12 bytes. You will also have a vptr - say anoter 4 bytes. So the total size for the data is around 16 bytes per instance. It's impossible to say how much space the code will take up, but you are correct in thinking there is only one copy of the code shared between all instances.
When you ask
I assume dlls get their own stack,
heap, code and data segments.
Th answer is that there really isn't much difference between data in a DLL and data in an app - basically they share everything between them, This has to be so when you think about about it - if they had different stacks (for example) how could function calls work?

Related

Enforcing a vftable entry in windbg "x /2" results, what to consider?

(This is quite a large question about software design. In case it's not suited for StackOverflow I'm willing to copy it to the Software-Engineering community)
I'm working with heap_stat, a script, which investigates dumps. This script is based on the idea that, for any object which has a virtual function, the vftable field is always the first one (allowing to find the memory address of the class of the object).
In my applications there are some objects, having vftable entries (typically every STL object has it), but there are also quite some objects who don't.
In order to force the presence of a vftable field, I've done following test:
Create a nonsense class, having a virtual function, and let my class inherit from this nonsense class:
class NONSENSE {
virtual int nonsense() { return 0; }
};
class Own_Class : public NONSENSE, ...
This, as expected, created a vftable entry in the symbols, which I could find (using Windbg's x /2 *!Own_Class*vftable* command):
00000000`012da1e0 Own_Application!Own_Class::`vftable'
I also saw a difference in memory usage:
sizeof(an normal Own_Class object) = 2928
sizeof(inherited Own_Class object) = 2936
=> 8 bytes have been added for this object.
There's a catch: apparently quite some objects are defined as:
class ATL_NO_VTABLE Own_Class
This ATL_NO_VTABLE blocks the creation of the vftable entry, which means the following (ATL_NO_VTABLE equals __declspec(novtable)):
// __declspec(novtable) is used on a class declaration to prevent the vtable
// pointer from being initialized in the constructor and destructor for the
// class. This has many benefits because the linker can now eliminate the
// vtable and all the functions pointed to by the vtable. Also, the actual
// constructor and destructor code are now smaller.
In my opinion, this means that the vftable does not get created, because of which object methods get called more directly, having an impact on the speed of the method execution and stack handling. Allowing the vftable to be created has following impact:
Not to be taken into account:
There is one more call on the stack, this only has impact in case of systems which are already at the limit of their memory usage. (I have no idea how the linker points to a particular method)
The CPU usage increase will be too small to be seen.
The speed decrease will be too small to be seen.
To be taken into account:
As mentioned before, the memory usage of the application increases by 8 bytes per object. When a regular object has a size of some 1000 bytes, this means a memory usage increase of ±1%, but for objects with a memory size of less than 80 bytes, this might cause a memory usage increase of +10%.
Now I have following questions:
Is my analysis on the impact correct?
Is there a better way to force the creation of the vftable field, having less impact?
Did I miss anything?
Thanks in advance
Is my analysis on the impact correct?
No. __declspec(novtable) omits generation of vtable itself for a given class, the pointer to vtable would still exist, so sizeof will not change.
__declspec(novtable) is meant to be used for base classes, that have derived classes. So that constructor of derived class will set vtable pointer to derived vtable, and base vtable is not needed.
So, this optimization eliminates one pointer assignment (in generated part of constructor code), and a bit of space for vtable itself. Not very much useful for your goal to have per-object optimization, as it only does small per-class optimization.
It will work if you don't create base instances on their own, and don't call virtual method in constructor/destructor.
Omission of virtual function calls by making them non-virtual is completely separate story. It is called devirtualization. When compiler can be sure instance of which class is used, it replaces virtual calls with non-virtual ones.
__declspec(novtable) cannot help devirtualization anyhow. final / sealed keywords may help devirtualization, as they say there's no further derived classes/methods.
Regarding assumption that vtable pointer is the first member, this may be wrong. vtable pointer will be not first if your base classes don't have vtable, but have some data member. Also there may be more than one vtable pointer.
To analyze structures in dump programmatically, I would recommend using proper API. There are two APIs: DIA SDK and dbghelp functions. They are similar, but first one is object-based (COM) and second is just flat API, so the first may be easier to use.
As approach with heap_stat script is inherently limited, I would recommend for heap analysis use UMDH instead, which does not rely on vtable at all, and shows all kinds of objects
In the meantime, I've found a terribly easy way to force vftable' entries for every class: just declare every destructor as virtual.
In order to find all destructors, who are not virtual yet, I've launched following command in my Ubuntu app within my development directory:
find ./ -name "*.h" -exec fgrep "~" {} /dev/null \; | grep -v "virtual"
After having declared all destructors as virtual, I'm planning to do some performance testing (I believe that declaring a method as virtual might have an impact on the speed, as the method declaration has been changed, especially for a server application with heavy load), I'll keep this post up to date.

How can class definitions not occupy memory?

So I have read this about if class definitions occupy memory and this about if function occupy memory. This is what I do not get: How come class definitions do not occupy memory if functions do, or their code does. I mean, class definitions are also code, so shouldn't that occupy memory just like function code does?
It is not entirely correct to say that class definitions do not occupy memory: any class with member functions may place some code in memory, although the amount of code and its actual placement depends heavily on function inlining.
The Q&A at the first link talks about sizeof, which shows a per-instance memory requirement of the class, which excludes memory requirements for storing member functions, static members, inlined functions, dispatch tables, and so on. This is because all these elements are shared among all instances of the class.
You don't need to keep the class definition anywhere, because the details of how to create an instance of a class are encoded in its constructors.
(In a sense, the class definition is code, it's just not represented explicitly.)
All you need to know in order to create an object is
How big it is,
Which constructor to use for creating it, and
Which its virtual functions are.
To create an instance of class A:
Reserve a piece of memory of size sizeof(A) (or be handed one),
Associate that piece of memory with the virtual functions of A, if any (usually held in a table in a predetermined location), and
Tell the relevant A constructor where the A should be created, and then let it do the actual work.
You don't need to know a thing about the types of member variables or anything like that, the constructors know what to do once they know where the object is to be created.
(Every member variable can be found at an offset from the beginning of the object, so the constructor knows where things must be.)
To create a function, on the other hand, you would need to store its definition in some form and then generate the code at runtime. (This is usually called "Just-in-time" compilation.)
This requires a compiler, which means that you need to either
Include a compiler in every executable, or
Provide (or require everyone to install) a shared compiler for all executables (Java VMs usually contain at least one).
C++ compilers instead generate the functions in advance.
Abusing terminology a little, you could say that the functions are "instantiated" by the compilation process, with the source code as a blueprint.

Do classes take memory?

class Test
{
int x;
};
int main()
{
cout << sizeof(Test) ;
return 0;
}
Output : 4
I just want to ask that even i am not created any object of class Test why it prints 4 ?
sizeof(X) is the number of bytes an X takes when created. A call to new tends to use a few more bytes for memory use overhead, but an automatic storage (on-stack or local or global or static etc) array of X[N] will take N*sizeof(X) memory in practice (a little extra maybe for function local statics due to thread safety requirements).
It has nothing to do with the amount of memory the type itself takes.
Classes themselves use memory if they have methods that are not optimized away, if they have a vtable (caused by use of the virtual keywords), or similar. Then memory storing code or virtual function tables may exist outside of the memory costs of instances of the class.
Within the C++ language itself, there is no way to determine how much memory the class itself takes, nor no reliable way to determine what the new overhead is. You can usually puzzle that out by looking at the runtime behaviour, or the code for the compiler or runtime libraries, for a given platform.

Alternatives for polymorphic data storage

I'm storing a large amount of computed data and I'm currently using a polymorphic type to reduce the amount of storage required. Everything is extremely fast except for deleting the objects when I'm finished and I think there must be a better alternative. The code computes the state at each step and depending on the conditions present it needs to store certain values. The worst case is storing the full object state and the best state is storing almost nothing. The (very simplified) setup is as follows:
class BaseClass
{
public:
virtual ~BaseClass() { }
double time;
unsigned int section;
};
class VirtualSmall : public BaseClass
{
public:
double values[2];
int othervalue;
};
class VirtualBig : public BaseClass
{
public:
double values[16];
int othervalues[5];
};
...
std::vector<BaseClass*> results(10000);
The appropriate object type is generated during computation and a pointer to it is stored in the vector. The overhead from vtable+pointer is overall much smaller than than the size difference between the largest and smallest object (which is least 200 bytes according to sizeof). Since often the smallest object can be used instead of the largest and there are potentially many tens of millions of them stored it can save a few gigabytes of memory usage. The results can then be searched extremely fast as the base class contains the information necessary to find the correct item which can then be dynamic_cast back to it's real type. It works very well for the most part.
The only issue is with delete. It takes a few seconds to free all of the memory when there is many tens of millions of objects. The delete code iterates through each object and delete results[i] which calls the virtual destructor. While it's not impossible to work around I think there must be a more elegant solution.
It could definitely be done by allocating largish contiguous blocks of memory (with malloc or similar), which are kept track of and then something generates a correct pointers to the next batch of free memory inside of the block. That pointer is then stored in the vector. To free the memory the smaller number of large blocks need to have free() called on them. There is no more vtable (and it can be replaced by a smaller type field to ensure the correct cast) which saves space as well. It is very much a C style solution though and not particularly pretty.
Is there a C++ style solution to this type of problem I'm overlooking?
You can overload the "new" operator (i.e. void* VirtualSmall::operator new(size_t) ) for you classes, and implement them to obtain memory from custom allocators. I would use one block allocator for each derived class, so that each block size is a multiple of the class' it's supposed to store.
When it's time to cleanup, tell each allocators to release all blocks. No destructors will be called, so make sure you don't need them.

Can you write a polymorphic class to disk and survive?

Firstly, I know that writing a class to disk is bad, but you should see some of our other code. D:
My question is: can I write a polymorphic class to disk and then read it in later and not get undefined behaviour? I am going to guess not because of vtables (I think these are generated at runtime and unique to the object?)
I.e.
class A {
virtual ~A() {}
virtual void foo() = 0;
};
class B : public A {
virtual ~B() {}
virtual void foo() {}
};
A * a = new B;
fwrite( a, 1, sizeof( B ), fp );
delete a;
a = new B;
fread( a, 1, sizeof( B ), fp );
a->foo();
delete a;
Thank-you!
I'll suggest you to take a look at Boost Serialization.
we use the term "serialization" to mean the reversible deconstruction of an arbitrary set of C++ data structures to a sequence of bytes. Such a system can be used to reconstitute an equivalent structure in another program context. Depending on the context, this might used implement object persistence, remote parameter passing or other facility.
You might be able to get away with it if such objects are always read back during the same execution of the program that wrote them (though I really don't recommend it). But if the data in the file must persist between different executions of the program, then using the raw bytes of the in-memory objects will almost certainly lead to significant problems.
Each vtable itself is generated at compile time and stored somewhere in the resulting executable. What each object instance contains is just a pointer to the appropriate vtable, and that pointer does not change for the lifetime of any given object. (Multiple inheritance can be a little more complicated, but for this discussion those details aren't relevant. The pointers are still constant.)
So if an object has a vtable pointer and you write the raw bytes of that object to disk, then the vtable pointer is written to disk as well. If you then read back those bytes during the same execution of the program and push them into an appropriate object, it may work since the vtable will still be in the same location and thus the vtable pointer will still be correct.
(However note that everything I just explained there is an implementation detail. While many compilers typically implement virtual functions in that manner, I don't think any of the exact details are guaranteed by the C++ standard. So there could be additional potential problems.)
Now, if this might be possible, why not store such objects for longer durations? Because you have no guarantee that any particular virtual table will be in the same memory location.
Some operating systems may change the memory layout for each execution of the same program. I don't know whether or not this actually affects virtual table locations, but that's certainly a serious risk.
Furthermore, if you ever compile a new version of the program, the location of each virtual table is completely up to the whims of the compiler. Changes to seemingly unrelated parts of the code may cause the compiler to place the relevant virtual tables in different locations. Obviously, that happening would completely break this scheme. And you have no way to prevent it from happening.
(And beyond the vtables, what if new data members need to be added to those objects in subsequent versions of the program? You might have to deal with reading past versions of raw objects' bytes into new versions that have new members or a different layout of members. That can get complicated and ugly as well as error prone.)
Now, even if you only intend to store the objects temporarily for each execution of the program. I still don't think it's a good idea. You are highly restricted as to what kinds of variables these objects can contain. No smart objects (std::string, std::vector, etc). No pointers to memory allocated per each object. Any strings must therefore be stored in raw character arrays. Other dynamic allocation would have to be turned into fixed members or member arrays. That means you lose a lot of C++'s benefits everywhere these objects are used.
Furthermore, these objects and the scheme of writing this directly to disk would need to be accompanied by comments and documentation warning of all the dangers I've described. Otherwise, some future programmer might unknowingly decide to add the wrong kind of data member. Or even worse, they might decide to try storing such objects longer than the execution of the program, opening them up to serious crashes and failures that might not happen until much later in the future (and probably at the worst possible time).
In the end, I strongly suggest using a scheme that stores the data in a format specifically intended for the file. As someone else already mentioned, Boost Serialization is a good option. If not that, there are may be other usable serialization libraries. Or else depending on your needs, you may be able to roll your own mechanism without too much trouble.
The problem is not the vtable. It is stored per class type, not per instance, so you won't write it to file. Basically your code should work (haven't tried it).
However, you should keep in mind that reading pointers/handles from file does not work.