How does GCC store member functions in memory? - c++

I am trying to minimise the size my class occupies in memory (both data and instructions). I know how to minimise data size, but I am not too familiar with how GCC places member functions.
Are they stored in memory, the same order they are declared in the class?

For the purpose of in-memory data representation, a C++ class can have either plain or static member functions, or virtual member functions (including some virtualdestructor, if any).
Plain or static member functions do not take any space in data memory, but of course their compiled code take some resource, e.g. as binary code in the text or code segment of your executable or your process. Of course, they can also require static data (or thread-local data), or local data (e.g. local variables) on the call stack.
My answer is Linux oriented. I don't know Windows, and don't know how GCC work on it.
Virtual member functions are very often implemented thru virtual method table (or vtable); a class having some virtual member functions usually have instances with a single (assuming single-inheritance) vtable-pointer pointing to that vtable (which is practically some data packed in the text segment).
Notice that vtables are not mandatory and are not required by C++11 standard. But I don't know any C++ implementation not using them.
When you are using multiple-inheritance things become more complex, objects might have several vtable pointers.
So if you have a class (either a root class, or using single-inheritance), the consumption for virtual member functions is one vtable pointer per instance (plus the small space needed by the single vtable itself). It won't change (for each instance) if you have only one virtual member function (or destructor) or a thousand of them (what would change is the vtable itself). Each class has its own single vtable (unless it has no virtual member function), and each instance has generally one (for single-inheritance case) vtable pointer.
The GCC compiler is free to organize the vtable as it wishes (and its order and layout is an implementation detail you should not care about); see also this. In practice (for single-inheritance) for most recent GCC versions, the vtable pointer is the first word of the object, and the vtable contain function pointers in the order of virtual method declaration, but you should not depend on such details.
The GCC compiler is free to organize the functions in the code segment as it wishes, and it would actually reorder them (e.g. for optimizations). Last time I looked, it ordered them in reverse order. But you certainly should not depend on that order! BTW GCC can inline functions (even when not marked inline) and clone functions when optimizing. You could also compile and link with link-time optimizations (e.g. make CXX='g++ -flto -Os'), and you could ask for profile-guided optimizations (for GCC: -fprofile-generate, -fprofile-use, -fauto-profile etc...)
You should not depend on how the compiler (and linker) is organizing function code or vtables. Leave the optimizations to the compiler (and such optimizations depend upon your target machine, your compiler flags, and the compiler version). You might also use function attributes to give hints to the GCC (or Clang/LLVM) compiler (e.g. __attribute__((cold)), __attribute__((noinline)) etc etc....)
If you really need to know how functions are placed (which IMHO is very wrong), study the generated assembly code (e.g. using g++ -O -fverbose-asm -S) and be aware that it could vary with compiler versions!
If you need on Linux and Posix systems at runtime to find out the address of a function from its name, consider using dlsym (for Linux, see dlsym(3), which also documents dladdr). Be aware of name mangling, which you can disable by declaring such functions as extern "C" (see C++ dlopen minihowto).
BTW, you might compile and link with -rdynamic (which is very useful for dlopen etc...). If you really need to know the address of functions, use nm(1) as nm -C your-executable.
You might also read the ABI specification and calling conventions for your target platform (and compiler), e.g. Linux x86-64 ABI spec.

Let's say we have a type T with 4 instance methods.
class T {
public:
void member_function_1() { ... }
void member_function_2() { ... }
void member_function_3() { ... }
void member_function_4() { ... }
};
The amount of memory that those methods take up is the same if we instantiate 1 copy of T, or if we instantiate 1 million copies of T.

Related

How can I force generation of a non-inlined *copy* of an inlined constructor in some unit?

In a DLL (built with MSVC) I have a legacy published class in a C++ header, that declares a non-illine constructor (and non-inline, virtual destructor), defined in a unit. That means that library users expect the constructor implementation in the library.
Due to performance reasons, I would like the constrictor to be inlined for the internal code. However, just defining it inline in the header would mean that users might not find the non-inlined definition in the library. At least that is my understanding - which might be wrong; is it guaranteed by something (in MSVC, or in the standard) that the implementation is emitted for such class?
So the question is: if it's possible to guarantee emitting the implementation of inlined constructor in the DLL in some way, so that inlining would work for the newly compiled code, but existing binaries will still keep finding non-inlined implementation. Since I can't take the address of the constructor, I can't use this trick.
Another way would be to create another constructor, with a dummy argument, to be used in internal code, and leave the old default constructor alone; but if possible, I'd like to avoid increasing the complexity here.

Is the compiler allowed to optimise out private data members?

If the compiler can prove that a (private) member of a class is never used, including by potential friends, does the standard allow the compiler to remove this member from the memory footprint of the class?
It is self-evident that this not possible for protected or public members at compile time, but there could be circumstances where it is possible regarding private data members for such a proof to be constructed.
Related questions:
Behind the scenes of public, private and protected (sparked this question)
Is C++ compiler allowed to optimize out unreferenced local objects (about automatic objects)
Will a static variable always use up memory? (about static objects)
Possible in theory (along with unused public members), but not with the kind of compiler ecosystem we're used to (targeting a fixed ABI that can link separately-compiled code). Removing unused members could only be done with whole-program optimization that forbids separate libraries1.
Other compilation units might need to agree on sizeof(foo), but that wouldn't be something you could derive from a .h if it depended on verifying that no implementation of a member function's behaviour depended on any private members.
Remember C++ only really specifies one program, not a way to do libraries. The language ISO C++ specifies is compatible with the style of implementation we're used to (of course), but implementations that take all the .cpp and .h files at once and produce a single self-contained non-extensible executable are possible.
If you constrain the implementation enough (no fixed ABI), aggressive whole-program application of the as-if rule becomes possible.
Footnote 1: I was going to add "or exports the size information somehow to other code being compiled" as a way to allow libraries, if the compiler could already see definitions for every member function declared in the class. But #PasserBy's answer points out that a separately-compiled library could be the thing that used the declared private members in ways that ultimately produce externally-visible side effects (like I/O). So we'd have to fully rule them out.
Given that, public and private members are equivalent for the purposes of such an optimization.
If the compiler can prove that a (private) member of a class is never used
The compiler cannot prove that, because private members can be used in other compilation units. Concretely, this is possible in the context of a pointer to member in a template argument according to [temp.spec]/6 of the standard, as originally described by Johannes Schaub.
So, in summary: no, the compiler must not optimise out private data members any more than public or protected members (subject to the as-if rule).
No, because you can subvert the access control system legally.
class A
{
int x;
};
auto f();
template<auto x>
struct cheat
{
friend auto f() { return x; }
};
template struct cheat<&A::x>; // see [temp.spec]/6
int& foo(A& a)
{
return a.*f(); // returns a.x
}
Given that the compiler must fix the ABI when A is first used, and that it can never know whether some future code may access x, it must fix the memory of A to contain x.

What the benefits by using the template<int size> than dynamic allocate?

I'm reading the pbrt and it has defined a type:
template <int nSpectrumSamples>
class CoefficientSpectrum;
class RGBSpectrum : public CoefficientSpectrum<3> {
using CoefficientSpectrum<3>::c;
typedef RGBSpectrum Spectrum;
// typedef SampledSpectrum Spectrum;
And the author said:
"We have not written the system such that the selection of which Spectrum implementation to use could be resolved at run time; to switch to a different representation, the entire system must be recompiled. One advantage to this design is that many of the various Spectrum methods can be implemented as short functions that can be inlined by the compiler, rather than being left as stand-alone functions that have to be invoked through the relatively slow virtual method call mechanism. Inlining frequently used short functions like these can give a substantial improvement in performance."
1.Why template can inline function but normal way can not?
2.Why do normal way has to use the virtual method?
Linkage to the entire header file:
https://github.com/mmp/pbrt-v3/blob/master/src/core/spectrum.h
To inline a function call, the compiler has to know 1. which function is called and 2. the exact code of that function. The whole purpose of virtual functions is to defer the choice which function is called to run-time, so compilers can obtain the above pieces of information only with sophisticated optimization techniques that require very specific circumstances1.
Both templates and virtual functions (i.e. polymorphy) are tools for encoding abstraction. The code that uses a CoefficientSpectrum does not really care about the implementation details of the spectrum, only that you can e.g. convert it to and from RGB - that's why it uses an abstraction (to avoid repeating the code for each kind of spectrum). As explained in the comment you quoted, using polymorphy for abstraction here would mean that the compiler has a hard time optimizing the code because it fundamentally defers the choice of implementation to run-time (which is sometimes useful but not strictly necessary here). By requiring the choice of implementation to be made at compile-time, the compiler can easily optimize (i.e. inline) the code.
1For example, some compilers are able to optimize away the std::function abstraction, which generally uses polymorphy for type erasure. Of course, this can only work if all the necessary information is available.

Will adding enum definition inside a class break its binary-backward-compatibility?

I know adding static member function is fine, but how about an enum definition? No new data members, just it's definition.
A little background:
I need to add a static member function (in a class), that will recognize (the function) the version of an IP address by its string representation. The first thing, that comes to my mind is to declare a enum for IPv4, IPv6 and Unknown and make this enum return code of my function.
But I don't want to break the binary backward compatibility.
And a really bad question (for SO) - is there any source or question here, I can read more about that? I mean - what breaks the binary compatibility and what - does not. Or it depends on many things (like architecture, OS, compiler..)?
EDIT: Regarding the #PeteKirkham 's comment: Okay then, at least - is there a way to test/check for changed ABI or it's better to post new question about that?
EDIT2: I just found a SO Question : Static analysis tool to detect ABI breaks in C++ . I think it's somehow related here and answers the part about tool to check binary compatibility. That's why I relate it here.
The real question here, is obviously WHY make it a class (static) member ?
It seems obvious from the definition that this could perfectly be a free function in its own namespace (and probably header file) or if the use is isolated define in an anonymous namespace within the source file.
Although this could still potentially break ABI, it would really take a funny compiler to do so.
As for ABI breakage:
modifying the size of a class: adding data members, unless you manage to stash them into previously unused padding (compiler specific, of course)
modifying the alignment of a class: changing data members, there are tricks to artificially inflate the alignment (union) but deflating it requires compiler specific pragmas or attributes and compliant hardware
modifying the layout of a vtable: adding a virtual method may change the offsets of previous virtual methods in the vtable. For gcc, the vtable is layed out in the order of declaration, so adding the virtual method at the end works... however it does not work in base classes as vtable layout may be shared with derived classes. Best considered frozen
modyfing the signature of a function: the name of the symbol usually depends both on the name of the function itself and the types of its arguments (plus for methods the name of the class and the qualifiers of the method). You can add a top-level const on an argument, it's ignored anyway, and you can normally change the return type (this might entails other problems though). Note that adding a parameter with a default value does break the ABI, defaults are ignored as far as signatures are concerned. Best considered frozen
removing any function or class that previously exported symbols (ie, classes with direct or inherited virtual methods)
I may have forgotten one or two points, but that should get you going for a while already.
Example of what an ABI is: the Itanium ABI.
Formally... If you link files which were compiled against two different
versions of your class, you've violated the one definition rule, which
is undefined behavior. Practically... about the only things which break
binary compatibilty are adding data members or virtual functions
(non-virtual functions are fine), or changing the name or signature of a
function, or anything involving base classes. And this seems to be
universal—I don't know of a compiler where the rules are
different.

C++ vtable query

I have a query in regard to the explaination provided here http://www.parashift.com/c++-faq/virtual-functions.html#faq-20.4
In the sample code the function mycode(Base *p), calls virt3 method as p->virt3(). Here how exactly the compiler know that virt3 is found in third slot of vtable? How do it compare and with what ?
When the compiler sees the definition of Base it decides the layout of its vtable according to some algorithm1, which is common to all its derived classes as far as methods inherited from Base are concerned (derived classes may add other virtual methods, but they are put into the vtable after the stuff inherited from Base).
Thus, when the compiler sees p->virt3(), it already knows that for any object that inherits from Base the pointer to the correct virt3 is e.g. in the third slot of the vtable (because that's how it laid out the vtable of Base at the moment of its definition), so it can correctly generate the code for the virtual call.
Long story short (driving inspiration from #David Rodríguez's comment): it knows where it stays because he decided it before.
1. The standard do not mandate any particular algorithm (actually, it doesn't say anything about how the C++ ABI should be implenented), but there are several widespread C++ ABI specifications, notably the COM ABI on Windows and the Itanium ABI on Linux (and in general for gcc). Obviously, given the same class definition, the algorithm must give the same vtable layout every time, otherwise it would be impossible to link together different object modules.
The layout of the vtable is specified by the Itanium C++ ABI, followed by many compilers including GCC. The compiler itself doesn't decide where the function pointers go (though I suppose it does decide to abide by the ABI!).
The order of the virtual function pointers in a virtual table is the order of declaration of the corresponding member functions in the class.
(Example.)
COM — used by Visual Studio — also emits vtable pointers in source order (though I can't find normative documentation to prove that).
Also, because the function name doesn't even exist at runtime (but a function pointer), the layout of the vtable at compile-time doesn't really matter. The function call translation works in just the same way that a normal function call translation works: the compiler is already mapping the function name to an address in its internal machinery. The only difference is that the mapping here is to a location in the vtable, rather than to the start of the actual function code.
This also addresses your concern about interoperability, to some extent.
Do remember, though, that this is all implementation machinery and C++ itself has no knowledge that virtual tables even exist.
The compiler has a well defined algorithm for allocating the entries in the vtable so that the order of the entries will always be the same regardless of which translation unit is being processed. Internal to the compiler is a mapping between the function names and their location in the vtable so the compiler can do the correct transformation between function call and vtable index.
It is important, therefore, that changes to the definition of a class with virtual functions causes all source files that are dependent on the class to be recompiled, otherwise bad things could happen.