Suppose I have two translation-units:
foo.cpp
void foo() {
auto v = std::vector<int>();
}
bar.cpp
void bar() {
auto v = std::vector<int>();
}
When I compile these translation-units, each will instantiate std::vector<int>.
My question is: how does this work at the linking stage?
Do both instantiations have different mangled names?
Does the linker remove them as duplicates?
C++ requires that an inline function definition
be present in a translation unit that references the function. Template member
functions are implicitly inline, but also by default are instantiated with external
linkage. Hence the duplication of definitions that will be visible to the linker when
the same template is instantiated with the same template arguments in different
translation units. How the linker copes with this duplication is your question.
Your C++ compiler is subject to the C++ Standard, but your linker is not subject
to any codified standard as to how it shall link C++: it is a law unto itself,
rooted in computing history and indifferent to the source language of the object
code it links. Your compiler has to work with what a target linker
can and will do so that you can successfully link your programs and see them do
what you expect. So I'll show you how the GCC C++ compiler interworks with
the GNU linker to handle identical template instantiations in different translation units.
This demonstration exploits the fact that while the C++ Standard requires -
by the One Definition Rule
- that the instantiations in different translation units of the same template with
the same template arguments shall have the same definition, the compiler -
of course - cannot enforce any requirement like that on relationships between different
translation units. It has to trust us.
So we'll instantiate the same template with the same parameters in different
translation units, but we'll cheat by injecting a macro-controlled difference into
the implementations in different translation units that will subsequently show
us which definition the linker picks.
If you suspect this cheat invalidates the demonstration, remember: the compiler
cannot know whether the ODR is ever honoured across different translation units,
so it cannot behave differently on that account, and there's no such thing
as "cheating" the linker. Anyhow, the demo will demonstrate that it is valid.
First we have our cheat template header:
thing.hpp
#ifndef THING_HPP
#define THING_HPP
#ifndef ID
#error ID undefined
#endif
template<typename T>
struct thing
{
T id() const {
return T{ID};
}
};
#endif
The value of the macro ID is the tracer value we can inject.
Next a source file:
foo.cpp
#define ID 0xf00
#include "thing.hpp"
unsigned foo()
{
thing<unsigned> t;
return t.id();
}
It defines function foo, in which thing<unsigned> is
instantiated to define t, and t.id() is returned. By being a function with
external linkage that instantiates thing<unsigned>, foo serves the purposes
of:-
obliging the compiler to do that instantiating at all
exposing the instantiation in linkage so we can then probe what the
linker does with it.
Another source file:
boo.cpp
#define ID 0xb00
#include "thing.hpp"
unsigned boo()
{
thing<unsigned> t;
return t.id();
}
which is just like foo.cpp except that it defines boo in place of foo and
sets ID = 0xb00.
And lastly a program source:
main.cpp
#include <iostream>
extern unsigned foo();
extern unsigned boo();
int main()
{
std::cout << std::hex
<< '\n' << foo()
<< '\n' << boo()
<< std::endl;
return 0;
}
This program will print, as hex, the return value of foo() - which our cheat should make
= f00 - then the return value of boo() - which our cheat should make = b00.
Now we'll compile foo.cpp, and we'll do it with -save-temps because we want
a look at the assembly:
g++ -c -save-temps foo.cpp
This writes the assembly in foo.s and the portion of interest there is
the definition of thing<unsigned int>::id() const (mangled = _ZNK5thingIjE2idEv):
.section .text._ZNK5thingIjE2idEv,"axG",#progbits,_ZNK5thingIjE2idEv,comdat
.align 2
.weak _ZNK5thingIjE2idEv
.type _ZNK5thingIjE2idEv, #function
_ZNK5thingIjE2idEv:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movl $3840, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
Three of the directives at the top are significant:
.section .text._ZNK5thingIjE2idEv,"axG",#progbits,_ZNK5thingIjE2idEv,comdat
This one puts the function definition in a linkage section of its own called
.text._ZNK5thingIjE2idEv that will be output, if it's needed, merged into the
.text (i.e. code) section of program in which the object file is linked. A
linkage section like that, i.e. .text.<function_name> is called a function-section.
It's a code section that contains only the definition of function <function_name>.
The directive:
.weak _ZNK5thingIjE2idEv
is crucial. It classifies thing<unsigned int>::id() const as a weak symbol.
The GNU linker recognises strong symbols and weak symbols. For a strong symbol, the
linker will accept only one definition in the linkage. If there are more, it will give a multiple
-definition error. But for a weak symbol, it will tolerate any number of definitions,
and pick one. If a weakly defined symbol also has (just one) strong definition in the linkage then the
strong definition will be picked. If a symbol has multiple weak definitions and no strong definition,
then the linker can pick any one of the weak definitions, arbitrarily.
The directive:
.type _ZNK5thingIjE2idEv, #function
classifies thing<unsigned int>::id() as referring to a function - not data.
Then in the body of the definition, the code is assembled at the address
labelled by the weak global symbol _ZNK5thingIjE2idEv, the same one locally
labelled .LFB2. The code returns 3840 ( = 0xf00).
Next we'll compile boo.cpp the same way:
g++ -c -save-temps boo.cpp
and look again at how thing<unsigned int>::id() is defined in boo.s
.section .text._ZNK5thingIjE2idEv,"axG",#progbits,_ZNK5thingIjE2idEv,comdat
.align 2
.weak _ZNK5thingIjE2idEv
.type _ZNK5thingIjE2idEv, #function
_ZNK5thingIjE2idEv:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movl $2816, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
It's identical, except for our cheat: this definition returns 2816 ( = 0xb00).
While we're here, let's note something that might or might not go without saying:
Once we're in assembly (or object code), classes have evaporated. Here,
we're down to: -
data
code
symbols, which can label data or label code.
So nothing here specifically represents the instantiation of thing<T> for
T = unsigned. All that's left of thing<unsigned> in this instance is
the definition of _ZNK5thingIjE2idEv a.k.a thing<unsigned int>::id() const.
So now we know what the compiler does about instantiating thing<unsigned>
in a given translation unit. If it is obliged to instantiate a thing<unsigned>
member function, then it assembles the definition of the instantiated member
function at a weakly global symbol that identifies the member function, and it
puts this definition into its own function-section.
Now let's see what the linker does.
First we'll compile the main source file.
g++ -c main.cpp
Then link all the object files, requesting a diagnostic trace on _ZNK5thingIjE2idEv,
and a linkage map file:
g++ -o prog main.o foo.o boo.o -Wl,--trace-symbol='_ZNK5thingIjE2idEv',-M=prog.map
foo.o: definition of _ZNK5thingIjE2idEv
boo.o: reference to _ZNK5thingIjE2idEv
So the linker tells us that the program gets the definition of _ZNK5thingIjE2idEv from
foo.o and calls it in boo.o.
Running the program shows it's telling the truth:
./prog
f00
f00
Both foo() and boo() are returning the value of thing<unsigned>().id()
as instantiated in foo.cpp.
What has become of the other definition of thing<unsigned int>::id() const
in boo.o? The map file shows us:
prog.map
...
Discarded input sections
...
...
.text._ZNK5thingIjE2idEv
0x0000000000000000 0xf boo.o
...
...
The linker chucked away the function-section in boo.o that
contained the other definition.
Let's now link prog again, but this time with foo.o and boo.o in the
reverse order:
$ g++ -o prog main.o boo.o foo.o -Wl,--trace-symbol='_ZNK5thingIjE2idEv',-M=prog.map
boo.o: definition of _ZNK5thingIjE2idEv
foo.o: reference to _ZNK5thingIjE2idEv
This time, the program gets the definition of _ZNK5thingIjE2idEv from boo.o and
calls it in foo.o. The program confirms that:
$ ./prog
b00
b00
And the map file shows:
...
Discarded input sections
...
...
.text._ZNK5thingIjE2idEv
0x0000000000000000 0xf foo.o
...
...
that the linker chucked away the function-section .text._ZNK5thingIjE2idEv
from foo.o.
That completes the picture.
The compiler emits, in each translation unit, a weak definition of
each instantiated template member in its own function section. The linker
then just picks the first of those weak definitions that it encounters
in the linkage sequence when it needs to resolve a reference to the weak
symbol. Because each of the weak symbols addresses a definition, any
one one of them - in particular, the first one - can be used to resolve all references
to the symbol in the linkage, and the rest of the weak definitions are
expendable. The surplus weak definitions must be ignored, because
the linker can only link one definition of a given symbol. And the surplus
weak definitions can be discarded by the linker, with no collateral
damage to the program, because the compiler placed each one in a linkage section all by itself.
By picking the first weak definition it sees, the linker is effectively
picking at random, because the order in which object files are linked is arbitrary.
But this is fine, as long as we obey the ODR accross multiple translation units,
because it we do, then all of the weak definitions are indeed identical. The usual practice of #include-ing a class template everywhere from a header file (and not macro-injecting any local edits when we do so) is a fairly robust way of obeying the rule.
Different implementations use different strategies for this.
The GNU compiler, for example, marks template instantiations as weak symbols. Then at link time, the linker can throw away all definitions but one of the same weak symbol.
The Sun Solaris compiler, on the other hand, does not instantiate templates at all during normal compilation. Then at link time, the linker collects all template instantiations needed to complete the program, and then goes ahead and calls the compiler in a special template-instantiation mode. Thus exactly one instantiation is produced for each template. There are no duplicates to merge or get rid of.
Each approach has its own advantages and disadvantages.
When you have a non-template class definition, for example, class Bar {...};, and this class is defined in the header, that is included in multiple translation units. After compilation phase you have two object files with two definitions, right? Do you think linker will create two binary definitions for the class in your final binary? Of course, you have two definitions in two translation units and one final definition in the final binary after the linkage phase is done. This is called linkage collapsing, it is not forced by the standard, the standard only enforces the ODR rule, that does not say how the linker resolves the final problem, it is up to the linker, but the only way I have ever seen is the collapsing way of resolution. Of course the linker can keep both definitions, but I cannot image why, since the standard enforces those definitions to be identical in their semantics (see the ODR rule link above for more details), and if those are not the program is ill-formed. Now imaging it was not Bar it was std::vector<int>. Templates is just a way of code generation in this case, everything else is the same.
Related
In both C and C++, inline functions with external linkage can of course have multiple definitions available at link-time, the assumption being that these definitions are all (hopefully) identical. (I am of course referring to functions declared with the inline linkage specification, not to functions that the compiler or link-time-optimizer actually inlines.)
So what do common linkers typically do when they encounter multiple definitions of a function? In particular:
Are all definitions included in the final executable or shared-library?
Do all invocations of the function link against the same definition?
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
P.S. Yes, I know C and C++ are separate languages, but they both support inline, and their compiler-output can typically be linked by the same linker (e.g. GCC's ld), so I believe there cannot be any difference between them in this aspect.
If the function is, in fact, inlined, then there's nothing to link. It's only when, for whatever reason, the compiler decides not to expand the function inline that it has to generate an out-of-line version of the function. If the compiler generates an out-of-line version of the function for more than one translation unit you end up with more than one object file having definitions for the same "inline" function.
The out-of-line definition gets compiled into the object file, and it's marked so that the linker won't complain if there is more than one definition of that name. If there is more than one, the linker simply picks one. Usually the first one it saw, but that's not required, and if the definitions are all the same, it doesn't matter. And that's why it's undefined behavior to have two or more different definitions of the same inline function: there's no rule for which one to pick. Anything can happen.
The linker just has to figure out how to deduplicate all the definitions. That is of course provided that any function definitions have been emitted at all; inline functions may well be inlined. But should you take the address of an inline function with external linkage, you always get the same address (cf. [dcl.fct.spec]/4).
Inline functions aren't the only construction which require linker support; templates are another, as are inline variables (in C++17).
inline or no inline, C does not permit multiple external definitions of the same name among the translation units contributing to the same program or library. Furthermore, it does not permit multiple definitions of the same name in the same translation unit, whether internal, external, or inline. Therefore, there can be at most two available definitions of a given function in scope in any given translation unit: one internal and/or inline, and one external.
C 2011, 6.7.4/7 has this to say:
Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an
inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the
inline
function
specifier without
extern
, then the definition in that translation unit is an
inline
definition
. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.
(Emphasis added.)
In specific answer to your questions, then, as they pertain to C:
Are all definitions included in the final executable or shared-library?
Inline definitions are not external definitions. They may or may not be included as actual functions, as inlined code, both, or neither, depending on the foibles of the compiler and linker and on details of their usage. They are not in any case callable by name by functions from different translation units, so whether they should be considered "included" is a bit of an abstract question.
Do all invocations of the function link against the same definition?
C does not specify, but it allows for the answer to be "no", even for different calls within the same translation unit. Moreover, inline functions are not external, so no inline function defined in one translation unit is ever called (directly) by a function defined in a different translation unit.
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
My answers are based on the current C standard to the extent that it addresses the questions, but as you will have seen, those answers are not entirely prescriptive. Moreover, the standard does not directly address any question of object code or linking, so you may have noticed that my answers are not, for the most part, couched in those terms.
In any case, it is not safe to assume that any given C system is consistent even with itself in these regards for different functions or in different contexts. Under some circumstances it may inline every call to an internal or inline function, so that that function does not appear as a separate function at all. At other times it may indeed emit a function with internal linkage, but that does not prevent it from inlining some calls to that function anyway. In any case, internal functions are not eligible to be linked to functions from other translation units, so the linker is not necessarily involved with linking them at all.
I think the correct answer to your question is "it depends".
Consider following pieces of code:
File x.c (or x.cc):
#include <stdio.h>
void otherfunction(void);
inline void inlinefunction(void) {
printf("inline 1\n");
}
int main(void) {
inlinefunction();
otherfunction();
return 0;
}
File y.c (or y.cc)
#include <stdio.h>
inline void inlinefunction(void) {
printf("inline 2\n");
}
void otherfunction(void) {
printf("otherfunction\n");
inlinefunction();
}
As inline keyword is only a "suggestion" for the compile to inline the function different compilers with different flags behave differently. E.g. looks like C compiler always "exports" inline functions and does not allow for multiple definitions:
$ gcc x.c y.c && ./a.out
/tmp/ccy5GYHp.o: In function `inlinefunction':
y.c:(.text+0x0): multiple definition of `inlinefunction'
/tmp/ccQkn7m4.o:x.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
while C++ allows it:
$ g++ x.cc y.cc && ./a.out
inline 1
otherfunction
inline 1
More interesting - let's try to switch order of files (and so - switch the order of linking):
$ g++ y.cc x.cc && ./a.out
inline 2
otherfunction
inline 2
Well... it looks that first one counts! But... let's add some optimization flags:
$ g++ y.cc x.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
And that's the behavior we'd expect. Function got inlined. Different order of files changes nothing:
$ g++ x.cc y.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
Next we can extend our x.c (x.cc) source with prototype of void anotherfunction(void) and call it in our main function. Let's place anotherfunction definition in z.c (z.cc) file:
#include <stdio.h>
void inlinefunction(void);
void anotherfunction(void) {
printf("anotherfunction\n");
inlinefunction();
}
We do not define the body of inlinefunction this time. Compilation/execution for c++ gives following results:
$ g++ x.cc y.cc z.cc && ./a.out
inline 1
otherfunction
inline 1
anotherfunction
inline 1
Different order:
$ g++ y.cc x.cc z.cc && ./a.out
inline 2
otherfunction
inline 2
anotherfunction
inline 2
Optimization:
$ g++ x.cc y.cc z.cc -O1 && ./a.out
/tmp/ccbDnQqX.o: In function `anotherfunction()':
z.cc:(.text+0xf): undefined reference to `inlinefunction()'
collect2: ld returned 1 exit status
So conclusion is: the best is to declare inline together with static, which narrows the scope of the function usage, because "exporting" the function which we'd like to be used inline makes no sense.
When inline functions don't end up being inlined, behavior differs between C++ and C.
In C++ they behave like regular functions, but with additional symbol flag that allows for duplicate definitions, and the linker can select any one of them.
In C, the actual function body gets ignored, and they behave just like external functions.
On ELF targets, linker behavior needed for C++ is implemented with weak symbols.
Note that weak symbols are often used in combination with regular (strong) symbols where strong symbols would override weak symbols (this is the main use case mentioned in the Wikipedia article on weak symbols). They can also be used for implementing optional references (linker would insert null value for weak symbol reference if a definition is not found). But for C++ inline functions, they provide exactly what we need: given multiple weak symbols defined with the same name, linker will select one of them, in my tests always the one from the file appearing first in the list of files passed to the linker.
Here are some examples showing the behavior in C++ and then in C:
$ cat c1.cpp
void __attribute__((weak)) func_weak() {}
void func_regular() {}
void func_external();
void inline func_inline() {}
void test() {
func_weak();
func_regular();
func_external();
func_inline();
}
$ g++ -c c1.cpp
$ readelf -s c1.o | c++filt | grep func
11: 0000000000000000 11 FUNC WEAK DEFAULT 2 func_weak()
12: 000000000000000b 11 FUNC GLOBAL DEFAULT 2 func_regular()
13: 0000000000000000 11 FUNC WEAK DEFAULT 6 func_inline()
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external()
We're compiling without optimization flag, causing inline function not to get inlined. We see that inline function func_inline gets emitted as weak symbol, the same as func_weak which is defined explicitly as weak using GCC attribute.
Compiling the same program in C, we see that func_inline is a regular external function, same as func_external:
$ cp c1.cpp c1.c
$ gcc -c c1.c
$ readelf -s c1.o | grep func
9: 0000000000000000 11 FUNC WEAK DEFAULT 1 func_weak
10: 000000000000000b 11 FUNC GLOBAL DEFAULT 1 func_regular
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_inline
So in C, in order to resolve this external reference, one has to designate a single file that contains the actual function definition.
When we use optimization flag, we cause inline function to actually get inlined, and no symbol is emitted at all:
$ g++ -O1 -c c1.cpp
$ readelf -s c1.o | c++filt | grep func_inline
$ gcc -O1 -c c1.c
$ readelf -s c1.o | grep func_inline
$
Suppose I have two translation-units:
foo.cpp
void foo() {
auto v = std::vector<int>();
}
bar.cpp
void bar() {
auto v = std::vector<int>();
}
When I compile these translation-units, each will instantiate std::vector<int>.
My question is: how does this work at the linking stage?
Do both instantiations have different mangled names?
Does the linker remove them as duplicates?
C++ requires that an inline function definition
be present in a translation unit that references the function. Template member
functions are implicitly inline, but also by default are instantiated with external
linkage. Hence the duplication of definitions that will be visible to the linker when
the same template is instantiated with the same template arguments in different
translation units. How the linker copes with this duplication is your question.
Your C++ compiler is subject to the C++ Standard, but your linker is not subject
to any codified standard as to how it shall link C++: it is a law unto itself,
rooted in computing history and indifferent to the source language of the object
code it links. Your compiler has to work with what a target linker
can and will do so that you can successfully link your programs and see them do
what you expect. So I'll show you how the GCC C++ compiler interworks with
the GNU linker to handle identical template instantiations in different translation units.
This demonstration exploits the fact that while the C++ Standard requires -
by the One Definition Rule
- that the instantiations in different translation units of the same template with
the same template arguments shall have the same definition, the compiler -
of course - cannot enforce any requirement like that on relationships between different
translation units. It has to trust us.
So we'll instantiate the same template with the same parameters in different
translation units, but we'll cheat by injecting a macro-controlled difference into
the implementations in different translation units that will subsequently show
us which definition the linker picks.
If you suspect this cheat invalidates the demonstration, remember: the compiler
cannot know whether the ODR is ever honoured across different translation units,
so it cannot behave differently on that account, and there's no such thing
as "cheating" the linker. Anyhow, the demo will demonstrate that it is valid.
First we have our cheat template header:
thing.hpp
#ifndef THING_HPP
#define THING_HPP
#ifndef ID
#error ID undefined
#endif
template<typename T>
struct thing
{
T id() const {
return T{ID};
}
};
#endif
The value of the macro ID is the tracer value we can inject.
Next a source file:
foo.cpp
#define ID 0xf00
#include "thing.hpp"
unsigned foo()
{
thing<unsigned> t;
return t.id();
}
It defines function foo, in which thing<unsigned> is
instantiated to define t, and t.id() is returned. By being a function with
external linkage that instantiates thing<unsigned>, foo serves the purposes
of:-
obliging the compiler to do that instantiating at all
exposing the instantiation in linkage so we can then probe what the
linker does with it.
Another source file:
boo.cpp
#define ID 0xb00
#include "thing.hpp"
unsigned boo()
{
thing<unsigned> t;
return t.id();
}
which is just like foo.cpp except that it defines boo in place of foo and
sets ID = 0xb00.
And lastly a program source:
main.cpp
#include <iostream>
extern unsigned foo();
extern unsigned boo();
int main()
{
std::cout << std::hex
<< '\n' << foo()
<< '\n' << boo()
<< std::endl;
return 0;
}
This program will print, as hex, the return value of foo() - which our cheat should make
= f00 - then the return value of boo() - which our cheat should make = b00.
Now we'll compile foo.cpp, and we'll do it with -save-temps because we want
a look at the assembly:
g++ -c -save-temps foo.cpp
This writes the assembly in foo.s and the portion of interest there is
the definition of thing<unsigned int>::id() const (mangled = _ZNK5thingIjE2idEv):
.section .text._ZNK5thingIjE2idEv,"axG",#progbits,_ZNK5thingIjE2idEv,comdat
.align 2
.weak _ZNK5thingIjE2idEv
.type _ZNK5thingIjE2idEv, #function
_ZNK5thingIjE2idEv:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movl $3840, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
Three of the directives at the top are significant:
.section .text._ZNK5thingIjE2idEv,"axG",#progbits,_ZNK5thingIjE2idEv,comdat
This one puts the function definition in a linkage section of its own called
.text._ZNK5thingIjE2idEv that will be output, if it's needed, merged into the
.text (i.e. code) section of program in which the object file is linked. A
linkage section like that, i.e. .text.<function_name> is called a function-section.
It's a code section that contains only the definition of function <function_name>.
The directive:
.weak _ZNK5thingIjE2idEv
is crucial. It classifies thing<unsigned int>::id() const as a weak symbol.
The GNU linker recognises strong symbols and weak symbols. For a strong symbol, the
linker will accept only one definition in the linkage. If there are more, it will give a multiple
-definition error. But for a weak symbol, it will tolerate any number of definitions,
and pick one. If a weakly defined symbol also has (just one) strong definition in the linkage then the
strong definition will be picked. If a symbol has multiple weak definitions and no strong definition,
then the linker can pick any one of the weak definitions, arbitrarily.
The directive:
.type _ZNK5thingIjE2idEv, #function
classifies thing<unsigned int>::id() as referring to a function - not data.
Then in the body of the definition, the code is assembled at the address
labelled by the weak global symbol _ZNK5thingIjE2idEv, the same one locally
labelled .LFB2. The code returns 3840 ( = 0xf00).
Next we'll compile boo.cpp the same way:
g++ -c -save-temps boo.cpp
and look again at how thing<unsigned int>::id() is defined in boo.s
.section .text._ZNK5thingIjE2idEv,"axG",#progbits,_ZNK5thingIjE2idEv,comdat
.align 2
.weak _ZNK5thingIjE2idEv
.type _ZNK5thingIjE2idEv, #function
_ZNK5thingIjE2idEv:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movl $2816, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
It's identical, except for our cheat: this definition returns 2816 ( = 0xb00).
While we're here, let's note something that might or might not go without saying:
Once we're in assembly (or object code), classes have evaporated. Here,
we're down to: -
data
code
symbols, which can label data or label code.
So nothing here specifically represents the instantiation of thing<T> for
T = unsigned. All that's left of thing<unsigned> in this instance is
the definition of _ZNK5thingIjE2idEv a.k.a thing<unsigned int>::id() const.
So now we know what the compiler does about instantiating thing<unsigned>
in a given translation unit. If it is obliged to instantiate a thing<unsigned>
member function, then it assembles the definition of the instantiated member
function at a weakly global symbol that identifies the member function, and it
puts this definition into its own function-section.
Now let's see what the linker does.
First we'll compile the main source file.
g++ -c main.cpp
Then link all the object files, requesting a diagnostic trace on _ZNK5thingIjE2idEv,
and a linkage map file:
g++ -o prog main.o foo.o boo.o -Wl,--trace-symbol='_ZNK5thingIjE2idEv',-M=prog.map
foo.o: definition of _ZNK5thingIjE2idEv
boo.o: reference to _ZNK5thingIjE2idEv
So the linker tells us that the program gets the definition of _ZNK5thingIjE2idEv from
foo.o and calls it in boo.o.
Running the program shows it's telling the truth:
./prog
f00
f00
Both foo() and boo() are returning the value of thing<unsigned>().id()
as instantiated in foo.cpp.
What has become of the other definition of thing<unsigned int>::id() const
in boo.o? The map file shows us:
prog.map
...
Discarded input sections
...
...
.text._ZNK5thingIjE2idEv
0x0000000000000000 0xf boo.o
...
...
The linker chucked away the function-section in boo.o that
contained the other definition.
Let's now link prog again, but this time with foo.o and boo.o in the
reverse order:
$ g++ -o prog main.o boo.o foo.o -Wl,--trace-symbol='_ZNK5thingIjE2idEv',-M=prog.map
boo.o: definition of _ZNK5thingIjE2idEv
foo.o: reference to _ZNK5thingIjE2idEv
This time, the program gets the definition of _ZNK5thingIjE2idEv from boo.o and
calls it in foo.o. The program confirms that:
$ ./prog
b00
b00
And the map file shows:
...
Discarded input sections
...
...
.text._ZNK5thingIjE2idEv
0x0000000000000000 0xf foo.o
...
...
that the linker chucked away the function-section .text._ZNK5thingIjE2idEv
from foo.o.
That completes the picture.
The compiler emits, in each translation unit, a weak definition of
each instantiated template member in its own function section. The linker
then just picks the first of those weak definitions that it encounters
in the linkage sequence when it needs to resolve a reference to the weak
symbol. Because each of the weak symbols addresses a definition, any
one one of them - in particular, the first one - can be used to resolve all references
to the symbol in the linkage, and the rest of the weak definitions are
expendable. The surplus weak definitions must be ignored, because
the linker can only link one definition of a given symbol. And the surplus
weak definitions can be discarded by the linker, with no collateral
damage to the program, because the compiler placed each one in a linkage section all by itself.
By picking the first weak definition it sees, the linker is effectively
picking at random, because the order in which object files are linked is arbitrary.
But this is fine, as long as we obey the ODR accross multiple translation units,
because it we do, then all of the weak definitions are indeed identical. The usual practice of #include-ing a class template everywhere from a header file (and not macro-injecting any local edits when we do so) is a fairly robust way of obeying the rule.
Different implementations use different strategies for this.
The GNU compiler, for example, marks template instantiations as weak symbols. Then at link time, the linker can throw away all definitions but one of the same weak symbol.
The Sun Solaris compiler, on the other hand, does not instantiate templates at all during normal compilation. Then at link time, the linker collects all template instantiations needed to complete the program, and then goes ahead and calls the compiler in a special template-instantiation mode. Thus exactly one instantiation is produced for each template. There are no duplicates to merge or get rid of.
Each approach has its own advantages and disadvantages.
When you have a non-template class definition, for example, class Bar {...};, and this class is defined in the header, that is included in multiple translation units. After compilation phase you have two object files with two definitions, right? Do you think linker will create two binary definitions for the class in your final binary? Of course, you have two definitions in two translation units and one final definition in the final binary after the linkage phase is done. This is called linkage collapsing, it is not forced by the standard, the standard only enforces the ODR rule, that does not say how the linker resolves the final problem, it is up to the linker, but the only way I have ever seen is the collapsing way of resolution. Of course the linker can keep both definitions, but I cannot image why, since the standard enforces those definitions to be identical in their semantics (see the ODR rule link above for more details), and if those are not the program is ill-formed. Now imaging it was not Bar it was std::vector<int>. Templates is just a way of code generation in this case, everything else is the same.
In C and C++ we can manipulate a variable's linkage. There are three kinds of linkage: no linkage, internal linkage, and external linkage. My question is probably related to why these are called "linkage" (How is that related to the linker).
I understand a linker is able to handle variables with external linkage, because references to this variable is not confined within a single translation unit, therefore not confined within a single object file. How that actually works under the hood is typically discussed in courses on operating systems.
But how does the linker handle variables (1) with no linkage and (2) with internal linkage? What are the differences in these two cases?
As far as C++ itself goes, this does not matter: the only thing that matters is the behavior of the system as a whole. Variables with no linkage should not be linked; variables with internal linkage should not be linked across translation units; and variables with external linkage should be linked across translation units. (Of course, as the person writing the C++ code, you must obey all of your constraints as well.)
Inside a compiler and linker suite of programs, however, we certainly do have to care about this. The method by which we achieve the desired result is up to us. One traditional method is pretty simple:
Identifiers with no linkage are never even passed through to the linker.
Identifiers with internal linkage are not passed through to the linker either, or are passed through to the linker but marked "for use within this one translation unit only". That is, there is no .global declaration for them, or there is a .local declaration for them, or similar.
Identifiers with external linkage are passed through to the linker, and if internal linkage identifiers are seen by the linker, these external linkage symbols are marked differently, e.g., have a .global declaration or no .local declaration.
If you have a Linux or Unix like system, run nm on object (.o) files produced by the compiler. Note that some symbols are annotated with uppercase letters like T and D for text and data: these are global. Other symbols are annotated with lowercase letters like t and d: these are local. So these systems are using the "pass internal linkage to the linker, but mark them differently from external linkage" method.
The linker isn't normally involved in either internal linkage or no linkage--they're resolved entirely by the compiler, before the linker gets into the act at all.
Internal linkage means two declarations at different scopes in the same translation unit can refer to the same thing.
No Linkage
No linkage means two declarations at different scopes in the same translation unit can't refer to the same thing.
So, if I have something like:
int f() {
static int x; // no linkage
}
...no other declaration of x in any other scope can refer to this x. The linker is involved only to the degree that it typically has to produce a field in the executable telling it the size of static space needed by the executable, and that will include space for this variable. Since it can never be referred to by any other declaration, there's no need for the linker to get involved beyond that though (in particular, the linker has nothing to do with resolving the name).
Internal linkage
Internal linkage means declarations at different scopes in the same translation unit can refer to the same object. For example:
static int x; // a namespace scope, so `x` has internal linkage
int f() {
extern int x; // declaration in one scope
}
int g() {
extern int x; // declaration in another scope
}
Assuming we put these all in one file (i.e., they end up as a single translation unit), the declarations in both f() and g() refer to the same thing--the x that's defined as static at namespace scope.
For example, consider code like this:
#include <iostream>
static int x; // a namespace scope, so `x` has internal linkage
int f()
{
extern int x;
++x;
}
int g()
{
extern int x;
std::cout << x << '\n';
}
int main() {
g();
f();
g();
}
This will print:
0
1
...because the x being incremented in f() is the same x that's being printed in g().
The linker's involvement here can be (and usually is) pretty much the same as in the no linkage case--the variable x needs some space, and the linker specifies that space when it creates the executable. It does not, however, need to get involved in determining that when f() and g() both declare x, they're referring to the same x--the compiler can determine that.
We can see this in the generated code. For example, if we compile the code above with gcc, the relevant bits for f() and g() are these.
f:
movl _ZL1x(%rip), %eax
addl $1, %eax
movl %eax, _ZL1x(%rip)
That's the increment of x (it uses the name _ZL1x for it).
g:
movl _ZL1x(%rip), %eax
[...]
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_c#PLT
So that's basically loading up x, then sending it to std::cout (I've left out code for other parameters we don't care about here).
The important part is that the code refers to _ZL1x--the same name as f used, so both of them refer to the same object.
The linker isn't really involved, because all it sees is that this file has requested space for one statically allocated variable. It makes space for that, but doesn't have to do anything to make f and g refer to the same thing--that's already handled by the compiler.
My question is probably related to why these are called "linkage" (How is that related to the linker).
According to the C standard,
An identifier declared in different scopes or in the same scope more
than once can be made to refer to the same object or function by a
process called linkage.
The term "linkage" seems reasonably well fitting -- different declarations of the same identifier are linked together so that they refer to the same object or function. That being the chosen terminology, it's pretty natural that a program that actually makes linkage happen is conventionally called a "linker".
But how does the linker handle variables (1) with no linkage and (2) with internal linkage? What are the differences in these two cases?
The linker does not have to do anything with identifiers that have no linkage. Every such declaration of an object identifier declares a distinct object (and function declarations always have internal or external linkage).
The linker does not necessarily do anything with identifiers having internal linkage, either, as the compiler can generally do everything that needs to be done with these. Nevertheless, identifiers with internal linkage can be declared multiple times in the same translation unit, with those identifiers all referring to the same object or function. The most common case is a static function with a forward declaration:
static void internal(void);
// ...
static void internal(void) {
// do something
}
File-scope variables can also have internal linkage and multiple declarations that are all linked to refer to the same object, but the multiple declaration part is not as useful for variables.
In both C and C++, inline functions with external linkage can of course have multiple definitions available at link-time, the assumption being that these definitions are all (hopefully) identical. (I am of course referring to functions declared with the inline linkage specification, not to functions that the compiler or link-time-optimizer actually inlines.)
So what do common linkers typically do when they encounter multiple definitions of a function? In particular:
Are all definitions included in the final executable or shared-library?
Do all invocations of the function link against the same definition?
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
P.S. Yes, I know C and C++ are separate languages, but they both support inline, and their compiler-output can typically be linked by the same linker (e.g. GCC's ld), so I believe there cannot be any difference between them in this aspect.
If the function is, in fact, inlined, then there's nothing to link. It's only when, for whatever reason, the compiler decides not to expand the function inline that it has to generate an out-of-line version of the function. If the compiler generates an out-of-line version of the function for more than one translation unit you end up with more than one object file having definitions for the same "inline" function.
The out-of-line definition gets compiled into the object file, and it's marked so that the linker won't complain if there is more than one definition of that name. If there is more than one, the linker simply picks one. Usually the first one it saw, but that's not required, and if the definitions are all the same, it doesn't matter. And that's why it's undefined behavior to have two or more different definitions of the same inline function: there's no rule for which one to pick. Anything can happen.
The linker just has to figure out how to deduplicate all the definitions. That is of course provided that any function definitions have been emitted at all; inline functions may well be inlined. But should you take the address of an inline function with external linkage, you always get the same address (cf. [dcl.fct.spec]/4).
Inline functions aren't the only construction which require linker support; templates are another, as are inline variables (in C++17).
inline or no inline, C does not permit multiple external definitions of the same name among the translation units contributing to the same program or library. Furthermore, it does not permit multiple definitions of the same name in the same translation unit, whether internal, external, or inline. Therefore, there can be at most two available definitions of a given function in scope in any given translation unit: one internal and/or inline, and one external.
C 2011, 6.7.4/7 has this to say:
Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an
inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the
inline
function
specifier without
extern
, then the definition in that translation unit is an
inline
definition
. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.
(Emphasis added.)
In specific answer to your questions, then, as they pertain to C:
Are all definitions included in the final executable or shared-library?
Inline definitions are not external definitions. They may or may not be included as actual functions, as inlined code, both, or neither, depending on the foibles of the compiler and linker and on details of their usage. They are not in any case callable by name by functions from different translation units, so whether they should be considered "included" is a bit of an abstract question.
Do all invocations of the function link against the same definition?
C does not specify, but it allows for the answer to be "no", even for different calls within the same translation unit. Moreover, inline functions are not external, so no inline function defined in one translation unit is ever called (directly) by a function defined in a different translation unit.
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
My answers are based on the current C standard to the extent that it addresses the questions, but as you will have seen, those answers are not entirely prescriptive. Moreover, the standard does not directly address any question of object code or linking, so you may have noticed that my answers are not, for the most part, couched in those terms.
In any case, it is not safe to assume that any given C system is consistent even with itself in these regards for different functions or in different contexts. Under some circumstances it may inline every call to an internal or inline function, so that that function does not appear as a separate function at all. At other times it may indeed emit a function with internal linkage, but that does not prevent it from inlining some calls to that function anyway. In any case, internal functions are not eligible to be linked to functions from other translation units, so the linker is not necessarily involved with linking them at all.
I think the correct answer to your question is "it depends".
Consider following pieces of code:
File x.c (or x.cc):
#include <stdio.h>
void otherfunction(void);
inline void inlinefunction(void) {
printf("inline 1\n");
}
int main(void) {
inlinefunction();
otherfunction();
return 0;
}
File y.c (or y.cc)
#include <stdio.h>
inline void inlinefunction(void) {
printf("inline 2\n");
}
void otherfunction(void) {
printf("otherfunction\n");
inlinefunction();
}
As inline keyword is only a "suggestion" for the compile to inline the function different compilers with different flags behave differently. E.g. looks like C compiler always "exports" inline functions and does not allow for multiple definitions:
$ gcc x.c y.c && ./a.out
/tmp/ccy5GYHp.o: In function `inlinefunction':
y.c:(.text+0x0): multiple definition of `inlinefunction'
/tmp/ccQkn7m4.o:x.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
while C++ allows it:
$ g++ x.cc y.cc && ./a.out
inline 1
otherfunction
inline 1
More interesting - let's try to switch order of files (and so - switch the order of linking):
$ g++ y.cc x.cc && ./a.out
inline 2
otherfunction
inline 2
Well... it looks that first one counts! But... let's add some optimization flags:
$ g++ y.cc x.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
And that's the behavior we'd expect. Function got inlined. Different order of files changes nothing:
$ g++ x.cc y.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
Next we can extend our x.c (x.cc) source with prototype of void anotherfunction(void) and call it in our main function. Let's place anotherfunction definition in z.c (z.cc) file:
#include <stdio.h>
void inlinefunction(void);
void anotherfunction(void) {
printf("anotherfunction\n");
inlinefunction();
}
We do not define the body of inlinefunction this time. Compilation/execution for c++ gives following results:
$ g++ x.cc y.cc z.cc && ./a.out
inline 1
otherfunction
inline 1
anotherfunction
inline 1
Different order:
$ g++ y.cc x.cc z.cc && ./a.out
inline 2
otherfunction
inline 2
anotherfunction
inline 2
Optimization:
$ g++ x.cc y.cc z.cc -O1 && ./a.out
/tmp/ccbDnQqX.o: In function `anotherfunction()':
z.cc:(.text+0xf): undefined reference to `inlinefunction()'
collect2: ld returned 1 exit status
So conclusion is: the best is to declare inline together with static, which narrows the scope of the function usage, because "exporting" the function which we'd like to be used inline makes no sense.
When inline functions don't end up being inlined, behavior differs between C++ and C.
In C++ they behave like regular functions, but with additional symbol flag that allows for duplicate definitions, and the linker can select any one of them.
In C, the actual function body gets ignored, and they behave just like external functions.
On ELF targets, linker behavior needed for C++ is implemented with weak symbols.
Note that weak symbols are often used in combination with regular (strong) symbols where strong symbols would override weak symbols (this is the main use case mentioned in the Wikipedia article on weak symbols). They can also be used for implementing optional references (linker would insert null value for weak symbol reference if a definition is not found). But for C++ inline functions, they provide exactly what we need: given multiple weak symbols defined with the same name, linker will select one of them, in my tests always the one from the file appearing first in the list of files passed to the linker.
Here are some examples showing the behavior in C++ and then in C:
$ cat c1.cpp
void __attribute__((weak)) func_weak() {}
void func_regular() {}
void func_external();
void inline func_inline() {}
void test() {
func_weak();
func_regular();
func_external();
func_inline();
}
$ g++ -c c1.cpp
$ readelf -s c1.o | c++filt | grep func
11: 0000000000000000 11 FUNC WEAK DEFAULT 2 func_weak()
12: 000000000000000b 11 FUNC GLOBAL DEFAULT 2 func_regular()
13: 0000000000000000 11 FUNC WEAK DEFAULT 6 func_inline()
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external()
We're compiling without optimization flag, causing inline function not to get inlined. We see that inline function func_inline gets emitted as weak symbol, the same as func_weak which is defined explicitly as weak using GCC attribute.
Compiling the same program in C, we see that func_inline is a regular external function, same as func_external:
$ cp c1.cpp c1.c
$ gcc -c c1.c
$ readelf -s c1.o | grep func
9: 0000000000000000 11 FUNC WEAK DEFAULT 1 func_weak
10: 000000000000000b 11 FUNC GLOBAL DEFAULT 1 func_regular
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_inline
So in C, in order to resolve this external reference, one has to designate a single file that contains the actual function definition.
When we use optimization flag, we cause inline function to actually get inlined, and no symbol is emitted at all:
$ g++ -O1 -c c1.cpp
$ readelf -s c1.o | c++filt | grep func_inline
$ gcc -O1 -c c1.c
$ readelf -s c1.o | grep func_inline
$
I wrote a simple C++ program with which defined a class like below:
#include <iostream>
using namespace std;
class Computer
{
public:
Computer();
~Computer();
};
Computer::Computer()
{
}
Computer::~Computer()
{
}
int main()
{
Computer compute;
return 0;
}
When I use g++(Test is on x86 32bit Linux with g++ 4.6.3.) to produce the ctors and dtors, I get these definitions at the end of section .ctors.
.globl _ZN8ComputerC1Ev
.set _ZN8ComputerC1Ev,_ZN8ComputerC2Ev
.globl _ZN8ComputerD1Ev
.set _ZN8ComputerD1Ev,_ZN8ComputerD2Ev
After digging into the assembly code produced, I figured out that _ZN8ComputerC1Ev should be the function name which is used when class Computer is constructed, while _ZN8ComputerC2Ev is the name of class Computer's constructor. The same thing happens in Computer's destructor declaring and invoking.
It seems that a table is built, linking the constructor and its implementation.
So my questions are:
What actually is this constructor/destructor information for?
Where can I find them in ELF format?
I dumped related .ctors and .init_array section, but I just can not find the meta data that defined the relation between _ZN8ComputerC1Ev and _ZN8ComputerC2Ev...
There is no table here. .globl and .set are so-called assembler directives or pseudo ops. They signal something to the assembler, but do not necessarily result in production of actual code or data. From the docs:
.global symbol, .globl symbol
.global makes the symbol visible to ld. If you define symbol in your
partial program, its value is made available to other partial programs
that are linked with it. Otherwise, symbol takes its attributes from a
symbol of the same name from another file linked into the same
program.
.set symbol, expression
Set the value of symbol to expression. This changes symbol's value and
type to conform to expression.
So the fragment you quote just ensures that the constructor is available for linking in case it's referenced by other compile units. The only effect of it you normally see in the final ELF is the presence of those symbols in the symbol table (if it has not been stripped).
Now, you may be curious about why you have two different names for the constructor (e.g. _ZN8ComputerC1Ev and _ZN8ComputerC2Ev). The answer is somewhat complicated so I will refer you to another SO question which addresses it in some detail:
Dual emission of constructor symbols