What do linkers actually do with multiply-defined `inline` functions? - c++

In both C and C++, inline functions with external linkage can of course have multiple definitions available at link-time, the assumption being that these definitions are all (hopefully) identical. (I am of course referring to functions declared with the inline linkage specification, not to functions that the compiler or link-time-optimizer actually inlines.)
So what do common linkers typically do when they encounter multiple definitions of a function? In particular:
Are all definitions included in the final executable or shared-library?
Do all invocations of the function link against the same definition?
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
P.S. Yes, I know C and C++ are separate languages, but they both support inline, and their compiler-output can typically be linked by the same linker (e.g. GCC's ld), so I believe there cannot be any difference between them in this aspect.

If the function is, in fact, inlined, then there's nothing to link. It's only when, for whatever reason, the compiler decides not to expand the function inline that it has to generate an out-of-line version of the function. If the compiler generates an out-of-line version of the function for more than one translation unit you end up with more than one object file having definitions for the same "inline" function.
The out-of-line definition gets compiled into the object file, and it's marked so that the linker won't complain if there is more than one definition of that name. If there is more than one, the linker simply picks one. Usually the first one it saw, but that's not required, and if the definitions are all the same, it doesn't matter. And that's why it's undefined behavior to have two or more different definitions of the same inline function: there's no rule for which one to pick. Anything can happen.

The linker just has to figure out how to deduplicate all the definitions. That is of course provided that any function definitions have been emitted at all; inline functions may well be inlined. But should you take the address of an inline function with external linkage, you always get the same address (cf. [dcl.fct.spec]/4).
Inline functions aren't the only construction which require linker support; templates are another, as are inline variables (in C++17).

inline or no inline, C does not permit multiple external definitions of the same name among the translation units contributing to the same program or library. Furthermore, it does not permit multiple definitions of the same name in the same translation unit, whether internal, external, or inline. Therefore, there can be at most two available definitions of a given function in scope in any given translation unit: one internal and/or inline, and one external.
C 2011, 6.7.4/7 has this to say:
Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an
inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the
inline
function
specifier without
extern
, then the definition in that translation unit is an
inline
definition
. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.
(Emphasis added.)
In specific answer to your questions, then, as they pertain to C:
Are all definitions included in the final executable or shared-library?
Inline definitions are not external definitions. They may or may not be included as actual functions, as inlined code, both, or neither, depending on the foibles of the compiler and linker and on details of their usage. They are not in any case callable by name by functions from different translation units, so whether they should be considered "included" is a bit of an abstract question.
Do all invocations of the function link against the same definition?
C does not specify, but it allows for the answer to be "no", even for different calls within the same translation unit. Moreover, inline functions are not external, so no inline function defined in one translation unit is ever called (directly) by a function defined in a different translation unit.
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
My answers are based on the current C standard to the extent that it addresses the questions, but as you will have seen, those answers are not entirely prescriptive. Moreover, the standard does not directly address any question of object code or linking, so you may have noticed that my answers are not, for the most part, couched in those terms.
In any case, it is not safe to assume that any given C system is consistent even with itself in these regards for different functions or in different contexts. Under some circumstances it may inline every call to an internal or inline function, so that that function does not appear as a separate function at all. At other times it may indeed emit a function with internal linkage, but that does not prevent it from inlining some calls to that function anyway. In any case, internal functions are not eligible to be linked to functions from other translation units, so the linker is not necessarily involved with linking them at all.

I think the correct answer to your question is "it depends".
Consider following pieces of code:
File x.c (or x.cc):
#include <stdio.h>
void otherfunction(void);
inline void inlinefunction(void) {
printf("inline 1\n");
}
int main(void) {
inlinefunction();
otherfunction();
return 0;
}
File y.c (or y.cc)
#include <stdio.h>
inline void inlinefunction(void) {
printf("inline 2\n");
}
void otherfunction(void) {
printf("otherfunction\n");
inlinefunction();
}
As inline keyword is only a "suggestion" for the compile to inline the function different compilers with different flags behave differently. E.g. looks like C compiler always "exports" inline functions and does not allow for multiple definitions:
$ gcc x.c y.c && ./a.out
/tmp/ccy5GYHp.o: In function `inlinefunction':
y.c:(.text+0x0): multiple definition of `inlinefunction'
/tmp/ccQkn7m4.o:x.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
while C++ allows it:
$ g++ x.cc y.cc && ./a.out
inline 1
otherfunction
inline 1
More interesting - let's try to switch order of files (and so - switch the order of linking):
$ g++ y.cc x.cc && ./a.out
inline 2
otherfunction
inline 2
Well... it looks that first one counts! But... let's add some optimization flags:
$ g++ y.cc x.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
And that's the behavior we'd expect. Function got inlined. Different order of files changes nothing:
$ g++ x.cc y.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
Next we can extend our x.c (x.cc) source with prototype of void anotherfunction(void) and call it in our main function. Let's place anotherfunction definition in z.c (z.cc) file:
#include <stdio.h>
void inlinefunction(void);
void anotherfunction(void) {
printf("anotherfunction\n");
inlinefunction();
}
We do not define the body of inlinefunction this time. Compilation/execution for c++ gives following results:
$ g++ x.cc y.cc z.cc && ./a.out
inline 1
otherfunction
inline 1
anotherfunction
inline 1
Different order:
$ g++ y.cc x.cc z.cc && ./a.out
inline 2
otherfunction
inline 2
anotherfunction
inline 2
Optimization:
$ g++ x.cc y.cc z.cc -O1 && ./a.out
/tmp/ccbDnQqX.o: In function `anotherfunction()':
z.cc:(.text+0xf): undefined reference to `inlinefunction()'
collect2: ld returned 1 exit status
So conclusion is: the best is to declare inline together with static, which narrows the scope of the function usage, because "exporting" the function which we'd like to be used inline makes no sense.

When inline functions don't end up being inlined, behavior differs between C++ and C.
In C++ they behave like regular functions, but with additional symbol flag that allows for duplicate definitions, and the linker can select any one of them.
In C, the actual function body gets ignored, and they behave just like external functions.
On ELF targets, linker behavior needed for C++ is implemented with weak symbols.
Note that weak symbols are often used in combination with regular (strong) symbols where strong symbols would override weak symbols (this is the main use case mentioned in the Wikipedia article on weak symbols). They can also be used for implementing optional references (linker would insert null value for weak symbol reference if a definition is not found). But for C++ inline functions, they provide exactly what we need: given multiple weak symbols defined with the same name, linker will select one of them, in my tests always the one from the file appearing first in the list of files passed to the linker.
Here are some examples showing the behavior in C++ and then in C:
$ cat c1.cpp
void __attribute__((weak)) func_weak() {}
void func_regular() {}
void func_external();
void inline func_inline() {}
void test() {
func_weak();
func_regular();
func_external();
func_inline();
}
$ g++ -c c1.cpp
$ readelf -s c1.o | c++filt | grep func
11: 0000000000000000 11 FUNC WEAK DEFAULT 2 func_weak()
12: 000000000000000b 11 FUNC GLOBAL DEFAULT 2 func_regular()
13: 0000000000000000 11 FUNC WEAK DEFAULT 6 func_inline()
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external()
We're compiling without optimization flag, causing inline function not to get inlined. We see that inline function func_inline gets emitted as weak symbol, the same as func_weak which is defined explicitly as weak using GCC attribute.
Compiling the same program in C, we see that func_inline is a regular external function, same as func_external:
$ cp c1.cpp c1.c
$ gcc -c c1.c
$ readelf -s c1.o | grep func
9: 0000000000000000 11 FUNC WEAK DEFAULT 1 func_weak
10: 000000000000000b 11 FUNC GLOBAL DEFAULT 1 func_regular
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_inline
So in C, in order to resolve this external reference, one has to designate a single file that contains the actual function definition.
When we use optimization flag, we cause inline function to actually get inlined, and no symbol is emitted at all:
$ g++ -O1 -c c1.cpp
$ readelf -s c1.o | c++filt | grep func_inline
$ gcc -O1 -c c1.c
$ readelf -s c1.o | grep func_inline
$

Related

How do inline functions resolve multiple function definitions? [duplicate]

In both C and C++, inline functions with external linkage can of course have multiple definitions available at link-time, the assumption being that these definitions are all (hopefully) identical. (I am of course referring to functions declared with the inline linkage specification, not to functions that the compiler or link-time-optimizer actually inlines.)
So what do common linkers typically do when they encounter multiple definitions of a function? In particular:
Are all definitions included in the final executable or shared-library?
Do all invocations of the function link against the same definition?
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
P.S. Yes, I know C and C++ are separate languages, but they both support inline, and their compiler-output can typically be linked by the same linker (e.g. GCC's ld), so I believe there cannot be any difference between them in this aspect.
If the function is, in fact, inlined, then there's nothing to link. It's only when, for whatever reason, the compiler decides not to expand the function inline that it has to generate an out-of-line version of the function. If the compiler generates an out-of-line version of the function for more than one translation unit you end up with more than one object file having definitions for the same "inline" function.
The out-of-line definition gets compiled into the object file, and it's marked so that the linker won't complain if there is more than one definition of that name. If there is more than one, the linker simply picks one. Usually the first one it saw, but that's not required, and if the definitions are all the same, it doesn't matter. And that's why it's undefined behavior to have two or more different definitions of the same inline function: there's no rule for which one to pick. Anything can happen.
The linker just has to figure out how to deduplicate all the definitions. That is of course provided that any function definitions have been emitted at all; inline functions may well be inlined. But should you take the address of an inline function with external linkage, you always get the same address (cf. [dcl.fct.spec]/4).
Inline functions aren't the only construction which require linker support; templates are another, as are inline variables (in C++17).
inline or no inline, C does not permit multiple external definitions of the same name among the translation units contributing to the same program or library. Furthermore, it does not permit multiple definitions of the same name in the same translation unit, whether internal, external, or inline. Therefore, there can be at most two available definitions of a given function in scope in any given translation unit: one internal and/or inline, and one external.
C 2011, 6.7.4/7 has this to say:
Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an
inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the
inline
function
specifier without
extern
, then the definition in that translation unit is an
inline
definition
. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.
(Emphasis added.)
In specific answer to your questions, then, as they pertain to C:
Are all definitions included in the final executable or shared-library?
Inline definitions are not external definitions. They may or may not be included as actual functions, as inlined code, both, or neither, depending on the foibles of the compiler and linker and on details of their usage. They are not in any case callable by name by functions from different translation units, so whether they should be considered "included" is a bit of an abstract question.
Do all invocations of the function link against the same definition?
C does not specify, but it allows for the answer to be "no", even for different calls within the same translation unit. Moreover, inline functions are not external, so no inline function defined in one translation unit is ever called (directly) by a function defined in a different translation unit.
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
My answers are based on the current C standard to the extent that it addresses the questions, but as you will have seen, those answers are not entirely prescriptive. Moreover, the standard does not directly address any question of object code or linking, so you may have noticed that my answers are not, for the most part, couched in those terms.
In any case, it is not safe to assume that any given C system is consistent even with itself in these regards for different functions or in different contexts. Under some circumstances it may inline every call to an internal or inline function, so that that function does not appear as a separate function at all. At other times it may indeed emit a function with internal linkage, but that does not prevent it from inlining some calls to that function anyway. In any case, internal functions are not eligible to be linked to functions from other translation units, so the linker is not necessarily involved with linking them at all.
I think the correct answer to your question is "it depends".
Consider following pieces of code:
File x.c (or x.cc):
#include <stdio.h>
void otherfunction(void);
inline void inlinefunction(void) {
printf("inline 1\n");
}
int main(void) {
inlinefunction();
otherfunction();
return 0;
}
File y.c (or y.cc)
#include <stdio.h>
inline void inlinefunction(void) {
printf("inline 2\n");
}
void otherfunction(void) {
printf("otherfunction\n");
inlinefunction();
}
As inline keyword is only a "suggestion" for the compile to inline the function different compilers with different flags behave differently. E.g. looks like C compiler always "exports" inline functions and does not allow for multiple definitions:
$ gcc x.c y.c && ./a.out
/tmp/ccy5GYHp.o: In function `inlinefunction':
y.c:(.text+0x0): multiple definition of `inlinefunction'
/tmp/ccQkn7m4.o:x.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
while C++ allows it:
$ g++ x.cc y.cc && ./a.out
inline 1
otherfunction
inline 1
More interesting - let's try to switch order of files (and so - switch the order of linking):
$ g++ y.cc x.cc && ./a.out
inline 2
otherfunction
inline 2
Well... it looks that first one counts! But... let's add some optimization flags:
$ g++ y.cc x.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
And that's the behavior we'd expect. Function got inlined. Different order of files changes nothing:
$ g++ x.cc y.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
Next we can extend our x.c (x.cc) source with prototype of void anotherfunction(void) and call it in our main function. Let's place anotherfunction definition in z.c (z.cc) file:
#include <stdio.h>
void inlinefunction(void);
void anotherfunction(void) {
printf("anotherfunction\n");
inlinefunction();
}
We do not define the body of inlinefunction this time. Compilation/execution for c++ gives following results:
$ g++ x.cc y.cc z.cc && ./a.out
inline 1
otherfunction
inline 1
anotherfunction
inline 1
Different order:
$ g++ y.cc x.cc z.cc && ./a.out
inline 2
otherfunction
inline 2
anotherfunction
inline 2
Optimization:
$ g++ x.cc y.cc z.cc -O1 && ./a.out
/tmp/ccbDnQqX.o: In function `anotherfunction()':
z.cc:(.text+0xf): undefined reference to `inlinefunction()'
collect2: ld returned 1 exit status
So conclusion is: the best is to declare inline together with static, which narrows the scope of the function usage, because "exporting" the function which we'd like to be used inline makes no sense.
When inline functions don't end up being inlined, behavior differs between C++ and C.
In C++ they behave like regular functions, but with additional symbol flag that allows for duplicate definitions, and the linker can select any one of them.
In C, the actual function body gets ignored, and they behave just like external functions.
On ELF targets, linker behavior needed for C++ is implemented with weak symbols.
Note that weak symbols are often used in combination with regular (strong) symbols where strong symbols would override weak symbols (this is the main use case mentioned in the Wikipedia article on weak symbols). They can also be used for implementing optional references (linker would insert null value for weak symbol reference if a definition is not found). But for C++ inline functions, they provide exactly what we need: given multiple weak symbols defined with the same name, linker will select one of them, in my tests always the one from the file appearing first in the list of files passed to the linker.
Here are some examples showing the behavior in C++ and then in C:
$ cat c1.cpp
void __attribute__((weak)) func_weak() {}
void func_regular() {}
void func_external();
void inline func_inline() {}
void test() {
func_weak();
func_regular();
func_external();
func_inline();
}
$ g++ -c c1.cpp
$ readelf -s c1.o | c++filt | grep func
11: 0000000000000000 11 FUNC WEAK DEFAULT 2 func_weak()
12: 000000000000000b 11 FUNC GLOBAL DEFAULT 2 func_regular()
13: 0000000000000000 11 FUNC WEAK DEFAULT 6 func_inline()
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external()
We're compiling without optimization flag, causing inline function not to get inlined. We see that inline function func_inline gets emitted as weak symbol, the same as func_weak which is defined explicitly as weak using GCC attribute.
Compiling the same program in C, we see that func_inline is a regular external function, same as func_external:
$ cp c1.cpp c1.c
$ gcc -c c1.c
$ readelf -s c1.o | grep func
9: 0000000000000000 11 FUNC WEAK DEFAULT 1 func_weak
10: 000000000000000b 11 FUNC GLOBAL DEFAULT 1 func_regular
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_inline
So in C, in order to resolve this external reference, one has to designate a single file that contains the actual function definition.
When we use optimization flag, we cause inline function to actually get inlined, and no symbol is emitted at all:
$ g++ -O1 -c c1.cpp
$ readelf -s c1.o | c++filt | grep func_inline
$ gcc -O1 -c c1.c
$ readelf -s c1.o | grep func_inline
$

Why is the symbol of member function weak?

I compiled this C++ code with g++ (7.3.0).
struct C {
__attribute__((noinline)) int foo() { return 0; };
};
int bar(C &c) { return c.foo(); }
And using nm I found that foo is weak. Is that due to the C++ specification or a decision of GCC?
The experiment flow:
$ g++ test.cpp -c -Og
$ nm test.o
0000000000000000 T _Z3barR1C
0000000000000000 W _ZN1C3fooEv
Is that C++ specification or decision of GCC?
A bit of both. Not that the C++ specification deals with strong or weak symbols (it doesn't). But it does affect GCC's behavior, since usually extensions try to build upon standard mandated behavior.
When you define a member function in the class body, it's automatically an inline function. In the C++ language, an inline function is a function that can have multiple definitions, so long as they appear in different translation units. It's not a problem, because in C++, an inline function must have the exact same definition in every translation unit (down to the token sequence). So even though they are "different" definitions, they are identical, and so can be reduced to a single one at some point.
The function doesn't have to actually be inlined at the call site. As far as C++ is concerend, inline is about the one-definition rule and linkage. So you have that, there can be multiple definitions of C::foo according to the spec.
And then you specify __attribute__((noinline)), so the function cannot be physically inlined, and so every TU must contain a symbol for the function.
This leaves only one choice. The symbol must be weak, otherwise the different symbols in different translation units will interfere with eachother. By making the symbol weak, you can have your mandatory definition and avoid physically inlining the function, while the C++ specification is upheld. Those different symbols stand for the exact same function, and so any one of them can "win" when linking.

Are template variables allowed within multiple translation units and effectively merged?

See the following:
https://en.cppreference.com/w/cpp/language/definition#One_Definition_Rule
http://eel.is/c++draft/basic.def.odr#12
It states that multiple definitions of class templates, static data members of class templates, partial template specializations, etc are allowed and will act as one single definition. Great... but it does not mention variable templates anywhere?
If I have the following in multiple translation units:
template<typename T>
T my_data{};
inline void test() {
my_data<int> = 1;
}
Will each translation unit will be given their own definition of my_data resulting in multiple symbols, or will they all be effectively merged into a single definition within the program where calling test() in one translation unit will modify the variable for another translation unit?
Where in the standard does it mention this behavior?
According to the c++14 standard [basic.def]/4:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program outside of a discarded statement; no diagnostic required.
So if my_data<T> is ord-used with the same template argument in multiple translation unit, you should have a odr-violation (without any diagnostic). inline variables appeared in c++17 to solve that problem and this is why type_traits *_v family of variable template are declared inline.
In practice, with Gcc and Clang (at least, I cannot check other compilers), you will not get any odr violation because template variables have "vague linkage" (as if they were declared inline).
You can check it with nm. If you run this command line g++ -c test.cpp -std=c++14 && nm test.o | c++filt | grep my_data, you should see that my_data<int> is a symbol of categorie u which is according to nm documentation :
The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the dynamic linker will make sure that in the entire process there is just one symbol with this name and type in use.
In Core issue #1849 one can read this obscure sentence:
The description in 6.2 [basic.def.odr] paragraph 6 of when entities can be multiply-declared in a program does not, but should, discuss variable templates.
I would bet that if all compilers give to variable templates a vague linkage, a future revision of the standard may reflect that. But for now we should use the inline specifier as it is done in the stl.

Manually create gnu_unique_object symbols

Consider the case of a class member function defined in a header, with a static storage variable. When you include the header in multiple compilation units, you will end up with multiple copies of the static variable. However, the compiler will fix this for you, and just pick one of the emitted variables (note that this is different from inlining). This is enabled by a GNU specific extension to the possible types of a symbol, precisely gnu_unique_object (they show up as "u" in nm, which calls them in the man pages "unique global symbols").
The question is, how do you force the compiler to produce this kind of symbols for arbitrary variables? I'm tired of wrapping things in a class to get this behavior.
If you declare the global variable in a header file, then you'll get a different instantiation of that variable in each compilation unit. The extern keyword is what you're after. This keyword makes what looks like an instantiation into a forward declaration. It can be thought of as a promise that a variable of that name is instantiated in a different compilation unit, and will be findable upon linking.
MyTrueGlobals.h
extern int global_variable_1;
MyTrueGlobals.cpp
int global_variable_1 = 0;
Since C++ 17, one can use inline variables. They produce gnu_unique_object symbols:
$ cat c1.cpp
inline int var1 = 123;
void f1() {
int x = var1;
}
$ g++ -std=c++1z -c c1.cpp
$ nm -C c1.o
0000000000000000 T f1()
0000000000000000 u var1

Can GCC optimize things better when I compile everything in one step?

gcc optimizes code when I pass it the -O2 flag, but I'm wondering how well it can actually do that if I compile all source files to object files and then link them afterwards.
Here's an example:
// in a.h
int foo(int n);
// in foo.cpp
int foo(int n) {
return n;
}
// in main.cpp
#include "a.h"
int main(void) {
return foo(5);
}
// code used to compile it all
gcc -c -O2 foo.cpp -o foo.o
gcc -c -O2 main.cpp -o main.o
gcc -O2 foo.o main.o -o executable
Normally, gcc should inline foo because it's a small function and -O2 enables -finline-small-functions, right? But here, gcc only sees the code of foo and main independently before it creates the object files, so there won't be any optimizations like that, right? So, does compiling like this really make code slower?
However, I could also compile it like this:
gcc -O2 foo.cpp main.cpp -o executable
Would that be faster? If not, would it be faster this way?
// in foo.cpp
int foo(int n) {
return n;
}
// in main.cpp
#include "foo.cpp"
int main(void) {
return foo(5);
}
Edit: I looked at objdump, and its disassembled code showed that only the #include "foo.cpp" thing worked.
It seems that you have rediscovered on your own the issue about the separate compilation model that C and C++ use. While it certainly eases memory requirements (which was important at the time of its creation), it does so by exposing only minimal information to the compiler, meaning that some optimizations (like this one) cannot be performed.
Newer languages, with their module systems can expose as much information as necessary, and we can hope to rip those benefits if modules get into the next version of C++...
In the mean time, the simplest thing to go for is called Link-Time Optimization. The idea is that you will perform as much optimization as possible on each TU (Translation Unit) to obtain an object file, but you will also enrich the traditional object file (which contain assembly) with IR (Intermediate Representation, used by compilers to optimize) for part of or all functions.
When the linker will be invoked to merge those object files together, instead of just merging the files together, it will merge the IR representations, rexeecute a number of optimization passes (constant propagation, inlining, ...) and then create assembly on its own. It means that instead of being just a linker, it is in fact a backend optimizer.
Of course, like all optimization passes this has a cost, so makes for longer compilation. Also, it means that both the compiler and the linker should be passed a special option to trigger this behavior, in the case of gcc, it would be -lto or -O4.
You may be looking for Link-Time Optimization (LTO), aka Whole Program Optimization.
Since you're using GCC, you can use the C99 inline function specifier mechanism. This is from ISO/IEC 9899:1999.
§ 6.7.4 Function specifiers
Syntax
¶1 function-specifier:
inline
Constraints
¶2 Function specifiers shall be used only in the declaration of an identifier for a function.
¶3 An inline definition of a function with external linkage shall not contain a definition of a
modifiable object with static storage duration, and shall not contain a reference to an
identifier with internal linkage.
¶4 In a hosted environment, the inline function specifier shall not appear in a declaration
of main.
Semantics
¶5 A function declared with an inline function specifier is an inline function. The
function specifier may appear more than once; the behavior is the same as if it appeared
only once. Making a function an inline function suggests that calls to the function be as
fast as possible.118) The extent to which such suggestions are effective is
implementation-defined.119)
¶6 Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the inline function
specifier without extern, then the definition in that translation unit is an inline
definition. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.120)
¶7 EXAMPLE The declaration of an inline function with external linkage can result in either an external
definition, or a definition available for use only within the translation unit. A file scope declaration with
extern creates an external definition. The following example shows an entire translation unit.
inline double fahr(double t)
{
return (9.0 * t) / 5.0 + 32.0;
}
inline double cels(double t)
{
return (5.0 * (t - 32.0)) / 9.0;
}
extern double fahr(double); // creates an external definition
double convert(int is_fahr, double temp)
{
/* A translator may perform inline substitutions */
return is_fahr ? cels(temp) : fahr(temp);
}
¶8 Note that the definition of fahr is an external definition because fahr is also declared with extern, but
the definition of cels is an inline definition. Because cels has external linkage and is referenced, an
external definition has to appear in another translation unit (see 6.9); the inline definition and the external
definition are distinct and either may be used for the call.
118) By using, for example, an alternative to the usual function call mechanism, such as "inline
substitution". Inline substitution is not textual substitution, nor does it create a new function.
Therefore, for example, the expansion of a macro used within the body of the function uses the
definition it had at the point the function body appears, and not where the function is called; and
identifiers refer to the declarations in scope where the body occurs. Likewise, the function has a
single address, regardless of the number of inline definitions that occur in addition to the external
definition.
119) For example, an implementation might never perform inline substitution, or might only perform inline
substitutions to calls in the scope of an inline declaration.
120) Since an inline definition is distinct from the corresponding external definition and from any other
corresponding inline definitions in other translation units, all corresponding objects with static storage
duration are also distinct in each of the definitions.
Note that GCC also had inline functions in C before they were standardized. Read the GCC manual for details if you need that notation.