Manually create gnu_unique_object symbols

Manually create gnu_unique_object symbols - c++

Consider the case of a class member function defined in a header, with a static storage variable. When you include the header in multiple compilation units, you will end up with multiple copies of the static variable. However, the compiler will fix this for you, and just pick one of the emitted variables (note that this is different from inlining). This is enabled by a GNU specific extension to the possible types of a symbol, precisely gnu_unique_object (they show up as "u" in nm, which calls them in the man pages "unique global symbols").
The question is, how do you force the compiler to produce this kind of symbols for arbitrary variables? I'm tired of wrapping things in a class to get this behavior.

If you declare the global variable in a header file, then you'll get a different instantiation of that variable in each compilation unit. The extern keyword is what you're after. This keyword makes what looks like an instantiation into a forward declaration. It can be thought of as a promise that a variable of that name is instantiated in a different compilation unit, and will be findable upon linking.
MyTrueGlobals.h
extern int global_variable_1;
MyTrueGlobals.cpp
int global_variable_1 = 0;

Since C++ 17, one can use inline variables. They produce gnu_unique_object symbols:
$ cat c1.cpp
inline int var1 = 123;
void f1() {
int x = var1;
}
$ g++ -std=c++1z -c c1.cpp
$ nm -C c1.o
0000000000000000 T f1()
0000000000000000 u var1

Related

How do inline functions resolve multiple function definitions? [duplicate]

In both C and C++, inline functions with external linkage can of course have multiple definitions available at link-time, the assumption being that these definitions are all (hopefully) identical. (I am of course referring to functions declared with the inline linkage specification, not to functions that the compiler or link-time-optimizer actually inlines.)
So what do common linkers typically do when they encounter multiple definitions of a function? In particular:
Are all definitions included in the final executable or shared-library?
Do all invocations of the function link against the same definition?
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
P.S. Yes, I know C and C++ are separate languages, but they both support inline, and their compiler-output can typically be linked by the same linker (e.g. GCC's ld), so I believe there cannot be any difference between them in this aspect.

If the function is, in fact, inlined, then there's nothing to link. It's only when, for whatever reason, the compiler decides not to expand the function inline that it has to generate an out-of-line version of the function. If the compiler generates an out-of-line version of the function for more than one translation unit you end up with more than one object file having definitions for the same "inline" function.
The out-of-line definition gets compiled into the object file, and it's marked so that the linker won't complain if there is more than one definition of that name. If there is more than one, the linker simply picks one. Usually the first one it saw, but that's not required, and if the definitions are all the same, it doesn't matter. And that's why it's undefined behavior to have two or more different definitions of the same inline function: there's no rule for which one to pick. Anything can happen.

The linker just has to figure out how to deduplicate all the definitions. That is of course provided that any function definitions have been emitted at all; inline functions may well be inlined. But should you take the address of an inline function with external linkage, you always get the same address (cf. [dcl.fct.spec]/4).
Inline functions aren't the only construction which require linker support; templates are another, as are inline variables (in C++17).

inline or no inline, C does not permit multiple external definitions of the same name among the translation units contributing to the same program or library. Furthermore, it does not permit multiple definitions of the same name in the same translation unit, whether internal, external, or inline. Therefore, there can be at most two available definitions of a given function in scope in any given translation unit: one internal and/or inline, and one external.
C 2011, 6.7.4/7 has this to say:
Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an
inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the
inline
function
specifier without
extern
, then the definition in that translation unit is an
inline
definition
. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.
(Emphasis added.)
In specific answer to your questions, then, as they pertain to C:
Are all definitions included in the final executable or shared-library?
Inline definitions are not external definitions. They may or may not be included as actual functions, as inlined code, both, or neither, depending on the foibles of the compiler and linker and on details of their usage. They are not in any case callable by name by functions from different translation units, so whether they should be considered "included" is a bit of an abstract question.
Do all invocations of the function link against the same definition?
C does not specify, but it allows for the answer to be "no", even for different calls within the same translation unit. Moreover, inline functions are not external, so no inline function defined in one translation unit is ever called (directly) by a function defined in a different translation unit.
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
My answers are based on the current C standard to the extent that it addresses the questions, but as you will have seen, those answers are not entirely prescriptive. Moreover, the standard does not directly address any question of object code or linking, so you may have noticed that my answers are not, for the most part, couched in those terms.
In any case, it is not safe to assume that any given C system is consistent even with itself in these regards for different functions or in different contexts. Under some circumstances it may inline every call to an internal or inline function, so that that function does not appear as a separate function at all. At other times it may indeed emit a function with internal linkage, but that does not prevent it from inlining some calls to that function anyway. In any case, internal functions are not eligible to be linked to functions from other translation units, so the linker is not necessarily involved with linking them at all.

I think the correct answer to your question is "it depends".
Consider following pieces of code:
File x.c (or x.cc):
#include <stdio.h>
void otherfunction(void);
inline void inlinefunction(void) {
printf("inline 1\n");
}
int main(void) {
inlinefunction();
otherfunction();
return 0;
}
File y.c (or y.cc)
#include <stdio.h>
inline void inlinefunction(void) {
printf("inline 2\n");
}
void otherfunction(void) {
printf("otherfunction\n");
inlinefunction();
}
As inline keyword is only a "suggestion" for the compile to inline the function different compilers with different flags behave differently. E.g. looks like C compiler always "exports" inline functions and does not allow for multiple definitions:
$ gcc x.c y.c && ./a.out
/tmp/ccy5GYHp.o: In function `inlinefunction':
y.c:(.text+0x0): multiple definition of `inlinefunction'
/tmp/ccQkn7m4.o:x.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
while C++ allows it:
$ g++ x.cc y.cc && ./a.out
inline 1
otherfunction
inline 1
More interesting - let's try to switch order of files (and so - switch the order of linking):
$ g++ y.cc x.cc && ./a.out
inline 2
otherfunction
inline 2
Well... it looks that first one counts! But... let's add some optimization flags:
$ g++ y.cc x.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
And that's the behavior we'd expect. Function got inlined. Different order of files changes nothing:
$ g++ x.cc y.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
Next we can extend our x.c (x.cc) source with prototype of void anotherfunction(void) and call it in our main function. Let's place anotherfunction definition in z.c (z.cc) file:
#include <stdio.h>
void inlinefunction(void);
void anotherfunction(void) {
printf("anotherfunction\n");
inlinefunction();
}
We do not define the body of inlinefunction this time. Compilation/execution for c++ gives following results:
$ g++ x.cc y.cc z.cc && ./a.out
inline 1
otherfunction
inline 1
anotherfunction
inline 1
Different order:
$ g++ y.cc x.cc z.cc && ./a.out
inline 2
otherfunction
inline 2
anotherfunction
inline 2
Optimization:
$ g++ x.cc y.cc z.cc -O1 && ./a.out
/tmp/ccbDnQqX.o: In function `anotherfunction()':
z.cc:(.text+0xf): undefined reference to `inlinefunction()'
collect2: ld returned 1 exit status
So conclusion is: the best is to declare inline together with static, which narrows the scope of the function usage, because "exporting" the function which we'd like to be used inline makes no sense.

When inline functions don't end up being inlined, behavior differs between C++ and C.
In C++ they behave like regular functions, but with additional symbol flag that allows for duplicate definitions, and the linker can select any one of them.
In C, the actual function body gets ignored, and they behave just like external functions.
On ELF targets, linker behavior needed for C++ is implemented with weak symbols.
Note that weak symbols are often used in combination with regular (strong) symbols where strong symbols would override weak symbols (this is the main use case mentioned in the Wikipedia article on weak symbols). They can also be used for implementing optional references (linker would insert null value for weak symbol reference if a definition is not found). But for C++ inline functions, they provide exactly what we need: given multiple weak symbols defined with the same name, linker will select one of them, in my tests always the one from the file appearing first in the list of files passed to the linker.
Here are some examples showing the behavior in C++ and then in C:
$ cat c1.cpp
void __attribute__((weak)) func_weak() {}
void func_regular() {}
void func_external();
void inline func_inline() {}
void test() {
func_weak();
func_regular();
func_external();
func_inline();
}
$ g++ -c c1.cpp
$ readelf -s c1.o | c++filt | grep func
11: 0000000000000000 11 FUNC WEAK DEFAULT 2 func_weak()
12: 000000000000000b 11 FUNC GLOBAL DEFAULT 2 func_regular()
13: 0000000000000000 11 FUNC WEAK DEFAULT 6 func_inline()
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external()
We're compiling without optimization flag, causing inline function not to get inlined. We see that inline function func_inline gets emitted as weak symbol, the same as func_weak which is defined explicitly as weak using GCC attribute.
Compiling the same program in C, we see that func_inline is a regular external function, same as func_external:
$ cp c1.cpp c1.c
$ gcc -c c1.c
$ readelf -s c1.o | grep func
9: 0000000000000000 11 FUNC WEAK DEFAULT 1 func_weak
10: 000000000000000b 11 FUNC GLOBAL DEFAULT 1 func_regular
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_inline
So in C, in order to resolve this external reference, one has to designate a single file that contains the actual function definition.
When we use optimization flag, we cause inline function to actually get inlined, and no symbol is emitted at all:
$ g++ -O1 -c c1.cpp
$ readelf -s c1.o | c++filt | grep func_inline
$ gcc -O1 -c c1.c
$ readelf -s c1.o | grep func_inline
$

Link an externally defined static function in C++ Application code

I have a set of pre defined C source code files that declares defines a lot of static functions - they are just coded up in .c files and not declared in any .h headers file.
Now I am trying to make use of those functions in my C++ application code:
Cmethods.c
static int amethod(int oppcheck)
{
}
A library is created using the C source code files:
$ nm --demangle userlib.a | grep amethod
00000000000001b6 t amethod
CppApp.h
extern "C" { int amethod(int oppcheck); }
CppApp.cpp
#include "CppApp.h"
voit callme()
{
amethod(check);
}
However during compilation ensuring that userlib.a is linked I get below error:
: undefined reference to `amethod'
$ nm --demangle userappcode.a | grep amethod
00000000000001b6 t amethod
U amethod
My further findings is that for functions in C source code files if declared in C header files - the linker error for them never occurs.
Note I cannot touch the C source code files - they are provided by third party community and we cannot break the license.
How can I resolve the issue

I have a set of pre defined C source code files that declares defines a lot of static functions.
Now I am trying to make use of those functions in my C++ application code:
Remove the static from those functions you want to call from some other translation unit.
If you cannot do that, you can't use these functions from outside. And the compiler could even optimize them to the point of removing them from your object file.
(a dirty trick that I do not recommend could be to compile that C code with gcc -Dstatic= to have the preprocessor replace static by nothing)
Note I cannot touch the C source code files.
Then your task is impossible.
You could "augment" the translation units, perhaps by appending to them a public (non static) function calling the static one. For example, you might compile something like
// include the original C code with only `static`
#include "Cmethods.c"
// code a public wrapper calling the static method
extern int public_amethod(int oppcheck);
int public_amethod(int oppcheck) { return amethod(oppcheck); }
Note I cannot touch the C source code files - they are provided by third party community and we cannot break the license -
It looks like you might not be legally allowed to compile that code, or that you cannot distribute its object file. There are no technical tricks to overcome a legal prohibition. If that goes in court, you'll probably lose!
Your issue is not technical, but social and legal. You may need a lawyer. You could also talk with the provider of the original code, and ask him if you are allowed to do what you want.
(without more motivation and context, your question looks weird)

static functions in C and static member functions in C++ are two different things. In C, a static function it is limited to its translation unit and invisible outside, this essential means object file(.o/.obj).
In C++, static can also apply to member functions and data members of classes. A static data member is also called a "class variable", while a non-static data member is an "instance variable".

If you defined amethod with static, then the function will have a internal linkage which means you can't link this function from other source files.
internal linkage.
The name can be referred to from all scopes in the
current translation unit. Any of the following names declared at
namespace scope have internal linkage
variables, functions, or function templates declared static
non-volatile non-inline (sinceC++17) const-qualified variables
(including constexpr) that aren'tdeclared extern and aren't previously
declared to have external linkage.
data members of anonymous unions

you might not touch the C source but a hacky solution is to #include it into a different translation unit with functions that forward the statics. This is not something I would recommend as a general answer but ...
file1.c:
static void function1(int)
{ ...}
file2.c:
#include "file1.c"
void fwd_function1(int x)
{
function1(x);
}

What do linkers actually do with multiply-defined `inline` functions?

In both C and C++, inline functions with external linkage can of course have multiple definitions available at link-time, the assumption being that these definitions are all (hopefully) identical. (I am of course referring to functions declared with the inline linkage specification, not to functions that the compiler or link-time-optimizer actually inlines.)
So what do common linkers typically do when they encounter multiple definitions of a function? In particular:
Are all definitions included in the final executable or shared-library?
Do all invocations of the function link against the same definition?
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
P.S. Yes, I know C and C++ are separate languages, but they both support inline, and their compiler-output can typically be linked by the same linker (e.g. GCC's ld), so I believe there cannot be any difference between them in this aspect.

If the function is, in fact, inlined, then there's nothing to link. It's only when, for whatever reason, the compiler decides not to expand the function inline that it has to generate an out-of-line version of the function. If the compiler generates an out-of-line version of the function for more than one translation unit you end up with more than one object file having definitions for the same "inline" function.
The out-of-line definition gets compiled into the object file, and it's marked so that the linker won't complain if there is more than one definition of that name. If there is more than one, the linker simply picks one. Usually the first one it saw, but that's not required, and if the definitions are all the same, it doesn't matter. And that's why it's undefined behavior to have two or more different definitions of the same inline function: there's no rule for which one to pick. Anything can happen.

The linker just has to figure out how to deduplicate all the definitions. That is of course provided that any function definitions have been emitted at all; inline functions may well be inlined. But should you take the address of an inline function with external linkage, you always get the same address (cf. [dcl.fct.spec]/4).
Inline functions aren't the only construction which require linker support; templates are another, as are inline variables (in C++17).

inline or no inline, C does not permit multiple external definitions of the same name among the translation units contributing to the same program or library. Furthermore, it does not permit multiple definitions of the same name in the same translation unit, whether internal, external, or inline. Therefore, there can be at most two available definitions of a given function in scope in any given translation unit: one internal and/or inline, and one external.
C 2011, 6.7.4/7 has this to say:
Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an
inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the
inline
function
specifier without
extern
, then the definition in that translation unit is an
inline
definition
. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.
(Emphasis added.)
In specific answer to your questions, then, as they pertain to C:
Are all definitions included in the final executable or shared-library?
Inline definitions are not external definitions. They may or may not be included as actual functions, as inlined code, both, or neither, depending on the foibles of the compiler and linker and on details of their usage. They are not in any case callable by name by functions from different translation units, so whether they should be considered "included" is a bit of an abstract question.
Do all invocations of the function link against the same definition?
C does not specify, but it allows for the answer to be "no", even for different calls within the same translation unit. Moreover, inline functions are not external, so no inline function defined in one translation unit is ever called (directly) by a function defined in a different translation unit.
Are the answers to the above questions required by one or more of the C and C++ ISO standards, and if not, do most common platforms do the same thing?
My answers are based on the current C standard to the extent that it addresses the questions, but as you will have seen, those answers are not entirely prescriptive. Moreover, the standard does not directly address any question of object code or linking, so you may have noticed that my answers are not, for the most part, couched in those terms.
In any case, it is not safe to assume that any given C system is consistent even with itself in these regards for different functions or in different contexts. Under some circumstances it may inline every call to an internal or inline function, so that that function does not appear as a separate function at all. At other times it may indeed emit a function with internal linkage, but that does not prevent it from inlining some calls to that function anyway. In any case, internal functions are not eligible to be linked to functions from other translation units, so the linker is not necessarily involved with linking them at all.

I think the correct answer to your question is "it depends".
Consider following pieces of code:
File x.c (or x.cc):
#include <stdio.h>
void otherfunction(void);
inline void inlinefunction(void) {
printf("inline 1\n");
}
int main(void) {
inlinefunction();
otherfunction();
return 0;
}
File y.c (or y.cc)
#include <stdio.h>
inline void inlinefunction(void) {
printf("inline 2\n");
}
void otherfunction(void) {
printf("otherfunction\n");
inlinefunction();
}
As inline keyword is only a "suggestion" for the compile to inline the function different compilers with different flags behave differently. E.g. looks like C compiler always "exports" inline functions and does not allow for multiple definitions:
$ gcc x.c y.c && ./a.out
/tmp/ccy5GYHp.o: In function `inlinefunction':
y.c:(.text+0x0): multiple definition of `inlinefunction'
/tmp/ccQkn7m4.o:x.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
while C++ allows it:
$ g++ x.cc y.cc && ./a.out
inline 1
otherfunction
inline 1
More interesting - let's try to switch order of files (and so - switch the order of linking):
$ g++ y.cc x.cc && ./a.out
inline 2
otherfunction
inline 2
Well... it looks that first one counts! But... let's add some optimization flags:
$ g++ y.cc x.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
And that's the behavior we'd expect. Function got inlined. Different order of files changes nothing:
$ g++ x.cc y.cc -O1 && ./a.out
inline 1
otherfunction
inline 2
Next we can extend our x.c (x.cc) source with prototype of void anotherfunction(void) and call it in our main function. Let's place anotherfunction definition in z.c (z.cc) file:
#include <stdio.h>
void inlinefunction(void);
void anotherfunction(void) {
printf("anotherfunction\n");
inlinefunction();
}
We do not define the body of inlinefunction this time. Compilation/execution for c++ gives following results:
$ g++ x.cc y.cc z.cc && ./a.out
inline 1
otherfunction
inline 1
anotherfunction
inline 1
Different order:
$ g++ y.cc x.cc z.cc && ./a.out
inline 2
otherfunction
inline 2
anotherfunction
inline 2
Optimization:
$ g++ x.cc y.cc z.cc -O1 && ./a.out
/tmp/ccbDnQqX.o: In function `anotherfunction()':
z.cc:(.text+0xf): undefined reference to `inlinefunction()'
collect2: ld returned 1 exit status
So conclusion is: the best is to declare inline together with static, which narrows the scope of the function usage, because "exporting" the function which we'd like to be used inline makes no sense.

When inline functions don't end up being inlined, behavior differs between C++ and C.
In C++ they behave like regular functions, but with additional symbol flag that allows for duplicate definitions, and the linker can select any one of them.
In C, the actual function body gets ignored, and they behave just like external functions.
On ELF targets, linker behavior needed for C++ is implemented with weak symbols.
Note that weak symbols are often used in combination with regular (strong) symbols where strong symbols would override weak symbols (this is the main use case mentioned in the Wikipedia article on weak symbols). They can also be used for implementing optional references (linker would insert null value for weak symbol reference if a definition is not found). But for C++ inline functions, they provide exactly what we need: given multiple weak symbols defined with the same name, linker will select one of them, in my tests always the one from the file appearing first in the list of files passed to the linker.
Here are some examples showing the behavior in C++ and then in C:
$ cat c1.cpp
void __attribute__((weak)) func_weak() {}
void func_regular() {}
void func_external();
void inline func_inline() {}
void test() {
func_weak();
func_regular();
func_external();
func_inline();
}
$ g++ -c c1.cpp
$ readelf -s c1.o | c++filt | grep func
11: 0000000000000000 11 FUNC WEAK DEFAULT 2 func_weak()
12: 000000000000000b 11 FUNC GLOBAL DEFAULT 2 func_regular()
13: 0000000000000000 11 FUNC WEAK DEFAULT 6 func_inline()
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external()
We're compiling without optimization flag, causing inline function not to get inlined. We see that inline function func_inline gets emitted as weak symbol, the same as func_weak which is defined explicitly as weak using GCC attribute.
Compiling the same program in C, we see that func_inline is a regular external function, same as func_external:
$ cp c1.cpp c1.c
$ gcc -c c1.c
$ readelf -s c1.o | grep func
9: 0000000000000000 11 FUNC WEAK DEFAULT 1 func_weak
10: 000000000000000b 11 FUNC GLOBAL DEFAULT 1 func_regular
13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_external
14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND func_inline
So in C, in order to resolve this external reference, one has to designate a single file that contains the actual function definition.
When we use optimization flag, we cause inline function to actually get inlined, and no symbol is emitted at all:
$ g++ -O1 -c c1.cpp
$ readelf -s c1.o | c++filt | grep func_inline
$ gcc -O1 -c c1.c
$ readelf -s c1.o | grep func_inline
$

Can GCC optimize things better when I compile everything in one step?

gcc optimizes code when I pass it the -O2 flag, but I'm wondering how well it can actually do that if I compile all source files to object files and then link them afterwards.
Here's an example:
// in a.h
int foo(int n);
// in foo.cpp
int foo(int n) {
return n;
}
// in main.cpp
#include "a.h"
int main(void) {
return foo(5);
}
// code used to compile it all
gcc -c -O2 foo.cpp -o foo.o
gcc -c -O2 main.cpp -o main.o
gcc -O2 foo.o main.o -o executable
Normally, gcc should inline foo because it's a small function and -O2 enables -finline-small-functions, right? But here, gcc only sees the code of foo and main independently before it creates the object files, so there won't be any optimizations like that, right? So, does compiling like this really make code slower?
However, I could also compile it like this:
gcc -O2 foo.cpp main.cpp -o executable
Would that be faster? If not, would it be faster this way?
// in foo.cpp
int foo(int n) {
return n;
}
// in main.cpp
#include "foo.cpp"
int main(void) {
return foo(5);
}
Edit: I looked at objdump, and its disassembled code showed that only the #include "foo.cpp" thing worked.

It seems that you have rediscovered on your own the issue about the separate compilation model that C and C++ use. While it certainly eases memory requirements (which was important at the time of its creation), it does so by exposing only minimal information to the compiler, meaning that some optimizations (like this one) cannot be performed.
Newer languages, with their module systems can expose as much information as necessary, and we can hope to rip those benefits if modules get into the next version of C++...
In the mean time, the simplest thing to go for is called Link-Time Optimization. The idea is that you will perform as much optimization as possible on each TU (Translation Unit) to obtain an object file, but you will also enrich the traditional object file (which contain assembly) with IR (Intermediate Representation, used by compilers to optimize) for part of or all functions.
When the linker will be invoked to merge those object files together, instead of just merging the files together, it will merge the IR representations, rexeecute a number of optimization passes (constant propagation, inlining, ...) and then create assembly on its own. It means that instead of being just a linker, it is in fact a backend optimizer.
Of course, like all optimization passes this has a cost, so makes for longer compilation. Also, it means that both the compiler and the linker should be passed a special option to trigger this behavior, in the case of gcc, it would be -lto or -O4.

You may be looking for Link-Time Optimization (LTO), aka Whole Program Optimization.

Since you're using GCC, you can use the C99 inline function specifier mechanism. This is from ISO/IEC 9899:1999.
§ 6.7.4 Function specifiers
Syntax
¶1 function-specifier:
inline
Constraints
¶2 Function specifiers shall be used only in the declaration of an identifier for a function.
¶3 An inline definition of a function with external linkage shall not contain a definition of a
modifiable object with static storage duration, and shall not contain a reference to an
identifier with internal linkage.
¶4 In a hosted environment, the inline function specifier shall not appear in a declaration
of main.
Semantics
¶5 A function declared with an inline function specifier is an inline function. The
function specifier may appear more than once; the behavior is the same as if it appeared
only once. Making a function an inline function suggests that calls to the function be as
fast as possible.118) The extent to which such suggestions are effective is
implementation-defined.119)
¶6 Any function with internal linkage can be an inline function. For a function with external
linkage, the following restrictions apply: If a function is declared with an inline
function specifier, then it shall also be defined in the same translation unit. If all of the
file scope declarations for a function in a translation unit include the inline function
specifier without extern, then the definition in that translation unit is an inline
definition. An inline definition does not provide an external definition for the function,
and does not forbid an external definition in another translation unit. An inline definition
provides an alternative to an external definition, which a translator may use to implement
any call to the function in the same translation unit. It is unspecified whether a call to the
function uses the inline definition or the external definition.120)
¶7 EXAMPLE The declaration of an inline function with external linkage can result in either an external
definition, or a definition available for use only within the translation unit. A file scope declaration with
extern creates an external definition. The following example shows an entire translation unit.
inline double fahr(double t)
{
return (9.0 * t) / 5.0 + 32.0;
}
inline double cels(double t)
{
return (5.0 * (t - 32.0)) / 9.0;
}
extern double fahr(double); // creates an external definition
double convert(int is_fahr, double temp)
{
/* A translator may perform inline substitutions */
return is_fahr ? cels(temp) : fahr(temp);
}
¶8 Note that the definition of fahr is an external definition because fahr is also declared with extern, but
the definition of cels is an inline definition. Because cels has external linkage and is referenced, an
external definition has to appear in another translation unit (see 6.9); the inline definition and the external
definition are distinct and either may be used for the call.
118) By using, for example, an alternative to the usual function call mechanism, such as "inline
substitution". Inline substitution is not textual substitution, nor does it create a new function.
Therefore, for example, the expansion of a macro used within the body of the function uses the
definition it had at the point the function body appears, and not where the function is called; and
identifiers refer to the declarations in scope where the body occurs. Likewise, the function has a
single address, regardless of the number of inline definitions that occur in addition to the external
definition.
119) For example, an implementation might never perform inline substitution, or might only perform inline
substitutions to calls in the scope of an inline declaration.
120) Since an inline definition is distinct from the corresponding external definition and from any other
corresponding inline definitions in other translation units, all corresponding objects with static storage
duration are also distinct in each of the definitions.
Note that GCC also had inline functions in C before they were standardized. Read the GCC manual for details if you need that notation.

Why c++ puts same-named variables defined in separate modules into same address in memory?

Let's take header file var.h
#include <iostream>
class var
{public:
var () {std::cout << "Creating var at " << this << std::endl; }
~var () {std::cout << "Deleting var at " << this << std::endl; }
};
and two source files, first lib.cpp
#include "var.h"
var A;
and second app.cpp
#include "var.h"
var A;
int main ()
{return 0;
}
then, if I attempt to compile them
g++ -c app.cpp
g++ -c lib.cpp
g++ -o app app.o lib.o
linker return multiply defined variable error. But, if I compile it onto shared library + main app
g++ -fPIC -c lib.cpp
g++ --shared -o liblib.so lib.o
g++ -fPIC -c app.cpp
g++ -o app -llib -L . app.o
it links without error. However program doesn't work properly:
./app
Creating var at 0x6013c0
Creating var at 0x6013c0
Deleting var at 0x6013c0
Deleting var at 0x6013c0
so different variables was created at the same memory address! It might put into serious trouble, for example, in a case when library and application expect them to have different values (values of object fields in this case).
if class var do memory allocation/deleting valgrind warns about accessing memory in recently deleted block.
Yes, I know I could put static var A; instead of var A; and both ways to compile will work properly. My question is: why one can't use same-named variables (or even functions?) in different libraries? Library creators might know nothing about names each other use and not to be warned to use static. Why GNU linked doesn't warn about this conflict?
And, BTW, could dlload put into same trouble?
UPD. Thank you all for explaining about namespaces and extern, I see why same symbols are placed into same memory address, but still I can't get why no linking error or even warning about doubly defined variable is shown but wrong code produced in second case.

My question is: why one can't use same-named variables (or even functions?)
in different libraries?
You can. The thing you're missing is that the declarations
var A;
aren't defining the symbol A for use in the library. They're defining the symbol to be exported for so that any other compilation unit can reference it!
e.g. if, in app.cpp, you declared
extern var A;
this would mean declare "A is a variable of type var that some other compilation unit is going to define and export" -- with this modification to your setup, this would make app.cpp explicitly request to use the object named A that lib.cpp exported.
The problem with your setup is that you have two different compilation units both trying to export the same symbol A, which leads to a conflict.
Why GNU linked doesn't warn about this conflict?
Because GNU can't know that you wanted A to be a variable private to your compilation unit unless you tell GNU that it should be private to your compilation unit. That's what static means in this context.

It's not clear if you're asking whether this is supposed to happen or what the rationale is.
First, it is required behavior. Per the "one definition rule", section 3.2 of the C++ standard, if multiple translation units contain identical definitions (and certain other requirements are met), then the program shall behave as if there were a single definition. In any other case where there are multiple definitions, the behavior is undefined.
If you're asking what the rationale for this rule is, it's that it's usually what you want. Your compiler may have an option to alert if more than one definition isn't marked extern.

Different libraries should have differently named global variables and global functions, otherwise very unpleasant things happen (e.g. when dlopen-ing it several times...).
Conventionally, well behaved libraries use a common prefix (like gtk) in C, or a namespace in C++.
And libraries should minimize global state (in C++, it probably should be static data inside classes).
You could also use the visibility function attribute accepted by GCC.

Symbols with extern linkage (which is the default in this case) are visible to other translation units. This is to allow interfaces between source files, libraries, etc.
The existence or non-existence of a definition does not change which object is accessed. The programmer is responsible for arranging declarations and definitions such that an object always declared before use and always defined exactly once (the one-definition rule).
The best solution is to put private globals into unnamed namespaces, so that definitions that look the same can still be different.
lib.cpp
#include "var.h"
namespace { // unnamed namespace
var A; // object inaccessible to other translation units
}
app.cpp
#include "var.h"
namespace { // different unnamed namespace
var A; // different object
}
int main ()
{return 0;}

Somewhat simplified answer: "libraries" are an implementation detail. All object files are combined (linked) to a single unit (executable) prior to execution. After linking has been done, there is no trace of libraries, original source files, etc. any longer -- all that matters is the final executable.
Now, you seem to be surprised that the same global name in the program (= the final result of linikng everything together) always refers to the same object. Wouldn't it be confusing if it were otherwise?
If file1.cpp and file2.cpp both defined a variable A with external linkage, how are the compiler and linker supposed to know whether you want one or two different objects? More importantly, how is the human reading the code supposed to know whether the original author wanted to create one or two objects?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Manually create gnu_unique_object symbols - c++

Since C++ 17, one can use inline variables. They produce gnu_unique_object symbols: $ cat c1.cpp inline int var1 = 123; void f1() { int x = var1; } $ g++ -std=c++1z -c c1.cpp $ nm -C c1.o 0000000000000000 T f1() 0000000000000000 u var1

Related

How do inline functions resolve multiple function definitions? [duplicate]

Link an externally defined static function in C++ Application code

What do linkers actually do with multiply-defined `inline` functions?

Can GCC optimize things better when I compile everything in one step?

Why c++ puts same-named variables defined in separate modules into same address in memory?

Categories

Resources