For example:
code1.c / .cpp
int a;
// ... and so on
code2.c / .cpp
int a;
int main(void) {
return 0;
}
go to compile:
$gcc code1.c code2.c # this is fine
$
$g++ code1.cpp code2.cpp # this is dead
/tmp/ccLY66HQ.o:(.bss+0x0): multiple definition of `a'
/tmp/ccnIOmPC.o:(.bss+0x0): first defined here
collect2: ld returned 1 exit status
Is there any global variable linkage difference between C & C++?
It's not strictly legal. int a; is a tentative definition in C. You are allowed multiple tentative definitions and at most one non-tentative definition per translation unit of each object with external linkage in C, but only one definition across all translation units in a program.
It is a commonly implemented extension to allow tentative definitions across multiple translation units in C so long as not more than one translation unit contains a non-tentative definition, but it's not strictly standard.
In C++ int a; is just a definition - there's no concept of tentative - and it's still illegal to have multiple definitions of an object across the translation units of a program.
For the C case, you may wish to look at this question.
It's illegal in both, but C compilers generally implement an extension. See this answer.
There are three ways for resolution of problem:
If variable a is the same in both files, you must declare it as extern in all files except one. extern keyword says to linker that this name is located in another files.
You may use static keyword to limit scope of variable to one file. In which it is declared.
Or you may use nameless namespace.
g++ compiler is more strict then gcc compiler.
It also depends on version of gcc, may be higher version of gcc i.e. 4.X onwards it can give same error.
Use extern to avoid
Related
I did run into the situation, that I declared two (separate) global variables with the same name in two separate files, but without using static, volatile nor extern on them.
One file was a .c and the other a .cpp file.
The compiler and build environment (GCC) was the ESP IDF and even the data types were different on these declarations.
mqtt_ssl.c:
esp_mqtt_client_handle_t client;
mb_master.cpp:
PL::ModbusClient client(port, PL::ModbusProtocol::rtu, 1);
During the runtime i experienced a lot of problems with reboots of the ESP32 until I found out, that the same name, of two actually separate variables, is causing the issue.
My guess is, that the compiler used the same memory-region for both of them.
I expected, that it gets handled as two separate objects with its own region in the memory, which is obviously not true.
After reading some questions here and there is my understanding now is that the behavior is undefined, if a variable with the same name gets declared without static, extern or volatile in two separate C/C++ files.
It took me quite a while to figure that out.
If it is not allowed to declare it like this, why didn't the compiler/linker throw an error?
Is there an option for GCC to treat such a situation as error to prevent that situation in the future?
Edit 1:
This is a reproducible example with xtensa-esp32-elf-gcc.exe (crosstool-NG esp-2021r2-patch3) 8.4.0.
app_main.c
#include <stdio.h>
void test_sub(void);
uint16_t test1;
void app_main(void)
{
test1 = 1;
printf("app_main:test1=%d\n", test1);
test_sub();
printf("app_main:test1=%d\n", test1);
}
test_sub.c
#include <stdio.h>
int16_t test1;
void test_sub(void)
{
test1 = 2;
printf("test_sub:test1=%d\n", test1);
}
Result:
app_main:test1=1
test_sub:test1=2
app_main:test1=2
test1 in app_main() got overwritten by test_sub(), because it had the same name in both files.
esp_mqtt_client_handle_t client; is a tentative definition. In spite of its name, it is not a definition, just a declaration, but it will cause a definition to be created at the end of the translation unit if there is no regular definition in the translation unit.
The C standard allows C implementations to choose how they resolve multiple definitions (because it says it does not define the behavior, so compilers and linkers may define it). Prior to GCC version 10, the default in GCC was to mark definitions from tentative definitions as “common,” meaning they could be coalesced with other definitions. This results in the linker not complaining about such multiple definitions.
Giving the -fno-common switch to GCC instructs it not to do this; definitions will not be marked as “common,” and the linker will complain if it sees multiple definitions.
Is your problem really that the variable is shaddowed? Often with global variables the problem can be that the variables are not initialized in the right order.
int a;
int a=3; //error as cpp compiled with clang++-7 compiler but not as C compiled with clang-7;
int main() {
}
For C, the compiler seems to merge these symbols into one global symbol but for C++ it is an error.
Demo
file1:
int a = 2;
file2:
#include<stdio.h>
int a;
int main() {
printf("%d", a); //2
}
As C files compiled with clang-7, the linker does not produce an error and I assume it converts the uninitialised global symbol 'a' to an extern symbol (treating it as if it were compiled as an extern declaration). As C++ files compiled with clang++-7, the linker produces a multiple definition error.
Update: the linked question does answer the first example in my question, specifically 'In C, If an actual external definition is found earlier or later in the same translation unit, then the tentative definition just acts as a declaration.' and 'C++ does not have “tentative definitions”'.
As for the second scenario, if I printf a, then it does print 2, so obviously the linker has linked it correctly (but I previously would have assumed that a tentative definition would be initialised to 0 by the compiler as a global definition and would cause a link error).
It turns out that int i[]; tentative defintion in both files also gets linked to one definition. int i[5]; is also a tentative definition in .common, just with a different size expressed to the assembler. The former is known as a tentative definition with an incomplete type, whereas the latter is a tentative definition with a complete type.
What happens with the C compiler is that int a is made strong-bound weak global in .common and left uninitialised (where .common implies a weak global) in the symbol table (whereas extern int a would be an extern symbol), and the linker makes the necessary decision, i.e. it ignores all weak-bound globals defined using #pragma weak if there is a strong-bound global with the same identifier in a translation unit, where 2 strong-bounds would be a multiple definition error (but if it finds no strong-bounds and 1 weak-bound, the output is a single weak-bound, and if it finds no strong-bounds but two weak-bounds, it chooses the definition in the first file on the command line and outputs the single weak-bound. Though two weak-bounds are two definitions to the linker (because they are initialised to 0 by the compiler), it is not a multiple definition error, because they are both weak-bound) and then resolves all .common symbols to point to the strong/weak-bound strong global. https://godbolt.org/z/Xu_8tY https://docs.oracle.com/cd/E19120-01/open.solaris/819-0690/chapter2-93321/index.html
As baz is declared with #pragma weak, it is weak-bound and gets zeroed by the compiler and put in .bss (even though it is a weak global, it doesn't go in .common, because it is weak-bound; all weak-bound variables go in .bss if uninitialised and get initialised by the compiler, or .data if they are initialised). If it were not declared with #pragma weak, baz would go in common and the linker will zero it if no weak/strong-bound strong global symbol is found.
C++ compiler makes int a a strong-bound strong global in .bss and initialises it to 0: https://godbolt.org/z/aGT2-o, therefore the linker treats it as a multiple definition.
Update 2:
GCC 10.1 defaults to -fno-common. As a result, global variable targets are more efficient on various targets. In C, global variables with multiple tentative definitions now result in linker errors (like C++). With -fcommon such definitions are silently merged during linking.
I'll address the C end of the question, since I'm more familiar with that language and you seem to already be pretty clear on why the C++ side works as it does. Someone else is welcome to add a detailed C++ answer.
As you noted, in your first example, C treats the line int a; as a tentative definition (see 6.9.2 in N2176). The later int a = 3; is a declaration with an initializer, so it is an external definition. As such, the earlier tentative definition int a; is treated as merely a declaration. So, retroactively, you have first declared a variable at file scope and later defined it (with an initializer). No problem.
In your second example, file2 also has a tentative definition of a. There is no external definition in this translation unit, so
the behavior is exactly as if the translation
unit contains a file scope declaration of that identifier, with the composite type as of the end of the
translation unit, with an initializer equal to 0. [6.9.2 (1)]
That is, it is as if you had written int a = 0; in file2. Now you have two external definitions of a in your program, one in file1 and another in file2. This violates 6.9 (5):
If an identifier declared with external linkage is used in an expression
(other than as part of the operand of a sizeof or _Alignof operator whose result is an integer
constant), somewhere in the entire program there shall be exactly one external definition for the
identifier; otherwise, there shall be no more than one.
So under the C standard, the behavior of your program is undefined and the compiler is free to do as it likes. (But note that no diagnostic is required.) With your particular implementation, instead of summoning nasal demons, what your compiler chooses to do is what you described: use the common feature of your object file format, and have the linker merge the definitions into one. Although not required by the standard, this behavior is traditional at least on Unix, and is mentioned by the standard as a "common extension" (no pun intended) in J.5.11.
This feature is quite convenient, in my opinion, but since it's only possible if your object file format supports it, we couldn't really expect the C standard authors to mandate it.
clang doesn't document this behavior very clearly, as far as I can see, but gcc, which has the same behavior, describes it under the -fcommon option. On either compiler, you can disable it with -fno-common, and then your program should fail to link with a multiple definition error.
When I declare a global variable in two different source files and only define it in one of the source files, I get different results compiling for C++ than for C. See the following example:
main.c
#include <stdio.h>
#include "func.h" // only contains declaration of void print();
int def_var = 10;
int main() {
printf("%d\n", def_var);
return 0;
}
func.c
#include <stdio.h>
#include "func.h"
/* extern */int def_var; // extern needed for C++ but not for C?
void print() {
printf("%d\n", def_var);
}
I compile with the following commands:
gcc/g++ -c main.c -o main.o
gcc/g++ -c func.c -o func.o
gcc/g++ main.o func.o -o main
g++/clang++ complain about multiple definition of def_var (this is the behaviour I expected, when not using extern).
gcc/clang compile just fine. (using gcc 7.3.1 and clang 5.0)
According to this link:
A tentative definition is a declaration that may or may not act as a definition. If an actual external definition is found earlier or later in the same translation unit, then the tentative definition just acts as a declaration.
So my variable def_var should be defined at the end of each translation unit and then result in multiple definitions (as it is done for C++). Why is that not the case when compiling with gcc/clang?
This isn't valid C either, strictly speaking. Says as much in
6.9 External definitions - p5
An external definition is an external declaration that is also a
definition of a function (other than an inline definition) or an
object. If an identifier declared with external linkage is used in an
expression (other than as part of the operand of a sizeof or _Alignof
operator whose result is an integer constant), somewhere in the entire
program there shall be exactly one external definition for the
identifier; otherwise, there shall be no more than one.
You have two definitions for an identifier with external linkage. You violate that requirement, the behavior is undefined. The program linking and working is not in opposition to that. It's not required to be diagnosed.
And it's worth noting that C++ is no different in that regard.
[basic.def.odr]/4
Every program shall contain exactly one definition of every non-inline
function or variable that is odr-used in that program outside of a
discarded statement; no diagnostic required. The definition can appear
explicitly in the program, it can be found in the standard or a
user-defined library, or (when appropriate) it is implicitly defined
(see [class.ctor], [class.dtor] and [class.copy]). An inline function
or variable shall be defined in every translation unit in which it is
odr-used outside of a discarded statement.
Again, a "shall" requirement, and it says explicitly that no diagnostic is required. As you may have noticed, there's quite a bit more machinery that this paragraph can apply to. So the front ends for GCC and Clang probably need to work harder, and as such are able to diagnose it, despite not being required to.
The program is ill-formed either way.
As M.M pointed out in a comment, the C standard has an informative section that mentions the very extension in zwol's answer.
J.5.11 Multiple external definitions
There may be more than one external definition for the identifier of
an object, with or without the explicit use of the keyword extern; if
the definitions disagree, or more than one is initialized, the
behavior is undefined (6.9.2).
I believe you are observing an extension to C known as "common symbols", implemented by most, but not all, Unix-lineage C compilers, originally (IIUC) for compatibility with FORTRAN. The extension generalizes the "tentative definitions" rule described in StoryTeller's answer to multiple translation units. All external object definitions with the same name and no initializer,
int foo; // at file scope
are collapsed into one, even if they appear in more than one TU, and if there exists an external definition with an initializer for that name,
int foo = 1; // different TU, also file scope
then all of the external definitions with no initializers are treated as external declarations. C++ compilers do not implement this extension, because (oversimplifying) nobody wanted to figure out what it should do in the presence of templates. For GCC and Clang, you can disable the extension with -fno-common, but other Unix C compilers may not have any way to turn it off.
I have tested the following code:
in file a.c/a.cpp
int a;
in file b.c/b.cpp
int a;
int main() { return 0; }
When I compile the source files with gcc *.c -o test, it succeeds.
But when I compile the source files with g++ *.c -o test, it fails:
ccIJdJPe.o:b.cpp:(.bss+0x0): multiple definition of 'a'
ccOSsV4n.o:a.cpp:(.bss+0x0): first defined here
collect2.exe: error: ld returned 1 exit status
I'm really confused about this. Is there any difference between the global variables in C and C++?
Here are the relevant parts of the standard. See my explanation below the standard text:
§6.9.2/2 External object definitions
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
ISO C99 §6.9/5 External definitions
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
With the C version, the 'g' global variables are 'merged' into one, so you will only have one in the end of the day which is declared twice. This is OK due to the time when extern was not needed, or perhaps did not exits. Hence, this is for historical and compatibility reason to build old code. This is a gcc extension for this legacy feature.
It basically makes gcc allocate memory for a variable with the name 'a', so there can be more than one declarations, but only one definition. That is why the code below will not work even with gcc.
This is also called tentative definition. There is no such a thing with C++, and that is while it compiles. C++ has no concept of tentative declaration.
A tentative definition is any external data declaration that has no storage class specifier and no initializer. A tentative definition becomes a full definition if the end of the translation unit is reached and no definition has appeared with an initializer for the identifier. In this situation, the compiler reserves uninitialized space for the object defined.
Note however that the following code will not compile even with gcc because this is tentative definition/declaration anymore with values assigned:
in file "a.c/a.cpp"
int a = 1;
in file "b.c/b.cpp"
int a = 2;
int main() { return 0; }
Let us go even beyond this with further examples. The following statements show normal definitions and tentative definitions. Note, static would make it a bit difference since that is file scope, and would not be external anymore.
int i1 = 10; /* definition, external linkage */
static int i2 = 20; /* definition, internal linkage */
extern int i3 = 30; /* definition, external linkage */
int i4; /* tentative definition, external linkage */
static int i5; /* tentative definition, internal linkage */
int i1; /* valid tentative definition */
int i2; /* not legal, linkage disagreement with previous */
int i3; /* valid tentative definition */
int i4; /* valid tentative definition */
int i5; /* not legal, linkage disagreement with previous */
Further details can be on the following page:
http://c0x.coding-guidelines.com/6.9.2.html
See also this blog post for further details:
http://ninjalj.blogspot.co.uk/2011/10/tentative-definitions-in-c.html
gcc implements a legacy feature where uninitialized global variables are placed in a common block.
Although in each translation unit the definitions are tentative, in ISO C, at the end of the translation unit, tentative definitions are "upgraded" to full definitions if they haven't already been merged into a non-tentative definition.
In standard C, it is always incorrect to have the same variables with external linkage defined in more that one translation unit even if these definitions came from tentative definitions.
To get the same behaviour as C++, you can use the -fno-common switch with gcc and this will result in the same error. (If you are using the GNU linker and don't use -fno-common you might also want to consider using the --warn-common / -Wl,--warn-common option to highlight the link time behaviour on encountering multiple common and non-common symbols with the same name.)
From the gcc man page:
-fno-common
In C code, controls the placement of uninitialized global
variables. Unix C compilers have traditionally permitted multiple
definitions of such variables in different compilation units by
placing the variables in a common block. This is the behavior
specified by -fcommon, and is the default for GCC on most
targets. On the other hand, this behavior is not required by ISO
C, and on some targets may carry a speed or code size penalty on
variable references. The -fno-common option specifies that the
compiler should place uninitialized global variables in the data
section of the object file, rather than generating them as common
blocks. This has the effect that if the same variable is declared
(without extern) in two different compilations, you will get a
multiple-definition error when you link them. In this case, you
must compile with -fcommon instead. Compiling with
-fno-common is useful on targets for which it provides better
performance, or if you wish to verify that the program will work
on other systems which always treat uninitialized variable
declarations this way.
gcc's behaviour is a common one and it is described in Annex J of the standard (which is not normative) which describes commonly implemented extensions to the standard:
J.5.11 Multiple external definitions
There may be more than one external definition for the identifier of an object, with or
without the explicit use of the keyword extern; if the definitions disagree, or more than
one is initialized, the behavior is undefined (6.9.2).
...
#include "test1.h"
int main(..)
{
count << aaa <<endl;
}
aaa is defined in test1.h,and I didn't use extern keyword,but still can reference aaa.
So I doubt is extern really necessary?
extern has its uses. But it mainly involves "global variables" which are frowned upon. The main idea behind extern is to declare things with external linkage. As such it's kind of the opposite of static. But external linkage is in many cases the default linkage so you don't need extern in those cases. Another use of extern is: It can turn definitions into declarations. Examples:
extern int i; // Declaration of i with external linkage
// (only tells the compiler about the existence of i)
int i; // Definition of i with external linkage
// (actually reserves memory, should not be in a header file)
const int f = 3; // Definition of f with internal linkage (due to const)
// (This applies to C++ only, not C. In C f would have
// external linkage.) In C++ it's perfectly fine to put
// somethibng like this into a header file.
extern const int g; // Declaration of g with external linkage
// could be placed into a header file
extern const int g = 3; // Definition of g with external linkage
// Not supposed to be in a header file
static int t; // Definition of t with internal linkage.
// may appear anywhere. Every translation unit that
// has a line like this has its very own t object.
You see, it's rather complicated. There are two orthogonal concepts: Linkage (external vs internal) and the matter of declaration vs definition. The extern keyword can affect both. With respect to linkage it's the opposite of static. But the meaning of static is also overloaded and -- depending on the context -- does or does not control linkage. The other thing it does is to control the life-time of objects ("static life-time"). But at global scope all variables already have a static life-time and some people thought it would be a good idea to recycle the keyword for controlling linkage (this is me just guessing).
Linkage basically is a property of an object or function declared/defined at "namespace scope". If it has internal linkage, it won't be directly accessible by name from other translation units. If it has external linkage, there shall be only one definition across all translation units (with exceptions, see one-definition-rule).
I've found the best way to organise your data is to follow two simple rules:
Only declare things in header files.
Define things in C (or cpp, but I'll just use C here for simplicity) files.
By declare, I mean notify the compiler that things exist, but don't allocate storage for them. This includes typedef, struct, extern and so on.
By define, I generally mean "allocate space for", like int and so on.
If you have a line like:
int aaa;
in a header file, every compilation unit (basically defined as an input stream to the compiler - the C file along with everything it brings in with #include, recursively) will get its own copy. That's going to cause problems if you link two object files together that have the same symbol defined (except under certain limited circumstances like const).
A better way to do this is to define that aaa variable in one of your C files and then put:
extern int aaa;
in your header file.
Note that if your header file is only included in one C file, this isn't a problem. But, in that case, I probably wouldn't even have a header file. Header files are, in my opinion, only for sharing things between compilation units.
If your test1.h has the definition of aaa and you wanted to include the header file into more than one translation unit you will run into multiple definition error, unless aaa is constant.
Better you define the aaa in a cpp file and add extern definition in header file that could be added to other files as header.
Thumb rule for having variable and constant in header file
extern int a ;//Data declarations
const float pi = 3.141593 ;//Constant definitions
Since constant have internal linkage in c++ any constant that is defined in a translation unit will not be visible to other translation unit, but it is not the case for variable they have external linkage i.e., they are visible to other translation unit. Putting the definition of a variable in a header, that is shared in other translation unit would lead to multiple definition of a variable, leading to multiple definition error.
In that case, extern is not necessary. Extern is needed when the symbol is declared in another compilation unit.
When you use the #include preprocessing directive, the included file is copied out in place of the directive. In this case you don't need extern because the compiler already know aaa.
If aaa is not defined in another compilation unit you don't need extern, otherwise you do.