I see a common pattern in many C++ codebases:
Header.h:
static const int myConstant = 1;
Source1.cpp:
#include "Header.h"
Source2.cpp:
#include "Header.h"
Based on:
3.5 Program and linkage
...
(2.1) — When a name has external linkage , the entity it denotes can be referred to by names from scopes of
other translation units or from other scopes of the same translation unit.
(2.2) — When a name has internal linkage , the entity it denotes can be referred to by names from other scopes
in the same translation unit.
...
3 A name having namespace scope (3.3.6) has internal linkage if it is the name of
(3.1) — a variable, function or function template that is explicitly declared static; or,
myConstant is accessible only from the same translation unit and the compiler will generate multiple instances of it, one for each translation unit that included Header.h.
Is my understanding correct - multiple instances of myConstant are created? If this is the case can you please point me to better alternatives of using constants in C++
EDIT:
Some suggested to make myConstant extern in the header and define it in one cpp file. Is this a good practice? I guess this will make the value invisible to the compiler and prevent many optimizations, for example when the value appears in arithmetic operations.
What you're doing should be fine. The optimizer will probably avoid creating any storage for the constants, and will instead replace any uses of it with the value, as long as you never take the address of the variable (e.g. &myConstant).
A pattern static const int myConstant = 1 arising in header files is a little bit strange, because keyword static restricts the scope of a variable definition to the specific translation unit. Hence, this variable can then not be accessed from other translation units. So I don't see why someone might expose a variable in a header file though this variable can never be addressed from "outside".
Note that if different translation units include the header, then each translation unit will define its own, somewhat "private" instance of this variable.
I think that the common pattern should be:
In the header file:
extern const int myConstant;
In exactly one implementation file of the whole program:
const int myConstant = 1;
The comments say, however, that this will prevent the compiler from optimisations, as the value of the constant is not know at the time a translation unit is compiled (and this sounds reasonable).
So it seems that "global/shared" constants are not possible and that one might have to live with the - somewhat contradicting - keyword static in a header file.
Additionally, I'd use constexr to indicate a compile time constant (though the compiler might derive this anyway):
static constexpr int x = 1;
Because the static-keyword still disturbs me somehow, I did some research and experiments on constexpr without a static keyword but with an extern keyword. Unfortunately, an extern constexpr still requires an initialisation (which makes it a definition then and leads to duplicate symbol errors). Interestingly , at least with my compiler, I can actually define constexpr int x = 1 in different translation units without introducing a compiler/linker error. But I do not find a support for this behaviour in the standard. But defining constexpr int x = 1 in a header file is even more curious than static constexpr int x = 1.
So - many words, few findings. I think static constexpr int x = 1 is the best choice.
Related
I have the following code in a header file, that is included in 2 different cpp files:
constexpr int array[] = { 11, 12, 13, 14, 15 };
inline const int* find(int id)
{
auto it = std::find(std::begin(array), std::end(array), id);
return it != std::end(array) ? &*it : nullptr;
}
I then call find(13) in each of the cpp files. Will both pointers returned by find() point to the same address in memory?
The reason I ask is because I have similar code in my project and sometimes it works and sometimes it doesn't. I assumed both pointers would point to the same location, but I don't really have a basis for that assumption :)
In C++11 and C++14:
In your example array has internal linkage (see [basic.link]/3.2), which means it will have different addresses in different translation units.
And so it's an ODR violation to include and call find in different translation units (since its definition is different).
A simple solution is to declare it extern.
In C++17:
[basic.link]/3.2 has changed such that constexpr can be inline in which case there will be no effect on the linkage anymore.
Which means that if you declare array inline it'll have external linkage and will have the same address across translation units. Of course like with any inline, it must have identical definition in all translation units.
I can't claim to be an expert in this, but according to this blog post, the code you have there should do what you want in C++17, because constexpr then implies inline and that page says (and I believe it):
A variable declared inline has the same semantics as a function declared inline: it can be defined, identically, in multiple translation units, must be defined in every translation unit in which it is used, and the behavior of the program is as if there was exactly one variable.
So, two things to do:
make sure you are compiling as C++17
declare array as constexpr inline to force a compiler error on older compilers (and to ensure that you actually get the semantics you want - see comments below)
I believe that will do it.
Learning from this: By default, non-const variables declared outside of a block are assumed to be external. However, const variables declared outside of a block are assumed to be internal.
But if I write this inside MyTools.h:
#ifndef _TOOLSIPLUG_
#define _TOOLSIPLUG_
typedef struct {
double LN20;
} S;
S tool;
#endif // !_TOOLSIPLUG_
and I include MyTools.h 3 times (from 3 different .cpp) it says that tool is already defined (at linker phase). Only if I change in:
extern S tool;
works. Isn't external default?
The declaration S tool; at namespace scope, does declare an extern linkage variable. It also defines it. And that's the problem, since you do that in three different translation units: you're saying to the linker that you have accidentally named three different global variables, of the same type, the same.
One way to achieve the effect you seem to desire, a single global shared variable declared in the header, is to do this:
inline auto tool_instance()
-> S&
{
static S the_tool; // One single instance shared in all units.
return the_tool;
}
static S& tool = tool_instance();
The function-accessing-a-local-static is called a Meyers' singleton, after Scott Meyers.
In C++17 and later you can also just declare an inline variable.
Note that global variables are considered Evil™. Quoting Wikipedia on that issue:
” The use of global variables makes software harder to read and understand. Since any code anywhere in the program can change the value of the variable at any time, understanding the use of the variable may entail understanding a large portion of the program. Global variables make separating code into reusable libraries more difficult. They can lead to problems of naming because a global variable defined in one file may conflict with the same name used for a global variable in another file (thus causing linking to fail). A local variable of the same name can shield the global variable from access, again leading to harder-to-understand code. The setting of a global variable can create side effects that are hard to locate and predict. The use of global variables makes it more difficult to isolate units of code for purposes of unit testing; thus they can directly contribute to lowering the quality of the code.
Disclaimer: code not touched by compiler's hands.
There's a difference between a declaration and a definition. extern int i; is a declaration: it says that there's a variable named i whose type is int, and extern implies that it will be defined somewhere else. int i;, on the other hand, is a definition of the variable i. When you write a definition, it tells the compiler to create that variable. If you have definitions of the same variable in more than one source file, you've got multiple definitions, and the compiler (well, in practice, the linker) should complain. That's why putting the definition S tool; in the header creates problems: each source file that #include's that header ends up defining tool, and the compiler rightly complains.
The difference between a const and a not-const definition is, as you say, that const int i = 3; defines a variable named i that is local to the file being compiled. There's no problem having the same definition in more than one source file, because those guys all have internal linkage, that is, they aren't visible outside the source file. When you don't have the const, for example, with int i = 3;, that also defines a variable named i, but it has external linkage, that it, it's visible outside the source file, and having the same definition in multiple files gives you that error. (technically, it's definitions of the same name that lead to problems; int i = 3; and double i = 3.0; in two different source files still are duplicate definitions).
By default, non-const variables declared outside of a block are assumed to be external. However, const variables declared outside of a block are assumed to be internal.
That statement is still correct in your case.
In your MyTools.h, tool is external.
S tool; // define tool, external scope
In your cpp file, however, the extern keyword merely means that S is defined else where.
extern S tool; // declare tool
In C++, say we have this header file:
myglobals.h
#ifndef my_globals_h
#define my_globals_h
int monthsInYear = 12;
#endif
and we include it in multiple implementation files, then we will get compilation errors since we end up with 'monthsInYear' defined multiple times, once in each file that monthsInYear is included in.
Fine. So if we modify our header and make our global const, like so:
const int monthsInYear = 12;
then our compilation errors go away. The explanation for this, as I understand it, and as given here for example, is that the const keyword here changes the linkage of monthsInYear to internal, which means that now each compilation unit that includes the header now has its own copy of the variable with internal linkage, hence we no longer have multiple definitions.
Now, an alternative would be to merely declare the variable in the header with extern, i.e.:
extern int monthsInYear;
and then define it in one of the implementation files that includes the header, i.e.:
#include "myglobals.h"
...
int monthsInYear = 12;
...
and then everywhere that includes it is dealing with one variable with external linkage.
This is fine, but the thing I'm a bit puzzled by is that using const gives it internal linkage, which fixes the problem, and using extern gives it external linkage, which also fixes the problem! It's like saying that any linkage will do, as long as we specify it. On top of that, when we just wrote:
int monthsInYear = 12;
wasn't the linkage already external? Which is why adding 'const' changed the linkage to internal?
It seems to me that the reason using 'extern' here actually fixes the problem isn't because it gives us external linkage (we already had that) but rather because it allows us to merely declare the variable without defining it, which we otherwise wouldn't be able to do, since:
int monthsInYear = 12;
both declares and defines it, and since including the header effectively copy-pastes the code into the implemenation file we end up with multiple definitions. Conversely, since extern allows us to merely declare it we end up with multiple declarations every time we include the header, which is fine since we're allowed to have multiple declarations, just not multiple definitions.
Is my understanding here correct? Sorry if this is massively obvious, but it seems that extern does at least three things:
specifies external linkage
allows declaration of objects without defining them
specifies static storage duration
but a lot of sources I look at don't make this clear, and when talking about using extern to stop 'multiple definition' errors with global variables don't make it clear that the key thing is separating the declaration from the definition.
For a non-const variable, extern has the effect of specifying that the variable has external linkage (which is the default), but it also converts the definition into a declaration if there's no initializer - i.e. extern int i; does not actually define i. (extern int i = 5; would, and would hopefully generate a compiler warning).
When you write extern int monthsInYear; in multiple source files (or #include it into them, which is equivalent), none of them define it. int monthsInYear = 12; defines the variable in that source file only.
The storage class specifiers are a part of the decl-specifier-seq of a declaration syntax. They control two independent properties of the names introduced by the declaration: their storage duration and their linkage.
extern - static or thread storage duration and external linkage
This is what cppreference says which matches your understanding perfectly.
I have tested the following code:
in file a.c/a.cpp
int a;
in file b.c/b.cpp
int a;
int main() { return 0; }
When I compile the source files with gcc *.c -o test, it succeeds.
But when I compile the source files with g++ *.c -o test, it fails:
ccIJdJPe.o:b.cpp:(.bss+0x0): multiple definition of 'a'
ccOSsV4n.o:a.cpp:(.bss+0x0): first defined here
collect2.exe: error: ld returned 1 exit status
I'm really confused about this. Is there any difference between the global variables in C and C++?
Here are the relevant parts of the standard. See my explanation below the standard text:
§6.9.2/2 External object definitions
A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0.
ISO C99 §6.9/5 External definitions
An external definition is an external declaration that is also a definition of a function (other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
With the C version, the 'g' global variables are 'merged' into one, so you will only have one in the end of the day which is declared twice. This is OK due to the time when extern was not needed, or perhaps did not exits. Hence, this is for historical and compatibility reason to build old code. This is a gcc extension for this legacy feature.
It basically makes gcc allocate memory for a variable with the name 'a', so there can be more than one declarations, but only one definition. That is why the code below will not work even with gcc.
This is also called tentative definition. There is no such a thing with C++, and that is while it compiles. C++ has no concept of tentative declaration.
A tentative definition is any external data declaration that has no storage class specifier and no initializer. A tentative definition becomes a full definition if the end of the translation unit is reached and no definition has appeared with an initializer for the identifier. In this situation, the compiler reserves uninitialized space for the object defined.
Note however that the following code will not compile even with gcc because this is tentative definition/declaration anymore with values assigned:
in file "a.c/a.cpp"
int a = 1;
in file "b.c/b.cpp"
int a = 2;
int main() { return 0; }
Let us go even beyond this with further examples. The following statements show normal definitions and tentative definitions. Note, static would make it a bit difference since that is file scope, and would not be external anymore.
int i1 = 10; /* definition, external linkage */
static int i2 = 20; /* definition, internal linkage */
extern int i3 = 30; /* definition, external linkage */
int i4; /* tentative definition, external linkage */
static int i5; /* tentative definition, internal linkage */
int i1; /* valid tentative definition */
int i2; /* not legal, linkage disagreement with previous */
int i3; /* valid tentative definition */
int i4; /* valid tentative definition */
int i5; /* not legal, linkage disagreement with previous */
Further details can be on the following page:
http://c0x.coding-guidelines.com/6.9.2.html
See also this blog post for further details:
http://ninjalj.blogspot.co.uk/2011/10/tentative-definitions-in-c.html
gcc implements a legacy feature where uninitialized global variables are placed in a common block.
Although in each translation unit the definitions are tentative, in ISO C, at the end of the translation unit, tentative definitions are "upgraded" to full definitions if they haven't already been merged into a non-tentative definition.
In standard C, it is always incorrect to have the same variables with external linkage defined in more that one translation unit even if these definitions came from tentative definitions.
To get the same behaviour as C++, you can use the -fno-common switch with gcc and this will result in the same error. (If you are using the GNU linker and don't use -fno-common you might also want to consider using the --warn-common / -Wl,--warn-common option to highlight the link time behaviour on encountering multiple common and non-common symbols with the same name.)
From the gcc man page:
-fno-common
In C code, controls the placement of uninitialized global
variables. Unix C compilers have traditionally permitted multiple
definitions of such variables in different compilation units by
placing the variables in a common block. This is the behavior
specified by -fcommon, and is the default for GCC on most
targets. On the other hand, this behavior is not required by ISO
C, and on some targets may carry a speed or code size penalty on
variable references. The -fno-common option specifies that the
compiler should place uninitialized global variables in the data
section of the object file, rather than generating them as common
blocks. This has the effect that if the same variable is declared
(without extern) in two different compilations, you will get a
multiple-definition error when you link them. In this case, you
must compile with -fcommon instead. Compiling with
-fno-common is useful on targets for which it provides better
performance, or if you wish to verify that the program will work
on other systems which always treat uninitialized variable
declarations this way.
gcc's behaviour is a common one and it is described in Annex J of the standard (which is not normative) which describes commonly implemented extensions to the standard:
J.5.11 Multiple external definitions
There may be more than one external definition for the identifier of an object, with or
without the explicit use of the keyword extern; if the definitions disagree, or more than
one is initialized, the behavior is undefined (6.9.2).
...
#include "test1.h"
int main(..)
{
count << aaa <<endl;
}
aaa is defined in test1.h,and I didn't use extern keyword,but still can reference aaa.
So I doubt is extern really necessary?
extern has its uses. But it mainly involves "global variables" which are frowned upon. The main idea behind extern is to declare things with external linkage. As such it's kind of the opposite of static. But external linkage is in many cases the default linkage so you don't need extern in those cases. Another use of extern is: It can turn definitions into declarations. Examples:
extern int i; // Declaration of i with external linkage
// (only tells the compiler about the existence of i)
int i; // Definition of i with external linkage
// (actually reserves memory, should not be in a header file)
const int f = 3; // Definition of f with internal linkage (due to const)
// (This applies to C++ only, not C. In C f would have
// external linkage.) In C++ it's perfectly fine to put
// somethibng like this into a header file.
extern const int g; // Declaration of g with external linkage
// could be placed into a header file
extern const int g = 3; // Definition of g with external linkage
// Not supposed to be in a header file
static int t; // Definition of t with internal linkage.
// may appear anywhere. Every translation unit that
// has a line like this has its very own t object.
You see, it's rather complicated. There are two orthogonal concepts: Linkage (external vs internal) and the matter of declaration vs definition. The extern keyword can affect both. With respect to linkage it's the opposite of static. But the meaning of static is also overloaded and -- depending on the context -- does or does not control linkage. The other thing it does is to control the life-time of objects ("static life-time"). But at global scope all variables already have a static life-time and some people thought it would be a good idea to recycle the keyword for controlling linkage (this is me just guessing).
Linkage basically is a property of an object or function declared/defined at "namespace scope". If it has internal linkage, it won't be directly accessible by name from other translation units. If it has external linkage, there shall be only one definition across all translation units (with exceptions, see one-definition-rule).
I've found the best way to organise your data is to follow two simple rules:
Only declare things in header files.
Define things in C (or cpp, but I'll just use C here for simplicity) files.
By declare, I mean notify the compiler that things exist, but don't allocate storage for them. This includes typedef, struct, extern and so on.
By define, I generally mean "allocate space for", like int and so on.
If you have a line like:
int aaa;
in a header file, every compilation unit (basically defined as an input stream to the compiler - the C file along with everything it brings in with #include, recursively) will get its own copy. That's going to cause problems if you link two object files together that have the same symbol defined (except under certain limited circumstances like const).
A better way to do this is to define that aaa variable in one of your C files and then put:
extern int aaa;
in your header file.
Note that if your header file is only included in one C file, this isn't a problem. But, in that case, I probably wouldn't even have a header file. Header files are, in my opinion, only for sharing things between compilation units.
If your test1.h has the definition of aaa and you wanted to include the header file into more than one translation unit you will run into multiple definition error, unless aaa is constant.
Better you define the aaa in a cpp file and add extern definition in header file that could be added to other files as header.
Thumb rule for having variable and constant in header file
extern int a ;//Data declarations
const float pi = 3.141593 ;//Constant definitions
Since constant have internal linkage in c++ any constant that is defined in a translation unit will not be visible to other translation unit, but it is not the case for variable they have external linkage i.e., they are visible to other translation unit. Putting the definition of a variable in a header, that is shared in other translation unit would lead to multiple definition of a variable, leading to multiple definition error.
In that case, extern is not necessary. Extern is needed when the symbol is declared in another compilation unit.
When you use the #include preprocessing directive, the included file is copied out in place of the directive. In this case you don't need extern because the compiler already know aaa.
If aaa is not defined in another compilation unit you don't need extern, otherwise you do.