In C and C++ we can manipulate a variable's linkage. There are three kinds of linkage: no linkage, internal linkage, and external linkage. My question is probably related to why these are called "linkage" (How is that related to the linker).
I understand a linker is able to handle variables with external linkage, because references to this variable is not confined within a single translation unit, therefore not confined within a single object file. How that actually works under the hood is typically discussed in courses on operating systems.
But how does the linker handle variables (1) with no linkage and (2) with internal linkage? What are the differences in these two cases?
As far as C++ itself goes, this does not matter: the only thing that matters is the behavior of the system as a whole. Variables with no linkage should not be linked; variables with internal linkage should not be linked across translation units; and variables with external linkage should be linked across translation units. (Of course, as the person writing the C++ code, you must obey all of your constraints as well.)
Inside a compiler and linker suite of programs, however, we certainly do have to care about this. The method by which we achieve the desired result is up to us. One traditional method is pretty simple:
Identifiers with no linkage are never even passed through to the linker.
Identifiers with internal linkage are not passed through to the linker either, or are passed through to the linker but marked "for use within this one translation unit only". That is, there is no .global declaration for them, or there is a .local declaration for them, or similar.
Identifiers with external linkage are passed through to the linker, and if internal linkage identifiers are seen by the linker, these external linkage symbols are marked differently, e.g., have a .global declaration or no .local declaration.
If you have a Linux or Unix like system, run nm on object (.o) files produced by the compiler. Note that some symbols are annotated with uppercase letters like T and D for text and data: these are global. Other symbols are annotated with lowercase letters like t and d: these are local. So these systems are using the "pass internal linkage to the linker, but mark them differently from external linkage" method.
The linker isn't normally involved in either internal linkage or no linkage--they're resolved entirely by the compiler, before the linker gets into the act at all.
Internal linkage means two declarations at different scopes in the same translation unit can refer to the same thing.
No Linkage
No linkage means two declarations at different scopes in the same translation unit can't refer to the same thing.
So, if I have something like:
int f() {
static int x; // no linkage
}
...no other declaration of x in any other scope can refer to this x. The linker is involved only to the degree that it typically has to produce a field in the executable telling it the size of static space needed by the executable, and that will include space for this variable. Since it can never be referred to by any other declaration, there's no need for the linker to get involved beyond that though (in particular, the linker has nothing to do with resolving the name).
Internal linkage
Internal linkage means declarations at different scopes in the same translation unit can refer to the same object. For example:
static int x; // a namespace scope, so `x` has internal linkage
int f() {
extern int x; // declaration in one scope
}
int g() {
extern int x; // declaration in another scope
}
Assuming we put these all in one file (i.e., they end up as a single translation unit), the declarations in both f() and g() refer to the same thing--the x that's defined as static at namespace scope.
For example, consider code like this:
#include <iostream>
static int x; // a namespace scope, so `x` has internal linkage
int f()
{
extern int x;
++x;
}
int g()
{
extern int x;
std::cout << x << '\n';
}
int main() {
g();
f();
g();
}
This will print:
0
1
...because the x being incremented in f() is the same x that's being printed in g().
The linker's involvement here can be (and usually is) pretty much the same as in the no linkage case--the variable x needs some space, and the linker specifies that space when it creates the executable. It does not, however, need to get involved in determining that when f() and g() both declare x, they're referring to the same x--the compiler can determine that.
We can see this in the generated code. For example, if we compile the code above with gcc, the relevant bits for f() and g() are these.
f:
movl _ZL1x(%rip), %eax
addl $1, %eax
movl %eax, _ZL1x(%rip)
That's the increment of x (it uses the name _ZL1x for it).
g:
movl _ZL1x(%rip), %eax
[...]
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_c#PLT
So that's basically loading up x, then sending it to std::cout (I've left out code for other parameters we don't care about here).
The important part is that the code refers to _ZL1x--the same name as f used, so both of them refer to the same object.
The linker isn't really involved, because all it sees is that this file has requested space for one statically allocated variable. It makes space for that, but doesn't have to do anything to make f and g refer to the same thing--that's already handled by the compiler.
My question is probably related to why these are called "linkage" (How is that related to the linker).
According to the C standard,
An identifier declared in different scopes or in the same scope more
than once can be made to refer to the same object or function by a
process called linkage.
The term "linkage" seems reasonably well fitting -- different declarations of the same identifier are linked together so that they refer to the same object or function. That being the chosen terminology, it's pretty natural that a program that actually makes linkage happen is conventionally called a "linker".
But how does the linker handle variables (1) with no linkage and (2) with internal linkage? What are the differences in these two cases?
The linker does not have to do anything with identifiers that have no linkage. Every such declaration of an object identifier declares a distinct object (and function declarations always have internal or external linkage).
The linker does not necessarily do anything with identifiers having internal linkage, either, as the compiler can generally do everything that needs to be done with these. Nevertheless, identifiers with internal linkage can be declared multiple times in the same translation unit, with those identifiers all referring to the same object or function. The most common case is a static function with a forward declaration:
static void internal(void);
// ...
static void internal(void) {
// do something
}
File-scope variables can also have internal linkage and multiple declarations that are all linked to refer to the same object, but the multiple declaration part is not as useful for variables.
Related
int a;
int a=3; //error as cpp compiled with clang++-7 compiler but not as C compiled with clang-7;
int main() {
}
For C, the compiler seems to merge these symbols into one global symbol but for C++ it is an error.
Demo
file1:
int a = 2;
file2:
#include<stdio.h>
int a;
int main() {
printf("%d", a); //2
}
As C files compiled with clang-7, the linker does not produce an error and I assume it converts the uninitialised global symbol 'a' to an extern symbol (treating it as if it were compiled as an extern declaration). As C++ files compiled with clang++-7, the linker produces a multiple definition error.
Update: the linked question does answer the first example in my question, specifically 'In C, If an actual external definition is found earlier or later in the same translation unit, then the tentative definition just acts as a declaration.' and 'C++ does not have “tentative definitions”'.
As for the second scenario, if I printf a, then it does print 2, so obviously the linker has linked it correctly (but I previously would have assumed that a tentative definition would be initialised to 0 by the compiler as a global definition and would cause a link error).
It turns out that int i[]; tentative defintion in both files also gets linked to one definition. int i[5]; is also a tentative definition in .common, just with a different size expressed to the assembler. The former is known as a tentative definition with an incomplete type, whereas the latter is a tentative definition with a complete type.
What happens with the C compiler is that int a is made strong-bound weak global in .common and left uninitialised (where .common implies a weak global) in the symbol table (whereas extern int a would be an extern symbol), and the linker makes the necessary decision, i.e. it ignores all weak-bound globals defined using #pragma weak if there is a strong-bound global with the same identifier in a translation unit, where 2 strong-bounds would be a multiple definition error (but if it finds no strong-bounds and 1 weak-bound, the output is a single weak-bound, and if it finds no strong-bounds but two weak-bounds, it chooses the definition in the first file on the command line and outputs the single weak-bound. Though two weak-bounds are two definitions to the linker (because they are initialised to 0 by the compiler), it is not a multiple definition error, because they are both weak-bound) and then resolves all .common symbols to point to the strong/weak-bound strong global. https://godbolt.org/z/Xu_8tY https://docs.oracle.com/cd/E19120-01/open.solaris/819-0690/chapter2-93321/index.html
As baz is declared with #pragma weak, it is weak-bound and gets zeroed by the compiler and put in .bss (even though it is a weak global, it doesn't go in .common, because it is weak-bound; all weak-bound variables go in .bss if uninitialised and get initialised by the compiler, or .data if they are initialised). If it were not declared with #pragma weak, baz would go in common and the linker will zero it if no weak/strong-bound strong global symbol is found.
C++ compiler makes int a a strong-bound strong global in .bss and initialises it to 0: https://godbolt.org/z/aGT2-o, therefore the linker treats it as a multiple definition.
Update 2:
GCC 10.1 defaults to -fno-common. As a result, global variable targets are more efficient on various targets. In C, global variables with multiple tentative definitions now result in linker errors (like C++). With -fcommon such definitions are silently merged during linking.
I'll address the C end of the question, since I'm more familiar with that language and you seem to already be pretty clear on why the C++ side works as it does. Someone else is welcome to add a detailed C++ answer.
As you noted, in your first example, C treats the line int a; as a tentative definition (see 6.9.2 in N2176). The later int a = 3; is a declaration with an initializer, so it is an external definition. As such, the earlier tentative definition int a; is treated as merely a declaration. So, retroactively, you have first declared a variable at file scope and later defined it (with an initializer). No problem.
In your second example, file2 also has a tentative definition of a. There is no external definition in this translation unit, so
the behavior is exactly as if the translation
unit contains a file scope declaration of that identifier, with the composite type as of the end of the
translation unit, with an initializer equal to 0. [6.9.2 (1)]
That is, it is as if you had written int a = 0; in file2. Now you have two external definitions of a in your program, one in file1 and another in file2. This violates 6.9 (5):
If an identifier declared with external linkage is used in an expression
(other than as part of the operand of a sizeof or _Alignof operator whose result is an integer
constant), somewhere in the entire program there shall be exactly one external definition for the
identifier; otherwise, there shall be no more than one.
So under the C standard, the behavior of your program is undefined and the compiler is free to do as it likes. (But note that no diagnostic is required.) With your particular implementation, instead of summoning nasal demons, what your compiler chooses to do is what you described: use the common feature of your object file format, and have the linker merge the definitions into one. Although not required by the standard, this behavior is traditional at least on Unix, and is mentioned by the standard as a "common extension" (no pun intended) in J.5.11.
This feature is quite convenient, in my opinion, but since it's only possible if your object file format supports it, we couldn't really expect the C standard authors to mandate it.
clang doesn't document this behavior very clearly, as far as I can see, but gcc, which has the same behavior, describes it under the -fcommon option. On either compiler, you can disable it with -fno-common, and then your program should fail to link with a multiple definition error.
I understand that there are three possible linkage values for a variable in C++ - no linkage, internal linkage and external linkage.
So external linkage means that the variable identifier is accessible in multiple files, and internal linkage means that it is accessible within the same file. But what is the point of internal linkage? Why not just have two possible linkages for an identifier - no linkage and external linkage? To me it seems like global (or file) scope and internal linkage serve the same purpose.
Is there any use case where internal linkage is actually useful that is not covered by global scope?
In the below example, I have two pieces of code - the first one links to the static int i11 (which has internal linkage), and the second one does not. Both pretty much do the same thing, since main already has access to the variable i11 due to its file scope. So why have a separate linkage called internal linkage.
static int i11 = 10;
int main()
{
extern int i11;
cout << ::i11;
return 0;
}
gives the same result as
static int i11 = 10;
int main()
{
cout << ::i11;
return 0;
}
EDIT: Just to add more clarity, as per HolyBlackCat's definition below, internal linkage really means you can forward-declare a variable within the same translation unit. But why would you even need to do that for a variable that is already globally accessible within the file .. Is there any use case for this feature?
Examples of each:
External linkage:
foo.h
extern int foo; // Declaration
foo.cpp
extern int foo = 42; // Definition
bar.cpp
#include "foo.h"
int bar() { return foo; } // Use
Internal linkage:
foo.cpp
static int foo = 42; // No relation to foo in bar.cpp
bar.cpp
static int foo = -43; // No relation to foo in foo.cpp
No linkage:
foo.cpp
int foo1() { static int foo = 42; foo++; return foo; }
int foo2() { static int foo = -43; foo++; return foo; }
Surely you will agree that the foo variables in functions foo1 and foo2 have to have storage. This means they probably have to have names because of how assemblers and linkers work. Those names cannot conflict and should not be accessible by any other code. The way the C++ standard encodes this is as "no linkage." There are a few other cases where it is used as well, but for things where it is a little less obvious what the storage is used for. (E.g. for class you can imagine the vtable has storage, but for a typedef it is mostly a matter of language specification minutiae about access scope of the name.)
C++ specifies somewhat of a least common denominator linkage model that can be mapped onto the richer models of actual linkers on actual platforms. In practice this is highly imperfect and lots of real systems end up using attributes, pragmas, or compiler flags to gain greater control of linkage type. In order to do this and still provide a reasonably useful language, one gets into name mangling and other compiler techniques. If C++ were ever to try and provide a greater degree of compiled code interop, such as Java or .NET virtual machines do, it is very likely the language would gain clearer and more elaborate control over linkage.
EDIT: To more clearly answer the question... The standard has to define how this works for both access to identifiers in the source language and linkage of compiled code. The definition must be strong enough so that correctly written code never produces errors for things being undefined or multiply defined. There are certainly better ways to do this than C++ uses, but it is largely an evolved language and the specification is somewhat influenced by the substrate it is compiled onto. In effect the three different types of linkage are:
External linkage: The entire program agrees on this name and it can be access anywhere there is a declaration visible.
Internal linkage: A single file agrees on this name and it can be accessed in any scope the declaration is visible.
No linkage: The name is for one scope only and can only be accessed within this scope.
In the assembly, these tend to map into a global declaration, a file local declaration, and a file local declaration with a synthesized unique name.
It is also relevant for cases where the same name is declared with different linkage in different parts of the program and in determining what extern int foo refers to from a given place.
External linkage is for when you have files being compiled independently of each other (#included .h, .c, and .cpp files are not compiled independently of each other). An extern variable is special in that it can be used between files being compiled separately.
The standard seems to imply that there is no restriction on the number of definitions of a variable if it is not odr-used (§3.2/3):
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.
It does say that any variable can't be defined multiple times within a translation unit (§3.2/1):
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template.
But I can't find a restriction for non-odr-used variables across the entire program. So why can't I compile something like the following:
// other.cpp
int x;
// main.cpp
int x;
int main() {}
Compiling and linking these files with g++ 4.6.3, I get a linker error for multiple definition of 'x'. To be honest, I expect this, but since x is not odr-used anywhere (as far as I can tell), I can't see how the standard restricts this. Or is it undefined behaviour?
Your program violates the linkage rules. C++11 §3.5[basic.link]/9 states:
Two names that are the same and that are declared in different scopes shall denote the same
variable, function, type, enumerator, template or namespace if
both names have external linkage or else both names have internal linkage and are declared in the same translation unit; and
both names refer to members of the same namespace or to members, not by inheritance, of the same class; and
when both names denote functions, the parameter-type-lists of the functions are identical; and
when both names denote function templates, the signatures are the same.
(I've cited the complete paragraph, for reference. The second two bullets do not apply here.)
In your program, there are two names x, which are the same. They are declared in different scopes (in this case, they are declared in different translation units). Both names have external linkage and both names refer to members of the same namespace (the global namespace).
These two names do not denote the same variable. The declaration int x; defines a variable. Because there are two such definitions in the program, there are two variables in the program. The name "x" in one translation unit denotes one of these variables; the name "x" in the other translation unit denotes the other. Therefore, the program is ill-formed.
You're correct that the standard is at fault in this regard. I have a feeling that this case falls into the gap between 3.2p1 (at most one definition per translation unit, as in your question) and 3.2p6 (which describes how classes, enumerations, inline functions, and various templates can have duplicate definitions across translation units).
For comparison, in C, 6.9p5 requires that (my emphasis):
An external definition is an external declaration that is also a definition of a function
(other than an inline definition) or an object. If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one.
If standard does not say anything about definitions of unused variables then you can not imply that there may be multiple:
Undefined behavior may also be expected when this International
Standard omits the description of any explicit definition of
behavior.
So it may compile and run nicely or may stop during translation with error message or may crash runtime etc.
EDIT: See James McNellis answer the standard indeed actually has rules about it.
There is no error in compiling that, the error is in its linkage. By default your global variable or functions are public to other files(have extern storage) so at the end when linker want to link your code it see two definition for x and it can't select one of them, so if you do not use x of main.cpp in other.cpp and vice-verse make them static(that means only visible to the file that contain it)
// other.cpp
static int x;
// main.cpp
static int x;
Why the following doesn't compile?
...
extern int i;
static int i;
...
but if you reverse the order, it compiles fine.
...
static int i;
extern int i;
...
What is going on here?
This is specifically given as an example in the C++ standard when it's discussing the intricacies of declaring external or internal linkage. It's in section 7.1.1.7, which has this exert:
static int b ; // b has internal linkage
extern int b ; // b still has internal linkage
extern int d ; // d has external linkage
static int d ; // error: inconsistent linkage
Section 3.5.6 discusses how extern should behave in this case.
What's happening is this: static int i (in this case) is a definition, where the static indicates that i has internal linkage. When extern occurs after the static the compiler sees that the symbol already exists and accepts that it already has internal linkage and carries on. Which is why your second example compiles.
The extern on the other hand is a declaration, it implicitly states that the symbol has external linkage but doesn't actually create anything. Since there's no i in your first example the compiler registers i as having external linkage but when it gets to your static it finds the incompatible statement that it has internal linkage and gives an error.
In other words it's because declarations are 'softer' than definitions. For example, you could declare the same thing multiple times without error, but you can only define it once.
Whether this is the same in C, I do not know (but netcoder's answer below informs us that the C standard contains the same requirement).
For C, quoting the standard, in C11 6.2.2: Linkage of identifiers:
3) If the declaration of a file scope identifier for an object or a function contains the storage-class specifier static, the identifier has internal linkage.
4) For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.
(emphasis-mine)
That explains the second example (i will have internal linkage). As for the first one, I'm pretty sure it's undefined behavior:
7) If, within a translation unit, the same identifier appears with both internal and external linkage, the behavior is undefined.
...because extern appears before the identifier is declared with internal linkage, 6.2.2/4 does not apply. As such, i has both internal and external linkage, so it's UB.
If the compiler issues a diagnostic, well lucky you I guess. It could compile both without errors and still be compliant to the standard.
C++:
7.1.1 Storage class specifiers [dcl.stc]
7) A name declared in a namespace scope without a storage-class-specifier has external linkage unless it has
internal linkage because of a previous declaration and provided it is not declared const. Objects declared
const and not explicitly declared extern have internal linkage.
So, the first one attempts to first gives i external linkage, and internal afterwards.
The second one gives it internal linkage first, and the second line doesn't attempt to give it external linkage because it was previously declared as internal.
8) The linkages implied by successive declarations for a given entity shall agree. That is, within a given scope,
each declaration declaring the same variable name or the same overloading of a function name shall imply
the same linkage. Each function in a given set of overloaded functions can have a different linkage, however.
[ Example:
[...]
static int b; // b has internal linkage
extern int b; // b still has internal linkage
[...]
extern int d; // d has external linkage
static int d; // error: inconsistent linkage
[...]
In Microsoft Visual Studio, both versions compile just fine.
On Gnu C++ you get an error.
I'm not sure which compiler is "correct". Either way, having both lines doesn't make much sense.
extern int i means that the integer i is defined in some other module (object file or library). This is a declaration. The compiler will not allocate storage the i in this object, but it will recognize the variable when you are using it somewhere else in the program.
int i tells the compiler to allocate storage for i. This is a definition. If other C++ (or C) files have int i, the linker will complain, that int i is defined twice.
static int i is similar to the above, with the extra functionality that i is local. It cannot be accessed from other module, even if they declare extern int i. People are using the keyword static (in this context) to keep i localize.
Hence having i both declared as being defined somewhere else, AND defined as static within the module seems like an error. Visual Studio is silent about it, and g++ is silent only in a specific order, but either way you just shouldn't have both lines in the same source code.
I want to understand the external linkage and internal linkage and their difference.
I also want to know the meaning of
const variables internally link by default unless otherwise declared as extern.
When you write an implementation file (.cpp, .cxx, etc) your compiler generates a translation unit. This is the source file from your implementation plus all the headers you #included in it.
Internal linkage refers to everything only in scope of a translation unit.
External linkage refers to things that exist beyond a particular translation unit. In other words, accessible through the whole program, which is the combination of all translation units (or object files).
As dudewat said external linkage means the symbol (function or global variable) is accessible throughout your program and internal linkage means that it is only accessible in one translation unit.
You can explicitly control the linkage of a symbol by using the extern and static keywords. If the linkage is not specified then the default linkage is extern (external linkage) for non-const symbols and static (internal linkage) for const symbols.
// In namespace scope or global scope.
int i; // extern by default
const int ci; // static by default
extern const int eci; // explicitly extern
static int si; // explicitly static
// The same goes for functions (but there are no const functions).
int f(); // extern by default
static int sf(); // explicitly static
Note that instead of using static (internal linkage), it is better to use anonymous namespaces into which you can also put classes. Though they allow extern linkage, anonymous namespaces are unreachable from other translation units, making linkage effectively static.
namespace {
int i; // extern by default but unreachable from other translation units
class C; // extern by default but unreachable from other translation units
}
A global variable has external linkage by default. Its scope can be extended to files other than containing it by giving a matching extern declaration in the other file.
The scope of a global variable can be restricted to the file containing its declaration by prefixing the declaration with the keyword static. Such variables are said to have internal linkage.
Consider following example:
1.cpp
void f(int i);
extern const int max = 10;
int n = 0;
int main()
{
int a;
//...
f(a);
//...
f(a);
//...
}
The signature of function f declares f as a function with external linkage (default). Its definition must be provided later in this file or in other translation unit (given below).
max is defined as an integer constant. The default linkage for constants is internal. Its linkage is changed to external with the keyword extern. So now max can be accessed in other files.
n is defined as an integer variable. The default linkage for variables defined outside function bodies is external.
2.cpp
#include <iostream>
using namespace std;
extern const int max;
extern int n;
static float z = 0.0;
void f(int i)
{
static int nCall = 0;
int a;
//...
nCall++;
n++;
//...
a = max * z;
//...
cout << "f() called " << nCall << " times." << endl;
}
max is declared to have external linkage. A matching definition for max (with external linkage) must appear in some file. (As in 1.cpp)
n is declared to have external linkage.
z is defined as a global variable with internal linkage.
The definition of nCall specifies nCall to be a variable that retains its value across calls to function f(). Unlike local variables with the default auto storage class, nCall will be initialized only once at the first invocation of f(). The storage class specifier static affects the lifetime of the local variable and not its scope.
NB: The keyword static plays a double role. When used in the definitions of global variables, it specifies internal linkage. When used in the definitions of the local variables, it specifies that the lifetime of the variable is going to be the duration of the program instead of being the duration of the function.
In terms of 'C' (Because static keyword has different meaning between 'C' & 'C++')
Lets talk about different scope in 'C'
SCOPE: It is basically how long can I see something and how far.
Local variable : Scope is only inside a function. It resides in the STACK area of RAM.
Which means that every time a function gets called all the variables
that are the part of that function, including function arguments are
freshly created and are destroyed once the control goes out of the
function. (Because the stack is flushed every time function returns)
Static variable: Scope of this is for a file. It is accessible every where in the file
in which it is declared. It resides in the DATA segment of RAM. Since
this can only be accessed inside a file and hence INTERNAL linkage. Any
other files cannot see this variable. In fact STATIC keyword is the
only way in which we can introduce some level of data or function
hiding in 'C'
Global variable: Scope of this is for an entire application. It is accessible form every
where of the application. Global variables also resides in DATA segment
Since it can be accessed every where in the application and hence
EXTERNAL Linkage
By default all functions are global. In case, if you need to
hide some functions in a file from outside, you can prefix the static
keyword to the function. :-)
Before talking about the question, it is better to know the term translation unit, program and some basic concepts of C++ (actually linkage is one of them in general) precisely. You will also have to know what is a scope.
I will emphasize some key points, esp. those missing in previous answers.
Linkage is a property of a name, which is introduced by a declaration. Different names can denote same entity (typically, an object or a function). So talking about linkage of an entity is usually nonsense, unless you are sure that the entity will only be referred by the unique name from some specific declarations (usually one declaration, though).
Note an object is an entity, but a variable is not. While talking about the linkage of a variable, actually the name of the denoted entity (which is introduced by a specific declaration) is concerned. The linkage of the name is in one of the three: no linkage, internal linkage or external linkage.
Different translation units can share the same declaration by header/source file (yes, it is the standard's wording) inclusion. So you may refer the same name in different translation units. If the name declared has external linkage, the identity of the entity referred by the name is also shared. If the name declared has internal linkage, the same name in different translation units denotes different entities, but you can refer the entity in different scopes of the same translation unit. If the name has no linkage, you simply cannot refer the entity from other scopes.
(Oops... I found what I have typed was somewhat just repeating the standard wording ...)
There are also some other confusing points which are not covered by the language specification.
Visibility (of a name). It is also a property of declared name, but with a meaning different to linkage.
Visibility (of a side effect). This is not related to this topic.
Visibility (of a symbol). This notion can be used by actual implementations. In such implementations, a symbol with specific visibility in object (binary) code is usually the target mapped from the entity definition whose names having the same specific linkage in the source (C++) code. However, it is usually not guaranteed one-to-one. For example, a symbol in a dynamic library image can be specified only shared in that image internally from source code (involved with some extensions, typically, __attribute__ or __declspec) or compiler options, and the image is not the whole program or the object file translated from a translation unit, thus no standard concept can describe it accurately. Since symbol is not a normative term in C++, it is only an implementation detail, even though the related extensions of dialects may have been widely adopted.
Accessibility. In C++, this is usually about property of class members or base classes, which is again a different concept unrelated to the topic.
Global. In C++, "global" refers something of global namespace or global namespace scope. The latter is roughly equivalent to file scope in the C language. Both in C and C++, the linkage has nothing to do with scope, although scope (like linkage) is also tightly concerned with an identifier (in C) or a name (in C++) introduced by some declaration.
The linkage rule of namespace scope const variable is something special (and particularly different to the const object declared in file scope in C language which also has the concept of linkage of identifiers). Since ODR is enforced by C++, it is important to keep no more than one definition of the same variable or function occurred in the whole program except for inline functions. If there is no such special rule of const, a simplest declaration of const variable with initializers (e.g. = xxx) in a header or a source file (often a "header file") included by multiple translation units (or included by one translation unit more than once, though rarely) in a program will violate ODR, which makes to use const variable as replacement of some object-like macros impossible.
I think Internal and External Linkage in C++ gives a clear and concise explanation:
A translation unit refers to an implementation (.c/.cpp) file and all
header (.h/.hpp) files it includes. If an object or function inside
such a translation unit has internal linkage, then that specific
symbol is only visible to the linker within that translation unit. If
an object or function has external linkage, the linker can also see it
when processing other translation units. The static keyword, when used
in the global namespace, forces a symbol to have internal linkage. The
extern keyword results in a symbol having external linkage.
The compiler defaults the linkage of symbols such that:
Non-const global variables have external linkage by default
Const global variables have internal linkage by default
Functions have external linkage by default
Basically
extern linkage variable is visible in all files
internal linkage variable is visible in single file.
Explain: const variables internally link by default unless otherwise declared as extern
by default, global variable is external linkage
but, const global variable is internal linkage
extra, extern const global variable is external linkage
A pretty good material about linkage in C++
http://www.goldsborough.me/c/c++/linker/2016/03/30/19-34-25-internal_and_external_linkage_in_c++/
Linkage determines whether identifiers that have identical names refer to the same object, function, or other entity, even if those identifiers appear in different translation units. The linkage of an identifier depends on how it was declared.
There are three types of linkages:
Internal linkage : identifiers can only be seen within a translation unit.
External linkage : identifiers can be seen (and referred to) in other translation units.
No linkage : identifiers can only be seen in the scope in which they are defined.
Linkage does not affect scoping
C++ only : You can also have linkage between C++ and non-C++ code fragments, which is called language linkage.
Source :IBM Program Linkage
In C++
Any variable at file scope and that is not nested inside a class or function, is visible throughout all translation units in a program. This is called external linkage because at link time the name is visible to the linker everywhere, external to that translation unit.
Global variables and ordinary functions have external linkage.
Static object or function name at file scope is local to translation unit. That is
called as Internal Linkage
Linkage refers only to elements that have addresses at link/load time; thus, class declarations and local variables have no linkage.