Extern variable declaration does not have internal linkage - c++

I'm asking this question as a follow up from this post. They say that the extern block declaration has external linkage and not internal linkage, but I'm not sure why:
static int i = 0; // #1
void g() {
extern int i; // #3 external linkage
}
Why doesn't the extern declaration take the linkage of i (internal linkage)? The quote in the post seems to allow that. In the example after the OP's quote it has:
static void f();
void g() {
extern void f(); // internal linkage
// ...
}
and it says that the extern declaration has internal linkage. Why is there a difference when using variables and functions?

Because variable "i" has static storage. So, in terms of your snippet,
omitting the "static" statement will produce "no linkage"
applying "static" statement will produce "external linkage" (as static storage is quite special part of the C/C++ runtime infrastructure).
Also, you may find this discussion interesting:
Understanding static storage class in C

Related

C++ namespace function definition only has static keyword

I understand that a static namespace function may be declared static, and the static keyword omitted from the definition of the function. What I have is the opposite, e.g.:
// in foo.h
namespace Foo
{
void bar();
}
// in foo.cpp
static void Foo::bar()
{
}
Does the static keyword have any effect here, or is it simply ignored? It compiles either way, and the functions are available to other translation units.
The given program is ill-formed as can be seen from dcl.stc which states:
The linkages implied by successive declarations for a given entity shall agree. That is, within a given scope, each declaration declaring the same variable name or the same overloading of a function name shall imply the same linkage.
(end quote)
Now, lets apply this to your example.
// in foo.h
namespace Foo
{
void bar(); //#1 external linkage
}
// in foo.cpp
static void Foo::bar() //#2 tries to give Foo::bar internal linkage
{
}
In the above code, the first declaration #1 of bar gives it external linkage while when implementing it in the source file(foo.cpp) in #2, you're trying to give it internal linkage. And so according to the above quoted statement, the given example is not valid.

Does declaring functions in anonymous namespace with "static" reduces the linking time and memory by not polluting the symbol table

I am in a discussion in one of the code review sessions.
The debate is about if the functions in the anonymous namespace should be declared static or not. As an example, I am being told the square3 function which declared with "static" keyword has advantages.
namespace {
int square2(int num) {
return num * num;
}
static int square3(int num) {
return num * num;
}
}
I have told by my colleagues static is beneficial because:
A static function does not require an entry in the function table and
thus does not require time or memory when linking.
static means it is not exposed when linking. It improves linking time
and reduces memory usage when linking. Anonymous namespace does not
provide this.
I think by "function table" my colleagues mean "symbol table". I couldn't find any documentation about that except the least upvoted answer of another S.O question which says
It is usually better to prefer static linkage, as that doesn't
pollute the symbol table
I am trying to see if that info is true or not by using compiler explorer. But I couldn't find a way to get to the symbol table. All I see the function names and everything is mangled in the same way.
So I have two questions:
1 - Is using a function with static keyword in the anonymous namespace helps with memory/linking time?
2 - Is there a way to check the symbol table in compiler explorer?
It improves linking time and reduces memory usage when linking.
Even if this were to be true for a given implementation, from a pure language perspective, a static function in an unnamed namespace is redundant and arguably an anti-pattern that is likely to confuse other developers; whom may already struggle with the way C++ has overloaded what static means in terms of linkage and the entirely orthogonal storage duration.
// .cpp
namespace top {
int a{}; // external linkage, static storage duration
const int b{}; // internal linkage, static storage duration
static int c{}; // internal linkage, static storage duration
namespace {
int d{}; // internal linkage, static storage duration
const int e{}; // internal linkage, static storage duration
static int f{}; // internal linkage, static storage duration
// ^^^ static is redundant (-> and may confuse devs)
int g() {} // internal linkage
static int h() {} // internal linkage
// ^^^ static is redundant (-> and may confuse devs)
} // namespace
int k() {} // external linkage
static int l() {} // internal linkage
} // namespace top
[...] except the least upvoted answer of another S.O question which says
It is usually better to prefer static linkage, as that doesn't pollute the symbol table
The answer you refer to was based on C++03, where entities in unnamed namespace did not have internal linkage per default. Meaning that in C++03, even if an entity in an unnamed namespace was not accessible to other translation units (due to the unique naming nature of the unnamed namespace), it could still have external linkage and thus end up in the symbol table.
From C++11 and onwards entities in an unnamed namespace have internal linkage. Meaning that, from the C++ language perspective w.r.t. linkage, the examples above (with internal linkage) are equivalent, and it could arguably even be the case that your colleague mixes these concepts (as is common due to their overloaded meaning) or that you colleagues' argument is based on the state of affairs prior to C++11.
1 - Is using a function with static keyword in the anonymous namespace helps with memory/linking time?
Static does not affect functions in anonymous namespace as far as the language is concerned. This is because being in an anonymous namespace already achieves the same thing as declaring the function static achieves. (Note that declaring a member function static has an entirely different meaning).
2 - Is there a way to check the symbol table in compiler explorer?
I don't know of compiler explorer, but you can use for example the nm program to list symbols: http://coliru.stacked-crooked.com/a/0281bc487044ec02
namespace {
void foo1(){} // internal linkage
static void foo2(){} // internal linkage
}
void foo3(){} // external linkage
static void foo4(){} // internal linkage
command:
nm main.o | c++filt
output:
0000000000000000 T foo3()

Why did they insist in using the `extern` specifier in the examples below?

[basic.link]/6 (my emphasiss):
The name of a function declared in block scope and the name of a variable declared by a block scope extern declaration have linkage.
...
static void f();
static int i = 0;
void g() {
extern void f(); // internal linkage
int i; // #2 i has no linkage
{
extern void f(); // internal linkage <--
extern int i; // #3 external linkage
}
}
[basic.link]/7:
...
namespace X {
void p() {
q(); // error: q not yet declared
extern void q(); // q is a member of namespace X <--
}
void middle() {
q(); // error: q not yet declared
}
void q() { /* ... */ } // definition of X::q
}
void q() { /* ... */ } // some other, unrelated q
The externspecifiers pointed by the arrows are not necessary, given the very first sentence in paragraph [basic.link]/6 highlighted in bold characters above.Or am I missing something?
The externs are there to emphasize the respective comments, pointing out that the extern has no effect in certain circumstances (due to the rules outlined in that paragraph).
In the first example, f has internal linkage despite being declared extern, because it was first declared static at namespace-scope.
In the second exmaple, extern has no effect on the declaration because q is also declared at namespace scope without it (and X::q takes precedence over ::q).
I think the examples stave off some plausible but wrong ideas one might otherwise come up with.
In paragraph 6 one might expect f() to have external linkage since that is what extern "normally" (i.e. at file scope) means, but it's actually internal linkage because of the static declaration further up.
In paragraph 7 someone might expect extern void q(); to make q() available outside of (or in loose speech external to) p(), so it could be called in middle(), but that doesn't happen either.
Both would still be true without the extern keyword, but then it wouldn't be surprising to people expecting extern to mean something different.

Difference between internal and no linkage

Please refer to the following code that is in the same translation unit:
static int global_var; // file scope in C and global namespace scope in C++
// internal linkage
void f(void)
{
static int local_var; // block scope in C and local scope in C++
// no linkage
}
My understanding is this:
I can refer to global_var from anywhere in the translation unit because it has global scope.
I can refer to local_var only inside function f because it has local scope.
My questions:
What is the difference beteen the two variables, in relation to linkage?
Can you provide one example where internal and no linkage makes a difference, and the difference is derived not only from scope?
EDIT
After the answer and comments of James Kanze, I am now able to construct an example that shows the difference between the internal and no linkage attributes:
static int i; // definition
// static storage
// internal linkage
void f(void)
{
extern int i; // declaration
// refers to the static i at file scope
// note that even though the specifier is extern
// its linkage is intern (this is legal in both C/C++)
{
int i; // definition
// automatic storage
// no linkage
}
}
Some articles that do a good job at explaining the concepts involved:
- Scope regions in C and C++
- Storage class specifiers and storage duration
- Linkage in C and C++
First: in addition to type, variables have three other
characteristics: linkage, scope and lifetime. All four
attributes are sort of orthogonal, but linked in the way they
are expressed in the language, and do interact in some ways.
With regards to linkage: linkage really affects the symbol which
is being declared, and not the object itself. If there is no
linkage, all declarations of the symbol bind to different
objects, e.g.:
int
func()
{
int i;
{
int i;
}
}
The symbol i has no linkage, and the two symbols i are bound
to two different entities. Generally speaking, local variables
(variables declared at block scope) and function arguments have
no linkage, regardless of type and lifetime.
Internal and external linkage are similar, in that repeated
declarations of the symbol bind to the same entity: internal
linkage binds only within the translation unit, external accross
the entire program. So given:
static int i; // internal linkage...
in several translation units, the i binds to a separate entity
in each translation unit. Without the static, you have external
linkage, and all of the i bind to the same entity.
Note that this only holds at namespace scope; all entities
which are members of a non-local class have external linkage.
And that type has an impact: variables which are const
implicitly have internal linkage:
int const i = 42; // same as static int const i...
extern int const j = 42; // external linkage.
Finally, all declarations which bind to the same entity must
declare it to have the same type. If you violate this rule in
a single translation unit (e.g.:
extern int i;
// ...
double i;
in the same namespace scope), then the compiler should complain.
If the two declarations are in different translation units,
however, it is undefined behavior, and who knows what will
happen. (In theory, the linker could complain, but most don't.)
EDIT:
One additional point: linkage is determined by the first
declaration which can refer to the entity. So if I write:
static int i;
void
func()
{
extern int i;
}
Both i refer to the same entity, which has internal linkage.
(Why one would ever write the second declaration is beyond me,
but it is legal.)
Generally static variables have internal linkage. You can't access the static variable or function in another file(in multiple file compilation situations), because its scope limited within this file(Internal linkage).
Normally auto and register variables have no linkage.
As i said above auto and register variables have no linkage. You can't declare these variables in global scope. static variables has internal linkage, scope is based on declaration, but not possible to access in another file. extern variables has external linkage, its possible to access these variables in another file.
For more reference Storage classes
global_var could be accessed from void g() in the same compilation unit, that is the difference.

block scope extern declaration

The C++11 standard give the code snippet below (I deleted unrelated code) and said the name i have external linkage. (clause 3.5.6)
static int i = 0; // #1
void g() {
extern int i; // #3 external linkage
}
Why do they do this? Did I misunderstand something? The two i refer to the same object in vs2012. And when I use i somewhere else, i got an unresolved external error. I have no idea whether vs2012 support this feature or not.
Edit:
I think VS2012 is doing the right thing. The i in #3 only need to refers to an i that has a linkage. If the compiler can't find one, than the i should be defined in other translation unit. So the two i should refer to the same object in the code snippet above.
The quote from the standard:
If there is a visible declaration of an entity with linkage having the
same name and type, ignoring entities declared outside the innermost
enclosing namespace scope, the block scope declaration declares that
same entity and receives the linkage of the previous declaration. if
no matching entity is found, the block scope entity receives external
linkage.
But why people need this feature?
#3 is only a declaration; it states that a variable called i exists somewhere in the program, with external linkage, but does not define that variable. The declaration allows you to use that, rather than the static variable from #1, within the scope of g.
You will also need to define it, in the namespace that contains g. In this case, it will have to be in a different translation unit, so that it doesn't conflict with the static variable with the same name.
To be clear, there are two different variables called i here, as explained in the paragraph following the example. #1 is defined here; #3 is only declared, and needs a separate definition.
static int i = 0; // #1
void g() {
extern int i; // #3 external linkage
}
The first static i is a declaration and is visible only in the current source file.
extern int i;
tells the compiler I don't mean this static i but another i defined somewhere else. If you haven't defined it somewhere else (in another translation unit) you will get the undefined reference.
And this doesn't break the ODR because this (the static) i is static (visible only in this unit).
extern int i;
Promises compiler I will give you an int i.
The static int i=0; is not that promised variable, and you have to declare a int i somewhere else visible to that extern variable declaration.
In other words extern int i; and static int i=0; are two irrelevant variables.