Please refer to the following code that is in the same translation unit:
static int global_var; // file scope in C and global namespace scope in C++
// internal linkage
void f(void)
{
static int local_var; // block scope in C and local scope in C++
// no linkage
}
My understanding is this:
I can refer to global_var from anywhere in the translation unit because it has global scope.
I can refer to local_var only inside function f because it has local scope.
My questions:
What is the difference beteen the two variables, in relation to linkage?
Can you provide one example where internal and no linkage makes a difference, and the difference is derived not only from scope?
EDIT
After the answer and comments of James Kanze, I am now able to construct an example that shows the difference between the internal and no linkage attributes:
static int i; // definition
// static storage
// internal linkage
void f(void)
{
extern int i; // declaration
// refers to the static i at file scope
// note that even though the specifier is extern
// its linkage is intern (this is legal in both C/C++)
{
int i; // definition
// automatic storage
// no linkage
}
}
Some articles that do a good job at explaining the concepts involved:
- Scope regions in C and C++
- Storage class specifiers and storage duration
- Linkage in C and C++
First: in addition to type, variables have three other
characteristics: linkage, scope and lifetime. All four
attributes are sort of orthogonal, but linked in the way they
are expressed in the language, and do interact in some ways.
With regards to linkage: linkage really affects the symbol which
is being declared, and not the object itself. If there is no
linkage, all declarations of the symbol bind to different
objects, e.g.:
int
func()
{
int i;
{
int i;
}
}
The symbol i has no linkage, and the two symbols i are bound
to two different entities. Generally speaking, local variables
(variables declared at block scope) and function arguments have
no linkage, regardless of type and lifetime.
Internal and external linkage are similar, in that repeated
declarations of the symbol bind to the same entity: internal
linkage binds only within the translation unit, external accross
the entire program. So given:
static int i; // internal linkage...
in several translation units, the i binds to a separate entity
in each translation unit. Without the static, you have external
linkage, and all of the i bind to the same entity.
Note that this only holds at namespace scope; all entities
which are members of a non-local class have external linkage.
And that type has an impact: variables which are const
implicitly have internal linkage:
int const i = 42; // same as static int const i...
extern int const j = 42; // external linkage.
Finally, all declarations which bind to the same entity must
declare it to have the same type. If you violate this rule in
a single translation unit (e.g.:
extern int i;
// ...
double i;
in the same namespace scope), then the compiler should complain.
If the two declarations are in different translation units,
however, it is undefined behavior, and who knows what will
happen. (In theory, the linker could complain, but most don't.)
EDIT:
One additional point: linkage is determined by the first
declaration which can refer to the entity. So if I write:
static int i;
void
func()
{
extern int i;
}
Both i refer to the same entity, which has internal linkage.
(Why one would ever write the second declaration is beyond me,
but it is legal.)
Generally static variables have internal linkage. You can't access the static variable or function in another file(in multiple file compilation situations), because its scope limited within this file(Internal linkage).
Normally auto and register variables have no linkage.
As i said above auto and register variables have no linkage. You can't declare these variables in global scope. static variables has internal linkage, scope is based on declaration, but not possible to access in another file. extern variables has external linkage, its possible to access these variables in another file.
For more reference Storage classes
global_var could be accessed from void g() in the same compilation unit, that is the difference.
Related
I am in a discussion in one of the code review sessions.
The debate is about if the functions in the anonymous namespace should be declared static or not. As an example, I am being told the square3 function which declared with "static" keyword has advantages.
namespace {
int square2(int num) {
return num * num;
}
static int square3(int num) {
return num * num;
}
}
I have told by my colleagues static is beneficial because:
A static function does not require an entry in the function table and
thus does not require time or memory when linking.
static means it is not exposed when linking. It improves linking time
and reduces memory usage when linking. Anonymous namespace does not
provide this.
I think by "function table" my colleagues mean "symbol table". I couldn't find any documentation about that except the least upvoted answer of another S.O question which says
It is usually better to prefer static linkage, as that doesn't
pollute the symbol table
I am trying to see if that info is true or not by using compiler explorer. But I couldn't find a way to get to the symbol table. All I see the function names and everything is mangled in the same way.
So I have two questions:
1 - Is using a function with static keyword in the anonymous namespace helps with memory/linking time?
2 - Is there a way to check the symbol table in compiler explorer?
It improves linking time and reduces memory usage when linking.
Even if this were to be true for a given implementation, from a pure language perspective, a static function in an unnamed namespace is redundant and arguably an anti-pattern that is likely to confuse other developers; whom may already struggle with the way C++ has overloaded what static means in terms of linkage and the entirely orthogonal storage duration.
// .cpp
namespace top {
int a{}; // external linkage, static storage duration
const int b{}; // internal linkage, static storage duration
static int c{}; // internal linkage, static storage duration
namespace {
int d{}; // internal linkage, static storage duration
const int e{}; // internal linkage, static storage duration
static int f{}; // internal linkage, static storage duration
// ^^^ static is redundant (-> and may confuse devs)
int g() {} // internal linkage
static int h() {} // internal linkage
// ^^^ static is redundant (-> and may confuse devs)
} // namespace
int k() {} // external linkage
static int l() {} // internal linkage
} // namespace top
[...] except the least upvoted answer of another S.O question which says
It is usually better to prefer static linkage, as that doesn't pollute the symbol table
The answer you refer to was based on C++03, where entities in unnamed namespace did not have internal linkage per default. Meaning that in C++03, even if an entity in an unnamed namespace was not accessible to other translation units (due to the unique naming nature of the unnamed namespace), it could still have external linkage and thus end up in the symbol table.
From C++11 and onwards entities in an unnamed namespace have internal linkage. Meaning that, from the C++ language perspective w.r.t. linkage, the examples above (with internal linkage) are equivalent, and it could arguably even be the case that your colleague mixes these concepts (as is common due to their overloaded meaning) or that you colleagues' argument is based on the state of affairs prior to C++11.
1 - Is using a function with static keyword in the anonymous namespace helps with memory/linking time?
Static does not affect functions in anonymous namespace as far as the language is concerned. This is because being in an anonymous namespace already achieves the same thing as declaring the function static achieves. (Note that declaring a member function static has an entirely different meaning).
2 - Is there a way to check the symbol table in compiler explorer?
I don't know of compiler explorer, but you can use for example the nm program to list symbols: http://coliru.stacked-crooked.com/a/0281bc487044ec02
namespace {
void foo1(){} // internal linkage
static void foo2(){} // internal linkage
}
void foo3(){} // external linkage
static void foo4(){} // internal linkage
command:
nm main.o | c++filt
output:
0000000000000000 T foo3()
I have read this and it says that
names of classes, their member functions, static data members (const
or not), nested classes and enumerations, and functions first
introduced with friend declarations inside class bodies
have external linkage by default.. But what about the variables declared inside the class body that are not specified static? Also, it starts out with
Any of the following names declared at namespace scope have external
linkage
, but is class scope considered a namespace scope? I mean class scope and namespace scope are different, so why do they start out by saying that the following is applicable for the mentioned declared inside a namespace scope? I mean, for example, member functions are declared in class scope and they mention them as if it was namespace scope?
Following example:
class C
{
public:
int n;
};
C e;
namespace { C i; }
e has external linkage, i internal. How much sense would it make to speak of linkage of n now? If at all, you could consider n inheriting the linkage of the containing object, thus e.n would have external, i.n internal linkage – for better understanding only, I do not consider this as correct wording...
Quote from the standard:
A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope
Translation to plain English:
If you can redeclare it in another scope, it has linkage. Otherwise, nope.
You cannot redeclare a non-static class data member in another scope, so it has no linkage.
I'm asking this question as a follow up from this post. They say that the extern block declaration has external linkage and not internal linkage, but I'm not sure why:
static int i = 0; // #1
void g() {
extern int i; // #3 external linkage
}
Why doesn't the extern declaration take the linkage of i (internal linkage)? The quote in the post seems to allow that. In the example after the OP's quote it has:
static void f();
void g() {
extern void f(); // internal linkage
// ...
}
and it says that the extern declaration has internal linkage. Why is there a difference when using variables and functions?
Because variable "i" has static storage. So, in terms of your snippet,
omitting the "static" statement will produce "no linkage"
applying "static" statement will produce "external linkage" (as static storage is quite special part of the C/C++ runtime infrastructure).
Also, you may find this discussion interesting:
Understanding static storage class in C
What is the C++ equvalent for translation unit local static function in C?
For example having the following in bar.c:
static void bar() {
// ...
}
In C++, would this be written as a a private member function like
class foo {
void bar();
};
void foo::bar() {
// ...
}
A private member function implicitly introduces the this pointer as parameter, so it's not really comparable to the C style static function. But even a private static member function bar() would be seen in the public interface (and staying accessible for the linker), and isn't comparable as well.
While accessible scope of those functions seems to be similar, these options don't look like good replacements for the mentioned C style static function syntax.
Is the equivalent a function in an unnamed namespace, that's visible to the current translation unit only?
namespace {
void bar() {
// ...
}
}
[C] Static function with file scope.
static void bar() { ... }
This will create a function named bar that has internal linkage.
[C++] Static function with file scope
static void bar() { ... }
This will create a function named bar that has internal linkage.
[C++] Unnamed namespace
namespace {
void bar() { ... }
}
This will create a function named bar that has internal linkage.
Conclusions
They are all identical. I'd probably recommend using the unnamed namespace in C++, because it gets rid of some of the overloading of the static keyword. But from the perspective of what your code does, it doesn't matter.
Sidebar: What does internal linkage mean?
In C and C++, we have three kinds of linkage: External, Internal and No linkage. To define these, I'm going to quote from C++ 2011 Section 3.5 Paragraph 2:
A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope:
When a name has external linkage , the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit.
When a name has internal linkage , the entity it denotes can be referred to by names from other scopes in the same translation unit.
When a name has no linkage , the entity it denotes cannot be referred to by names from other scopes.
C 2011 has similar language at Section 6.2.2 Paragraph 2:
In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function. Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function. Each declaration of an identifier with no linkage denotes a unique entity.
So names that have internal linkage are only visible in the translation unit that they were found in.
Sidebar: Let's include an example of how internal linkage works in practice:
Let's create 2 c++ files. bar.cc will contain just a function with internal linkage:
static void bar() {}
We'll also create main.cc, which will try to use that bar().
extern void bar();
int main() {
bar();
}
If we compile this, our linker will complain. there is no function named bar that we can find from the main.cc translation unit. This is the expected behavior of internal linkage.
Undefined symbols for architecture x86_64:
"bar()", referenced from:
_main in main-c16bef.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
The C++11 standard give the code snippet below (I deleted unrelated code) and said the name i have external linkage. (clause 3.5.6)
static int i = 0; // #1
void g() {
extern int i; // #3 external linkage
}
Why do they do this? Did I misunderstand something? The two i refer to the same object in vs2012. And when I use i somewhere else, i got an unresolved external error. I have no idea whether vs2012 support this feature or not.
Edit:
I think VS2012 is doing the right thing. The i in #3 only need to refers to an i that has a linkage. If the compiler can't find one, than the i should be defined in other translation unit. So the two i should refer to the same object in the code snippet above.
The quote from the standard:
If there is a visible declaration of an entity with linkage having the
same name and type, ignoring entities declared outside the innermost
enclosing namespace scope, the block scope declaration declares that
same entity and receives the linkage of the previous declaration. if
no matching entity is found, the block scope entity receives external
linkage.
But why people need this feature?
#3 is only a declaration; it states that a variable called i exists somewhere in the program, with external linkage, but does not define that variable. The declaration allows you to use that, rather than the static variable from #1, within the scope of g.
You will also need to define it, in the namespace that contains g. In this case, it will have to be in a different translation unit, so that it doesn't conflict with the static variable with the same name.
To be clear, there are two different variables called i here, as explained in the paragraph following the example. #1 is defined here; #3 is only declared, and needs a separate definition.
static int i = 0; // #1
void g() {
extern int i; // #3 external linkage
}
The first static i is a declaration and is visible only in the current source file.
extern int i;
tells the compiler I don't mean this static i but another i defined somewhere else. If you haven't defined it somewhere else (in another translation unit) you will get the undefined reference.
And this doesn't break the ODR because this (the static) i is static (visible only in this unit).
extern int i;
Promises compiler I will give you an int i.
The static int i=0; is not that promised variable, and you have to declare a int i somewhere else visible to that extern variable declaration.
In other words extern int i; and static int i=0; are two irrelevant variables.