Local static variable linkage in a template class static member function - c++

I have a question similar to Local static/thread_local variables of inline functions?
Does the standard guarantee that return value is always 1, meaning that static int x is the same across translation units?
// TU1
template <int X>
struct C {
static int* f() { static int x = X; return &x; }
};
extern int* a1;
extern int* a2;
void sa() { a1 = C<1>::f(); a2 = C<2>::f(); }
// TU2
template <int X>
struct C {
static int* f() { static int x = X; return &x; }
};
extern int* b1;
extern int* b2;
void sb() { b1 = C<1>::f(); b2 = C<2>::f(); }
// TU3
int *a1, *a2, *b1, *b2;
void sa();
void sb();
int main() { sa(); sb(); return a1 == b1 && a2 == b2; }

Does the standard guarantee that return value is always 1, meaning that static int x is the same across translation units?
The standard requires that.
Static local variables:
Function-local static objects in all definitions of the same inline function (which may be implicitly inline) all refer to the same object defined in one translation unit.
The compiler generates a copy of that function local static in each translation unit and then the linker picks one and discards the duplicates. When shared libraries are involved they may have their own copies of the object but the runtime linker (ld.so) resolves all references to the one that was discovered first. This method is known as vague linkage:
Local static variables and string constants used in an inline function are also considered to have vague linkage, since they must be shared between all inlined and out-of-line instances of the function.

In your example, it does. We need to examine [basic.def.odr] ΒΆ6
There can be more than one definition of a [..] class template,
non-static function template, static data member of a class template,
member function of a class template, or template specialization for
which some template parameters are not specified ([temp.spec],
[temp.class.spec]) in a program provided that each definition appears
in a different translation unit, and provided the definitions satisfy
the following requirements. Given such an entity named D defined in
more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and [...]
If D is a template and is defined in more than one translation unit,
then the preceding requirements shall apply both to names from the
template's enclosing scope used in the template definition
([temp.nondep]), and also to dependent names at the point of
instantiation ([temp.dep]). If the definitions of D satisfy all these
requirements, then the behavior is as if there were a single
definition of D. If the definitions of D do not satisfy these
requirements, then the behavior is undefined.
Your templates are okay in that regard, it's as if they were included from the same header. There are other bullets in that clause that also need to hold, but they are not relevant to your example.
Now, since it's as if there is only one definition of the template and its members in your program, that static variable is the same variable in both TU's.

Related

Why we need a separate definition for a static const data member?

Hello if I have a static const data member then I can provide an in-class initializer for it and I don't need to define it again outside of the class body.
But that is true only if that constant is used within the class scope and if used outside, a separate definition outside must be supplied otherwise any reference to it causes a link-time error: "undefined reference to static object: x".
struct Foo{
static int const sz_ = 100;
std::array<int, sz_> ai_100{};
};
//int const Foo::sz_;
int main(){
float pts[Foo::sz_]{}; // ok
void bar(int const&); // defined later on
bar(Foo::sz_); // undefined reference to Foo::sz_
}
void bar(int const&){
//do_something
}
Why when used the static const data member sz_outside of the class scope as the array size is OK?
Why when passing sz to function bar which takes an l-value reference to const the linker fails to link and complain about the definition of Foo::sz_?
The function bar takes an l-value reference to const int& so it can be initialized from an r-value so why it matters about the definition of the initializer Foo::sz_?
Why we need a separate definition for a static const data member?
First, we must remember the One Definition Rule. It's actually a set of rules, but a simplified version of the rule that we are interested in this context is: There must be exactly one definition of every variable.
Secondly, we should consider that classes are often used in more than one translation unit.
If the declaration of the static member variable was a definition, then every translation unit that includes contains the class definition would contain that variable definition. That would be contrary to the ODR, and the linker wouldn't know what to do.
As such, there must be a separate definition of the variable which allows the class definition to be included into multiple translation unit while keeping the separate variable definition in one translation unit.
While the language implementations of past may not have been able to deal with multiple definitions (back before templates were in C++), these days they are, and the language has been expanded (in C++17) to allow inline definitions of variables. inline keyword has the same meaning for variable as it has for functions: It relaxes the One Definition Rule allowing (and also requiring) the variable to be defined in every TU (where it is odr-used).
Mostly for historical reasons. Staring with C++17, you can have inline data members. And constexpr members are inline by default.
Your non-linking example can be simplified to:
struct Foo{
static int const sz_ = 100;
};
int main(){
int const *p = &Foo::sz_; // undefined reference to Foo::sz_
}
In other words, if we take the address of Foo::sz_, or assign a reference to it (which is a very similar operation, under the hood), then we need to define int const Foo::sz_; somewhere.
Why is this? Well, if we just use the value of Foo::sz_, then the compiler doesn't need a variable. It can just use 100 as, effectively, a literal. But if we want to take the address of it then we need an actual variable to refer to.

Should `const` and `constexpr` variables in headers be `inline` to prevent ODR violations?

Consider the following header and assume it is used in several TUs:
static int x = 0;
struct A {
A() {
++x;
printf("%d\n", x);
}
};
As this question explains, this is an ODR violation and, therefore, UB.
Now, there is no ODR violation if our inline function refers to a non-volatile const object and we do not odr-use it within that function (plus the other provisions), so this still works fine in a header:
constexpr int x = 1;
struct A {
A() {
printf("%d\n", x);
}
};
But if we do happen to odr-use it, we are back at square one with UB:
constexpr int x = 1;
struct A {
A() {
printf("%p\n", &x);
}
};
Thus, given we have now inline variables, should not the guideline be to mark all namespace-scoped variables as inline in headers to avoid all problems?
constexpr inline int x = 1;
struct A {
A() {
printf("%p\n", &x);
}
};
This also seems easier to teach, because we can simply say "inline-everything in headers" (i.e. both function and variable definitions), as well as "never static in headers".
Is this reasoning correct? If yes, are there any disadvantages whatsoever of always marking const and constexpr variables in headers as inline?
As you have pointed out, examples one and third does indeed violate ODR as per [basic.def.odr]/12.2.1
[..] in each definition of D, corresponding names, looked up according to [basic.lookup], shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution and after matching of partial template specialization, except that a name can refer to
a non-volatile const object with internal or no linkage if the object
is not odr-used in any definition of D, [..]
Is this reasoning correct?
Yes, inline variables with external linkage are guaranteed to refer to the same entity even when they are odr-used as long all the definitions are the same:
[dcl.inline]/6
An inline function or variable shall be defined in every translation unit in which it is odr-used and shall have exactly the same definition in every case ([basic.def.odr]). [..] An inline function or variable with external linkage shall have the same address in all translation units.
The last example is OK because it meets and don't violate the bold part of the above.
are there any disadvantages whatsoever of always marking const and constexpr variables in headers as inline?
I can't think of any, because if we keep the promise of having the exact same definition of an inline variable with external linkage through TU's, the compiler is free to pick any of them to refer to the variable, this will be the same, technically, as having just one TU and have a global variable declared in the header with appropriate header guards

Struct vs. Function Definitions in Scope

So, as far as I know, this is legal in C:
foo.c
struct foo {
int a;
};
bar.c
struct foo {
char a;
};
But the same thing with functions is illegal:
foo.c
int foo() {
return 1;
}
bar.c
int foo() {
return 0;
}
and will result in linking error (multiple definition of function foo).
Why is that? What's the difference between struct names and function names that makes C unable to handle one but not the other?
Also does this behavior extend to C++?
Why is that?
struct foo {
int a;
};
defines a template for creating objects. It does not create any objects or functions. Unless struct foo is used somewhere in your code, as far as the compiler/linker is concerned, those lines of code may as well not exist.
Please note that there is a difference in how C and C++ deal with incompatible struct definitions.
The differing definitions of struct foo in your posted code, is ok in a C program as long as you don't mix their usage.
However, it is not legal in C++. In C++, they have external linkage and must be defined identically. See 3.2 One definition rule/5 for further details.
The distinguishing concept in this case is called linkage.
In C struct, union or enum tags have no linkage. They are effectively local to their scope.
6.2.2 Linkages of identifiers
6 The following identifiers have no linkage: an identifier declared to be anything other than
an object or a function; an identifier declared to be a function parameter; a block scope
identifier for an object declared without the storage-class specifier extern.
They cannot be re-declared in the same scope (except for so called forward declarations). But they can be freely re-declared in different scopes, including different translation units. In different scopes they may declare completely independent types. This is what you have in your example: in two different translation units (i.e. in two different file scopes) you declared two different and unrelated struct foo types. This is perfectly legal.
Meanwhile, functions have linkage in C. In your example these two definitions define the same function foo with external linkage. And you are not allowed to provide more than one definition of any external linkage function in your entire program
6.9 External definitions
5 [...] If an identifier declared with external
linkage is used in an expression (other than as part of the operand of a sizeof or _Alignof operator whose result is an integer constant), somewhere in the entire
program there shall be exactly one external definition for the identifier; otherwise, there
shall be no more than one.
In C++ the concept of linkage is extended: it assigns specific linkage to a much wider variety of entities, including types. In C++ class types have linkage. Classes declared in namespace scope have external linkage. And One Definition Rule of C++ explicitly states that if a class with external linkage has several definitions (across different translation units) it shall be defined equivalently in all of these translation units (http://eel.is/c++draft/basic.def.odr#12). So, in C++ your struct definitions would be illegal.
Your function definitions remain illegal in C++ as well because of C++ ODR rule (but essentially for the same reasons as in C).
Your function definitions both declare an entity called foo with external linkage, and the C standard says there must not be more than one definition of an entity with external linkage. The struct types you defined are not entities with external linkage, so you can have more than one definition of struct foo.
If you declared objects with external linkage using the same name then that would be an error:
foo.c
struct foo {
int a;
};
struct foo obj;
bar.c
struct foo {
char a;
};
struct foo obj;
Now you have two objects called obj that both have external linkage, which is not allowed.
It would still be wrong even if one of the objects is only declared, not defined:
foo.c
struct foo {
int a;
};
struct foo obj;
bar.c
struct foo {
char a;
};
extern struct foo obj;
This is undefined, because the two declarations of obj refer to the same object, but they don't have compatible types (because struct foo is defined differently in each file).
C++ has similar, but more complex rules, to account for inline functions and inline variables, templates, and other C++ features. In C++ the relevant requirements are known as the One-Definition Rule (or ODR). One notable difference is that C++ doesn't even allow the two different struct definitions, even if they are never used to declare objects with external linkage or otherwise "shared" between translation units.
The two declarations for struct foo are incompatible with each other because the types of the members are not the same. Using them both within each translation unit is fine as long as you don't do anything to confuse the two.
If for example you did this:
foo.c:
struct foo {
char a;
};
void bar_func(struct foo *f);
void foo_func()
{
struct foo f;
bar_func(&f);
}
bar.c:
struct foo {
int a;
};
void bar_func(struct foo *f)
{
f.a = 1000;
}
You would be invoking undefined behavior because the struct foo that bar_func expects is not compatible with the struct foo that foo_func is supplying.
The compatibility of structs is detailed in section 6.2.7 of the C standard:
1 Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are
described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers,
and in 6.7.6 for declarators. Moreover, two structure, union, or
enumerated types declared in separate translation units are compatible
if their tags and members satisfy the following requirements: If one
is declared with a tag, the other shall be declared with the same tag.
If both are completed anywhere within their respective translation
units, then the following additional requirements apply: there shall
be a one-to-one correspondence between their members such that each
pair of corresponding members are declared with compatible types; if
one member of the pair is declared with an alignment specifier, the
other is declared with an equivalent alignment specifier; and if one
member of the pair is declared with a name, the other is declared with
the same name. For two structures, corresponding members shall be
declared in the same order. For two structures or unions,
corresponding bit-fields shall have the same widths. For two
enumerations, corresponding members shall have the same values.
2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
To summarize, the two instances of struct foo must have members with the same name and type and in the same order to be compatible.
Such rules are needed so that a struct can be defined once in a header file and that header is subsequently included in multiple source files. This results in the struct being defined in multiple source files, but with each instance being compatible.
The difference isn't so much in the names as in existence; a struct definition isn't stored anywhere and its name only exists during compilation.
(It is the programmer's responsibility to ensure that there is no conflict in the uses of identically named structs. Otherwise, our dear old friend Undefined Behaviour comes calling.)
On the other hand, a function needs to be stored somewhere, and if it has external linkage, the linker needs its name.
If you make your functions static, so they're "invisible" outside their respective compilation unit, the linking error will disappear.
To hide the function definition from the linker use the keyword static.
foo.c
static int foo() {
return 1;
}
bar.c
static int foo() {
return 0;
}

Class template overloading across TUs

Consider the following C++11 application:
A.cpp:
template<typename T>
struct Shape {
T x;
T area() const { return x*x; }
};
int testA() {
return Shape<int>{2}.area();
}
B.cpp:
template<typename T, typename U = T>
struct Shape {
T x;
U y;
U area() const { return x*y; }
};
int testB() {
return Shape<int,short>{3,4}.area();
}
Main.cpp:
int testA();
int testB();
int main() {
return testA() + testB();
}
Although it compiles (as long as A and B are in separate TUs), it doesn't look right, and I'm having trouble figuring out why.
Hence my Question: Does this violate ODR, overloading, or any other rule, and if so, what sections of the Standard are violated and why?
It is an ODR violation. Template names have linkage. And both those template names have external linkage, as [basic.link]/4 says:
An unnamed namespace or a namespace declared directly or indirectly
within an unnamed namespace has internal linkage. All other namespaces
have external linkage. A name having namespace scope that has not been
given internal linkage above has the same linkage as the enclosing
namespace if it is the name of
[...]
a template.
And on account of that, since both templates share a name, it means that [basic.def.odr]/5 applies:
There can be more than one definition of a [...] class template
(Clause [temp]) [...] in a program provided that each definition
appears in a different translation unit, and provided the definitions
satisfy the following requirements. Given such an entity named D
defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
[...]
If D is a template and is defined in more than one translation unit,
then the preceding requirements shall apply both to names from the
template's enclosing scope used in the template definition
([temp.nondep]), and also to dependent names at the point of
instantiation ([temp.dep]). If the definitions of D satisfy all these
requirements, then the program shall behave as if there were a single
definition of D. If the definitions of D do not satisfy these
requirements, then the behavior is undefined.
Not the same sequence of tokens by a margin.
You can easily resolve it, as Jarod42 suggested, by putting both definitions of the templates into an unnamed namespace, thus giving them internal linkage.

What means "obey ODR" in case of inline and constexpr function?

I just read that constexpr and inline functions obey one-definition rule, but they definition must be identical. So I try it:
inline void foo() {
return;
}
inline void foo() {
return;
}
int main() {
foo();
};
error: redefinition of 'void foo()',
and
constexpr int foo() {
return 1;
}
constexpr int foo() {
return 1;
}
int main() {
constexpr x = foo();
};
error: redefinition of 'constexpr int foo()'
So what exactly means that, constexpr and inline function can obey ODR?
I just read that constexpr and inline functions obey one-definition rule, but they definition must be identical.
This is in reference to inline functions in different translations units. In your example they are both in the same translation unit.
This is covered in the draft C++ standard 3.2 One definition rule [basic.def.odr] which says:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with
external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member
of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for
which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition
appears in a different translation unit, and provided the definitions satisfy the following requirements. Given
such an entity named D defined in more than one translation unit, then
and includes the following bullet:
each definition of D shall consist of the same sequence of tokens; and
You are defining functions repeatedly in one translation unit. This is always forbidden:
No translation unit shall contain more than one definition of any variable, function, class type, enumeration
type, or template. (C++11 3.2/1)
For inline functions, you are allowed to define same function in exactly the same way in more than one translation unit (read: .cpp file). In fact, you must define it in every translation unit (which is usually done by defining it in a header file):
An inline function shall be defined in every translation unit in which it is odr-used. (C++11 3.2/3)
For "normal" (non-inline, non-constexpr, non-template, etc.) functions with external linkage (non-static) functions, this will usually (no diagnostic required) lead to a linker error.
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used
in that program; no diagnostic required. (C++11 3.2/3)
To sum up:
Never define anything multiple times in one translation unit (which is a .cpp file and all directly or indirectly included headers).
You may put a certain number of things into header files, where they will be included once in several different translation units, for example:
inline functions
class types and templates
static data members of a class template.
If you have:
file1.cpp:
inline void foo() { std::cout << "Came to foo in file1.cpp" << std::endl; }
and
file2.cpp:
inline void foo() { std::cout << "Came to foo in file2.cpp" << std::endl; }
and you link those files together in an executable, you are violating the one-definition-rule since the two versions of the inline function are not same.