How does template affect linkage of const global variable? - c++

As the doc says (emphasis mine):
Any of the following names declared at namespace scope have internal linkage:
non-volatile non-template non-inline const-qualified variables (including constexpr) that aren't declared extern and aren't previously declared to have external linkage;
So I'd expect const template variables to have external linkage. So I did a test:
// main.cpp
void other();
template<class T> T var = 1;
template<class T> const T constVar = 1;
int main() {
std::cout << var<int> << ' ' << constVar<int> << std::endl;
other();
}
// other.cpp
template<class T> T var = 2;
template<class T> const T constVar = 2;
void other() {
std::cout << var<int> << ' ' << constVar<int> << std::endl;
}
And the output is:
1 1
1 2
The second column is for constVar, and it differs for different rows (printed from different translation units). This makes me think that it actually has internal linkage, despite being a template.
I understand that I do violate ODR, but only to understand what's happening.
So does constVar actually have internal linkage? If yes, what does the highlighted fragment of the doc mean? If no, then what's happening, and why do we need this highlighted fragment?

C++14 N4296 §14.4
A template name has linkage (3.5). [...] Template definitions shall obey the one definition rule (3.2).
C++14 N4296 §3.2.6
Given
such an entity named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; [...]
If D is a template and is defined in more than one translation unit, then the preceding requirements shall
apply both to names from the template’s enclosing scope used in the template definition (14.6.3), and also to
dependent names at the point of instantiation (14.6.2). If the definitions of D satisfy all these requirements,
then the behavior is as if there were a single definition of D. If the definitions of D do not satisfy these
requirements, then the behavior is undefined.
Templates are implicitly inline i.e. have external linkage.
What's happening = undefined behaviour.
Compiler is not required to diagnose your code and might produce unexpected/inconsistent results.

Related

Identity of unnamed enums with no enumerators

Consider a program with the following two translation units:
// TU 1
#include <typeinfo>
struct S {
enum { } x;
};
const std::type_info& ti1 = typeid(decltype(S::x));
// TU 2
#include <iostream>
#include <typeinfo>
struct S {
enum { } x;
};
extern std::type_info& ti1;
const std::type_info& ti2 = typeid(decltype(S::x));
int main() {
std::cout << (ti1 == ti2) << '\n';
}
I compiled it with GCC and Clang and in both cases the result was 1, and I'm not sure why. (GCC also warns that "ISO C++ forbids empty unnamed enum", which I don't think is true.)
[dcl.enum]/11 states that if an unnamed enumeration does not have a typedef name for linkage purposes but has at least one enumerator, then it has its first enumerator as its name for linkage purposes. These enums have no enumerators, so they have no name for linkage purposes. The same paragraph also has the following note which seems to be a natural consequence of not giving the enums names for linkage purposes:
[Note 3: Each unnamed enumeration with no enumerators is a distinct type. — end note]
Perhaps both compilers have a bug. Or, more likely, I just misunderstood the note. The note is non-normative anyway, so let's look at some normative wording.
[basic.link]/8
Two declarations of entities declare the same entity if, considering declarations of unnamed types to introduce their names for linkage purposes, if any ([dcl.typedef], [dcl.enum]), they correspond ([basic.scope.scope]), have the same target scope that is not a function or template parameter scope, and
[irrelevant]
[irrelevant]
they both declare names with external linkage.
[basic.scope.scope]/4
Two declarations correspond if they (re)introduce the same name, both declare constructors, or both declare destructors, unless [irrelevant]
It seems that, when an unnamed enum is not given a typedef name for linkage purposes, and has no enumerators, it can't be the same type as itself in a different translation unit.
So is it really just a compiler bug? One last thing I was thinking is that if the two enum types really are distinct, then the multiple definitions of S violate the one-definition rule and make the program ill-formed NDR. But I couldn't find anything in the ODR that actually says that.
This program is well-formed and prints 1, as seen. Because S is defined identically in both translation units with external linkage, it is as if there is one definition of S ([basic.def.odr]/14) and thus only one enumeration type is defined. (In practice it is mangled based on the name S or S::x.)
This is just the same phenomenon as static local variables and lambdas being shared among the definitions of an inline function:
// foo.hh
inline int* f() {static int x; return &x;}
inline auto g(int *p) {return [p] {return p;};}
inline std::vector<decltype(g(nullptr))> v;
// bar.cc
#include"foo.hh"
void init() {v.push_back(g(f()));}
// main.cc
#include"foo.hh"
void init();
int main() {
init();
return v.front()()!=f(); // 0
}

Should `const` and `constexpr` variables in headers be `inline` to prevent ODR violations?

Consider the following header and assume it is used in several TUs:
static int x = 0;
struct A {
A() {
++x;
printf("%d\n", x);
}
};
As this question explains, this is an ODR violation and, therefore, UB.
Now, there is no ODR violation if our inline function refers to a non-volatile const object and we do not odr-use it within that function (plus the other provisions), so this still works fine in a header:
constexpr int x = 1;
struct A {
A() {
printf("%d\n", x);
}
};
But if we do happen to odr-use it, we are back at square one with UB:
constexpr int x = 1;
struct A {
A() {
printf("%p\n", &x);
}
};
Thus, given we have now inline variables, should not the guideline be to mark all namespace-scoped variables as inline in headers to avoid all problems?
constexpr inline int x = 1;
struct A {
A() {
printf("%p\n", &x);
}
};
This also seems easier to teach, because we can simply say "inline-everything in headers" (i.e. both function and variable definitions), as well as "never static in headers".
Is this reasoning correct? If yes, are there any disadvantages whatsoever of always marking const and constexpr variables in headers as inline?
As you have pointed out, examples one and third does indeed violate ODR as per [basic.def.odr]/12.2.1
[..] in each definition of D, corresponding names, looked up according to [basic.lookup], shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution and after matching of partial template specialization, except that a name can refer to
a non-volatile const object with internal or no linkage if the object
is not odr-used in any definition of D, [..]
Is this reasoning correct?
Yes, inline variables with external linkage are guaranteed to refer to the same entity even when they are odr-used as long all the definitions are the same:
[dcl.inline]/6
An inline function or variable shall be defined in every translation unit in which it is odr-used and shall have exactly the same definition in every case ([basic.def.odr]). [..] An inline function or variable with external linkage shall have the same address in all translation units.
The last example is OK because it meets and don't violate the bold part of the above.
are there any disadvantages whatsoever of always marking const and constexpr variables in headers as inline?
I can't think of any, because if we keep the promise of having the exact same definition of an inline variable with external linkage through TU's, the compiler is free to pick any of them to refer to the variable, this will be the same, technically, as having just one TU and have a global variable declared in the header with appropriate header guards

Can you violate ODR with structured bindings on a class type

The structured bindings feature says that it goes with the tuple like decomposition if the tuple_size template is a complete type. What happens when std::tuple_size is a complete type for the given type at one point in the program and is not complete at another point?
#include <iostream>
#include <tuple>
using std::cout;
using std::endl;
class Something {
public:
template <std::size_t Index>
auto get() {
cout << "Using member get" << endl;
return std::get<Index>(this->a);
}
std::tuple<int> a{1};
};
namespace {
auto something = Something{};
}
void foo() {
auto& [one] = something;
std::get<0>(one)++;
cout << std::get<0>(one) << endl;
}
namespace std {
template <>
class tuple_size<Something> : public std::integral_constant<std::size_t, 1> {};
template <>
class tuple_element<0, Something> {
public:
using type = int;
};
}
int main() {
foo();
auto& [one] = something;
cout << one << endl;
}
(Reproduced here https://wandbox.org/permlink/4xJUEpTAyUxrizyU)
In the above program the type Something is decomposed via the public data members at one point in the program and falls back to the tuple like decomposition at another. Are we violating ODR with the implicit "is std::tuple_size complete" check behind the scenes?
I don't see any reason to believe that the program in question is ill-formed. Simply having something in the code depend on the completeness of a type, then having something else later on depend on the completeness of the same type where the type has since been completed, does not violate the standard.
A problem arises if we have something like
inline Something something; // external linkage
inline void foo() {
auto& [one] = something;
}
defined in multiple translation units, where, in some of those, std::tuple_size<Something> is already complete at the point where foo is defined, and in others, it isn't. This seems like it should definitely violate the ODR, since the entity one receives different types in different copies of foo, however, I can't actually find a place in the standard that says so. The criteria for the multiple definitions to be merged into one are:
each definition of D shall consist of the same sequence of tokens; and
in each definition of D, corresponding names, looked up according to 6.4, shall refer to an entity defined
within the definition of D, or shall refer to the same entity, after overload resolution (16.3) and after
matching of partial template specialization (17.8.3), except that a name can refer to
a non-volatile const object with internal or no linkage if the object
has the same literal type in all definitions of D,
is initialized with a constant expression (8.20),
is not odr-used in any definition of D, and
has the same value in all definitions of D,
or
a reference with internal or no linkage initialized with a constant expression such that the reference
refers to the same entity in all definitions of D;
and
in each definition of D, corresponding entities shall have the same language linkage; and
in each definition of D, the overloaded operators referred to, the implicit calls to conversion functions,
constructors, operator new functions and operator delete functions, shall refer to the same function, or
to a function defined within the definition of D; and
in each definition of D, a default argument used by an (implicit or explicit) function call is treated as if
its token sequence were present in the definition of D; that is, the default argument is subject to the
requirements described in this paragraph (and, if the default argument has subexpressions with default
arguments, this requirement applies recursively) 28 ; and
if D is a class with an implicitly-declared constructor (15.1), it is as if the constructor was implicitly
defined in every translation unit where it is odr-used, and the implicit definition in every translation
unit shall call the same constructor for a subobject of D.
If there's a rule here that makes my code ill-formed, I don't know which one it is. Perhaps the standard needs to be amended, because it cannot have been intended that this was allowed.
Another way to make the program ill-formed NDR involves the use of a template:
template <int unused>
void foo() {
auto& [one] = something;
}
// define tuple_element and tuple_size
foo<42>(); // instantiate foo
This would run afoul of [temp.res]/8.4, according to which
The program is ill-formed, no diagnostic required, if ... the interpretation of [a construct that does not depend on a template parameter] in [the hypothetical instantiation of a template immediately following its definition] is different from the interpretation of the corresponding construct in any actual instantiation of the template

Class template overloading across TUs

Consider the following C++11 application:
A.cpp:
template<typename T>
struct Shape {
T x;
T area() const { return x*x; }
};
int testA() {
return Shape<int>{2}.area();
}
B.cpp:
template<typename T, typename U = T>
struct Shape {
T x;
U y;
U area() const { return x*y; }
};
int testB() {
return Shape<int,short>{3,4}.area();
}
Main.cpp:
int testA();
int testB();
int main() {
return testA() + testB();
}
Although it compiles (as long as A and B are in separate TUs), it doesn't look right, and I'm having trouble figuring out why.
Hence my Question: Does this violate ODR, overloading, or any other rule, and if so, what sections of the Standard are violated and why?
It is an ODR violation. Template names have linkage. And both those template names have external linkage, as [basic.link]/4 says:
An unnamed namespace or a namespace declared directly or indirectly
within an unnamed namespace has internal linkage. All other namespaces
have external linkage. A name having namespace scope that has not been
given internal linkage above has the same linkage as the enclosing
namespace if it is the name of
[...]
a template.
And on account of that, since both templates share a name, it means that [basic.def.odr]/5 applies:
There can be more than one definition of a [...] class template
(Clause [temp]) [...] in a program provided that each definition
appears in a different translation unit, and provided the definitions
satisfy the following requirements. Given such an entity named D
defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
[...]
If D is a template and is defined in more than one translation unit,
then the preceding requirements shall apply both to names from the
template's enclosing scope used in the template definition
([temp.nondep]), and also to dependent names at the point of
instantiation ([temp.dep]). If the definitions of D satisfy all these
requirements, then the program shall behave as if there were a single
definition of D. If the definitions of D do not satisfy these
requirements, then the behavior is undefined.
Not the same sequence of tokens by a margin.
You can easily resolve it, as Jarod42 suggested, by putting both definitions of the templates into an unnamed namespace, thus giving them internal linkage.

What means "obey ODR" in case of inline and constexpr function?

I just read that constexpr and inline functions obey one-definition rule, but they definition must be identical. So I try it:
inline void foo() {
return;
}
inline void foo() {
return;
}
int main() {
foo();
};
error: redefinition of 'void foo()',
and
constexpr int foo() {
return 1;
}
constexpr int foo() {
return 1;
}
int main() {
constexpr x = foo();
};
error: redefinition of 'constexpr int foo()'
So what exactly means that, constexpr and inline function can obey ODR?
I just read that constexpr and inline functions obey one-definition rule, but they definition must be identical.
This is in reference to inline functions in different translations units. In your example they are both in the same translation unit.
This is covered in the draft C++ standard 3.2 One definition rule [basic.def.odr] which says:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with
external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member
of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for
which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition
appears in a different translation unit, and provided the definitions satisfy the following requirements. Given
such an entity named D defined in more than one translation unit, then
and includes the following bullet:
each definition of D shall consist of the same sequence of tokens; and
You are defining functions repeatedly in one translation unit. This is always forbidden:
No translation unit shall contain more than one definition of any variable, function, class type, enumeration
type, or template. (C++11 3.2/1)
For inline functions, you are allowed to define same function in exactly the same way in more than one translation unit (read: .cpp file). In fact, you must define it in every translation unit (which is usually done by defining it in a header file):
An inline function shall be defined in every translation unit in which it is odr-used. (C++11 3.2/3)
For "normal" (non-inline, non-constexpr, non-template, etc.) functions with external linkage (non-static) functions, this will usually (no diagnostic required) lead to a linker error.
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used
in that program; no diagnostic required. (C++11 3.2/3)
To sum up:
Never define anything multiple times in one translation unit (which is a .cpp file and all directly or indirectly included headers).
You may put a certain number of things into header files, where they will be included once in several different translation units, for example:
inline functions
class types and templates
static data members of a class template.
If you have:
file1.cpp:
inline void foo() { std::cout << "Came to foo in file1.cpp" << std::endl; }
and
file2.cpp:
inline void foo() { std::cout << "Came to foo in file2.cpp" << std::endl; }
and you link those files together in an executable, you are violating the one-definition-rule since the two versions of the inline function are not same.