Can using a lambda in header files violate the ODR? - c++

Can the following be written in a header file:
inline void f () { std::function<void ()> func = [] {}; }
or
class C { std::function<void ()> func = [] {}; C () {} };
I guess in each source file, the lambda's type may be different and therefore the contained type in std::function (target_type's results will differ).
Is this an ODR (One Definition Rule) violation, despite looking like a common pattern and a reasonable thing to do? Does the second sample violate the ODR every time or only if at least one constructor is in a header file?

This boils down to whether or not a lambda's type differs across translation units. If it does, it may affect template argument deduction and potentially cause different functions to be called - in what are meant to be consistent definitions. That would violate the ODR (see below).
However, that isn't intended. In fact, this problem has already been touched on a while ago by core issue 765, which specifically names inline functions with external linkage - such as f:
7.1.2 [dcl.fct.spec] paragraph 4 specifies that local static variables and string literals appearing in the body of an inline function with
external linkage must be the same entities in every translation unit
in the program. Nothing is said, however, about whether local types
are likewise required to be the same.
Although a conforming program could always have determined this by use
of typeid, recent changes to C++ (allowing local types as template
type arguments, lambda expression closure classes) make this question
more pressing.
Notes from the July, 2009 meeting:
The types are intended to be the same.
Now, the resolution incorporated the following wording into [dcl.fct.spec]/4:
A type defined within the body of an extern inline function is the same type in every translation unit.
(NB: MSVC isn't regarding the above wording yet, although it might in the next release).
Lambdas inside such functions' bodies are therefore safe, since the closure type's definition is indeed at block scope ([expr.prim.lambda]/3).
Hence multiple definitions of f were ever well-defined.
This resolution certainly doesn't cover all scenarios, as there are many more kinds of entities with external linkage that can make use of lambdas, function templates in particular - this should be covered by another core issue.
In the meantime, Itanium already contains appropriate rules to ensure that such lambdas' types coincide in more situations, hence Clang and GCC should already mostly behave as intended.
Standardese on why differing closure types are an ODR violation follows. Consider bullet points (6.2) and (6.4) in [basic.def.odr]/6:
There can be more than one definition of […]. Given such an entity named D defined in more than one translation unit, then each definition of D shall consist of the
same sequence of tokens; and
(6.2) - in each definition of D, corresponding names, looked up
according to [basic.lookup], shall refer to an entity defined within
the definition of D, or shall refer to the same entity, after
overload resolution ([over.match]) and after matching of partial
template specialization ([temp.over]), […]; and
(6.4) - in each definition of D, the overloaded operators referred to,
the implicit calls to conversion functions, constructors,
operator new functions and operator delete functions, shall refer to
the same function, or to a function defined within the definition of
D; […]
What this effectively means is that any functions called in the entity's definition shall be the same in all translation units - or have been defined inside its definition, like local classes and their members. I.e. usage of a lambda per se is not problematic, but passing it to function templates clearly is, since these are defined outside the definition.
In your example with C, the closure type is defined within the class (whose scope is the smallest enclosing one). If the closure type differs in two TUs, which the standard may unintentionally imply with the uniqueness of a closure type, the constructor instantiates and calls different specializations of function's constructor template, violating (6.4) in the above quote.

UPDATED
After all I agree with #Columbo answer, but want to add the practical five cents :)
Although the ODR violation sounds dangerous, it's not really a serious problem in this particular case. The lambda classes created in different TUs are equivalent except their typeids. So unless you have to cope with the typeid of a header-defined lambda (or a type depending on the lambda), you are safe.
Now, when the ODR violation is reported as a bug, there is a big chance that it will be fixed in compilers that have the problem e.g. MSVC and probably some other ones which don't follow the Itanium ABI. Note that Itanium ABI conformant compilers (e.g. gcc and clang) are already producing ODR-correct code for header-defined lambdas.

Related

It seems the current standard draft cannot interpret why two structured binding declaration conflict with each other

struct A{
int a;
};
struct B{
int b;
};
auto&& [x] = A{}; //#1
auto&& [x] = B{}; //#2
int main(){
}
In this example, all compilers give an error that the x at #2 conflicts with that introduced at #1. However, IIUC, there's no rule in the post-C++20 working draft standard which can interpret what's the reason.
First, in my opinion, the declaration at #2 and the declaration at #1 declare the same entity. They correspond due to:
basic.scope#scope-3
Two declarations correspond if they (re)introduce the same name, both declare constructors, or both declare destructors, unless
[...]
They declare the same entity per basic.link#8
Two declarations of entities declare the same entity if, considering declarations of unnamed types to introduce their names for linkage purposes, if any ([dcl.typedef], [dcl.enum]), they correspond ([basic.scope.scope]), have the same target scope that is not a function or template parameter scope, and either
they appear in the same translation unit, or
[...]
So, as far as now, they declare the same entity and they shouldn't be considered as potentially conflict per basic.scope#scope-4
Two declarations potentially conflict if they correspond and cause their shared name to denote different entities([basic.link]). The program is ill-formed if, in any scope, a name is bound to two declarations that potentially conflict and one precedes the other ([basic.lookup]).
Since they denote the same entity, as aforementioned, they do not potentially conflict.
They still do not violate this rule:
basic.link#11
For any two declarations of an entity E:
If one declares E to be a variable or function, the other shall declare E as one of the same type.
[...]
Since structured bindings are not mentioned in this list, they do not violate this rule. Similar, they do not violate One-definition rule
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, template, default argument for a parameter (for a function in a given scope), or default template argument.
At least, according to what the relevant rules say, the two declarations in this example shouldn't result in any program ill-formed. If I don't miss some other rules, Can it be considered as vague in the standard which cannot interpret why two structured binding declarations conflict with each other in this case? This case is more underspecified in the N4861
This is just a missing case in [basic.link]/11: that if one (of the two declarations) declares a structured binding, the program is ill-formed. (One could alternatively merely require that the other also declare a structured binding and then extend the list in [basic.def.odr]/1, but that’s more complicated and suggests that it might be possible to redefine it in another translation unit.)
You are citing text from a working draft of a post-C++20 version of the language. As such, the behavior it describes is not likely implemented by any compiler currently existing. As it is a working draft, it likely contains a number of language defects and/or bugs, so trying to learn from it is not a productive activity.
All of the "correspond" language you cite is adopted from P1787, which is not part of any C++ standard actual compilers implement. As such, compilers are providing you with the C++20 functionality, and under those rules, these clearly conflict.
There may be some defective wording in P1787, but that's expected with complex proposals and working drafts of a standard. File a defect report on it.

Why does the same class being defined in multiple .cpp files not cause a linker multiple definition error?

I'm getting a strange behavior which I don't understand. So I have two different classes with the same name defined in two different cpp files. I understand that this will not cause any error during the compilation of the translation units as they don't know about each other. But shouldn't the linker throw some error when it links these files together?
You're thinking of the one definition rule. I'm quoting from there (boldface is emphasis of my choosing, not a part of the original document).
Your understanding would be correct--it's illegal to define the same function in multiple compilation units:
One and only one definition of every non-inline function or variable that is odr-used (see below) is required to appear in the entire program (including any standard and user-defined libraries). The compiler is not required to diagnose this violation, but the behavior of the program that violates it is undefined.
However, this isn't the case for classes, which can be defined multiple times (up to once in each compilation unit), as long as the definitions are all identical. If they are identical, then you can safely pass instances of that class from one compilation unit to another, since all compilation units have compatible, identical definitions with compatible sizes and memory layouts.
Only one definition of any variable, function, class type, enumeration type, concept (since C++20) or template is allowed in any one translation unit (some of these may have multiple declarations, but only one definition is allowed).
...
There can be more than one definition in a program, as long as each definition appears in a different translation unit, of each of the following: class type, enumeration type, inline function with external linkage inline variable with external linkage (since C++17), class template, non-static function template, static data member of a class template, member function of a class template, partial template specialization, concept, (since C++20) as long as all of the following is true:
each definition consists of the same sequence of tokens (typically, appears in the same header file)
name lookup from within each definition finds the same entities (after overload-resolution), except that constants with internal or no linkage may refer to different objects as long as they are not ODR-used and have the same values in every definition.
overloaded operators, including conversion, allocation, and deallocation functions refer to the same function from each definition (unless referring to one defined within the definition)
the language linkage is the same (e.g. the include file isn't inside an extern "C" block)
the three rules above apply to every default argument used in each definition
if the definition is for a class with an implicitly-declared constructor, every translation unit where it is odr-used must call the same constructor for the base and members
if the definition is for a template, then all these requirements apply to both names at the point of definition and dependent names at the point of instantiation
If all these requirements are satisfied, the program behaves as if there is only one definition in the entire program. Otherwise, the behavior is undefined.
The bullet points are a fancy and highly precise way of specifying that the definitions must be the same, in letter and in effective result.
The one-definition rule specifically permits this, as long as those definitions are completely, unadulteratedly, identical.
And I do mean absolutely identical. Even if you swap the token struct for the token class, in a case where it would otherwise not matter, your program has undefined behaviour.
And it's for good reason: typically we define classes in headers, and we typically include such headers into multiple translation units; it would be very awkward if this were not allowed.
The same applies to inline function definitions for the same reason.
As for why you don't get an error: well, like I said, undefined behaviour. It would technically be possible for the toolchain to diagnose this, but since multiple class definitions with the same name are a totally commonplace thing to do (per above), it's arguably a waste of time to come up with what would be quite complicated logic for the linker of all things to try to diagnose "accidents". Ultimately, as with many things in this language, it's left up to you to try to get it right.

C++ ODR-rule for default arguments of template functions

In the current draft of the C++ standard, there is the following paragraph ([temp] p.6]):
A template name has linkage. Specializations (explicit or implicit) of a template that has internal linkage are distinct from all specializations in other translation units. A template, a template explicit specialization, and a class template partial specialization shall not have C linkage. Use of a linkage specification other than "C" or "C++" with any of these constructs is conditionally-supported, with implementation-defined semantics. Template definitions shall obey the one-definition rule. [ Note: Default arguments for function templates and for member functions of class templates are considered definitions for the purpose of template instantiation ([temp.decls]) and must also obey the one-definition rule. — end note ]
I do not understand what the highlighted part means. How could I break the one definition rule using default arguments? Is there a way to "re-define" them?
The OP seems to be asking a few different questions:
What does the first part ("considered definitions for the purpose of template instantiation") mean?
What does the second part ("must also obey the one-definition rule") mean?
Isn't the part about the one-definition rule redundant with [basic.def.odr]/14.4 (which, at the time, was known as [basic.def.odr]/12.1)?
Assuming that the intent is e.g. to prohibit the same function template from having two different default arguments for the same parameter in different translation units such that the two default arguments don't have the same value, is this issue specific to templates? If not, then why does it need to be mentioned here at all?
Answer to question 1
That part of the note references [temp.decls], and [temp.decls.general]/3 elaborates:
For purposes of name lookup and instantiation, default arguments, type-constraints, requires-clauses (13.1), and noexcept-specifiers of function templates and of member functions of class templates are considered definitions; each default argument, type-constraint, requires-clause, or noexcept-specifier is a separate definition which is unrelated to the templated function definition or to any other default arguments, type-constraints, requires-clauses, or noexcept-specifiers. For the purpose of instantiation, the substatements of a constexpr if statement (8.5.2) are considered definitions.
Thus, it seems to be simply reminding the reader that default arguments are subject to instantiation separately from the function templates (or member functions of class templates) that they belong to.
Answer to question 2
The part about the one-definition rule appears to be reminding the reader that if the same default argument of a function template (or member function of class template) appears in multiple translation units, those multiple definitions must be "the same" in the sense of the one-definition rule ([basic.def.odr]/14). xskxzr already gave an example of how this could potentially be violated:
// TU 1
template <typename T> void foo(int = 0);
// TU 2
template <typename T> void foo(int = 1);
Answer to question 3
In the current standard, default arguments are a kind of "definable item" ([basic.def.odr]/1.6) so the rule that multiple definitions in different translation units must be identical applies to them just as it does to class definitions, inline function definitions, and so on ([basic.def.odr]/14). This does make the note redundant, which is fine. Notes are non-normative, so they ought to be redundant. Notes are often used to clarify normative text or to draw the reader's attention to a possibly surprising consequence of the normative text.
Answer to question 4
The first half, about instantiation, is obviously specific to templates. The second half is not now specific to templates, but it used to be. In C++20, it was made illegal for the same function parameter to be given different default arguments by different translation units; prior to C++17, it was legal. (This is [diff.cpp17.dcl.dcl]/2 in the "Compatibility" section of C++20.)
To be specific, from C++98 through C++17, therefore, it was permitted for non-template functions to have different default arguments in different translation units, but it was not permitted for function templates; therefore, the rule about default arguments being definitions that are subject to the one-definition rule was specific to function templates (and member functions of class templates). The wording of the note has not changed since C++98.
It might be more clear if the wording of the note were now changed to:
Default arguments for function templates and for member functions of class templates are considered definitions for the purpose of template instantiation ([temp.decls]) and, like default arguments of non-templated functions, must also obey the one-definition rule.
I don't think this change is necessary, though. The standard cannot be expected to be a particularly well-written document, as it is maintained by volunteers who already have their hands full with determining what the normative rules should be. You should expect that there will be times when you will look at something in the standard and wonder "why isn't this written in a different way?" And usually there will be some historical reason, but even if you can't figure out what that reason is, you shouldn't let it detract from your ability to understand what the standard is saying.

Example of non odr-used variable defined in the program [duplicate]

This just came up in the context of another question.
Apparently member functions in class templates are only instantiated if they are ODR-used.
Could somebody explain what exactly that means. The wikipedia article on One Definition Rule (ODR) doesn't mention "ODR-use".
However the standard defines it as
A variable whose name appears as a potentially-evaluated expression
is odr-used unless it is an object that satisfies the requirements for
appearing in a constant expression (5.19) and the lvalue-to-rvalue
conversion (4.1) is immediately applied.
in [basic.def.odr].
Edit: Apparently this is the wrong part and the entire paragraph contains multiple definitions for different things. This might be the relevant one for class template member function:
A non-overloaded function whose name appears as a
potentially-evaluated expression or a member of a set of candidate
functions, if selected by overload resolution when referred to from a
potentially-evaluated expression, is odr-used, unless it is a pure
virtual function and its name is not explicitly qualified.
I do however not understand, how this rule works across multiple compilation units? Are all member functions instantiated if I explicitly instantiate a class template?
It's just an arbitrary definition, used by the standard to
specify when you must provide a definition for an entity (as
opposed to just a declaration). The standard doesn't say just
"used", because this can be interpreted diversely depending on
context. And some ODR-use doesn't really correspond to what one
would normally associate with "use"; for example, a virtual
function is always ODR-used unless it is pure, even if it isn't
actually called anywhere in the program.
The full definition is in §3.2, second paragraph, although this
contains references to other sections to complete the
definition.
With regards to templates, ODR-used is only part of question;
the other part is instantiation. In particular, §14.7 covers
when a template is instantiated. But the two are related: while
the text in §14.7.1 (implicit instantiation) is fairly long, the
basic principle is that a template will only be instantiated if
it is used, and in this context, used means ODR-used. Thus,
a member function of a class template will only be instantiated
if it is called, or if it is virtual and the class itself is
instantiated. The standard itself counts on this in many
places: the std::list<>::sort uses < on the individual
elements, but you can instantiate a list over an element type
which doesn't support <, as long as you don't call sort on
it.
In plain word, odr-used means something(variable or function) is used in a context where the definition of it must be present.
e.g.,
struct F {
static const int g_x = 2;
};
int g_x_plus_1 = F::g_x + 1; // in this context, only the value of g_x is needed.
// so it's OK without the definition of g_x
vector<int> vi;
vi.push_back( F::g_x ); // Error, this is odr-used, push_back(const int & t) expect
// a const lvalue, so it's definition must be present
Note, the above push_back passed in MSVC 2013. This behavior is not standard compliant - both gcc 4.8.2 and clang 3.8.0 failed, with error message:
undefined reference to F::g_x

What does it mean to "ODR-use" something?

This just came up in the context of another question.
Apparently member functions in class templates are only instantiated if they are ODR-used.
Could somebody explain what exactly that means. The wikipedia article on One Definition Rule (ODR) doesn't mention "ODR-use".
However the standard defines it as
A variable whose name appears as a potentially-evaluated expression
is odr-used unless it is an object that satisfies the requirements for
appearing in a constant expression (5.19) and the lvalue-to-rvalue
conversion (4.1) is immediately applied.
in [basic.def.odr].
Edit: Apparently this is the wrong part and the entire paragraph contains multiple definitions for different things. This might be the relevant one for class template member function:
A non-overloaded function whose name appears as a
potentially-evaluated expression or a member of a set of candidate
functions, if selected by overload resolution when referred to from a
potentially-evaluated expression, is odr-used, unless it is a pure
virtual function and its name is not explicitly qualified.
I do however not understand, how this rule works across multiple compilation units? Are all member functions instantiated if I explicitly instantiate a class template?
It's just an arbitrary definition, used by the standard to
specify when you must provide a definition for an entity (as
opposed to just a declaration). The standard doesn't say just
"used", because this can be interpreted diversely depending on
context. And some ODR-use doesn't really correspond to what one
would normally associate with "use"; for example, a virtual
function is always ODR-used unless it is pure, even if it isn't
actually called anywhere in the program.
The full definition is in §3.2, second paragraph, although this
contains references to other sections to complete the
definition.
With regards to templates, ODR-used is only part of question;
the other part is instantiation. In particular, §14.7 covers
when a template is instantiated. But the two are related: while
the text in §14.7.1 (implicit instantiation) is fairly long, the
basic principle is that a template will only be instantiated if
it is used, and in this context, used means ODR-used. Thus,
a member function of a class template will only be instantiated
if it is called, or if it is virtual and the class itself is
instantiated. The standard itself counts on this in many
places: the std::list<>::sort uses < on the individual
elements, but you can instantiate a list over an element type
which doesn't support <, as long as you don't call sort on
it.
In plain word, odr-used means something(variable or function) is used in a context where the definition of it must be present.
e.g.,
struct F {
static const int g_x = 2;
};
int g_x_plus_1 = F::g_x + 1; // in this context, only the value of g_x is needed.
// so it's OK without the definition of g_x
vector<int> vi;
vi.push_back( F::g_x ); // Error, this is odr-used, push_back(const int & t) expect
// a const lvalue, so it's definition must be present
Note, the above push_back passed in MSVC 2013. This behavior is not standard compliant - both gcc 4.8.2 and clang 3.8.0 failed, with error message:
undefined reference to F::g_x