Template static member definition depends on order passed to linker - c++

The following code, which has 2 definitions for a template static field member, each definition defines template1<int>::x with a different value.
One would expect the linker to reject such redefinitions as they have different values.
But compilation & linkage passes for both g++ and MSVC, and which definition is used is dependent on the order in which the sources are passed to the linker.
Is this behavior compliant to the C++ standard, undefined behavior, or a linker bug?
my_template.h
template <class T>
class template1
{
public:
static int x;
};
Src2.cpp
#include <stdio.h>
#include "my_template.h"
template <class T>
int template1<T>::x = 2;
void my_func() // definition
{
printf("my_func: template1<int>::x = %d\n", template1<int>::x); // definition of X to 2.
printf("my_func: template1<char>::x = %d\n", template1<char>::x); // definition of X to 2.
}
Main.cpp
#include <cstdio>
#include "my_template.h"
template <class T>
int template1<T>::x = 1;
void my_func();
int main()
{
printf("main: template1<int>::x = %d\n", template1<int>::x); // definition of X to 1.
my_func();
return 0;
}
Compile with g++ (MinGW.org GCC Build-20200227-1) 9.2.0+
Compile1
g++ -o prog Src2.cpp Main.cpp
Output1
main: template1<int>::x = 2
my_func: template1<int>::x = 2
my_func: template1<char>::x = 2
Compile2
g++ -o prog Main.cpp Src2.cpp
Ouput2
main: template1<int>::x = 1
my_func: template1<int>::x = 1
my_func: template1<char>::x = 2
Observed also with
Microsoft (R) C/C++ Optimizing Compiler Version 19.25.28612 for x86
When I disassembled the code with -S flag, each compilation unit defined the same symbol name.
Co-work with Nightra.

This violates ODR (which requires that an entity must have exactly one definition, if it's used). So the program has UB.
The compiler couldn't diagnose this, because each translation unit is fine. In theory, the linker could diagnose this, but in practice it won't do that.

Is this behavior compliant to the C++ standard, undefined behavior, or a linker bug?
This is undefined behaviour (UB).
From [basic.def.odr]/4 of N4659 [emphasis mine]:
Every program shall contain exactly one definition of every
non-inline function or variable that is odr-used in that program
outside of a discarded statement; no diagnostic required. The
definition can appear explicitly in the program, it can be found in
the standard or a user-defined library, or (when appropriate) it is
implicitly defined (see [class.ctor], [class.dtor] and [class.copy]).
An inline function or variable shall be defined in every translation
unit in which it is odr-used outside of a discarded statement.
Non-constexpr static member variables of templates are not implicitly inline, and thus this is UB, no diagnostic required.
We may also turn to [basic.def.odr]/6 for an even stronger statement (not even requiring ODR-use) [quoting a selected extract, emphasis mine]:
There can be more than one definition of a [...] static data member
of a class template [...] in a program provided that each definition
appears in a different translation unit, and provided the
definitions satisfy the following requirements. Given such an entity
named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
[...]
If the definitions of D satisfy all these requirements, then the
behavior is as if there were a single definition of D. If the
definitions of D do not satisfy these requirements, then the
behavior is undefined.
With two different definitions of D (in your case, template1<int>::x) "each definition of D shall consist of the same sequence of tokens" is not fulfilled, and it follows that we naturally cannot possibly fulfill "[...] as if there were a single definition of D"; thus UB.

Related

conflicting C++ classes in different modules get mixed up without link error [duplicate]

Consider the following example:
// usedclass1.hpp
#include <iostream>
class UsedClass
{
public:
UsedClass() { }
void doit() { std::cout << "UsedClass 1 (" << this << ") doit hit" << std::endl; }
};
// usedclass2.hpp
#include <iostream>
class UsedClass
{
public:
UsedClass() { }
void doit() { std::cout << "UsedClass 2 (" << this << ") doit hit" << std::endl; }
};
// object.hpp
class Object
{
public:
Object();
};
// object.cpp
#include "object.hpp"
#include "usedclass2.hpp"
Object::Object()
{
UsedClass b;
b.doit();
}
// main.cpp
#include "usedclass1.hpp"
#include "object.hpp"
int main()
{
Object obj;
UsedClass a;
a.doit();
}
The code compiles without any compiler or linker errors. But the output is strange for me:
gcc (Red Hat 4.6.1-9) on Fedora x86_64 with no optimization [EG1]:
UsedClass 1 (0x7fff0be4a6ff) doit hit
UsedClass 1 (0x7fff0be4a72e) doit hit
same as [EG1] but with -O2 option enabled [EG2]:
UsedClass 2 (0x7fffcef79fcf) doit hit
UsedClass 1 (0x7fffcef79fff) doit hit
msvc2005 (14.00.50727.762) on Windows XP 32bit with no optimization [EG3]:
UsedClass 1 (0012FF5B) doit hit
UsedClass 1 (0012FF67) doit hit
same as [EG3] but with /O2 (or /Ox) enabled [EG4]:
UsedClass 1 (0012FF73) doit hit
UsedClass 1 (0012FF7F) doit hit
I would expect either a linker error (assuming ODR rule is violated) or the output as in [EG2] (code is inlined, nothing is exported from the translation unit, ODR rule is held). Thus my questions:
Why are outputs [EG1], [EG3], [EG4] possible?
Why do I get different results from different compilers or even from the same compiler? That makes me think that the standard somehow doesn't specify the behaviour in this case.
Thank you for any suggestions, comments and standard interpretations.
Update
I would like to understand the compiler's behaviour. More precisely, why there are no errors generated if the ODR is violated. A hypothesis is that since all functions in classes UsedClass1 and UsedClass2 are marked as inline (and therefore C++03 3.2 is not violated) the linker doesn't report errors, but in this case outputs [EG1], [EG3], [EG4] seem strange.
This is the rule that prohibits what you're doing (the C++11 wording), from section 3.2 of the Standard:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
in each definition of D, corresponding names, looked up according to 3.4, shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution (13.3) and after matching of partial template specialization (14.8.3), except that a name can refer to a const object with internal or no linkage if the object has the same literal type in all definitions of D, and the object is initialized with a constant expression (5.19), and the value (but not the address) of the object is used, and the object has the same value in all definitions of D; and
in each definition of D, corresponding entities shall have the same language linkage; and
in each definition of D, the overloaded operators referred to, the implicit calls to conversion functions, constructors, operator new functions and operator delete functions, shall refer to the same function, or to a function defined within the definition of D; and
in each definition of D, a default argument used by an (implicit or explicit) function call is treated as if its token sequence were present in the definition of D; that is, the default argument is subject to the three requirements described above (and, if the default argument has sub-expressions with default arguments, this requirement applies recursively).
if D is a class with an implicitly-declared constructor (12.1), it is as if the constructor was implicitly defined in every translation unit where it is odr-used, and the implicit definition in every translation unit shall call the same constructor for a base class or a class member of D.
In your program, you're violating the ODR for class UsedClass because the tokens are different in different compilation units. You could fix that by moving the definition of UsedClass::doit() outside the class body, but the same rule applies to the body of inline functions.
Your program violates the One Definition Rule and invokes an Undefined Behavior.
The standard does not mandate an diagnostic message if you break the ODR but the behavior is Undefined.
C++03 3.2 One definition rule
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template.
...
Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8). An inline function shall be defined in every translation unit in which it is used.
Further the standard defines specific requirements for existence of multiple definitions of an symbol, those are aptly defined in Para #5 of 3.2.
There can be more than one definition of a class type (clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (clause 14), non-static function template (14.5.5), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.4) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
— each definition of D shall consist of the same sequence of tokens; and
...
Why are outputs [EG1], [EG3], [EG4] possible?
The simple answer is that the behaviour is undefined, so anything is possible.
Most compilers handle an inline function by generating a copy in each translation unit in which it's defined; the linker then arbitrarily chooses one to include in the final program. This is why, with optimisations disabled, it calls the same function in both cases. With optimisations enabled, the function might be inlined by the compiler, in which case each inlined call will use the version defined in the current translation unit.
That makes me think that the standard somehow doesn't specify the behaviour in this case.
That's correct. Breaking the one definition rule gives undefined behaviour, and no diagnostic is required.

Possible ODR-violations when using a constexpr variable in the definition of an inline function (in C++14)

(Note! This question particularly covers the state of C++14, before the introduction of inline variables in C++17)
TLDR; Question
What constitutes odr-use of a constexpr variable used in the definition of an inline function, such that multiple definitions of the function violates [basic.def.odr]/6?
(... likely [basic.def.odr]/3; but could this silently introduce UB in a program as soon as, say, the address of such a constexpr variable is taken in the context of the inline function's definition?)
TLDR example: does a program where doMath() defined as follows:
// some_math.h
#pragma once
// Forced by some guideline abhorring literals.
constexpr int kTwo{2};
inline int doMath(int arg) { return std::max(arg, kTwo); }
// std::max(const int&, const int&)
have undefined behaviour as soon as doMath() is defined in two different translation units (say by inclusion of some_math.h and subsequent use of doMath())?
Background
Consider the following example:
// constants.h
#pragma once
constexpr int kFoo{42};
// foo.h
#pragma once
#include "constants.h"
inline int foo(int arg) { return arg * kFoo; } // #1: kFoo not odr-used
// a.cpp
#include "foo.h"
int a() { return foo(1); } // foo odr-used
// b.cpp
#include "foo.h"
int b() { return foo(2); } // foo odr-used
compiled for C++14, particularly before inline variables and thus before constexpr variables were implicitly inline.
The inline function foo (which has external linkage) is odr-used in both translation units (TU) associated with a.cpp and b.cpp, say TU_a and TU_b, and shall thus be defined in both of these TU's ([basic.def.odr]/4).
[basic.def.odr]/6 covers the requirements for when such multiple definitions (different TU's) may appear, and particularly /6.1 and /6.2 is relevant in this context [emphasis mine]:
There can be more than one definition of a [...] inline function with external linkage [...] in a program
provided that each definition appears in a different translation unit,
and provided the definitions satisfy the following requirements. Given
such an entity named D defined in more than one translation unit, then
/6.1 each definition of D shall consist of the same sequence of tokens; and
/6.2 in each definition of D, corresponding names, looked up according to [basic.lookup], shall refer to an entity defined within
the definition of D, or shall refer to the same entity, after overload
resolution ([over.match]) and after matching of partial template
specialization ([temp.over]), except that a name can refer to a
non-volatile const object with internal or no linkage if the object
has the same literal type in all definitions of D, and the object is
initialized with a constant expression ([expr.const]), and the object
is not odr-used, and the object has the same value in all definitions
of D; and
...
If the definitions of D do not satisfy these requirements, then the behavior is undefined.
/6.1 is fulfilled.
/6.2 if fulfilled if kFoo in foo:
[OK] is const with internal linkage
[OK] is initialized with a constant expressions
[OK] is of same literal type over all definitions of foo
[OK] has the same value in all definitions of foo
[??] is not odr-used.
I interpret 5 as particularly "not odr-used in the definition of foo"; this could arguably have been clearer in the wording. However if kFoo is odr-used (at least in the definition of foo) I interpret it as opening up for odr-violations and subsequent undefined behavior, due to violation of [basic.def.odr]/6.
Afaict [basic.def.odr]/3 governs whether kFoo is odr-used or not,
A variable x whose name appears as a potentially-evaluated expression ex is odr-used by ex unless applying the lvalue-to-rvalue conversion ([conv.lval]) to x yields a constant expression ([expr.const]) that does not invoke any non-trivial functions and, if x is an object, ex is an element of the set of potential results of an expression e, where either the lvalue-to-rvalue conversion ([conv.lval]) is applied to e, or e is a discarded-value expression (Clause [expr]). [...]
but I'm having a hard time to understand whether kFoo is considered as odr-used e.g. if its address is taken within the definition of foo, or e.g. whether if its address is taken outside of the definition of foo or not affects whether [basic.def.odr]/6.2 is fulfilled or not.
Further details
Particularly, consider if foo is defined as:
// #2
inline int foo(int arg) {
std::cout << "&kFoo in foo() = " << &kFoo << "\n";
return arg * kFoo;
}
and a() and b() are defined as:
int a() {
std::cout << "TU_a, &kFoo = " << &kFoo << "\n";
return foo(1);
}
int b() {
std::cout << "TU_b, &kFoo = " << &kFoo << "\n";
return foo(2);
}
then running a program which calls a() and b() in sequence produces:
TU_a, &kFoo = 0x401db8
&kFoo in foo() = 0x401db8 // <-- foo() in TU_a:
// &kFoo from TU_a
TU_b, &kFoo = 0x401dbc
&kFoo in foo() = 0x401db8 // <-- foo() in TU_b:
// !!! &kFoo from TU_a
namely the address of the TU-local kFoo when accessed from the different a() and b() functions, but pointing to the same kFoo address when accessed from foo().
DEMO.
Does this program (with foo and a/b defined as per this section) have undefined behaviour?
A real life example would be where these constexpr variables represent mathematical constants, and where they are used, from within the definition of an inline function, as arguments to utility math functions such as std::max(), which takes its arguments by reference.
In the OP's example with std::max, an ODR violation does indeed occur, and the program is ill-formed NDR. To avoid this issue, you might consider one of the following fixes:
give the doMath function internal linkage, or
move the declaration of kTwo inside doMath
A variable that is used by an expression is considered to be odr-used unless there is a certain kind of simple proof that the reference to the variable can be replaced by the compile-time constant value of the variable without changing the result of the expression. If such a simple proof exists, then the standard requires the compiler perform such a replacement; consequently the variable is not odr-used (in particular, it does not require a definition, and the issue described by the OP would be avoided because none of the translation units in which doMath is defined would actually reference a definition of kTwo). If the expression is too complicated, however, then all bets are off. The compiler might still replace the variable with its value, in which case the program may work as you expect; or the program may exhibit bugs or crash. That's the reality with IFNDR programs.
The case where the variable is immediately passed by reference to a function, with the reference binding directly, is one common case where the variable is used in a way that is too complicated and the compiler is not required to determine whether or not it may be replaced by its compile-time constant value. This is because doing so would necessarily require inspecting the definition of the function (such as std::max<int> in this example).
You can "help" the compiler by writing int(kTwo) and using that as the argument to std::max as opposed to kTwo itself; this prevents an odr-use since the lvalue-to-rvalue conversion is now immediately applied prior to calling the function. I don't think this is a great solution (I recommend one of the two solutions that I previously mentioned) but it has its uses (GoogleTest uses this in order to avoid introducing odr-uses in statements like EXPECT_EQ(2, kTwo)).
If you want to know more about how to understand the precise definition of odr-use, involving "potential results of an expression e...", that would be best addressed with a separate question.
Does a program where doMath() defined as follows: [...] have undefined behaviour as soon as doMath() is defined in two different translation units (say by inclusion of some_math.h and subsequent use of doMath())?
Yes; this particular issue was highlighted in LWG2888 and LWG2889 which were both resolved for C++17 by P0607R0 (Inline Variables for the Standard Library) [emphasis mine]:
2888. Variables of library tag types need to be inline variables
[...]
The variables of library tag types need to be inline variables.
Otherwise, using them in inline functions in multiple translation
units is an ODR violation.
Proposed change: Make piecewise_construct, allocator_arg, nullopt,
(the in_place_tags after they are made regular tags), defer_lock,
try_to_lock and adopt_lock inline.
[...]
[2017-03-12, post-Kona] Resolved by p0607r0.
2889. Mark constexpr global variables as inline
The C++ standard library provides many constexpr global variables.
These all create the risk of ODR violations for innocent user code.
This is especially bad for the new ExecutionPolicy algorithms, since
their constants are always passed by reference, so any use of those
algorithms from an inline function results in an ODR violation.
This can be avoided by marking the globals as inline.
Proposed change: Add inline specifier to: bind placeholders _1, _2,
..., nullopt, piecewise_construct, allocator_arg, ignore, seq, par,
par_unseq in
[...]
[2017-03-12, post-Kona] Resolved by p0607r0.
Thus, in C++14, prior to inline variables, this risk is present both for your own global variables as well as library ones.

Does the usage of header only libraries with different versions result in UB

Lets assume I have a library somelib.a, that is distributed as binary by the package manager. And this library makes use of the header only library anotherlib.hpp.
If I now link my program against somelib.a, and also use anotherlib.hpp but with a different version, then this can result in UB, if somelib.a uses parts of the anotherlib.hpp in its include headers.
But what will happen if somelib.a will reference/use anotherlib.hpp only in its cpp files (so I don't know that it uses them)? Will the linking step between my application and somelib.a ensure that somelib.a and my application will both use their own version of anotherlib.hpp.
The reason I ask is if I link the individual compilation units of my program to the final program, then the linker removes duplicate symbols (depending on if it is internal linkage or not). So a header only library is normally written in a way that removing duplicate symbols can be done.
A minimal example
somelib.a is build on a system with nlohmann/json.hpp version 3.2
somelib/somelib.h
namespace somelib {
struct config {
// some members
};
config read_configuration(const std::string &path);
}
somelib.cpp
#include <nlohmann/json.hpp>
namespace somelib {
config read_configuration(const std::string &path)
{
nlohmann::json j;
std::ifstream i(path);
i >> j;
config c;
// populate c based on j
return c;
}
}
application is build on another system with nlohmann/json.hpp version 3.5 and 3.2 and 3.5 are not compatible, and then application is then linked against the somelib.a that was build on the system with version 3.2
application.cpp
#include <somelib/somelib.h>
#include <nlohmann/json.hpp>
#include <ifstream>
int main() {
auto c = somelib::read_configuration("config.json");
nlohmann::json j;
std::ifstream i("another.json");
i >> j;
return 0;
}
It hardly makes any difference that you are using a static library.
The C++ standard states that if in a program there is multiple definitions of an inline function (or class template, or variable, etc.) and all the definitions are not the same, then you have UB.
Practically, it means that unless the changes between the 2 versions of the header library are very limited you will have UB.
For instance, if the only changes are whitespace changes, comments, or adding new symbols, then you will not have undefined behavior. However, if a single line of code in an existing function was changed, then it is UB.
From the C++17 final working draft (n4659.pdf):
6.2 One-definition rule
[...]
There can be more than one definition of a class type (Clause 12),
enumeration type (10.2), inline function with external linkage
(10.1.6), inline variable with external linkage (10.1.6), class
template (Clause 17), non-static function template (17.5.6), static
data member of a class template (17.5.1.3), member function of a class
template (17.5.1.1), or template specialization for which some
template parameters are not specified in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the
following requirements.
Given such an entity named D defined in more than one translation
unit, then
each definition of D shall consist of the same
sequence of tokens; and
in each definition of D, corresponding
names, looked up according to 6.4, shall refer to an entity defined
within the definition of D, or shall refer to the same entity, after
overload resolution (16.3) and after matching of partial template
specialization (17.8.3), except that a name can refer to (6.2.1)
a non-volatile const object with internal or no linkage if the object
has the same literal type in all definitions of D,
(6.2.1.2)
is initialized with a constant expression (8.20),
is not odr-used in any definition of D, and
has the same value in all definitions of D,
or
a reference with internal or no linkage initialized with a constant expression
such that the reference refers to the same entity in all definitions
of D; and (6.3)
in each definition of D, corresponding entities
shall have the same language linkage; and
in each definition
of D, the overloaded operators referred to, the implicit calls to
conversion functions, constructors, operator new functions and
operator delete functions, shall refer to the same function, or to a
function defined within the definition of D; and
in each definition of
D, a default argument used by an (implicit or explicit) function call
is treated as if its token sequence were present in the definition of
D; that is, the default argument is subject to the requirements
described in this paragraph (and, if the default argument has
subexpressions with default arguments, this requirement applies
recursively).28
if D is a class with an implicitly-declared
constructor (15.1), it is as if the constructor was implicitly defined
in every translation unit where it is odr-used, and the implicit
definition in every translation unit shall call the same constructor
for a subobject of D.
If D is a template and is defined in more than one translation unit,
then the preceding requirements shall apply both to names from the
template’s enclosing scope used in the template definition (17.6.3),
and also to dependent names at the point of instantiation (17.6.2). If
the definitions of D satisfy all these requirements, then the behavior
is as if there were a single definition of D. If the definitions of D
do not satisfy these requirements, then the behavior is undefined.

C++11: ill-formed calls are undefined behavior?

§ 14.6.4.2 from N3485 states the following about dependent candidate function lookup:
If the call would be ill-formed or would find a better match had the lookup within the associated namespaces considered all the function declarations with external linkage introduced in those namespaces in all translation units, not just considering those declarations found in the template definition and template instantiation contexts, then the program has undefined behavior.
What exactly does it mean for a call to be "ill-formed", and how would an ill-formed call be selected by the lookup? Also, why does it matter that a better match would be found if all translation units were considered?
What exactly does it mean for a call to be "ill-formed"
Formally, ill-formed is defined by [defns.ill.formed] as not well-formed, and a well-formed program is defined by [defns.well.formed] as:
C++ program constructed according to the syntax rules, diagnosable semantic rules, and the One Definition Rule (3.2).
So an ill-formed call is one with invalid syntax or a diagnosable error such as passing the wrong number of arguments, or arguments which cannot be converted to the parameter types, or an overload ambiguity.
how would an ill-formed call be selected by the lookup?
I think it's saying "if (the call would be ill-formed || would find a better match) had the lookup within the associated namespaces considered all the function declarations with external linkage ...", which means you have undefined behaviour if considering other functions would have found equal or better matches. Equally good matches would make the call ambiguous, i.e. ill-formed, and better matches would have resulted in a different function being called.
So if in another context the call would have been ambiguous or caused another sort of error, but succeeds due to only considering a limited set of names in the instantiation and definition contexts, it's undefined. And if in another context the call would have chosen a better match, that's also undefined.
Also, why does it matter that a better match would be found if all translation units were considered?
I think the reason for the rule is to disallow situations where instantiating the same template specialization in two different contexts results in it calling two different functions, e.g. if in one translation unit the call finds one function, and in another translation unit it finds a different function, you'll get two different instantiations of the same template, which violates the ODR, and only one instantiation will be kept by the linker, so the instantiation that's not kept by the linker will get replaced by one which calls a function that wasn't even visible where the template was instantiated.
That's similar (if not already covered by) the last sentence of the previous paragraph:
A specialization for any template may have points of instantiation in multiple translation units. If two different points of instantiation give a template specialization different meanings according to the one definition rule (3.2), the program is ill-formed, no diagnostic required.
Page 426 of the C++ ARM (Ellis & Stroustrup) gives a bit of context for that text (and I believe for 14.6.4.2 as well) and explains it more concisely and clearly than I did above:
This would seem to imply that a global name used from within a template could be bound to different objects or functions in different compilation units or even at different points within a compilation unit. However, should that happen, the resulting template function or class is rendered illegal by the "one-definition" rule (§7.1.2).
There's another related formulation of the same rules in [basic.def.odr]/6
The problem is that namespaces can be defined piecemeal, so there is no one place that is guaranteed to define all of the members of a namespace. As a result, different translation units can see different sets of namespace members. What this section says is that if the part that isn't seen would affect lookup, the behavior is undefined. For example:
namespace mine {
void f(double);
}
mine::f(2); // seems okay...
namespace mine {
void f(char);
}
mine::f(2); // ambiguous, therefore ill-formed
The rule says that the first call to f(2) produces undefined behavior because it would have been ill-formed if all of the overloads in mine had been visible at that point.
Building on #tletnes' partial answer, I think I've come up with a simple program that triggers this particular undefined behavior. Of course it uses multiple translation units.
cat >alpha.cc <<EOF
#include <stdio.h>
void customization_point(int,int) { puts("(int,int)"); }
#include "beta.h"
extern void gamma();
int main() {
beta(42);
gamma();
}
EOF
cat >gamma.cc <<EOF
#include <stdio.h>
void customization_point(int,double) { puts("(int,double)"); }
#include "beta.h"
void gamma() { beta(42); }
EOF
cat >beta.h <<EOF
template<typename T>
void beta(T t) {
customization_point(t, 3.14);
}
EOF
Compiling this program with different optimization levels changes its behavior. This is all right, according to the Standard, because the call in "alpha.cc" invokes undefined behavior.
$ clang++ alpha.cc gamma.cc -O1 -w ; ./a.out
(int,int)
(int,int)
$ clang++ alpha.cc gamma.cc -O2 -w ; ./a.out
(int,int)
(int,double)
When I read this rule I imagine the code similar to the following is at least part of what was being considered:
int foo(int a; int b){ printf("A"); }
int main(){
foo(1, 1.0);
}
int foo(int a, double b){ printf("B"); }
or
int foo(int a);
int main(){
foo(1);
}
int foo(int a, double b){ printf("B"); }

C++: Different classes with the same name in different translation units

Consider the following example:
// usedclass1.hpp
#include <iostream>
class UsedClass
{
public:
UsedClass() { }
void doit() { std::cout << "UsedClass 1 (" << this << ") doit hit" << std::endl; }
};
// usedclass2.hpp
#include <iostream>
class UsedClass
{
public:
UsedClass() { }
void doit() { std::cout << "UsedClass 2 (" << this << ") doit hit" << std::endl; }
};
// object.hpp
class Object
{
public:
Object();
};
// object.cpp
#include "object.hpp"
#include "usedclass2.hpp"
Object::Object()
{
UsedClass b;
b.doit();
}
// main.cpp
#include "usedclass1.hpp"
#include "object.hpp"
int main()
{
Object obj;
UsedClass a;
a.doit();
}
The code compiles without any compiler or linker errors. But the output is strange for me:
gcc (Red Hat 4.6.1-9) on Fedora x86_64 with no optimization [EG1]:
UsedClass 1 (0x7fff0be4a6ff) doit hit
UsedClass 1 (0x7fff0be4a72e) doit hit
same as [EG1] but with -O2 option enabled [EG2]:
UsedClass 2 (0x7fffcef79fcf) doit hit
UsedClass 1 (0x7fffcef79fff) doit hit
msvc2005 (14.00.50727.762) on Windows XP 32bit with no optimization [EG3]:
UsedClass 1 (0012FF5B) doit hit
UsedClass 1 (0012FF67) doit hit
same as [EG3] but with /O2 (or /Ox) enabled [EG4]:
UsedClass 1 (0012FF73) doit hit
UsedClass 1 (0012FF7F) doit hit
I would expect either a linker error (assuming ODR rule is violated) or the output as in [EG2] (code is inlined, nothing is exported from the translation unit, ODR rule is held). Thus my questions:
Why are outputs [EG1], [EG3], [EG4] possible?
Why do I get different results from different compilers or even from the same compiler? That makes me think that the standard somehow doesn't specify the behaviour in this case.
Thank you for any suggestions, comments and standard interpretations.
Update
I would like to understand the compiler's behaviour. More precisely, why there are no errors generated if the ODR is violated. A hypothesis is that since all functions in classes UsedClass1 and UsedClass2 are marked as inline (and therefore C++03 3.2 is not violated) the linker doesn't report errors, but in this case outputs [EG1], [EG3], [EG4] seem strange.
This is the rule that prohibits what you're doing (the C++11 wording), from section 3.2 of the Standard:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
in each definition of D, corresponding names, looked up according to 3.4, shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution (13.3) and after matching of partial template specialization (14.8.3), except that a name can refer to a const object with internal or no linkage if the object has the same literal type in all definitions of D, and the object is initialized with a constant expression (5.19), and the value (but not the address) of the object is used, and the object has the same value in all definitions of D; and
in each definition of D, corresponding entities shall have the same language linkage; and
in each definition of D, the overloaded operators referred to, the implicit calls to conversion functions, constructors, operator new functions and operator delete functions, shall refer to the same function, or to a function defined within the definition of D; and
in each definition of D, a default argument used by an (implicit or explicit) function call is treated as if its token sequence were present in the definition of D; that is, the default argument is subject to the three requirements described above (and, if the default argument has sub-expressions with default arguments, this requirement applies recursively).
if D is a class with an implicitly-declared constructor (12.1), it is as if the constructor was implicitly defined in every translation unit where it is odr-used, and the implicit definition in every translation unit shall call the same constructor for a base class or a class member of D.
In your program, you're violating the ODR for class UsedClass because the tokens are different in different compilation units. You could fix that by moving the definition of UsedClass::doit() outside the class body, but the same rule applies to the body of inline functions.
Your program violates the One Definition Rule and invokes an Undefined Behavior.
The standard does not mandate an diagnostic message if you break the ODR but the behavior is Undefined.
C++03 3.2 One definition rule
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template.
...
Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8). An inline function shall be defined in every translation unit in which it is used.
Further the standard defines specific requirements for existence of multiple definitions of an symbol, those are aptly defined in Para #5 of 3.2.
There can be more than one definition of a class type (clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (clause 14), non-static function template (14.5.5), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.4) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
— each definition of D shall consist of the same sequence of tokens; and
...
Why are outputs [EG1], [EG3], [EG4] possible?
The simple answer is that the behaviour is undefined, so anything is possible.
Most compilers handle an inline function by generating a copy in each translation unit in which it's defined; the linker then arbitrarily chooses one to include in the final program. This is why, with optimisations disabled, it calls the same function in both cases. With optimisations enabled, the function might be inlined by the compiler, in which case each inlined call will use the version defined in the current translation unit.
That makes me think that the standard somehow doesn't specify the behaviour in this case.
That's correct. Breaking the one definition rule gives undefined behaviour, and no diagnostic is required.