Inline constructors and One Definition Rule - c++

Consider following source files
1.cpp
#include <iostream>
using namespace std;
struct X
{
X()
{
cout << "1" << endl;
}
};
void bar();
void foo()
{
X x;
}
int main()
{
foo();
bar();
return 0;
}
2.cpp
#include <cstdio>
struct X
{
X()
{
printf("2\n");
}
};
void bar()
{
X x;
}
Is program compiled from these files well-formed? What should be in it's output?
I've expected linker error due to violation of One Definition Rule or output "1 2". However it prints out "1 1" when compiled with g++ 3.4 and VC 8.0.
How this can be explained?

This does violate ODR (3.2) - specifically that you can have more than one definition of an inline function, but those definitions must be identical (3.2/5) - and leads to undefined behavior, so anything may happen and the compiler/linker is not required to diagnose that. The most likely reason why you see that behavior is that function calls are inlined and do not participate in linking, so no link error is emitted.

It is undefined behaviour (with no required diagnostic) if inlined functions (such as your class constructor) have different definitions in different translation units.

Related

Static variable in an inline method across compilation units

Consider the following header file Sample.h:
#pragma once
template<typename T> class Sample {
static T Method() {
static T var = T(0);
var++;
return var;
}
};
int U1Test();
int U2Test();
And 2 compilation units, U1.cpp:
#include "Sample.h"
int U1Test() { return Sample<int>::Method(); }
And U2.cpp:
#include "Sample.h"
int U2Test() { return Sample<int>::Method(); }
Then in another unit Main.cpp:
#include "Sample.h"
#include <iostream>
using namespace std;
int main() {
cout << U1Test() << endl;
cout << U2Test() << endl;
return 0;
}
When compiled, it gives me the following output:
1
2
But I'm not sure how the compiler does this, because the method is inline header-only and there is no compilation unit for it. So I would expect each compilation unit (like U1.cpp and U2.cpp) to receive its own copy of var because the method is inlined in that compilation unit.
Is there a subtle change that would make the variable separate in each compilation unit? I'm asking because code like this in a larger program seems to lead to crashes, so perhaps my reproducer is not enough (the reproducer works according to the C++ standard, AFAIK).
The compiler is g++ (conda-forge gcc 10.3.0-16) 10.3.0 on Ubuntu 20.04.
So I would expect each compilation unit (like U1.cpp and U2.cpp) to receive its own copy of var because the method is inlined in that compilation unit.
(The implicit) inline means that there may be multiple copies of the same function, but the linker will ignore all but one of them. Thus, all compilation units access that one function instantiation.
Is there a subtle change that would make the variable separate in each compilation unit?
Yes, make the function (freestanding) static rather than (implicitly) inline:
static T Method() {
static T var = T(0);
var++;
return var;
}
Note that this function is not in a class.

C++ class name collision

For the following C++ code, I'm getting unexpected behavior. The behavior was verified with recent GCC, Clang and MSVC++. To trigger it, it is required to split the code among several files.
def.h
#pragma once
template<typename T>
struct Base
{
void call() {hook(data);}
virtual void hook(T& arg)=0;
T data;
};
foo.h
#pragma once
void foo();
foo.cc
#include "foo.h"
#include <iostream>
#include "def.h"
struct X : Base<int>
{
virtual void hook(int& arg) {std::cout << "foo " << arg << std::endl;}
};
void foo()
{
X x;
x.data=1;
x.call();
}
bar.h
#pragma once
void bar();
bar.cc
#include "bar.h"
#include <iostream>
#include "def.h"
struct X : Base<double>
{
virtual void hook(double& arg) {std::cout << "bar " << arg << std::endl;}
};
void bar()
{
X x;
x.data=1;
x.call();
}
main.cc
#include "foo.h"
#include "bar.h"
int main()
{
foo();
bar();
return 0;
}
Expected output:
foo 1
bar 1
Actual output:
bar 4.94066e-324
bar 1
What I expected to happen:
Inside of foo.cc, an instance of X defined within foo.cc is made and through calling call(), the implementation of hook() within foo.cc is called. Same for bar.
What actually happens:
An instance of X as defined in foo.cc is made in foo(). But when calling call, it dispatches not to hook() defined in foo.cc but to hook() defined in bar.cc. This leads to corruption, as the argument to hook is still an int, not a double.
The problem can be solved by putting definition of X within foo.cc in an other namespace than definition of X within bar.cc
So finally the question: There is no compiler warning about this. Neither gcc, nor clang or MSVC++ did even show a warning about this. Is that behavior valid as defined per C++ standard?
The situation seems to be a bit constructed, but it happened in a real world scenario. I was writing tests with rapidcheck, where possible actions on a unit to be tested are defined as classes.
Most container classes have similar actions, so when writing tests for a queue and a vector, classes with names like "Clear", "Push" or "Pop" may come up several times. As these are only required locally, I've put them in directly in the sources where the tests are execute.
The program is Ill-formed, because it violates the One-Definition Rule by having two different definitions for class X. So it is not a valid C++ program. Note that the standard specifically allows compilers not to diagnose this violation. So the compilers are conforming, but the program is not valid C++ and as such has Undefined Behaviour when executed (and thus anything can happen).
You have two identically named, but different, classes X in different compilation units, rendering the program ill-formed, as there are now two symbols with the same name. Since the problem can only be detected during linking, compilers are not able (and not required) to report this.
The only way to avoid this sort of thing, is to put any code that is not meant to be exported (in particular all code that has not been declared in a header file) into an anonymous or unnamed namespace:
#include "foo.h"
#include <iostream>
#include "def.h"
namespace {
struct X : Base<int>
{
virtual void hook(int& arg) {std::cout << "foo " << arg << std::endl;}
};
}
void foo()
{
X x;
x.data=1;
x.call();
}
and equivalently for bar.cc. In fact, this is the main (sole?) purpose of unnamed namespaces.
Simply re-naming your classes (e.g. fooX and barX) may work for you in practice, but is not a stable solution, because there is no guarantee that these symbol names are not used by some obscure third-party library loaded at link- or run-time (now or at some point in the future).

"vtable" linker error (involving a virtual destructor with "=default") - potential bug in Clang 3.1?

I'm getting a linker error in my code. I have pinpointed it down to the bare essentials below.
This code gives the linker error "vtable for Foo", referenced from: Foo::Foo()
class Foo {
public:
Foo();
virtual ~Foo() = default;
};
Foo::Foo() { }
But this code doesn't give any errors:
class Foo {
public:
Foo();
virtual ~Foo() { }
};
Foo::Foo() { }
Why? I thought the = default was supposed to basically do the same thing as those empty square brackets.
Update: I'm using the "Apple LLVM compiler 4.1", a part of Xcode 4.5.2. Could it be a bug in this compiler? It may possibly work on the latest GCC (which Apple isn't shipping anymore though). See comments below for a discussion on compilers.
Update 2: As discussed below, changing the line to virtual inline ~Foo() = default; gets rid of this error. Doesn't this simply have to be a bug? Looks like the compiler doesn't recognize an inline function in this case without explicitly writing out inline.
It appears to be a bug in clang which has been fixed already. You asked at a good time, as a new release should be coming shortly: release candidates are already available. Please give them a try, your example works in the i386-linux binary release, and should work in all of them.
In the Itanium ABI, the v-table (and other RTTI information) is emitted for the translation unit containing the definition of the first virtual method not defined inline in the class, or if there are only virtual methods defined inline, for every translation unit that includes the class. It's then up to the linker to merge the redundant symbols.
It is possible that by specifying = default, Clang does not realize that you have defined the virtual method inline in the class and that each and every TU that includes your file should define the v-table and RTTI info, and instead is waiting for the definition to appear somewhere.
May I suggest putting the definition outside the class ? => Foo::~Foo() = default;
It works for me with g++ 4.7.2. But I have the same problem as you with clang 3.1.
I have 3 files.
Foo.h:
#ifndef FOO_H
#define FOO_H
class Foo {
public:
Foo();
virtual ~Foo() = default;
};
#endif // FOO_H
Foo.cpp:
#include "Foo.h"
Foo::Foo() { }
main.cpp:
#include <iostream>
#include "Foo.h"
using namespace std;
int main()
{
Foo foo;
return 0;
}
But if it is like this, it works with clang as well:
Foo.cpp is empty.
main.cpp
#include <iostream>
#include "Foo.h"
using namespace std;
Foo::Foo() { }
int main()
{
Foo foo;
return 0;
}
So I guess clang has bug during generating the objectfile.

Instantiation of function object with different inline function definitions depends on order of linkage

Please help me understand the root cause of the following behaviour.
In file a.cpp I have:
namespace NS {
struct Obj {
void pong(){ cout << "X in "__FILE__ << endl; }
double k;
};
X::X() { Obj obj; obj.pong(); }
void X::operator()() { cout << "X says hello" << endl; }
}
In file b.cpp I have:
namespace NS {
struct Obj {
void pong(){ cout << "Y in "__FILE__ << endl; }
bool m;
};
Y::Y() { Obj obj; obj.pong(); }
void Y::operator()() { cout << "Y says hello" << endl; }
}
My main creates an X, an Y and calls their operator()s:
int main( int argc, char *argv[] )
{
NS::X x;
x();
NS::Y y;
y();
return 0;
}
The output of this program depends on whether a.cpp or b.cpp gets compiled first: in the first case the Obj from a.cpp is instantiated also within NS::Y's constructor, in the second case the Obj from b.cpp is instantiated in both NS::X and NS::Y.
% g++ b.cpp a.cpp main.cpp
% ./a.out
X in a.cpp
X says hello
Y in b.cpp
Y says hello
% g++ b.cpp a.cpp main.cpp
% ./a.out
Y in b.cpp
X says hello
Y in b.cpp
Y says hello
No warnings from the linker either on Linux or Visual Studio (2005). If I define Obj::pong() outside the declaration of the struct I get a linker error telling me that the Obj::pong function is multiply defined.
I experimented a bit further and found out that the cause must be related to whether or not the inlining, because if I compile with -O3, the each object uses the Obj from his own translation unit.
So then the question changes to: what happens to the second definition of the inline function during non-optimized compilation? Are they silently ignored?
This is undefined behavior: The your class definitions define the same class type, and so they have to be both the same. For the linker it means it can choose one arbitrary definition as the one that gets emitted.
If you want them to be separated types, you have to nest them into an unnamed namespace. This will cause anything in that namespace to be unique for that translation unit:
namespace NS {
namespace {
struct Obj {
void pong(){ cout << "Y in "__FILE__ << endl; }
bool m;
};
}
Y::Y() { Obj obj; obj.pong(); }
void Y::operator()() { cout << "Y says hello" << endl; }
}
So then the question changes to: what happens to the second definition of the inline function during non-optimized compilation? Are they silently ignored?
Yes, for inline functions (functions defined within class definitions are inline, even if not explicitly declared inline), the same principle applies: They can be defined multiple times in the program, and the program behaves as if it was defined only once. To the linker it means again it can discard all but one definition. Which one it chooses is unspecified.
Linker deals with mangled names.
Please have a look here: http://en.wikipedia.org/wiki/Name_mangling
So, as Johannes said, the behaviour is undefined, but details may be clarified:
If pong() is defined outside of the namespace, its name becomes unique and the linker complains correctly.
But if the name is hidden into the namespace, which is overlapped with the same one from another translation unit - as you have figured out - the linker does not complain. It just uses one symbol instead.
That's it.
I think, it is not specified and is implementation-specific for any compiler/linker.

several definitions of the same class

Playing around with MSVC++ 2005, I noticed that if the same class is defined several times, the program still happily links, even at the highest warning level. I find it surprising, how comes this is not an error?
module_a.cpp:
#include <iostream>
struct Foo {
const char * Bar() { return "MODULE_A"; }
};
void TestA() { std::cout << "TestA: " << Foo().Bar() << std::endl; }
module_b.cpp:
#include <iostream>
struct Foo {
const char * Bar() { return "MODULE_B"; }
};
void TestB() { std::cout << "TestB: " << Foo().Bar() << std::endl; }
main.cpp:
void TestA();
void TestB();
int main() {
TestA();
TestB();
}
And the output is:
TestA: MODULE_A
TestB: MODULE_A
It is an error - the code breaks the C++ One Definition Rule. If you do that, the standard says you get undefined behaviour.
The code links, because if you had:
struct Foo {
const char * Bar() { return "MODULE_B"; }
};
in both modules there would NOT be a ODR violation - after all, this is basically what #including a header does. The violation comes because your definitions are different ( the other one contains the string "MODULE_A") but there is no way for the linker (which just looks at class/function names) to detect this.
The compiler might consider that the object is useless besides its use in Test#() function and hence inlines the whole thing. That way, the linker would never see that either class even existed ! Just an idea, though.
Or somehow, linking between TestA and class Foo[#] would be done inside compilation. There would be a conflict if linker was looking for class Foo (multiple definition), but the linker simply does not look for it !
Do you have linking errors if compiling in debug mode with no optimizations enabled ?