When should linkers generate multiply defined X warnings? - c++

Never turn your back on C++. It'll getcha.
I'm in the habit of writing unit tests for everything I do. As part of this I frequently define classes with names like A and B, in the .cxx of the test to exercise code, safe in the knowledge that i) because this code never becomes part of a library or is used outside of the test, name collisions are likely very rate and ii) the worst that could happen is that the linker will complain about multiply defined A::A() or what every and I'll fix that error. How wrong I was.
Here are two compilation units:
#include <iostream>
using namespace std;
// Fwd decl.
void runSecondUnit();
class A {
public:
A() : version( 1 ) {
cerr << this << " A::A() --- 1\n";
}
virtual ~A() {
cerr << this << " A::~A() --- 1\n";
}
int version; };
void runFirstUnit() {
A a;
// Reports 1, correctly.
cerr << " a.version = " << a.version << endl;
// If you uncomment these, you will call
// secondCompileUnit: A::getName() instead of A::~A !
//A* a2 = new A;
//delete a2;
}
int main( int argc, char** argv ) {
cerr << "firstUnit BEGIN\n";
runFirstUnit();
cerr << "firstUnit END\n";
cerr << "secondUnit BEGIN\n";
runSecondUnit();
cerr << "secondUnit END\n";
}
and
#include <iostream>
using namespace std;
void runSecondUnit();
// Uncomment to fix all the errors:
//#define USE_NAMESPACE
#if defined( USE_NAMESPACE )
namespace mySpace
{
#endif
class A {
public:
A() : version( 2 ) {
cerr << this << " A::A() --- 2\n";
}
virtual const char* getName() const {
cerr << this << " A::getName() --- 2\n"; return "A";
}
virtual ~A() {
cerr << this << " A::~A() --- 2\n";
}
int version;
};
#if defined(USE_NAMESPACE )
} // mySpace
using namespace mySpace;
#endif
void runSecondUnit() {
A a;
// Reports 1. Not 2 as above!
cerr << " a.version = " << a.version << endl;
cerr << " a.getName()=='" << a.getName() << "'\n";
}
Ok, ok. Obviously I shouldn't have declared two classes called A. My bad. But I bet you can't guess what happens next...
I compiled each unit, and linked the two object files (successfully) and ran. Hmm...
Here's the output (g++ 4.3.3):
firstUnit BEGIN
0x7fff0a318300 A::A() --- 1
a.version = 1
0x7fff0a318300 A::~A() --- 1
firstUnit END
secondUnit BEGIN
0x7fff0a318300 A::A() --- 1
a.version = 1
0x7fff0a318300 A::getName() --- 2
a.getName()=='A'
0x7fff0a318300 A::~A() --- 1
secondUnit END
So there are two separate A classes. In the second use, the destructor and constructor for the first on was used, even though only the second one was in visible in its compilation unit. Even more bizarre, if I uncomment the lines in runFirstUnit, instead of calling either A::~A, the A::getName is called. Clearly in the first use, the object gets the vtable for the second definition (getName is the second virtual function in the second class, the destructor the second in the first). And it even correcly gets the constructor from the first.
So my question is, why didn't the linker complain about the multiply defined symbols.
It appears to choose the first match. Reordering the objects in the link step confirm.
The behavior is identical in Visual Studio, so I'm guessing that this is some standard-defined behavior. My question is, why? Clearly it would be easy for the linker to barf given the duplicate names.
If I add,
void f() {}
to both files it complains. Why not for my class constructors and destructors?
EDIT The problem isn't, "what should I have done to avoid this", or "how is the behavior explained". It is, "why don't linkers catch it?" Projects may have thousands of compile units. Sensible naming practices don't really solve this issue -- they only make the problem obscure and only then if you can train everyone to follow them.
The above example leads to ambiguous behavior that is easy and definitively solvable by compiler tools. So, why do they not? Is this simply a bug. (I suspect not.)
** EDIT ** See litb's answer below. I'm repeating is back to make sure my understanding's right:
Linkers only generate warnings for strong references.
Because we have shared headers, inline function definitions (i.e. where declaration and definition is made at the same place, or template functions) are be compiled into multiple object files for each TU that sees them. Because there's no easy way to restrict the generation this code to a single object file, the linker has the job of choosing one of many definitions. So that errors are not generated by the linker, the symbols for these compiled definitions are tagged as weak references in the object file.

The compiler and linker relies on both classes to be exactly the same. In your case, they are different and so strange things happen. The one definition rule says that the result is undefined behavior - so behavior is not at all required to be consistent among compilers. . I suspect that in runFirstUnit, in the delete line, it puts a call to the first virtual table entry (because in its translation unit, the destructor may occupy the first entry).
In the second translation unit, this entry happens to point to A::getName, but in the first translation unit (where you execute the delete), the entry points to A::~A. Since these two are differently named (A::~A vs A::getName) you don't get a name clash (you will have code emitted for both the destructor and getName). But since their class name is the same, their v-tables will clash on purpose, because since both classes have the same name, the linker will think they are the same class and assume same contents.
Notice that all member functions were defined in-class, which means they are all inline functions. These functions can be defined multiple times in a program. In the case of in-class definitions, the rationale is that you may include the same class definition into different translation units from their header files. Your test function, however, isn't an inline function and thus including it into different translation units will triggers a linker error.
If you enable namespaces, there will be no clash what-so ever, because ::A and ::mySpace::A are different classes, and of course will get different v-tables.

A simple way to restrict each class to the current translation unit is to enclose it in an anonymous namespace:
// a.cpp
namespace {
class A {
// ...
};
}
// b.cpp
namespace {
class A {
// ...
};
}
is perfecetly legal. Because the two classes are in separate translation units, and are inside anonymous namespaces, they won't conflict.

The functions are defined as inline. inline functions can be defined multiple times in the program. See point 3 in the summary here:
http://en.wikipedia.org/wiki/One_Definition_Rule
The important point is:
For a given entity, each definition must be the same.
Try not defining the functions as inline. The linker should start to give duplicate symbol errors then.

Related

Wrong symbols linked. Why? [duplicate]

This question already has answers here:
C++ Multiple Definition of Struct
(2 answers)
Why is there no multiple definition error when you define a class in a header file?
(3 answers)
Closed 1 year ago.
C++ translator seems uses correct declared structs of the same name, but then linker mismatches them without any warning or error! And this also leads to UB, because at least inappropriate ctor/dtor are used for the memory region.
Here is minimal sandbox code. Each struct Test should be treated as some internal non-public structure used only in one own .cpp file.
file1.cpp
#include <iostream>
using namespace std;
void someFunc();
struct Test
{
Test() { std::cout << "1 "; }
~Test() { std::cout << "~1" << std::endl; }
};
int main()
{
{
Test test;
}
someFunc();
return 0;
}
file2.cpp
#include <iostream>
struct Test {
Test() { std::cout << "2 "; }
~Test() { std::cout << "~2" << std::endl; }
};
void someFunc() {
Test test;
}
(Downloadable and buildable CMake-project just in case: https://file.io/dzafv409B2t0)
Output will be:
1 ~1
1 ~1
So, I expected:
Successful build with output: 1 ~1 2 ~2
Or failed build with multiple definition error
Yes, I can resolve the problem if:
Rename the struct
Put the struct into anonymous namespace - force internal linkage
...but this doesn't answer the main question:
Why linker behaves so? Why does it silently links to first available matching symbol (among several) instead of reporting multiple definition error?
Update: As I understood, this mechanism allows to include header with class declaration (with inline code) into several different source files without multiple definition problem.

Why can I not define member functions that have been forward declared in the same file?

I am working on a project, and I have used Xcode in the past, and it has been "acting up" lately (might be me). The following code is a test code for this question (not my project).
(Assume that all lexical/preprocessing/namespace directives are all there.)
In Foo.hpp
class Foo {
public:
Foo();
};
Foo::Foo() {
cout << "constructive" << endl;
}
Now, if I run a main that constructs a Foo object, it gives a linker error of duplicate symbol. How should I fix this?
The quick and dirty fix is to either write
inline Foo::Foo(){
or fully define the function in the class definition:
public:
Foo(){cout << "constructive" << endl;}
The better fix is to ensure that the constructor definition is only compiled in exactly one translation unit; i.e. put it in a source file.
You need to declare the function as inline:
class Foo {
public:
inline Foo();
};
or put it in a .cpp file, to ensure that it is defined in only one translation unit:
// foo.cpp
Foo::Foo() {
cout << "constructive" << endl;
}
Oh, looks like I found the answer. All I did is delete Foo.cpp and it solved the problem. Also, I could have put the definitions in the .cpp file, but sending that to people isn't the best.
My new question is why does this work?

The same function in two different CPP files. How do I accomplish this?

For my homework, this is my assignment:
Create 5 files. Driver.cpp, f.h, f.cpp, g.h, g.cpp. f and g should implement a function called hello. Driver should
call hello from f and g.
Example Output:
hello from f
hello from g
Press any key to continue . . .
I have all these files created, but what I dont understand is how can the same function hello() exist in two files and be called from the driver.cpp file? any help would be greatly appreciated!
edit: The error I get is "fatal error LNK1169: one or more multiply defined symbols found". This is referring to the two hello() functions. How do I fix this?
Globally visible entities are allowed to have only one definition. Thus, you can't have the same function hello() defined in multiple translation units. There are a few separate approaches how to define equally named functions multiple times:
Overloaded function can have the same name as long as they differ in their arguments in some way. For example, you could have each of the hello() functions take an argument which differs between the different versions (note: I'm not suggesting this approch). For example:
void hello(bool) { std::cout << "hello(bool)\n"; }
void hello(int) { std::cout << "hello(int)\n"; }
You can define the names in different namespaces. This makes the fully qualified name actually different, i.e., the conflict is prevented by just using a different scope, e.g.:
namespace a { void hello() { std::cout << "a::hello()\n"; }
namespace b { void hello() { std::cout << "b::hello()\n"; }
Assuming you call your function from a function in the local file, you can move the function from being globally visible to being only locally visible using the static keyword. Functions with local visibility do not conflict between different translation units. For example:
// file a.cpp
static void hello() { std::cout << "a.cpp:hello()\n"; }
void a() { hello(); }
// file b.cpp
static void hello() { std::cout << "b.cpp:hello()\n"; }
void b() { hello(); }
Which of these versions your teach is actually after, I don't know. Each one has their use, though, and it is useful to know the different variations.
In case someone claims that for completeness I should have included virtual functions: note that overriding a function is actually also creating an overload (the virtual function in the base and the overriding function differ in the implicity passed object), i.e., the use of virtual functions is already covered.
You should use namespaces :
In f.h
:
namespace mynamespace {
void hello();
}
In f.cpp
void mynamespace::hello()
{
/... function definition here
}
In main()
int main()
{
mynamespace :: hello(); // calls hello defined in f.cpp
}
For a good introduction to namespaces. Namespaces

same class, different size...?

See the code, then you would understand what I'm confused.
Test.h
class Test {
public:
#ifndef HIDE_VARIABLE
int m_Test[10];
#endif
};
Aho.h
class Test;
int GetSizeA();
Test* GetNewTestA();
Aho.cpp
//#define HIDE_VARIABLE
#include "Test.h"
#include "Aho.h"
int GetSizeA() { return sizeof(Test); }
Test* GetNewTestA() { return new Test(); }
Bho.h
class Test;
int GetSizeB();
Test* GetNewTestB();
Bho.cpp
#define HIDE_VARIABLE // important!
#include "Test.h"
#include "Bho.h"
int GetSizeB() { return sizeof(Test); }
Test* GetNewTestB() { return new Test(); }
TestPrj.cpp
#include "Aho.h"
#include "Bho.h"
#include "Test.h"
int _tmain(int argc, _TCHAR* argv[]) {
int a = GetSizeA();
int b = GetSizeB();
Test* pA = GetNewTestA();
Test* pB = GetNewTestB();
pA->m_Test[0] = 1;
pB->m_Test[0] = 1;
// output : 40 1
std::cout << a << '\t' << b << std::endl;
char temp;
std::cin >> temp;
return 0;
}
Aho.cpp does not #define HIDE_VARIABLE, so GetSizeA() returns 40, but
Bho.cpp does #define HIDE_VARIABLE, so GetSizeB() returns 1.
But, Test* pA and Test* pB both have member variable m_Test[].
If the size of class Test from Bho.cpp is 1, then pB is weird, isn't it?
I don't understand what's going on, please let me know.
Thanks, in advance.
Environment:
Microsoft Visual Studio 2005 SP1 (or SP2?)
You violated the requirements of One Definition Rule (ODR). The behavior of your program is undefined. That's the only thing that's going on here.
According to ODR, classes with external linkage have to be defined identically in all translation units.
Your code exhibits undefined behavior. You are violating the one definition rule (class Test is defined differently in two places). Therefore the compiler is allowed to do whatever it wants, including "weird" behavior.
In addition to ODR.
Most of the grief is caused by including the headers only in the cpp files, allowing you to change the definition between the compilation units.
But, Test* pA and Test* pB both have member variable m_Test[].
No, pB doesn't have m_Test[] however the TestPrj compilation unit doesn't know that and is applying the wrong structure of the class so it will compile.
Unless you compile in debug with capturing of memory overrun you would most times not see a problem.
pB->m_Test[9] = 1; would cause writing to memory not assigned by pB but may or may not be a valid space for you to write.
Like many people told here, you've violated the so-called One Definition Rule (ODR).
It's important to realize how C/C++ programs are assembled. That is, the translation units (cpp files) are compiled separately, without any connection to each other. Next linker assembles the executable according to the symbols and the code pieces generated by the compiler. It doesn't have any high-level type information, hence it's unable (and should not) to detect the problem.
So that you've actually cheated the compiler, beaten yourself, shoot your foot, whatever you like.
One point that I'd like to mention is that actually ODR rule is violated very frequently due to subtle changes in the various include header files and miscellaneous defines, but usually there's no problem and people don't even realize this.
For instance a structure may have a member of LPCTSTR type, which is a pointer to either char or wchar_t, depending on the defines, includes and etc. But this type of violation is "almost ok". As long as you don't actually use this member in differently compiled translation units there's no problem.
There're also many other common examples. Some arise from the in-class implemented member functions (inlined), which actually compile into different code within different translation units (due to different compiler options for different translation units for instance).
However this is usually ok. In your case however the memory layout of the structure has changed. And here we have a real problem.

several definitions of the same class

Playing around with MSVC++ 2005, I noticed that if the same class is defined several times, the program still happily links, even at the highest warning level. I find it surprising, how comes this is not an error?
module_a.cpp:
#include <iostream>
struct Foo {
const char * Bar() { return "MODULE_A"; }
};
void TestA() { std::cout << "TestA: " << Foo().Bar() << std::endl; }
module_b.cpp:
#include <iostream>
struct Foo {
const char * Bar() { return "MODULE_B"; }
};
void TestB() { std::cout << "TestB: " << Foo().Bar() << std::endl; }
main.cpp:
void TestA();
void TestB();
int main() {
TestA();
TestB();
}
And the output is:
TestA: MODULE_A
TestB: MODULE_A
It is an error - the code breaks the C++ One Definition Rule. If you do that, the standard says you get undefined behaviour.
The code links, because if you had:
struct Foo {
const char * Bar() { return "MODULE_B"; }
};
in both modules there would NOT be a ODR violation - after all, this is basically what #including a header does. The violation comes because your definitions are different ( the other one contains the string "MODULE_A") but there is no way for the linker (which just looks at class/function names) to detect this.
The compiler might consider that the object is useless besides its use in Test#() function and hence inlines the whole thing. That way, the linker would never see that either class even existed ! Just an idea, though.
Or somehow, linking between TestA and class Foo[#] would be done inside compilation. There would be a conflict if linker was looking for class Foo (multiple definition), but the linker simply does not look for it !
Do you have linking errors if compiling in debug mode with no optimizations enabled ?