Understanding what causes this multiple definition error - c++

I have a base class with a pure virtual method implemented by two classes:
// base_class.hpp
class base_class {
public:
virtual std::string hello() = 0;
};
// base_implementer_1.hpp
class base_implementer1 : base_class {
public:
std::string hello();
};
// base_implementer_2.hpp
class base_implementer2 : base_class {
public:
std::string hello();
};
// base_implementer_1.cpp
std::string hello() {
return(std::string("Hello!"));
}
// base_implementer_2.cpp
std::string hello() {
return(std::string("Hola!"));
}
Note the lack of base_implementer1:: and base_implementer2:: in the implementations. This is deliberate.
By adding in the base_implementer1:: and base_implementer2:: I do not get a multiple definition error. However, leaving them off the linker complains I have two definitions of the same function (hello()).
Since these two implementations are not featured in the header files, I would think that (even though they are not correct in terms of ACTUALLY implementing hello()) they would be allowed since there's no reason you couldn't have two hello() functions in two distinct .cpp files. But this doesn't seem to be the case. Can anyone tell me what's happening in the linker to make this multiple definition error happen?

One-Definition-Rule defines rules for two scopes, i.e. translation unit scope and program scope.
The following rule with translation unit scope states that the same translation unit must not comprise two different definitions of the same function:
Only one definition of any variable, function, class type, enumeration
type, or template is allowed in any one translation unit (some of
these may have multiple declarations, but only one definition is
allowed).
So, if you have two different .cpp-files, than you have two different translation units, and each of them may have their own definition of hello(); ODR is not violated in the scope of a translation unit.
The following rule with program scope defines that an odr-used function must be defined exactly once in the program:
One and only one definition of every non-inline function or variable
that is odr-used (see below) is required to appear in the entire
program (including any standard and user-defined libraries). The
compiler is not required to diagnose this violation, but the behavior
of the program that violates it is undefined.
The definition of odr-used informally states that for every function that is called or which's address is taken must be defined in the program:
Informally, an object is odr-used if its address is taken, or a
reference is bound to it, and a function is odr-used if a function
call to it is made or its address is taken. If an object or a
function is odr-used, its definition must exist somewhere in the
program; a violation of that is a link-time error.
So, if more than one .cpp-file exposes an implementation of hello(), and if this function is called or referenced, then ODR from program scope is clearly violated.
If the respective function is not odr-used (i.e. called or referenced), ODR should - to my understanding - not be violated;
If a compiler complains about duplicate symbols, than this is because the program violates linkage rules (please confer also SO answer concerning "If I don't odr-use a variable"). C++11 §3.5[basic.link]/9 states:
Two names that are the same and that are declared in different scopes
shall denote the same variable, function, type, enumerator, template
or namespace if
both names have external linkage or else both names have internal linkage and are declared in the same translation unit; and ...
To avoid this, make sure that at most one implementation of hello() is exposed, and make all others static or use an unnamed namespace.
In the C programming language, static is used with global variables and functions to set their scope to the containing file, i.e. it does not expose this implementation and name clashes with other binaries are avoided.
So a reasonable suggestion would be: Make function definitions, that are solely used within a translation unit, visible only to this translation unit; and define functions that are exposed within a namespace or class in order to avoid unintended or unforeseeable name clashes / duplicate symbol problems in the linker.

You have two different definitions of a function named hello in two different translation units. When it comes to link time, the linker has no idea which hello function to link to.
Consider:
A.cpp
#include <string>
std::string hello() {
return "A";
}
B.cpp
#include <string>
std::string hello() {
return "B";
}
C.cpp
#include <iostream>
std::string hello();
int main() {
std::cout << hello() << '\n';
}
How could the linker possibly know which hello to call in main? It can't because the One Definition Rule has been violated.

You define a global function called hello in base_implementor_1.cpp. You define another global function called hello in base_implementor_2.cpp. This results in the multiple definition and the required error for violation of the ODR. Why is this a problem? If you have a 3rd source file that calls hello(), which function should be called?
If you want to define distinct functions with the same name in multiple source files, you can preface them with the static keyword
static void hello() { }
or within an anonymous namespace
namespace {
void hello() { }
}

Related

When a method is defined inside class scope, project won't compile unless method is invoked somewhere else in the original cpp file

Here's the body of main.cpp
#include "Header.h"
int main()
{
auto f = Foo();
f.bar();
return 0;
}
Here's the body of Header.h
class Foo {
public:
void bar();
};
Here's the body of Source.cpp
#include <iostream>
class Foo {
public:
void bar() {
std::cout << "Foo.bar() called\n";
}
};
void t() {
auto f = Foo();
f.bar(); // If this line isn't here, the project won't compile
}
When I comment out f.bar(); in Source.cpp, I receive the following error upon compilation. It's telling me that f.bar() in main() is unresolved:
Error LNK2019 unresolved external symbol "public: void __thiscall
Foo::bar(void)" (?bar#Foo##QAEXXZ) referenced in function _main ...
I understand that it's common to define methods outside of class scope, and indeed the following version of Source.cpp compiles successfully--
#include <iostream>
#include "Header.h"
void Foo::bar() {
std::cout << "Foo.bar() called\n";
}
Nonetheless, I don't understand what's wrong with the original version. It feels like something mysterious and magical is going on that I don't fully understand.
This is an ODR violation either way, because the class Foo as defined in the header, and the class Foo as defined in the cpp file are not token by token identical.
out of line definition of bar should look like
#include <iostream>
#include "Header.h"
void Foo::bar() {
std::cout << "Foo.bar() called\n";
}
Note that ODR violations are 'no diagnostics required'
(this is why one of the examples compiles, the compiler can't always detect ODR violations)
The best way to not accidentally create errors like this is to always include the header-file in the file/files where you implement the methods.
When we are talking about good practices, don't forget the include-guard in the header-file.
If a method is short enough to be inline defined, (style may vary, but for me that means a short oneliner), then that inline definition need to be present in the header-file, so that the compiler sees an identical type every time it encounters it.
Your original program violates the One Definition Rule (ODR), defined in C++14, 3.2. The relevant paragraph is number 6:
There can be more than one definition of a class type [...] in a program
provided that each definition appears in a different translation unit,
and provided the definitions satisfy the following requirements. Given such
an entity name D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
[...]
You have two translation units: main.cpp with all its header files (including Header.h), and Source.cpp with all its header files (which do not include Header.h). Both translation units contain a definition of the class ::Foo: main.cpp contains the one from Header.h, and Source.cpp has its own.
However, the two do not consist of the same sequence of tokens. The one in Source.cpp contains an inline function definition.
The reason for having one definition of a class in a header and only getting it by including this header is two-fold:
You save the effort of duplicating it, and more importantly, finding every duplicate if you have to change it.
Barring preprocessor shenanigans, with there being only one textual form of the definition, you cannot run afoul of the ODR.
Note that ODR violations are "no diagnostic required" - in your case you get an error, but as you saw, the error disappears if you call the function in Source.cpp. That doesn't mean your program became correct; it just means that the compiler was no longer capable of discovering the error; nonetheless, very strange behavior could occur. Your program has undefined behavior.
Take a look at the C++ standard 9.3.2 (n4296):
" ... A member function may be defined (8.4) in its class definition, in which case it is an inline member function
(7.1.2), or it may be defined outside of its class definition if it has already been declared but not defined
in its class definition. A member function definition that appears outside of the class definition shall appear
in a namespace scope enclosing the class definition. Except for member function definitions that appear
outside of a class definition, and except for explicit specializations of member functions of class templates
and member function templates (14.7) appearing outside of the class definition, a member function shall not
be redeclared ..."
& 9.3.3 :
" ...An inline member function (whether static or non-static) may also be defined outside of its class definition
provided either its declaration in the class definition or its definition outside of the class definition declares
the function as inline.[ Note: Member functions of a class in namespace scope have the linkage of that
class. Member functions of a local class (9.8) have no linkage. See 3.5. —end note ] ..."
Since bar() is defined inside Foo { }; it is implicitly inline.
I believe taking the above clauses into consideration together implies that Foo::bar has local linkage in Source.cpp & is not visible to anyone else. What I find confusing is that calling f.bar() changes the linkage allowing your other example to compile.
Compiler take the cpp files one by one, there's no way the compiler will remember what else he compiled in another file
main.cpp - include header. Header contains a class declaration. main.obj expects that foo() will be implemented as a separate symbol available at linking time. Done, write the object file and forget
Source.cpp redeclare the class without including the header. As such, the compiler cannot complain of the double declaration. As in the Source.cpp declaration the foo method is implemented inline, the compiler (not knowing anything about what main expects) does not generate a linkage symbol for the method, Done, write the object and forget
linker loads both objects, see that the 'main.obj' needs a linkage symbol for the foo method, searches everywhere and cannot find one - because the Source.obj does not contain one - the compiler was instructed the method will be inline.
Makes sense?

c++ emitting inline functions

Let's say that I have a library which contains a public definition of function void foo();. The library calls this function internally. To get the best performance I want internal calls to be inlined. I also want to prevent external code from seeing the definition so that later I can change the implementation without breaking the ABI. Here is a piece of code:
MyLib.h:
void foo();
MyLibInlined.h:
inline void foo() { code here }
MyLib.cpp
#define inline
#include "MyLibInlined.h"
The question is does it break the ODR or is it considered bad practice?
EDIT:
What if foo was a member function?
The question is does it break the ODR or is it considered bad practice?
It doesn't break the ODR, but it breaks the rules in [dcl.fct.spec]:
If a function with external linkage is
declared inline in one translation unit, it shall be declared inline in all translation units in which it appears;
no diagnostic is required.
Instead you should have a public version of the function, which is not declared inline, and have an internal version which you use inside your library:
// MyLibInlined.h
inline void foo_impl() { }
Then inside the library define foo as a call to the internal one:
// MyLib.cpp
#include "MyLibInlined.h"
void foo() { foo_impl(); }
Alternatively, if all the calls to foo() are in a single file you don't need to worry at all, just define it as a non-inline function, and let the compiler inline it in the file where the definition is visible:
// MyLib.h
void foo();
// MyLib.cpp
void foo() { code here }
// use foo ...
The inline keyword doesn't mean the function will be inlined, it means the definition is provided inline in headers. The compiler doesn't need that keyword to be able to inline it within the file where it's defined, because it can see the definition. You only need the inline keyword to allow the definition to appear in multiple translation units without causing a multiple definition error.
AFAIK it does break the ODR, since inline is not so much a rule as it is a guideline. The compiler is allowed to not inline functions despite them being declared so.
On the other hand compilers are also allowed to inline functions that are not declared inline, and are likely to do so for small functions in internal calls (it can do so at link-time in some cases), so just don't worry about it.
Alternatively declare the inline version in a separate namespace and use inline namespaces to resolve it at compile-time (or using or whatever)(http://en.cppreference.com/w/cpp/language/namespace#Inline_namespaces)
It seems to be illegal based on this (C++14 3.2/6)
There can be more than one definition of a [...] inline function with
external linkage [...] in a program provided that each definition
appears in a different translation unit, and provided the definitions satisfy the following requirements. Given
such an entity named D defined in more than one translation unit, then
[...]
— each definition of D shall consist of the same sequence of tokens
Section 3.2 is the section on the one definition rule.
This might be a cleaner variation on what you're doing:
// foo_pub.h -- public interface
#define foo() foo_pub()
void foo_pub();
// foo_private.h -- internal used by library
#define foo() foo_inline()
inline foo_inline() { ... }
// foo_pub.c -- definition for public function
void
foo_pub()
{
foo_inline()
}

Why should types be put in unnamed namespaces?

I understand the use of unnamed namespaces to make functions and variables have internal linkage. Unnamed namespaces are not used in header files; only source files. Types declared in a source file cannot be used outside. So what's the use of putting types in unnamed namespaces?
See these links where it's mentioned that types can be put in unnamed namespaces:
Superiority of unnamed namespace over static?
Unnamed/anonymous namespaces vs. static functions
Why an unnamed namespace is a "superior" alternative to static?
Where do you want to put local types other than the unnamed namespace? Types can't have a linkage specifier like static. If they are not publicly known, e.g., because they are declared in a header, there is a fair chance that names of local types conflict, e.g., when two translation units define types with the same name. In that case you'd end up with an ODR violation. Defining the types inside an unnamed namespace eliminates this possibility.
To be a bit more concrete. Consider you have
// file demo.h
int foo();
double bar();
// file foo.cpp
struct helper { int i; };
int foo() { helper h{}; return h.i; }
// file bar.cpp
struct helper { double d; }
double bar() { helper h{}; return h.d; }
// file main.cpp
#include "demo.h"
int main() {
return foo() + bar();
}
If you link these three translation units, you have mismatching definitions of helper from foo.cpp and bar.cpp. The compiler/linker is not required to detect these but each type which is used in the program needs to have a consistent definition. Violating this constraints is known as violation of the "one definition rule" (ODR). Any violation of the ODR rule results in undefined behavior.
Given the comment it seems a bit more convincing is needed. The relevant section of the standard is 3.2 [basic.def.odr] paragraph 6:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member
of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then each definition of D shall consist of the same sequence of tokens; and
[...]
There are plenty of further constraints but "shall consist of the same sequence of tokens" is clearly sufficient to rule out e.g. the definitions in the demo above from being legal.
So what's the use of putting types in unnamed namespaces?
You can create short, meaningful classes with names that maybe used in more than one file without the problem of name conflicts.
For example, I use two classes often in unnamed namespaces - Initializer and Helper.
namespace
{
struct Initializer
{
Initializer()
{
// Take care of things that need to be initialized at static
// initialization time.
}
};
struct Helper
{
// Provide functions that are useful for the implementation
// but not exposed to the users of the main interface.
};
// Take care of things that need to be initialized at static
// initialization time.
Initializer initializer;
}
I can repeat this pattern of code in as many files as I want without the names Initializer and Helper getting in the way.
Update, in response to comment by OP
file-1.cpp:
struct Initializer
{
Initializer();
};
Initializer::Initializer()
{
}
int main()
{
Initializer init;
}
file-2.cpp:
struct Initializer
{
Initializer();
};
Initializer::Initializer()
{
}
Command to build:
g++ file-1.cpp file-2.cpp
I get linker error message about multiple definitions of Initializer::Initializer(). Please note that the standard does not require the linker to produce this error. From section 3.2/4:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.
The linker does not produce an error if the functions are defined inline:
struct Initializer
{
Initializer() {}
};
That's OK for a simple case like this since the implementations are identical. If the inline implementations are different, the program is subject to undefined behavior.
I might be a bit late for answering the question the OP made but since I think the answer is not fully clear, I would like to help future readers.
Lets try a test... compile the following files:
//main.cpp
#include <iostream>
#include "test.hpp"
class Test {
public:
void talk() {
std::cout<<"I'm test MAIN\n";
}
};
int main()
{
Test t;
t.talk();
testfunc();
}
//test.hpp
void testfunc();
//test.cpp
#include <iostream>
class Test {
public:
void talk()
{
std::cout<<"I'm test 2\n";
}
};
void testfunc() {
Test t;
t.talk();
}
Now run the executable.
You would expect to see:
I'm test MAIN
I'm test 2
What you should see thought is:
I'm test MAIN
I'm test MAIN
What happened?!?!!
Now try putting an unnamed namespace around the "Test" class in "test.cpp" like so:
#include <iostream>
#include "test.hpp"
namespace{
class Test {
public:
void talk()
{
std::cout<<"I'm test 2\n";
}
};
}
void testfunc() {
Test t;
t.talk();
}
Compile it again and run.
The output should be:
I'm test MAIN
I'm test 2
Wow! It works!
As it turns out, it is important to define classes inside unnamed namespaces so that you get the proper functionality out of them when two class names in different translation units are identical.
Now as to why that is the case, I haven't done any research on it (maybe someone could help here?) and so I can't really tell you for sure. I'm answering purely from a practical standpoint.
What I would suspect though is that, while it is true that C structs are indeed local to a translation unit, they are a bit different from classes since classes in c++ usually have behavior assigned to them. Behavior means functions and as we know, functions are not local to the translation unit.
This is just my assumption.

Why do you _have_ to initialize a C++ static member variable?

I know that you generally initialize a static member variable from within a .cpp file. But my question is: why do you have to?
Here's an example:
#include <vector>
using namespace std;
class A {
public:
static vector<int> x;
};
main() {
int sz = A::x.size();
}
This gives a compiler error: undefined reference to 'A::x'
However, this:
#include <vector>
using namespace std;
class A {
public:
static vector<int> x;
};
// Initialize static member
vector<int> A::x;
main() {
int sz = A::x.size();
}
compiles and runs fine.
I can understand if I was initializing the vector using something other than the default constructor, but I'm not. I just want a vector of size 0 created. Surely, any static members will have to be allocated memory on program initialization, so why doesn't the compiler just use the default constructor?
That's not about initialization, it's about definition.
Or more precisely : it's about knowing which compilation unit (.cpp) will hold the object (that have to be uniquely defined SOMEWHERE)
So, what's needed is simply to put the definition somewhere, in a unique place, that is a cpp, to let the compiler know that when the class's static object is called, it's defined there and nowhere else. (if you try to define your static in a header, each cpp including this header will have a definition, making impossible to know where it should be defined - and manually initialized if it's required for you use)
.
You are looking at it from the point of a single compilation unit.
But the language has to assume there can potentially by multiple compilation units. So now in which compilation unit is the static object created? Basically the compiler is not allowed to make that decision and the engineer must make the decision.
undefined reference to 'A::x' isn't a compiler error; it's a linker error. It means that a definition of A::x can't be found in any of the translation units that are being linked together to form your program. Static member variables have external linkage and must be defined in exactly one translation unit. Anything with external linkage will not have a definition generated by the compiler unless you write one.
What you call initialization is definition. You need to define the static member somewhere. The part inside the class is just a declaration.
This is mostly because having a definition inside the header would cause huge problems (since you couldn't include that header into more then one translation unit without causing multiple definitions).

Isn't C++'s inline totally optional?

I have a class that had an inline member, but I later decided that I wanted to remove the implementation from the headers so I moved the members body of the functions out to a cpp file. At first I just left the inlined signature in the header file (sloppy me) and the program failed to link correctly. Then I fixed my header and it all works fine, of course.
But wasn't inline totally optional?
In code:
First:
//Class.h
class MyClass
{
void inline foo()
{}
};
Next changed to (won't link):
//Class.h
class MyClass
{
void inline foo();
};
//Class.cpp
void MyClass::foo()
{}
And then to (will work fine):
//Class.h
class MyClass
{
void foo();
};
//Class.cpp
void MyClass::foo()
{}
I thought inline was optional, and imagined I might get by with a warning for my sloppiness, but didn't expect a linking error. What's the correct/standard thing a compiler should do in this case, did I deserve my error according to the standard?
Indeed, there is this one definition rule saying that an inline function must be defined in every translation unit it is used. Gory details follow. First 3.2/3:
Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8).
An inline function shall be defined in every translation unit in which it is used.
And of course 7.1.2/4:
An inline function shall be defined in every translation unit in which it is used and shall have exactly the same definition in every case (3.2). [Note: a call to the inline function may be encountered before its definition appears in the translation unit. ] If a function with external linkage is declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. An inline function with external linkage shall have the same address in all translation units. A static local variable in an extern inline function always refers to the same object. A string literal in an extern inline function is the same object in different translation units.
However, if you define your function within the class definition, it is implicitly declared as inline function. That will allow you to include the class definition containing that inline function body multiple times in your program. Since the function has external linkage, any definition of it will refer to the same function (or more gory - to the same entity).
Gory details about my claim. First 3.5/5:
In addition, a member function, static data member, class or enumeration of class scope has external linkage if the name of the class has external linkage.
Then 3.5/4:
A name having namespace scope has external linkage if it is the name of [...] a named class (clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes.
This "name for linkage purposes" is this fun thing:
typedef struct { [...] } the_name;
Since now you have multiple definitions of the same entity in your programs, another thing of the ODR happens to restrict you. 3.2/5 follows with boring stuff.
There can be more than one definition of a class type (clause 9), enumeration type (7.2), inline function with external linkage (7.1.2) [...] in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
in each definition of D, corresponding names, looked up according to 3.4, shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution (13.3) and after matching of partial template specialization (14.8.3) [...]
I cut off some unimportant stuff now. The above are the two important one to remember about inline functions. If you define an extern inline function multiple times, but do define it differently, or if you define it and names used within it resolve to different entities, then you are doing undefined behavior.
The rule that the function has to be defined in every TU in which it is used is easy to remember. And that it is the same is also easy to remember. But what about that name resolution thingy? Here some example. Consider a static function assert_it:
static void assert_it() { [...] }
Now, since static will give it internal linkage, when you include it into multiple translation units, then each definition will define a different entity. This means that you are not allowed to use assert_it from an extern inline function that's going to be defined multiple times in the program: Because what happens is that the inline function will refer to one entity called assert_it in one TU, but to another entity of the same name in another TU. You will find that this all is boring theory and compilers won't probably complain, but i found this example in particular shows the relation between the ODR and entities.
What follows is getting back to your particular problem again.
Following are the same things:
struct A { void f() { } };
struct A { inline void f(); }; void A::f() { } // same TU!
But this one is different, since the function is non-inline. You will violate the ODR, since you have more than one definition of f if you include the header more than once
struct A { void f(); }; void A::f() { } // evil!
Now if you put inline on the declaration of f inside the class, but then omit defining it in the header, then you violate 3.2/3 (and 7.1.2/4 which says the same thing, just more elaborating), since the function isn't defined in that translation unit!
Note that in C (C99), inline has different semantics than in C++. If you create an extern inline function, you should first read some good paper (preferably the Standard), since those are really tricky in C (basically, any used inline-definition of a function will need another, non-inline function definition in another TU. static inline functions in C are easy to handle. They behave like any other function, apart of having the usual "inline substitution" hint. static inline in both C and C++ serve only as a inline-substitution hint. Since static will already create a different entity any time it's used (because of internal linkage), inline will just add the inline-substitution hint - not more.
Whether or not the method is actually inlined is at the sole discretion of the compiler. However the presence of the inline keyword will also affect the linkage of the method.
C++ linkage is not my specialty so I'll defer to the links for a better explanation.
http://publib.boulder.ibm.com/infocenter/zos/v1r9/index.jsp?topic=/com.ibm.zos.r9.cbclx01/inline_linkage.htm
http://en.wikipedia.org/wiki/Inline_function
Alternately you can just wait for litb to provide the gory details in an hour or so ;)
Point to note: when method is declared inline, its definition MUST be together with its declaration.
Regarding harshath.jr's answer, a method need not be declared inline if its definition has the "inline" keyword, and that definition is available in the same header, i.e.:
class foo
{
void bar();
};
inline void foo::bar()
{
...
}
This is useful for conditionally inlining a method depending on whether or not the build is "debug" or "release" like so:
// Header - foo.h
class foo
{
void bar(); // Conditionally inlined.
};
#ifndef FOO_DEBUG
# include "foo.inl"
#endif
The "inline" file could look like:
// Inline Functions/Methods - foo.inl
#ifndef FOO_DEBUG
# define FOO_INLINE inline
#else
# define FOO_INLINE
#endif
FOO_INLINE void foo::bar()
{
...
}
and the implementation could like the following:
// Implementation file - foo.cpp
#ifdef FOO_DEBUG
# include "foo.inl"
#endif
...
It's not exactly pretty but it has it's uses when aggressive inline becomes a debugging headache.