A virtual member function is used if it is not pure? - c++

C++03 3.2.2 ...An object or non-overloaded function is used if its name appears in a potentially-evaluated expression. A virtual member function is used if it is not pure...
And then later in 3.2.3 we have: Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8).
An inline function shall be defined in every translation unit in which it is used.
Along the lines I am reading: a pure virtual function is not used. The ODR applies only to functions which are used. Doesn't this imply that the following would be legal? I am guessing the answer is no, it doesn't, but then I can't understand why.
//x.h
struct A
{
virtual void f() = 0;
};
//y.cpp
#include "x.h"
void A::f()
{
}
//z.cpp
#include "x.h"
#include <iostream>
void A::f()
{
std::cout << "Hello" << std::endl;
}
//main.cpp
#include "x.h"
struct B:A
{
virtual void f()
{
A::f();
}
};
int main()
{
A* p = new B;
p->f();
}

The two clauses are not mutually exclusive.
That a virtual function is used if it is not pure, does not mean that the converse holds. If a virtual function is pure it does not mean that it is necessarily not used. It may still be used "if its name appears in a potentially evaluated expression" such as in your example: A::f();.

This code violates ODR. A::f is multiply defined. Hence it has UB.
Multiple definitions across translation units are only allowed for the following as per $3.2/5
There can be more than one definition
of a class type (clause 9),
enumeration type (7.2), inline
function with external linkage
(7.1.2), class template (clause 14),
non-static function template (14.5.5),
static data member of a class template
(14.5.1.3), member function of a class
template (14.5.1.1), or template
specialization for which some template
parameters are not specified (14.7,
14.5.4) in a program provided that each definition appears in a different
translation unit, and provided the
definitions satisfy the following
requirements.

As #Charles Bailey pointed out, your A::f is in fact used even though it's pure virtual. But that's beside the main point.
It's not accurate that the One Definition Rule does not apply to functions that are not used. We have:
3.2p1 No translation unit shall contain more than one definition of any variable, function, class type, enumeration type or template.
3.2p3 Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required.
Together, these requirements seem to imply that a used function must have exactly one definition, and an unused function (including a pure virtual function which is never explicitly called) may have either no definition or a single definition. In either case, multiple definitions for a non-inline function makes the program ill-formed.
At least, I'm quite certain that's the intent. But you may be on to a hole in the phrasing, since a very literal reading does not say anywhere that multiple different definitions of the same unused function in different translation units is ill-formed.
// x.cpp
void f() {}
void g() {}
// y.cpp
#include <iostream>
void f() {
std::cout << "Huh" << std::endl;
}
void h() {}
// z.cpp
void g();
void h();
int main() {
g();
h();
return 0;
}

This is related but off-topic: from the citations it seems there is a hole in the Standard alright: it should also say a pure virtual destructor is used, and, that it must be defined; at least if there exist any derived class objects which are destroyed or if a destructor of such is defined, since the derived class destructor must call the base destructor, implicitly it does so with the qualified::id syntax. The definition of such destructors is usually trivial but cannot be elided and cannot be generated.

[class.abstract]: "A pure virtual function need be defined only if called with, or as if with (12.4), the qualified-id syntax (5.1)."
Your A::f is called by B::f, so there must be a single definition of A::f.

Related

Understanding what causes this multiple definition error

I have a base class with a pure virtual method implemented by two classes:
// base_class.hpp
class base_class {
public:
virtual std::string hello() = 0;
};
// base_implementer_1.hpp
class base_implementer1 : base_class {
public:
std::string hello();
};
// base_implementer_2.hpp
class base_implementer2 : base_class {
public:
std::string hello();
};
// base_implementer_1.cpp
std::string hello() {
return(std::string("Hello!"));
}
// base_implementer_2.cpp
std::string hello() {
return(std::string("Hola!"));
}
Note the lack of base_implementer1:: and base_implementer2:: in the implementations. This is deliberate.
By adding in the base_implementer1:: and base_implementer2:: I do not get a multiple definition error. However, leaving them off the linker complains I have two definitions of the same function (hello()).
Since these two implementations are not featured in the header files, I would think that (even though they are not correct in terms of ACTUALLY implementing hello()) they would be allowed since there's no reason you couldn't have two hello() functions in two distinct .cpp files. But this doesn't seem to be the case. Can anyone tell me what's happening in the linker to make this multiple definition error happen?
One-Definition-Rule defines rules for two scopes, i.e. translation unit scope and program scope.
The following rule with translation unit scope states that the same translation unit must not comprise two different definitions of the same function:
Only one definition of any variable, function, class type, enumeration
type, or template is allowed in any one translation unit (some of
these may have multiple declarations, but only one definition is
allowed).
So, if you have two different .cpp-files, than you have two different translation units, and each of them may have their own definition of hello(); ODR is not violated in the scope of a translation unit.
The following rule with program scope defines that an odr-used function must be defined exactly once in the program:
One and only one definition of every non-inline function or variable
that is odr-used (see below) is required to appear in the entire
program (including any standard and user-defined libraries). The
compiler is not required to diagnose this violation, but the behavior
of the program that violates it is undefined.
The definition of odr-used informally states that for every function that is called or which's address is taken must be defined in the program:
Informally, an object is odr-used if its address is taken, or a
reference is bound to it, and a function is odr-used if a function
call to it is made or its address is taken. If an object or a
function is odr-used, its definition must exist somewhere in the
program; a violation of that is a link-time error.
So, if more than one .cpp-file exposes an implementation of hello(), and if this function is called or referenced, then ODR from program scope is clearly violated.
If the respective function is not odr-used (i.e. called or referenced), ODR should - to my understanding - not be violated;
If a compiler complains about duplicate symbols, than this is because the program violates linkage rules (please confer also SO answer concerning "If I don't odr-use a variable"). C++11 §3.5[basic.link]/9 states:
Two names that are the same and that are declared in different scopes
shall denote the same variable, function, type, enumerator, template
or namespace if
both names have external linkage or else both names have internal linkage and are declared in the same translation unit; and ...
To avoid this, make sure that at most one implementation of hello() is exposed, and make all others static or use an unnamed namespace.
In the C programming language, static is used with global variables and functions to set their scope to the containing file, i.e. it does not expose this implementation and name clashes with other binaries are avoided.
So a reasonable suggestion would be: Make function definitions, that are solely used within a translation unit, visible only to this translation unit; and define functions that are exposed within a namespace or class in order to avoid unintended or unforeseeable name clashes / duplicate symbol problems in the linker.
You have two different definitions of a function named hello in two different translation units. When it comes to link time, the linker has no idea which hello function to link to.
Consider:
A.cpp
#include <string>
std::string hello() {
return "A";
}
B.cpp
#include <string>
std::string hello() {
return "B";
}
C.cpp
#include <iostream>
std::string hello();
int main() {
std::cout << hello() << '\n';
}
How could the linker possibly know which hello to call in main? It can't because the One Definition Rule has been violated.
You define a global function called hello in base_implementor_1.cpp. You define another global function called hello in base_implementor_2.cpp. This results in the multiple definition and the required error for violation of the ODR. Why is this a problem? If you have a 3rd source file that calls hello(), which function should be called?
If you want to define distinct functions with the same name in multiple source files, you can preface them with the static keyword
static void hello() { }
or within an anonymous namespace
namespace {
void hello() { }
}

Why should types be put in unnamed namespaces?

I understand the use of unnamed namespaces to make functions and variables have internal linkage. Unnamed namespaces are not used in header files; only source files. Types declared in a source file cannot be used outside. So what's the use of putting types in unnamed namespaces?
See these links where it's mentioned that types can be put in unnamed namespaces:
Superiority of unnamed namespace over static?
Unnamed/anonymous namespaces vs. static functions
Why an unnamed namespace is a "superior" alternative to static?
Where do you want to put local types other than the unnamed namespace? Types can't have a linkage specifier like static. If they are not publicly known, e.g., because they are declared in a header, there is a fair chance that names of local types conflict, e.g., when two translation units define types with the same name. In that case you'd end up with an ODR violation. Defining the types inside an unnamed namespace eliminates this possibility.
To be a bit more concrete. Consider you have
// file demo.h
int foo();
double bar();
// file foo.cpp
struct helper { int i; };
int foo() { helper h{}; return h.i; }
// file bar.cpp
struct helper { double d; }
double bar() { helper h{}; return h.d; }
// file main.cpp
#include "demo.h"
int main() {
return foo() + bar();
}
If you link these three translation units, you have mismatching definitions of helper from foo.cpp and bar.cpp. The compiler/linker is not required to detect these but each type which is used in the program needs to have a consistent definition. Violating this constraints is known as violation of the "one definition rule" (ODR). Any violation of the ODR rule results in undefined behavior.
Given the comment it seems a bit more convincing is needed. The relevant section of the standard is 3.2 [basic.def.odr] paragraph 6:
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member
of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then each definition of D shall consist of the same sequence of tokens; and
[...]
There are plenty of further constraints but "shall consist of the same sequence of tokens" is clearly sufficient to rule out e.g. the definitions in the demo above from being legal.
So what's the use of putting types in unnamed namespaces?
You can create short, meaningful classes with names that maybe used in more than one file without the problem of name conflicts.
For example, I use two classes often in unnamed namespaces - Initializer and Helper.
namespace
{
struct Initializer
{
Initializer()
{
// Take care of things that need to be initialized at static
// initialization time.
}
};
struct Helper
{
// Provide functions that are useful for the implementation
// but not exposed to the users of the main interface.
};
// Take care of things that need to be initialized at static
// initialization time.
Initializer initializer;
}
I can repeat this pattern of code in as many files as I want without the names Initializer and Helper getting in the way.
Update, in response to comment by OP
file-1.cpp:
struct Initializer
{
Initializer();
};
Initializer::Initializer()
{
}
int main()
{
Initializer init;
}
file-2.cpp:
struct Initializer
{
Initializer();
};
Initializer::Initializer()
{
}
Command to build:
g++ file-1.cpp file-2.cpp
I get linker error message about multiple definitions of Initializer::Initializer(). Please note that the standard does not require the linker to produce this error. From section 3.2/4:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.
The linker does not produce an error if the functions are defined inline:
struct Initializer
{
Initializer() {}
};
That's OK for a simple case like this since the implementations are identical. If the inline implementations are different, the program is subject to undefined behavior.
I might be a bit late for answering the question the OP made but since I think the answer is not fully clear, I would like to help future readers.
Lets try a test... compile the following files:
//main.cpp
#include <iostream>
#include "test.hpp"
class Test {
public:
void talk() {
std::cout<<"I'm test MAIN\n";
}
};
int main()
{
Test t;
t.talk();
testfunc();
}
//test.hpp
void testfunc();
//test.cpp
#include <iostream>
class Test {
public:
void talk()
{
std::cout<<"I'm test 2\n";
}
};
void testfunc() {
Test t;
t.talk();
}
Now run the executable.
You would expect to see:
I'm test MAIN
I'm test 2
What you should see thought is:
I'm test MAIN
I'm test MAIN
What happened?!?!!
Now try putting an unnamed namespace around the "Test" class in "test.cpp" like so:
#include <iostream>
#include "test.hpp"
namespace{
class Test {
public:
void talk()
{
std::cout<<"I'm test 2\n";
}
};
}
void testfunc() {
Test t;
t.talk();
}
Compile it again and run.
The output should be:
I'm test MAIN
I'm test 2
Wow! It works!
As it turns out, it is important to define classes inside unnamed namespaces so that you get the proper functionality out of them when two class names in different translation units are identical.
Now as to why that is the case, I haven't done any research on it (maybe someone could help here?) and so I can't really tell you for sure. I'm answering purely from a practical standpoint.
What I would suspect though is that, while it is true that C structs are indeed local to a translation unit, they are a bit different from classes since classes in c++ usually have behavior assigned to them. Behavior means functions and as we know, functions are not local to the translation unit.
This is just my assumption.

Are all unused undefined methods allowed?

Here’s a class with an undefined method. It seems compilers allow instances of this class to be constructed, so long as the undefined member function is never called:
struct A {
void foo();
};
int main() {
A a; // <-- Works in both VC2013 and g++
a.foo(); // <-- Error in both VC2013 and g++
}
Here’s a similar situation, but one that involves inheritance. Subclass Bar extends base class Foo. Foo defines a method g(). Bar declares the same-named method but does not define it:
#include <iostream>
struct Foo {
void g() { std::cout << "g\n"; }
};
struct Bar : Foo {
void g();
};
int main() {
Bar b; // Works in both VC2013 and g++
b.Foo::g(); // Works in both VC2013 and g++
b.g(); // Error in both VC2013 and g++
}
Here's a variation of the above. The only difference here is that g() is virtual to both Foo and Bar:
#include <iostream>
struct Foo {
virtual void g() { std::cout << "g\n"; }
};
struct Bar : Foo {
virtual void g();
};
int main() {
Bar b; // Works in g++. But not in VC2013, which gives
// 'fatal error LNK1120: 1 unresolved externals'
b.Foo::g(); // Works in g++, but VC2013 already failed on b's construction
b.g(); // Error in g++, but VC2013 already failed on b's construction
}
See the code comments for contrast of different behavior between VC2013 and g++.
Which compiler is correct, if any?
Why does VC2013's compiler have some different complaints in its version with the virtual keyword compared to the one in its version without the virtual keyword?
Are unused undefined methods always allowed? If not, what are all the cases in which they're
not allowed?
Does Bar’s declaration of g() count as overriding
even when Bar doesn't provide a definition?
Which compiler is correct, if any?
They are both right. Your code is wrong, no diagnostic required. [class.virtual]/11
A virtual function declared in a class shall be defined, or declared
pure (10.4) in that class, or both; but no diagnostic is required
(3.2).
[intro.compliance]/2:
If a program contains a violation of a rule for which no diagnostic is
required, this International Standard places no requirement on
implementations with respect to that program.
Have a look at your optimization settings for GCC, they may influence the behavior.
Are unused undefined methods always allowed?
A member function must be defined if and only if it is odr-used. [basic.def.odr]/3:
Every program shall contain exactly one definition of every non-inline
function or variable that is odr-used in that program; no diagnostic
required.
Now consider [basic.def.odr]/2:
An expression is potentially evaluated unless it is an unevaluated operand (Clause 5) or a subexpression thereof.
[…]
A virtual member function is odr-used if it is not pure.
A non-overloaded function whose name appears as a potentially-evaluated expression or a member of a set of candidate functions, if selected by overload resolution when referred to from a potentially-evaluated expression, is odr-used, unless it is a pure virtual function and its name is not explicitly qualified.
You are still allowed to use undefined non-virtual member functions inside decltype or sizeof. But non-pure virtual functions are odr-used simply because they are not pure.
Does Bar’s declaration of g() count as overriding even when Bar
doesn't provide a definition?
Yes.

How does linker deal with virtual functions defined in multiple headers?

Suppose I have
Base.h
class Base
{
virtual void foo() {...}
};
Derived1.h
class Derived1 : public Base
{
virtual void foo() {...}
};
Derived2.h
class Derived2 : public Base
{
virtual void foo() {...}
};
Header Derived1.h is included in multiple source files and Derived1 class is also used through Base interface. Since foo is virtual and is used polymorphic it can not be inlined. So it will be compiled in multiple obj files. How does linker then resolve this situation?
Member functions defined within class definition are implicitly inline(C++03 7.1.2.3).
Whether the function body actually gets inlined at point of calling is immaterial. But inline allows you to have multiple definitions of a function as long as all the definitions are same(which is disallowed by One definition rule)(C++03 7.1.2.2). The standard mandates that the linker should be able to link to (one or)many of these definitions.(C++03 7.1.2.4).
How does the linker do this?
The standard provisions for this by:
It mandates that the function definition should be present in each translation unit. All the linker has to do is link to the definition found in that translation unit.
It mandates that all definitions of this function should be exactly same, this removes any ambiguity of linking to a particular definition, if different definitions were to exist.
C++03 7.1.2 Function specifiers:
Para 2:
A function declaration (8.3.5, 9.3, 11.4) with an inline specifier declares an inline function. The inline specifier indicates to the implementation that inline substitution of the function body at the point of call is to be preferred to the usual function call mechanism. An implementation is not required to perform this inline substitution at the point of call; however, even if this inline substitution is omitted, the other rules for inline functions defined by 7.1.2 shall still be respected.
Para 3:
A function defined within a class definition is an inline function. The inline specifier shall not appear on a block scope function declaration
Para 4:
An inline function shall be defined in every translation unit in which it is used and shall have exactly the same definition in every case (3.2).

Isn't C++'s inline totally optional?

I have a class that had an inline member, but I later decided that I wanted to remove the implementation from the headers so I moved the members body of the functions out to a cpp file. At first I just left the inlined signature in the header file (sloppy me) and the program failed to link correctly. Then I fixed my header and it all works fine, of course.
But wasn't inline totally optional?
In code:
First:
//Class.h
class MyClass
{
void inline foo()
{}
};
Next changed to (won't link):
//Class.h
class MyClass
{
void inline foo();
};
//Class.cpp
void MyClass::foo()
{}
And then to (will work fine):
//Class.h
class MyClass
{
void foo();
};
//Class.cpp
void MyClass::foo()
{}
I thought inline was optional, and imagined I might get by with a warning for my sloppiness, but didn't expect a linking error. What's the correct/standard thing a compiler should do in this case, did I deserve my error according to the standard?
Indeed, there is this one definition rule saying that an inline function must be defined in every translation unit it is used. Gory details follow. First 3.2/3:
Every program shall contain exactly one definition of every non-inline function or object that is used in that program; no diagnostic required. The definition can appear explicitly in the program, it can be found in the standard or a user-defined library, or (when appropriate) it is implicitly defined (see 12.1, 12.4 and 12.8).
An inline function shall be defined in every translation unit in which it is used.
And of course 7.1.2/4:
An inline function shall be defined in every translation unit in which it is used and shall have exactly the same definition in every case (3.2). [Note: a call to the inline function may be encountered before its definition appears in the translation unit. ] If a function with external linkage is declared inline in one translation unit, it shall be declared inline in all translation units in which it appears; no diagnostic is required. An inline function with external linkage shall have the same address in all translation units. A static local variable in an extern inline function always refers to the same object. A string literal in an extern inline function is the same object in different translation units.
However, if you define your function within the class definition, it is implicitly declared as inline function. That will allow you to include the class definition containing that inline function body multiple times in your program. Since the function has external linkage, any definition of it will refer to the same function (or more gory - to the same entity).
Gory details about my claim. First 3.5/5:
In addition, a member function, static data member, class or enumeration of class scope has external linkage if the name of the class has external linkage.
Then 3.5/4:
A name having namespace scope has external linkage if it is the name of [...] a named class (clause 9), or an unnamed class defined in a typedef declaration in which the class has the typedef name for linkage purposes.
This "name for linkage purposes" is this fun thing:
typedef struct { [...] } the_name;
Since now you have multiple definitions of the same entity in your programs, another thing of the ODR happens to restrict you. 3.2/5 follows with boring stuff.
There can be more than one definition of a class type (clause 9), enumeration type (7.2), inline function with external linkage (7.1.2) [...] in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
in each definition of D, corresponding names, looked up according to 3.4, shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution (13.3) and after matching of partial template specialization (14.8.3) [...]
I cut off some unimportant stuff now. The above are the two important one to remember about inline functions. If you define an extern inline function multiple times, but do define it differently, or if you define it and names used within it resolve to different entities, then you are doing undefined behavior.
The rule that the function has to be defined in every TU in which it is used is easy to remember. And that it is the same is also easy to remember. But what about that name resolution thingy? Here some example. Consider a static function assert_it:
static void assert_it() { [...] }
Now, since static will give it internal linkage, when you include it into multiple translation units, then each definition will define a different entity. This means that you are not allowed to use assert_it from an extern inline function that's going to be defined multiple times in the program: Because what happens is that the inline function will refer to one entity called assert_it in one TU, but to another entity of the same name in another TU. You will find that this all is boring theory and compilers won't probably complain, but i found this example in particular shows the relation between the ODR and entities.
What follows is getting back to your particular problem again.
Following are the same things:
struct A { void f() { } };
struct A { inline void f(); }; void A::f() { } // same TU!
But this one is different, since the function is non-inline. You will violate the ODR, since you have more than one definition of f if you include the header more than once
struct A { void f(); }; void A::f() { } // evil!
Now if you put inline on the declaration of f inside the class, but then omit defining it in the header, then you violate 3.2/3 (and 7.1.2/4 which says the same thing, just more elaborating), since the function isn't defined in that translation unit!
Note that in C (C99), inline has different semantics than in C++. If you create an extern inline function, you should first read some good paper (preferably the Standard), since those are really tricky in C (basically, any used inline-definition of a function will need another, non-inline function definition in another TU. static inline functions in C are easy to handle. They behave like any other function, apart of having the usual "inline substitution" hint. static inline in both C and C++ serve only as a inline-substitution hint. Since static will already create a different entity any time it's used (because of internal linkage), inline will just add the inline-substitution hint - not more.
Whether or not the method is actually inlined is at the sole discretion of the compiler. However the presence of the inline keyword will also affect the linkage of the method.
C++ linkage is not my specialty so I'll defer to the links for a better explanation.
http://publib.boulder.ibm.com/infocenter/zos/v1r9/index.jsp?topic=/com.ibm.zos.r9.cbclx01/inline_linkage.htm
http://en.wikipedia.org/wiki/Inline_function
Alternately you can just wait for litb to provide the gory details in an hour or so ;)
Point to note: when method is declared inline, its definition MUST be together with its declaration.
Regarding harshath.jr's answer, a method need not be declared inline if its definition has the "inline" keyword, and that definition is available in the same header, i.e.:
class foo
{
void bar();
};
inline void foo::bar()
{
...
}
This is useful for conditionally inlining a method depending on whether or not the build is "debug" or "release" like so:
// Header - foo.h
class foo
{
void bar(); // Conditionally inlined.
};
#ifndef FOO_DEBUG
# include "foo.inl"
#endif
The "inline" file could look like:
// Inline Functions/Methods - foo.inl
#ifndef FOO_DEBUG
# define FOO_INLINE inline
#else
# define FOO_INLINE
#endif
FOO_INLINE void foo::bar()
{
...
}
and the implementation could like the following:
// Implementation file - foo.cpp
#ifdef FOO_DEBUG
# include "foo.inl"
#endif
...
It's not exactly pretty but it has it's uses when aggressive inline becomes a debugging headache.