Why defining classes in header files works but not functions - c++

I have this little piece of code :
File modA.h
#ifndef MODA
#define MODA
class CAdd {
public:
CAdd(int a, int b) : result_(a + b) { }
int getResult() const { return result_; }
private:
int result_;
};
/*
int add(int a, int b) {
return a + b;
}
*/
#end
File calc.cpp
#include "modA.h"
void doSomeCalc() {
//int r = add(1, 2);
int r = CAdd(1, 2).getResult();
}
File main.cpp
#include "modA.h"
int main() {
//int r = add(1, 2);
int r = CAdd(1, 2).getResult();
return 0;
}
If I understand well, we can't define a function in a header file and use it in different unit translations (unless the function is declared static). The macro MODA wouldn't be defined in each unit translation and thus the body guard wouldn't prevent the header from being copied in place of every #include "modA.h". This would cause the function to be defined at different places and the linker would complain about it. Is it correct ?
But then why is it possible to do so with a class and also with methods of a class. Why doesn't the linker complain about it ?
Isn't it a redefinition of a class ?
Thank you

When member functions are defined in the body of the class definition, they are inline by default. If you qualify the non-member functions inline in the .h file, it will work fine.
Without the inline qualifier, non-member functions defined in .h files are compiled in every .cpp file that the .h file is included in. That violates the following rule from the standard:
3.2 One definition rule
3 Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; ...
You will get the same error if you define member functions outside the body of the class definition in a .h file and did not add the inline qualifier explicitly.

Multiple translation units might need the definition of the class at compile time since it is not possible to know, for example, the types of members of the class (or even whether they exist) unless the definition of the class is available. (Therefore, you must be allowed to define a class in multiple translation units.) On the other hand, a translation unit only needs the declaration of a function because as long as it knows how to call the function, the compiler can leave the job of inserting the actual address of the function to the linker.
But this comes at a price: if a class is defined multiple times in a program, all the definitions must be identical, and if they're not, then you may get strange linker errors, or if the program links, it will probably segfault.
For functions you don't have this problem. If the function is defined multiple times, the linker will let you know. This is good, because it avoids accidentally defining multiple functions with the same name and signature in a given program. If you want to override this, you can declare the function inline. Then the same rule as that for classes applies: the function has to be defined in each translation unit in which it's used in a certain way (odr-used, to be precise) and all the definitions must be identical.
If a function is defined within a class definition, there's a special rule that it's implicitly inline. If this were not the case, then it would make it impossible to have multiple definitions of a class as long as there's at least one function defined in the class definition unless you went to the trouble of marking all such functions inline.

” If I understand well, we can't define a function in a header file and use it in different unit translations (unless the function is declared static)
That's incorrect. You can, and in the presented code CAdd::getResult is one such function.
In order to support general use of a header in multiple translation units, which gives multiple competing definitions of the function, it needs to be inline. A function defined in the class definition, like getResult, is automatically inline. A function defined outside the class definition needs to be explicitly declared inline.
In practical terms the inline specifier tells the linker to just arbitrarily select one of the definitions, if there are several.
There is unfortunately no simple syntax to do the same for data. That is, data can't just be declared as inline. However, there is an exemption for static data members of class templates, and also an inline function with extern linkage can contain static local variables, and so compilers are required to support effectively the same mechanism also for data.
An inline function has extern linkage by default. Since inline also serves as an optimization hint it's possible to have an inline static function. For the case of the default extern linkage, be aware that the standard then requires the function to be defined, identically, in every translation unit where it's used.
The part of the standard dealing with this is called the One Definition Rule, usually abbreviated as the ODR.
In C++11 the ODR is §3.2 “One definition rule”. Specifically, C++11 §3.2/3 specifies the requirement about definitions of an inline function in every relevant translation unit. This requirement is however repeated in C+11 §7.1.2/4 about “Function specifiers”.

Related

How to simultaneously inline function and make it dynamic symbol in C/C++? [duplicate]

I can't make sense of the following behavior: one header with some basic types, and another header in which I use these types in several functions. Afterward I started constructing classes based on my defined types and functions. In the function header if I leave the following signature:
void whateverFunction(parameters)
The linker points out that there are multiple definitions of whateverFunction. Now if change it to:
inline void whateverFunction(parameters)
the linkage problem is gone and all compiles and links well. What I know concerning inline is that it replaces every function call with it's code other than that it's a pretty dark, so my question is:
How does the linker treats inline functions in C++?
When the function in the header is not inline, then multiple definitions of this function (e.g. in multiple translation units) is a violation of ODR rules.
Inline functions by default have external linkage. Hence, as a consequence of ODR rules (given below), such multiple definitions (e.g. in multiple translation units) are Okay:
$3.2/5- "There can be more than one
definition of a class type (Clause 9),
enumeration type (7.2), inline
function with external linkage
(7.1.2), class template (Clause 14),
non-static function template (14.5.6),
static data member of a class template
(14.5.1.3), member function of a class
template (14.5.1.1), or template
specialization for which some template
parameters are not specified (14.7,
14.5.5) in a program provided that each definition appears in a different
translation unit, and provided the
definitions satisfy the following
requirements. Given such an entity
named D defined in more than one
translation unit, then
— each definition of D shall consist
of the same sequence of tokens; and [...]
How the linker treats inline functions is a pretty much implementation level detail. Suffice it to know that the implementation accepts such mulitple defintions within the limitations of ODR rules
Note that if the function declaration in header is changed to 'static inline....', then the inline function explicitly has internal linkage and each translation unit has it's own copy of the static inline function.
The linker may not see inline functions at all. They are usually compiled straight into the code that calls them (i.e., the code is used in place of a function call).
If the compiler chooses not to inline the function (since it is merely a hint), I'm not sure, but I think the compiler emits it as a normal non-inline function and somehow annotates it so the linker just picks the first copy it sees and ignores the others.
The inline just masks the problem. Having multiple definition points out a problem somewhere.
Juste be careful about how you use your headers. Dont forget to :
- << #ifndef HEADER_NAME / #define HEADER_NAME / #endif >> to avoid multiple inclusion.
- Do not use indirect inclusion : if you use a type in a file, add the corresponding header, even if another header in the same file includes it.

Why does defining inline global function in 2 different cpp files cause a magic result?

Suppose I have two .cpp files file1.cpp and file2.cpp:
// file1.cpp
#include <iostream>
inline void foo()
{
std::cout << "f1\n";
}
void f1()
{
foo();
}
and
// file2.cpp
#include <iostream>
inline void foo()
{
std::cout << "f2\n";
}
void f2()
{
foo();
}
And in main.cpp I have forward declared the f1() and f2():
void f1();
void f2();
int main()
{
f1();
f2();
}
Result (doesn't depend on build, same result for debug/release builds):
f1
f1
Whoa: Compiler somehow picks only the definition from file1.cpp and uses it also in f2(). What is the exact explanation of this behavior?.
Note, that changing inline to static is a solution for this problem. Putting the inline definition inside an unnamed namespace also solves the problem and the program prints:
f1
f2
This is undefined behavior, because the two definitions of the same inline function with external linkage break C++ requirement for objects that can be defined in several places, known as One Definition Rule:
3.2 One definition rule
...
There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14),[...] in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
6.1 each definition of D shall consist of the same sequence of tokens; [...]
This is not an issue with static functions, because one definition rule does not apply to them: C++ considers static functions defined in different translation units to be independent of each other.
The compiler may assume that all definitions of the same inline function are identical across all translation units because the standard says so. So it can choose any definition it wants. In your case, that happened to be the one with f1.
Note that you cannot rely on the compiler always picking the same definition, violating the aforementioned rule makes the program ill-formed. The compiler could also diagnose that and error out.
If the function is static or in an anonymous namespace, you have two distinct functions called foo and the compiler must pick the one from the right file.
Relevant standardese for reference:
An inline function shall be defined in every translation unit in which it is odr-used and shall have exactly
the same definition in every case (3.2). [...]
7.1.2/4 in N4141, emphasize mine.
As others have noted, the compilers are in compliance with the C++ standard because the One definition rule states that you shall have only one definition of a function, except if the function is inline then the definitions must be the same.
In practice, what happens is that the function is flagged as inline, and at linking stage if it runs into multiple definitions of an inline flagged token, the linker silently discards all but one. If it runs into multiple definitions of a token not flagged inline, it instead generates an error.
This property is called inline because, prior to LTO (link time optimization), taking the body of a function and "inlining" it at the call site required that the compiler have the body of the function. inline functions could be put in header files, and each cpp file could see the body and "inline" the code into the call site.
It doesn't mean that the code is actually going to be inlined; rather, it makes it easier for compilers to inline it.
However, I am unaware of a compiler that checks that the definitions are identical before discarding duplicates. This includes compilers that otherwise check definitions of function bodies for being identical, such as MSVC's COMDAT folding. This makes me sad, because it is a reall subtle set of bugs.
The proper way around your problem is to place the function in an anonymous namespace. In general, you should consider putting everything in a source file in an anonymous namespace.
Another really nasty example of this:
// A.cpp
struct Helper {
std::vector<int> foo;
Helper() {
foo.reserve(100);
}
};
// B.cpp
struct Helper {
double x, y;
Helper():x(0),y(0) {}
};
methods defined in the body of a class are implicitly inline. The ODR rule applies. Here we have two different Helper::Helper(), both inline, and they differ.
The sizes of the two classes differ. In one case, we initialize two sizeof(double) with 0 (as the zero float is zero bytes in most situations).
In another, we first initialize three sizeof(void*) with zero, then call .reserve(100) on those bytes interpreting them as a vector.
At link time, one of these two implementations is discarded and used by the other. What more, which one is discarded is likely to be pretty determistic in a full build. In a partial build, it could change order.
So now you have code that might build and work "fine" in a full build, but a partial build causes memory corruption. And changing the order of files in makefiles could cause memory corruption, or even changing the order lib files are linked, or upgrading your compiler, etc.
If both cpp files had a namespace {} block containing everything except the stuff you are exporting (which can use fully qualified namespace names), this could not happen.
I've caught exactly this bug in production multiple times. Given how subtle it is, I do not know how many times it slipped through, waiting for its moment to pounce.
POINT OF CLARIFICATION:
Although the answer rooted in C++ inline rule is correct, it only applies if both sources are compiled together. If they are compiled separately, then, as one commentator noted, each resulting object file would contain its own 'foo()'. HOWEVER: If these two object files are then linked together, then because both 'foo()'-s are non-static, the name 'foo()' appears in the exported symbol table of both object files; then the linker has to coalesce the two table entries, hence all internal calls are re-bound to one of the two routines (presumably the one in the first object file processed, since it is already bound [i.e the linker would treat the second record as 'extern' regardless of binding]).

Inline keyword vs header definition

What's the difference between using the inline keyword before a function and just declaring the whole function in the header?
so...
int whatever() { return 4; }
vs
.h:
inline int whatever();
.cpp:
inline int myClass::whatever()
{
return 4;
}
for that matter, what does this do:
inline int whatever() { return 4; }
There are several facets:
Language
When a function is marked with the inline keyword, then its definition should be available in the TU or the program is ill-formed.
Any function defined right in the class definition is implicitly marked inline.
A function marked inline (implicitly or explicitly) may be defined in several TUs (respecting the ODR), whereas it is not the case for regular functions.
Template functions (not fully specialized) get the same treatment as inline ones.
Compiler behavior
A function marked inline will be emitted as a weak symbol in each object file where it is necessary, this may increase their size (look up template bloat).
Whereas the compiler actually inlines the call (ie, copy/paste the code at the point of use instead of performing a regular function call) is entirely at the compiler's discretion. The presence of the keyword may, or not, influence the decision but it is, at best, a hint.
Linker behavior
Weak symbols are merged together to have a single occurrence in the final library. A good linker could check that the multiple definitions concur but this is not required.
without inline, you will likely end up with multiple exported symbols, if the function is declared at the namespace or global scope (results in linker errors).
however, for a class (as seen in your example), most compilers implicitly declare the method as inline (-fno-default-inline will disable that default on GCC).
if you declare a function as inline, the compiler may expect to see its definition in the translation. therefore, you should reserve it for the times the definition is visible.
at a higher level: a definition in the class declaration is frequently visible to more translations. this can result in better optimization, and it can result in increased compile times.
unless hand optimization and fast compiles are both important, it's unusual to use the keyword in a class declaration these days.
The purpose of inline is to allow a function to be defined in more than one translation unit, which is necessary for some compilers to be able to inline it wherever it's used. It should be used whenever you define a function in a header file, although you can omit it when defining a template, or a function inside a class definition.
Defining it in a header without inline is a very bad idea; if you include the header from more than one translation unit, then you break the One Definition Rule; your code probably won't link, and may exhibit undefined behaviour if it does.
Declaring it in a header with inline but defining it in a source file is also a very bad idea; the definition must be available in any translation unit that uses it, but by defining it in a source file it is only available in one translation unit. If another source file includes the header and tries to call the function, then your program is invalid.
This question explains a lot about inline functions What does __inline__ mean ? (even though it was about inline keyword.)
Basically, it has nothing to do with the header. Declaring the whole function in the header just changes which source file has that the source of the function is in. Inline keyword modifies where the resulting compiled function will be put - in it's own place, so that every call will go there, or in place of every call (better for performance). However compilers sometimes choose which functions or methods to make inline for themselves, and keywords are simply suggestions for the compiler. Even functions which were not specified inline can be chosen by the compiler to become inline, if that gives better performance.
If you are linking multiple objects into an executable, there should normally only be one object that contains the definition of the function. For int whatever() { return 4; } - any translation unit that is used to produce an object will contain a definition (i.e. executable code) for the whatever function. The linker won't know which one to direct callers to. If inline is provided, then the executable code may or may not be inlined at the call sites, but if it's not the linker is allowed to assume that all the definitions are the same, and pick one arbitrarily to direct callers to. If somehow the definitions were not the same, then it's considered YOUR fault and you get undefined behaviour. To use inline, the definition must be known when compiler the call, so your idea of putting an inline declaration in a header and the inline definition in a .cpp file will only work if all the callers happen to be later in that same .cpp file - in general it's broken, and you'd expect the (nominally) inline function's definition to appear in the header that declares it (or for there to be a single definition without prior declaration).

Inline function linkage

I can't make sense of the following behavior: one header with some basic types, and another header in which I use these types in several functions. Afterward I started constructing classes based on my defined types and functions. In the function header if I leave the following signature:
void whateverFunction(parameters)
The linker points out that there are multiple definitions of whateverFunction. Now if change it to:
inline void whateverFunction(parameters)
the linkage problem is gone and all compiles and links well. What I know concerning inline is that it replaces every function call with it's code other than that it's a pretty dark, so my question is:
How does the linker treats inline functions in C++?
When the function in the header is not inline, then multiple definitions of this function (e.g. in multiple translation units) is a violation of ODR rules.
Inline functions by default have external linkage. Hence, as a consequence of ODR rules (given below), such multiple definitions (e.g. in multiple translation units) are Okay:
$3.2/5- "There can be more than one
definition of a class type (Clause 9),
enumeration type (7.2), inline
function with external linkage
(7.1.2), class template (Clause 14),
non-static function template (14.5.6),
static data member of a class template
(14.5.1.3), member function of a class
template (14.5.1.1), or template
specialization for which some template
parameters are not specified (14.7,
14.5.5) in a program provided that each definition appears in a different
translation unit, and provided the
definitions satisfy the following
requirements. Given such an entity
named D defined in more than one
translation unit, then
— each definition of D shall consist
of the same sequence of tokens; and [...]
How the linker treats inline functions is a pretty much implementation level detail. Suffice it to know that the implementation accepts such mulitple defintions within the limitations of ODR rules
Note that if the function declaration in header is changed to 'static inline....', then the inline function explicitly has internal linkage and each translation unit has it's own copy of the static inline function.
The linker may not see inline functions at all. They are usually compiled straight into the code that calls them (i.e., the code is used in place of a function call).
If the compiler chooses not to inline the function (since it is merely a hint), I'm not sure, but I think the compiler emits it as a normal non-inline function and somehow annotates it so the linker just picks the first copy it sees and ignores the others.
The inline just masks the problem. Having multiple definition points out a problem somewhere.
Juste be careful about how you use your headers. Dont forget to :
- << #ifndef HEADER_NAME / #define HEADER_NAME / #endif >> to avoid multiple inclusion.
- Do not use indirect inclusion : if you use a type in a file, add the corresponding header, even if another header in the same file includes it.

strange redefined symbols

I included this header into one of my own: http://codepad.org/lgJ6KM6b
When I compiled I started getting errors like this:
CMakeFiles/bin.dir/SoundProjection.cc.o: In function `Gnuplot::reset_plot()':
/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.4/include/g++-v4/new:105: multiple definition of `Gnuplot::reset_plot()'
CMakeFiles/bin.dir/main.cc.o:project/gnuplot-cpp/gnuplot_i.hpp:962: first defined here
CMakeFiles/bin.dir/SoundProjection.cc.o: In function `Gnuplot::set_smooth(std::basic_string, std::allocator > const&)':
project/gnuplot-cpp/gnuplot_i.hpp:1041: multiple definition of `Gnuplot::set_smooth(std::basic_string, std::allocator > const&)'
CMakeFiles/bin.dir/main.cc.o:project/gnuplot-cpp/gnuplot_i.hpp:1041: first defined here
CMakeFiles/bin.dir/SoundProjection.cc.o:/usr/include/eigen2/Eigen/src/Core/arch/SSE/PacketMath.h:41: multiple definition of `Gnuplot::m_sGNUPlotFileName'
I know it's hard to see in this mess, but look at where the redefinitions are taking place. They take place in files like /usr/lib/gcc/x86_64-pc-linux-gnu/4.3.4/include/g++-v4/new:105. How is the new operator getting information about a gnuplot header? I can't even edit that file. How could that ever even be possible? I'm not even sure how to start debugging this. I hope I've provided enough information. I wasn't able to reproduce this in a small project. I mostly just looking for tips on how to find out why this is happening, and how to track it down.
Thanks.
You're obviously violating the "one definition rule". You have lots of definitions in your header file. Some of them are classes or class templates (which is fine), some of them are inline functions or function templates (which is also fine) and some of them are "normal" functions and static members of non-templates (which is not fine).
class foo; // declaration of foo
class foo { // definition of foo
static int x; // declaration of foo::x
};
int foo::x; // definition of foo::x
void bar(); // declaration
void bar() {} // definition
The one definition rule says that your program shall contain at most one definition of an entity. The exceptions are classes, inline functions, function templates, static members of class templates (I probably forgot something). For those entities multiple definitions may exist as long as no two definitions of the same entity are in the same translation unit. So, including this header file into more than one translation unit leads to multiple definitions.
Looks like you include conflicting headers. Try to check your include paths. They usually are defined in -I directive (at least in gcc) or in an environment variable.
Reading compiler errors usually helps. You should learn to understand what the compiler is telling you. The fact that it is complaining about a symbol being redefine is saying that you are breaking the One Definition Rule. Then it even tells you what the symbols are:
class GnuPlot {
//...
GnuPlot& reset_plot(); // <-- declaration
//...
};
//...
Gnuplot& Gnuplot::reset_plot() { // <-- Definition
nplots = 0;
return *this;
}
You can declare a symbol as many times as you wish within a program, but you can only define it once (unless it is inlined). In this case the reset_plot is compiled and defined in all translation units that include the header, violating the One Definition Rule.
The simplest way out of it is declaring it inline, so that it can appear in more than one compilation unit and let the linker remove the redundant copies (if any) later on.
A little more problematic are the static members that must be declared within the class and defined (only once) in a translation unit. For that you can either create a .cpp file to define those variables (and any function/method that you don't need inlined in the header).