Why should avoid redundant declarations in C++? - c++

I am learning multiple file compilation in C++ and found practice like this:
#ifndef MY_LIB_H
#define MY_LIB_H
void func(int a, int b);
#endif
Some people say that this practice is adopted to avoid repeating declarations.
But I try to declare a function twice and the code just runs well without any compilation error (like below).
int func();
int func();
int func()
{
return 1;
}
So is it really necessary to avoid repeating declarations? Or is there another reason for using #ifndef?

Some people say that this practice is adopted to avoid repeating declarations.
If some people say that then what they say is misleading. Header guards are used to avoid repeating definitions in order to conform to the One Definition Rule.

Repeating declarations is okay. Repeating definitions is not.
int func(); // declaration
int func(); // declaration; repetition is okay
class X; // declaration
class X; // declaration; repetition is okay
class Y {}; // definition
class Y {}; // definition; repetition is not okay
If a header consists only of declarations it can be included multiple times. But that's inefficient: the compiler has to compile each declaration, determine that it's just a duplicate, and ignore it. And, of course, even if it consists only of declarations at the moment, some future maintainer (including you) will, at some point, change it.

So is it really necessary to avoid repeating declarations?
You can have multiple declarations for a given entity(name). That is you can repeat declarations in a given scope.
is there another reason for using #ifndef?
The main reason for using header guards is to ensure that the second time a header file is #included, its contents are discarded, thereby avoiding the duplicate definition of a class, inline entity, template, and so on, that it may contain.
In other words, so that the program conform to the One Definition Rule(aka ODR).

Related

Basic ODR violation: member functions in .h files

Disclaimer: This is probably a basic question, but I'm a theoretical physicist by training trying to learn to code properly, so please bear with me.
Let's say that I want to model a fairly involved physical system. In my understanding, one way of modelling this system is to introduce it as a class. However, since the system involved, the class will be large, with potentially many data members, member functions and subclasses. Having the main program and this class in one file will be very cluttered, so to give a better overview of the project I tend to put the class in a separate .h file. Such that I'd have something like:
//main.cpp
#include "tmp.h"
int main()
{
myclass aclass;
aclass.myfunction();
return 0;
}
and
// tmp.h
class myclass
{
// data members
double foo;
double bar;
public:
// function members
double myfunction();
};
double myclass::myfunction()
{
return foo + bar;
}
This however, amounts to the following compiler warning in my new compiler: function definitions in header files can lead to ODR violations. My question then is this: what is actually the preferred way of dealing with a situation like this? I guess I can just make tmp.h into tmp.cpp, but to the best of my understanding, this is the intended use of .h files?
Normally, a class definition goes in an ".h" file and its member functions' definitions go in a ".cpp" file.
If you want to define member functions in the header, you need to either declare them inline, or write them inside the class definition (which makes them implicitly inline).
Adding to the other answers:
This is a function definition:
double myfunction()
{
return foo + bar;
}
This is a function declaration:
double myfunction();
The purpose of the declaration is to declare the unique signature of the function to other code. The function can be declared many times, but can only have one definition, hence the ODR (One Definition Rule).
As a basic rule to start with, put function declarations in header files, and put definitions in source files.
Unfortunately in C++, things rapidly get more complicated.
The problem with just having the function signature available is that you can't easily optimise the code in the function, because you can't see it. To solve that problem, C++ allows function definitions to be put in headers in several circumstances.
You'll have to either use the inline keyword or put the definition of myclass in a .cpp file.
myclass.hpp
#ifndef MY_CLASS_H
#define MY_CLASS_H
class myclass
{
public:
double myfunction( );
private:
double foo;
double bar;
};
#endif
myclass.cpp
#include "myclass.hpp"
double myclass::myFunction( )
{
return foo + bar;
}
Or you can define the function in the header (myclass.hpp) using inline.
#ifndef MY_CLASS_H
#define MY_CLASS_H
class myclass
{
public:
double myfunction( );
private:
double foo;
double bar;
};
inline double myclass::myFunction( )
{
return bar + foo;
}
#endif
If you define the myFunction function in the class declaration then you can omit the use of the inline keyword.
#ifndef MY_CLASS_H
#define MY_CLASS_H
class myclass
{
public:
double myfunction( )
{
return foo + bar;
}
private:
double foo;
double bar;
};
#endif
ODR stands for One Definition Rule. It means that everything should have one and only one definition. Now, if you define a function in a header file, every translation unit that includes that header file will get a definition. That obviously violates the ODR. That's what the compiler is warning about. You have couple of ways to work around that:
Move the function declaration to a cpp file. That way there is a single definition.
Make the function inline. This means that there may be multiple definitions of this function, but you're sure that all are identical and the linker may use one those and ignore the rest.
Make the function static (doesn't apply to class-methods). When a function is static each translation unit gets it's own copy of the method, but they are all different functions and belong to one and only one compilation unit. So it's OK.

Declarations vs. Definitions at the top of your program (Best Practice)

Using a function declaration at the top of your function, then later defining it under main seems a bit redundant. Considering the practices of DRY, is standard practice in the C++ community to simply declare the function fully at the top (or even just define in a seperate functions.cpp file, with declarations in the header.hpp file), rather than declare then later define?
I realize that all ways would produce the same result, but I would like to hone in on my formatting skills, and eliminate redundancy. Why declare then later define as opposed to just defining at the top before main? Perhaps this is a tabs vs. spaces type of debate, but maybe thats all I need to know lol
Thanks!
There are cases where you have no choice other than first providing a declaration and then the definition. For example when there is mutual dependency:
int bar(int x);
int foo(int x) {
if (x == 1) return bar(x);
return x;
}
int bar(int x) {
if (x==0) return foo(x);
return x;
}
Seperating declaration and definition is not considered a violation of DRY or as redundant. Probably the most common is to have declarations in headers and definitions in source files.
There may be some cases that you have to provide function signature in advance, and its implementation only later at some point. The best example is the cyclic dependency of functions, so one needs to be defined before the other one, making a closed loop.

c++ classes: different ways to write a function

what's the difference between the following 3 cases:
1) in point.h:
class point
{
int x,y;
public:
int getX();
};
int point::getX() {
return this->x;
}
2) in point.h:
class point
{
int x,y;
public:
int getX()
{
return this->x;
}
};
3) in point.h:
class point
{
int x,y;
public:
int getX();
};
int point.cpp:
int point::getX() {
return this->x;
}
Note: I read that it's somehow connected to inline but not sure which one of them makes the compiler to treat int getX() and inline int getX()
Avoid this first one:
struct point
{
int x,y;
int getX();
};
int point::getX() {
return this->x;
}
If multiple source files include point.h, you will get multiple definitions of point::getX, leading to a violation of the One Definition Rule (and modern linkers will give an error message).
For the second one:
struct point
{
int x,y;
int getX()
{
return this->x;
}
};
This implicitly inlines the function. This means that the function definition may be copy-pasted everywhere it is used, instead of resolving a function call. There are a few trade offs here. On one hand, by providing definitions in headers, you can more easily distribute your library. Additionally, in some cases you may see performance improvements due to the locality of the code. On the other hand, you may actually hurt performance due to instruction cache misses (more instructions around == it won't all fit in cache). And the size of your binaries may grow as the inlined function gets copied around.
Another tradeoff is that, should you ever need to change your implementation, all clients must rebuild.
Finally, depending on the sensitivity of the function, you may be revealing trade secrets through your headers (that is, there is absolutely no hiding of your secret sauce) (note: one can always decompile your binary and reverse engineer an implementation, so putting the def in the .cpp file won't stop a determined programmer, but it keeps honest people honest).
The third one, which separates a definition into a .cpp file:
// point.h
struct point
{
int x,y;
int getX();
};
// point.cpp
int point::getX() {
return this->x;
}
This will cause a function to get exported to your library (at least for gcc. In Windows, you need to be explicit by using __declspec directives to import/export). Again, there are tradeoffs here.
Changing the implementation does not require clients to recompile; you can distribute a new library for them to link to instead (the new library is ABI-compatible if you only change the impl details in the .cpp file). However, it is more difficult to distribute your library, as your binaries now need to be built for each platform.
You may see a performance decrease due to the requirement to resolve function pointers into a library for running code. You may also see a performance increase over inlining due to the fact that your code may be friendlier to the instruction cache.
In the end, there is a lot to consider. My recommendation is to go with #3 by default unless you are writing templates. When you want to look at improving performance, you can start to measure what inlining does for you (binary size as well as runtime perf). Of course you may have other information up front that makes approach #2 or #3 better suited for the task (e.g., you have a Point class, and you know that accessing X will happen everywhere and it's a really small function, so you decide to inline it).
what's the difference between the following 3 cases
The function definition is outside of the class definition. Note that in this example you've defined a non-inline function in a header. Including this header into more than one translation unit violates the One Definition Rule. This is most likely a bug.
The function definition is inside of the class definition. In this case, the function is implicitly inline. As such, it is fine to include it into multiple translation units.
The function definition is outside of the class definition again. The function is not declared inline. This time the function is defined in a separate translation unit, thereby conforming to the ODR even if the header is included into multiple translation units.
what's the problem if both b.cpp & a.cpp includes my header file
The problem is that then both b.cpp and a.cpp will define a non-inline function. The One Definition Rule says that there must be at most one definition of any inline function. Two is more than one. Therefore doing this violates the ODR and therefore such program would be ill-formed.
I'm too much confused why it's an error to write the same function in two different cpp files?
It is an "error" because the rules of the language (explained above) say that it is an "error".
what if both want to use that function?
Then declare the function in both translation units. Only define the function in one translation unit unless you declare the function inline, in which case define the function in all translation units (where the function is used) instead. Look at the examples 2. and 3. of your question to see how that can be done.
so the code in method 1 is not automatically inlined?
No. Functions are not automatically declared inline. Function is declared inline only if A. inline keyword is used, or if B. it is a non-static member function that is defined within the class definition (or in a case involving constexpr that I shall omit here). None of those cases apply to the example 1, therefore it is not an inline function.

How the Inline functions having multiple lines of code are treated as a single line?

I have an idea on defining inline functions(normal functions) globally
using "inline keyword" increases performance if the snippet is small.I have
a doubt that :
"how the member functions defined inside classes
also gives the same performance and considered as inline?"
Actually inline functions contain a single line of code
This statement is wrong. There's no such constraint.
Inline function merely means that all the function definition code is placed directly where it is declared.
but member functions defined inside a class contain multiple code instead treated as inline why?
If you're referring to the inline keyword there's also no constraint that functions marked with that keyword can only contain a single line of code.
If it's actually inlined by the compiler (i.e. assembly code directly inserted in place, without a function call) is left to its decision, and mostly depends on compiler optimization strategies chosen in the optimization flags.
You need to provide the inline keyword for non class member functions if they are completely defined in a header file to avoid ODR violation errors.
Here's an example (header file assumed):
class foo {
int x_;
public:
// Inside the class declaration 'inline' is assumed as default
int x() const { return x_; }
int y() const {
int result = 0;
// Do some complicated calculation spanning
// a load of code lines
return result;
}
};
inline int bar() { // inline is required here, otherwise the compiler
// will see multiple definitions of that function
// in every translation unit (.cpp) that includes
// that header file.
return 42;
}
Inline doesn't mean its just single line of code . It means the whole code if it is single line or multiple lines gets inserted at the function calling point thereby reducing function call overhead .
Look at this code, I know it's not C++ but basics are the same.
#include <stdio.h>
#define inlineMacro(x) ((x) = (x) + 1); ((x) = (x) * 2)
int main()
{
int i = 5;
inlineMacro(i);
printf("%i",i);
return 0;
}
Outputs:
12
You can put all of your code on single line. So don't be detracted by keyword inline, it's just for compiler.
I have got a clear explanation from all of you guys.
INTUITION in three 4 points:
1.For normal functions(not methods that are declared /defined in classes) inline keyword is used to interpolate the assembly code (by compiler),thereby avoiding repeating function calls.
2.For methods declared inside classes ,the performance would be same if they are declared as normal functions with inline keyword(no class concept) for a small snippet.
3.method declaration(for classes) is implicit inline.
4.functions declaration is (if needed) is explicit inline.

Why One Definition Rule, not One Declaration Rule?

I have read materials below:
https://www.wikiwand.com/en/One_Definition_Rule
http://en.cppreference.com/w/cpp/language/definition
What is the difference between a definition and a declaration?
But still, can't figure out why it is One Definition Rule rather than One Declaration Rule?
I maintain that declaration is a subset of definition, so One Definition Rule is enough.
One declaration rule would be too strict, preventing programs that use the same header more than once from compiling. It would also make it impossible to define data structures with back references.
A simple way to see the first point (using headers) is to consider a program composed of two translation units, A.cpp and B.cpp, which both include <string> header.
Translation units A.cpp and B.cpp are translated independently. By including <string>, both translation units acquire a declaration of std::string.
As for the second point (data structures with back references) consider an example of defining a tree in which each node has a back reference to its parent tree:
// Does not compile
struct tree {
struct node *root;
};
struct node {
struct node *left;
struct node *right;
struct tree *owner;
};
This example would not compile, because node from struct node *tree is undeclared. Switching the order of struct node and struct tree declarations wouldn't help, because then tree from struct tree *owner would be undeclared. The only solution in C and C++ is to introduce a second declaration for either of the two structs.
Because the same declaration, in a .h file, may be included in multiple compilation units, and because multiple definitions is definitely a programming error, whereas multiple declarations isn't.
Definition is a subset of declaration, not the other way around. Every definition is a declaration, and there are declarations that are not definitions.
int i = 3; // definition and declaration
extern int i; // ok: (re)declaration
int i = 4; // error: redefinition
extern int j; // declaration
extern int j; // ok: (re)declaration
int j = 5; // ok: (re)declaration and definition
int j = 6; // error: redefinition
Because when you have declared a function in a header file
// header toto.h
int f(void);
and you want to define it in the compilation unit where it belongs, you'd do
#include "toto.h"
int f(void) {
return 0;
}
The definition is also a declaration, so this compilation unit sees two declarations, one in the header and one in the .c or .cpp file.
In short, the multiple declaration rule allows to check for consistency between different source files.
The reason is really that the C++ translation model can easily deal with conflicting multiple declarations; it just requires the compiler part of the toolset to detect errors like this in the source code:
int X();
void X(); // error
A compiler can easily do that.
And when there are no such errors in any translation units, then there's no problem; every X() call in every translation unit is identical; what remains to do is for the linker to link every call to the one correct destination. The declarations have done their job and no longer play a role.
Now with multiple definitions, it's not that easy. Definitions are something which concerns multiple translation units and which goes beyond the scope of the compilation phase.
We've already seen that in the example above. The X() calls are in place, but now we need the guarantee that they all end up at the same destination, the same definition of X().
That there can be only one such definition should be clear, but how to enforce it? Put in simple terms, when it's time to link the object code together, the source code has already been dealt with.
The answer is that C++ basically chooses to put the burden on the programmer. Forcing compiler/linker implementors to check all multiple definitions for equality and detect differences would be beyond the capabilities of C++ toolsets in most real-life situations or completely break the way those tools work, so the pragmatic solution is to just forbid it and/or force the programmer to make sure that they are all identical or else get undefined behaviour.