Why does a struct declaration violate the ODR in C++? - c++

I am trying to get the compiler to react to some code that I believe does not violate the one-definition-rule in C++. Inside a header file, I have two declarations: one for a struct and one function, like this:
struct TestStruct {
int a;
double d;
};
int k();
Then I intentionally include the header file twice in another file with main() in it, to see what happens.
To my surprise, the compiler complains about multiple definitions for the struct. I was expecting the compiler not to raise any multiplicity error at all since both the struct and function have pure declarations.
It is only after I put the struct in a header-guard that the compiler stops complaining. But, there is no memory allocated for the struct. It is not a definition. Then why is the compiler mad?

You can't define a struct more than once in a single translation unit.
You can define it in several translation units, but then the definitions have to be the same. (Source: cppreference/ODR).
To avoid this problem, you need to have an include guard in your header. It will silently prevent the header from being included more than once in each translation unit.

Use include guards (or if available with your compiler) pragma once.
#ifndef PATH_TO_FILE_FILENAME_H
#define PATH_TO_FILE_FILENAME_H
struct TestStruct {
int a;
double d;
};
int k();
#endif
or (much better if available!)
#pragma once
struct TestStruct {
int a;
double d;
};
int k();
Also might be worth using namespaces to avoid polluting the global namespace
#pragma once
namespace Test
{
struct TestStruct {
int a;
double d;
};
int k();
};
Note that to avoid muldefs you'll also need to declare k() inline should you decide to provide its definition in the header (this can be unavoidable sometimes should you need to use templates and not specify explicit template parameters).
#pragma once
namespace Test
{
struct TestStruct {
int a;
double d;
};
template<typename T>
inline int k<T>() // This now has to be inline or static.
{
// Some implementation
}
};
Edit: On an aside, the difference between a declaration and a definition for a struct/class isn't much different from a function:
void TestFunction(); // The compiler now knows there's a function called TestFunctionand can attempt to link the symbol information to its implementation somewhere in the compilation unit.
Where in this case we're not implementing the meat and bones of the function, just saying it exists and since the compiler knows the signature (or what the function promises to take and return) it can continue happily. In TestStructs case the forward declaration (without implementation) would be
class TestStruct;

Related

Declaration of function prototypes within classes in C++? [duplicate]

I have been programming in C++ for quite some time and I never thought about this until today.
Consider the following code:
struct foo
{
// compiles fine
void bar()
{
a = 1;
my_int_type b;
b = 5;
}
// Just a declaration, this fails to compile, which leads me to assume that
// even though the bar() method is declared and defined all at once, the
// compiler looks/checks-syntax-of the class interface first, and then compiles
// the respective definitions...?
void bar2(my_int_type); // COMPILE ERROR
my_int_type b; // COMPILE ERROR because it comes before the typedef declaration
typedef int my_int_type;
my_int_type a;
void bar3(my_int_type); // compiles fine
};
int main()
{
foo a;
a.bar();
return 0;
}
Is my understanding of why the errors occur (see bar2() comment above) correct/incorrect? Either way, I would appreciate an answer with a simplistic overview of how a single-pass C++ compiler would compile the code given above.
For the most part, a C++ file is parsed top-to-bottom, so entities must be declared before they are used.
In your class, bar2 and b are invalid because they both make use of my_int_type, which has not yet been declared.
One exception to the "top-to-bottom" parsing rule is member functions that are defined inside the definition of their class. When such a member function definition is parsed, it is parsed as if it appeared after the definition of the class. This is why your usage of my_int_type in bar is valid.
Effectively, this:
struct foo
{
void bar()
{
my_int_type b;
}
typedef int my_int_type;
};
is the same as:
struct foo
{
void bar();
typedef int my_int_type;
};
inline void foo::bar()
{
my_int_type b;
}
Compiler just starts to go down in a block. Any symbol which is not familiar to it will be considered as a new symbol which is not defined. This is the scheme behind the function definition or header files.
You can suppose that the compiler first makes a list of definitions so the bar() method should get compiled correctly because the definitions have provided before.
It has a lot to do with visibility. I think your confusion may come from assuming a single pass. Think of class parsing as done in two stages.
Parse class definition.
Parse method implementation.
The advantage to this is we have visibility of the entire class from within class member functions.

how to define global variable without causing multiple declaration of same variable problem?

I have the following definition of a global variable in a header that was included in several cpp file. I do have the #pragma once included.
// A.h // only declarations and variable definition here.
#pragma once
int i = 1;
class one{public: one(); int j, p=1;};
class two{public: two(); int m, q=10;};
// B.cpp // definition of class one methods
#include "A.h"
one::one(){j=i;}
// C.cpp // definition of class two methods
#include "A.h"
two::two(){m=i;}
// D.cpp
#include "A.h"
int main()
{
one b();
two a();
}
when compiling the compiler returns two error that both say int i has been defined in D.obj.
Several problems in your code (some maybe be typos you made while writing the question)
you include headers, whereas you should #include them;
main has to have a return value type of int;
you declare classes, so without you writing public: before members, they are private by default; this results in a private constructor, for instance; you better write classes in a more readable way, by the way:
class one {
public:
one();
private:
int j;
int p = 1;
};
when declaring a and b in main, you should guard against MVP; one way to do so is writing one b{}; and two a{}; instead.
it's good practice putting #pragma once at the top of the headers so that they don't get included more than once in each translation unit;
furthermore, when you define a variable in a header file, each cpp file including the header will have its own version of that variable; you need to inline those variables; from here you can read that (my emphasis)
An inline function or inline variable (since C++17) has the following properties:
...
An inline function or variable (since C++17) with external linkage (e.g. not declared static) has the following additional properties:
There may be more than one definition of an inline function or variable (since C++17) in the program as long as each definition appears in a different translation unit and (for non-static inline functions and variables (since C++17)) all definitions are identical. For example, an inline function or an inline variable (since C++17) may be defined in a header file that is #include'd in multiple source files.

Declaring an object in a header when its definition is in a different header

I am new to cpp, looking at an existing project.
There are several places in the project where an object is defined in a namespace in some header file and then declared in the same namespace and used in other declarations in some other header file. Example:
MyStruct.hpp:
namespace A { struct MyStruct { int i; }; }
MyClass.hpp:
namespace A { struct MyStruct; }
namespace B {
class MyClass {
private:
void foo(A::MyStruct& s);
};
}
MyClass.cpp:
#include "MyClass.hpp"
#include "MyStruct.hpp"
namespace B {
class MyClass {
void foo(A::MyStruct& s) { /* ... */ }
};
}
I might have expected there to be an #include "MyStruct.hpp" in MyClass.hpp, but that is not the case. I guess when the two header files are isolated there is not yet any relationship between the MyStruct objects in that case. Why does this not create some conflict though when the two headers are loaded together in the implementation file? I cannot, for instance, write int j; int j = 0;. Does the order of the #includes matter? And what might be the motivation for doing this?
What you are seeing is a forward declaration being used.
Since foo() takes a MyStruct by reference, MyClass.hpp doesn't need to know the full declaration of MyStruct in order to declare foo(), it only needs to know that MyStruct exists somewhere. The linker will match them up later.
MyClass.cpp, on the other hand, needs to know the full declaration of MyStruct in order for foo()'s implementation to access the contents of its s parameter. So MyStruct.hpp is used there instead.
This also means that if the contents of MyStruct.hpp are ever modified, any source files using MyClass.hpp don't have to be recompiled since the forward declaration of MyStruct won't change (unless the namespace is changed, that is). Only files that are using MyStruct.hpp will need recompiling, like MyClass.cpp.
MyClass.hpp contains a forward declaration of MyStruct; it’s just telling the compiler that a type named struct MyStruct exists, but not providing any details about what’s in the struct. That’s enough for things like const MyStruct & to compile without syntax errors, but not enough to actually do anything with the struct (since its size and contents are still unknown to the compiler)
As for why someone would do that instead of putting in an #include; it’s a little bit faster to compile, but the main reason would be to avoid cyclic dependencies (eg if MyStruct.h needs to reference MyClass but MyClass.h also needs to reference MyStruct, that would be difficult to achieve using only #include directives)

Basic ODR violation: member functions in .h files

Disclaimer: This is probably a basic question, but I'm a theoretical physicist by training trying to learn to code properly, so please bear with me.
Let's say that I want to model a fairly involved physical system. In my understanding, one way of modelling this system is to introduce it as a class. However, since the system involved, the class will be large, with potentially many data members, member functions and subclasses. Having the main program and this class in one file will be very cluttered, so to give a better overview of the project I tend to put the class in a separate .h file. Such that I'd have something like:
//main.cpp
#include "tmp.h"
int main()
{
myclass aclass;
aclass.myfunction();
return 0;
}
and
// tmp.h
class myclass
{
// data members
double foo;
double bar;
public:
// function members
double myfunction();
};
double myclass::myfunction()
{
return foo + bar;
}
This however, amounts to the following compiler warning in my new compiler: function definitions in header files can lead to ODR violations. My question then is this: what is actually the preferred way of dealing with a situation like this? I guess I can just make tmp.h into tmp.cpp, but to the best of my understanding, this is the intended use of .h files?
Normally, a class definition goes in an ".h" file and its member functions' definitions go in a ".cpp" file.
If you want to define member functions in the header, you need to either declare them inline, or write them inside the class definition (which makes them implicitly inline).
Adding to the other answers:
This is a function definition:
double myfunction()
{
return foo + bar;
}
This is a function declaration:
double myfunction();
The purpose of the declaration is to declare the unique signature of the function to other code. The function can be declared many times, but can only have one definition, hence the ODR (One Definition Rule).
As a basic rule to start with, put function declarations in header files, and put definitions in source files.
Unfortunately in C++, things rapidly get more complicated.
The problem with just having the function signature available is that you can't easily optimise the code in the function, because you can't see it. To solve that problem, C++ allows function definitions to be put in headers in several circumstances.
You'll have to either use the inline keyword or put the definition of myclass in a .cpp file.
myclass.hpp
#ifndef MY_CLASS_H
#define MY_CLASS_H
class myclass
{
public:
double myfunction( );
private:
double foo;
double bar;
};
#endif
myclass.cpp
#include "myclass.hpp"
double myclass::myFunction( )
{
return foo + bar;
}
Or you can define the function in the header (myclass.hpp) using inline.
#ifndef MY_CLASS_H
#define MY_CLASS_H
class myclass
{
public:
double myfunction( );
private:
double foo;
double bar;
};
inline double myclass::myFunction( )
{
return bar + foo;
}
#endif
If you define the myFunction function in the class declaration then you can omit the use of the inline keyword.
#ifndef MY_CLASS_H
#define MY_CLASS_H
class myclass
{
public:
double myfunction( )
{
return foo + bar;
}
private:
double foo;
double bar;
};
#endif
ODR stands for One Definition Rule. It means that everything should have one and only one definition. Now, if you define a function in a header file, every translation unit that includes that header file will get a definition. That obviously violates the ODR. That's what the compiler is warning about. You have couple of ways to work around that:
Move the function declaration to a cpp file. That way there is a single definition.
Make the function inline. This means that there may be multiple definitions of this function, but you're sure that all are identical and the linker may use one those and ignore the rest.
Make the function static (doesn't apply to class-methods). When a function is static each translation unit gets it's own copy of the method, but they are all different functions and belong to one and only one compilation unit. So it's OK.

Why doesn't the order of methods in a class matter in C++?

I have been programming in C++ for quite some time and I never thought about this until today.
Consider the following code:
struct foo
{
// compiles fine
void bar()
{
a = 1;
my_int_type b;
b = 5;
}
// Just a declaration, this fails to compile, which leads me to assume that
// even though the bar() method is declared and defined all at once, the
// compiler looks/checks-syntax-of the class interface first, and then compiles
// the respective definitions...?
void bar2(my_int_type); // COMPILE ERROR
my_int_type b; // COMPILE ERROR because it comes before the typedef declaration
typedef int my_int_type;
my_int_type a;
void bar3(my_int_type); // compiles fine
};
int main()
{
foo a;
a.bar();
return 0;
}
Is my understanding of why the errors occur (see bar2() comment above) correct/incorrect? Either way, I would appreciate an answer with a simplistic overview of how a single-pass C++ compiler would compile the code given above.
For the most part, a C++ file is parsed top-to-bottom, so entities must be declared before they are used.
In your class, bar2 and b are invalid because they both make use of my_int_type, which has not yet been declared.
One exception to the "top-to-bottom" parsing rule is member functions that are defined inside the definition of their class. When such a member function definition is parsed, it is parsed as if it appeared after the definition of the class. This is why your usage of my_int_type in bar is valid.
Effectively, this:
struct foo
{
void bar()
{
my_int_type b;
}
typedef int my_int_type;
};
is the same as:
struct foo
{
void bar();
typedef int my_int_type;
};
inline void foo::bar()
{
my_int_type b;
}
Compiler just starts to go down in a block. Any symbol which is not familiar to it will be considered as a new symbol which is not defined. This is the scheme behind the function definition or header files.
You can suppose that the compiler first makes a list of definitions so the bar() method should get compiled correctly because the definitions have provided before.
It has a lot to do with visibility. I think your confusion may come from assuming a single pass. Think of class parsing as done in two stages.
Parse class definition.
Parse method implementation.
The advantage to this is we have visibility of the entire class from within class member functions.