I think I understood that anonymous namespace can be used to make the symbols local to current translation unit. But what about structure definitions, can I assume that they do refer to the same type ?
MyClass.h:
namespace {
class MyClass {};
}
A.h:
#include "MyClass.h"
class A {
MyClass* impl;
void op();
}
A.cpp translation unit 1:
#include "A.h"
void A::op() {
// Let *this->impl refer to a type X.
}
B.cpp translation unit 2:
#include "A.h"
void global_op(const A& a) {
// Can I assume that *a->impl refer to same type X ?
}
No, they do not refer to the same type. The header MyClass.h contains a definition of a class type MyClass inside an unnamed namespace. An unnamed namespace basically makes everything inside it (yes, types too) have internal linkage [basic.link]/6. You have two translation units, each (indirectly) includes MyClass.h, each gets it's own unnamed namespace with it's own MyClass [basic.link]/11.
Think of an unnamed namespace as being a namespace that has a distinct name for each translation unit. So the MyClass in translation unit A is actually $somerandomstringA$::MyClass, while the MyClass in translation unit B is actually $somerandomstringB$::MyClass…
As discussed down in the comments to this answer, be aware of the fact that the program you described above will contain an ODR violation (specifically [basic.def.odr]/12.2) as a result of your class A being defined to contain a member of type MyClass*, which has a different meaning in different translation units.
This program has undefined behavior, since each translation unit defines class ::A but with two different meanings.
An anonymous namespace has internal linkage ([basic.link]/4). The type MyClass has the same linkage as its namespace, so also internal linkage ([basic.link]/4.3). And internal linkage means that the type can only be named from the same translation units, so the two translation units formed from A.cpp and B.cpp define two different types named MyClass. This isn't a problem, yet.
But the global namespace and class A have external linkage. There are two definitions of the single type ::A, but they give the member impl two different types. This is a One Definition Rule violation.
(Although we often say "the ODR", there are really essentially two flavors: [basic.def.odr]/10 applies to things like objects and functions which are namespace members and not marked inline, and says the program can only have one definition, in one TU; so we usually put those in source files. [basic.def.odr]/12 applies to things like types, things marked inline, and declarations with template parameters, and says multiple TUs may each have one definition, but all must have the same token spelling (after preprocessing) and the same meaning; so we often put those in header files so that multiple TUs can use a common definition.)
Specifically here, the program violates [basic.def.odr]/12.2:
There can be more than one definition of a class type, ... in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
...; and
in each definition of D, corresponding names, looked up according to [basic.lookup], shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution and after matching of partial template specialization ([temp.over]), except that a name can refer to
a non-volatile const object with internal or no linkage if ..., or
a reference with internal or no linkage initialized with a constant expression such that ...;
and ....
... If the definitions of D do not satisfy these requirements, then the behavior is undefined.
Here MyClass is a name within the definitions of class ::A but referring to two different entities, and not falling into any of the specifically permitted categories.
This might work in practice for many systems, since both translation units will see the same member names, types, and offsets within their own MyClass types. But if MyClass ever ends up being used in "name mangling", that will go wrong. And anyway, it's safest to avoid undefined behavior whenever you can.
Related
For example:
class B;
class A {
public:
B f();
};
int main(int, char**) {
A a; // no exception and error
return 0;
}
The class A may be an incomplete type.
Why can class A be instantiated in this example?
I know that the program will fail to compile when the member function f is called in the code.
In C++ entities don't need to be defined in the same translation unit in which they are used (exceptions apply). So it not being defined in one translation unit doesn't mean that you won't define and explicitly instantiate the member function in another translation unit.
Also, it is not necessary to define entities that are not ODR-used (e.g. called) at all in C++.
This is exactly the same for free functions. You also only need to define a free function if you actually ODR-use (e.g. call) it and the definition needs to be only in one translation unit.
The class A may be an incomplete type.
A is complete after the class A { /*...*/ }; definition. Whether member functions are defined doesn't affect completeness.
I know that the program will fail to compile when the member function f is called in the code.
But it will fail only in the linker step when all translation units are linked together, because before that the compiler can't know that the function isn't defined and instantiated somewhere else. If the function isn't ODR-used (e.g. called), then the linker has no reason to look for a definition for it anyway.
Note that B being incomplete is also not a problem. Return types need to be complete only in the definition of a function, not its declaration, and you can have B be completed before you define the member function in another translation unit.
B also needs to be complete when calling the function, however not in general for other ODR-uses. This applies to all return and parameter types. Still, B can be completed without defining the function. That can still happen in another translation unit.
ODR in the above means "one definition rule". This is the rule that there shall be exactly one definition of a given entity in the whole program. The rule doesn't apply in full to all entities and in particular templated entities can be defined once in each translation as long as the definitions are identical (exact rules are more complicated).
"ODR-use" refers to a use of an entity that would trigger the one definition rule to require (at least) one definition of the entity to exist somewhere in the program. These are for example calling a function or taking the address of a function.
See https://en.cppreference.com/w/cpp/language/definition for details.
Why can class A be instantiated in this example?
Because when defining a non-static local variable like A a; the requirement is that the defined type is a complete type and since A is complete at the point of the definition A a;, it satisfied that requirement.
Also note that B f(); is a declaration and not a definition, and when declaring a function/member function, the return type can be of incomplete type like B.
If I will write something like this:
// A.h
#ifndef A_h
#define A_h
class A
{
public:
void f();
};
void A::f()
{
}
#endif //A_h
// B.cpp
#include "A.h"
void foo()
{
A a;
a.f();
}
// C.cpp
#include "A.h"
void bar()
{
A b;
b.f();
}
// main.cpp
#include "B.cpp"
#include "C.cpp"
using namespace std;
int main()
{
foo();
bar();
return 0;
}
I get a linker error as such:
error LNK2005: "public: void __thiscall A::f(void)" (?f#A##QAEXXZ)
already defined in B.obj
Why does the same problem not happen when the A class is a class template? Eventually it becomes a plain class (a non-template one) during compilation, right? For this reason, I expect the same behavior as a non-template class, i.e. a linker error.
There are two separate effects at work here:
A member function definition that's out of line is a normal function definition, and by one definition rule (ODR) it must occur precisely once in the link. A member function defined inline is implicitly inline, and the ODR allows inline function definitions to be repeated:
That is, it's OK to put the following code in a header and include it repeatedly:
struct Foo {
void bar() {} // "inline" implied
};
But if you have the definition out of line, it must be in a single translation unit.
Function templates can be defined repeatedly, even when they are not inline. The templating mechanism in general already needs to deal with repeated instantiations of templates, and with their deduplication at link time.
Member functions of class templates are themselves function templates, and therefore it doesn't matter whether you declare them inline.
Why is it that the non template functions are treated differently to templates when it comes to multiple definitions?
There are historical and compatibility issues involved here. Some of the requirements come from C, that is the way it worked. There are also reasons related to what templates are, they are code generators; when required, the compiler needs to generate the code, consequently it needs to see the code when it generates it. This has a knock on effect that there will be multiple definitions, so rules are required to resolve those issues.
Simply put; templates behave (w.r.t. linking) as if they had a single definition in the program, hence they do not behave the same during compilation and linking as non-templates (that are not declared with inline) - in particular w.r.t. functions. If the non-templates are declared inline, similar behaviour is seen.
Standard references here include;
Some background, most of the issues here relate to linkage, what is linkage? §3.5/2 [basic.link]
A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope:
When a name has external linkage, the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit.
When a name has internal linkage, the entity it denotes can be referred to by names from other scopes in the same translation unit.
When a name has no linkage, the entity it denotes cannot be referred to by names from other scopes.
Some general rules relating to functions and variable, for the program as a whole and each of the translation units.
§3.2/1 [basic.def.odr]
No translation unit shall contain more than one definition of any
variable, function, class type, enumeration type, or template.
And
§3.2/4 [basic.def.odr]
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program...
§3.2/6 [basic.def.odr]
There can be more than one definition of a class type (Clause [class]), enumeration type ([dcl.enum]), inline function with external linkage ([dcl.fct.spec]), class template (Clause [temp]), non-static function template ([temp.fct]), static data member of a class template ([temp.static]), member function of a class template ([temp.mem.func]), or template specialization for which some template parameters are not specified ([temp.spec], [temp.class.spec]) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements....
If D is a template and is defined in more than one translation unit, then the preceding requirements shall apply both to names from the template's enclosing scope used in the template definition ([temp.nondep]), and also to dependent names at the point of instantiation ([temp.dep]). If the definitions of D satisfy all these requirements, then the behavior is as if there were a single definition of D. If the definitions of D do not satisfy these requirements, then the behavior is undefined.
Some informal observations on the above list including the classes, templates etc. These are typical elements often found in header files (of course not exclusively or limited to headers). They are given these special rules to make everything work as expected.
What about class member functions? §9.3 [class.mfct]
1/ A member function may be defined ([dcl.fct.def]) in its class definition, in which case it is an inline member function ([dcl.fct.spec]), or it may be defined outside of its class definition if it has already been declared but not defined in its class definition. A member function definition that appears outside of the class definition shall appear in a namespace scope enclosing the class definition...
2/ An inline member function (whether static or non-static) may also be defined outside of its class definition provided either its declaration in the class definition or its definition outside of the class definition declares the function as inline or constexpr. [ Note: Member functions of a class in namespace scope have the linkage of that class. Member functions of a local class ([class.local]) have no linkage. See [basic.link]. — end note ]
So basically, member functions not defined in the class definition and not implicitly inline, hence the "normal" rules apply and hence can only appear once in the program.
And template, what does it say about linkage ? §14/4 [temp]
A template name has linkage ([basic.link]). Specializations (explicit or implicit) of a template that has internal linkage are distinct from all specializations in other translation units... Template definitions shall obey the one-definition rule ([basic.def.odr]).
Templates are not code; they are patterns for creating code. They must be visible wherever they are used, so the compiler has to have special rules for using them. The key special rules here are that the compiler generates code wherever a template is used and that the linker ignores duplicates.
Maybe its lame question, But I don't get it!
If I include <string> or <vector> in multiple translation units (different .cpp) why it doesn't break the ODR?
As far as I know each .cpp is compiled differently so vector's methods code will be generated for each object file separately, right?
So linker should detect it and complain.
Even If it won't (I suspect it's special case for templates) will it be using one code or different set of cloned code in each unit, when I link all together???
The same way any template definitions don't break the ODR — the ODR specifically says that template definitions may be duplicated across translation units, as long as they are literally duplicates (and, since they are duplicates, no conflict or ambiguity is possible).
[C++14: 3.2/6]: There can be more than one definition of a class type (Clause 9), enumeration type (7.2), inline function with external linkage (7.1.2), class template (Clause 14), non-static function template (14.5.6), static data member of a class template (14.5.1.3), member function of a class template (14.5.1.1), or template specialization for which some template parameters are not specified (14.7, 14.5.5) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements [..]
Multiple inclusions of <vector> within the same translation unit are expressly permitted and effectively elided, more than likely by "#ifndef" header guards.
The standard has a special exception for templates that allows for duplication of functions that otherwise would violate ODR (such as functions with external linkage and non-inline member functions). from C++11 3.2/5:
If D is a template and is defined in more than one translation unit,
then the preceding requirements shall apply both to names from the
template’s enclosing scope used in the template definition (14.6.3),
and also to dependent names at the point of instantiation (14.6.2). If
the definitions of D satisfy all these requirements, then the program
shall behave as if there were a single definition of D. If the
definitions of D do not satisfy these requirements, then the behavior
is undefined.
The ODR doesn't state that a struct will only be declared one time across all compilation units--it states that if you declare a struct in multiple compilation units, it has to be the same struct. Violating the ODR would be if you had two separate vector types with the same name but different contents. At that point the linker would get confused and you'd get mixed up code and/or errors.
This question already has answers here:
Same class name in different C++ files
(4 answers)
Closed 8 years ago.
Let I've two cpp files:
//--a.cpp--//
class A
{
public:
void bar()
{
printf("class A");
}
};
//--b.cpp--//
class A
{
public:
void bar()
{
printf("class A");
}
};
When I'm compling and linking this files together I have no errors. But if I'll write the following:
//--a.cpp--//
int a;
//--b.cpp--//
int a;
After compiling and linking this sources I've an error as the redefiniton of a. But in the case of classes I've redefinition to, but there is no error is raised. I'm confused.
Classes are types. For the most part, they are compile-time artifacts; global variables, on the other hand, are runtime artifacts.
In your first example, each translation unit has its own definition of class a. Since the translation units are separate from each other, and because they do not produce global runtime artifacts with identical names, this is OK. The standard requires that there be exactly one definition of a class per translation unit - see sections 3.2.1 and 3.2.4:
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template.
Exactly one definition of a class is required in a translation unit if the class is used in a way that requires the class type to be complete.
However, the standard permits multiple class definitions in separate translation units - see section 3.2.6:
There can be more than one definition of a class type, enumeration type, inline function with external linkage, class template, non-static function template, static data member of a class template, member function of a class template, or template specialization for
which some template parameters are not specified in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. [...]
What follows is a long list of requirements, which boils down to that the two class definitions need to be the same; otherwise, the program is considered ill-formed.
In your second example you are defining a global runtime artifact (variable int a) in two translation units. When the linker tries to produce the final output (an executable or a library) it finds both of these, and issues a redefinition error. Note that the rule 3.2.6 above does not include variables with external linkage.
If you declare your variables static, your program will compile, because static variables are local to a translation unit in which they are defined.
Although both programs would compile, the reasons why they compile are different: in case of multiple class definitions the compiler assumes that the two classes are the same; in the second case, the compiler considers the two variables independent of each other.
There are actually two different flavors of the One Definition Rule.
One flavor, which applies to global and namespace variables, static class members, and functions without the inline keyword, says that there can only be one definition in the entire program. These are the things that typically go in *.cpp files.
The other flavor, which applies to type definitions, functions ever declared with the inline keyword, and anything with a template parameter, says that the definition can appear once per translation unit but must be defined with the same source and have the same meaning in each. It's legal to copy-paste into two *.cpp files as you did, but typically you would put these things in a header file and #include that header from all the *.cpp files that need them.
Classes can't be used (except in very limited ways) unless the definition is available within the translation unit that uses it. This means that you need multiple definitions in order to use it in multiple units, and so the language allows that - as long as all the definitions are identical. The same rules apply to various other entities (such as templates and inline functions) for which a definition is needed at the point of use.
Usually, you would share the definition by putting it in a header, and including that wherever it's needed.
Variables can be used with only a declaration, not the definition, so there's no need to allow multiple definitions. In your case, you could fix the error by making one of them a pure declaration:
extern int a;
so that there is only one definition. Again, it's common for such declarations to go in headers, to make sure they're the same in every file that uses them.
For the full, gory details of the One Definition Rule, see C++11 3.2, [basic.def.odr].
Do classes have external linkage? If so, why couldn't I get the following to work on gcc?
// mycpp1.cpp
class MyClass
{
int x;
// ......
};
`
// mycpp2.cpp
class MyClass; // why doesn't this work?
I've done some googling around and came around this,
C++ Standard
(3.5/4) a named class (clause 9), or an unnamed class defined in a
typedef declaration in which the class has the typedef name for
linkage purposes (7.1.3);
I could just include the file, but what would be the purpose of linkage?
Why I half agree with Nawez that you shouldn't worry too much about
linkage, it is nice to understand it, so you know why the coding pattern
you're using works (and why other patterns don't). Linkage defines how
a name (a symbol) is bound to an entity (variable, function, type,
etc.). External linkage means that the name (or the qualified version
of it) binds to the same entity in all of the translation units: in your
case, that MyClass is the same class in all of the translation units.
It doesn't mean that all translation units automatically know how it is
defined; just that MyClass in mycpp1.cpp is the same type as
MyClass in mycpp2.cpp. For various reasons related to compiler and
linker technology, you still have to provide a definition in every source
(preferably by means of an include) that uses it, but these definitions
are required to be identical (which is why the include is preferred),
because the compiler will generate code treating them as identical.
You should be doing this:
Header File : declaration should go in .h file
// mycpp1.h
class MyClass
{
int x;
void f(); //only declaration
};
Source File : : definition should go in .cpp file
// mycpp1.cpp
#include "mycpp1.h"
void MyClass::f() //definition of member function
{
//code
}
and then use MyClass as
// mycpp2.cpp
#include "mycpp1.h"
MyClass instance; // it should work!
I would suggest you to pick a good introductory book on C++. Here is a comprehensive list of good books on C++:
The Definitive C++ Book Guide and List
class MyClass; // why doesn't this work?
This is a forward declaration - it declares that the class exists, but doesn't provide the information needed to do much with it.
With just a forward declaration, you can do a few things, including declaring a pointer or reference to it, and declaring a function with it as an argument or return value. You can't instantiate it, inherit from it, access any of its members, or do various other things without the full definition, for which you need to include the header file.
I could just include the file, but what would be the purpose of linkage?
A header file often contains declarations of functions and objects, which can then be fully defined in a separate source file and not included by files using them. That is the purpose of linkage, but it doesn't extend to allowing class definitions to be in separate source files.