Linkage of classes - c++

Do classes have external linkage? If so, why couldn't I get the following to work on gcc?
// mycpp1.cpp
class MyClass
{
int x;
// ......
};
`
// mycpp2.cpp
class MyClass; // why doesn't this work?
I've done some googling around and came around this,
C++ Standard
(3.5/4) a named class (clause 9), or an unnamed class defined in a
typedef declaration in which the class has the typedef name for
linkage purposes (7.1.3);
I could just include the file, but what would be the purpose of linkage?

Why I half agree with Nawez that you shouldn't worry too much about
linkage, it is nice to understand it, so you know why the coding pattern
you're using works (and why other patterns don't). Linkage defines how
a name (a symbol) is bound to an entity (variable, function, type,
etc.). External linkage means that the name (or the qualified version
of it) binds to the same entity in all of the translation units: in your
case, that MyClass is the same class in all of the translation units.
It doesn't mean that all translation units automatically know how it is
defined; just that MyClass in mycpp1.cpp is the same type as
MyClass in mycpp2.cpp. For various reasons related to compiler and
linker technology, you still have to provide a definition in every source
(preferably by means of an include) that uses it, but these definitions
are required to be identical (which is why the include is preferred),
because the compiler will generate code treating them as identical.

You should be doing this:
Header File : declaration should go in .h file
// mycpp1.h
class MyClass
{
int x;
void f(); //only declaration
};
Source File : : definition should go in .cpp file
// mycpp1.cpp
#include "mycpp1.h"
void MyClass::f() //definition of member function
{
//code
}
and then use MyClass as
// mycpp2.cpp
#include "mycpp1.h"
MyClass instance; // it should work!
I would suggest you to pick a good introductory book on C++. Here is a comprehensive list of good books on C++:
The Definitive C++ Book Guide and List

class MyClass; // why doesn't this work?
This is a forward declaration - it declares that the class exists, but doesn't provide the information needed to do much with it.
With just a forward declaration, you can do a few things, including declaring a pointer or reference to it, and declaring a function with it as an argument or return value. You can't instantiate it, inherit from it, access any of its members, or do various other things without the full definition, for which you need to include the header file.
I could just include the file, but what would be the purpose of linkage?
A header file often contains declarations of functions and objects, which can then be fully defined in a separate source file and not included by files using them. That is the purpose of linkage, but it doesn't extend to allowing class definitions to be in separate source files.

Related

aIs anonymous namespace structure unique

I think I understood that anonymous namespace can be used to make the symbols local to current translation unit. But what about structure definitions, can I assume that they do refer to the same type ?
MyClass.h:
namespace {
class MyClass {};
}
A.h:
#include "MyClass.h"
class A {
MyClass* impl;
void op();
}
A.cpp translation unit 1:
#include "A.h"
void A::op() {
// Let *this->impl refer to a type X.
}
B.cpp translation unit 2:
#include "A.h"
void global_op(const A& a) {
// Can I assume that *a->impl refer to same type X ?
}
No, they do not refer to the same type. The header MyClass.h contains a definition of a class type MyClass inside an unnamed namespace. An unnamed namespace basically makes everything inside it (yes, types too) have internal linkage [basic.link]/6. You have two translation units, each (indirectly) includes MyClass.h, each gets it's own unnamed namespace with it's own MyClass [basic.link]/11.
Think of an unnamed namespace as being a namespace that has a distinct name for each translation unit. So the MyClass in translation unit A is actually $somerandomstringA$::MyClass, while the MyClass in translation unit B is actually $somerandomstringB$::MyClass…
As discussed down in the comments to this answer, be aware of the fact that the program you described above will contain an ODR violation (specifically [basic.def.odr]/12.2) as a result of your class A being defined to contain a member of type MyClass*, which has a different meaning in different translation units.
This program has undefined behavior, since each translation unit defines class ::A but with two different meanings.
An anonymous namespace has internal linkage ([basic.link]/4). The type MyClass has the same linkage as its namespace, so also internal linkage ([basic.link]/4.3). And internal linkage means that the type can only be named from the same translation units, so the two translation units formed from A.cpp and B.cpp define two different types named MyClass. This isn't a problem, yet.
But the global namespace and class A have external linkage. There are two definitions of the single type ::A, but they give the member impl two different types. This is a One Definition Rule violation.
(Although we often say "the ODR", there are really essentially two flavors: [basic.def.odr]/10 applies to things like objects and functions which are namespace members and not marked inline, and says the program can only have one definition, in one TU; so we usually put those in source files. [basic.def.odr]/12 applies to things like types, things marked inline, and declarations with template parameters, and says multiple TUs may each have one definition, but all must have the same token spelling (after preprocessing) and the same meaning; so we often put those in header files so that multiple TUs can use a common definition.)
Specifically here, the program violates [basic.def.odr]/12.2:
There can be more than one definition of a class type, ... in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
...; and
in each definition of D, corresponding names, looked up according to [basic.lookup], shall refer to an entity defined within the definition of D, or shall refer to the same entity, after overload resolution and after matching of partial template specialization ([temp.over]), except that a name can refer to
a non-volatile const object with internal or no linkage if ..., or
a reference with internal or no linkage initialized with a constant expression such that ...;
and ....
... If the definitions of D do not satisfy these requirements, then the behavior is undefined.
Here MyClass is a name within the definitions of class ::A but referring to two different entities, and not falling into any of the specifically permitted categories.
This might work in practice for many systems, since both translation units will see the same member names, types, and offsets within their own MyClass types. But if MyClass ever ends up being used in "name mangling", that will go wrong. And anyway, it's safest to avoid undefined behavior whenever you can.

Why can static member function definitions not have the keyword 'static'?

As per this link on the 'static' keyword in C++ :
The static keyword is only used with the declaration of a static
member, inside the class definition, but not with the definition of
that static member.
Why is the static keyword prohibited on member function definitions? I do understand that re-declaring a function as 'static' at its definition is redundant. But using it should be harmless during compilation of the function definition as it does not result in any kind of ambiguity. So why do compilers prohibit it?
There's ambiguity alright. The same definition need not be for a member function at all.
Consider this:
namespace foo {
static void bar();
}
static void foo::bar() {
}
foo::bar is required to be defined with the same linkage specifier.
For member functions, however, static is not a linkage specifier. If it was allowed, the correctness of the definition of foo::bar will be very very context dependent on what foo is. Disallowing static in fact eases the burden on the compiler.
Extending it to members in general, as opposed to just member functions, is a matter of consistency.
The point is, that static has several, very different meanings:
class Foo {
static void bar();
}
Here the static keyword means that the function bar is associated with the class Foo, but it is not called on an instance of Foo. This meaning of static is strongly connected to object orientation. However, the declaration
static void bar();
means something very different: It means that bar is only visible in file scope, the function cannot be called directly from other compilation units.
You see, if you say static in the class declaration, it does not make any sense to later restrict the function to file scope. And if you have a static function (with file scope), it does not make sense to publish it as part of a class definition in a public header file. The two meanings are so different, that they practically exclude each other.
static has even more, distinct meanings:
void bar() {
static int hiddenGlobal = 42;
}
is another meaning, that is similar, but not identical to
class Foo {
static int classGlobal = 6*7;
}
When programming, words don't always the same meaning in all contexts.
You have to understand the difference between declaration and implementation, and that will answer your question:
Declaration: Is how C++ functions and methods are seen before compiling the program. It's put in a header file (.h file).
Implementation: Is how the compiler links a declaration to a real task in binary code. The implementation can be compiled on the fly (from source files, .cpp or .cxx or .cc), or can be already compiled (from shared libraries or object files).
Now going back to your question, when you declare something as static, it's something not related to the implementation, but related to how the compiler sees the decleration while compiling the code. For example, if you label functions in source files "static", then that's meaningless, because that information cannot be carried to compiled objects and shared libraries. Why allow it? On the contrary, it could only cause ambiguity.
For the exact same reason, default parameters must go into the header, not the source files. Because source files (that contain implementations), cannot carry the default parameter information to a compiled object.
Wild guess but if the definition has static it could be interpreted as a file-scope variable in the C sense.

Why class redefinition in a several cpp files is permitted [duplicate]

This question already has answers here:
Same class name in different C++ files
(4 answers)
Closed 8 years ago.
Let I've two cpp files:
//--a.cpp--//
class A
{
public:
void bar()
{
printf("class A");
}
};
//--b.cpp--//
class A
{
public:
void bar()
{
printf("class A");
}
};
When I'm compling and linking this files together I have no errors. But if I'll write the following:
//--a.cpp--//
int a;
//--b.cpp--//
int a;
After compiling and linking this sources I've an error as the redefiniton of a. But in the case of classes I've redefinition to, but there is no error is raised. I'm confused.
Classes are types. For the most part, they are compile-time artifacts; global variables, on the other hand, are runtime artifacts.
In your first example, each translation unit has its own definition of class a. Since the translation units are separate from each other, and because they do not produce global runtime artifacts with identical names, this is OK. The standard requires that there be exactly one definition of a class per translation unit - see sections 3.2.1 and 3.2.4:
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template.
Exactly one definition of a class is required in a translation unit if the class is used in a way that requires the class type to be complete.
However, the standard permits multiple class definitions in separate translation units - see section 3.2.6:
There can be more than one definition of a class type, enumeration type, inline function with external linkage, class template, non-static function template, static data member of a class template, member function of a class template, or template specialization for
which some template parameters are not specified in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. [...]
What follows is a long list of requirements, which boils down to that the two class definitions need to be the same; otherwise, the program is considered ill-formed.
In your second example you are defining a global runtime artifact (variable int a) in two translation units. When the linker tries to produce the final output (an executable or a library) it finds both of these, and issues a redefinition error. Note that the rule 3.2.6 above does not include variables with external linkage.
If you declare your variables static, your program will compile, because static variables are local to a translation unit in which they are defined.
Although both programs would compile, the reasons why they compile are different: in case of multiple class definitions the compiler assumes that the two classes are the same; in the second case, the compiler considers the two variables independent of each other.
There are actually two different flavors of the One Definition Rule.
One flavor, which applies to global and namespace variables, static class members, and functions without the inline keyword, says that there can only be one definition in the entire program. These are the things that typically go in *.cpp files.
The other flavor, which applies to type definitions, functions ever declared with the inline keyword, and anything with a template parameter, says that the definition can appear once per translation unit but must be defined with the same source and have the same meaning in each. It's legal to copy-paste into two *.cpp files as you did, but typically you would put these things in a header file and #include that header from all the *.cpp files that need them.
Classes can't be used (except in very limited ways) unless the definition is available within the translation unit that uses it. This means that you need multiple definitions in order to use it in multiple units, and so the language allows that - as long as all the definitions are identical. The same rules apply to various other entities (such as templates and inline functions) for which a definition is needed at the point of use.
Usually, you would share the definition by putting it in a header, and including that wherever it's needed.
Variables can be used with only a declaration, not the definition, so there's no need to allow multiple definitions. In your case, you could fix the error by making one of them a pure declaration:
extern int a;
so that there is only one definition. Again, it's common for such declarations to go in headers, to make sure they're the same in every file that uses them.
For the full, gory details of the One Definition Rule, see C++11 3.2, [basic.def.odr].

Declaring static data members of normal class and class template

I read the reason for defining static data members in the source file is because if they were in the header file and multiple source files included the header file- the definitions would get output multiple times. I can see why this would be a problem for the static const data member, but why is this a problem for the static data member?
I'm not too sure I fully understand why there is a problem if the definition is written in the header file...
The multiple definition problem for variables is due to two main deficiencies in the language definition.
As shown below you can easily work around it. There is no technical reason why there is no direct support. It has to do with the feature not being in sufficient high demand that people on the committee have chosen to make it a priority.
First, why multiple definitions in general are a problem. Since C++ lacks support for separately compiled modules (deficiency #1), programmers have to emulate that feature by using textual preprocessing etc. And then it's easy to inadvertently introduce two or more definitions of the same name, which would most likely be in error.
For functions this was solved by the inline keyword and property. A freestanding function can only be explicitly inline, while a member function can be implicitly inline by being defined in the class definition. Either way, if a function is inline then it can be defined in multiple translation units, and it must be defined in every translation unit where it's used, and those definitions must be equivalent.
Mainly that solution allowed classes to be defined in header files.
No such language feature was needed to support data, variables defined in header files, so it just isn't there: you can't have inline variables. This is language deficiency #2.
However, you can obtain the effect of inline variables via a special exemption for static data members of class templates. The reason for the exemption is that class templates generally have to be fully defined in header files (unless the template is only used internally in a translation unit), and so for a class template to be able to have static data members, it's necessary with either an exemption from the general rules, or some special support. The committee chose the exemption-from-the-rules route.
template< class Dummy >
struct math_
{
static double const pi;
};
template< class Dummy >
double const math_<Dummy>::pi = 3.14;
typedef math_<void> math;
The above has been referred to as the templated const trick. As far as I know I was the one who once introduced it, in the [comp.lang.c++] Usenet group, so I can't give credit to someone else. I've also posted it a few times here on SO.
Anyway, this means that every C++ compiler and linker internally supports and must support the machinery needed for inline data, and yet the language doesn't have that feature.
However, on the third hand, C++11 has constexpr, where you can write the above as just
struct math
{
static double constexpr pi = 3.14;
};
Well, there is a difference, that you can't take the address of the C++11 math::pi, but that's a very minor limitation.
I think you're confusing two things: static data members and global variables markes as static.
The latter have internal linkage, which means that if you put their definition in a header file that multiple translation units #include, each translation unit will receive a private copy of those variables.
Global variables marked as const have internal linkage by default, so you won't need to specify static explicitly for those. Hence, the linker won't complain about multiple definitions of global const variable or of global non-const variables marked as static, while it will complain in the other cases (because those variables would have external linkage).
Concerning static data members, this is what Paragraph 9.4.2/5 of the C++11 Standard says:
static data members of a class in namespace scope have external linkage (3.5). A local class shall not have
static data members.
This means that if you put their definition in a header file #included by multiple translation units, you will end up with multiple definitions of the same symbol in the corresponding object files (exactly like non-const global variables), no matter what their const-qualification is. In that case, your program would violate the One Definition Rule.
Also, this Q&A on StackOverflow may give you a clearer understanding of the subject.

Why is there no multiple definition error when you define a class in a header file?

I'm not sure if I asked the question correctly, but let me explain.
First, I read this article that explains the difference between declarations and definitions:
http://www.cprogramming.com/declare_vs_define.html
Second, I know from previous research that it is bad practice to define variables and functions in a header file, because during the linking phase you might have multiple definitions for the same name which will throw an error.
However, how come this doesn't happen for classes? According to another SO answer (
What is the difference between a definition and a declaration? ), the following would be a class DEFINITION:
class MyClass {
private:
public:
};
If the above definition is in a header file. Then , presumably, you can have multiple .cpp files that #include that header. This means the class is defined multiple times after compilation in multiple .o files, but doesn't seem to cause much problems...
On the other hand, if it was a function being defined in the header file, it would cause problems apparently...from what I understand... maybe?
So what's so special about class definitions?
The one-definition rule (3.2, [basic.def.odr]) applies differently to classes and functions:
1 - No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template.
[...]
4 - Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program [...]
So while (non-inline) functions may be defined at most once in the whole program (and exactly once if they are called or otherwise odr-used), classes may be defined as many times as you have translation units (source files), but no more than once per translation unit.
The reason for this is that since classes are types, their definitions are necessary to be able to share data between translation units. Originally, classes (structs in C) did not have any data requiring linker support; C++ introduces virtual member functions and virtual inheritance, which require linker support for the vtable, but this is usually worked around by attaching the vtable to (the definition of) a member function.
A class definition is just a kind of a blueprint for the objects of that class. It's been the same with struct since the C days. No classes or structures actually exists in the code as such.
Your class definition defines the class, but does not define and objects of that class. It's OK to have the class (or structure) defined in multiple files, because you're just defining a type, not a variable of that type. If you just had the definition, no code would be emitted by the compiler.
The compiler actually emits code only after you declare an object (i.e. variable) of this type:
class MyClass myvar;
or:
class MyOtherClass {
public: ...
private: ...
} myvar; // note the variable name, it instantiates a MyOtherClass
That is what you do NOT want to do in headers because it will cause multiple instances of myvar to be instantiated.