Is putting an extern template in a header file and then do the explicit template instantiation in a unit compilation file valid ?
For example in the compiling example for g++, is this working to avoid the instancation of nothing<int> twice ? Why doesn't anybody write it like this and prefer to copy pase the extern template line in each .cpp file ?
A.hpp:
#ifndef HEADERC_A
#define HEADERC_A
template< typename T > struct nothing {};
extern template struct nothing<int>;
#endif
A.cpp:
#include "A.hpp"
template struct nothing<int>;
main.cpp:
#include "A.hpp"
#include <iostream>
int main()
{
nothing<int> n;
return 0;
}
Well, this is certainly "valid" insofar as gcc will compile this, and do pretty much what you expect to happen.
As far as why doesn't everyone to do this, well, once you go beyond a trivial situation like this, and you start managing a large collection of widely used templates, it will quickly reach the point where it simply becomes not practical to keep track of every parameter that each one of your templates gets used with, so that it can be instantiated explicitly, in this manner.
It's going to be much easier for the compiler to keep track of it, for you.
As previously stated this is a perfectly valid use case and if it fits your programming model then you should use it. But buyer beware:
There are several reasons why extern templates are not commonly declared in header files and then explicitly instantiated in the cpp files.
A very common model for implementing template classes/functions is to place the definitions in the header file and the implementation in an "inl" or other named file. But then include that file at the bottom of the header file. There are reams of code that use this approach to resolving the template/header/implementation separation problem. Putting an "extern" at the top of the implementation makes the code much easier to read and maintain, especially when multiple classes get involved. Heres an example:
A.hpp
#pragma once
template< typename T > struct nothing {
void donothing(T input); // fastest func around
};
#include "A.inl"
A.inl
// does NOT include A.hpp
extern template struct nothing<int>; // save time and space
template<typename T> nothing<T>::donothing { return; }
Instance.h
#include "A.hpp"
template struct nothing<int>; // compiler generates code now
But there is a hidden caveat in all this...
If this gets implemented as you suggest then what happens when another person comes along and wants:
nothing<float> mynothing;
The compiler will see the header file but never find an implementation for float. So it may compile just fine, but at link time there will be unresolvable symbols.
So they try this:
template struct nothing<float>;
nothing<float> mynothing;
WRONG! Now the compiler can't find the implementation and all you get is MORE errors.
Now you could go back to your A.cpp file and add another instance for float...can you say maintenance, headache, carpal tunnel nightmare? With the commonly used solution you get to have your cake and eat it to. (mostly)
Now you may be thinking why even bother with the extern? Because as your post implies there is a typical use case where most of the time "nothing" will be used with an int template type. Having this occur in potentially hundreds of files can lead to serious compile time and code size ramifications.
Why doesn't the standards committee do something about this mess? They did! They added extern templates! In all fairness it is a difficult issue to resolve after the fact.
Related
I have a template class that I want to separate from my main.cpp file, where I'll be conducting tests, etc. The problem I have is that there's no easy way to separate the template class into .h/.cpp files because the compiler needs to know which type(s) the object may potentially be.
I came across this resource and I'm primarily looking at "Method 2." Is it bad practice to include a .cpp file in the file that holds main?
I've always been told that it was generally unadvised, but what would be the best solution here?
Yes, it's bad practice.
It's also bad practice to include it in the corresponding header.
What if someone tries to compile the project by giving all .cpp files as input to the compiler?
Good practice would be to mark those special files with a special extension, usually .inl.
Best: let the class template implementation be part of the header.
Then you have
MyClass.hpp
Only separate the implementation out as a distinct file if there ever will be a need for the header's code without the implementation code.
Because if the code sans implementation will not ever be needed, there is no rationale for using a distinct file.
However. if you separate the implementation as a distinct file, make that a header too, because that's how it's intended to be used. It's not intended to be separately compiled. And you don't want some too smart IDE to do that.
Then, because headers should be self-contained, the implementation should include the purely declarative header, like this:
MyClass.fwd.hpp
Pure declarations.
MyClass.hpp
Includes the above pure header.
This then is structured very much like the standard library's <ioswfd>, which corresponds to MyClass.fwd.hpp above.
Why not #include the .cpp into the corresponding .h? This will solve the template compilation problem and at the same time follow the standard approach with non-templated classes.
SomeClass.h:
#ifndef SOMECLASS_H
#define SOMECLASS_H
template<class T>
class SomeClass { /* ... */ };
#include "SomeClass.cpp"
#endif
SomeClass.cpp:
template<class T> SomeClass::foo()
/*...*/
main.cpp:
#include "SomeClass.h"
int main() {
/* ... */
}
Just don't forget that you do not need to actually compile your SomeClass.cpp. As suggested by others, you might choose a different extension (not .cpp) for the implementation file.
UPD: on reading the article you linked, I see that this is called "Method 3".
I'm pretty clear on when I can/can't use forward declaration but I'm still not sure about one thing.
Let's say I know that I have to include a header sooner or later to de-reference an object of class A.
I'm not clear on whether it's more efficient to do something like..
class A;
class B
{
A* a;
void DoSomethingWithA();
};
and then in the cpp have something like..
#include "A.hpp"
void B::DoSomethingWithA()
{
a->FunctionOfA();
}
Or might I as well just include A's header in B's header file in the first place?
If the former is more efficient then I'd appreciate it if someone clearly explained why as I suspect it has something to do with the compilation process which I could always do with learning more about.
Use forward declarations (as in your example) whenever possible. This reduces compile times, but more importantly minimizes header and library dependencies for code that doesn't need to know and doesn't care for implementation details. In general, no code other than the actual implementation should care about implementation details.
Here is Google's rationale on this: Header File Dependencies
When you use forward declaration, you explicitly say with it "class B doesn't need to know anything about internal implementation of class A, it only needs to know that class named A exists". If you can avoid including that header, then avoid it. - it's good practice to use forward declaration instead because you eliminate redundant dependencies by using it.
Also note, that when you change the header file, it causes all files that include it to be recompiled.
These questions will also help you:
What are the drawbacks of forward declaration?
What is the purpose of forward declaration?
Don't try to make your compilation efficient. There be dragons. Just include A.hpp in B.hpp.
The standard practice for C and C++ header files is to wrap all of the header file in an #ifndef to make sure it is compiled only once:
#ifndef _A_HPP_
#define _A_HPP_
// all your definitions
#endif
That way, if you #include "A.hpp" in B.hpp, you can have a program that includes both, and it won't break because it won't try to define anything twice.
I am learning C++ and it hasn't been an enjoyable experience (compared to Java or VBA at least). I have the following code:
//This is in a number.h file
#pragma once
template <class T>
class number{
public:
T value1, value2, result;
public:
T add();
number(T value1_in, T value2_in);
};
//This is in a number.cpp file
template <class T>
number<T>::number(T value1_in, T value2_in){
value1 = value1_in;
value2 = value2_in;
}
template <class T>
T number<T>::add(){
result = value1 + value2;
return result;
}
//This is in main.cpp
#include "number.h"
#include <iostream>
using namespace std;
int main(){
int a = 2, b =3;
number<int> n1(a,b);
cout << n1.add();
system("pause");
return EXIT_SUCCESS;
}
Which of course gives me an error. Even though I am pretty sure it should work. More specifically I get a linker error. After 3 hours of looking at this I decided to include number.cpp in main.cpp and that magically made it work. What the hell is going on? I thought I only need to include the header file (I wrote a matrix class with a bunch of linear solvers for different algorithms before this and only included header files in the whole project). Is this C++ specific or compiler specific? (I am using Dev-C++ 4.9.9.2 which has Mingw I guess)
You do not have to put the entire definition into the header file; don't listen to what others tell you :-)
That is the conventional solution, and you and I will do it quite often, but it's not the only solution.
The other option is to simply place this line at the end of your number.cpp file, in order to force that particular template class to be instantiated and fully compiled there.
template class number<int>;
In short, there are two valid solutions. You can use this line, or you can copy the definition into the header file. Depending on context, one approach might be better than the other, but both are valid. See this answer of mine for a more comprehensive discussion of both approaches:
https://stackoverflow.com/a/8752879/146041
Templated classes or functions always need to stay in the header file.
The reason is that whenever you instantiate a template, the preprocessor (inside compiler) generates new code for exactly that kind of instantiation (e.g. number<double>). That's why the classes number<double> and number<int> will not share any relationship: They will be two completely different classes although both were generated from the same template.
For the compiler to be able to generate this code, it must know the whole template definition, not only its declaration. That's why a template needs to stay in the header in full.
Including the cpp file in your main.cpp did the trick, as it effectively became a header.
In C++, templates are just as their name suggests: templates for a class, function etc. As the concrete type of the template parameter is not known in advance, and the compiled object code depends on the actual parameter, they are not getting compiled (as normal classes and functions) until you use them in any other source file and the compiler gets to know what type should it substitute into the template parameter.
That's why all functions must be defined and implemented in the header file.
See the last section of this documentation or the answers for this similar question for further explanation.
Instead of doing
#include "MyClass.cpp"
I would like to do
#include "MyClass.h"
I've read online that not doing so is considered bad practice.
Separate compilation in a nutshell
First, let's get some quick examples out there:
struct ClassDeclaration; // 'class' / 'struct' mean almost the same thing here
struct ClassDefinition {}; // the only difference is default accessibility
// of bases and members
void function_declaration();
void function_definition() {}
extern int global_object_declaration;
int global_object_definition;
template<class T> // cannot replace this 'class' with 'struct'
struct ClassTemplateDeclaration;
template<class T>
struct ClassTemplateDefinition {};
template<class T>
void function_template_declaration();
template<class T>
void function_template_definition() {}
Translation Unit
A translation unit (TU) is a single source file (should be a **.cpp* file) and all the files it includes, and they include, etc. In other words: the result of preprocessing a single file.
Headers
Include guards are a hack to work around lack of a real module system, making headers into a kind of limited module; to this end, including the same header more than once must not have an adverse affect.
Include guards work by making subsequent #includes no-ops, with the definitions available from the first include. Because of their limited nature, macros which control header options should be consistent throughout a project (oddball headers like <assert.h> cause problems) and all #includes of public headers should be outside of any namespace, class, etc., usually at the top of any file.
See my include guard naming advice, including a short program to generate include guards.
Declarations
Classes, functions, objects, and templates may be declared almost anywhere, may be declared any number of times, and must be declared before referring to them in any way. In a few weird cases, you can declare classes as you use them; won't cover that here.
Definitions
Classes may be defined at most once[1] per TU; this typically happens when you include a header for a particular class. Functions and objects must be defined once in exactly one TU; this typically happens when you implement them in a **.cpp* file. However, inline functions, including implicitly inline functions inside class definitions, may be defined in multiple TUs, but the definitions must be identical.
For practical purposes[2], templates (both class templates and function templates) are defined only in headers, and if you want to use a separate file, then use another header[3].
[1] Because of the at-most-once restriction, headers use include guards to prevent multiple inclusion and thus multiple definition errors.
[2] I won't cover the other possibilities here.
[3] Name it blahblah_detail.hpp, blahblah_private.hpp, or similar if you want to document that it's non-public.
Guidelines
So, while I'm sure everything above is all a big ball of mud so far, it's less than a page on what should take up a few chapters, so use it as a brief reference. Understanding the concepts above, however, is important. Using those, here's a short list of guidelines (but not absolute rules):
Always name headers consistently in a single project, such as **.h* for C and **.hpp* for C++.
Never include a file which is not a header.
Always name implementation files (which are going to be directly compiled) consistently, such as **.c* and **.cpp*.
Use a build system which can compile your source files automatically. make is the canonical example, but there are many alternatives. Keep it simple in simple cases. For example, make can be used its built-in rules and even without a makefile.
Use a build system which can generate header dependencies. Some compilers can generate this with command-line switches, such as -M, so you can make a surprisingly useful system easily.
Build Process
(Here's the tiny bit that answers your question, but you need most of the above in order to get here.)
When you build, the build system will then go through several steps, of which the important ones for this discussion are:
compile each implementation file as a TU, producing an object file (**.o*, **.obj*)
each is compiled independently of the others, which is why each TU needs declarations and definitions
link those files, along with libraries specified, into a single executable
I recommend you learn the rudiments of make, as it is popular, well-understood, and easy to get started with. However, it's an old system with several problems, and you'll want to switch to something else at some point.
Choosing a build system is almost a religious experience, like choosing an editor, except you'll have to work with more people (everyone working on the same project) and will likely be much more constrained by precedent and convention. You can use an IDE which handles the same details for you, but this has no real benefit from using a comprehensive build system instead, and you really should still know what it's doing under the hood.
File Templates
example.hpp
#ifndef EXAMPLE_INCLUDE_GUARD_60497EBE580B4F5292059C8705848F75
#define EXAMPLE_INCLUDE_GUARD_60497EBE580B4F5292059C8705848F75
// all project-specific macros for this project are prefixed "EXAMPLE_"
#include <ostream> // required headers/"modules"/libraries from the
#include <string> // stdlib, this project, and elsewhere
#include <vector>
namespace example { // main namespace for this project
template<class T>
struct TemplateExample { // for practical purposes, just put entire
void f() {} // definition of class and all methods in header
T data;
};
struct FooBar {
FooBar(); // declared
int size() const { return v.size(); } // defined (& implicitly inline)
private:
std::vector<TemplateExample<int> > v;
};
int main(std::vector<std::string> args); // declared
} // example::
#endif
example.cpp
#include "example.hpp" // include the headers "specific to" this implementation
// file first, helps make sure the header includes anything it needs (is
// independent)
#include <algorithm> // anything additional not included by the header
#include <iostream>
namespace example {
FooBar::FooBar() : v(42) {} // define ctor
int main(std::vector<std::string> args) { // define function
using namespace std; // use inside function scope, if desired, is always okay
// but using outside function scope can be problematic
cout << "doing real work now...\n"; // no std:: needed here
return 42;
}
} // example::
main.cpp
#include <iostream>
#include "example.hpp"
int main(int argc, char const** argv) try {
// do any global initialization before real main
return example::main(std::vector<std::string>(argv, argv + argc));
}
catch (std::exception& e) {
std::cerr << "[uncaught exception: " << e.what() << "]\n";
return 1; // or EXIT_FAILURE, etc.
}
catch (...) {
std::cerr << "[unknown uncaught exception]\n";
return 1; // or EXIT_FAILURE, etc.
}
This is called separate compilation model. You include class declarations into each module where they are needed, but define them only once.
In addition to hiding implementation details in cpp files (check other replies), you can additionally hide structure details by class forward declaration.
class FooPrivate;
class Foo
{
public:
// public stuff goes here
private:
FooPrivate *foo_private;
};
The expression class FooPrivate says that FooPrivate is completely defined somewhere else (preferably in the same file where Foo's implementation resides, before Foo's stuff comes. This way you make sure that implementation details of Foo(Private) aren't exposed via the header file.
You needn't include .c or .cpp files - the compiler will compile them regardless whether they're #included in other files or not. However, the code in the .c/.cpp files is useless if the other files are unaware of the classes/methods/functions/global vars/whatever that's contained in them. And that's where headers come into play. In the headers, you only put declarations, such as this one:
//myfile.hpp
class MyClass {
public:
MyClass (void);
void myMethod (void);
static int myStaticVar;
private:
int myPrivateVar;
};
Now, all .c/.cpp files that will #include "myfile.hpp" will be able to create instances of MyClass, operate on myStaticVar and call MyClass::myMethod(), even though there's no actual implementation here! See?
The implementation (the actual code) goes into myfile.cpp, where you tell the compiler what all your stuff does:
//myfile.cpp
int MyClass::myStaticVar = 0;
MyClass::MyClass (void) {
myPrivateVar = 0;
}
void MyClass::myMethod (void) {
myPrivateVar++;
}
You never include this file anywhere, it's absolutely not necessary.
A tip: create a main.hpp (or main.h, if you prefer - makes no difference) file and put all the #includes there. Each .c/.cpp file will then only need to have have this line: #include "main.hpp". This is enough to have access to all classes, methods etc. you declared in your entire project :).
You should not include a source file (.c or .cpp). Instead you should include the corresponding header file(.h) containing the declarations. The source files needs to be compiled separately and linked together to get the final executable.
Cpp files should be defined in your compiler script to be compiled as object files.
What ide are you using?
I am going to assume you are compiling with gcc, so here is the command to compile two .cpp files into one executable
gcc -o myclasses.out myclass.cpp myotherclass.cpp
You should only use #include to include class definitions, not the implentation
One thing you will want to watch out for when including you class declarations from a .h/.hpp is make sure it only ever gets included once. If you don't do this you will get some possibly cryptic compiler errors that will drive you up the wall.
To do this you need to tell the compiler, using a #define, to include the file only if the #define does not already exist.
For example (MyClass.h):
#ifndef MYCLASS_H
#define MYCLASS_H
class MyClass
{
// Memebers and methods
}
#endif
// End of file
This will guarantee your class declaration only gets included once even if you have it included in many different .cpp files.
It used to be that to use a template class in C++, the implementations had to be in the header file or #included into the header file at the bottom.
I've not used C++ templates for a few years; I just started using them again and observed that this behavior seems to persist. Is this still the case? Or are the compilers smart enough now to have the implementation separate from the interface?
Technically they do not need to be in the header file.
An example of this usage is when you have a template class with a fixed set of versions (lets say for arguments sake char and wchar_t). Then you can put all the method delcarations into a source file and explicitly instanciate these two versions. This has the safety that others can not istanciate the template for types it was not meant to be used for.
// X.h
template<typename T>
class X
{
// DECLARATION ONLY OF STUFF
public:
X(T const& t);
private:
T m_t;
};
// X.cpp
#include "X.h"
// DEFINTION OF STUFF
template<typename T>
X<T>::X(T const& t)
:m_t(t)
{}
// INSTANCIATE The versions you want.
template class X<char>;
template class X<wchar_t>;
// Main.cpp
#include "X.h"
int main()
{
X<chat> x1('a');
X<wchar_t> x2(L'A');
// X<int> x3(5); // Uncomment for a linker failure.
}
Assuming people can't just directly include X.cpp (because it is not provided by the distribution) then others can not use X<int> or X<float> etc. But the abovr classes are fully defined.
I have also seen this technique used to reduce compilatio time. Because each compilation unit is not re-generating the same version of X we only get the defintion in one place (thus one compilation cost). The downsize to this is that you must manually instanciate each seprate version of X that you use.
To separate the implementation from the declaration the standard forces you to use the export keyword. As far as I know there's only one compiler that knows how to handle it: Comeau.
However, C++0x will include a mechanism that tells the compiler not to instantiate certain specializations automatically (extern templates). So, if you want to cut compilation time you will be able to do so by explicitly instantiating some specializations in one compilation unit and declaring them in the header as extern.
You are referring to exported templates (using the export keyword), which seem to be supported only by Comeau C++ (according to this section of the C++ FAQ Lite).
A common technique to keep the interface devoid of implementation code is to put the inline function definitions into a separate "implementation" header that can be included at the end of the declaration header.
Export is only support by the EDG frontend, comercially only available in the Comeau compiler as far as I know.
Export doesn't eliminate the need for source disclosure, nor does it reduce compile dependencies, while it requires a massive effort from compiler builders.
So Herb Sutter himself asked compiler builders to 'forget about' export. As the time investment needed would be better spend elsewhere... so I don't think export will ever be implemented by other compilers after they saw how long it took, and how little was gained.
The paper is called "Why we can't afford export", it's listed on Sutters blog but no pdf there (a quick google should turn it up though), it's six years old now, I suppose they all listened and never bothered :)
Many people use two header files (e.g. .hpp and .ipp), one with only the declaration, and one with the definitions, then it's simply a matter of including one in the other.
foo.hpp
#ifndef MY_TEMPLATES_HPP
#define MY_TEMPLATES_HPP
template< class T >
void foo(T & t);
#include "foo.ipp"
#endif
foo.ipp
#ifdef MY_TEMPLATES_IPP
nonsense here, that will generate compiler error
#else
#define MY_TEMPLATES_IPP
template< class T >
void foo(T & t) {
... // long function
}
#endif
This only gains some clarity of course, nothing really changes compared to simply inlining everything in one header file.
GCC goes through a lengthy collect stage unless you explicitly instantiate all templates. VC++ seems to cope, but I prefer to avoid this step anyway and in cases where I know how template is going to be used, which is usually the case for applications, not so much for libraries, I put template definitions into a separate file. This also makes code more readable by making declarations less cluttered with implementation details.