Automatically separate class definitions from declarations? - c++

I am using a library that consists almost entirely of templated classes and functions in header files, like this:
// foo.h
template<class T>
class Foo {
Foo(){}
void computeXYZ() { /* heavy code */ }
};
template<class T>
void processFoo(const Foo<T>& foo) { /* more heavy code */ }
Now this is bad because compile times are unbearable whenever I include one of those header files (and actually I include many of them in each of my compilation units).
Since as a template parameter I only use one or two types anyway I am planning to create, for each library header file, a file that contains only declarations, without the heavy code, like this:
// NEW: fwd-foo.h
template<class T>
class Foo {
Foo();
void computeXYZ();
};
template<class T>
void processFoo(const Foo<T>& foo);
And then one file that creates all the instantiations that I'll need. That file can be compiled separately once and for all:
// NEW: foo.cpp
#include "foo.h"
template class Foo<int>;
template class Foo<double>;
template void processFoo(const Foo<int>& foo);
template void processFoo(const Foo<double>& foo);
Now I can just include fwd-foo.h in my code and have short compile times. I'll link against foo.o at the end.
The downside, of course, is that I have to create these new fwd-foo.h and foo.cpp files myself. And of course it's a maintenance problem: When a new library version is released I have to adapt them to that new version. Are there any other downsides?
And my main question is:
Is there any chance I can create these new files, especially fwd-foo.h, automatically from the original foo.h? I have to do this for many library header files (maybe 20 or so), and an automatic solution would be best especially in case a new library version is released and I have to do this again with the new version. Are any tools available for this task?
EDIT:
Additional question: How can the newly supported extern keyword help me in this case?

We use lzz which splits out a single file into a separate header and translation unit. By default, it would normally put the template definitions into the header too, however, you can specify that you don't want this to happen.
To show you how you might use it consider the following:
// t.cc
#include "b.h"
#include "c.h"
template <typename T>
class A {
void foo () {
C c;
c.foo ();
b.foo ();
}
B b;
}
Take the above file and copy it to 't.lzz' file. Place any #include directives into separate $hdr and $src blocks as necessary:
// t.lzz
$hdr
#include "b.h"
$end
$src
#include "c.h"
$end
template <typename T>
class A {
void foo () {
C c;
c.foo ();
b.foo ();
}
B b;
}
Now finally, run lzz over the file specifying that it places the template definitions into the source file. You can either do this using a $pragma in the source file, or you can use the command line option "-ts":
This will result in the following files being generated:
// t.h
//
#ifndef LZZ_t_h
#define LZZ_t_h
#include "b.h"
#undef LZZ_INLINE
#ifdef LZZ_ENABLE_INLINE
#define LZZ_INLINE inline
#else
#define LZZ_INLINE
#endif
template <typename T>
class A
{
void foo ();
B b;
};
#undef LZZ_INLINE
#endif
And:
// t.cpp
//
#include "t.h"
#include "c.h"
#define LZZ_INLINE inline
template <typename T>
void A <T>::foo ()
{
C c;
c.foo ();
b.foo ();
}
#undef LZZ_INLINE
You can then run these through some grep/sed commands to remove the LZZ helper macros.

Try using precompiled headers. I know GCC and MSVC support this feature. Usage is vender-specific, though.

I have been working on the very same issue for quite a while now. In the solution you're proposing, you are defining your template classes twice. It will be ok if it defines the same stuff (in the same order), but you're bound to have problems sooner or later.
What i have come up with is to consider the problem the other way around. As long as you are not specializing your implementation, it works gracefully.
It uses two macros, which avois having to update template arguments in implementation file (be careful, though, if you want to add default template arguments to the class).
// foo.h
#define FOO_TEMPLATE template<typename T>
#define FOO_CLASS Foo<T>
FOO_TEMPLATE
class Foo {
Foo();
void computeXYZ();
};
// foo_impl.h
#include "foo.h"
FOO_TEMPLATE
FOO_CLASS::Foo(){}
FOO_TEMPLATE
void FOO_CLASS::computeXYZ() { /* heavy code */ }
By doing this, you essentially work the same way you do with non-template classes (you can do the same thing with template functions, of course).
EDIT : about the extern keyword in c++0x
I believe the extern keyword in c++0x will help, but it won't solve everything magically !
From this article,
Extern Templates
Every module that instantiates a
template essentially creates a copy of
it in the object code. Then, it's up
to the linker to dispose of all of the
redundant object code at the very last
stage, thus slowing the critical
edit-compile-link cycle that makes up
a programmer's day (or sometimes
daydreams). To short-circuit this
object code garbage collection, a
number of compiler vendors have
already implemented an extern keyword
that can go in front of templates.
This is a case of standardization
codifying existing industry practice
(pun intended). In practice, this is
implemented by sending a notice to the
compiler basically to "not instantiate
this here":
extern template class std::vector;

C++0x will fix your compile time issues with extern templates. I don't know an automatic way to do what you ask, though.

Related

Declaration and Definition Separation for C++ Templates

So lately I have been making progress on building a personal library using Template Metaprogramming (using the book Modern C++ Design as my current reference).
I have been thinking about the best way to layout my template code. Obviously, the compiler wants the definitions of the template visible in the same file it is declared. However, I want to separate the declaration from the definition so I have less to look at when I just want to see a method prototype or something.
So to solve this, what I have been starting to do can be illustrated by the following example:
SomeClass.hpp
#ifndef _someclass_hpp
#define _someclass_hpp
template<typename T>
class SomeClass {
public:
...
private:
...
};
#include "SomeClass_Implementation.hpp"
#endif
SomeClass_Implementation.hpp
#ifndef _someclass_impl_hpp
#define _someclass_impl_hpp
#include "SomeClass.hpp"
/* SomeClass Implementation... */
#endif
I personally like this more than having everything in one file, but I am curious if anyone has any tips on approaching this or any reasoning that might make me consider just dumping it all into one file.
One approach I like is to do this:
Header.hpp:
#ifndef FileGuard_whatever // technically reserved by STL because
#define FileGuard_whatever // it starts with a capital letter
template <typename T>
class A
{
void go ( T const& value ) ;
} ;
#include "Header.hxx" // implementation details
#endif // FileGuard
Header.hxx:
template <typename T>
void A<T>::go ( T const& value ) { }
Which is pretty much exactly what you do. I omit the addition File Guard in the other file and give a name letting people know not to import it.
I've had good luck with this. It takes people a sec to figure it out, but the .hxx helps at least a bit with that ("It's different; ok, how is it different?")

Can member functions of class templates be inlined in case of explicit instantiation?

Can doSth() be inlined given this configuration?
// A.h
template<typename T>
struct A
{
void doSth();
};
// A.cpp
template<typename T>
void A<T>::doSth() { /* do something */ }
template class A<bool>;
template class A<int>;
// main.cpp
#include "A.h"
int main()
{
A<bool> a;
a.doSth();
}
If the answer is negative I'd go define my member functions in a .tpp file and include that at the end of "A.h" but that would just look weird with the non-inline versions in a .cpp file so I'd want to avoid that.
Most compilers cannot inline with that arrangement of code. The ICC compiler documentation claims it supports being invoked in a way that would enable that inlining AFTER you first built a different way, then gathered profiling data then fed back the profiling data to a cross module optimizing build. I made only modest attempts to get that to work and it worked only in play size projects, not in anything real.
For use with ordinary compilation, you should have that extra file for the function definitions you want inline, but you should probably not include it in the end of A.h, rather include A.h in the beginning of it and include it in select cpp files that really need it.
I prefer
// A.h
#ifndef A_H
#define A_H
template<typename T>
struct A
{
inline void doSth();
};
#endif
// A.tpp
#ifndef A_TPP
#define A_TPP
#include "A.h"
template<typename T>
inline void A<T>::doSth() { /* do something */ }
#endif
// Various other .h files that need to know what is declared in A
#include "A.h"
// Only cpp files that need what is defined in A.tpp
#include "A.tpp"
I forget the option as well as which compilers have such an option, but that nearly redundant use of inline in the .h file goes with a compiler option saying that in case the function is declared that way and used and not defined, throw a compile time error.
Without that option, the link time error is harder to read but does tell you which .cpp needed to include the .tpp but missed it.
You most definitely can. You also forgot the ; at the end of your struct definition.
struct a
{
// Whatever
};
Is how it should be.
Visual Studio does allow in-lining in this way but it limits the types that you can instantiate to just those that you defined in your A.cpp file.

return a typedef type when using separate compilation

Here are the files I am working on:
class.h
#include <vector>
using std::vector;
template<class T>
class test {
private:
vector<T> data;
public:
typedef vector<T> vt;
typedef typename vt::iterator it;
test() {
}
;
it find(T x);
}
and class.cpp
#include "class.h"
it test::find(T x) {
return find(data.begin(), data.end(), x);
}
The codes work if I put the implementation of find inside the class declaration.
However, when I separate the implementation from the class, the compiler reports an error "expected initializer before test".
How to fix it? Is the problem related to the scope of typedef/typename?
Sorry for my poor English, it is my secondary language.
Please point out any error in my codes as well as my english
Thank you for your help.:D
When the compiler sees it, it can't yet know that you mean test<T>::it. So you have to tell it:
template<class T> typename test<T>::it test<T>::find(T x) {
// The following line doesn't compile, but that's another issue:
// return find(data.begin(), data.end(), x);
}
See http://ideone.com/Rtho2 for a working program.
Now templates in C++ violates this principle, bcoz C++ is a compiled language. And compiler generates all the needed code during compilation. Now to adhere to OOP we end up with fragile templates which not 100% generic in nature.
Keep declaration and definitions separate (SHARING implementation)
If you are just want to keep things clean and in order, then you can include your implementation file in another header. I think it should be header file as this goes with basic convention that we share .h files and we keep .cc files not to be shared (until you are sharing the code itself). Here is how the files look.
foo.h
This is simple file with including foo_impl.h.
#ifndef FOO_H
#define FOO_H
template <class T>
class Foo {
public:
Foo (T);
T get();
private:
T data;
};
#include "foo_impl.h"
#endif
foo_impl.h
This one is bit different from the norms. Here we are not guarding the header file content. Instead we will raise an error if some one included foo_impl.h directly (which in our case does not make sense).
#ifndef FOO_H
#error 'foo_impl.h' is not supposed to be included directly. Include 'foo.h' instead.
#endif
template <class T>
Foo <T> :: Foo (T stuff) {
data = stuff;
}
template <class T>
T Foo <T> :: get () {
return data;
}
Now if some one tries to include foo_impl.h directly will get error like:
foo_impl.h:2:2: error: #error 'foo_impl.h' is not supposed to be included directly. Include 'foo.h' instead.
PROS:
Separation of concerns, implementation and declarations are in separate files.
Safe guarding implementation file avoid accidental inclusion.
The header file used to include is not bloated with implementation code.
CONS:
As mentioned above, have to share the implementation.
Keep declaration and definitions separate (NOT SHARING implementation)
For not sharing code for templates, you have to define all the possible datatypes your template can be used with in .cc/.cpp file like:
template class foo< int >;
template class foo< double >;
PROS:
Separation of concerns, implementation and declarations are in separate files.
No need to share your implementation
CONS:
Not too generic. Have to know before hand what is required.

Template Linking and Separation of .h and .cc Files

I've been doing some reading into designing template code have a question about it. Most of the solutions to problems relating to designing code as templates seem to either be:
Put definitions of prototypes into the header file
Use the export keyword like this (which requires an extra compiler option)
Specifically lay out how the templates will be used in the .cc/.cpp file.
For example:
// File: foo_impl.cc
// We're working with Class Foo
#include "foo.cc"
template class Foo <int>;
template class Foo <string>;
// etc.
None of these methods seem very effective. Unless if I'm missing something, they don't seem to offer the ability for a user to simply import the header file and link the template code (in a .cc file) without doing extra work. I was wondering if people could take a look at what I'm doing with my own code and tell me if these violate some kind of best practices protocol or if they could cause an issue that I'm just not seeing. Here's what I've been doing...
In main.cc:
#include <iostream>
#include "foo.h"
using namespace std;
int main (void) {
Foo <string> f ("hello world");
string s = f.get ();
cout << s << endl;
return 0;
}
In foo.h:
#ifndef FOO_H
#define FOO_H
template <class T>
class Foo {
public:
Foo (T);
T get ();
private:
T data;
};
#endif
#include "foo.cc"
In foo.cc:
#ifndef FOO_CC
#define FOO_CC
#include "foo.h"
template <class T>
Foo :: Foo (T stuff) {
data = stuff;
}
template <class T>
T Foo <T> :: get () {
return data;
}
#endif
I've been able to compile the code with all warnings in gcc 4.1.2. Thank you!
I personally prefer to put the declaration (.h) in a separate file from the definition (your .cc file). Also, I'd avoid including the .cc file in the .h file. Methods in a .h file should only be inline methods.
Let's say in your example you also had a header file (bar.h) that simply declares a class that has a Foo data member.
Every time you would modify the definition of the Foo class, you would cause a recompile of anyone who includes bar.h, even tough they couldn't care less about the definition of Foo. However, bar.cpp is probably where you actually implement stuff and that file DOES need to include the implementation of your template. This seem trivial in small projects, but becomes a source of headaches in big projects that constantly recompile files for no reason. I've seen people throwing SSDs and Incredibuild at stuff that could be fixed by simple forward declares and better header management.
Personally, I use .imp.h for the implementation of my templates. Including cc files or cpp files seems yucky to me.
For example ( sorry for compilation errors. ;) )
// foo.h
#ifndef foo_h
#define foo_h
template< typename T >
struct Foo
{
Foo( T value );
void print();
T _value;
};
#endif
//foo.imp.h
#ifndef foo_imp_h
#define foo_imp_h
#include "foo.h"
#include <iostream>
template< typename T >
Foo< T >::Foo( T value ) : _value( value ) {}
void Foo< T >::print() { std::cout << _value << std::endl; }
#endif
// bar.h
#ifndef bar_h
#define bar_h
#include "foo.h"
struct Bar {
Foo< int > _intFoo;
Foo< double > _doubleFoo;
void print();
};
#endif
// bar.cpp
#include "bar.h"
#include "foo.imp.h"
void Bar::print()
{
_intFoo.print();
_doubleFoo.print();
}
// foobar.cpp
#include "bar.h"
void foobar()
{
Bar bar;
bar.print();
}
Had the defintion of foo be included in or by foo.h, bar.cpp and foobar.cpp would have been recompiled. Since only bar.cpp is concerned with Foo's implementation, splitting the defintion and declaration of Foo in two files and not having foo.h include foo.imp.h at the end saved me a recompile of foobar.cpp.
This is something that happens all the time in projects and can be very easily avoided by following the .h/.imp.h rule I explained above. The reason you never see this in stuff like STL or boost is because you are not modifying those files. It doesn't matter if they are in one or two files. But in your own projects, you will be constantly modifying the definitions of your templates and this is how you reduce recompilation times.
If you already know beforehand which types are actually going to be used with your template, than do not even bother with the .imp.h file. Put everything in a .cpp and do this at the end
// foo.cpp
// Implementation goes here.
// You might need to put something in front so that it gets exported from your DLL,
// depening on the platform
template class foo< int >;
template class foo< double >;
Lets start with basic ideology of .h and .cc files. When building libraries, the idea is to share only your header files and not your implementation (mean .cc files). This is also the basics OOP's encapsulation, abstraction etc, to hide the implementation details.
Now templates in C++ violates this principle, bcoz C++ is a compiled language. And compiler generates all the needed code during compilation. Now to adhere to OOP we end up with fragile templates which not 100% generic in nature.
Keep declaration and definitions separate (SHARING implementation)
If you are just want to keep things clean and in order, then you can include your implementation file in another header. I think it should be header file as this goes with basic convention that we share .h files and we keep .cc files not to be shared (until you are sharing the code itself). Here is how the files look.
foo.h
This is simple file with including foo_impl.h.
#ifndef FOO_H
#define FOO_H
template <class T>
class Foo {
public:
Foo (T);
T get();
private:
T data;
};
#include "foo_impl.h"
#endif
foo_impl.h
This one is bit different from the norms. Here we are not guarding the header file content. Instead we will raise an error if some one included foo_impl.h directly (which in our case does not make sense).
#ifndef FOO_H
#error 'foo_impl.h' is not supposed to be included directly. Include 'foo.h' instead.
#endif
template <class T>
Foo <T> :: Foo (T stuff) {
data = stuff;
}
template <class T>
T Foo <T> :: get () {
return data;
}
Now if some one tries to include foo_impl.h directly will get error like:
foo_impl.h:2:2: error: #error 'foo_impl.h' is not supposed to be included directly. Include 'foo.h' instead.
PROS:
Separation of concerns, implementation and declarations are in separate files.
Safe guarding implementation file avoid accidental inclusion.
The header file used to include is not bloated with implementation code.
CONS:
As mentioned above, have to share the implementation.
NOTE: For not sharing code for templates, I think you already know that you have to declare all possible types in the which the end user can use it.
Including .cc files is bad news and defeats the purpose of separating implementation from declaration.
Define templates in headers:
#ifndef FOO_H
#define FOO_H
template <class T>
class Foo {
public:
Foo (T);
T get ();
private:
T data;
};
// implementation:
template <class T>
Foo :: Foo (T stuff) {
data = stuff;
}
template <class T>
T Foo <T> :: get () {
return data;
}
#endif
If you really prefer 2 files then make the second one a .h too. Name it foo_impl.h or something.
It is common to affect the seperation of interface and implementation for templates by #includeing the implementation of the templates at the end of the header, but there are three problems with how you are doing it:
You're including the .h file in the .cc file. Don't; there should be nothing but function definitions in the implementation file.
You're .cc file should not be named .cc, it should be named .template or something similar to let people know that it should not be compiled (like headers should not be compiled)
The #include "foo.cc" in foo.h should be inside the include guards, not outside.
Done this way, there is no extra work for the user to be done. All you do is #include the header, and you're done. You don't compile the implementation.
Since you are including foo.cc in foo.h, you'll make your life simpler by putting all the code into foo.h and getting rid of foo.cc. There is no advantage to be gained from splitting the code into two pieces.
export keyword is deprecated in c++11. So, you will end up with deprecated code. You put your defintions in the header file itself.

When are includes not needed?

I admit I'm a bit naive when it comes to includes. My understanding is that if you use it in your class you either need to include it or forward declare it. But then I was going through some code and I saw this:
// file: A.cpp
#include "Helper.h"
#include "B.h"
#include "C.h"
#include "A.h"
// ...
// file: A.h
B b;
C c;
// ...
// file: B.h
Helper h;
// ...
// file: C.h
Helper h;
// ...
Can someone explain to me why B and C does not need to include Helper? Also, what are the advantages/disadvantages to organizing includes this way? (Besides the obvious less typing.)
Thanks.
When you #include some header (or other) file into a .cpp file, that #include statement is simply replaced by the content of the header file. For example:
//header.h
int i;
float f;
// file.cpp
#include"header.h"
int main()
{}
After preprocessing stage, file.cpp will look like,
int i;
float f;
int main()
{}
Can see this in g++ using g++ -E file.cpp > TEMP, which shows you just the preprocessed files.
In your present question context, you must have #include helper.h in/before B.h and C.h as they appear before and you declare an object of those types.
Also, it's not good practice to rely on the arrangement of header files to get the code working, because once you alter arrangement little, the whole hierarchy collapses with several compilation errors.
Instead #include everything in the file if you are using it and you can use the #ifndef guards to avoid multiple inclusion:
//helper.h
#ifndef HELPER_H
#define HELPER_H
// content of helper file
#endif
If the class definitions for B and C don't actually refer to any of the members of the Helper class, then the compiler doesn't need to see the full definition of the Helper class in their header files. A forward declaration of the Helper class is sufficient.
For example, if the definition of the B class only uses pointers or references to Helper, then a forward reference is recommended :
class Helper;
class B {
// <SNIP>
Helper* helper;
// <SNIP>
void help(const Helper& helper);
// <SNIP>
};
If the definition of the B class uses an instance of the Helper class (ie. it needs to know the size of the Helper instance), or otherwise refers to the definition of the Helper class (in template functions eg.), then you need to make the full definition of the Helper class visible (most likely by including the header file that defines the Helper class) :
#include "helper.h"
class B {
// <SNIP>
Helper helper;
// <SNIP>
void help(Helper helper);
// <SNIP>
};
The rule of when to use includes versus forward declarations is relatively straightforward : use forward declarations when you can, and includes when you have to.
The advantage of this is clear : the less includes you have, the less dependencies there are between header files (and the more you'll speed up compilation).
Think of #include as literally including the text of the other file in this one - it's exactly the same as if you had copied it in. So in this case, the reason B and C don't need to include Helper is because you've included it in the same "compilation unit", which is what the combination of a .cpp file and all its includes is called.
With templates, a forward declaration might be enough even if the members of the forward declared type are referred to in the header file.
For example, the boost::shared_ptr<T> implementation only forward declares boost::weak_ptr<T> even though it is used in two constructors. Here is a code snipped from http://www.boost.org/doc/libs/1_47_0/boost/smart_ptr/shared_ptr.hpp:
namespace boost
{
// ...
template<class T> class weak_ptr;
// ...
template<class T> class shared_ptr
{
// ...
public:
// ...
template<class Y>
explicit shared_ptr(weak_ptr<Y> const & r): pn(r.pn) // may throw
{
// it is now safe to copy r.px, as pn(r.pn) did not throw
px = r.px;
}
template<class Y>
shared_ptr( weak_ptr<Y> const & r, boost::detail::sp_nothrow_tag ): px( 0 ), pn( r.pn, boost::detail::sp_nothrow_tag() ) // never throws
{
if( !pn.empty() )
{
px = r.px;
}
}
In this case, a forward declaration of boost::weak_ptr<T> is enough since the two constructors do not get instantiated unless the definition for the forward declared boost::weak_ptr<T> has been included in the compilation unit that is using those constructors.