Separating template interface and implementation in C++ - c++

This is a follow up question to:
Using export keyword with templates
As mentioned in the answers of the original questions 'export' is deprecated in C++0x and rarely supported by compilers even for C++03. Given this situation, in what way can one hide actual implementations in lib files and just expose declarations through header files, So that end user can know what are the signatures of the exposed API but not have access to the source code implementing the same?

In practice you cannot.
Only if you have a certain set of specializations, you can put these in a library. The base template cannot be put there.
On the other hand, using export did not hide the source. The compiler still needed it to instantiate new classes from the template.

In short, you can't. The export keyword was a failed attempt to achieve something akin to non-source template libraries (though not even approaching the level of obfuscation that binary code achieves), and there is no replacement in the offing.

One thing I have often noticed is that a good chunk of template code, is not so template in fact, and can be moved to non-template functions.
It also happens that function template specialization are considered as regular functions: you can either define them inline (and mark them so) or declare them in a header and implement them in a source file.
Of course, specialization means that you know with which type it will be executed...
Note that what you are asking for is somewhat antithetic.
The very goal of template is to create a "pattern" so that the compiler can generate classes and functions for a multitude of unrelated types. If you hide this pattern, how do you expect the compiler to be able to generate those classes and functions ?

You can use extern template in most recent compilers : http://en.wikipedia.org/wiki/C%2B%2B0x#Extern_template
However, it's unperfect as it only limit template instantiation. The idea is that you separate the template declaration and implementation in two seperate files.
Then when you need the template, you use extern template first, to make sure it's not instantiated yet. Then for each instantiation you need ( one for std::vector, one for std::vector, etc) , put the instantiation in a typedef that will be in a unique cpp.
As it makes the code clearly harder to understand, it's not the best solution yet. But it does works : it helps minimize template instantiations.

Related

How does the compilation of templates work?

I'm reading a book about how templates work, and I'm having difficulty understanding this explanation of templates.
It says
When the compiler sees the definition of a template, it does not generate code. It generates code only when we instantiate a specific instance of the template. The fact that code is generated only when we use a template (and not when we define it) affects how we organize our source code and when errors are detected...To generate an instantiation, the compiler needs to have the code that defines a function template or class template member function. As a result, unlike non-template code, headers for templates typically include definitions as well as declarations.
What exactly does it mean by "generate code"? I don't understand what is different when you compile function templates or class templates compared to regular functions or classes.
The compiler generates the code for the specific types given in the template class instantiation.
If you have for instance a template class declaration as
template<typename T>
class Foo
{
public:
T& bar()
{
return subject;
}
private:
T subject;
};
as soon you have for example the following instantiations
Foo<int> fooInt;
Foo<double> fooDouble;
these will effectively generate the same linkable code as you would have defined classes like
class FooInt
{
public:
int& bar()
{
return subject;
}
private:
int subject;
}
and
class FooDouble
{
public:
double& bar()
{
return subject;
}
private:
double subject;
}
and instantiate the variables like
FooInt fooInt;
FooDouble fooDouble;
Regarding the point that template definitions (don't confuse with declarations regardless of templates) need to be seen with the header (included) files, it's pretty clear why:
The compiler can't generate this code without seeing the definition. It can refer to a matching instantiation that appeared first at linking stage though.
What does a non-template member function have that allows for it to
be defined outside of the header that a template function doesn't
have?
The declaration of a non-template class/member/function gives a predefined entry point for the linker. The definition can be drawn from a single implementation seen in a compiled object file (== .cpp == compilation unit).
In contrast the declaration of a templated class/member/function might be instantiated from arbitrary compilation units given the same or varying template parameters. The definition for these template parameters need's to be seen at least once. It can be either generic or specialized.
Note that you can specialize template implementations for particular types anyway (included with the header or at a specific compilation unit).
If you would provide a specialization for your template class in one of your compilation units, and don't use your template class with types other than specialized, that also should suffice for linking it all together.
I hope this sample helps clarifying what's the difference and efforts done from the compiler.
A template is a pattern for creating code. When the compiler sees the definition of a template it makes notes about that pattern. When it sees a use of that template it digs out its notes, figures out how to apply the pattern at the point where it's being used, and generates code according to the pattern.
What is the compiler suppose to do when it sees a template? Generate all the machine code for all possible data types - ints, doubles, float, strings, ... Could take a lot of time. Or just be a little lazy and generate the machine code for what it requires.
I guess the latter option is the better solution and gets the job done.
The main point here is that compiler does not treat a template definition until it meets a certain instance of the template. (Then it can proceed, I guess, like it have a usual class, which is a specific case of the template class, with fixed template parameters.)
The direct answer to your question is: Compiler generates machine code from users c++ code, I think this is wat is meant here by word "generate code".
The template declaration must be in header file because when compiler compiles some source, which use template it HAVE only header file (included in source with #include macro), but it NEED whole template definition. So logical conclusion is that template definition must be in header.
When you create a function and compile it, the compiler generates code for it. Many compilers will not generate code for static functions that are not used.
If you create a templated function and nothing uses the template (such as std::sort), the code for the function will not be generated.
Remember, templates are like stencils. The templates tell how to generate a class or function using the given template parameters. If the stencil is not used, nothing is generated.
Consider also that the compiler doesn't know how to implement or use the template until it sees all the template parameters resolved.
It won't straight away generate code. Only generates the class or template code when it comes across an instantiation of that template. That is, if you are actually creating an object of that template definition.
In essence, templates allow you abstract away from types. If you need two instantiations of the template class for example for an int and a double the compiler will literally create two of these classes for you when you need them. That is what makes templates so powerful.
Your C++ is read by the compiler and turned into assembly code, before being turned in machine code.
Templates are designed to allow generic programming. If your code doesn't use your template at all, the compiler won't generate the assembly code associated. The more data types you associate your template with in your program, the more assembly code it will generate.

Elegant way to move templated method definitions out of header file

In one of my C++ projects, I make use of a many templates. Since I cannot put them in *.cpp files, they whole functions live in the headers at the moment.
But that messes up header files on the one hand, and leads to long compile times on the other hand. How can I handle implementations of templated functions in a clean way?
There isn't really a requirement that templates have to be in the header. They can very well be in a translation unit. The only requirement is that compiler is either able to instantiate them implicitly when they are used or that they get explicitly instantiated.
Whether separating the templatized code into headers and non-headers based on that is feasible depends pretty much on what is being done. It works rather well, e.g., for the IOStreams library because it is, in practice, only instantiated for the character types char and wchar_t. Writing the explicit instantiations is fairly straight forward and even if there are couple of more character types, e.g., char16_t and char32_t, it stays feasible. On the other hand, separating templates like std::vector<T> in a similar way is rather infeasible.
Using general templates like std::vector<T> in interfaces between subsystems quickly starts to become a major problem: while concrete instantiations or selected instantiations are OK, as the subsystem can be implemented without being a template, using arbitrary instantiations would force an entire system to be all templates. Doing so is infeasible in any real-world application which are often a couple of million lines of code on the small end.
What this amounts to is to use compilation-firewalls which are fully typed and don't use arbitrary templates between subsystems. To ease the use of the subsystem interfaces there may be thin template wrappers which, e.g., convert one container type into another container or which type-erase the template parameter where feasible. It needs to be recognized, however, that the compilation separation generally comes at a run-time performance cost: calling a virtual function is a lot more expensive than calling an inline function. Thus, the abstractions between subsystems may be very different from those within subsystems. For example, iterators are great internal abstractions. Between subsystems, a specific container, e.g., a std::vector<X> for some type X, tends to be more effective.
Note that the build-time interactions of templates are inherent in their dependency on specific instantiations. That is, even if there is a rather different system for declarations and template definitions, e.g., in the form of a module system rather than using header files, it won't become feasible to make everything a template. Using templates for local flexibility works great but they don't work without fixing the instantiations globally in large projects.
Finally a plug: here is a write-up on how to organize sources implementing templates.
You can just create a new header file library_detail.hpp and include it in your original header.
I've seen some people naming this implementation-header file using .t or .template extensions as well.
One thing that can help is to factor out functionality that doesn't depend on the template arguments into non templated helper functions that can be defined in an implementation file.
As an example, if you were implementing your own vector class you could factor out most of the memory management into a non template class that just works with an untyped array of bytes and have the definitions of its member functions in a .cpp file. The functions that actually need to know the type of the elements are implemented in the templated class in a header and delegate much of the work to the non-templated helpers.

About C++ template and explicit declaration

I just spent about an 20 minutes trying to figure out why some template methods of mine passed compilation but not linkage.
Turns out I needed to explicitly declare my template method.
It was something of this kind :
class Test {
template<class Source> void Save(Source& obj);
};
Then I would use it like this somewhere :
Test t;
ClassDerivedFromInterface obj;
t.Save(obj);
It compiled fine but didn't link. Until I added :
template void Test::Save(ClassDerivedFromInterface);
I would like to understand in which case an explicit declaration is necessary.
Thanks
In a nutshell, you need to have the entire body (the definition) of a template function visible to the translation unit that instantiates the template. So when you say t.Save(obj);, that translation unit should have access to the definition of Save. Usually you achieve this by including the definitions of function templates in the header file itself.
The reason for this is that templates aren't ordinary code that gets compiled and can later be linked at will. Rather, templates are a code generation tool that generates the necessary code on demand - an automatic version of copy/paste followed by search-and-replace, if you will.
Therefore, the actual compilable code for your function Save(ClassDerivedFromInterface&) doesn't come into existence until you write that line. If only the declaration of the function template is visible, then the template only produces the declaration of the concrete function, but not its body, and so at link time you notice that the function is missing.
To recap, templates themselves cannot be compiled, it is only their concrete instances that can, and you have to pay attention to ensure that the concrete instances are always available when you instantiate them. Explicit instantiation as you have it works and allows you to package a few specific instances into a separate TU, but generally that's hard to maintain and not scalable, and there are other drawbacks to explicit instantiation that you avoid when you let the compiler instantiate implicitly. So usually it's best to package your entire definitions into the header file.
You will need to explicitly declare the template if the template source is not visible at the time of compilation. This link covers it pretty well, also an awesome site in general:
C++ FAQ 35.13
You need explicit declaration when your template definitions are not accesible from the code that is using them, consider the following:
template.h - template declarations
template.cpp - template definitions
main.cpp - template usage
template.cpp is not included in main.cpp and thus unreachable by the template user so you need explicit declarations.
But if your structure is:
template.h - template declarations and definitions
main.cpp - template usage
the template declarations are reachable by the template user so you don't need explicit declarations.
The compiler needs to know what types you're going to be using with your template. If you create a template class and then use it with, for example, int, char, and double, then the compiler will create methods for the template for those types. If you compile the template method in a separate compilation unit from where you use it, the compiler will not instantiate your template for the type you need. But if you explicitly instantiate the template, the compiler will create whatever you tell it to.

implicit template method instantion

I have a class containing a template method with a non-type template parameter. The code size got really big, so I tried avoiding the inlining by putting it into the .cpp file. But I can only manage to instantiate it explictly for each non-type parameter.
Is an implicit instantiation possible? What would it look like? In an other related question this link http://www.parashift.com/c++-faq-lite/templates.html is provided but I can't find a solution for implicit instantiation (if there is something like this)...
class Example
{
public:
template<enumExample T_ENUM> void Foo(void);
};
I get linker errors for Foo (unresolved external symbol) when using it.
Your problem is that the template code needs to be visible at the point at which it is instantiated. See C++ FAQ 35.13
Which basically means you can't do what you are trying to. There is an export keyword that makes this possible, but it is very poorly supported and I believe has been removed from the standard in C++0x. See C++ FAQ 35.14 for more information.
For implicit instantiation, the compiler needs to see the implementation of the function template. Usually this means that the implementation needs to be in a header file. If you just want to avoid inlining, you could try writing the implementation of the function template in the header, but outside of the class declaration (although I'm not sure inlining is your real problem).
To reduce code size you can try to reduce the dependencies by implementing, when appropriate, the pimpl idiom.

Why do C++ template definitions need to be in the header? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why should the implementation and the declaration of a template class be in the same header file?
e.g when defining a template class why do the implementations of the class methods need to be in the header? Why can't they be in a implementation file (cpp/cxx)?
A template class is not a class, it's a template that can be used to create a class. When you instantiate such a class, e.g. MyTemplate<int>, the compiler creates the class on the spot. In order to create it, it has to see all the templated member functions (so that it can use the templates to create actual member functions such as MyTemplate<int>::foo() ), and therefore these templated member functions must be in the header.
If the members are not in the header, the compiler will simply assume that they exist somewhere else and just create actual function declarations from the templated function declarations, and this gives you linker errors.
The "export" keyword is supposed to fix this, but few compilers support it (I only know of Comeau).
You can also explicitly instantiate MyTemplate<int> - then the compiler will create actual member functions for MyTemplate<int> when it compiles the cpp files containing the MyTemplate member function definition templates.
They need to be visible for the compiler when they are instantiated. That basically means that if you are publishing the template in a header, the definitions have to be visible by all translation units that include that header if you depend on implicit instantiation.
They need not be defined in the header if you are going to explicitly instantiate the templates, but this is in most cases not a good idea.
As to the reason, it basically boils down to the fact that templates are not compiled when the compiler parses the definition, but rather when they are instantiated, and then they are compiled for the particular instantiation type.
If your compiler supports export, then it doesn't. Only EDG-based compilers support export, and it's going to be removed from C++0x because of that.
Non-exported templates require that the compiler can see the full template definition, in order to instantiate it for the particular types you supply as arguments. For example:
template<typename T>
struct X {
T t;
X(int i): t(i) {}
};
Now, when you write X<float>(5) in some translation unit, the compiler as part of compiling that translation unit must check that the constructor of X is type-correct, generate the code for it, and so on. Hence it must see the definition of X, so that it can permit X<float>(5) but forbid X<char*>(5).
The only sensible way to ensure that the compiler sees the same template definition in all translation units that use it, is to put the definition in a header file. As far as the standard is concerned, though, you're welcome to copy-and-paste it manually, or to define a template in a cpp file that is used only in that one translation unit.
export in effect tells the compiler that it must output a parsed form of the template definition into a special kind of object file. Then the linker performs template instantiation. With normal toolchains, the compiler is smart enough to perform template instantiation and the linker isn't. Bear in mind that template instantiation has to do pretty much everything that the compiler does beyond basic parsing.
They can be in a CPP file.
The problem arises from the fact that the compiler builds the code for a specific instantiation of a template class (eg std::vector< int >) on a per translation unit basis. The problem with defining the functions in a CPP file is that you will need to define every possible form in that CPP file (this is called template specialization).
So for that int vector exampled above you could define a function in a CPP file for the int case using specialization.
e.g
template<> void std::vector< int >::push_back( int& intVal )
Of course doing this can produce the advantage of optimisation for specific cases but it does give you an idea of just how much code bloat can be introduced by STL! At least all the functions aren't defined as inline as a certain compiler used to do ;)
That aspect of template is called the compilation model, not to be confused with the instantiation mechanism which was the subject of How does C++ link template instances.
The instantiation mechanism is the answer to the question "When is the instantiation generated?", the instantiation model is the answer to "Where the source are found?"
There are two standards compilation model:
inclusion, the one that you know, where the definition must be available,
separated, which allows to put the definition somewhere else with the help of the keyword export. That one has been removed from the standard and won't be available in C++0X. One of the raison for removal was that it wasn't widely implemented (only one implementation).
See C++ Templates, The Complete Guide by David Vandevoorde and Nicolai Josuttis or http://www.bourguet.org/v2/cpplang/export.pdf for more information, the separated compilation model being the subject of that later paper.