When will implicit instantiation cause problems? - c++

I'm reading C++ Primer, and it says:
"If a member function isn't used, it is not instantiated. The fact that members are instantiated only if we use them lets us instantiate a class with a type that may not meet the requirements for some of the template’s operations."
I don't know why this is a problem. If some operations are required, why doesn't compiler instantiate those operations? Can someone give an example?

That's an ease-of-use feature, not a pitfall.
Lazy instantiation serves to simplify templates. You can implement the set of all possible member functions that any specialization might have, even if some of the functions don't work for some specializations.
It also reduces compile time, since the compiler never needs to instantiate what you don't use.
To prevent lazy instantiation, use explicit instantiation:
template class my_template< some_arg >;
This will immediately instantiate all the members of the class (except members which are templates, inherited, or not yet defined). For templates that are slow to compile, you can do the above in one source file (translation unit) and then use the linker to bypass instantiation in other source files, by putting a declaration in the header:
extern template class my_template< some_arg >;

Related

Class template can be instantiated without members?

The Wikipedia article says this:
instantiating a class template does not cause its member definitions to be instantiated.
I can't imagine any class in C++ being instantiated, whether from a template or not, where that classes members were not also instantiated?
Many early C++ compilers instantiated all member functions, whether you ever called them or not.
Consider, for example, std::list, which has a sort member function. With a current, properly functioning compiler, you can instantiate list over a type that doesn't support comparison. If you try to use list::sort, it will fail, because you don't support comparison. As long as you don't call sort for that list, it's all fine though, because list<T>::sort won't be instantiated unless you call it.
With those older, poorly functioning compilers, however, trying to create list<T> meant that list<T>::sort was instantiated even though you never used it. The existence of list::sort meant that you needed to implement < for T, just to create a list<T>, even if you never actually used sort on a list of that type at all.
The standard clearly says that (both non-template and template) member methods instantiation should happen only when used.
An excerpt from C++ standard (N3690 - 14.7.1(2) Implicit instantiation)
2 Unless a member of a class template or a member template has been explicitly instantiated or explicitly specialized, the specialization of the member is implicitly instantiated when the specialization is referenced in a context that requires the member definition to exist; in particular, the initialization (and any associated side-effects) of a static data member does not occur unless the static data member is itself used in a way that requires the definition of the static data member to exist.
Methods of a class are also members. Class Template methods are instantiated when they are called for that instantiated class. So it is possible that those member methods are never instantiated.

How to use extern template

I've been looking through the N3291 working draft of C++0x. And I was curious about extern template. Section 14.7.3 states:
Except for inline functions and class template specializations, explicit instantiation declarations have the effect of suppressing the implicit instantiation of the entity to which they refer.
FYI: the term "explicit instantiation declaration" is standard-speak for extern template. That was defined back in section 14.7.2.
This sounds like it's saying that if you use extern template std::vector<int>, then doing any of the things that would normally implicitly instantiate std::vector<int> will not do so.
The next paragraph is more interesting:
If an entity is the subject of both an explicit instantiation declaration and an explicit instantiation definition in the same translation unit, the definition shall follow the declaration. An entity that is the subject of an explicit instantiation declaration and that is also used in a way that would otherwise cause an implicit instantiation (14.7.1) in the translation unit shall be the subject of an explicit instantiation definition somewhere in the program; otherwise the program is ill-formed, no diagnostic required.
FYI: the term "explicit instantiation definition" is standard speak for these things: template std::vector<int>. That is, without the extern.
To me, these two things say that extern template prevents implicit instantiation, but it does not prevent explicit instantiation. So if you do this:
extern template std::vector<int>;
template std::vector<int>;
The second line effectively negates the first by explicitly doing what the first line prevented from happening implicitly.
The problem is this: Visual Studio 2008 doesn't seem to agree. The way I want to use extern template is to prevent users from implicitly instantiating certain commonly-used templates, so that I can explicitly instantiate them in the .cpp files to cut down on compile time. The templates would only be instantiated once.
The problem is that I have to basically #ifdef around them in VS2008. Because if a single translation unit sees the extern and non-extern version, it will make the extern version win and nobody would ever instantiate it. And then come the linker errors.
So, my questions are:
What is the correct behavior according to C++0x? Should extern template prevent explicit instantiation or not?
If the answer to the previous question is that it should not, then VS2008 is in error (granted, it was written well before the spec, so it's not like it's their fault). How does VS2010 handle this? Does it implement the correct extern template behavior?
It says
Except for ...class template specializations
So it does not apply to std::vector<int>, but to its members (members that aren't inline member functions and presumably that aren't nested classes. Unfortunately, there isn't a one term that catches both of "class template specialization and specializations of member classes of class templates". So there are some places that use only the former but mean to also include the latter). So std::vector<int> and its nested classes (like std::vector<int>::iterator, if it is defined as a nested class) will still be implicitly instantiated if needed.

Why is this C++ explicit template specialization code illegal?

(Note: I know how it is illegal, I'm looking for the reason that the language make it so.)
template<class c> void Foo(); // Note: no generic version, here or anywhere.
int main(){
Foo<int>();
return 0;
}
template<> void Foo<int>();
Error:
error: explicit specialization of 'Foo<int>' after instantiation
A quick pass with Google found this citation of the spec but that offers only the what and not the why.
Edit:
Several responses have forwarded the argument (e.g. confirmed my speculation) that the rule is this way because to do otherwise would violate the One Definition Rule (ODR). However, this is a very weak argument because it doesn't hold, in this case, for two reaons:
Moving the explicit specialization to another translation unit solves the problem and doesn't seem to violate the ODR (or so the linker says).
The short form of the ODR (as applied to functions) is that you can't have more than one body for any given function, and I don't. The only place the body of the function is ever defined is in the explicit specialization, so the call to Foo<int> can't define a generic specialization of the template because there is no generic body to be specialized.
Speculation on the matter:
A guess as to why the rule exist at all: if the first line offered a definition (as opposed to a declaration), an explicit specialization after an instantiation would be a problem because you would get multiple definitions. But in this case, the only definition in sight is the explicit specialization.
Oddly the following (or something like it in the real code I'm working on) works:
File A:
template<class c> void Foo();
int main(){
Foo<int>();
return 0;
}
File B:
template<class c> void Foo();
template<> void Foo<int>();
But to use that in general starts to create a spaghetti imports structure.
A guess as to why the rule exist at
all: if the first line offered a
definition (as opposed to a
declaration), an explicit
specialization after an instantiation
would be a problem because you would
get multiple definitions. But in this
case, the only definition in sight is
the explicit specialization.
But you do have multiple definitions. You have already defined Foo< int > when you instantiate it and after that you try to specialize the template function for int, which is already defined.
int main(){
Foo<int>(); // Define Foo<int>();
return 0;
}
template<> void Foo<int>(); // Trying to specialize already defined Foo<int>
This code is illegal because explicit specialization appears after the instantiation. Basically, the compiler first saw the generic template, then it saw its instantiation and specialized that generic template with a specific type. After that it saw a specific implementation of the generic template that was already instantiated. So what is the compiler supposed to do? Go back and re-compile the code? That's why it is not allowed to do that.
You have to think of an explicit specialization as a function declaration. Just like if you had two overloaded functions (non-templated), if only one declaration can be found before you try to make a call to the second version, the compiler is going to say that it cannot find the required overloaded version. The difference with templates is that the compiler can generate that specialization based on the general function template. So, why is it forbidden to do this? Because the full template specialization violates the ODR when it is seen, since, by that time, there already exists a template specialization for the same type. When a template is instantiated (implicitly or not), the corresponding template specialization is also created, such that later use (in the same translation unit) of the same specialization will be able to reuse the instantiation and not duplicate the template code for every instantiations. Obviously, the ODR applies just as well to template specializations as it applies elsewhere.
So, when the quoted text says "no diagnostic is required", it just means that the compiler is not required to provide you with the insightful remark that the problem is due to an instantiation of the template occurring sometime before the explicit specialization. But, if it doesn't do that, the other option is to give the standard ODR violation error, i.e., "multiple definitions of 'Foo' specialization for [T = int]" or something like that, which wouldn't be as helpful as the more clever diagnostic.
RESPONSE TO EDIT
1) Although the saying goes that all template function definitions (i.e. implementation) must be visible at the point of instantiation (such that the compiler can substitute the template argument(s) and instantiate the function template). However, implicit instantiation of the function template only requires that the declaration of the function be available. So, in your case, splitting it into two translation units works, because it does not violate ODR (since in that TU, there is only one declaration of Foo<int>), the declaration if Foo<int> is available at the implicit instantiation point (through Foo<T>), and the definition of Foo<int> is available to the linker within TU B. So, no one has argued that this second example is "not supposed to work", it works as it is supposed to. Your question is about a rule that applies within one translation unit, don't counter the arguments by saying that the error doesn't occur when you split it into two TUs (especially when it clearly should work fine in two TUs, according to the rules).
2) In your first example, either there will be an error because the compiler cannot find the general function template (the non-specialized implementation) and thus cannot instantiate Foo<int> from the general template. Or, the compiler will find a definition for the general template, use it to instantiate Foo<int>, and then throw an error because a second template specialization Foo<int> is encountered. You seem to think that the compiler will find your specialization before it gets to it, it doesn't. C++ compilers compile the code from top to bottom, they don't go back and forth to substitute stuff here and there. When the compiler gets to the first use of Foo<int>, it sees only the general template at that point, assumes there will be an implementation of that general template that can be used to instantiate Foo<int>, it is not expecting a specialized implementation for Foo<int>, it is expecting and will use the general one. Then, it sees the specialization and throws the error, because it already had made its mind that the general version was to be used, so, it does see two distinct definitions for the same function, and yes, it does violate ODR. It's as simple as that.
WHY OH WHY!!!
The 2 TU case has to work because you should be able to share template instantiations between TUs, that's a feature of C++, and a useful one (in case when you have a small number of possible instantiations, you can pre-compile them).
The 1 TU case cannot be allowed because declaring something in C++ tells the compiler "there is this thing defined somewhere". You tell the compiler "there is a general definition of the template somewhere", then say "I want to use the general definition to make the function Foo<int>", and finally, you say "whenever Foo<int> is called, it should use this special definition". That's a flat out contradiction! That's why the ODR exists and applies to this context, to forbid such contradictions. It doesn't matter whether the general definition "to-be-found" is not present, the compiler expects it, and it has to assume that it does exist and that it is different from the specialization. It cannot go on and say "ok, so, I'll look everywhere else in the code for the general definition, and if I cannot find it, then I will come back and 'approve' this specialization to be used instead of the general definition, but if I find it I will flag this specialization as an error". Nor can it go on and flatly ignore the desire of the programmer by changing code that clear shows intent to use the general template (since the specialization is not declared yet), for code that uses a specialization that appears later. I can't possibly explain the "why" any more clearly than that.
The 2 TU case is completely different. When the compiler is compiling TU A (that uses Foo<int>), it will look for the general definition, fail to find it, assume that it will be linked-in later as Foo<int>, and leaves a symbol placeholder. Then, since the linker will not look for templates (templates are NOT exportable, in practice), it will look for a function that implements Foo<int>, and it doesn't care whether it is a specialized version or not. The linker is happy as long as it finds the same symbol to link to. This is so, because it would be a nightmare to do it otherwise (look up discussions on "exported templates") for both the programmers (not being able to easily change functions in their compiled libraries) and for the compiler vendors (having to implement this linking crazy scheme).

Why do C++ template definitions need to be in the header? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why should the implementation and the declaration of a template class be in the same header file?
e.g when defining a template class why do the implementations of the class methods need to be in the header? Why can't they be in a implementation file (cpp/cxx)?
A template class is not a class, it's a template that can be used to create a class. When you instantiate such a class, e.g. MyTemplate<int>, the compiler creates the class on the spot. In order to create it, it has to see all the templated member functions (so that it can use the templates to create actual member functions such as MyTemplate<int>::foo() ), and therefore these templated member functions must be in the header.
If the members are not in the header, the compiler will simply assume that they exist somewhere else and just create actual function declarations from the templated function declarations, and this gives you linker errors.
The "export" keyword is supposed to fix this, but few compilers support it (I only know of Comeau).
You can also explicitly instantiate MyTemplate<int> - then the compiler will create actual member functions for MyTemplate<int> when it compiles the cpp files containing the MyTemplate member function definition templates.
They need to be visible for the compiler when they are instantiated. That basically means that if you are publishing the template in a header, the definitions have to be visible by all translation units that include that header if you depend on implicit instantiation.
They need not be defined in the header if you are going to explicitly instantiate the templates, but this is in most cases not a good idea.
As to the reason, it basically boils down to the fact that templates are not compiled when the compiler parses the definition, but rather when they are instantiated, and then they are compiled for the particular instantiation type.
If your compiler supports export, then it doesn't. Only EDG-based compilers support export, and it's going to be removed from C++0x because of that.
Non-exported templates require that the compiler can see the full template definition, in order to instantiate it for the particular types you supply as arguments. For example:
template<typename T>
struct X {
T t;
X(int i): t(i) {}
};
Now, when you write X<float>(5) in some translation unit, the compiler as part of compiling that translation unit must check that the constructor of X is type-correct, generate the code for it, and so on. Hence it must see the definition of X, so that it can permit X<float>(5) but forbid X<char*>(5).
The only sensible way to ensure that the compiler sees the same template definition in all translation units that use it, is to put the definition in a header file. As far as the standard is concerned, though, you're welcome to copy-and-paste it manually, or to define a template in a cpp file that is used only in that one translation unit.
export in effect tells the compiler that it must output a parsed form of the template definition into a special kind of object file. Then the linker performs template instantiation. With normal toolchains, the compiler is smart enough to perform template instantiation and the linker isn't. Bear in mind that template instantiation has to do pretty much everything that the compiler does beyond basic parsing.
They can be in a CPP file.
The problem arises from the fact that the compiler builds the code for a specific instantiation of a template class (eg std::vector< int >) on a per translation unit basis. The problem with defining the functions in a CPP file is that you will need to define every possible form in that CPP file (this is called template specialization).
So for that int vector exampled above you could define a function in a CPP file for the int case using specialization.
e.g
template<> void std::vector< int >::push_back( int& intVal )
Of course doing this can produce the advantage of optimisation for specific cases but it does give you an idea of just how much code bloat can be introduced by STL! At least all the functions aren't defined as inline as a certain compiler used to do ;)
That aspect of template is called the compilation model, not to be confused with the instantiation mechanism which was the subject of How does C++ link template instances.
The instantiation mechanism is the answer to the question "When is the instantiation generated?", the instantiation model is the answer to "Where the source are found?"
There are two standards compilation model:
inclusion, the one that you know, where the definition must be available,
separated, which allows to put the definition somewhere else with the help of the keyword export. That one has been removed from the standard and won't be available in C++0X. One of the raison for removal was that it wasn't widely implemented (only one implementation).
See C++ Templates, The Complete Guide by David Vandevoorde and Nicolai Josuttis or http://www.bourguet.org/v2/cpplang/export.pdf for more information, the separated compilation model being the subject of that later paper.

Undefined template methods trick?

A colleague of mine told me about a little piece of design he has used with his team that sent my mind boiling. It's a kind of traits class that they can specialize in an extremely decoupled way.
I've had a hard time understanding how it could possibly work, and I am still unsure of the idea I have, so I thought I would ask for help here.
We are talking g++ here, specifically the versions 3.4.2 and 4.3.2 (it seems to work with both).
The idea is quite simple:
1- Define the interface
// interface.h
template <class T>
struct Interface
{
void foo(); // the method is not implemented, it could not work if it was
};
//
// I do not think it is necessary
// but they prefer free-standing methods with templates
// because of the automatic argument deduction
//
template <class T>
void foo(Interface<T>& interface) { interface.foo(); }
2- Define a class, and in the source file specialize the interface for this class (defining its methods)
// special.h
class Special {};
// special.cpp
#include "interface.h"
#include "special.h"
//
// Note that this specialization is not visible outside of this translation unit
//
template <>
struct Interface<Special>
{
void foo() { std::cout << "Special" << std::endl; }
};
3- To use, it's simple too:
// main.cpp
#include "interface.h"
class Special; // yes, it only costs a forward declaration
// which helps much in term of dependencies
int main(int argc, char* argv[])
{
Interface<Special> special;
foo(special);
return 0;
};
It's an undefined symbol if no translation unit defined a specialization of Interface for Special.
Now, I would have thought this would require the export keyword, which to my knowledge has never been implemented in g++ (and only implemented once in a C++ compiler, with its authors advising anyone not to, given the time and effort it took them).
I suspect it's got something to do with the linker resolving the templates methods...
Do you have ever met anything like this before ?
Does it conform to the standard or do you think it's a fortunate coincidence it works ?
I must admit I am quite puzzled by the construct...
Like #Steward suspected, it's not valid. Formally it's effectively causing undefined behavior, because the Standard rules that for a violation no diagnostic is required, which means the implementation can silently do anything it wants. At 14.7.3/6
If a template, a member template or the member of a class template is explicitly specialized then that specialization shall be declared before the first use of that specialization that would cause an implicit instantiation to take place, in every translation unit in which such a use occurs; no diagnostic is required.
In practice at least on GCC, it's implicitly instantiating the primary template Interface<T> since the specialization wasn't declared and is not visible in main, and then calling Interface<T>::foo. If its definition is visible, it instatiates the primary definition of the member function (which is why when it is defined, it wouldn't work).
Instantiated function name symbols have weak linkage because they could possibly be present multiple times in different object files, and have to be merged into one symbol in the final program. Contrary, members of explicit specializations that aren't templates anymore have strong linkage so they will dominate weak linkage symbols and make the call end up in the specialization. All this is implementation detail, and the Standard has no such notion of weak/strong linkage. You have to declare the specialization prior to creating the special object:
template <>
struct Interface<Special>;
The Standard lays it bare (emphasize by me)
The placement of explicit specialization declarations for function templates, class templates, member functions of class templates, static data members of class templates, member classes of class templates, member class templates of class templates, member function templates of class templates, member functions of member templates of class templates, member functions of member templates of non-template classes, member function templates of member classes of class templates, etc., and the placement of partial specialization declarations of class templates, member class templates of non-template classes, member class templates of class templates, etc., can affect whether a program is well-formed according to the relative positioning of the explicit specialization declarations and their points of instantiation in the translation unit as specified above and below. When writing a specialization, be careful about its location; or to make it compile will be such a trial as to kindle its self-immolation.
Thats pretty neat. I'm not sure if it is guaranteed to work everywhere though. It looks like what they're doing is having a deliberately undefined template method, and then defining a specialization tucked away in its own translation unit. They're depending on the compiler using the same name mangling for both the original class template method and the specialization, which is the bit I think is probably non-standard. The linker will then look for the method of the class template, but instead find the specialization.
There are a few risks with this though. No one, not even the linker, will pick up multiple implementations of the method for example. The template methods will be marked selectany because template implies inline so if the linker sees multiple instances, instead of issuing an error it will pick whichever one is most convenient.
Still a nice trick though, although unfortunately it does seem to be a lucky coincidence that it works.