Why is this C++ explicit template specialization code illegal? - c++

(Note: I know how it is illegal, I'm looking for the reason that the language make it so.)
template<class c> void Foo(); // Note: no generic version, here or anywhere.
int main(){
Foo<int>();
return 0;
}
template<> void Foo<int>();
Error:
error: explicit specialization of 'Foo<int>' after instantiation
A quick pass with Google found this citation of the spec but that offers only the what and not the why.
Edit:
Several responses have forwarded the argument (e.g. confirmed my speculation) that the rule is this way because to do otherwise would violate the One Definition Rule (ODR). However, this is a very weak argument because it doesn't hold, in this case, for two reaons:
Moving the explicit specialization to another translation unit solves the problem and doesn't seem to violate the ODR (or so the linker says).
The short form of the ODR (as applied to functions) is that you can't have more than one body for any given function, and I don't. The only place the body of the function is ever defined is in the explicit specialization, so the call to Foo<int> can't define a generic specialization of the template because there is no generic body to be specialized.
Speculation on the matter:
A guess as to why the rule exist at all: if the first line offered a definition (as opposed to a declaration), an explicit specialization after an instantiation would be a problem because you would get multiple definitions. But in this case, the only definition in sight is the explicit specialization.
Oddly the following (or something like it in the real code I'm working on) works:
File A:
template<class c> void Foo();
int main(){
Foo<int>();
return 0;
}
File B:
template<class c> void Foo();
template<> void Foo<int>();
But to use that in general starts to create a spaghetti imports structure.

A guess as to why the rule exist at
all: if the first line offered a
definition (as opposed to a
declaration), an explicit
specialization after an instantiation
would be a problem because you would
get multiple definitions. But in this
case, the only definition in sight is
the explicit specialization.
But you do have multiple definitions. You have already defined Foo< int > when you instantiate it and after that you try to specialize the template function for int, which is already defined.
int main(){
Foo<int>(); // Define Foo<int>();
return 0;
}
template<> void Foo<int>(); // Trying to specialize already defined Foo<int>

This code is illegal because explicit specialization appears after the instantiation. Basically, the compiler first saw the generic template, then it saw its instantiation and specialized that generic template with a specific type. After that it saw a specific implementation of the generic template that was already instantiated. So what is the compiler supposed to do? Go back and re-compile the code? That's why it is not allowed to do that.

You have to think of an explicit specialization as a function declaration. Just like if you had two overloaded functions (non-templated), if only one declaration can be found before you try to make a call to the second version, the compiler is going to say that it cannot find the required overloaded version. The difference with templates is that the compiler can generate that specialization based on the general function template. So, why is it forbidden to do this? Because the full template specialization violates the ODR when it is seen, since, by that time, there already exists a template specialization for the same type. When a template is instantiated (implicitly or not), the corresponding template specialization is also created, such that later use (in the same translation unit) of the same specialization will be able to reuse the instantiation and not duplicate the template code for every instantiations. Obviously, the ODR applies just as well to template specializations as it applies elsewhere.
So, when the quoted text says "no diagnostic is required", it just means that the compiler is not required to provide you with the insightful remark that the problem is due to an instantiation of the template occurring sometime before the explicit specialization. But, if it doesn't do that, the other option is to give the standard ODR violation error, i.e., "multiple definitions of 'Foo' specialization for [T = int]" or something like that, which wouldn't be as helpful as the more clever diagnostic.
RESPONSE TO EDIT
1) Although the saying goes that all template function definitions (i.e. implementation) must be visible at the point of instantiation (such that the compiler can substitute the template argument(s) and instantiate the function template). However, implicit instantiation of the function template only requires that the declaration of the function be available. So, in your case, splitting it into two translation units works, because it does not violate ODR (since in that TU, there is only one declaration of Foo<int>), the declaration if Foo<int> is available at the implicit instantiation point (through Foo<T>), and the definition of Foo<int> is available to the linker within TU B. So, no one has argued that this second example is "not supposed to work", it works as it is supposed to. Your question is about a rule that applies within one translation unit, don't counter the arguments by saying that the error doesn't occur when you split it into two TUs (especially when it clearly should work fine in two TUs, according to the rules).
2) In your first example, either there will be an error because the compiler cannot find the general function template (the non-specialized implementation) and thus cannot instantiate Foo<int> from the general template. Or, the compiler will find a definition for the general template, use it to instantiate Foo<int>, and then throw an error because a second template specialization Foo<int> is encountered. You seem to think that the compiler will find your specialization before it gets to it, it doesn't. C++ compilers compile the code from top to bottom, they don't go back and forth to substitute stuff here and there. When the compiler gets to the first use of Foo<int>, it sees only the general template at that point, assumes there will be an implementation of that general template that can be used to instantiate Foo<int>, it is not expecting a specialized implementation for Foo<int>, it is expecting and will use the general one. Then, it sees the specialization and throws the error, because it already had made its mind that the general version was to be used, so, it does see two distinct definitions for the same function, and yes, it does violate ODR. It's as simple as that.
WHY OH WHY!!!
The 2 TU case has to work because you should be able to share template instantiations between TUs, that's a feature of C++, and a useful one (in case when you have a small number of possible instantiations, you can pre-compile them).
The 1 TU case cannot be allowed because declaring something in C++ tells the compiler "there is this thing defined somewhere". You tell the compiler "there is a general definition of the template somewhere", then say "I want to use the general definition to make the function Foo<int>", and finally, you say "whenever Foo<int> is called, it should use this special definition". That's a flat out contradiction! That's why the ODR exists and applies to this context, to forbid such contradictions. It doesn't matter whether the general definition "to-be-found" is not present, the compiler expects it, and it has to assume that it does exist and that it is different from the specialization. It cannot go on and say "ok, so, I'll look everywhere else in the code for the general definition, and if I cannot find it, then I will come back and 'approve' this specialization to be used instead of the general definition, but if I find it I will flag this specialization as an error". Nor can it go on and flatly ignore the desire of the programmer by changing code that clear shows intent to use the general template (since the specialization is not declared yet), for code that uses a specialization that appears later. I can't possibly explain the "why" any more clearly than that.
The 2 TU case is completely different. When the compiler is compiling TU A (that uses Foo<int>), it will look for the general definition, fail to find it, assume that it will be linked-in later as Foo<int>, and leaves a symbol placeholder. Then, since the linker will not look for templates (templates are NOT exportable, in practice), it will look for a function that implements Foo<int>, and it doesn't care whether it is a specialized version or not. The linker is happy as long as it finds the same symbol to link to. This is so, because it would be a nightmare to do it otherwise (look up discussions on "exported templates") for both the programmers (not being able to easily change functions in their compiled libraries) and for the compiler vendors (having to implement this linking crazy scheme).

Related

What does the compiler do when you make a function template? [duplicate]

I'm reading a book about how templates work, and I'm having difficulty understanding this explanation of templates.
It says
When the compiler sees the definition of a template, it does not generate code. It generates code only when we instantiate a specific instance of the template. The fact that code is generated only when we use a template (and not when we define it) affects how we organize our source code and when errors are detected...To generate an instantiation, the compiler needs to have the code that defines a function template or class template member function. As a result, unlike non-template code, headers for templates typically include definitions as well as declarations.
What exactly does it mean by "generate code"? I don't understand what is different when you compile function templates or class templates compared to regular functions or classes.
The compiler generates the code for the specific types given in the template class instantiation.
If you have for instance a template class declaration as
template<typename T>
class Foo
{
public:
T& bar()
{
return subject;
}
private:
T subject;
};
as soon you have for example the following instantiations
Foo<int> fooInt;
Foo<double> fooDouble;
these will effectively generate the same linkable code as you would have defined classes like
class FooInt
{
public:
int& bar()
{
return subject;
}
private:
int subject;
}
and
class FooDouble
{
public:
double& bar()
{
return subject;
}
private:
double subject;
}
and instantiate the variables like
FooInt fooInt;
FooDouble fooDouble;
Regarding the point that template definitions (don't confuse with declarations regardless of templates) need to be seen with the header (included) files, it's pretty clear why:
The compiler can't generate this code without seeing the definition. It can refer to a matching instantiation that appeared first at linking stage though.
What does a non-template member function have that allows for it to
be defined outside of the header that a template function doesn't
have?
The declaration of a non-template class/member/function gives a predefined entry point for the linker. The definition can be drawn from a single implementation seen in a compiled object file (== .cpp == compilation unit).
In contrast the declaration of a templated class/member/function might be instantiated from arbitrary compilation units given the same or varying template parameters. The definition for these template parameters need's to be seen at least once. It can be either generic or specialized.
Note that you can specialize template implementations for particular types anyway (included with the header or at a specific compilation unit).
If you would provide a specialization for your template class in one of your compilation units, and don't use your template class with types other than specialized, that also should suffice for linking it all together.
I hope this sample helps clarifying what's the difference and efforts done from the compiler.
A template is a pattern for creating code. When the compiler sees the definition of a template it makes notes about that pattern. When it sees a use of that template it digs out its notes, figures out how to apply the pattern at the point where it's being used, and generates code according to the pattern.
What is the compiler suppose to do when it sees a template? Generate all the machine code for all possible data types - ints, doubles, float, strings, ... Could take a lot of time. Or just be a little lazy and generate the machine code for what it requires.
I guess the latter option is the better solution and gets the job done.
The main point here is that compiler does not treat a template definition until it meets a certain instance of the template. (Then it can proceed, I guess, like it have a usual class, which is a specific case of the template class, with fixed template parameters.)
The direct answer to your question is: Compiler generates machine code from users c++ code, I think this is wat is meant here by word "generate code".
The template declaration must be in header file because when compiler compiles some source, which use template it HAVE only header file (included in source with #include macro), but it NEED whole template definition. So logical conclusion is that template definition must be in header.
When you create a function and compile it, the compiler generates code for it. Many compilers will not generate code for static functions that are not used.
If you create a templated function and nothing uses the template (such as std::sort), the code for the function will not be generated.
Remember, templates are like stencils. The templates tell how to generate a class or function using the given template parameters. If the stencil is not used, nothing is generated.
Consider also that the compiler doesn't know how to implement or use the template until it sees all the template parameters resolved.
It won't straight away generate code. Only generates the class or template code when it comes across an instantiation of that template. That is, if you are actually creating an object of that template definition.
In essence, templates allow you abstract away from types. If you need two instantiations of the template class for example for an int and a double the compiler will literally create two of these classes for you when you need them. That is what makes templates so powerful.
Your C++ is read by the compiler and turned into assembly code, before being turned in machine code.
Templates are designed to allow generic programming. If your code doesn't use your template at all, the compiler won't generate the assembly code associated. The more data types you associate your template with in your program, the more assembly code it will generate.

When will implicit instantiation cause problems?

I'm reading C++ Primer, and it says:
"If a member function isn't used, it is not instantiated. The fact that members are instantiated only if we use them lets us instantiate a class with a type that may not meet the requirements for some of the template’s operations."
I don't know why this is a problem. If some operations are required, why doesn't compiler instantiate those operations? Can someone give an example?
That's an ease-of-use feature, not a pitfall.
Lazy instantiation serves to simplify templates. You can implement the set of all possible member functions that any specialization might have, even if some of the functions don't work for some specializations.
It also reduces compile time, since the compiler never needs to instantiate what you don't use.
To prevent lazy instantiation, use explicit instantiation:
template class my_template< some_arg >;
This will immediately instantiate all the members of the class (except members which are templates, inherited, or not yet defined). For templates that are slow to compile, you can do the above in one source file (translation unit) and then use the linker to bypass instantiation in other source files, by putting a declaration in the header:
extern template class my_template< some_arg >;

How does the compilation of templates work?

I'm reading a book about how templates work, and I'm having difficulty understanding this explanation of templates.
It says
When the compiler sees the definition of a template, it does not generate code. It generates code only when we instantiate a specific instance of the template. The fact that code is generated only when we use a template (and not when we define it) affects how we organize our source code and when errors are detected...To generate an instantiation, the compiler needs to have the code that defines a function template or class template member function. As a result, unlike non-template code, headers for templates typically include definitions as well as declarations.
What exactly does it mean by "generate code"? I don't understand what is different when you compile function templates or class templates compared to regular functions or classes.
The compiler generates the code for the specific types given in the template class instantiation.
If you have for instance a template class declaration as
template<typename T>
class Foo
{
public:
T& bar()
{
return subject;
}
private:
T subject;
};
as soon you have for example the following instantiations
Foo<int> fooInt;
Foo<double> fooDouble;
these will effectively generate the same linkable code as you would have defined classes like
class FooInt
{
public:
int& bar()
{
return subject;
}
private:
int subject;
}
and
class FooDouble
{
public:
double& bar()
{
return subject;
}
private:
double subject;
}
and instantiate the variables like
FooInt fooInt;
FooDouble fooDouble;
Regarding the point that template definitions (don't confuse with declarations regardless of templates) need to be seen with the header (included) files, it's pretty clear why:
The compiler can't generate this code without seeing the definition. It can refer to a matching instantiation that appeared first at linking stage though.
What does a non-template member function have that allows for it to
be defined outside of the header that a template function doesn't
have?
The declaration of a non-template class/member/function gives a predefined entry point for the linker. The definition can be drawn from a single implementation seen in a compiled object file (== .cpp == compilation unit).
In contrast the declaration of a templated class/member/function might be instantiated from arbitrary compilation units given the same or varying template parameters. The definition for these template parameters need's to be seen at least once. It can be either generic or specialized.
Note that you can specialize template implementations for particular types anyway (included with the header or at a specific compilation unit).
If you would provide a specialization for your template class in one of your compilation units, and don't use your template class with types other than specialized, that also should suffice for linking it all together.
I hope this sample helps clarifying what's the difference and efforts done from the compiler.
A template is a pattern for creating code. When the compiler sees the definition of a template it makes notes about that pattern. When it sees a use of that template it digs out its notes, figures out how to apply the pattern at the point where it's being used, and generates code according to the pattern.
What is the compiler suppose to do when it sees a template? Generate all the machine code for all possible data types - ints, doubles, float, strings, ... Could take a lot of time. Or just be a little lazy and generate the machine code for what it requires.
I guess the latter option is the better solution and gets the job done.
The main point here is that compiler does not treat a template definition until it meets a certain instance of the template. (Then it can proceed, I guess, like it have a usual class, which is a specific case of the template class, with fixed template parameters.)
The direct answer to your question is: Compiler generates machine code from users c++ code, I think this is wat is meant here by word "generate code".
The template declaration must be in header file because when compiler compiles some source, which use template it HAVE only header file (included in source with #include macro), but it NEED whole template definition. So logical conclusion is that template definition must be in header.
When you create a function and compile it, the compiler generates code for it. Many compilers will not generate code for static functions that are not used.
If you create a templated function and nothing uses the template (such as std::sort), the code for the function will not be generated.
Remember, templates are like stencils. The templates tell how to generate a class or function using the given template parameters. If the stencil is not used, nothing is generated.
Consider also that the compiler doesn't know how to implement or use the template until it sees all the template parameters resolved.
It won't straight away generate code. Only generates the class or template code when it comes across an instantiation of that template. That is, if you are actually creating an object of that template definition.
In essence, templates allow you abstract away from types. If you need two instantiations of the template class for example for an int and a double the compiler will literally create two of these classes for you when you need them. That is what makes templates so powerful.
Your C++ is read by the compiler and turned into assembly code, before being turned in machine code.
Templates are designed to allow generic programming. If your code doesn't use your template at all, the compiler won't generate the assembly code associated. The more data types you associate your template with in your program, the more assembly code it will generate.

How to use extern template

I've been looking through the N3291 working draft of C++0x. And I was curious about extern template. Section 14.7.3 states:
Except for inline functions and class template specializations, explicit instantiation declarations have the effect of suppressing the implicit instantiation of the entity to which they refer.
FYI: the term "explicit instantiation declaration" is standard-speak for extern template. That was defined back in section 14.7.2.
This sounds like it's saying that if you use extern template std::vector<int>, then doing any of the things that would normally implicitly instantiate std::vector<int> will not do so.
The next paragraph is more interesting:
If an entity is the subject of both an explicit instantiation declaration and an explicit instantiation definition in the same translation unit, the definition shall follow the declaration. An entity that is the subject of an explicit instantiation declaration and that is also used in a way that would otherwise cause an implicit instantiation (14.7.1) in the translation unit shall be the subject of an explicit instantiation definition somewhere in the program; otherwise the program is ill-formed, no diagnostic required.
FYI: the term "explicit instantiation definition" is standard speak for these things: template std::vector<int>. That is, without the extern.
To me, these two things say that extern template prevents implicit instantiation, but it does not prevent explicit instantiation. So if you do this:
extern template std::vector<int>;
template std::vector<int>;
The second line effectively negates the first by explicitly doing what the first line prevented from happening implicitly.
The problem is this: Visual Studio 2008 doesn't seem to agree. The way I want to use extern template is to prevent users from implicitly instantiating certain commonly-used templates, so that I can explicitly instantiate them in the .cpp files to cut down on compile time. The templates would only be instantiated once.
The problem is that I have to basically #ifdef around them in VS2008. Because if a single translation unit sees the extern and non-extern version, it will make the extern version win and nobody would ever instantiate it. And then come the linker errors.
So, my questions are:
What is the correct behavior according to C++0x? Should extern template prevent explicit instantiation or not?
If the answer to the previous question is that it should not, then VS2008 is in error (granted, it was written well before the spec, so it's not like it's their fault). How does VS2010 handle this? Does it implement the correct extern template behavior?
It says
Except for ...class template specializations
So it does not apply to std::vector<int>, but to its members (members that aren't inline member functions and presumably that aren't nested classes. Unfortunately, there isn't a one term that catches both of "class template specialization and specializations of member classes of class templates". So there are some places that use only the former but mean to also include the latter). So std::vector<int> and its nested classes (like std::vector<int>::iterator, if it is defined as a nested class) will still be implicitly instantiated if needed.

Why do C++ template definitions need to be in the header? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why should the implementation and the declaration of a template class be in the same header file?
e.g when defining a template class why do the implementations of the class methods need to be in the header? Why can't they be in a implementation file (cpp/cxx)?
A template class is not a class, it's a template that can be used to create a class. When you instantiate such a class, e.g. MyTemplate<int>, the compiler creates the class on the spot. In order to create it, it has to see all the templated member functions (so that it can use the templates to create actual member functions such as MyTemplate<int>::foo() ), and therefore these templated member functions must be in the header.
If the members are not in the header, the compiler will simply assume that they exist somewhere else and just create actual function declarations from the templated function declarations, and this gives you linker errors.
The "export" keyword is supposed to fix this, but few compilers support it (I only know of Comeau).
You can also explicitly instantiate MyTemplate<int> - then the compiler will create actual member functions for MyTemplate<int> when it compiles the cpp files containing the MyTemplate member function definition templates.
They need to be visible for the compiler when they are instantiated. That basically means that if you are publishing the template in a header, the definitions have to be visible by all translation units that include that header if you depend on implicit instantiation.
They need not be defined in the header if you are going to explicitly instantiate the templates, but this is in most cases not a good idea.
As to the reason, it basically boils down to the fact that templates are not compiled when the compiler parses the definition, but rather when they are instantiated, and then they are compiled for the particular instantiation type.
If your compiler supports export, then it doesn't. Only EDG-based compilers support export, and it's going to be removed from C++0x because of that.
Non-exported templates require that the compiler can see the full template definition, in order to instantiate it for the particular types you supply as arguments. For example:
template<typename T>
struct X {
T t;
X(int i): t(i) {}
};
Now, when you write X<float>(5) in some translation unit, the compiler as part of compiling that translation unit must check that the constructor of X is type-correct, generate the code for it, and so on. Hence it must see the definition of X, so that it can permit X<float>(5) but forbid X<char*>(5).
The only sensible way to ensure that the compiler sees the same template definition in all translation units that use it, is to put the definition in a header file. As far as the standard is concerned, though, you're welcome to copy-and-paste it manually, or to define a template in a cpp file that is used only in that one translation unit.
export in effect tells the compiler that it must output a parsed form of the template definition into a special kind of object file. Then the linker performs template instantiation. With normal toolchains, the compiler is smart enough to perform template instantiation and the linker isn't. Bear in mind that template instantiation has to do pretty much everything that the compiler does beyond basic parsing.
They can be in a CPP file.
The problem arises from the fact that the compiler builds the code for a specific instantiation of a template class (eg std::vector< int >) on a per translation unit basis. The problem with defining the functions in a CPP file is that you will need to define every possible form in that CPP file (this is called template specialization).
So for that int vector exampled above you could define a function in a CPP file for the int case using specialization.
e.g
template<> void std::vector< int >::push_back( int& intVal )
Of course doing this can produce the advantage of optimisation for specific cases but it does give you an idea of just how much code bloat can be introduced by STL! At least all the functions aren't defined as inline as a certain compiler used to do ;)
That aspect of template is called the compilation model, not to be confused with the instantiation mechanism which was the subject of How does C++ link template instances.
The instantiation mechanism is the answer to the question "When is the instantiation generated?", the instantiation model is the answer to "Where the source are found?"
There are two standards compilation model:
inclusion, the one that you know, where the definition must be available,
separated, which allows to put the definition somewhere else with the help of the keyword export. That one has been removed from the standard and won't be available in C++0X. One of the raison for removal was that it wasn't widely implemented (only one implementation).
See C++ Templates, The Complete Guide by David Vandevoorde and Nicolai Josuttis or http://www.bourguet.org/v2/cpplang/export.pdf for more information, the separated compilation model being the subject of that later paper.