Confusions around explicit template instantiation - c++

Well, I think I just get extremely confused by explicit template instantiation ~>_<~
Could an explicit instantiation declaration exploit an implicit
instantiation definition?
What if both explicit and implicit instantiation definitions exist
in a program? Will they ultimately collapse into a single one?
Does an explicit instantiation declaration have any effect when placed after an
implicit instantiation definition?
Also, see the following code:
#include <iostream>
#include <vector>
std::vector<int> a; // Implicit instantiation definition.
// Explicit instantiation declaration.
extern template class std::vector<int>;
int main() {
std::cout << std::vector<int>().size(); // So what?
}
It causes the link error
/tmp/ccQld7ol.o: In function `_GLOBAL__sub_I_a':
main.cpp:(.text.startup+0x6e): undefined reference to `std::vector<int, std::allocator<int> >::~vector()'
collect2: error: ld returned 1 exit status
with GCC 5.2, but builds fine with clang 3.6. Which one is correct according to the standard?
I hope there is an insightful way to understand explicit template instantiation so that answers to all the questions above can be logically deduced and explained.

[temp.explicit]/p11:
An entity that is the subject of an explicit instantiation declaration
and that is also used in a way that would otherwise cause an implicit
instantiation (14.7.1) in the translation unit shall be the subject of
an explicit instantiation definition somewhere in the program;
otherwise the program is ill-formed, no diagnostic required.

First of all, it seems like you are overthinking explicit instantion. There is nothing special about it. All it does, it allows someone to use the templated function or class without having the template definition visible. It makes it so by creating an instance of function or class with template specified, so that it is no longer template but the actual usable thing. It can be used, for example, when you have a template class, but do want to hide the actual code in .cpp file which you never provide to the users - instead you give them the compiled .o file. To make it work, you can explicitly instantiate your template with the types you believe your users are going to need for the template arguments. (of course, this is a rare case when the set of types is known like this). There is nothing more to that.
Implicit and explicit instantions for the same type can live together. Implicit instantiation will produce weak symbol, explicit one will produce the 'strong' symbol. Strong symbols override weaks symbols, and there is no violation of ODR. Everything will be OK.
As for the error you have, you need to remove 'extern' from your explicit instantion.

Related

When will implicit instantiation cause problems?

I'm reading C++ Primer, and it says:
"If a member function isn't used, it is not instantiated. The fact that members are instantiated only if we use them lets us instantiate a class with a type that may not meet the requirements for some of the template’s operations."
I don't know why this is a problem. If some operations are required, why doesn't compiler instantiate those operations? Can someone give an example?
That's an ease-of-use feature, not a pitfall.
Lazy instantiation serves to simplify templates. You can implement the set of all possible member functions that any specialization might have, even if some of the functions don't work for some specializations.
It also reduces compile time, since the compiler never needs to instantiate what you don't use.
To prevent lazy instantiation, use explicit instantiation:
template class my_template< some_arg >;
This will immediately instantiate all the members of the class (except members which are templates, inherited, or not yet defined). For templates that are slow to compile, you can do the above in one source file (translation unit) and then use the linker to bypass instantiation in other source files, by putting a declaration in the header:
extern template class my_template< some_arg >;

Can I use `-fno-implicit-templates` feature only for one template?

Is it possible to forbid implicit instantiation, like -fno-implicit-templates does, but only for one template?
I have a problem with implicit instantiation of incomplete template, which causes compilation failure (part of implementation is hidden in source file, and I don't want to have it in other TUs). -fno-implicit-templates solves the problem, but at cost of problems with using STL and other templates.
You can try to use explicit template instantiation. Put explicit template instantiation declaration extern template class TemplateClass<ArgumentsSet>; (where ArgumentsSet is a TemplateClass arguments set for which you want to avoid implicit instantiation in your code) in your header file (you can put such directive for several arguments sets if you want). Also put explicit template instantiation definition template class TemplateClass<ArgumentsSet>; in your source file to explicitly instantiate TemplateClass for ArgumentsSet in this translation unit.

How to use extern template

I've been looking through the N3291 working draft of C++0x. And I was curious about extern template. Section 14.7.3 states:
Except for inline functions and class template specializations, explicit instantiation declarations have the effect of suppressing the implicit instantiation of the entity to which they refer.
FYI: the term "explicit instantiation declaration" is standard-speak for extern template. That was defined back in section 14.7.2.
This sounds like it's saying that if you use extern template std::vector<int>, then doing any of the things that would normally implicitly instantiate std::vector<int> will not do so.
The next paragraph is more interesting:
If an entity is the subject of both an explicit instantiation declaration and an explicit instantiation definition in the same translation unit, the definition shall follow the declaration. An entity that is the subject of an explicit instantiation declaration and that is also used in a way that would otherwise cause an implicit instantiation (14.7.1) in the translation unit shall be the subject of an explicit instantiation definition somewhere in the program; otherwise the program is ill-formed, no diagnostic required.
FYI: the term "explicit instantiation definition" is standard speak for these things: template std::vector<int>. That is, without the extern.
To me, these two things say that extern template prevents implicit instantiation, but it does not prevent explicit instantiation. So if you do this:
extern template std::vector<int>;
template std::vector<int>;
The second line effectively negates the first by explicitly doing what the first line prevented from happening implicitly.
The problem is this: Visual Studio 2008 doesn't seem to agree. The way I want to use extern template is to prevent users from implicitly instantiating certain commonly-used templates, so that I can explicitly instantiate them in the .cpp files to cut down on compile time. The templates would only be instantiated once.
The problem is that I have to basically #ifdef around them in VS2008. Because if a single translation unit sees the extern and non-extern version, it will make the extern version win and nobody would ever instantiate it. And then come the linker errors.
So, my questions are:
What is the correct behavior according to C++0x? Should extern template prevent explicit instantiation or not?
If the answer to the previous question is that it should not, then VS2008 is in error (granted, it was written well before the spec, so it's not like it's their fault). How does VS2010 handle this? Does it implement the correct extern template behavior?
It says
Except for ...class template specializations
So it does not apply to std::vector<int>, but to its members (members that aren't inline member functions and presumably that aren't nested classes. Unfortunately, there isn't a one term that catches both of "class template specialization and specializations of member classes of class templates". So there are some places that use only the former but mean to also include the latter). So std::vector<int> and its nested classes (like std::vector<int>::iterator, if it is defined as a nested class) will still be implicitly instantiated if needed.

Why is this C++ explicit template specialization code illegal?

(Note: I know how it is illegal, I'm looking for the reason that the language make it so.)
template<class c> void Foo(); // Note: no generic version, here or anywhere.
int main(){
Foo<int>();
return 0;
}
template<> void Foo<int>();
Error:
error: explicit specialization of 'Foo<int>' after instantiation
A quick pass with Google found this citation of the spec but that offers only the what and not the why.
Edit:
Several responses have forwarded the argument (e.g. confirmed my speculation) that the rule is this way because to do otherwise would violate the One Definition Rule (ODR). However, this is a very weak argument because it doesn't hold, in this case, for two reaons:
Moving the explicit specialization to another translation unit solves the problem and doesn't seem to violate the ODR (or so the linker says).
The short form of the ODR (as applied to functions) is that you can't have more than one body for any given function, and I don't. The only place the body of the function is ever defined is in the explicit specialization, so the call to Foo<int> can't define a generic specialization of the template because there is no generic body to be specialized.
Speculation on the matter:
A guess as to why the rule exist at all: if the first line offered a definition (as opposed to a declaration), an explicit specialization after an instantiation would be a problem because you would get multiple definitions. But in this case, the only definition in sight is the explicit specialization.
Oddly the following (or something like it in the real code I'm working on) works:
File A:
template<class c> void Foo();
int main(){
Foo<int>();
return 0;
}
File B:
template<class c> void Foo();
template<> void Foo<int>();
But to use that in general starts to create a spaghetti imports structure.
A guess as to why the rule exist at
all: if the first line offered a
definition (as opposed to a
declaration), an explicit
specialization after an instantiation
would be a problem because you would
get multiple definitions. But in this
case, the only definition in sight is
the explicit specialization.
But you do have multiple definitions. You have already defined Foo< int > when you instantiate it and after that you try to specialize the template function for int, which is already defined.
int main(){
Foo<int>(); // Define Foo<int>();
return 0;
}
template<> void Foo<int>(); // Trying to specialize already defined Foo<int>
This code is illegal because explicit specialization appears after the instantiation. Basically, the compiler first saw the generic template, then it saw its instantiation and specialized that generic template with a specific type. After that it saw a specific implementation of the generic template that was already instantiated. So what is the compiler supposed to do? Go back and re-compile the code? That's why it is not allowed to do that.
You have to think of an explicit specialization as a function declaration. Just like if you had two overloaded functions (non-templated), if only one declaration can be found before you try to make a call to the second version, the compiler is going to say that it cannot find the required overloaded version. The difference with templates is that the compiler can generate that specialization based on the general function template. So, why is it forbidden to do this? Because the full template specialization violates the ODR when it is seen, since, by that time, there already exists a template specialization for the same type. When a template is instantiated (implicitly or not), the corresponding template specialization is also created, such that later use (in the same translation unit) of the same specialization will be able to reuse the instantiation and not duplicate the template code for every instantiations. Obviously, the ODR applies just as well to template specializations as it applies elsewhere.
So, when the quoted text says "no diagnostic is required", it just means that the compiler is not required to provide you with the insightful remark that the problem is due to an instantiation of the template occurring sometime before the explicit specialization. But, if it doesn't do that, the other option is to give the standard ODR violation error, i.e., "multiple definitions of 'Foo' specialization for [T = int]" or something like that, which wouldn't be as helpful as the more clever diagnostic.
RESPONSE TO EDIT
1) Although the saying goes that all template function definitions (i.e. implementation) must be visible at the point of instantiation (such that the compiler can substitute the template argument(s) and instantiate the function template). However, implicit instantiation of the function template only requires that the declaration of the function be available. So, in your case, splitting it into two translation units works, because it does not violate ODR (since in that TU, there is only one declaration of Foo<int>), the declaration if Foo<int> is available at the implicit instantiation point (through Foo<T>), and the definition of Foo<int> is available to the linker within TU B. So, no one has argued that this second example is "not supposed to work", it works as it is supposed to. Your question is about a rule that applies within one translation unit, don't counter the arguments by saying that the error doesn't occur when you split it into two TUs (especially when it clearly should work fine in two TUs, according to the rules).
2) In your first example, either there will be an error because the compiler cannot find the general function template (the non-specialized implementation) and thus cannot instantiate Foo<int> from the general template. Or, the compiler will find a definition for the general template, use it to instantiate Foo<int>, and then throw an error because a second template specialization Foo<int> is encountered. You seem to think that the compiler will find your specialization before it gets to it, it doesn't. C++ compilers compile the code from top to bottom, they don't go back and forth to substitute stuff here and there. When the compiler gets to the first use of Foo<int>, it sees only the general template at that point, assumes there will be an implementation of that general template that can be used to instantiate Foo<int>, it is not expecting a specialized implementation for Foo<int>, it is expecting and will use the general one. Then, it sees the specialization and throws the error, because it already had made its mind that the general version was to be used, so, it does see two distinct definitions for the same function, and yes, it does violate ODR. It's as simple as that.
WHY OH WHY!!!
The 2 TU case has to work because you should be able to share template instantiations between TUs, that's a feature of C++, and a useful one (in case when you have a small number of possible instantiations, you can pre-compile them).
The 1 TU case cannot be allowed because declaring something in C++ tells the compiler "there is this thing defined somewhere". You tell the compiler "there is a general definition of the template somewhere", then say "I want to use the general definition to make the function Foo<int>", and finally, you say "whenever Foo<int> is called, it should use this special definition". That's a flat out contradiction! That's why the ODR exists and applies to this context, to forbid such contradictions. It doesn't matter whether the general definition "to-be-found" is not present, the compiler expects it, and it has to assume that it does exist and that it is different from the specialization. It cannot go on and say "ok, so, I'll look everywhere else in the code for the general definition, and if I cannot find it, then I will come back and 'approve' this specialization to be used instead of the general definition, but if I find it I will flag this specialization as an error". Nor can it go on and flatly ignore the desire of the programmer by changing code that clear shows intent to use the general template (since the specialization is not declared yet), for code that uses a specialization that appears later. I can't possibly explain the "why" any more clearly than that.
The 2 TU case is completely different. When the compiler is compiling TU A (that uses Foo<int>), it will look for the general definition, fail to find it, assume that it will be linked-in later as Foo<int>, and leaves a symbol placeholder. Then, since the linker will not look for templates (templates are NOT exportable, in practice), it will look for a function that implements Foo<int>, and it doesn't care whether it is a specialized version or not. The linker is happy as long as it finds the same symbol to link to. This is so, because it would be a nightmare to do it otherwise (look up discussions on "exported templates") for both the programmers (not being able to easily change functions in their compiled libraries) and for the compiler vendors (having to implement this linking crazy scheme).

Undefined template methods trick?

A colleague of mine told me about a little piece of design he has used with his team that sent my mind boiling. It's a kind of traits class that they can specialize in an extremely decoupled way.
I've had a hard time understanding how it could possibly work, and I am still unsure of the idea I have, so I thought I would ask for help here.
We are talking g++ here, specifically the versions 3.4.2 and 4.3.2 (it seems to work with both).
The idea is quite simple:
1- Define the interface
// interface.h
template <class T>
struct Interface
{
void foo(); // the method is not implemented, it could not work if it was
};
//
// I do not think it is necessary
// but they prefer free-standing methods with templates
// because of the automatic argument deduction
//
template <class T>
void foo(Interface<T>& interface) { interface.foo(); }
2- Define a class, and in the source file specialize the interface for this class (defining its methods)
// special.h
class Special {};
// special.cpp
#include "interface.h"
#include "special.h"
//
// Note that this specialization is not visible outside of this translation unit
//
template <>
struct Interface<Special>
{
void foo() { std::cout << "Special" << std::endl; }
};
3- To use, it's simple too:
// main.cpp
#include "interface.h"
class Special; // yes, it only costs a forward declaration
// which helps much in term of dependencies
int main(int argc, char* argv[])
{
Interface<Special> special;
foo(special);
return 0;
};
It's an undefined symbol if no translation unit defined a specialization of Interface for Special.
Now, I would have thought this would require the export keyword, which to my knowledge has never been implemented in g++ (and only implemented once in a C++ compiler, with its authors advising anyone not to, given the time and effort it took them).
I suspect it's got something to do with the linker resolving the templates methods...
Do you have ever met anything like this before ?
Does it conform to the standard or do you think it's a fortunate coincidence it works ?
I must admit I am quite puzzled by the construct...
Like #Steward suspected, it's not valid. Formally it's effectively causing undefined behavior, because the Standard rules that for a violation no diagnostic is required, which means the implementation can silently do anything it wants. At 14.7.3/6
If a template, a member template or the member of a class template is explicitly specialized then that specialization shall be declared before the first use of that specialization that would cause an implicit instantiation to take place, in every translation unit in which such a use occurs; no diagnostic is required.
In practice at least on GCC, it's implicitly instantiating the primary template Interface<T> since the specialization wasn't declared and is not visible in main, and then calling Interface<T>::foo. If its definition is visible, it instatiates the primary definition of the member function (which is why when it is defined, it wouldn't work).
Instantiated function name symbols have weak linkage because they could possibly be present multiple times in different object files, and have to be merged into one symbol in the final program. Contrary, members of explicit specializations that aren't templates anymore have strong linkage so they will dominate weak linkage symbols and make the call end up in the specialization. All this is implementation detail, and the Standard has no such notion of weak/strong linkage. You have to declare the specialization prior to creating the special object:
template <>
struct Interface<Special>;
The Standard lays it bare (emphasize by me)
The placement of explicit specialization declarations for function templates, class templates, member functions of class templates, static data members of class templates, member classes of class templates, member class templates of class templates, member function templates of class templates, member functions of member templates of class templates, member functions of member templates of non-template classes, member function templates of member classes of class templates, etc., and the placement of partial specialization declarations of class templates, member class templates of non-template classes, member class templates of class templates, etc., can affect whether a program is well-formed according to the relative positioning of the explicit specialization declarations and their points of instantiation in the translation unit as specified above and below. When writing a specialization, be careful about its location; or to make it compile will be such a trial as to kindle its self-immolation.
Thats pretty neat. I'm not sure if it is guaranteed to work everywhere though. It looks like what they're doing is having a deliberately undefined template method, and then defining a specialization tucked away in its own translation unit. They're depending on the compiler using the same name mangling for both the original class template method and the specialization, which is the bit I think is probably non-standard. The linker will then look for the method of the class template, but instead find the specialization.
There are a few risks with this though. No one, not even the linker, will pick up multiple implementations of the method for example. The template methods will be marked selectany because template implies inline so if the linker sees multiple instances, instead of issuing an error it will pick whichever one is most convenient.
Still a nice trick though, although unfortunately it does seem to be a lucky coincidence that it works.