Undefined template methods trick? - c++

A colleague of mine told me about a little piece of design he has used with his team that sent my mind boiling. It's a kind of traits class that they can specialize in an extremely decoupled way.
I've had a hard time understanding how it could possibly work, and I am still unsure of the idea I have, so I thought I would ask for help here.
We are talking g++ here, specifically the versions 3.4.2 and 4.3.2 (it seems to work with both).
The idea is quite simple:
1- Define the interface
// interface.h
template <class T>
struct Interface
{
void foo(); // the method is not implemented, it could not work if it was
};
//
// I do not think it is necessary
// but they prefer free-standing methods with templates
// because of the automatic argument deduction
//
template <class T>
void foo(Interface<T>& interface) { interface.foo(); }
2- Define a class, and in the source file specialize the interface for this class (defining its methods)
// special.h
class Special {};
// special.cpp
#include "interface.h"
#include "special.h"
//
// Note that this specialization is not visible outside of this translation unit
//
template <>
struct Interface<Special>
{
void foo() { std::cout << "Special" << std::endl; }
};
3- To use, it's simple too:
// main.cpp
#include "interface.h"
class Special; // yes, it only costs a forward declaration
// which helps much in term of dependencies
int main(int argc, char* argv[])
{
Interface<Special> special;
foo(special);
return 0;
};
It's an undefined symbol if no translation unit defined a specialization of Interface for Special.
Now, I would have thought this would require the export keyword, which to my knowledge has never been implemented in g++ (and only implemented once in a C++ compiler, with its authors advising anyone not to, given the time and effort it took them).
I suspect it's got something to do with the linker resolving the templates methods...
Do you have ever met anything like this before ?
Does it conform to the standard or do you think it's a fortunate coincidence it works ?
I must admit I am quite puzzled by the construct...

Like #Steward suspected, it's not valid. Formally it's effectively causing undefined behavior, because the Standard rules that for a violation no diagnostic is required, which means the implementation can silently do anything it wants. At 14.7.3/6
If a template, a member template or the member of a class template is explicitly specialized then that specialization shall be declared before the first use of that specialization that would cause an implicit instantiation to take place, in every translation unit in which such a use occurs; no diagnostic is required.
In practice at least on GCC, it's implicitly instantiating the primary template Interface<T> since the specialization wasn't declared and is not visible in main, and then calling Interface<T>::foo. If its definition is visible, it instatiates the primary definition of the member function (which is why when it is defined, it wouldn't work).
Instantiated function name symbols have weak linkage because they could possibly be present multiple times in different object files, and have to be merged into one symbol in the final program. Contrary, members of explicit specializations that aren't templates anymore have strong linkage so they will dominate weak linkage symbols and make the call end up in the specialization. All this is implementation detail, and the Standard has no such notion of weak/strong linkage. You have to declare the specialization prior to creating the special object:
template <>
struct Interface<Special>;
The Standard lays it bare (emphasize by me)
The placement of explicit specialization declarations for function templates, class templates, member functions of class templates, static data members of class templates, member classes of class templates, member class templates of class templates, member function templates of class templates, member functions of member templates of class templates, member functions of member templates of non-template classes, member function templates of member classes of class templates, etc., and the placement of partial specialization declarations of class templates, member class templates of non-template classes, member class templates of class templates, etc., can affect whether a program is well-formed according to the relative positioning of the explicit specialization declarations and their points of instantiation in the translation unit as specified above and below. When writing a specialization, be careful about its location; or to make it compile will be such a trial as to kindle its self-immolation.

Thats pretty neat. I'm not sure if it is guaranteed to work everywhere though. It looks like what they're doing is having a deliberately undefined template method, and then defining a specialization tucked away in its own translation unit. They're depending on the compiler using the same name mangling for both the original class template method and the specialization, which is the bit I think is probably non-standard. The linker will then look for the method of the class template, but instead find the specialization.
There are a few risks with this though. No one, not even the linker, will pick up multiple implementations of the method for example. The template methods will be marked selectany because template implies inline so if the linker sees multiple instances, instead of issuing an error it will pick whichever one is most convenient.
Still a nice trick though, although unfortunately it does seem to be a lucky coincidence that it works.

Related

When will implicit instantiation cause problems?

I'm reading C++ Primer, and it says:
"If a member function isn't used, it is not instantiated. The fact that members are instantiated only if we use them lets us instantiate a class with a type that may not meet the requirements for some of the template’s operations."
I don't know why this is a problem. If some operations are required, why doesn't compiler instantiate those operations? Can someone give an example?
That's an ease-of-use feature, not a pitfall.
Lazy instantiation serves to simplify templates. You can implement the set of all possible member functions that any specialization might have, even if some of the functions don't work for some specializations.
It also reduces compile time, since the compiler never needs to instantiate what you don't use.
To prevent lazy instantiation, use explicit instantiation:
template class my_template< some_arg >;
This will immediately instantiate all the members of the class (except members which are templates, inherited, or not yet defined). For templates that are slow to compile, you can do the above in one source file (translation unit) and then use the linker to bypass instantiation in other source files, by putting a declaration in the header:
extern template class my_template< some_arg >;

C++ template explicit specialization - calling existing member function

I'm using explicit template specialization to initialize a std::vector with information but only for a specific type of std::vector, thus the explicit specialization. Within the constructor, if I try to call push_back or any other existing function in std::vector, compilation fails. What is the problem and how do I fix it?
simplified example:
namespace std
{
template<>
class vector<int>
{
public:
vector(void)
{
int value = 5;
push_back(value);
}
};
}
compiler message:
In constructor 'std::vector<int>::vector()':
error: 'push_back' was not declared in this scope
push_back(value);
^
Explicit specializations are completely different classes that are separate from the primary template. You have to rewrite everything.
In normal situations where you control the primary template, you would typically have some sort of common base class or base class template to collect common structures.
With a given library, it is generally a very bad idea to add specializations (unless the library explicitly says it's OK). With the C++ standard library, this is outright undefined behaviour.
(The main problem is that other translation units may be using the template instantiation which you're specializing without seeing your specialization, which violates the one-definition rule.)
Template specializations are unrelated types from both the primary template and any other specialization. It is unclear what you are attempting to do, as it is also illegal to provide specializations of templates in the std namespace unless the specialization uses your own user defined type.
If you can explain the problem to solve, you might get other options, like specializing a member function rather than the template itself...

Template function in non-template class - Division between H and CPP files

I was (and have been for a long time) under the impression that you had to fully define all template functions in your .h files to avoid multiple definition errors that occur due to the template compilation process (non C++11).
I was reading a co-worker's code, and he had a non-template class that had a template function declared in it, and he separated the function declaration from the function definition (declared in H, defined in CPP). It compiles and works fine to my surprise.
Is there a difference between how a template function in a non template class is compiled, and how a function in a template class is compiled? Can someone explain what that difference is or where I might be confused?
The interesting bit is how and when the template gets instantiated. If the instantiations can be found at link time, the template definition doesn't need to be visible in the header file.
Sometimes, explicit instantiations are cause like this:
header :
struct X {
// function template _declaration_
template <typename T> void test(const T&);
};
cpp:
#include "X.h"
// function template _definition_:
template <typename T>
void X::test(const T&)
{
}
// explicit function template _instantiation(s)_:
template X::test<int>(const int&);
template X::test<std::string>(const std::string&);
Using this sample, linking will succeed unless uninstantiated definitions of the template are used in other translation units
There is no difference between function templates defined in namespace or in class scope. It also doesn't matter whether is inside class template or not. What matter is that at some point in the project any used function template (whether member or non-member) is instantiated. Let's go over the different situations:
Unused function templates don't need to be instantiated and thus their implementation doesn't need to be visible to compiler at any point in time. This sounds boring but is important e.g. when using SFINAE approaches where class or function templates are declared but not defined.
Any function template which is defined where it is used will be instantiated by the compiler in a form which allows multiple definitions across different translation units: only one of the instantiations is retained. It is important that all the different definitions are merged because you could detect differences if you took the address of a function template or used a state variable inside the function template: there shall be only one of these for each instantiation.
The most interesting setup is where the definition of the function template is not seen when the function template is used: in this case the compiler cannot instantiate it. When the compiler sees a definition of the function template in a different translation unit it wouldn't know which template arguments to instantiate! A Catch 22? Well, you can always explicitly instantiate a template once. Having multiple explicit instantiations would create multiply defined symbols.
These are roughly the important options. There are often good reasons that you don't want to have the definition of a function template in the header. For example, you don't necessarily want to drag in dependencies you wouldn't have otherwise. Putting the definition of the function template somewhere else and explicitly instantiating it is a good thing. Also, you might want to reduce the compile time e.g. by avoiding to instantiate essentially the entire I/O stream and locale library in every translation unit using the stream. In C++ 2011 extern templates were introduced which allow the declaration that a particular template (either function or class template) is defined externally once for the entire program and there isn't any need to instantiate it in every header using particularly common template arguments.
For a longer version of what I just said, including examples have a look at a blog post
I wrote last weekend on this topic.

What should happen to template class static member variables with definition in the .h file

If a template class definition contains a static member variable that depends on the template type, I'm unsure of what the reliable behavior should be?
In my case it is desirable to place the definition of that static member in the same .h file as the class definition, since
I want the class to be general for many template data types that I don't currently
know.
I want only one instance of the static member to be shared
throughout my program for each given template type. ( one for all MyClass<int> and one for all MyClass<double>, etc.
I can be most brief by saying that the code listed at this link behaves exactly as I want when compiled with gcc 4.3. Is this behavior according to the C++ Standard so that I can rely on it when using other compilers?
That link is not my code, but a counter example posted by CodeMedic to the discussion here. I've found several other debates like this one but nothing I consider conclusive.
I think the linker is consolidating the multiple definitions found ( in the example a.o and b.o ).
Is this the required/reliable linker behavior?
From N3290, 14.6:
A [...] static data member of a class template shall be defined in
every translation unit in which it is implicitly instantiated [...], unless the corresponding specialization is explicitly instantiated [...] .
Typically, you put the static member definition in the header file, along with the template class definition:
template <typename T>
class Foo
{
static int n; // declaration
};
template <typename T> int Foo<T>::n; // definition
To expand on the concession: If you plan on using explicit instantiations in your code, like:
template <> int Foo<int>::n = 12;
then you must not put the templated definition in the header if Foo<int> is also used in other TUs other than the one containing the explicit instantiation, since you'd then get multiple definitions.
However, if you do need to set an initial value for all possible parameters without using explicit instantiation, you have to put that in the header, e.g. with TMP:
// in the header
template <typename T> int Foo<T>::n = GetInitialValue<T>::value; // definition + initialization
This is wholly an addition to #Kerrek SB's excellent answer. I'd add it as a comment, but there're many of them already, so the new comments are hidden by default.
So, his and other examples I saw are "easy" in the sense that type of static member variable is known beforehand. It's easy because compiler for example knows storage size for any template instantiation, so one may think that compiler could use funky mangling scheme, output variable definition once, and offload the rest to linker, and that might even work.
But it's a bit amazing that that it works when static member type depends on template parameter. For example, following works:
template <typename width = uint32_t>
class Ticks : public ITimer< width, Ticks<width> >
{
protected:
volatile static width ticks;
}
template <typename width> volatile width Ticks<width>::ticks;
(Note that explicit instantiation of static var doesn't need (or allows) default spec for "width").
So, it brings more thoughts, that C++ compiler has to do quite a lot of processing - in particular, to instantiate a template, not only a template itself is needed, but it must also collect all [static member] explicit instantiations (one may only wonder then why they were made separate syntactic constructs, not something to be spelled out within the template class).
As for implementation of this on linker level, for GNU binutils its "common symbols":
http://sourceware.org/binutils/docs/as/Comm.html#Comm . (For Microsoft toolchains, it's named COMDAT, as another answer says).
The linker handles such cases almost exactly the same as for non-template class static members with __declspec(selectany) declaration applied, like this:
class X {
public:
X(int i){};
};
__declspec(selectany) X x(1);//works in msvc, for gcc use __attribute__((weak))
And as msdn says: "At link time, if multiple definitions of a COMDAT are seen, the linker picks one and discards the rest... For dynamically initialized, global objects, selectany will discard an unreferenced object's initialization code, as well."

Why do C++ template definitions need to be in the header? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why should the implementation and the declaration of a template class be in the same header file?
e.g when defining a template class why do the implementations of the class methods need to be in the header? Why can't they be in a implementation file (cpp/cxx)?
A template class is not a class, it's a template that can be used to create a class. When you instantiate such a class, e.g. MyTemplate<int>, the compiler creates the class on the spot. In order to create it, it has to see all the templated member functions (so that it can use the templates to create actual member functions such as MyTemplate<int>::foo() ), and therefore these templated member functions must be in the header.
If the members are not in the header, the compiler will simply assume that they exist somewhere else and just create actual function declarations from the templated function declarations, and this gives you linker errors.
The "export" keyword is supposed to fix this, but few compilers support it (I only know of Comeau).
You can also explicitly instantiate MyTemplate<int> - then the compiler will create actual member functions for MyTemplate<int> when it compiles the cpp files containing the MyTemplate member function definition templates.
They need to be visible for the compiler when they are instantiated. That basically means that if you are publishing the template in a header, the definitions have to be visible by all translation units that include that header if you depend on implicit instantiation.
They need not be defined in the header if you are going to explicitly instantiate the templates, but this is in most cases not a good idea.
As to the reason, it basically boils down to the fact that templates are not compiled when the compiler parses the definition, but rather when they are instantiated, and then they are compiled for the particular instantiation type.
If your compiler supports export, then it doesn't. Only EDG-based compilers support export, and it's going to be removed from C++0x because of that.
Non-exported templates require that the compiler can see the full template definition, in order to instantiate it for the particular types you supply as arguments. For example:
template<typename T>
struct X {
T t;
X(int i): t(i) {}
};
Now, when you write X<float>(5) in some translation unit, the compiler as part of compiling that translation unit must check that the constructor of X is type-correct, generate the code for it, and so on. Hence it must see the definition of X, so that it can permit X<float>(5) but forbid X<char*>(5).
The only sensible way to ensure that the compiler sees the same template definition in all translation units that use it, is to put the definition in a header file. As far as the standard is concerned, though, you're welcome to copy-and-paste it manually, or to define a template in a cpp file that is used only in that one translation unit.
export in effect tells the compiler that it must output a parsed form of the template definition into a special kind of object file. Then the linker performs template instantiation. With normal toolchains, the compiler is smart enough to perform template instantiation and the linker isn't. Bear in mind that template instantiation has to do pretty much everything that the compiler does beyond basic parsing.
They can be in a CPP file.
The problem arises from the fact that the compiler builds the code for a specific instantiation of a template class (eg std::vector< int >) on a per translation unit basis. The problem with defining the functions in a CPP file is that you will need to define every possible form in that CPP file (this is called template specialization).
So for that int vector exampled above you could define a function in a CPP file for the int case using specialization.
e.g
template<> void std::vector< int >::push_back( int& intVal )
Of course doing this can produce the advantage of optimisation for specific cases but it does give you an idea of just how much code bloat can be introduced by STL! At least all the functions aren't defined as inline as a certain compiler used to do ;)
That aspect of template is called the compilation model, not to be confused with the instantiation mechanism which was the subject of How does C++ link template instances.
The instantiation mechanism is the answer to the question "When is the instantiation generated?", the instantiation model is the answer to "Where the source are found?"
There are two standards compilation model:
inclusion, the one that you know, where the definition must be available,
separated, which allows to put the definition somewhere else with the help of the keyword export. That one has been removed from the standard and won't be available in C++0X. One of the raison for removal was that it wasn't widely implemented (only one implementation).
See C++ Templates, The Complete Guide by David Vandevoorde and Nicolai Josuttis or http://www.bourguet.org/v2/cpplang/export.pdf for more information, the separated compilation model being the subject of that later paper.