Where are template functions instantiated? - c++

I believe that there are 4 situations where my question may have different answers. These situations are sorted by member vs. non-member functions and within vs. without a library.
Non-member function within a library
Suppose that I have defined a template function func in header func.h.
// func.h
template <typename T>
int func(T t){
//definition
}
I #include "func.h" in two cpp files of the same project/library/executable and call
//a.cpp
#include "func.h"
//stuff
int m = func<int>(3);
//stuff
and
//b.cpp
#include "func.h"
//stuff
int n = func<int>(27);
//stuff
My understanding is that these two cpp files should compile into their own object files. In what object file is func<int> instantiated? Why will the One Definition Rule not be violated? For this basic application of templates, is there any benefit to explicitly instantiating func<int> separate from its use?
Member function within a library
Suppose that func is instead a member function of some class Func.
// func.h
class Func {
template <typename T>
int func(T t){
//definition
}
};
Where will func be instantiated? Will func<int> be linked to or placed inline?
Member and Non-Member functions across libraries
Suppose that a.cpp and b.cpp are in different libraries that are compiled separately and later linked into an executable. Will the different libraries their own definitions of func<int>? At link time, why will the One Definition Rule not be violated?
Note: There is a related question of the same title here, but in a specific situation with one cpp file.

In what object file is func<int> instantiated?
In every object file (aka translation unit) that invokes it or takes an address of it when the template definition is available.
Why will the One Definition Rule not be violated?
Because the standard says so in [basic.def.odr].13.
Also see https://en.cppreference.com/w/cpp/language/definition
There can be more than one definition in a program of each of the following: class type, enumeration type, inline function, inline variable (since C++17), templated entity (template or member of template, but not full template specialization), as long as all of the following is true...
For this basic application of templates, is there any benefit to explicitly instantiating func<int> separate from its use?
In this case you get no inlining but possibly smaller code. If you use link-time code generation, then inlining may still happen.

All your questions do not really differ when it comes to answer. Regardless of template function being a member or a free function, it is going to be instantiated on first use (with given types) in each compilation unit (.cpp file).
From the compiler standpoint, ODR is not violated here, since there are no two prohibited definitions of templated function. Standard explicitly allows several definitions of templated functions.
Yet your intuition that you end up with definition of instantiated function twice in object files is correct. Luckily, ODR doesn't apply at this point. Instead, those definitions are generated with so-called 'weak' symbol - telling linker that those two symbols are identical, and it is free to pick one (or none and perform link-time optimization!)

Related

Symbol table entry for templates

I have doubt about template functions.
If i write a normal function like
int function1(int x);
int function1(int x, int y);
two symbol table entry will be made for function1.
each entry represents each overloaded function.
In case of template function, what exactly happens and how it is handled by the compiler.
template<class X>
int function1(X a);
How many symbol table entries will be present for template functions?
There is no such thing as a "template function" in C++. There are "function templates," that is, templates (or perhaps blueprints) for creating functions. Once you accept this, the answer becomes easier to discover.
A template is purely a compilation construct. Each function instantiated from the template is (naturally) a function and thus it will have its own symbol.
Just as with inline functions, if one and the same function (= using the same set of template arguments) is instantiated from the template in different translation units (= different .cpp files), the compiler & linker must ensure these are merged into one, because their addresses must be the same. How they do it is an implementation detail of theirs; the standard just mandates that they have to do it. And I am afraid I don't know the technical details of how this can be done, so I can't provide an example.
When you create an instance of the function template using specific template parameter, the compiler generates one overloaded method. So when you use the method
temlate<class X>
int function1(X a);
to use like
function1<int>(5);
function1<double>(5.0);
Compiler generates
int function1(int a);
int function1(double a);
These are overloaded and there will be 2 symbols.
In fact templates are much like macro, with strong type checking. At compile type function templates are replaced with actual instance. In object code there is no existence of the template function, only the instantiate overloads (if any).
In the case that nobody is using the template to create a compiled function, in fact no entry is created in the symbol table. Otherwise, for each different use of the template (for each type or type combination—in case there are several type parameters to it) one more compiled function and thus one more entry in the symbol table is created.
Templates are evaluated on compile time only; after compiling they do not exist anymore.
For sure "one for each instantiation".
A function template by itself is not a type, or a function, or any other entity.
No code is generated from a source file that contains only template definitions.
In order for any code to appear, a template must be instantiated: the template
arguments must be determined so that the compiler can generate an actual function
link

multiple definition of a function in templated class

I saw an interesting thing but couldn't understand why.
template<class dataType>
Class A
{
AFnc();
}
template<> A<int>::AFnc() { }
Using only specialized template generates an error saying multiple definition of the same function. And it says it was generated at the same place.
But if I add
template<class dataType>
A<dataType>::AFnc()
{
}
Then it gets rid of the error.
Why ? Could someone please explain this behavior.
(You need to clean up your syntax. I assume that the actual code does not have all those syntax errors.)
Explicit specialization of template function is no longer a template, since it does not depend on any template parameters anymore. From the point of view of One Definition Rule (ODR) it is an "ordinary" function. And, as an "ordinary" function, it has to be declared in header file and defined only once in some implementation file. You apparently defined your specialization in header file, which is what leads to ODR violation if the header file gets included into multiple translation units (e.g. your "multiple definition" errors).
In your example, template<> void A<int>::AFnc() (I added void as return type) is no longer a template. This means that this definition
template<> void A<int>::AFnc() { }
must be moved from the header file to some implementation file. Meanwhile, in the header file you have to keep a non-defining declaration for this function
template<> void A<int>::AFnc(); // <- note, no function body
to let the compiler know that such specialization exists.
In general, remember the simple rule: if your function template still depends on some unspecified template parameters, it is a true template and it has to be defined in header file. But once you "fix" all the parameters (by explicit specialization) it is no longer a template. It becomes an ordinary function that has to be declared in header file and defined only once in some implementation file.
P.S. The above applies to non-inline functions. Inline functions can be (and are usually supposed to be) defined in header files.
P.P.S. The same logic applies to explicit specializations of static data members of template classes.
I guess, you put explicit instantiation in a header file. Then its code is emitted in every translation unit that includes that file. Just move this code
template<> A<int>::AFnc() { }
to .cpp file and it will be emitted only once.
You dont get this error with template method because rules for implicit instantiation are different.

Template function in non-template class - Division between H and CPP files

I was (and have been for a long time) under the impression that you had to fully define all template functions in your .h files to avoid multiple definition errors that occur due to the template compilation process (non C++11).
I was reading a co-worker's code, and he had a non-template class that had a template function declared in it, and he separated the function declaration from the function definition (declared in H, defined in CPP). It compiles and works fine to my surprise.
Is there a difference between how a template function in a non template class is compiled, and how a function in a template class is compiled? Can someone explain what that difference is or where I might be confused?
The interesting bit is how and when the template gets instantiated. If the instantiations can be found at link time, the template definition doesn't need to be visible in the header file.
Sometimes, explicit instantiations are cause like this:
header :
struct X {
// function template _declaration_
template <typename T> void test(const T&);
};
cpp:
#include "X.h"
// function template _definition_:
template <typename T>
void X::test(const T&)
{
}
// explicit function template _instantiation(s)_:
template X::test<int>(const int&);
template X::test<std::string>(const std::string&);
Using this sample, linking will succeed unless uninstantiated definitions of the template are used in other translation units
There is no difference between function templates defined in namespace or in class scope. It also doesn't matter whether is inside class template or not. What matter is that at some point in the project any used function template (whether member or non-member) is instantiated. Let's go over the different situations:
Unused function templates don't need to be instantiated and thus their implementation doesn't need to be visible to compiler at any point in time. This sounds boring but is important e.g. when using SFINAE approaches where class or function templates are declared but not defined.
Any function template which is defined where it is used will be instantiated by the compiler in a form which allows multiple definitions across different translation units: only one of the instantiations is retained. It is important that all the different definitions are merged because you could detect differences if you took the address of a function template or used a state variable inside the function template: there shall be only one of these for each instantiation.
The most interesting setup is where the definition of the function template is not seen when the function template is used: in this case the compiler cannot instantiate it. When the compiler sees a definition of the function template in a different translation unit it wouldn't know which template arguments to instantiate! A Catch 22? Well, you can always explicitly instantiate a template once. Having multiple explicit instantiations would create multiply defined symbols.
These are roughly the important options. There are often good reasons that you don't want to have the definition of a function template in the header. For example, you don't necessarily want to drag in dependencies you wouldn't have otherwise. Putting the definition of the function template somewhere else and explicitly instantiating it is a good thing. Also, you might want to reduce the compile time e.g. by avoiding to instantiate essentially the entire I/O stream and locale library in every translation unit using the stream. In C++ 2011 extern templates were introduced which allow the declaration that a particular template (either function or class template) is defined externally once for the entire program and there isn't any need to instantiate it in every header using particularly common template arguments.
For a longer version of what I just said, including examples have a look at a blog post
I wrote last weekend on this topic.

What should happen to template class static member variables with definition in the .h file

If a template class definition contains a static member variable that depends on the template type, I'm unsure of what the reliable behavior should be?
In my case it is desirable to place the definition of that static member in the same .h file as the class definition, since
I want the class to be general for many template data types that I don't currently
know.
I want only one instance of the static member to be shared
throughout my program for each given template type. ( one for all MyClass<int> and one for all MyClass<double>, etc.
I can be most brief by saying that the code listed at this link behaves exactly as I want when compiled with gcc 4.3. Is this behavior according to the C++ Standard so that I can rely on it when using other compilers?
That link is not my code, but a counter example posted by CodeMedic to the discussion here. I've found several other debates like this one but nothing I consider conclusive.
I think the linker is consolidating the multiple definitions found ( in the example a.o and b.o ).
Is this the required/reliable linker behavior?
From N3290, 14.6:
A [...] static data member of a class template shall be defined in
every translation unit in which it is implicitly instantiated [...], unless the corresponding specialization is explicitly instantiated [...] .
Typically, you put the static member definition in the header file, along with the template class definition:
template <typename T>
class Foo
{
static int n; // declaration
};
template <typename T> int Foo<T>::n; // definition
To expand on the concession: If you plan on using explicit instantiations in your code, like:
template <> int Foo<int>::n = 12;
then you must not put the templated definition in the header if Foo<int> is also used in other TUs other than the one containing the explicit instantiation, since you'd then get multiple definitions.
However, if you do need to set an initial value for all possible parameters without using explicit instantiation, you have to put that in the header, e.g. with TMP:
// in the header
template <typename T> int Foo<T>::n = GetInitialValue<T>::value; // definition + initialization
This is wholly an addition to #Kerrek SB's excellent answer. I'd add it as a comment, but there're many of them already, so the new comments are hidden by default.
So, his and other examples I saw are "easy" in the sense that type of static member variable is known beforehand. It's easy because compiler for example knows storage size for any template instantiation, so one may think that compiler could use funky mangling scheme, output variable definition once, and offload the rest to linker, and that might even work.
But it's a bit amazing that that it works when static member type depends on template parameter. For example, following works:
template <typename width = uint32_t>
class Ticks : public ITimer< width, Ticks<width> >
{
protected:
volatile static width ticks;
}
template <typename width> volatile width Ticks<width>::ticks;
(Note that explicit instantiation of static var doesn't need (or allows) default spec for "width").
So, it brings more thoughts, that C++ compiler has to do quite a lot of processing - in particular, to instantiate a template, not only a template itself is needed, but it must also collect all [static member] explicit instantiations (one may only wonder then why they were made separate syntactic constructs, not something to be spelled out within the template class).
As for implementation of this on linker level, for GNU binutils its "common symbols":
http://sourceware.org/binutils/docs/as/Comm.html#Comm . (For Microsoft toolchains, it's named COMDAT, as another answer says).
The linker handles such cases almost exactly the same as for non-template class static members with __declspec(selectany) declaration applied, like this:
class X {
public:
X(int i){};
};
__declspec(selectany) X x(1);//works in msvc, for gcc use __attribute__((weak))
And as msdn says: "At link time, if multiple definitions of a COMDAT are seen, the linker picks one and discards the rest... For dynamically initialized, global objects, selectany will discard an unreferenced object's initialization code, as well."

Why declaration/definition must both be in source file for template class in c++?

Anyone can elaborate the reason?
Source files are compiled independently of one another into executable code, then later linked in to the main program. Template functions on the other hand, cannot be compiled without the template parameters. So, the file that uses them needs to have that code in order for it to be compiled. Therefore the functions need to be visible in the header file.
Promised example:
template<class T>
void swap(T & a, T & b)
{
T temp = a;
a = b;
b = temp;
}
The only requirements of class T here are that it has a public assignment operator(=). That's just about every class that has ever been implemented or conceived. However, each class implements the assignment operator in it's own way. The same machine code cannot be generated for swap<int>, swap<double> and swap<string>. Each one of those functions has to be unique. At the same time, the compiler cannot possibly anticipate all the myriad of different types that you might pass to this function, so it can't generate the functions ahead of time. So it has to wait until the function is called, and then it can get compiled.
For example, let's say I have that function above defined in "swap.h". Then in "main.cpp", I do this:
int main()
{
int a=5, b=10;
double c=3.5, d=7.9;
string s1="hello";
string s2="world";
swap(a,b);
swap(c,d);
swap(s1,s2);
}
In this example, 3 different functions were created. One to swap ints, one to swap doubles, and one to swap strings. In order to create those functions, the compiler needed to be able to see the template code. If it was in a separate source file, "swap.cpp" for example, the compiler wouldn't be able to see it, because like I said before, each source file is compiled independently of one another.
Are you asking why template bodies have to be in header files? It's because the compiler needs to know both the body and the template parameter(s) at the same time in order to generate machine code. The template parameters are known where the template is used (instantiated). This gives you one trivial case and two non-trivial ones:
(Trivial) The template is only used in one source file, so the body can be in that same source file.
Make the body available at every use, which often means in a header file.
In the source file which contains the body, explicitly instantiate every needed combination of template parameters.
The short answer to your question is that there is no obligation for declaration and definition of template classes to be in the same source files.
In fact, i consider this a bad thing, but it's completely understandable, given that it's pretty difficult to use them separately (but it can be done !).
EDIT
Suppose you have
myTemplateClass.h which declares a template class MyTemplateClass
myTemplateClass.hpp which defines its class members (includes myTemplateClass.h)
use of MyTemplateClass inside main.cpp
Simply include myTemplateClass.h in main.cpp and create myTemplateClassInt.cpp as follows :
#include "myTemplateClass.hpp"
template MyTemplateClass<int>;
Doing that, you tell the compiler to instantiate all template methods of MyTemplateClass for template parameter "int". Since it has access to myTemplateClass.hpp, such methods will be generated flawlessly... And the linker won't complain.
Of course, this approach requires that you use some place where instantiated versions of your template classes are defined.