Static variable inside template function - c++

In C++, if you define this function in header.hpp
void incAndShow()
{
static int myStaticVar = 0;
std::cout << ++myStaticVar << " " << std::endl;
}
and you include header.hpp in at least two .cpp files. Then you will have multiple definition of incAndShow(). Which is expected. However, if you add a template to the function
template <class T>
void incAndShow()
{
static int myStaticVar = 0;
std::cout << ++myStaticVar << " " << std::endl;
}
then you won't have any multiple definition of error. Likewise, two different .cpp calling the function with the same template (e.g. incAndShow<int>()), will share myStaticVar. Is this normal? I'm asking this question, because I do rely on this "feature" (sharing the static variable) and I want to be sure that it is not only my implementation that is doing this.

You can rely on this. The ODR (One Definition Rule) says at 3.2/5 in the Standard, where D stands for the non-static function template (cursive font by me)
If D is a template, and is defined in more than one translation unit, then the last four requirements from the list above shall apply to names from the template’s enclosing scope used in the template definition (14.6.3), and also to dependent names at the point of instantiation (14.6.2). If the definitions of D satisfy all these requirements, then the program shall behave as if there were a single definition of D. If the definitions of D do not satisfy these requirements, then the behavior is undefined.
Of the last four requirements, the two most important are roughly
each definition of D shall consist of the same sequence of tokens
names in each definition shall refer to the same things ("entities")
Edit
I figure that this alone is not sufficient to guarantee that your static variables in the different instantiations are all the same. The above only guarantees that the multiple definitions of the template is valid. It doesn't say something about the specializations generated from it.
This is where linkage kicks in. If the name of a function template specialization (which is a function) has external linkage (3.5/4), then a name that refers to such a specialization refers to the same function. For a template that was declared static, functions instantiated from it have internal linkage, because of
Entities generated from a template with internal linkage are distinct from all entities generated in other translation units. -- 14/4
A name having namespace scope (3.3.6) has internal linkage if it is the name of [...] an object, reference, function or function template that is explicitly declared static -- 3.5/3
If the function template wasn't declared with static, then it has extern linkage (that, by the way, is also the reason that we have to follow the ODR at all. Otherwise, D would not be multiply defined at all!). This can be derived from 14/4 (together with 3.5/3)
A non-member function template can have internal linkage; any other template name shall have external linkage. -- 14/4.
Finally, we come to the conclusion that a function template specialization generated from a function template with external linkage has itself external linkage by 3.5/4:
A name having namespace scope has external linkage if it is the name of [...] a function, unless it has internal linkage -- 3.5/4
And when it has internal linkage was explained by 3.5/3 for functions provided by explicit specializations, and 14/4 for generated specializations (template instantiations). Since your template name has external linkage, all your specializations have external linkage: If you use their name (incAndShow<T>) from different translation units, they will refer to the same functions, which means your static objects will be the same in each occasion.

Just so i understand your question. You are asking if it is normal for each version of the templated function to have its own instance of myStaticVar. (for example: incAndShow<int> vs. intAndShow<float> The answer is yes.
Your other question is, if two files include the header containing the template function, will they still share the static variable for a given T. I would say yes.

The difference when you create the function template is that it has external linkage. The same incAndShow will be accessible from all translation units.
Paraphrasing from C++ standard working draft N2798 (2008-10-04):
14 part 4: a non member function template can have internal linkage, others always have external linkage.
14.8 point 2: every specialization will have its own copy of the static variable.
Your function template should have external linkage unless you declare it in the unnamed namespace or something. So, for each T that you use with your function template, you should get one static variable used throughput the program. In other words, it's OK to rely on having only one static variable in the program for each instantiation of the template (one for T==int, one for T==short, etc).
As an aside, this can lead to weird situations if you define incAndShow differently in different translation units. E.g., if you define it to increment in one file and the decrement in another file (without specifying internal linkage by putting the function into the unnamed namespace) both will end up sharing the same function, which will effectively be chosen at random at compile time (with g++ it depends on the order the object files are given on the command line).

Templates are instantiated as needed, which means that compiler (linker as well in this case?) will make sure that you don't end up with multiple instances of the same template as well as only those instances of templates that you need - in your case only incAndShow<int>() is instantiated and nothing else (otherwise compiler would have to try to instantiate for every type which does not make sense).
So I assume that the same methods it uses to figure out for which type to instantiate template prevents it from instantiating twice for the same type e.g. only one instance of incAndShow<int>()
This is different from non template code.

Yes, it is "normal", but whatever you try to achieve with this "feature" maybe questionable. Try explain why you want to use local static variable, may be we can come up with a cleaner way to do it.
The reason this is normal is because of the way template functions are compiled and linked. Each translation unit (the two .cpp in your case) gets see their own copy of incAndShow and when the program is linked together, the two incAndShow will be merged into one. If you declare your regular function inline in the header file, you'll get similar effect.

Take this example that shows the behaviour is absolutely expected:
#include <iostream>
template <class T> class Some
{
public:
static int stat;
};
template<class T>
int Some<T>::stat = 10;
void main()
{
Some<int>::stat = 5;
std::cout << Some<int>::stat << std::endl;
std::cout << Some<char>::stat << std::endl;
std::cout << Some<float>::stat << std::endl;
std::cout << Some<long>::stat << std::endl;
}
You get: 5 10 10 10 10
The above shows that the change in static variable is only for type "int" and hence in your case you don't see any problem.

templates will only actually be turned into code once they're instantiated (i.e. used)
headers are not to be used for implementation code, but only for declarations

Related

Where are template functions instantiated?

I believe that there are 4 situations where my question may have different answers. These situations are sorted by member vs. non-member functions and within vs. without a library.
Non-member function within a library
Suppose that I have defined a template function func in header func.h.
// func.h
template <typename T>
int func(T t){
//definition
}
I #include "func.h" in two cpp files of the same project/library/executable and call
//a.cpp
#include "func.h"
//stuff
int m = func<int>(3);
//stuff
and
//b.cpp
#include "func.h"
//stuff
int n = func<int>(27);
//stuff
My understanding is that these two cpp files should compile into their own object files. In what object file is func<int> instantiated? Why will the One Definition Rule not be violated? For this basic application of templates, is there any benefit to explicitly instantiating func<int> separate from its use?
Member function within a library
Suppose that func is instead a member function of some class Func.
// func.h
class Func {
template <typename T>
int func(T t){
//definition
}
};
Where will func be instantiated? Will func<int> be linked to or placed inline?
Member and Non-Member functions across libraries
Suppose that a.cpp and b.cpp are in different libraries that are compiled separately and later linked into an executable. Will the different libraries their own definitions of func<int>? At link time, why will the One Definition Rule not be violated?
Note: There is a related question of the same title here, but in a specific situation with one cpp file.
In what object file is func<int> instantiated?
In every object file (aka translation unit) that invokes it or takes an address of it when the template definition is available.
Why will the One Definition Rule not be violated?
Because the standard says so in [basic.def.odr].13.
Also see https://en.cppreference.com/w/cpp/language/definition
There can be more than one definition in a program of each of the following: class type, enumeration type, inline function, inline variable (since C++17), templated entity (template or member of template, but not full template specialization), as long as all of the following is true...
For this basic application of templates, is there any benefit to explicitly instantiating func<int> separate from its use?
In this case you get no inlining but possibly smaller code. If you use link-time code generation, then inlining may still happen.
All your questions do not really differ when it comes to answer. Regardless of template function being a member or a free function, it is going to be instantiated on first use (with given types) in each compilation unit (.cpp file).
From the compiler standpoint, ODR is not violated here, since there are no two prohibited definitions of templated function. Standard explicitly allows several definitions of templated functions.
Yet your intuition that you end up with definition of instantiated function twice in object files is correct. Luckily, ODR doesn't apply at this point. Instead, those definitions are generated with so-called 'weak' symbol - telling linker that those two symbols are identical, and it is free to pick one (or none and perform link-time optimization!)

Why are templates not redefined, why is it all written in the header file?

If I will write something like this:
// A.h
#ifndef A_h
#define A_h
class A
{
public:
void f();
};
void A::f()
{
}
#endif //A_h
// B.cpp
#include "A.h"
void foo()
{
A a;
a.f();
}
// C.cpp
#include "A.h"
void bar()
{
A b;
b.f();
}
// main.cpp
#include "B.cpp"
#include "C.cpp"
using namespace std;
int main()
{
foo();
bar();
return 0;
}
I get a linker error as such:
error LNK2005: "public: void __thiscall A::f(void)" (?f#A##QAEXXZ)
already defined in B.obj
Why does the same problem not happen when the A class is a class template? Eventually it becomes a plain class (a non-template one) during compilation, right? For this reason, I expect the same behavior as a non-template class, i.e. a linker error.
There are two separate effects at work here:
A member function definition that's out of line is a normal function definition, and by one definition rule (ODR) it must occur precisely once in the link. A member function defined inline is implicitly inline, and the ODR allows inline function definitions to be repeated:
That is, it's OK to put the following code in a header and include it repeatedly:
struct Foo {
void bar() {} // "inline" implied
};
But if you have the definition out of line, it must be in a single translation unit.
Function templates can be defined repeatedly, even when they are not inline. The templating mechanism in general already needs to deal with repeated instantiations of templates, and with their deduplication at link time.
Member functions of class templates are themselves function templates, and therefore it doesn't matter whether you declare them inline.
Why is it that the non template functions are treated differently to templates when it comes to multiple definitions?
There are historical and compatibility issues involved here. Some of the requirements come from C, that is the way it worked. There are also reasons related to what templates are, they are code generators; when required, the compiler needs to generate the code, consequently it needs to see the code when it generates it. This has a knock on effect that there will be multiple definitions, so rules are required to resolve those issues.
Simply put; templates behave (w.r.t. linking) as if they had a single definition in the program, hence they do not behave the same during compilation and linking as non-templates (that are not declared with inline) - in particular w.r.t. functions. If the non-templates are declared inline, similar behaviour is seen.
Standard references here include;
Some background, most of the issues here relate to linkage, what is linkage? §3.5/2 [basic.link]
A name is said to have linkage when it might denote the same object, reference, function, type, template, namespace or value as a name introduced by a declaration in another scope:
When a name has external linkage, the entity it denotes can be referred to by names from scopes of other translation units or from other scopes of the same translation unit.
When a name has internal linkage, the entity it denotes can be referred to by names from other scopes in the same translation unit.
When a name has no linkage, the entity it denotes cannot be referred to by names from other scopes.
Some general rules relating to functions and variable, for the program as a whole and each of the translation units.
§3.2/1 [basic.def.odr]
No translation unit shall contain more than one definition of any
variable, function, class type, enumeration type, or template.
And
§3.2/4 [basic.def.odr]
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program...
§3.2/6 [basic.def.odr]
There can be more than one definition of a class type (Clause [class]), enumeration type ([dcl.enum]), inline function with external linkage ([dcl.fct.spec]), class template (Clause [temp]), non-static function template ([temp.fct]), static data member of a class template ([temp.static]), member function of a class template ([temp.mem.func]), or template specialization for which some template parameters are not specified ([temp.spec], [temp.class.spec]) in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements....
If D is a template and is defined in more than one translation unit, then the preceding requirements shall apply both to names from the template's enclosing scope used in the template definition ([temp.nondep]), and also to dependent names at the point of instantiation ([temp.dep]). If the definitions of D satisfy all these requirements, then the behavior is as if there were a single definition of D. If the definitions of D do not satisfy these requirements, then the behavior is undefined.
Some informal observations on the above list including the classes, templates etc. These are typical elements often found in header files (of course not exclusively or limited to headers). They are given these special rules to make everything work as expected.
What about class member functions? §9.3 [class.mfct]
1/ A member function may be defined ([dcl.fct.def]) in its class definition, in which case it is an inline member function ([dcl.fct.spec]), or it may be defined outside of its class definition if it has already been declared but not defined in its class definition. A member function definition that appears outside of the class definition shall appear in a namespace scope enclosing the class definition...
2/ An inline member function (whether static or non-static) may also be defined outside of its class definition provided either its declaration in the class definition or its definition outside of the class definition declares the function as inline or constexpr. [ Note: Member functions of a class in namespace scope have the linkage of that class. Member functions of a local class ([class.local]) have no linkage. See [basic.link]. — end note ]
So basically, member functions not defined in the class definition and not implicitly inline, hence the "normal" rules apply and hence can only appear once in the program.
And template, what does it say about linkage ? §14/4 [temp]
A template name has linkage ([basic.link]). Specializations (explicit or implicit) of a template that has internal linkage are distinct from all specializations in other translation units... Template definitions shall obey the one-definition rule ([basic.def.odr]).
Templates are not code; they are patterns for creating code. They must be visible wherever they are used, so the compiler has to have special rules for using them. The key special rules here are that the compiler generates code wherever a template is used and that the linker ignores duplicates.

When does a non-member function template have internal linkage?

C++11 draft, 14.0.4:
A non-member function template can have internal linkage; any other
template name shall have external linkage.
This query is a consequence of separating template declaration and definition. For example, we can write the following in a header file.
template <typename T>
bool operator==(T const & l, T const & r);
In a single source file, destined to become a single translation unit, we write the definition. We also instantiate it, either implicitly or explicitly, in the same translation unit, for type foo.
template <typename T>
bool operator==(T const & l, T const & r)
{
return extract(l) == extract(r); // extract is uninteresting
}
In a second translation unit, which can only see the definition from the header, we attempt to use foo{} == foo{}, that is, to call the operator== which is instantiated elsewhere.
Currently, this "works". The linker patches the two translation units as I hoped it would.
However, if the function template has internal linkage, the link can fail. For example, we can force this by instantiating within an anonymous namespace.
Does the "can" in the spec indicate that the source code controls the linkage (e.g. by namespace {}) or that the compiler is permitted to choose whether the instantiation will have internal or external linkage?
I don't believe there is any undefined behaviour here, but I am struggling to convince myself that the linkage chosen is not an implementation detail. Can I rely on the symbol being visible from other translation units, if it has been instantiated in at least one TU in a context that suggests it will be external?
edit: DR1603 (thanks Eugene Zavidovsky!) contains a recommendation to erase exactly the sentence quoted above, alongside general rationalisation of linkage rules.
Does the "can" in the spec indicate that the source code controls the linkage (e.g. by namespace {}) or that the compiler is permitted to choose whether the instantiation will have internal or external linkage?
This is the code that controls the linkage. A function generated from function template has external linkage, unless it is a static function template or the template is in unnamed namespace (since C++11).
In other words, one would have to explicitly ask for internal linkage.

Why class redefinition in a several cpp files is permitted [duplicate]

This question already has answers here:
Same class name in different C++ files
(4 answers)
Closed 8 years ago.
Let I've two cpp files:
//--a.cpp--//
class A
{
public:
void bar()
{
printf("class A");
}
};
//--b.cpp--//
class A
{
public:
void bar()
{
printf("class A");
}
};
When I'm compling and linking this files together I have no errors. But if I'll write the following:
//--a.cpp--//
int a;
//--b.cpp--//
int a;
After compiling and linking this sources I've an error as the redefiniton of a. But in the case of classes I've redefinition to, but there is no error is raised. I'm confused.
Classes are types. For the most part, they are compile-time artifacts; global variables, on the other hand, are runtime artifacts.
In your first example, each translation unit has its own definition of class a. Since the translation units are separate from each other, and because they do not produce global runtime artifacts with identical names, this is OK. The standard requires that there be exactly one definition of a class per translation unit - see sections 3.2.1 and 3.2.4:
No translation unit shall contain more than one definition of any variable, function, class type, enumeration type, or template.
Exactly one definition of a class is required in a translation unit if the class is used in a way that requires the class type to be complete.
However, the standard permits multiple class definitions in separate translation units - see section 3.2.6:
There can be more than one definition of a class type, enumeration type, inline function with external linkage, class template, non-static function template, static data member of a class template, member function of a class template, or template specialization for
which some template parameters are not specified in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. [...]
What follows is a long list of requirements, which boils down to that the two class definitions need to be the same; otherwise, the program is considered ill-formed.
In your second example you are defining a global runtime artifact (variable int a) in two translation units. When the linker tries to produce the final output (an executable or a library) it finds both of these, and issues a redefinition error. Note that the rule 3.2.6 above does not include variables with external linkage.
If you declare your variables static, your program will compile, because static variables are local to a translation unit in which they are defined.
Although both programs would compile, the reasons why they compile are different: in case of multiple class definitions the compiler assumes that the two classes are the same; in the second case, the compiler considers the two variables independent of each other.
There are actually two different flavors of the One Definition Rule.
One flavor, which applies to global and namespace variables, static class members, and functions without the inline keyword, says that there can only be one definition in the entire program. These are the things that typically go in *.cpp files.
The other flavor, which applies to type definitions, functions ever declared with the inline keyword, and anything with a template parameter, says that the definition can appear once per translation unit but must be defined with the same source and have the same meaning in each. It's legal to copy-paste into two *.cpp files as you did, but typically you would put these things in a header file and #include that header from all the *.cpp files that need them.
Classes can't be used (except in very limited ways) unless the definition is available within the translation unit that uses it. This means that you need multiple definitions in order to use it in multiple units, and so the language allows that - as long as all the definitions are identical. The same rules apply to various other entities (such as templates and inline functions) for which a definition is needed at the point of use.
Usually, you would share the definition by putting it in a header, and including that wherever it's needed.
Variables can be used with only a declaration, not the definition, so there's no need to allow multiple definitions. In your case, you could fix the error by making one of them a pure declaration:
extern int a;
so that there is only one definition. Again, it's common for such declarations to go in headers, to make sure they're the same in every file that uses them.
For the full, gory details of the One Definition Rule, see C++11 3.2, [basic.def.odr].

How is a situation when different implementations of an inline function are linked into one executable classified?

According to One Definition Rule (ODR) I can't have a function
void function()
{
}
defined more than once in one executable - linker will object. However ODR is ignored for inline functions:
inline void function()
{
}
can be defined in a header file that will be #included into multiple .cpp files and so when resulting .obj files are linked together the linker sees that there're several instances of that function and intentionally ignores that. It assumes it is the very same function and just uses one of the instances. Since the program behavior is preserved noone cares.
But if thanks to any reason, use of preprocessor included, those instances happen to have different implementations the linker will again pick one of the functions and the developer won't even know which one is picked until he thoroughly tests his program.
How is the latter situation when the linker picks one of the functions and they happen to have different implementations classified? Is this undefined behavior or any other kind of situation?
Yes, it is UB for inline functions with external linkage (I think that's what the OP intends).
$3.2/5-
There can be more than one definition
of a class type (clause 9),
enumeration type (7.2), inline
function with external linkage
(7.1.2), class template (clause 14),
non-static function template (14.5.5),
static data member of a class template
(14.5.1.3), member function of a class
template (14.5.1.1), or template
specialization for which some template
parameters are not specified (14.7,
14.5.4) in a program provided that each definition appears in a different
translation unit, and provided the
definitions satisfy the following
requirements.
Given such an entity named D defined
in more than one translation unit,
then
— each definition of D shall
consist of the same sequence of
tokens; and
The same paragraph at the end states that failure to meet these requirements leads to UB