This question already has answers here:
How does the compilation of templates work?
(7 answers)
Closed last year.
As some of you may know from my recent posts i am studying for a C++ exam which the content for the class was delivered very poorly. I am basically having to self teach everything myself so bear with me here.
This is an exam question:
(i)Explain the concepts of templates as defined in the C++ language.
Be sure to differentiate between what the programmer does and what the
compiler does.
My current rationale:
(i) A template allows a function or class to operate using generics. This allows the programmer to effective program X functionality once, and be able to use this functionality with many different data types without having to rewrite the application or parts of the application multiple times.
My problem is i have no idea how the compiler handles the use of templates.
I am unsure what the compiler does at this stage, if somebody could clear this up it would be helpful.
Templates in C++ are implemented through substitution. It's not like Java generics which just type check the code which involves the generics class and then compiles it using raw references (type erasure).
Basically C++ creates a different class/method for each actual template argument used in your code. If you have your
template<typename T>
void myMethod(T t)
{
//
}
what happens at compile time is that a different method is compiled for each type the template is actually used. If you use it on myMethod(50) and myMethod("foo") then two overloaded version of the method will be available at runtime. Intuitively this means that templates could generate code bloating but in practice the same expressiveness is obtained by a larger codebase without templates with less readability so that's not a real concern.
So there is no black magic behind them (ok there is if you consider meta programming or partial specialization).
let's say you write a function using templates:
template <typename T>
void function(T t){
doSomething();
}
for each data type you call this function, the compiler simply replaces the 'T' with that data type, say 'int' and generates code for that like you've written this function with 'int' instead of 'T' since the beginning.
This is probably the right (but not the complete) answer if others agreed.
For each instance of an object of a different type that you create or in case of functions the different type of arguments that you use, the compiler simply makes an overloaded version at compile time. So if you have a template function like a sort function and use that function for int and double arrays, then the compiler have actually made two functions: one using int and the other using double. This is the simplest explanation I could give. Hope it's useful.
Related
Sorry for the newbie question, but I have a feeling I am missing something here:
If I have a certain class template which looks like this (basically the only way to pass a lambda to a function in C++, unless I am mistaken):
template<typename V, typename F>
class Something
{
public:
int some_method(V val, F func) {
double intermediate = val.do_something();
return func(intermediate);
}
}
By reading the implementation of this class, I can see that the V class must implement double do_something(), and that F must be a function/functor with the signature int F(double).
However, in languages like Java or C#, the constraints for the generic parameters are explicitly stated in the generic class signature, so they are obvious without having to look at the source code, e.g.
class Something<V> where V : IDoesSomething // interface with the DoSomething() method
{
// func delegate signature is explicit
public int SomeMethod(V val, Func<double, int> func)
{
double intermediate = val.DoSomething();
return func(intermediate);
}
}
My question is: how do I know how to implement more complex input arguments in practice? Can this somehow be documented using code only, when writing a library with template classes in C++, or is the only way to parse the code manually and look for parameter usage?
(or third possibility, add methods to the class until the compiler stops failing)
C# and Java Generics have similar syntax and some common uses with C++ templates, but they are very different beasts.
Here is a good overview.
In C++, by default template checking was done by instantiation of code, and requrements are in documentation.
Note that much of the requirements of C++ templates is semantic not syntactic; iterators need not only have the proper operations, those operations need to have the proper meaning.
You can check syntactic properties of types in C++ templates. Off the top of my head, there are 6 basic ways.
You can have a traits class requirement, like std::iterator_traits.
You can do SFINAE, an accidentally Turing-complete template metaprogramming technique.
You can use concepts and/or requires clauses if your compiler is modern enough.
You can generate static_asserts to check properties
You can use traits ADL functions, like begin.
You can just duck type, and use it as if it had the properties you want. If it quacks like a duck, it is a duck.
All of these have pluses and minuses.
The downside to all of them is that they can be harder to set up than "this parameter must inherit from the type Foo". Concepts can handle that, only a bit more verbose than Java.
Java style type erasure can be dominated using C++ templates. std::function is an example of a duck typed type eraser that allows unrelated types to be stored as values; doing something as restricted as Java is rarely worthwhile, as the machinery to do it is harder than making something more powerful.
C# reification cannot be fully duplicated by C++, because C#'s runtime environment ships with a compiler, and can effectively compile a new type when you instantiate at runtime. I have seen people ship compilers with C++ programs, compile dynamic libraries, then load and execute them, but that isn't something I'd advise.
Using modern c++20, you can:
template<Number N>
struct Polynomial;
where Number is a concept which checks N (a type) against its properties. This is a bit like the Java signature stuff on steroids.
And by c++23 you'll be able to use compile time reflection to do things that make templates look like preprocessor macros.
I have a library where template classes/functions often access explicit members of the input type, like this:
template <
typename InputType>
bool IsSomethingTrue(
InputType arg1) {
typename InputType::SubType1::SubType2 &a;
//Do something
}
Here, SubType1 and SubType2 are themselves generic types that were used to instantiate InputType. Is there a way to quickly find all the types in the library that are valid to pass in for InputType (likewise for SubType1 and SubType2)? So far I have just been searching the entire code base for classes containing the appropriate members, but the template input names are reused in a lot of places so it is very cumbersome.
From a coding perspective, what is the point of using a template like this when there is only a limited set of valid input types that are probably already defined? Why not just overload this function with explicit types rather than making them generic?
From a coding perspective, what is the point of using a template like this when there is only a limited set of valid input types that are probably already defined? Why not just overload this function with explicit types rather than making them generic?
First of all, because those overload would have the exact same body, or very similar ones. If the body of the function is long enough, having more versions of it is a problem for maintenance. When you need to change the algorithm, you now have to do it N times and hope you won't make mistakes. Most of the times, redundancy is bad.
Moreover, even though now there could be just a few such types which satisfy the syntactic requirements of your function, there may be more in future. Having a function template allows you to let your algorithm work with new types without the need to write a new overload every time one new such type is introduced.
The advantage of using generic types is not on the template end: if you're willing to explicitly name them and edit the template code every time, it's the same.
What happens, however, when you introduce a subclass or variant of a type accepted by the template? No modification needed on the other end.
In other words, when you say that all types are known beforehand, you are excluding code modifications and extensions, which is half the point of using templates.
I have a simple template struct associating a string with a value
template<typename T> struct Field
{
std::string name; T self;
}
I have a function that I want to accept 1-or-more Fields of any type, and the Fields may be of possible different types, so I'm using a std::initializer_list because C++, to my knowledge, lacks typed variadic arguments, cannot determine the size of variadic arguments, and must have at least one other argument to determine where to start.
The problem is that I don't know how to tell it to accept Fields that may be of different types. In Java, I would just use foo(Field<?> bar, Field<?>... baz), but C++ lacks both typed variadic arguments and wildcards. My only other idea is to make the parameter of type
std::initializer_list<Field<void*>>, but that seems like a bad solution... Is there a better way to do it?
A couple of things...
C++11 (which you seem to have since you are talking about std::initializer_list) does have typed variadic arguments, in particular they are named variadic templates
Java generics and C++ templates are completely different beasts. Java generics create a single type that stores a reference to Object and provides automatic casting in and out to the types in the interface, but the important bit is that it performs type erasure.
I would recommend that you explain the problem you want to solve and get suggestions for solutions to your problem that are idiomatic in C++. If you want to really mimic the behavior in Java (which, I cannot insist enough is a different language and has different idioms) you can use type erasure in C++ manually (i.e. use boost::any). But I have very rarely feel the need for full type erasure in a program... using a variant type (boost::variant) is a bit more common.
If your compiler has support for variadic templates (not all compilers do), you can always play with that, but stashing the fields for later in a vector may be a bit complicated for a fully generic approach unless you use type erasure. (Again, what is the problem to solve? There might be simpler solutions...)
Java generics are closer to just stuffing a boost::any into the self variable than to C++ templates. Give that a try. C++ templates create types that have no runtime or dynamic relarionship to each other by default.
You can introduce such a relationship manually, say via a common parent and type erasure and judicious use of pImpl and smart pointers.
C type variardic arguments are out of style in C++11. Variardic template arguments are very type safe, so long as your compiler has support for them (Nov 2012 CTP for MSVC 2012 has support for them (not update 1, the CTP), as does clang, and non-ancient versions of gcc).
Templates in C++ is a kind of metaprogramming, closer to writing a program that writes a program than it is to Java Generics. A Java Generic has one shared "binary" implementation, while each instance of a C++ template is a completely different "program" (which, via procedures like COMDAT folding, can be reduced to one binary implementation), whose details are described by the template code.
template<typename T>
struct Field {
T data;
};
is a little program that says "here is how to create Field types". When you pass in an int and double, the compiler does something roughly like this:
struct Field__int__ {
int data;
};
struct Field__double__ {
double data;
};
and you wouldn't expect these two types to be convertible between.
Java generics, on the other hand, create something like this:
struct Field {
boost::any __data__;
template<typename T>
T __get_data() {
__data__.get<T>();
}
template<typename T>
void __set_data(T& t) {
__data__.set(t);
}
property data; // reading uses __get_data(), writing uses __set_data()
};
where boost::any is a container that can hold an instance of any type, and access to the data field redirects through those accessors.
C++ provides means to write something equivalent to Java generics using template metaprogramming. To write something like C++ templates in Java, you'd have to have your Java program output custom Java byte or source code, then run that code in a way that allows a debugger to connect back to the code that writes the code as the source of the bugs.
There is no need to use wildcards in C++ templates, since in C++ it always knows the type, and is not "erased" like in Java. To write void foo(Field<?> bar, Field<?>... baz) method(or function) in C++, you would write:
template<class T, class... Ts>
void foo(Field<T> bar, Field<Ts>... baz);
Each Field<Ts> can be a different type. To use the variadic parameters inside the function, you just use baz.... So say you want to call another function:
template<class T, class... Ts>
void foo(Field<T> bar, Field<Ts>... baz)
{
foo2(baz...);
}
You can also expand the type with Field<Ts>..., so if you want to put it in a tuple(you can't put them in array since they can be different types):
template<class T, class... Ts>
void foo(Field<T> bar, Field<Ts>... baz)
{
std::tuple<Field<Ts>...> data(baz...);
}
This is not very idiomatic for C++. It can be done, perhaps; Coplien's book might have some ideas. But C++ is strongly typed because it believes in typing; trying to turn it into Smalltalk or fold it like a pheasant may lead to tears.
template <typename T>
class A
{
// use the type parameter T in various ways here
}
Is there any way to automatically synthesize a workable class definition for T, as used by the template A? My expectation is a tool or a compiler trick that could generate the boiler plate code for the type parameter T, which I could tweak further to my needs.
I know if I wrote the class A, I could provide some hints to the "user" using boost concepts checks etc... But it's an unfamiliar code base where I didn't have the luxury of writing class A. So far I build the needed parameter class T manually, by reading the code for class A and with the able assistance from the compiler (with its terse messages).
Is there a better way?
If I understand you correctly, you are looking for a way to automatically generate a concept archetype for a given template class. Currently, this is not possible and maybe it never will be.
The main problem here is that it is very hard to say anything about the semantics of As code without any a priori knowledge of T. Dave Abrahams wrote a blog post not too long ago, where he showed that it is possible to to call an unconstrained function from code that is constrained by concepts and the compiler will still be able to perform the concept checks correctly.
But what you are asking for is a compiler that synthesizes concept checks out of thin air. I'm not much of a compiler person but I can't think of a way to make this possible with today's tools. Although it surely would be very cool if this became possible some day.
I'm going to outline my problem in detail to explain what I'm trying to achieve, the question is in the last paragraph if you wish to ignore the details of my problem.
I have a problem with a class design in which I wish to pass a value of any type into push() and pop() functions which will convert the value passed into a string representation that will be appended to a string inside the class, effectively creating a stream of data. The reverse will occur for pop(), taking the stream and converting several bytes at the front of the stream back into a specified type.
Making push() and pop() templates tied with stringstream is an obvious solution. However, I wish to use this functionality inside a DLL in which I can change the way the string is stored (encryption or compression, for example) without recompilation of clients. A template of type T would need to be recompiled if the algorithm changes.
My next idea was to just use functions such as pushByte(), pushInt(), popByte(), popInt() etc. This would allow me to change the implementation without recompilation of clients, since they rely only on a static interface. This would be fine. However, it isn't so flexible. If a value was changed from a byte to a short, for example, all instances of pushByte() corresponding to that value would need to be changed to pushShort(), similarly for popByte() to popShort(). Overloading pop() and push() to combat this would cause conflictions in types (causing explicit casting, which would end up causing the same problem anyway).
With the above ideas, I could create a working class. However, I wondered how specialized templates are compiled. If I created push<byte>() and push<short>(), it would be a type specific overload, and the change from byte to short would automatically switch the template used, which would be ideal.
Now, my question is, if I used specialized templates only to simulate this kind of overloading (without a template of type T), would all specializations compile into my DLL allowing me to dispatch a new implementation without client recompilation? Or are specialized templates selected or dropped in the same way as a template of type T at client compilation time?
First of all, you can't just have specialized templates without a base template to specialize. It's just not allowed. You have to start with a template, then you can provide specializations of it.
You can explicitly instantiate a template over an arbitrary set of types, and have all those instantiations compiled into your DLL, but I'm not sure this will really accomplish much for you. Ultimately, templates are basically a compile-time form of polymorphism, and you seem to need (at least a limited form of) run-time polymorphism.
I'd probably just use overloading. The problem that I'd guess you're talking about arises with something on the order of:
int a;
byte b;
a = pop();
b = pop();
Where you'd basically just be overloading pop on the return type (which, as we all know, isn't allowed). I'd avoid that pretty simply -- instead of returning the value, pass a reference to the value to be modified:
int a;
byte b;
pop(a);
pop(b);
This not only lets overload resolution work, but at least to me looks cleaner as well (though maybe I've just written too much assembly language, so I'm accustomed to things like "pop ax").
It sounds like you have 2 opposing factors:
You want your clients to be able to push/pop/etc. every numeric type. Templates seem like a natural solution, but this is at odds with a consistent (only needs to be compiled once) implementation.
You don't want your clients to have to recompile when you change implementation aspects. The pimpl idiom seems like a natural solution, but this is at odds with a generic (works with any type) implementation.
From your description, it sounds like you only care about numeric types, not arbitrary T's. You can declare specializations of your template for each of them explicitly in a header file, and define them in a source file, and clients will use the specializations you've defined rather than compiling their own. The specializations are a form of compile time polymorphism. Now you can combine it with runtime polymorphism -- implement the specializations in terms of an implementation class that is type agnostic. Your implementation class could use boost::variant to do this since you know the range of possible T's ahead of time (boost::variant<int, short, long, ...>). If boost isn't an option for you, you can come up with a similar scheme yourself so long as you have a finite number of Ts you care about.