c++ support for last call optimization in template metaprogramming - c++

I am reading about C++ templates and would like to contrast two different implementations of a function which computes sum from 0 to N.
Unfortunately, I have problems and would like to address a few questions through examples:
Code for naive sum:
#include <stdio.h>
template<int N>
struct Sum {
// Copied the implementation idea from Scott Meyers book
// "Effective C++". Is there a better way?
enum { value = N + Sum<N - 1>::value };
};
template<>
struct Sum<0> {
enum { value = 0 };
};
int main() {
// Works well in this case, but gives compilation error, if
// it's called with a larger value, such as 10000
// (error: template instantiation depth exceeds maximum of 900").
// How to improve the program properly, that it would
// not give compile time error?
printf("%d\n", Sum<100>::value);
}
Now my idea for an improvement is to use an accumulator:
template<int Acc, int N>
struct Sum {
enum { value = Sum<Acc + N, N - 1>::value };
};
// Is that an appropriate way of writing the base case?
template<int Acc>
struct Sum<Acc, 0> {
enum { value = Acc };
};
However, when compiled with simple g++ on Ubuntu OS:
int main() {
// Still gives the "depth exceeded" error.
printf("%d\n", Sum<0, 1000>::value);
}
Hence, my main concern is:
Does any modern c++ compiler support last call optimisation for
template metaprogramming? If yes, what is an appropriate way to write code for such optimisation?

Does any modern c++ compiler support last call optimisation for template metaprogramming? If yes, what is an appropriate way to write code for such optimisation?
No, and it wouldn't make sense. Template instantiations are not function calls... last/tail call optimisation has no relevance here. Unlike function calls, template instantiations are not transient with automatic variables to reclaim; rather, each template instantiation becomes a new type in the compiler's state.

The whole point of template metaprogramming is that all of these "calls" will be optimised out of your program; they are "executed" during the build.
That doesn't change the fact that there is an implementation-defined limit to the amount of recursion you can use during this process. This is the limit you've hit.
So, no, there's no "optimisation" to work around it.

Short answer: incorporating LCO is not worth the trouble.
Longer explanation:
C++ template meta programming is Turing Complete. In theory it would be possible to compute any computable function at compile time using only templates (if enough resources were given). LCO would make such computation more efficient.
That does not mean templates should be used for sophisticated computations. Run time is for that. C++ templates merely aid to avoid writing identical code.
In fact, doing complicated computation through templates is discouraged, because one has little compiler support. The preprocessor will only expand templated code into more code and that's it. No type checking, etc. is happening when processing templates.
So I think the designers of c++ have more interesting things to add in the language rather than optimise template meta programming. Maybe after 20 years we will have LCO support. Currently there is no.

Related

C++ code example that makes the compile loop forever

Given that the C++ template system is not context-free and it's also Turing-Complete, can anyone provide me a non-trivial example of a program that makes the g++ compiler loop forever?
For more context, I imagine that if the C++ template system is Turing-complete, it can recognize all recursively enumerable languages and decide over all recursive ones. So, it made me think about the acceptance problem, and its more famous brother, the halting problem. I also imagine that g++ must decide if the input belongs in the C++ language (as it belongs in the decidability problem) in the syntactic analysis. But it also must resolve all templates, and since templates are recursively enumerable, there must be a C++ program that makes the g++ syntactic analysis run forever, since it can't decide if it belongs in the C++ grammar or not.
I would also like to know how g++ deals with such things?
While this is true in theory for the unlimited language, compilers in practice have implementation limits for recursive behavior (e.g. how deep template instantiations can be nested or how many instructions can be evaluated in a constant expression), so that it is probably not straight-forward to find such a case, even if we somehow ignore obvious problems of bounded memory. The standard specifically permits such limits, so if you want to be pedantic I am not even sure that any given implementation has to satisfy these theoretical concepts.
And also infinitely recursive template instantiation specifically is forbidden by the language. A program with such a construct has undefined behavior and the compiler can just refuse to compile if it is detected (although of course it cannot be detected in general).
This shows the limits for clang: Apple clang version 13.1.6 (clang-1316.0.21.2.5)
#include <iostream>
template<int V>
struct Count
{
static constexpr int value = Count<V-1>::value + 1;
};
template<>
struct Count<1>
{
static constexpr int value = 1;
};
int main()
{
#ifdef WORK
int v = Count<1026>::value; // This works.
#else
int v = Count<1027>::value; // This will fail to compile.
#endif
std::cout << "V: " << v << "\n";
}

What is compile time function in C++?

I've searched this question here(on SO), and as far as I know all questions assume what is compile time functions, but it is almost impossible for a beginner to know what that means, because resources to know that is quite rare.
I have found short wikipedia article which shows how to write incomprehensible code by writing never-seen-before use of enums in C++, and a video which is about future of it, but explains very little about that.
It seems to me that there are two ways to write compile time function in C++
constexpr
template<>
I've been through a short introduction of both of them, but I have no idea how they pop up here.
Can anyone explain compile time function with a sufficiently good example such that it encompasses most relevent features of it?
In cpp, as mentioned by you, there are two ways of evaluating a code on compile time - constexpr functions and template metaprogramming.
There are a few differences between those solutions. The template option is older and therefore supported by wider range of compilers. Additionaly templates are guaranteed to be evaluated in compile time while constexpr is somewhat like inline - it only suggests compiler that it is possible to do work while compiling. And for templates the arguments are usually passed via template parameters list while constexpr functions take arguments as regular functions (which they actually are). The constexpr functions are better in a manner that they can be called as regular functions in runtime.
Now the similarities - it must be possible for their parameters to be evaluated at compile time. So they must be either a literal or result of other compile-time function.
Having said all that let's look at compile time max function:
template<int a, int b>
struct max_template {
static constexpr int value = a > b ? a : b;
};
constexpr int max_fun(int a, int b) {
return a > b ? a : b;
}
int main() {
int x = 2;
int y = 3;
int foo = max_fun(3, 2); // can be evaluated at compile time
int bar = max_template<3, 2>::value; // is surely evaluated at compile time
// won't compile without compile-time arguments
// int bar2 = max_template<x, y>::value; // is surely evaluated at compile time
int foo = max_fun(x, y); // will be evaluated at runtime
return 0;
}
A "compile time function" as you have seen the term used is not a C++ construct, it's just the idea of computing stuff (hence, function) at compile-time (as opposed to computing at runtime or via a separate build tool outside the compiler). C++ makes this possible in several ways, of which you have found two:
Templates can indeed be used to compute arbitrary stuff, a set of techniques called "template metaprogramming". That's mostly by accident as they weren't designed for this purpose at all, hence the crazy syntax and struggles with old compilers. But in C++03 and before, that's all we had.
constexpr has been added in C++11 after seeing the need for compile-time calculations, and brings them back into somewhat saner territory. Its toolbelt has been expanding ever since, allowing more and more normal-looking code to be run at compile-time by just tacking a constexpr in the right place.
One could also mention macro metaprogramming, of which Boost.Preprocessor is a good example. But it's even more wonky and abhorrently arcane than old-school template metaprogramming, so you probably don't want to use it if you have a choice.

Constructing const array out of constexpr

Suppose for the sake of argument I have following private constexpr in a class:
static constexpr uint16_t square_it(uint16_t x)
{
return std::pow(x, 2);
}
Then I want to construct a static constant array of these values for the integers up to 255 in the same section of the same class using the above constexpr:
static const uint16_t array_of_squares[256] =
{
//something
};
I'd like the array to be constructed at compile time, not at runtime if possible. I think the first problem is that using expressions like std::pow in a constexpr is not valid ISO C++ (though allowed by arm-gcc maybe?), as it can return a domain error. The actual expression I want to use is a somewhat complicated function involving std::exp.
Note that I don't have much of the std library available as I'm compiling for a small microprocessor, the Cortex M4.
Is there a more appropriate way to do this, say using preprocessor macros? I'd very much like to avoid using something like an external Python script to calculate the table each time it needs to be modified during development, and then pasting it in.
The problem, as you say, is that C standard library functions are in general not marked constexpr.
The best workaround here, if you need to use std::exp, is to write your own implementation that can be run at compile-time. If it's meant to be done at compile-time, then optimizing it probably isn't necessary, it only needs to be accurate and moderately efficient.
Someone asked a question about how to do that here a long time ago. You could reuse the idea from there and rewrite as a constexpr function in C++11, although you'd have to refactor it to avoid the for loop. In C++14 less refactoring would be required.
You could also try doing it strictly via templates, but it would be more painful, and double cannot be a template parameter so it would be more complicated.
How about something like this?
constexpr uint16_t square_it(uint16_t v) { return v*v; }
template <size_t N, class = std::make_index_sequence<N>>
struct gen_table;
template <size_t N, size_t... Is>
struct gen_table<N, std::index_sequence<Is...>> {
static const uint16_t values[N] = {square_it(Is)...};
};
constexpr auto&& array_of_squares = gen_table<256>::values;
I have no idea whether that microprocessor supports this sort of operation. It may not have make_index_sequence in your standard library (though you can find implementations on SO), and maybe that template instantiation will take too much memory. But at least it's something that works somewhere.

What is induction method when it comes to C++ template metaprogramming?

People keep saying solve the problem using induction when it comes to template metaprograms. For example see this answer : https://stackoverflow.com/a/11811486/4882052
I know induction proofs etc, but how this theory is used to solve metaprogram?. I'd love examples with examples :)
"Induction" is just recursion looked at from a different point of view. In each you need one or more base cases in which a problem can be solved without recursion and you need a recursive case in which a problem can be solved by using the solutions of related problem(s) that are closer to base cases.
In run-time recursive programming, base cases can be detected by run-time conditionals. In recursive meta-programming, even compile-time conditionals are not quite enough to handle base cases. You need separate definitions utilizing overloading or specializing to cover base cases.
The first time I used it myself is a rather messy situation, which I can't quote in full, but the general idea might be instructive. The compiler did various optimizations before unwinding short loops and various other optimizations after unwinding short loops. But I really needed one of those "before" optimizations done after. So I needed to force the compiler to unwind some short loops earlier in compilation, roughly:
template<unsigned N>
struct unwind {
void operator()(X*p) { unwind<N-1>()(p); work(p[N]); } };
template<>
struct unwind<0> {
void operator()(X*p) { work(p[0]); } };
When you use that compile-time recursion instead of a run-time loop, the compiler will unwind the whole loop before doing any of the optimization, so optimizations of a type done before loop unwinding that in my work code aren't visible until after loop unwinding will be done.
As observed by in one of the comments under the OP, the TMP technique is essentially recursive, which I guess could be seen a form of `reverse induction' (an idea originally due to Fermat). The idea is that, for some N, you define the corresponding thing you want in terms of some lesser N, eventually terminating at some base case.
Consider the following TMP code for factorial:
template <int N>
struct Factorial {
enum { value = N * Factorial<N - 1>::value };
};
template <>
struct Factorial<0> {
enum { value = 1 };
};
void foo() {
std::cout << Factorial<0>::value << "," << Factorial<3>::value;
// outputs 1, 6
}
So the the general case (N) is given by a template with a value defined in terms of lesser values of (potentially more specialised) templates, terminating at some lower bound.
An inductive proof typically has the structure:
show that X is (usually trivially) true for some value Y
Show that if X is true for Y, then it remains true for some other value Y + delta
Therefore conclude that X is true for all Y + delta * N
(...and in a lot of cases, it's really handy if delta is 1, so we can say "X is true for all non-negative integers", or something on that order). In a fair number of cases, it's also handy to extend the proof in both directions, so we can say that X is true for all integers (for one obvious example).
Most purely recursive solutions (whether template meta programming or otherwise) tend to follow roughly the same structure. In particular, we start with processing for some trivial case, then define the more complex cases in terms of an application of the base case plus some extending step.
Ignoring template metaprogramming for the moment, this is probably most easily seen in recursive algorithms for preorder, inorder and postorder traversal of trees. For these we define a base case for processing the data in a single node of the tree. This is usually sort of irrelevant to the tree traversal itself, so we often just treat it as a function named process or something similar. With this given, we can define tree traversals something like:
void in_order(Tree *t) {
if (nullptr == t)
return;
in_order(t->left);
process(t);
in_order(t->right);
}
// preorder and postorder are same except for the order of `process` vs. recursion.
The reason many people think of this as being unique (or at least unusually applicable to) template meta-programming is that it's an area where C++ really only allows purely recursive solutions--unlike normal C++, you have no loops or variables. There have been other languages like that for quite some time, but most of them haven't really reached the mainstream. There are quite a few more languages that tend to follow that style even though they don't truly require it--but while some of them have gotten closer to the mainstream, most of them are still sort of on the fringes.

constexpr vs template for compile-time maths functions?

I'm quite confused with the new keyword constexpr of C++2011. I would like to know where to use constexpr and where to use templates metaprogramming when I code compile-time functions (especially maths functions). For example if we take an integer pow function :
// 1 :
template <int N> inline double tpow(double x)
{
return x*tpow<N-1>(x);
}
template <> inline double tpow<0>(double x)
{
return 1.0;
}
// 2 :
constexpr double cpow(double x, int N)
{
return (N>0) ? (x*cpow(x, N-1)) : (1.0);
}
// 3 :
template <int N> constexpr double tcpow(double x)
{
return x*tcpow<N-1>(x);
}
template <> constexpr double tcpow<0>(double x)
{
return 1.0;
}
Are the 2nd and 3rd functions equivalent ?
What is the best solution ? Does it produce the same result :
if x is known at compile-time
if x is not known at compile-time
When to use constexpr and when to use template metaprogramming ?
EDIT 1 : code modified to include specialization for templates
I probably shouldn't be answering a template metaprogramming question this late. But, here I go.
Firstly, constexpr isn't implemented in Visual Studio 2012. If you want to develop for windows, forget about it. I know, it sucks, I hate Microsoft for not including it.
With that out of the way, there's lots of things you can declare as constant, but they aren't really "constant" in terms of "you can work with them at compile time." For instance:
const int foo[5] = { 2, 5, 1, 9, 4 };
const int bar = foo[foo[2]]; // Fail!
You'd think you could read from that at compile time, right? Nope. But you can if you make it a constexpr.
constexpr int foo[5] = { 2, 5, 1, 9, 4 };
constexpr int bar = foo[foo[2]]; // Woohoo!
Constexpr's are really good for "constant propagation" optimization. What that means is if you have a variable X, that is declared at compile time based on some condition (perhaps metaprogramming), if it is a constexpr then the compiler knows it can "safely" use it when doing optimization to, say, remove instructions like a = (X * y); and replace them with a = 0; if X evaluated to 0 (and other conditions are met).
Obviously this is great because for many mathematical functions, constant propagation can give you an easy (to use) premature optimization.
Their main use, other than rather esoteric things (such as enabling me to write a compile-time byte-code interpreter a lot easier), is to be able to make "functions" or classes that can be called and used both at compile-time and at runtime.
Basically they just sort of fill a hole in C++03 and help with optimization by the compiler.
So which of your 3 is "best"?
2 can be called at run-time, whereas the others are compile-time only. That's pretty sweet.
There's a bit more to it. Wikipedia gives you a very basic summary of "constexpr allows this," but template metaprogramming can be complicated. Constexpr makes parts of it a lot easier. I wish I had a clear example for you other than say, reading from an array.
A good mathematical example, I suppose, would be if you wanted to implement a user-defined complex number class. It would be an order of magnitude more complex to code that with only template metaprogramming and no constexpr.
So when should you not use constexpr? Honestly, constexpr is basically "const except MORE CONST." You can generally use it anywhere you'd use const, with a few caveats such as how when called during runtime a function will act non-const if its input isn't const.
Um. OK, that's all for now. I'm too overtired to say more. I hope i was helpful, feel free to downvote me if I wasn't and I'll delete this.
1st and 3rd are incorrect. Compiler will try instantiate tpow<N-1> before it will evaluate (N>0) ?, and you will get infinite template recursion. You need specialisation for N==1 (or ==0) to make it work. 2nd will work for xknown at compile time and run time.
Added after your specialization for ==0 edit. Now all function will work for compile time or run time x. 1st will always return non constexpr value. 2nd and 3rd will return constexpr if x and N are constexpr. 2nd even work if N is not constexpr, other need constexpr N (so, 2nd and 3rd are not equivalent).
The constexpr is used in two cases. When you write int N=10;, value of N is known at compile time but it is not constexpr and can not used for example as template argument. Keyword constexpr explicitly tells compiler that N is safe to use as compile time value.
Second use is as constexpr functions. They use subset of C++ to conditionally produce constexpr values and can dramatically simplify equivalent template functions. One detriment of constexpr functions is that you have no guaranteed compile time evaluation -- compiler can chose to do evaluation in in run time. With templated implementation you are guaranteed compile time evaluation.