Wondering if the following surprises anyone, as it did me? Alex Allain's article here on using constexpr shows the following factorial example:
constexpr factorial (int n)
{
return n > 0 ? n * factorial( n - 1 ) : 1;
}
And states:
Now you can use factorial(2) and when the compiler sees it, it can
optimize away the call and make the calculation entirely at compile
time.
I tried this in VS2015 in Release mode with full optimizations on (/Ox) and stepped through the code in the debugger viewing the assembly and saw that the factorial calculation was not done at compilation.
Using GCC v5.4.0 with --std=C++14, I must use /O2 or /O3 before the calculation is performed at compile time. I was surprised thought that using just /O the calculation did not occur at compilation time.
Main main question is: Why is VS2015 not performing this calculation at compilation time?
It depends on the context of the function call.
For example, the following obviously could never be calculated at compile time:
int x;
std::cin >> x;
std::cout << factorial(x);
On the other hand, this context would require the answer at compile time:
class Foo {
int x[factorial(4)];
};
constexpr functions are only guaranteed to be evaluated at compile time if they are called from a constexpr context; otherwise it is up to the compiler to choose whether or not to eval at compile time (assuming such an optimization is possible, again, depending on the context).
You have to use it in const expression, as:
constexpr auto res = factorial(2);
else computation can be done at runtime.
constexpr is neither necessary nor sufficient to compile time evaluation of a function.
It's not sufficient, even aside from the fact that the arguments obviously also have to be constant expressions. Even if that is true, a conforming compiler does not have to evaluate it at compile time. It only has to be evaluated at compile time if it is in a constexpr context. Such as, assigning the result of the computation to a constexpr variable, or using the value as an array size, or as a non-type template parameter.
The other point, is that the compiler is completely capable of evaluating things at compile time, even without constexpr. There is a lot of confusion about this, and it's not clear why. compile time evaluation of constexpr functions fundamentally just boils down to constant propagation, and compilers have been doing this optimization since forever: https://godbolt.org/g/Sy214U.
int factorial(int n) {
if (n <= 1) return 1;
return n * factorial(n-1);
}
int foo() { return factorial(5); }
On gcc 6.3 with O3 (and 14) yields:
foo():
mov eax, 120
ret
In essence, outside of the specific case where you absolutely force compile time evaluation by assigning a constexpr function to another constexpr variable, compile time evaluation has more to do with the quality of your optimizer than the standard.
Related
This question already has answers here:
When does a constexpr function get evaluated at compile time?
(2 answers)
Closed 2 years ago.
This code, when compiled with g++ -O3, does not seem to evaluate get_fibonacci(50) at compile time - as it runs for a very long time.
#include <iostream>
constexpr long long get_fibonacci(int num){
if(num == 1 || num == 2){return 1;}
return get_fibonacci(num - 1) + get_fibonacci(num - 2);
}
int main()
{
std::cout << get_fibonacci(50) << std::endl;
}
Replacing the code with
#include <iostream>
constexpr long long get_fibonacci(int num){
if(num == 1 || num == 2){return 1;}
return get_fibonacci(num - 1) + get_fibonacci(num - 2);
}
int main()
{
long long num = get_fibonacci(50);
std::cout << num << std::endl;
}
worked perfectly fine. I don't know exactly why this is occurring, but my guess is that get_fibonacci(50) is not evaluated at compile-time in the first scenario because items given std::cout are evaluated at runtime. Is my reasoning correct, or is something else happening? Can somebody please point me in the right direction?
Actually, both versions of your code do not have the Fibonnaci number computed at compile-time, with typical compilers and compilation flags. But, interestingly enough, if you reduce the 50 to be, say, 30, both versions of your program do have the compile-time evaluation.
Proof: GodBolt
At the link, your first program is compiled and run first with 50 as the argument to get_fibbonacci(), then with 30, using GCC 10.2 and clang 11.0.
What you're seeing is the limits of the compiler's willingness to evaluate code at compile-time. Both compilers engage in the recursive evaluation at compile time - until a certain depth, or certain evaluation time cap, has elapsed. They then give up and leave it for run-time evaluation.
I don't know exactly why this is occurring, but my guess is that get_fibonacci(50) is not evaluated at compile-time in the first scenario because items given std::cout are evaluated at runtime
Your function can be computed compile-time, because receive a compile-time know value (50), but can also computed run-time, because the returned value is send to standard output so it's used run-time.
It's a gray area where the compiler can choose both solutions.
To impose (ignoring the as-if rule) the compile-time computation, you can place the returned value in a place where the value is required compile-time.
For example, in a template parameter, in your first example
std::cout << std::integral_constant<long long, get_fibonacci(50)>::value
<< std::endl;
or in a constexpr variable, in your second example
constexpr long long num = get_fibonacci(50);
But remember there is the "as-if rule", so the compiler (in this case, also using constexpr or std::integral_constant) can select the run-time solution because this "do not change the observable behavior of the program".
Assign to a constexpr to get the compiler to spit out an error message
constexpr auto val = get_fibonacci(50);
constexpr functions are evaluated at compile time only in constexpr context, which includes assignment to constexpr variables, template parameter, array size...
Regular function/operator call is not such context.
std::cout << get_fibonacci(50);
is done at runtime.
Now, compiler might optimize any (constexpr or not, inline or not) functions with the as-if rule, resulting in a constant, a simpler loop, ...
I am using constexpr to get the Fibonacci number
Enumeration is used to calculate the fibonacci in compile time
#include <iostream>
constexpr long fibonacci(long long n)
{
return n < 1 ? -1 :
(n == 1 || n == 2 ? 1 : fibonacci(n - 1) + fibonacci(n - 2));
}
enum Fibonacci
{
Ninth = fibonacci(9),
Tenth = fibonacci(10),
Thirtytwo = fibonacci(32)
};
int main()
{
std::cout << Fibonacci(Thirtytwo);
// std::cout << fibonacci(32);
return 0;
}
I am getting the following error on execution:
1>c:\users\hsingh\documents\visual studio 2017\projects\consoleapplication4\consoleapplication4\source.cpp(12): note: while evaluating 'fibonacci(30)'
1>c:\users\hsingh\documents\visual studio 2017\projects\consoleapplication4\consoleapplication4\source.cpp(6): note: while evaluating 'fibonacci(31)'
1>c:\users\hsingh\documents\visual studio 2017\projects\consoleapplication4\consoleapplication4\source.cpp(12): note: while evaluating 'fibonacci(31)'
1>c:\users\hsingh\documents\visual studio 2017\projects\consoleapplication4\consoleapplication4\source.cpp(12): error C2131: expression did not evaluate to a constant
1>c:\users\hsingh\documents\visual studio 2017\projects\consoleapplication4\consoleapplication4\source.cpp(5): note: failure was caused by control reaching the end of a constexpr function
1>c:\users\hsingh\documents\visual studio 2017\projects\consoleapplication4\consoleapplication4\source.cpp(12): note: while evaluating 'fibonacci(32)'
1>c:\users\hsingh\documents\visual studio 2017\projects\consoleapplication4\consoleapplication4\source.cpp(14): error C2057: expected constant expression
1>Done building project "ConsoleApplication4.vcxproj" -- FAILED.
But when I use run time
int x=30, y=2;
std::cout << fibonacci(x+y);//fibonacci is calculated on run time
I won't say I have a question but I have few confusions like:
Is memory used by compile time and run time use of constexpr different?
How do I know where to stop exploiting or using compile time data?
Which I am still trying to do is how to do Fibo-like calculation using both the advantage of compile time and run time together (use compile till it can be and after that let rest of calculation be done on run time).
Any example or reference if available will be helpful.
Memory used by the constexpr function at compile time is implementation dependent, but in general should be comparable to runtime (most compilers will compile and execute the statement).
In theory, you should use compile time evaluated expressions whenever possible. In practice, it's a judgment call (perhaps a good topic for an SE question), as the down sides are increased compile time (and perhaps memory) and lack of debugging.
It appears you are reaching the maximum recursion limit allowed by MSVC in compile time expressions. I can't find any documentation on what this limit is, but it's configurable on other compilers. Your error is the result of the enum requiring it to be fully evaluated at compile time, where as the cout call allows it be executed at compile time and/or run time (if you generate assembly, you should see compile time constants generated for the lower number calls, and a recursive function being used for the high number calls).
The reason for error is that you are trying to calculate Fib number of Fib number of 32. This is way to much!
You hit the maximum limit of constexpr function recursion, and thus you see a compilation error.
At runtime your program would crash at this point, but it doesn't, because your run-time expression is different - it is a Fib of 32.
I found this code, and wondering whether I should really implement something like this in my real project or not.
And things confusing me are
It will take more compile time, but I should not bother about
compile time at the cost of run time.
And what if N gets really a very big number then? Is there any file
size limit in source code?
or its something good to know, not to implement?
#include <iostream>
using namespace std;
template<int N>
class Factorial {
public:
static const int value = N * Factorial<N-1>::value;
};
template <>
class Factorial<1> {
public:
static const int value = 1;
};
int main() {
Factorial<5L> f;
cout << "5! = " << f.value << endl;
}
outPut:
5! = 120
slight modification in question, as i was playing with the code, found that
Factorial<12> f1; // works
Factorial<13> f2; // doesn't work
error:
undefined reference to `Factorial<13>::value'
is it like, can go up-to 12 depth not further?
The answer to 1 is that it depends, template meta programming essentially involves a trade-off between doing the calculation at compile-time with the benefit that it does not have to be done at run-time. In general this technique can lead to hard to read and maintain code. So the answer ultimately depends on your need for faster run-time performance over slower compiler times and possibly harder to maintain code.
The article Want speed? Use constexpr meta-programming! explains how in modern C++ you can use constexpr functions many times as a replacement for template meta programming. This in general leads to code that is more readable and perhaps faster. Compare this template meta programming method to the constexpr example:
constexpr int factorial( int n )
{
return ( n == 0 ? 1 : n*factorial(n-1) ) ;
}
which is more concise, readable and will be executed at compile time for arguments that are constant expressions although as the linked answer explains the standard does not actually say it must be but in practice current implementations definitely do.
It is also worth noting that since the result will quickly overflow value that another advantage of constexpr is that undefined behavior is not a valid constant expression and at least the current implementations of gcc and clang will turn undefined behavior within a constexpr into an error for most cases. For example:
constexpr int n = factorial(13) ;
for me generates the following error:
error: constexpr variable 'n' must be initialized by a constant expression
constexpr int n = factorial(13) ;
^ ~~~~~~~~~~~~~
note: value 6227020800 is outside the range of representable values of type 'int'
return ( n == 0 ? 1 : n*factorial(n-1) ) ;
^
This is also why you example:
Factorial<13> f2;
fails because a constant expression is required and gcc 4.9 gives a useful error:
error: overflow in constant expression [-fpermissive]
static const int value = N * Factorial<N-1>::value;
^
although older versions of gcc give you the less than helpful error you are seeing.
For question 2, compilers have a template recursion limit, which can usually be configured but eventually you will run out of system resources. For example the flag for gcc would be -ftemplate-depth=n:
Set the maximum instantiation depth for template classes to n. A limit
on the template instantiation depth is needed to detect endless
recursions during template class instantiation. ANSI/ISO C++
conforming programs must not rely on a maximum depth greater than 17
(changed to 1024 in C++11). The default value is 900, as the compiler
can run out of stack space before hitting 1024 in some situations.
As in your specific problem you will need to worry about signed integer overflow, which is undefined behavior before you have system resource issues.
The below code calculates Fibonacci numbers by an exponentially slow algorithm:
#include <cstdlib>
#include <iostream>
#define DEBUG(var) { std::cout << #var << ": " << (var) << std::endl; }
constexpr auto fib(const size_t n) -> long long
{
return n < 2 ? 1: fib(n - 1) + fib(n - 2);
}
int main(int argc, char *argv[])
{
const long long fib91 = fib(91);
DEBUG( fib91 );
DEBUG( fib(45) );
return EXIT_SUCCESS;
}
And I am calculating the 45th Fibonacci number at run-time, and the 91st one at compile time.
The interesting fact is that GCC 4.9 compiles the code and computes fib91 in a fraction of a second, but it takes a while to spit out fib(45).
My question: If GCC is smart enough to optimize fib(91) computation and not to take the exponentially slow path, what stops it to do the same for fib(45)?
Does the above mean GCC produces two compiled versions of fib function where one is fast and the other exponentially slow?
The question is not how the compiler optimizes fib(91) calculation (yes! It does use a sort of memoization), but if it knows how to optimize the fib function, why does it not do the same for fib(45)? And, are there two separate compilations of the fib function? One slow, and the other fast?
GCC is likely memoizing constexpr functions (enabling a Θ(n) computation of fib(n)). That is safe for the compiler to do because constexpr functions are purely functional.
Compare the Θ(n) "compiler algorithm" (using memoization) to your Θ(φn) run time algorithm (where φ is the golden ratio) and suddenly it makes perfect sense that the compiler is so much faster.
From the constexpr page on cppreference (emphasis added):
The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time.
The constexpr specifier does not declare that it is required to evaluate the value of the function or variable at compile time. So one can only guess what heuristics GCC is using to choose whether to evaluate at compile time or run time when a compile time computation is not required by language rules. It can choose either, on a case-by-case basis, and still be correct.
If you want to force the compiler to evaluate your constexpr function at compile time, here's a simple trick that will do it.
constexpr auto compute_fib(const size_t n) -> long long
{
return n < 2 ? n : compute_fib(n - 1) + compute_fib(n - 2);
}
template <std::size_t N>
struct fib
{
static_assert(N >= 0, "N must be nonnegative.");
static const long long value = compute_fib(N);
};
In the rest of your code you can then access fib<45>::value or fib<91>::value with the guarantee that they'll be evaluated at compile time.
At compile-time the compiler can memoize the result of the function. This is safe, because the function is a constexpr and hence will always return the same result of the same inputs.
At run-time it could in theory do the same. However most C++ programmers would frown at optimization passes that result in hidden memory allocations.
When you ask for fib(91) to give a value to your const fib91 in the source code, the compiler is forced to compute that value from you const expr. It does not compile the function (as you seem to think), just it sees that to compute fib91 it needs fib(90) and fib(89), to compute the it needs fib(87)... so on until he computes fib(1) which is given. This is an $O(n)$ algorithm and the result is computed fast enough.
However when you ask to evaluate fib(45) in runtime the compiler has to choose wether using the actual function call or precompute the result. Eventually it decides to use the compiled function. Now, the compiled function must execute exactly the exponential algorithm that you have decided there is no way the compiler could implement memoization to optimize a recursive function (think about the need to allocate some cache and to understand how many values to keep and how to manage them between function calls).
I'm using gcc 4.6.1 and am getting some interesting behavior involving calling a constexpr function. This program runs just fine and straight away prints out 12200160415121876738.
#include <iostream>
extern const unsigned long joe;
constexpr unsigned long fib(unsigned long int x)
{
return (x <= 1) ? 1 : (fib(x - 1) + fib(x - 2));
}
const unsigned long joe = fib(92);
int main()
{
::std::cout << "Here I am!\n";
::std::cout << joe << '\n';
return 0;
}
This program takes forever to run and I've never had the patience to wait for it to print out a value:
#include <iostream>
constexpr unsigned long fib(unsigned long int x)
{
return (x <= 1) ? 1 : (fib(x - 1) + fib(x - 2));
}
int main()
{
::std::cout << "Here I am!\n";
::std::cout << fib(92) << '\n';
return 0;
}
Why is there such a huge difference? Am I doing something wrong in the second program?
Edit: I'm compiling this with g++ -std=c++0x -O3 on a 64-bit platform.
joe is an Integral Constant Expression; it must be usable in array bounds. For that reason, a reasonable compiler will evaluate it at compile time.
In your second program, even though the compiler could calculate it at compile time, there's no reason why it must.
My best guess would be that program number one is had fib(92) evaluated at compile time, with lots of tables and stuff for the compiler to keep track of what values has already been evaluated... making running the program almost trivial,
Where as the second version is actually evaluated at run-time without lookup tables of evaluated constant expressions, meaning that the evaluating of fib(92) makes something like 2**92 recursive calls.
In other words the compiler does not optimize the fact that fib(92) is a constant expression.
There's wiggle room for the compiler to decide not to evaluate at compile time if it thinks something is "too complicated". That's in cases where it's not being absolutely forced to do the evaluation in order to generate a correct program that can actually be run (as #MSalters points out).
I thought perhaps the decision affecting compile-time laziness would be the recursion depth limit. (That's suggested in the spec as 512, but you can bump it up with the command line flag -fconstexpr-depth if you wanted it to.) But rather that would control it giving up in any cases...even when a compile time constant was necessary to run the program. So no effect on your case.
It seems if you want a guarantee in the code that it will do the optimization then you've found a technique for that. But if constexpr-depth can't help, I'm not sure if there are any relevant compiler flags otherwise...
I also wanted to see how gcc did optimize the code for this new constexpr keyword, and actually it's just because you are calling fib(92) as parameter of the ofstream::operator<<
::std::cout << fib(92) << '\n';
that it isn't evaluated at the compilation time, if you try calling it not as a parameter of another function (like you did in)
const unsigned long joe = fib(92);
it is evaluated at compile time, I did a blog post about this if you want more info, I don't know if this should be mentioned to gcc developers.