Do compilers reduce simple functions given constant arguments into unique instructions? - c++

This is something I've always thought to be true but have never had any validation. Consider a very simple function:
int subtractFive(int num) {
return num -5;
}
If a call to this function uses a compile time constant such as
getElement(5);
A compiler with optimizations turned on will very likely inline this. What is unclear to me however, is if the num - 5 will be evaluated at runtime or compile time. Will expression simplification extend recursively through inlined functions in this manner? Or does it not transcend functions?

We can simply look at the generated assembly to find out. This code:
int subtractFive(int num) {
return num -5;
}
int main(int argc, char *argv[]) {
return subtractFive(argc);
}
compiled with g++ -O2 yields
leal -5(%rdi), %eax
ret
So the function call was indeed reduced to a single instruction. This optimization technique is known as inlining.
One can of course use the same technique to see how far a compiler will go with that, e.g. the slightly more complicated
int subtractFive(int num) {
return num -5;
}
int foo(int i) {
return subtractFive(i) * 5;
}
int main(int argc, char *argv[]) {
return foo(argc);
}
still gets compiled to
leal -25(%rdi,%rdi,4), %eax
ret
so here both functions where just eliminated at compile time. If the input to foo is known at compile time, the function call will (in this case) simply be replaced by the resulting constant at compile time (Live).
The compiler can also combine this inlining with constant folding, to replace the function call with its fully evaluated result if all arguments are compile time constants. For example,
int subtractFive(int num) {
return num -5;
}
int foo(int i) {
return subtractFive(i) * 5;
}
int main() {
return foo(7);
}
compiles to
mov eax, 10
ret
which is equivalent to
int main () {
return 10;
}
A compiler will always do this where it thinks it is a good idea, and it is (usually) way better in optimizing code on this low level than you are.

It's easy to do a little test; consider the following
int foo(int);
int bar(int x) { return x-5; }
int baz() { return foo(bar(5)); }
Compiling with g++ -O3 the asm output for function baz is
xorl %edi, %edi
jmp _Z3fooi
This code loads a 0 in the first parameter and then jumps into the code of foo. So the code from bar is completely disappeared and the computation of the value to pass to foo has been done at compile time.
In addition returning the value of calling the function became just a jump to the function code (this is called "tail call optimization").

A smart compiler will evaluate this at compile time and will replace the getElement(5) because it will never have a different result. None of the variables are considered volatile.

Related

Curious missed optimization of recursive constexpr function by Clang

Today I wanted to test, how Clang would transform a recursive power of two function and noticed that even with known exponent, the recursion is not optimized away even when using constexpr.
#include <array>
constexpr unsigned int pow2_recursive(unsigned int exp) {
if(exp == 0) return 1;
return 2 * pow2_recursive(exp-1);
}
unsigned int pow2_5() {
return pow2_recursive(5);
}
pow2_5 is compiled as a call to pow2_recursive.
pow2_5(): # #pow2_5()
mov edi, 5
jmp pow2_recursive(unsigned int) # TAILCALL
However, when I use the result in a context that requires it to be known at compile time, it will correctly compute the result at compile time.
unsigned int pow2_5_arr() {
std::array<int, pow2_recursive(5)> a;
return a.size();
}
is compiled to
pow2_5_arr(): # #pow2_5_arr()
mov eax, 32
ret
Here is the link to the full example in Godbolt: https://godbolt.org/z/fcKef1
So, am I missing something here? Is there something that can change the result at runtime and a reason, that pow2_5 cannot be optimized in the same way as pow2_5_arr?

When is a constexpr evaluated at compile time?

What assurances do I have that a core constant expression (as in [expr.const].2) possibly containing constexpr function calls will actually be evaluated at compile time and on which conditions does this depend?
The introduction of constexpr implicitly promises runtime performance improvements by moving computations into the translation stage (compile time).
However, the standard does not (and presumably cannot) mandate what code a compiler produces. (See [expr.const] and [dcl.constexpr]).
These two points appear to be at odds with each other.
Under which circumstances can one rely on the compiler resolving a core constant expression (which might contain an arbitrarily complicated computation) at compile time rather than deferring it to runtime?
At least under -O0 gcc appears to actually emit code and call for a constexpr function. Under -O1 and up it doesn't.
Do we have to resort to trickery such as this, that forces the constexpr through the template system:
template <auto V>
struct compile_time_h { static constexpr auto value = V; };
template <auto V>
inline constexpr auto compile_time = compile_time_h<V>::value;
constexpr int f(int x) { return x; }
int main() {
for (int x = 0; x < compile_time<f(42)>; ++x) {}
}
When a constexpr function is called and the output is assigned to a constexpr variable, it will always be run at compiletime.
Here's a minimal example:
// Compile with -std=c++14 or later
constexpr int fib(int n) {
int f0 = 0;
int f1 = 1;
for(int i = 0; i < n; i++) {
int hold = f0 + f1;
f0 = f1;
f1 = hold;
}
return f0;
}
int main() {
constexpr int blarg = fib(10);
return blarg;
}
When compiled at -O0, gcc outputs the following assembly for main:
main:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], 55
mov eax, 55
pop rbp
ret
Despite all optimization being turned off, there's never any call to fib in the main function itself.
This applies going all the way back to C++11, however in C++11 the fib function would have to be re-written to use conversion to avoid the use of mutable variables.
Why does the compiler include the assembly for fib in the executable sometimes? A constexpr function can be used at runtime, and when invoked at runtime it will behave like a regular function.
Used properly, constexpr can provide some performance benefits in specific cases, but the push to make everything constexpr is more about writing code that the compiler can check for Undefined Behavior.
What's an example of constexpr providing performance benefits? When implementing a function like std::visit, you need to create a lookup table of function pointers. Creating the lookup table every time std::visit is called would be costly, and assigning the lookup table to a static local variable would still result in measurable overhead because the program has to check if that variable's been initialized every time the function is run.
Thankfully, you can make the lookup table constexpr, and the compiler will actually inline the lookup table into the assembly code for the function so that the contents of the lookup table is significantly more likely to be inside the instruction cache when std::visit is run.
Does C++20 provide any mechanisms for guaranteeing that something runs at compiletime?
If a function is consteval, then the standard specifies that every call to the function must produce a compile-time constant.
This can be trivially used to force the compile-time evaluation of any constexpr function:
template<class T>
consteval T run_at_compiletime(T value) {
return value;
}
Anything given as a parameter to run_at_compiletime must be evaluated at compile-time:
constexpr int fib(int n) {
int f0 = 0;
int f1 = 1;
for(int i = 0; i < n; i++) {
int hold = f0 + f1;
f0 = f1;
f1 = hold;
}
return f0;
}
int main() {
// fib(10) will definitely run at compile time
return run_at_compiletime(fib(10));
}
Never; the C++ standard permits almost the entire compilation to occur at "runtime". Some diagnostics have to be done at compile time, but nothing prevents insanity on the part of the compiler.
Your binary could be a copy of the compiler with your source code appended, and C++ wouldn't say the compiler did anything wrong.
What you are looking at is a QoI - Quality of Implrmentation - issue.
In practice, constexpr variables tend to be compile time computed, and template parameters are always compile time computed.
consteval can also be used to markup functions.

What happens when there's an error in constexpr function?

I learnt that constexpr functions are evaluated at compile time. But look at this example:
constexpr int fac(int n)
{
return (n>1) ? n*fac(n-1) : 1;
}
int main()
{
const int a = 500000;
cout << fac(a);
return 0;
}
Apparently this code would throw an error, but since constexpr functions are evaluated at compiling time, why I see no error when compile and link?
Further on, I disassembled this code, and it turned out this function isn't evaluated but rather called as a normal function:
(gdb) x/10i $pc
=> 0x80007ca <main()>: sub $0x8,%rsp
0x80007ce <main()+4>: mov $0x7a11f,%edi
0x80007d3 <main()+9>: callq 0x8000823 <fac(int)>
0x80007d8 <main()+14>: imul $0x7a120,%eax,%esi
0x80007de <main()+20>: lea 0x20083b(%rip),%rdi # 0x8201020 <_ZSt4cout##GLIBCXX_3.4>
0x80007e5 <main()+27>: callq 0x80006a0 <_ZNSolsEi#plt>
0x80007ea <main()+32>: mov $0x0,%eax
0x80007ef <main()+37>: add $0x8,%rsp
0x80007f3 <main()+41>: retq
However, if I call like fac(5):
constexpr int fac(int n)
{
return (n>1) ? n*fac(n-1) : 1;
}
int main()
{
const int a = 5;
cout << fac(a);
return 0;
}
The assemble code turned into:
(gdb) x/10i $pc
=> 0x80007ca <main()>: sub $0x8,%rsp
0x80007ce <main()+4>: mov $0x78,%esi
0x80007d3 <main()+9>: lea 0x200846(%rip),%rdi # 0x8201020 <_ZSt4cout##GLIBCXX_3.4>
0x80007da <main()+16>: callq 0x80006a0 <_ZNSolsEi#plt>
0x80007df <main()+21>: mov $0x0,%eax
0x80007e4 <main()+26>: add $0x8,%rsp
0x80007e8 <main()+30>: retq
The fac function is evaluated at compile time.
Can Anyone explain this?
Compiling command:
g++ -Wall test.cpp -g -O1 -o test
And with g++ version 7.4.0, gdb version 8.1.0
I learnt that constexpr functions are evaluated at compile time
No, constexpr can be evaluated at compile time, but also at runtime.
further reading:
Difference between `constexpr` and `const`
https://en.cppreference.com/w/cpp/language/constexpr
Purpose of constexpr
Apparently this code would throw an error
No, no errors thrown. For large input, the result will overflow which is undefined behavior. This doesn't mean an error will be thrown or displayed. It means anything can happen. And when I say anything, I do mean anything. The program can crash, hang, appear to work with a strange results, display weird characters, or literally anything.
further reading:
https://en.cppreference.com/w/cpp/language/ub
https://en.wikipedia.org/wiki/Undefined_behavior
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
And, as pointed out by nathanoliver
when invoked in a constant expression, a constexpr function must check and error out on UB http://coliru.stacked-crooked.com/a/43ccf2039dc511d5
In other words there can't be any UB at compile time. What at runtime would be UB, at compile time it's a hard error.
constexpr it means that it can be evaluated at compile time, not that it will be evaluated at compile time. The compiler will be forced to do the evaluation compile-time if you use it where a compile time constant is expected (e.g. the size of an array).
On the other hand for small values g++ is for example smart enough to compute the result compile time (even without constexpr).
For example with:
int fact(int n) {
return n < 2 ? 1 : n*fact(n-1);
}
int bar() {
return fact(5);
}
the code generated by g++ -O3 for bar is:
bar():
mov eax, 120
ret
Note that overflowing the call stack (e.g. infinite or excessive recursion) or even overflowing signed integer arithmetic is in C++ undefined behavior and anything can happen. It doesn't mean you'll get a nice "error" or even a segfault... but that ANYTHING can happen (including, unfortunately, nothing evident). Basically it means that the authors of compilers can just ignore to handle those cases because you're not supposed to do this kind of mistakes.

How to test if constexpr is evaluated correctly

I have used constexpr to calculate hash codes in compile times. Code compiles correctly, runs correctly. But I dont know, if hash values are compile time or run time. If I trace code in runtime, I dont do into constexpr functions. But, those are not traced even for runtime values (calculate hash for runtime generated string - same methods).
I have tried to look into dissassembly, but I quite dont understand it
For debug purposes, my hash code is only string length, using this:
constexpr inline size_t StringLengthCExpr(const char * const str) noexcept
{
return (*str == 0) ? 0 : StringLengthCExpr(str + 1) + 1;
};
I have ID class created like this
class StringID
{
public:
constexpr StringID(const char * key);
private:
const unsigned int hashID;
}
constexpr inline StringID::StringID(const char * key)
: hashID(StringLengthCExpr(key))
{
}
If I do this in program main method
StringID id("hello world");
I got this disassembled code (part of it - there is a lot of more from inlined methods and other stuff in main)
;;; StringID id("hello world");
lea eax, DWORD PTR [-76+ebp]
lea edx, DWORD PTR [id.14876.0]
mov edi, eax
mov esi, edx
mov ecx, 4
mov eax, ecx
shr ecx, 2
rep movsd
mov ecx, eax
and ecx, 3
rep movsb
// another code
How can I tell from this, that "hash value" is a compile time. I donĀ“t see any constant like 11 moved to register. I am not quite good with ASM, so maybe it is correct, but I am not sure what to check or how to be sure, that "hash code" values are compile time and not computed in runtime from this code.
(I am using Visual Studio 2013 + Intel C++ 15 Compiler - VS Compiler is not supporting constexpr)
Edit:
If I change my code and do this
const int ix = StringLengthCExpr("hello world");
mov DWORD PTR [-24+ebp], 11 ;55.15
I have got the correct result
Even with this
change private hashID to public
StringID id("hello world");
// mov DWORD PTR [-24+ebp], 11 ;55.15
printf("%i", id.hashID);
// some other ASM code
But If I use private hashID and add Getter
inline uint32 GetHashID() const { return this->hashID; };
to ID class, then I got
StringID id("hello world");
//see original "wrong" ASM code
printf("%i", id.GetHashID());
// some other ASM code
The most convenient way is to use your constexpr in a static_assert statement. The code will not compile when it is not evaluated during compile time and the static_assert expression will give you no overhead during runtime (and no unnecessary generated code like with a template solution).
Example:
static_assert(_StringLength("meow") == 4, "The length should be 4!");
This also checks whether your function is computing the result correctly or not.
If you want to ensure that a constexpr function is evaluated at compile time, use its result in something which requires compile-time evaluation:
template <size_t N>
struct ForceCompileTimeEvaluation { static constexpr size_t value = N; };
constexpr inline StringID::StringID(const char * key)
: hashID(ForceCompileTimeEvaluation<StringLength(key)>::value)
{}
Notice that I've renamed the function to just StringLength. Name which start with an underscore followed by an uppercase letter, or which contain two consecutive underscores, are not legal in user code. They're reserved for the implementation (compiler & standard library).
In the future(c++20) you can use the consteval specifier to declare a function, which must be evaluated at compile time, thus requiring a constant expression context.
The consteval specifier declares a function or function template to be
an immediate function, that is, every call to the function must
(directly or indirectly) produce a compile time constant expression.
An example from cppreference(see consteval):
consteval int sqr(int n) {
return n*n;
}
constexpr int r = sqr(100); // OK
int x = 100;
int r2 = sqr(x); // Error: Call does not produce a constant
consteval int sqrsqr(int n) {
return sqr(sqr(n)); // Not a constant expression at this point, but OK
}
constexpr int dblsqr(int n) {
return 2*sqr(n); // Error: Enclosing function is not consteval and sqr(n) is not a constant
}
There are a few ways to force compile-time evaluation. But these aren't as flexible and easy to setup as what you'd expect when using constexpr. And they don't help you in finding if the compile-time constants are actually been used.
What you'd want for constexpr is to work where you expect it to beneficial. Therefor you try to meet its requirements. But Then you need to test if the code you expect to be generated at compile-time has been generated, and if the users actually consume the generated result or trigger the function at runtime.
I've found two ways to detect if a class or (member)function is using the compile-time or runtime evaluated path.
Using the property of constexpr functions returning true from the noexcept operator (bool noexcept( expression )) if evaluated at compile-time. Since the generated result will be a compile-time constant. This method is quite accessible and usable with Unit-testing.
(Be aware that marking these functions explicitly noexcept will break the test.)
Source: cppreference.com (2017/3/3)
Because the noexcept operator always returns true for a constant expression, it can be used to check if a particular invocation of a constexpr function takes the constant expression branch (...)
(less convenient) Using a debugger: By putting a break-point inside the function marked constexpr. Whenever the break-point isn't triggered, the compiler evaluated result was used. Not the easiest, but possible for incidental checking.
Soure: Microsoft documentation (2017/3/3)
Note: In the Visual Studio debugger, you can tell whether a constexpr function is being evaluated at compile time by putting a breakpoint inside it. If the breakpoint is hit, the function was called at run-time. If not, then the function was called at compile time.
I've found both these methods to be useful while experimenting with constexpr. Although I haven't done any testing with environments outside VS2017. And haven't been able to find an explicit statement supporting this behaviour in the current draft of the standard.
The following trick can help to check if the constexpr function has been evaluated during compile time only:
With gcc you can compile the source file with assembly listing + c sources; given that both the constexpr and its calls are in source file try.cpp
gcc -std=c++11 -O2 -Wa,-a,-ad try.cpp | c++filt >try.lst
If the constexpr function has been evaluated during run time time then you will see the compiled function and a call instruction (call function_name on x86) in the assembly listing try.lst (note that c++filt command has undecorated the linker names)
Interesting that it I always see a call if compiled without optimization (without -O2 or -O3 option).
Simply put it in constexpr variable.
constexpr StringID id("hello world");
constexpr int ix = StringLengthCExpr("hello world");
A constexpr variable is always a real constant expression. If it compiles, it is computed on compile time.

Confused about the function return value

#include<iostream>
using namespace std;
int Fun(int x)
{
int sum=1;
if(x>1)
sum=x*Fun(x-1);
else
return sum;
}
int main()
{
cout<<Fun(1)<<endl;
cout<<Fun(2)<<endl;
cout<<Fun(3)<<endl;
cout<<Fun(4)<<endl;
cout<<Fun(5)<<endl;
}
This function is to compute the factorial of an integer number. In the branch of x>1,there is no return value for function Fun. So this function should not return correct answer.
But when fun(4) or some other examples are tested, the right answers are got unexpectedly. Why?
The assembly code of this function is(call Fun(4)):
0x004017E5 push %ebp
0x004017E6 mov %esp,%ebp
0x004017E8 sub $0x28,%esp
0x004017EB movl $0x1,-0xc(%ebp)
0x004017F2 cmpl $0x1,0x8(%ebp)
0x004017F6 jle 0x40180d <Fun(int)+40>
0x004017F8 mov 0x8(%ebp),%eax
0x004017FB dec %eax
0x004017FC mov %eax,(%esp)
0x004017FF call 0x4017e5 <Fun(int)>
0x00401804 imul 0x8(%ebp),%eax
0x00401808 mov %eax,-0xc(%ebp)
0x0040180B jmp 0x401810 <Fun(int)+43>
0x0040180D mov -0xc(%ebp),%eax
0x00401810 leave
0x00401811 ret
May be this is the reason: The value of sum is saved in register eax, and the return value is saved in eax too, so Funreturn the correct result.
Usually, EAX register is used to store return value, ad it is also used to do other stuff as well.
So whatever has been loaded to that register just before the function returns will be the return value, even if you don't intend to do so.
You can use the -S option to generate assembly code and see what happened to EAX right before the "ret" instruction.
When your program pass in the if condition, no return statement finish the function. The number you got is the result of an undefined behavior.
int Fun(int x)
{
int sum=1.0;
if(x>1)
sum=x*Fun(x-1);
else
return sum;
return x; // return something here
}
Just remove else from your code:
int Fun(int x)
{
int sum=1;
if(x>1)
sum=x*Fun(x-1);
return sum;
}
The code you have has a couple of errors:
you have an int being assigned the value 1.0 (which will be implicitly cast/converted), not an error as such but inelegant.
you have a return statement inside a conditionality, so you will only ever get a return when that if is true
If you fix issue with the return by removing the else, then all will be fine:)
As to why it works with 4 as an input, that is down to random chance/ some property of your environment as the code you have posted should be unable to function, as there will always be an instance, when calculating factorials for a positive int, where x = 1 and no return will be generated.
As an aside, here is a more concise/terse function: for so straightforward a function you might consider the ternary operator and use a function like:
int factorial(int x){ return (x>1) ? (x * factorial(x-1)) : 1;}
this is the function I use for my factorials and have had on library for the last 30 or so years (since my C days) :)
From the C standards:
Flowing off the end of a function is equivalent to a return with no
value; this results in undefined behavior in a value-returning
function.
Your situation is the same as this one:
int fun1(int x)
{
int sum = 1;
if(x > 1)
sum++;
else
return sum;
}
int main()
{
int b = fun1(3);
printf("%d\n", b);
return 0;
}
It prints 2 on my machine.
This is calling convention and architecture dependent. The return value is the result of last expression evaluation, stored in the eax register.
As stated in the comment this is undefined behaviour. With g++ I get the following warning.
warning: control reaches end of non-void function [-Wreturn-type]
On Visual C++, the warning is promoted to an error by default
error C4716: 'Fun' : must return a value
When I disabled the warning and ran the resulting executable, Fun(4) gave me 1861810763.
So why might it work under g++? During compilation conditional statements are turned into tests and jumps (or gotos). The function has to return something, and the simplest possible code for the compiler to produce is along the following lines.
int Fun(int x)
{
int sum=1.0;
if(!(x>1))
goto RETURN;
sum=x*Fun(x-1);
RETURN:
return sum;
}
This is consistent with your disassembly.
Of course you can't rely on undefined behaviour, as illustrated by the behaviour in Visual C++. Many shops have a policy to treat warnings as errors for this reason (also as suggested in a comment).