How to test if constexpr is evaluated correctly - c++

I have used constexpr to calculate hash codes in compile times. Code compiles correctly, runs correctly. But I dont know, if hash values are compile time or run time. If I trace code in runtime, I dont do into constexpr functions. But, those are not traced even for runtime values (calculate hash for runtime generated string - same methods).
I have tried to look into dissassembly, but I quite dont understand it
For debug purposes, my hash code is only string length, using this:
constexpr inline size_t StringLengthCExpr(const char * const str) noexcept
{
return (*str == 0) ? 0 : StringLengthCExpr(str + 1) + 1;
};
I have ID class created like this
class StringID
{
public:
constexpr StringID(const char * key);
private:
const unsigned int hashID;
}
constexpr inline StringID::StringID(const char * key)
: hashID(StringLengthCExpr(key))
{
}
If I do this in program main method
StringID id("hello world");
I got this disassembled code (part of it - there is a lot of more from inlined methods and other stuff in main)
;;; StringID id("hello world");
lea eax, DWORD PTR [-76+ebp]
lea edx, DWORD PTR [id.14876.0]
mov edi, eax
mov esi, edx
mov ecx, 4
mov eax, ecx
shr ecx, 2
rep movsd
mov ecx, eax
and ecx, 3
rep movsb
// another code
How can I tell from this, that "hash value" is a compile time. I donĀ“t see any constant like 11 moved to register. I am not quite good with ASM, so maybe it is correct, but I am not sure what to check or how to be sure, that "hash code" values are compile time and not computed in runtime from this code.
(I am using Visual Studio 2013 + Intel C++ 15 Compiler - VS Compiler is not supporting constexpr)
Edit:
If I change my code and do this
const int ix = StringLengthCExpr("hello world");
mov DWORD PTR [-24+ebp], 11 ;55.15
I have got the correct result
Even with this
change private hashID to public
StringID id("hello world");
// mov DWORD PTR [-24+ebp], 11 ;55.15
printf("%i", id.hashID);
// some other ASM code
But If I use private hashID and add Getter
inline uint32 GetHashID() const { return this->hashID; };
to ID class, then I got
StringID id("hello world");
//see original "wrong" ASM code
printf("%i", id.GetHashID());
// some other ASM code

The most convenient way is to use your constexpr in a static_assert statement. The code will not compile when it is not evaluated during compile time and the static_assert expression will give you no overhead during runtime (and no unnecessary generated code like with a template solution).
Example:
static_assert(_StringLength("meow") == 4, "The length should be 4!");
This also checks whether your function is computing the result correctly or not.

If you want to ensure that a constexpr function is evaluated at compile time, use its result in something which requires compile-time evaluation:
template <size_t N>
struct ForceCompileTimeEvaluation { static constexpr size_t value = N; };
constexpr inline StringID::StringID(const char * key)
: hashID(ForceCompileTimeEvaluation<StringLength(key)>::value)
{}
Notice that I've renamed the function to just StringLength. Name which start with an underscore followed by an uppercase letter, or which contain two consecutive underscores, are not legal in user code. They're reserved for the implementation (compiler & standard library).

In the future(c++20) you can use the consteval specifier to declare a function, which must be evaluated at compile time, thus requiring a constant expression context.
The consteval specifier declares a function or function template to be
an immediate function, that is, every call to the function must
(directly or indirectly) produce a compile time constant expression.
An example from cppreference(see consteval):
consteval int sqr(int n) {
return n*n;
}
constexpr int r = sqr(100); // OK
int x = 100;
int r2 = sqr(x); // Error: Call does not produce a constant
consteval int sqrsqr(int n) {
return sqr(sqr(n)); // Not a constant expression at this point, but OK
}
constexpr int dblsqr(int n) {
return 2*sqr(n); // Error: Enclosing function is not consteval and sqr(n) is not a constant
}

There are a few ways to force compile-time evaluation. But these aren't as flexible and easy to setup as what you'd expect when using constexpr. And they don't help you in finding if the compile-time constants are actually been used.
What you'd want for constexpr is to work where you expect it to beneficial. Therefor you try to meet its requirements. But Then you need to test if the code you expect to be generated at compile-time has been generated, and if the users actually consume the generated result or trigger the function at runtime.
I've found two ways to detect if a class or (member)function is using the compile-time or runtime evaluated path.
Using the property of constexpr functions returning true from the noexcept operator (bool noexcept( expression )) if evaluated at compile-time. Since the generated result will be a compile-time constant. This method is quite accessible and usable with Unit-testing.
(Be aware that marking these functions explicitly noexcept will break the test.)
Source: cppreference.com (2017/3/3)
Because the noexcept operator always returns true for a constant expression, it can be used to check if a particular invocation of a constexpr function takes the constant expression branch (...)
(less convenient) Using a debugger: By putting a break-point inside the function marked constexpr. Whenever the break-point isn't triggered, the compiler evaluated result was used. Not the easiest, but possible for incidental checking.
Soure: Microsoft documentation (2017/3/3)
Note: In the Visual Studio debugger, you can tell whether a constexpr function is being evaluated at compile time by putting a breakpoint inside it. If the breakpoint is hit, the function was called at run-time. If not, then the function was called at compile time.
I've found both these methods to be useful while experimenting with constexpr. Although I haven't done any testing with environments outside VS2017. And haven't been able to find an explicit statement supporting this behaviour in the current draft of the standard.

The following trick can help to check if the constexpr function has been evaluated during compile time only:
With gcc you can compile the source file with assembly listing + c sources; given that both the constexpr and its calls are in source file try.cpp
gcc -std=c++11 -O2 -Wa,-a,-ad try.cpp | c++filt >try.lst
If the constexpr function has been evaluated during run time time then you will see the compiled function and a call instruction (call function_name on x86) in the assembly listing try.lst (note that c++filt command has undecorated the linker names)
Interesting that it I always see a call if compiled without optimization (without -O2 or -O3 option).

Simply put it in constexpr variable.
constexpr StringID id("hello world");
constexpr int ix = StringLengthCExpr("hello world");
A constexpr variable is always a real constant expression. If it compiles, it is computed on compile time.

Related

When is a constexpr evaluated at compile time?

What assurances do I have that a core constant expression (as in [expr.const].2) possibly containing constexpr function calls will actually be evaluated at compile time and on which conditions does this depend?
The introduction of constexpr implicitly promises runtime performance improvements by moving computations into the translation stage (compile time).
However, the standard does not (and presumably cannot) mandate what code a compiler produces. (See [expr.const] and [dcl.constexpr]).
These two points appear to be at odds with each other.
Under which circumstances can one rely on the compiler resolving a core constant expression (which might contain an arbitrarily complicated computation) at compile time rather than deferring it to runtime?
At least under -O0 gcc appears to actually emit code and call for a constexpr function. Under -O1 and up it doesn't.
Do we have to resort to trickery such as this, that forces the constexpr through the template system:
template <auto V>
struct compile_time_h { static constexpr auto value = V; };
template <auto V>
inline constexpr auto compile_time = compile_time_h<V>::value;
constexpr int f(int x) { return x; }
int main() {
for (int x = 0; x < compile_time<f(42)>; ++x) {}
}
When a constexpr function is called and the output is assigned to a constexpr variable, it will always be run at compiletime.
Here's a minimal example:
// Compile with -std=c++14 or later
constexpr int fib(int n) {
int f0 = 0;
int f1 = 1;
for(int i = 0; i < n; i++) {
int hold = f0 + f1;
f0 = f1;
f1 = hold;
}
return f0;
}
int main() {
constexpr int blarg = fib(10);
return blarg;
}
When compiled at -O0, gcc outputs the following assembly for main:
main:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], 55
mov eax, 55
pop rbp
ret
Despite all optimization being turned off, there's never any call to fib in the main function itself.
This applies going all the way back to C++11, however in C++11 the fib function would have to be re-written to use conversion to avoid the use of mutable variables.
Why does the compiler include the assembly for fib in the executable sometimes? A constexpr function can be used at runtime, and when invoked at runtime it will behave like a regular function.
Used properly, constexpr can provide some performance benefits in specific cases, but the push to make everything constexpr is more about writing code that the compiler can check for Undefined Behavior.
What's an example of constexpr providing performance benefits? When implementing a function like std::visit, you need to create a lookup table of function pointers. Creating the lookup table every time std::visit is called would be costly, and assigning the lookup table to a static local variable would still result in measurable overhead because the program has to check if that variable's been initialized every time the function is run.
Thankfully, you can make the lookup table constexpr, and the compiler will actually inline the lookup table into the assembly code for the function so that the contents of the lookup table is significantly more likely to be inside the instruction cache when std::visit is run.
Does C++20 provide any mechanisms for guaranteeing that something runs at compiletime?
If a function is consteval, then the standard specifies that every call to the function must produce a compile-time constant.
This can be trivially used to force the compile-time evaluation of any constexpr function:
template<class T>
consteval T run_at_compiletime(T value) {
return value;
}
Anything given as a parameter to run_at_compiletime must be evaluated at compile-time:
constexpr int fib(int n) {
int f0 = 0;
int f1 = 1;
for(int i = 0; i < n; i++) {
int hold = f0 + f1;
f0 = f1;
f1 = hold;
}
return f0;
}
int main() {
// fib(10) will definitely run at compile time
return run_at_compiletime(fib(10));
}
Never; the C++ standard permits almost the entire compilation to occur at "runtime". Some diagnostics have to be done at compile time, but nothing prevents insanity on the part of the compiler.
Your binary could be a copy of the compiler with your source code appended, and C++ wouldn't say the compiler did anything wrong.
What you are looking at is a QoI - Quality of Implrmentation - issue.
In practice, constexpr variables tend to be compile time computed, and template parameters are always compile time computed.
consteval can also be used to markup functions.

What happens when there's an error in constexpr function?

I learnt that constexpr functions are evaluated at compile time. But look at this example:
constexpr int fac(int n)
{
return (n>1) ? n*fac(n-1) : 1;
}
int main()
{
const int a = 500000;
cout << fac(a);
return 0;
}
Apparently this code would throw an error, but since constexpr functions are evaluated at compiling time, why I see no error when compile and link?
Further on, I disassembled this code, and it turned out this function isn't evaluated but rather called as a normal function:
(gdb) x/10i $pc
=> 0x80007ca <main()>: sub $0x8,%rsp
0x80007ce <main()+4>: mov $0x7a11f,%edi
0x80007d3 <main()+9>: callq 0x8000823 <fac(int)>
0x80007d8 <main()+14>: imul $0x7a120,%eax,%esi
0x80007de <main()+20>: lea 0x20083b(%rip),%rdi # 0x8201020 <_ZSt4cout##GLIBCXX_3.4>
0x80007e5 <main()+27>: callq 0x80006a0 <_ZNSolsEi#plt>
0x80007ea <main()+32>: mov $0x0,%eax
0x80007ef <main()+37>: add $0x8,%rsp
0x80007f3 <main()+41>: retq
However, if I call like fac(5):
constexpr int fac(int n)
{
return (n>1) ? n*fac(n-1) : 1;
}
int main()
{
const int a = 5;
cout << fac(a);
return 0;
}
The assemble code turned into:
(gdb) x/10i $pc
=> 0x80007ca <main()>: sub $0x8,%rsp
0x80007ce <main()+4>: mov $0x78,%esi
0x80007d3 <main()+9>: lea 0x200846(%rip),%rdi # 0x8201020 <_ZSt4cout##GLIBCXX_3.4>
0x80007da <main()+16>: callq 0x80006a0 <_ZNSolsEi#plt>
0x80007df <main()+21>: mov $0x0,%eax
0x80007e4 <main()+26>: add $0x8,%rsp
0x80007e8 <main()+30>: retq
The fac function is evaluated at compile time.
Can Anyone explain this?
Compiling command:
g++ -Wall test.cpp -g -O1 -o test
And with g++ version 7.4.0, gdb version 8.1.0
I learnt that constexpr functions are evaluated at compile time
No, constexpr can be evaluated at compile time, but also at runtime.
further reading:
Difference between `constexpr` and `const`
https://en.cppreference.com/w/cpp/language/constexpr
Purpose of constexpr
Apparently this code would throw an error
No, no errors thrown. For large input, the result will overflow which is undefined behavior. This doesn't mean an error will be thrown or displayed. It means anything can happen. And when I say anything, I do mean anything. The program can crash, hang, appear to work with a strange results, display weird characters, or literally anything.
further reading:
https://en.cppreference.com/w/cpp/language/ub
https://en.wikipedia.org/wiki/Undefined_behavior
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
And, as pointed out by nathanoliver
when invoked in a constant expression, a constexpr function must check and error out on UB http://coliru.stacked-crooked.com/a/43ccf2039dc511d5
In other words there can't be any UB at compile time. What at runtime would be UB, at compile time it's a hard error.
constexpr it means that it can be evaluated at compile time, not that it will be evaluated at compile time. The compiler will be forced to do the evaluation compile-time if you use it where a compile time constant is expected (e.g. the size of an array).
On the other hand for small values g++ is for example smart enough to compute the result compile time (even without constexpr).
For example with:
int fact(int n) {
return n < 2 ? 1 : n*fact(n-1);
}
int bar() {
return fact(5);
}
the code generated by g++ -O3 for bar is:
bar():
mov eax, 120
ret
Note that overflowing the call stack (e.g. infinite or excessive recursion) or even overflowing signed integer arithmetic is in C++ undefined behavior and anything can happen. It doesn't mean you'll get a nice "error" or even a segfault... but that ANYTHING can happen (including, unfortunately, nothing evident). Basically it means that the authors of compilers can just ignore to handle those cases because you're not supposed to do this kind of mistakes.

const vs non-const variable with no change in value once assign

In C++, if value of a variable never gets changed once assigned in whole program VS If making that variable as const , In which case executable code is faster?
How compiler optimize executable code in case 1?
A clever compiler can understand that the value of a variable is never changed, thus optimizing the related code, even without the explicit const keyword by the programmer.
As for your second, question, when you mark a variable as const, then the follow might happen: the "compiler can optimize away this const by not providing storage to this variable rather add it in symbol table. So, subsequent read just need indirection into the symbol table rather than instructions to fetch value from memory". Read more in What kind of optimization does const offer in C/C++? (if any).
I said might, because const does not mean that this is a constant expression for sure, which can be done by using constexpr instead, as I explain bellow.
In general, you should think about safer code, rather than faster code when it comes to using the const keyword. So unless, you do it for safer and more readable code, then you are likely a victim of premature optimization.
Bonus:
C++ offers the constexpr keyword, which allows the programmer to mark a variable as what the Standard calls constant expressions. A constant expression is more than merely constant.
Read more in Difference between `constexpr` and `const` and When should you use constexpr capability in C++11?
PS: Constness prevents moving, so using const too liberally may turn your code to execute slower.
In which case executable code is faster?
The code is faster in case on using const, because compiler has more room for optimization. Consider this snippet:
int c = 5;
[...]
int x = c + 5;
If c is constant, it will simply assign 10 to x. If c is not a constant, it depend on compiler if it will be able to deduct from the code that c is de-facto constant.
How compiler optimize executable code in case 1?
Compiler has harder time to optimize the code in case the variable is not constant. The broader the scope of the variable, the harder for the compiler to make sure the variable is not changing.
For simple cases, like a local variables, the compiler with basic optimizations will be able to deduct that the variable is a constant. So it will treat it like a constant.
if (...) {
int c = 5;
[...]
int x = c + 5;
}
For broader scopes, like global variables, external variables etc., if the compiler is not able to analyze the whole scope, it will treat it like a normal variable, i.e. allocate some space, generate load and store operations etc.
file1.c
int c = 5;
file2.c
extern int c;
[...]
int x = c + 5;
There are more aggressive optimization options, like link time optimizations, which might help in such cases. But still, performance-wise, the const keyword helps, especially for variables with wide scopes.
EDIT:
Simple example
File const.C:
const int c = 5;
volatile int x;
int main(int argc, char **argv)
{
x = c + 5;
}
Compilation:
$ g++ const.C -O3 -g
Disassembly:
5 {
6 x = c + 5;
0x00000000004003e0 <+0>: movl $0xa,0x200c4a(%rip) # 0x601034 <x>
7 }
So we just move 10 (0xa) to x.
File nonconst.C:
int c = 5;
volatile int x;
int main(int argc, char **argv)
{
x = c + 5;
}
Compilation:
$ g++ nonconst.C -O3 -g
Disassembly:
5 {
6 x = c + 5;
0x00000000004003e0 <+0>: mov 0x200c4a(%rip),%eax # 0x601030 <c>
0x00000000004003e6 <+6>: add $0x5,%eax
0x00000000004003e9 <+9>: mov %eax,0x200c49(%rip) # 0x601038 <x>
7 }
We load c, add 5 and store to x.
So as you can see even with quite aggressive optimization (-O3) and the shortest program you can write, the effect of const is quite obvious.
g++ version 5.4.1

Constexpr Factorial Compilation Results in VS2015 and GCC 5.4.0

Wondering if the following surprises anyone, as it did me? Alex Allain's article here on using constexpr shows the following factorial example:
constexpr factorial (int n)
{
return n > 0 ? n * factorial( n - 1 ) : 1;
}
And states:
Now you can use factorial(2) and when the compiler sees it, it can
optimize away the call and make the calculation entirely at compile
time.
I tried this in VS2015 in Release mode with full optimizations on (/Ox) and stepped through the code in the debugger viewing the assembly and saw that the factorial calculation was not done at compilation.
Using GCC v5.4.0 with --std=C++14, I must use /O2 or /O3 before the calculation is performed at compile time. I was surprised thought that using just /O the calculation did not occur at compilation time.
Main main question is: Why is VS2015 not performing this calculation at compilation time?
It depends on the context of the function call.
For example, the following obviously could never be calculated at compile time:
int x;
std::cin >> x;
std::cout << factorial(x);
On the other hand, this context would require the answer at compile time:
class Foo {
int x[factorial(4)];
};
constexpr functions are only guaranteed to be evaluated at compile time if they are called from a constexpr context; otherwise it is up to the compiler to choose whether or not to eval at compile time (assuming such an optimization is possible, again, depending on the context).
You have to use it in const expression, as:
constexpr auto res = factorial(2);
else computation can be done at runtime.
constexpr is neither necessary nor sufficient to compile time evaluation of a function.
It's not sufficient, even aside from the fact that the arguments obviously also have to be constant expressions. Even if that is true, a conforming compiler does not have to evaluate it at compile time. It only has to be evaluated at compile time if it is in a constexpr context. Such as, assigning the result of the computation to a constexpr variable, or using the value as an array size, or as a non-type template parameter.
The other point, is that the compiler is completely capable of evaluating things at compile time, even without constexpr. There is a lot of confusion about this, and it's not clear why. compile time evaluation of constexpr functions fundamentally just boils down to constant propagation, and compilers have been doing this optimization since forever: https://godbolt.org/g/Sy214U.
int factorial(int n) {
if (n <= 1) return 1;
return n * factorial(n-1);
}
int foo() { return factorial(5); }
On gcc 6.3 with O3 (and 14) yields:
foo():
mov eax, 120
ret
In essence, outside of the specific case where you absolutely force compile time evaluation by assigning a constexpr function to another constexpr variable, compile time evaluation has more to do with the quality of your optimizer than the standard.

c++ template instantiation

I have a template class like below.
template<int S> class A
{
private:
char string[S];
public:
A()
{
for(int i =0; i<S; i++)
{
.
.
}
}
int MaxLength()
{
return S;
}
};
If i instantiate the above class with different values of S, will the compiler create different instances of A() and MaxLenth() function? Or will it create one instance and pass the S as some sort of argument?
How will it behave if i move the definition of A and Maxlength to a different cpp file.
The template will be instantiated for each different values of S.
If you move the method implementations to a different file, you'll need to #include that file. (Boost for instance uses the .ipp convention for such source files that need to be #included).
If you want to minimise the amount of code that is generated with the template instantiation (and hence needs to be made available in the .ipp file) you should try to factor it out by removing the dependency on S. So for example you could derive from a (private) base class which provides member functions with S as a parameter.
Actually this is fully up to the compiler. It's only required to generate correct code for its inputs. In order to do so it must follow the C++ standard as that explains what is correct. In this case it says that the compiler must at one step in the process instantiate templates with different arguments as different types, these types may later be represented by the same code, or not, it's fully up to the compiler.
It's most probable that the compiler would inline at least MaxLength() but possibly also your ctor. Otherwise it may very well generate a single instance of your ctor and pass/have it retrieve S from elsewhere. The only way to know for sure is to examine the output of the compiler.
So in order to know for sure I decided to list what VS2005 does in a release build. The file I have compiled looks like this:
template <int S>
class A
{
char s_[S];
public:
A()
{
for(int i = 0; i < S; ++i)
{
s_[i] = 'A';
}
}
int MaxLength() const
{
return S;
}
};
extern void useA(A<5> &a, int n); // to fool the optimizer
extern void useA(A<25> &a, int n);
void test()
{
A<5> a5;
useA(a5, a5.MaxLength());
A<25> a25;
useA(a25, a25.MaxLength());
}
The assembler output is the following:
?test##YAXXZ PROC ; test, COMDAT
[snip]
; 25 : A<5> a5;
mov eax, 1094795585 ; 41414141H
mov DWORD PTR _a5$[esp+40], eax
mov BYTE PTR _a5$[esp+44], al
; 26 : useA(a5, a5.MaxLength());
lea eax, DWORD PTR _a5$[esp+40]
push 5
push eax
call ?useA##YAXAAV?$A#$04##H#Z ; useA
As you can see both the ctor and the call to MaxLength() are inlined. And as you may now guess it does the same with the A<25> type:
; 28 : A<25> a25;
mov eax, 1094795585 ; 41414141H
; 29 : useA(a25, a25.MaxLength());
lea ecx, DWORD PTR _a25$[esp+48]
push 25 ; 00000019H
push ecx
mov DWORD PTR _a25$[esp+56], eax
mov DWORD PTR _a25$[esp+60], eax
mov DWORD PTR _a25$[esp+64], eax
mov DWORD PTR _a25$[esp+68], eax
mov DWORD PTR _a25$[esp+72], eax
mov DWORD PTR _a25$[esp+76], eax
mov BYTE PTR _a25$[esp+80], al
call ?useA##YAXAAV?$A#$0BJ###H#Z ; useA
It's very interesting to see the clever ways the compiler optimizes the for-loop. For all those premature optimizers out there using memset(), I would say fool on you.
How will it behave if i move the definition of A and Maxlength to a different cpp file.
It will probably not compile (unless you only use A in that cpp-file).
If i instantiate the above class with
different values of S, will the
compiler create different instances of
A() and MaxLenth() function? Or will
it create one instance and pass the S
as some sort of argument?
The compiler will instantiate a different copy of the class template for each different value of the parameter. Regarding the member functions, it will instantiate a different copy of each, for each different value of S. But unlike member functions of non-template classes, they will only be generated if they are actually used.
How will it behave if i move the definition of A and Maxlength to a different cpp file.
You mean if you put the definition of A into a header file, but define the member function MaxLength in a cpp file? Well, if users of your class template want to call MaxLength, the compiler wants to see its code, since it wants to instantiate a copy of it with the actual value of S. If it doesn't have the code available, it assumes the code is provided otherwise, and doesn't generate any code:
A.hpp
template<int S> class A {
public:
A() { /* .... */ }
int MaxLength(); /* not defined here in the header file */
};
A.cpp
template<int S> int
A<S>::MaxLength() { /* ... */ }
If you now only include include A.hpp for the code using the class template A, then the compiler won't see the code for MaxLength and won't generate any instantiation. You have two options:
Include the file A.cpp too, so the compiler sees the code for it, or
Provide an explicit instantiation for values of S you know you will use. For those values, you won't need to provide the code of MaxLength
For the second option, this is done by putting a line like the following inside A.cpp:
template class A<25>;
The compiler will be able to survive without seeing code for the member functions of A<25> now, since you explicitly instantiated a copy of your template for S=25. If you don't do any of the two options above, the linker will refuse to create a final executable, since there is still code missing that is needed.
A<S>::MaxLength() is so trivial it will be fully inlined. Hence, there will be 0 copies. A<S>::A() looks more complex, so it likely will cause multiple copies to be generated. The compiler may of course decide not to, as long as the code works as intended.
You might want to see if you can move the loop to A_base::A_base(int S).
Wherever S is used, a separate version of that function will get compiled into your code for each different S you instantiate.
It will create two different versions of A() and MaxLength() that will return compile-time constants. The simple return S; will be compiled efficiently and even inlined where possible.
Compiler will create different instances of the class if you instantiate it for different types or parameters.