Guaranteed 'if' statement elimination by using templates

Guaranteed 'if' statement elimination by using templates - c++

I am wondering if it can be guaranteed (by the compiler's implementation) that certain 'if' statements are never going to be implemented in binary code, i.e., no jumps for some given 'if' statements are ever going to be used in the binary code.
This can be motivated with examples, but let me be brief. Assume you have a class like this one:
template<bool N = true>
class A {
public:
// ...
void f() {
// do some work
if (N) {
// do some optional work
}
// do more work
}
// ...
};
Is the 'if' statement inside the function f ever going to be implemented by a modern compiler when class A is instantiated as A<false>? By "implemented" I mean if branching (or jump) instructions are going to be produced.
It seems to me that the answer is negative, i.e., the compiler will remove it (probably because it does not make much sense to implement if (false) { .. }). But is this guaranteed to happen in all compilers? How do I know if a compiler does this kind of optimization? Is that 'if' statement going to be removed only when optimization flags (e.g., -O1 or higher in g++) are passed to the compiler? In other words, is it also removed when there is no optimization at all?

Maybe
if constexpr (condition)
syntax would be an interesting solution, without a template

Related

C++ code example that makes the compile loop forever

Given that the C++ template system is not context-free and it's also Turing-Complete, can anyone provide me a non-trivial example of a program that makes the g++ compiler loop forever?
For more context, I imagine that if the C++ template system is Turing-complete, it can recognize all recursively enumerable languages and decide over all recursive ones. So, it made me think about the acceptance problem, and its more famous brother, the halting problem. I also imagine that g++ must decide if the input belongs in the C++ language (as it belongs in the decidability problem) in the syntactic analysis. But it also must resolve all templates, and since templates are recursively enumerable, there must be a C++ program that makes the g++ syntactic analysis run forever, since it can't decide if it belongs in the C++ grammar or not.
I would also like to know how g++ deals with such things?

While this is true in theory for the unlimited language, compilers in practice have implementation limits for recursive behavior (e.g. how deep template instantiations can be nested or how many instructions can be evaluated in a constant expression), so that it is probably not straight-forward to find such a case, even if we somehow ignore obvious problems of bounded memory. The standard specifically permits such limits, so if you want to be pedantic I am not even sure that any given implementation has to satisfy these theoretical concepts.
And also infinitely recursive template instantiation specifically is forbidden by the language. A program with such a construct has undefined behavior and the compiler can just refuse to compile if it is detected (although of course it cannot be detected in general).

This shows the limits for clang: Apple clang version 13.1.6 (clang-1316.0.21.2.5)
#include <iostream>
template<int V>
struct Count
{
static constexpr int value = Count<V-1>::value + 1;
};
template<>
struct Count<1>
{
static constexpr int value = 1;
};
int main()
{
#ifdef WORK
int v = Count<1026>::value; // This works.
#else
int v = Count<1027>::value; // This will fail to compile.
#endif
std::cout << "V: " << v << "\n";
}

What is compile time function in C++?

I've searched this question here(on SO), and as far as I know all questions assume what is compile time functions, but it is almost impossible for a beginner to know what that means, because resources to know that is quite rare.
I have found short wikipedia article which shows how to write incomprehensible code by writing never-seen-before use of enums in C++, and a video which is about future of it, but explains very little about that.
It seems to me that there are two ways to write compile time function in C++
constexpr
template<>
I've been through a short introduction of both of them, but I have no idea how they pop up here.
Can anyone explain compile time function with a sufficiently good example such that it encompasses most relevent features of it?

In cpp, as mentioned by you, there are two ways of evaluating a code on compile time - constexpr functions and template metaprogramming.
There are a few differences between those solutions. The template option is older and therefore supported by wider range of compilers. Additionaly templates are guaranteed to be evaluated in compile time while constexpr is somewhat like inline - it only suggests compiler that it is possible to do work while compiling. And for templates the arguments are usually passed via template parameters list while constexpr functions take arguments as regular functions (which they actually are). The constexpr functions are better in a manner that they can be called as regular functions in runtime.
Now the similarities - it must be possible for their parameters to be evaluated at compile time. So they must be either a literal or result of other compile-time function.
Having said all that let's look at compile time max function:
template<int a, int b>
struct max_template {
static constexpr int value = a > b ? a : b;
};
constexpr int max_fun(int a, int b) {
return a > b ? a : b;
}
int main() {
int x = 2;
int y = 3;
int foo = max_fun(3, 2); // can be evaluated at compile time
int bar = max_template<3, 2>::value; // is surely evaluated at compile time
// won't compile without compile-time arguments
// int bar2 = max_template<x, y>::value; // is surely evaluated at compile time
int foo = max_fun(x, y); // will be evaluated at runtime
return 0;
}

A "compile time function" as you have seen the term used is not a C++ construct, it's just the idea of computing stuff (hence, function) at compile-time (as opposed to computing at runtime or via a separate build tool outside the compiler). C++ makes this possible in several ways, of which you have found two:
Templates can indeed be used to compute arbitrary stuff, a set of techniques called "template metaprogramming". That's mostly by accident as they weren't designed for this purpose at all, hence the crazy syntax and struggles with old compilers. But in C++03 and before, that's all we had.
constexpr has been added in C++11 after seeing the need for compile-time calculations, and brings them back into somewhat saner territory. Its toolbelt has been expanding ever since, allowing more and more normal-looking code to be run at compile-time by just tacking a constexpr in the right place.
One could also mention macro metaprogramming, of which Boost.Preprocessor is a good example. But it's even more wonky and abhorrently arcane than old-school template metaprogramming, so you probably don't want to use it if you have a choice.

Reason for range for and decomposition not allowing constexpr

I wanted to do a couple of sanity tests for a pair of convenience functions that split a 64-bit integer in two 32-bit integers, or do the reverse. The intent is that you don't do the bit shifts and logic ops all over again with the potential of a typo somewhere. The sanity tests were supposed to make 100% sure that the pair of functions, although pretty trivial, indeed works as intended.
Nothing fancy, really... so as the first thing I added this:
static constexpr auto joinsplit(uint64_t h) noexcept { auto [a,b] = split(h); return join(a,b); }
static_assert(joinsplit(0x1234) == 0x1234);
... which works perfectly well, but is less "exhaustive" than I'd like. Of course I can follow up with another 5 or 6 tests with different patterns, copy-paste to the rescue. But seriously... wouldn't it be nice to have the compiler check a dozen or so values, within a pretty little function? No copy-paste? Now that would be cool.
With a recursive variadic template, this can be done (and it's what I'm using in lack of something better), but it's in my opinion needlessly ugly.
Given the power of constexpr functions and range-based for, wouldn't it be cool to have something nice and readable like:
constexpr bool test()
{
for(constexpr auto value : {1,2,3}) // other numbers of course
{
constexpr auto [a,b] = split(value);
static_assert(value == join(a,b));
}
return true; // never used
}
static_assert(test()); // invoke test
A big plus of this solution would be that in addtion to being much more readable, it would be obvious from the failing static_assert not just that the test failed in general, but also the exact value for which it failed.
This, however, doesn't work for two reasons:
You cannot declare value as constexpr because, as stated by the compiler: "The value of __for_begin is not usable in a constant expression". The reason for that is also explained by the compiler: "note: __for_begin was not declared constexpr". Fair enough, that is a reason, silly as it may be.
Decomposition declaration cannot be declared constexpr (which is promptly followed by a non-constexpr condition for static_assert error).
In both cases, I wonder if there is truly a hindrance to allowing these being constexpr. I understand why it doesn't work (see above!), but the interesting question is why is it like that?
I acknowledge that declaring value as constexpr is a lie to begin with since its value obviously is not constant (it's different in each iteration). On the other hand, any value that it ever takes is from a compiletime constant set of values, yet without the constexpr keyword the compiler refuses to treat it as such, i.e. the result of split is non-constexpr and not usable with static_assert although it really is, by all means.
OK, well... I'm probably really asking too much if I want to declare something that has a changing value as constant. Even though from some point of view, if it is constant, in each iteration's scope. Somehow... is the language missing a concept here?
I acknowledge that range-based for is, like lambdas, really just a hack that mostly works, and mostly works invisibly, not a true language feature -- the mention of __for_begin is a dead giveaway on its implementation. I also acknowledge that it's generally tricky (forbidding) to allow the counter in a normal for loop being constexpr, not only because it's not constant, but because you can in principle have any kind of expressions in there, and it truly cannot be easily told in advance what values in general will be generated (not with reasonable effort during compiletime, anyway).
On the other hand, given an exact finite sequence of literals (which is as compiletime-constant as it can get), the compiler should be able to do a number of iterations, each iteration of the loop with a different, compiletime-constant value (unroll the loop if you will). Somehow, in a readable (non-recursive-template) manner, such thing should be possible?
Am I asking too much there?
I acknowledge that a decomposition declaration is not an altogether "trivial" thing. It might for example require calling get on a tuple, which is a class template (that could in principle be anything). But, whatever, get happens to be constexpr (so that's no excuse), and also in my concrete example, an anonymous temporary of an anonymous struct with two members is returned, so public direct member binding (to a constexpr struct) is used.
Ironically, the compiler even does exactly the right thing in the first example, too (and with recursive templates as well). So apparently, it's quite possible. Only just, for some reason, not in the second example.
Again, am I asking too much here?
The likely correct answer will be "The standard doesn't provide that".
Apart from that, are there any true, technical reasons why this cannot, could not, or should not work? Is that an oversight, an implementation deficiency, or intentionally forbidden?

I can't answer you theoretical questions (" is the language missing a concept here?", " such thing should be possible? Am I asking too much there?", "there any true, technical reasons why this cannot, could not, or should not work? Is that an oversight, an implementation deficiency, or intentionally forbidden?") but, from the practical point of view...
With a recursive variadic template, this can be done (and it's what I'm using in lack of something better), but it's in my opinion needlessly ugly.
I think that variadic templates is the right way and (you tagged C++17), using folding, there is no reason to recursivize it.
By example
template <uint64_t ... Is>
static constexpr void test () noexcept
{ static_assert( ((joinsplit(Is) == Is) && ...) ); }
The following is a full compiling example
#include <utility>
#include <cstdint>
static constexpr std::pair<uint32_t, uint32_t> split (uint64_t h) noexcept
{ return { h >> 32 , h }; }
static constexpr uint64_t join (uint32_t h1, uint32_t h2) noexcept
{ return (uint64_t{h1} << 32) | h2; }
static constexpr auto joinsplit (uint64_t h) noexcept
{ auto [a,b] = split(h); return join(a, b); }
template <uint64_t ... Is>
static constexpr void test () noexcept
{ static_assert( ((joinsplit(Is) == Is) && ...) ); }
int main()
{
test<1, 2, 3>();
}
-- EDIT -- Bonus answer
Folding (C++17) is great but never underestimate the power of comma operator.
You can obtain the same result (well... quite same) in C++14 with an helper function and the initialization of an unused array
template <uint64_t I>
static constexpr void test_helper () noexcept
{ static_assert( joinsplit(I) == I, "!" ); }
template <uint64_t ... Is>
static constexpr void test () noexcept
{
using unused = int[];
(void)unused { 0, (test_helper<Is>(), 0)... };
}
Obviously after a little change in joinsplit() to make it C++14 compliant
static constexpr auto joinsplit (uint64_t h) noexcept
{ auto p = split(h); return join(p.first, p.second); }

Overloading typecast to bool and efficiency

Say I have a class C that I want to be able to implicitly cast to bool to use in if statements.
class C {
public:
...
operator bool() { return data ? true : false; }
private:
void * data;
};
and
C c;
...
if (c) ...
But the cast operator has a conditional which is technically overhead (even if relatively insignificant). If data was public I could do if (c.data) instead which is entirely possible and does not involve any conditionals. I doubt that the compiler will do any implicit conversion involving a conditional in the latter scenario, since it will likely generate a "jump if zero" or "jump if not zero" which doesn't really need any Boolean value, which the CPU will most likely have no notion of anyway.
My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Note that I did establish that if the typecast directly returns data it also works, probably using the same type of implicit (hypothetical and not really happening in practice) conversion that would be used in the case of if (c.data).
Edit: Just to clarify, the point of the matter is actually a bit hypothetical. The dilemma is that Boolean is itself a hypothetical construct (which didn't initially exist in C/C++), in reality it is just integers. As I mentioned, the typecast can directly return data or use != instead, but it is really not very readable, but even that is not the issue. I don't really know how to word it to make sense of it better, the C class has a void * that is an integer, the CPU has conditional jumps which use integers, the issue is that abiding to the hypothetical Boolean construct that sits in the middle mandates the extra conditional. Dunno if that "clarification" made things any more clear though...

My question is whether the typecast operator overload will indeed be less efficient than directly using the data member.
Only examining your compiler output - with the specific optimisation flags you'd like to use - can tell you for sure, and then it might change after some seemingly irrelevant change like adding an extra variable somewhere in the calling context, or perhaps with the next compiler release etc....
More generally, C++ wouldn't be renowned for speed if the optimisers didn't tend to handle this kind of situation perfectly, so your odds are very good.
Further, write working code then profile it and you'll learn a lot more about what performance problems are actually significant.

It depends on how smart your compiler's optimizer is. I think they should be smart enough to remove the useless ? true: false operation, because the typecast operation should be inlined.
Or you could just write this and not worry about it:
operator bool() { return data; }
Since there's a built-in implicit typecast from void* to bool, data gets typecast on the way out the function.
I don't remember if the conditional in if expects bool or void*; at one point, before C++ added bool, it was the latter. (operator! in the iostream classes returned void* back then.)

On modern compilers these two functions produce the same machine code:
bool toBool1(void* ptr) {
return ptr ? true : false;
}
bool toBool2(void* ptr) {
return ptr;
}
Demo
So it really doesn't matter.

Benefits of ternary operator vs. if statement

I'm browsing through some code and I found a few ternary operators in it. This code is a library that we use, and it's supposed to be quite fast.
I'm thinking if we're saving anything except for space there.
What's your experience?

Performance
The ternary operator shouldn't differ in performance from a well-written equivalent if/else statement... they may well resolve to the same representation in the Abstract Syntax Tree, undergo the same optimisations etc..
Things you can only do with ? :
If you're initialising a constant or reference, or working out which value to use inside a member initialisation list, then if/else statements can't be used but ? : can be:
const int x = f() ? 10 : 2;
X::X() : n_(n > 0 ? 2 * n : 0) { }
Factoring for concise code
Keys reasons to use ? : include localisation, and avoiding redundantly repeating other parts of the same statements/function-calls, for example:
if (condition)
return x;
else
return y;
...is only preferable to...
return condition ? x : y;
...on readability grounds if dealing with very inexperienced programmers, or some of the terms are complicated enough that the ? : structure gets lost in the noise. In more complex cases like:
fn(condition1 ? t1 : f1, condition2 ? t2 : f2, condition3 ? t3 : f3);
An equivalent if/else:
if (condition1)
if (condition2)
if (condition3)
fn(t1, t2, t3);
else
fn(t1, t2, f3);
else if (condition3)
fn(t1, f2, t3);
else
fn(t1, f2, f3);
else
if (condition2)
...etc...
That's a lot of extra function calls that the compiler may or may not optimise away.
Further, ? allows you to select an object, then use a member thereof:
(f() ? a : b).fn(g() ? c : d).field_name);
The equivalent if/else would be:
if (f())
if (g())
x.fn(c.field_name);
else
x.fn(d.field_name);
else
if (g())
y.fn(c.field_name);
else
y.fn(d.field_name);
Can't named temporaries improve the if/else monstrosity above?
If the expressions t1, f1, t2 etc. are too verbose to type repeatedly, creating named temporaries may help, but then:
To get performance matching ? : you may need to use std::move, except when the same temporary is passed to two && parameters in the function called: then you must avoid it. That's more complex and error-prone.
c ? x : y evaluates c then either but not both of x and y, which makes it safe to say test a pointer isn't nullptr before using it, while providing some fallback value/behaviour. The code only gets the side effects of whichever of x and y is actually selected. With named temporaries, you may need if / else around or ? : inside their initialisation to prevent unwanted code executing, or code executing more often than desired.
Functional difference: unifying result type
Consider:
void is(int) { std::cout << "int\n"; }
void is(double) { std::cout << "double\n"; }
void f(bool expr)
{
is(expr ? 1 : 2.0);
if (expr)
is(1);
else
is(2.0);
}
In the conditional operator version above, 1 undergoes a Standard Conversion to double so that the type matched 2.0, meaning the is(double) overload is called even for the true/1 situation. The if/else statement doesn't trigger this conversion: the true/1 branch calls is(int).
You can't use expressions with an overall type of void in a conditional operator either, whereas they're valid in statements under an if/else.
Emphasis: value-selection before/after action needing values
There's a different emphasis:
An if/else statement emphasises the branching first and what's to be done is secondary, while a ternary operator emphasises what's to be done over the selection of the values to do it with.
In different situations, either may better reflect the programmer's "natural" perspective on the code and make it easier to understand, verify and maintain. You may find yourself selecting one over the other based on the order in which you consider these factors when writing the code - if you've launched into "doing something" then find you might use one of a couple (or few) values to do it with, ? : is the least disruptive way to express that and continue your coding "flow".

The only potential benefit to ternary operators over plain if statements in my view is their ability to be used for initializations, which is particularly useful for const:
E.g.
const int foo = (a > b ? b : a - 10);
Doing this with an if/else block is impossible without using a function cal as well. If you happen to have lots of cases of const things like this you might find there's a small gain from initializing a const properly over assignment with if/else. Measure it! Probably won't even be measurable though. The reason I tend to do this is because by marking it const the compiler knows when I do something later that could/would accidentally change something I thought was fixed.
Effectively what I'm saying is that ternary operator is important for const-correctness, and const correctness is a great habit to be in:
This saves a lot of your time by letting the compiler help you spot mistakes you make
This can potentially let the compiler apply other optimizations

Well...
I did a few tests with GCC and this function call:
add(argc, (argc > 1)?(argv[1][0] > 5)?50:10:1, (argc > 2)?(argv[2][0] > 5)?50:10:1, (argc > 3)?(argv[3][0] > 5)?50:10:1);
The resulting assembler code with gcc -O3 had 35 instructions.
The equivalent code with if/else + intermediate variables had 36. With nested if/else using the fact that 3 > 2 > 1, I got 44. I did not even try to expand this into separate function calls.
Now I did not do any performance analysis, nor did I do a quality check of the resulting assembler code, but at something simple like this with no loops e.t.c. I believe shorter is better.
It appears that there is some value to ternary operators after all :-)
That is only if code speed is absolutely crucial, of course. If/else statements are much easier to read when nested than something like (c1)?(c2)?(c3)?(c4)?:1:2:3:4. And having huge expressions as function arguments is not fun.
Also keep in mind that nested ternary expressions make refactoring the code - or debugging by placing a bunch of handy printfs() at a condition - a lot harder.

If you're worried about it from a performance perspective then I'd be very surprised if there was any different between the two.
From a look 'n feel perspective it's mainly down to personal preference. If the condition is short and the true/false parts are short then a ternary operator is fine, but anything longer tends to be better in an if/else statement (in my opinion).

You assume that there must be a distinction between the two when, in fact, there are a number of languages which forgo the "if-else" statement in favor of an "if-else" expression (in this case, they may not even have the ternary operator, which is no longer needed)
Imagine:
x = if (t) a else b
Anyway, the ternary operator is an expression in some languages (C,C#,C++,Java,etc) which do not have "if-else" expressions and thus it serves a distinct role there.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Guaranteed 'if' statement elimination by using templates - c++

Maybe if constexpr (condition) syntax would be an interesting solution, without a template

Related

C++ code example that makes the compile loop forever

What is compile time function in C++?

Reason for range for and decomposition not allowing constexpr

Overloading typecast to bool and efficiency

Benefits of ternary operator vs. if statement

Categories

Resources