Templates and inlining- optimizing multiplication - c++

I don't understand this example I came across:
int Multiply(int x, int m){
return x * m;}
template<int m>
int MultiplyBy(int x){
return x * m;}
int a,b;
a = Multiply(10,8);
b = MultiplyBy<8>(10);
In the above example the template function is faster than the simple
function because the compiler knows that it can multiply by a power of
2 by using a shift operation. x*8 is replaced by x << , which is
faster. In the case of the simple function, the compiler doesnt know
the value of m and therefore cannot do the optimization unless the
function can be inlined.
As I understand it, the reason the template can optimize is because the compiler knows the value of the argument (8) at compile-time, whereas the simple function will not know the value of x (or m) until run-time. So how is inlining the simple function going to change this fact? Inlining doesn't provide any run-time knowledge of the argument value??

Inlining doesn't provide any run-time knowledge of the argument value??
Inlining per se doesn't. However, since the second argument is a compile-time constant, the compiler can propagate that constant into the inlined function.
Since the first argument is also a compile-time constant, the entire
a = Multiply(10,8);
can be replaced with
a = 80;
In fact, this is precisely what my compiler (gcc 4.7.2) does when I turn on optimizations.

a = Multiply(10,8);
Let's manually inline the function call:
a = 10 * 8;
Now, of course 8 is a compile-time constant here, so the compiler can use the bit-shift optimization as described. However, it would likely perform a better optimization and just replace 10 * 8 with 80. Compilers are very smart - given a constant expression, such as 10 * 8, they can work out the result at compile time.
That would be different had you done:
int x;
std::cin >> x;
a = Multiply(10,x);
If you inline Multiply here, you get:
int x;
std::cin >> x;
a = 10 * x;
The compiler doesn't know the value of x at compile-time and therefore can't optimize this in the same way.

Related

If `v` is a class, can `v + 1 - 1` be optimized by compiler to `v`?

Consider the following code:
struct S{
int a, b;
S() : a(0), b(10) {}
};
S operator + (const S& v, int n){
S r = v;
r.a += n;
return r;
}
S operator - (const S& v, int n){
S r = v;
r.b -= n;
return r;
}
S v = S() + 1 - 1;
Is it possible that S() + 1 - 1 will be optimized to S() ?
How does a compiler even determines when this kind of things can be optimized?
No, a compiler is not allowed to optimize
S() + 1 - 1
into
S()
because those two expressions are not equivalent for your class S.
Even though it might be non-idiomatic, it's completely valid for S to be written as you have done, in a way that operator+ and operator- don't cancel each other out (in your case, by modifying different member variables in the 2 operators).
The compiler is required to evaluate S() + 1 before using that result in the evaluation of - 1, and is only allowed to optimize that expression if it can prove that the resulting expression would be the same. In other words, the program must follow the as-if rule. Obviously, in this case the results would be different, and so the optimization is not allowed.
By the same token, if the compiler can see that an expression, and this can be any expression at all, is equivalent to some other expression, then it is allowed to optimize it as much as it wants. But again, the compiler must adhere to the as-if rule, i.e. the resulting program must behave exactly as if none of those optimizations were ever made at all.
Note that while your implementation of S is valid according to the language rules, it is quite strange, and will surprise users of the class. I suggest only doing that for types that are very clearly domain specific, and where users of the type naturally expect that + and - are not necessarily inverses of each other.
To avoid confusion, let's write it like this:
S a;
S b = a + 1 - 1;
To better see what happens in the last line you can rewrite the operator calls explicitly:
S b = a.operator+( 1 ).operator-( 1 );
Already here we see that the compiler cannot simply skip +1 -1. To do that it would have to first prove that applying +1 and then -1 to a S is the same as not applying any operation.
Usually that is the case, since typically + and - are inverses. However, your operators are not inverses of each other, as they increment/decrement different members.
Is it possible that S() + 1 - 1 will be optimized to S() ?
No, because the result would not be the same.
How does a compiler even determines when this kind of things can be optimized?
This is covered by the so-called as-if rule. If there is no observable difference, then compilers are allowed to transform the code to get an optimization. On the other hand, an optimization may never change observable behavior (there are a few exceptions, but they don't apply here).

Says it cannot be used as constant

Visual studio 2019
const int inputs = 1, layers = 2, layerN = 3, output = 1;
const int how = inputs * layerN + pow(layerN,layers) + output * layerN;
float w[how];
it says on w[how] that it must be "const" expression(but is is??)
I cannot run the program with this error.
hielp.
how is not a constant expression. Its value is not known to the compiler at compile-time, it is calculated dynamically at runtime because of the function call to pow(). As such, it cannot be used to declare a fixed length array. You will have to use new[] or a std::vector instead:
float *w = new float[how];
...
delete[] w;
#include <vector>
std::vector<float> w(how);
Perhaps it is more clear if you consider this example:
int x;
std::cin >> x;
const int y = x;
float w[y]; // error
y is const, ie during runtime its value cannot possibly change. However to allocate memory for an array the compiler needs to know the size. Making y a constant alone is not sufficient to achieve that. In your case it is not reading user input, but the call to pow that prevents the compiler from knowing the value.
PS: You might want to read about constexpr which is a much stronger guarantee than const. Unfortunately there is no constexpr version of pow.
The how variable is indeed const, but is not a constant expression. Such constant expressions are values and functions that have a known value and result at compile time, something known by the compiler before the program runs.
You can annotate your code to tell the compiler which variable should have a known value at compile time. This is what the constexpr keyword is for.
However, the pow function is not marked as constexpr so it's only useable at runtime and not also at compile time. You must implement your own function there:
constexpr auto power(int value, int exponent) -> int {
for (int i = 0 ; i < exponent ; i++) {
value *= value;
}
return value;
}
constexpr int inputs = 1, layers = 2, layerN = 3, output = 1;
constexpr int how = inputs * layerN + power(layerN,layers) + output * layerN;
Then it will be useable in a array size:
float w[how];
Live example
Also note that with this power function we created, we can revert back to const and the compiler will still accept it:
const int how = inputs * layerN + power(layerN, layers) + output * layerN;
float w[how];
The difference is that constexpr enforce compile time knowability, and const does not and you get the error later.
it says on w[how] that it must be "const" expression(but is is??)
It probably doesn't say that (in future, avoid paraphrasing error messages). I assume that it actually says that it must be a constant expression. The distinction may seem subtle, but is significant.
"Constant expression" is a specific term defined by the C++ language. There are many rules that specify whether an expression is constant, but a concise way to describe it is: Expression whose value is determined at translation time.
As confusing as it may be, an id-expression that names a const variable is not necessarily a constant expression. Constness of a type by itself merely implies that the variable is constant at runtime. And in this case, how specifically is not a constant expression.
The reason why how is not a constant expression is because its initialiser inputs * layerN + pow(layerN,layers) + output * layerN is not a constant expression. And that is because it contains a call to a non-constexpr function pow:
Here is a simple implementation of a constexpr pow function for int:
constexpr int
constant_pow(int base, int exp)
{
return exp ? (base * constant_pow(base, exp-1))
: 1;
}
Variable-length arrays (VLAs) are an optional feature of C, not C++. Visual Studio 2019 does not support them at all (neither in C nor C++ mode). However, some other compilers support VLAs even in C++ mode. These are compiler-specific extensions, though, and not part of the official C++ standard.
In cases where VLAs are not supported by the compiler, the expression how must be known at compile-time. The reason why the compiler cannot determine the value of how at compile-time is because the expression contains a function call to pow(). Replacing the expression pow( layerN, layers ) with 3*3 (which can be evaluated at compile-time) will make your code work.
In C++, functions declared as constexpr are guaranteed to be evaluated at compile-time. However, the C++ standard library function pow is not declared as constexpr.
It doesn't word with const. The const keyword denotes a read-only variable but not necessary a compile-time constant, is necessary to specify the size of your array.
In other words the only thing const guarantees is that the value of the variable will not change, but not necessarily that it is known before the program executes. To declare an array, the compiler needs to know the size when it compiles.
However C++ 11 introduced the constexpr keyword for this very purpose. constexpr tells the compiler that the value of the variable can be known at compile time.
This code uses constexpr.
constexpr int inputs = 1, layers = 2, layerN = 3, output = 1;
constexpr int how = inputs * layerN + pow(layerN,layers) + output * layerN;
float w[how];
See it live here
This code will work provided that the pow function is declared as a constexpr function (a function declared as constexpr guarantees that it can compute its value at compile-time provided that its arguments are themselves known).
This seems to be the case in the GCC version I linked above but it is not guaranteed to work everywhere. If you want portable code, you can define your own version of the power function which will be constexpr.
EDIT : added detail about potentially non-constexpr pow

Is variable defined by const int determined at compilation time?

I'm just wondering whether sentences like const int N=10 will be executed at compilation time. The reason I'm asking is because that the following code will work.
int main()
{
const int N=10;
int a[N]={};
return 0;
}
But this one wouldn't.
int main()
{
int N=10;
int a[N]={};
return 0;
}
The compiler must generate code "as if" the expression was evaluated at
compile time, but the const itself isn't sufficient for this. In
order to be used as the dimension of an array, for example, expression
N must be a "constant integral expression". A const int is
a constant integral expresion only if it is initialized with a constant
integral expression, and the initialization is visible to the compiler.
(Something like extern int const N;, for example, can't be used in
a constant integral expression.)
To be a constant integral expression, however, the variable must be
const; in your second example, the behavior of the compiler and the
resulting program must be "as if" the expression were only evaluated at
runtime (which means that it cannot be used as the dimension of an
array). In practice, at least with optimization, the compiler likely
would evaluate N at compile time, but it still has to pretend it
can't, and refuse to compile the code.
The compiler will probably evaluate both of the examples you provided at compile time, since even though the int N = 10; isn't const, you're not changing it anywhere and you're only assigning a constant value to it, which means the compiler can optimize this.
I recommend you take a look at the constexpr keyword introduced in C++11, which is exactly about being able to evaluate things at compile time.
Compilers will resolve const variables to literals at compile time (and also const expressions, see constant folding). The reason that the first method works is that compiler knows how much space to allocate (10*sizeof(int)) to a in the first method. In the second method the value of N is not known at compile time, and as such there is no way for the compiler to know how much space to allocate for a. Hope that helps.
This sort of thing is an implementation detail that technically is up to the compiler to choose. It could be different on different platforms.
In practice, with the most common compilers:
const int sometimes is and sometimes isn't baked at compile time. For example, the compiler clearly can't hardcode the value of a below into the object file:
int foo( int x )
{
const int a = x+ 1;
return a * 2;
}
In that function, const means it is only constant within the scope of foo(), but it is still a local stack variable.
On the other hand, const int x = 5 seems to be a literal that is usually resolved at compile time by GCC and MSVC (except sometimes they don't turn it into a literal for reasons unclear). I've seen some other compilers that won't turn it into a literal, and always put const int on the stack like an ordinary local variable.
const static int is different, because its scope is static, which means it outlives the function it is declared in, which means it will never change over the life of the program. Every compiler I've ever worked with has turned const static primitives into compile-time literals.
Objects with constructors, however, will still need to be initialized at runtime; so
class Foo {
Foo() : { CallGlobalFunction(); }
};
const static Foo g_whatever;
cannot be optimized into a literal by the compiler.

constexpr vs const: Will use a constexpr instead of const better help compile to optmize?

Will use a constexpr instead of const better help compile to optmize? I have some value that are constant. I could use an enum instead of but they aren't all of same type and I don't want to use #defines for ovious reasons. They are declared like below code. My question is: if I use constexpr will it increase possibilities to compiler replace the value where it's used by the literal constant value or it doesn't differ if I use const? I know it's up to each compiler if does or doesn't the optmization but a "general" answer as it behave in most compiler is very welcome. Also, assuming the result are different for simple locals like this it's different too if instead of they are member of a struct/class? is const folding most hard to be performed in this cases?
e.g,
int foo()
{
constexpr int x = 10;
constexpr int y = x * 3;
do_domsething(y + n);
}
versus
int foo()
{
const int x = 10;
const int y = x * 3;
do_domsething(y + n);
}
Today, constexpr is still pretty new and might fall through some optimization cracks. Calculation of the variable's own value is required to be performed at compile time, of course. In the long run, I'd expect the same optimization opportunities for each. Obviously when that happens is compiler-specific.
Just use whichever one has clearer meaning to you and anyone else working on the code, not only in terms of behavior but also intent.
Among other things, note that constexpr guarantees the constant has the same value always, whereas const allows it to have a different value each time initialization is reached (it can't be changed except by initialization, of course). But nearly every compiler on the planet will determine that for your example, the values really are constant.
One more thing: constexpr compile-time evaluation (like template argument calculation) is active even when optimizations are disabled.

Do "#define" and inline behave the same?

I have some short defines in one of my headers like this:
#define ROUND_DOWN(a,b) (a)-(a)%(b)
e.g.
ROUND_DOWN(178,32) = 160
But if I pass this to it:
ROUND_DOWN(160*2, 32);
then it gets compiled like this?
(160*2)-(160*2)%(32),
which is just more processing as it does 160*2 twice..
I'm wondering if inline functions behave in the same way? e.g.
inline int RoundDown(int a, int b)
{
return (a)-(a)%(b)
}
Would 160*2 get stored in "int a" as 320 and then the calculation would work, or would it behave the same as the define?
A better example is calling:
RoundDown((x+x2)*zoom, tile_width);
Do “#define” and inline behave the same?
No they dont!
There are a number of differences between a macro and a inline function.
- No of times of Evaluation
Expressions passed as arguments to inline functions are evaluated once.
In some cases, expressions passed as arguments to macros can be evaluated more than once.
Every time you use an argument in a macro, that argument is evaluated.
A Code sample:
#define max(a,b) (a>b?a:b)
int main()
{
int a = 0;
int b = 1;
int c = max(a++, b++);
cout << a << endl << b << endl;
return 0;
}
The intention probably was to print 1 and 2, but macro expands to:
int c = a++ > b++ ? a++ : b++;
b gets incremented twice, and the program prints 1 and 3.
- Who evaluates them
Inline functions are evaluated by the compiler while Macros are evaluated at pre-compilation by precompiler.
- Type checking
Inline functions follow all the protocols of type safety enforced on normal functions.
Argument types are checked, and necessary conversions are performed correctly.
The compiler performs return type checking, function signature before putting inline function into symbol table.
They can be overloaded to perform the right kind of operation for the right kind of data.
Macros are more error prone as compared to inline functions. the The parameters are not typed (the macro works for any objects of arithmetic type).
No error checking is done during compilation.
A Code Sample:
#define MAX(a, b) ((a < b) ? b : a)
int main( void)
{
cout << "Maximum of 10 and 20 is " << MAX("20", "10") << endl;
return 0;
}
One can pass strings to a macro that does some integer arithmetic and a macro won't complain!
- Suggestion or Command?
Inline is just a suggestion to the compiler. It is the compiler’s decision whether to expand the function inline or not.
Macros will always be expanded.
- How about Debugging?
Inline functions can be debugged easily because you can put a break point at the inline function definition and step into the method for debugging step by step.
Macros can not be used for debugging as they are expanded at pre-compile time.
First, you should pretty much assume that all constant expressions are evaluated at compile-time, so that multiplication never survives to be executed when you run the program.
Second, you can't depend on inline having any effect at all, it's just a hint to the compiler, not a requirement.
But even if the function is not inlined, the expression would not be evaluated twice since argument passing requires it to be evaluated before the body of the function runs.
#defines are simple textual substitutions, so (as you noticed) you may need to be careful with parentheses, etc. inline parameters are parsed normally.
There's a related issue with respect to conditions.
Nominally, the function argument 160*2 is evaluated exactly once, and the result is then used in the body of the function, whereas the macro evaluates 160*2 twice. If the argument expression has side-effects, then you can see this[*]: ROUND_DOWN(printf("hi!\n"), 1); vs RoundDown(printf("hi!\n"), 1);
In practice, whether the function is inlined or the macro expanded, it's just integer arithmetic in the expression, with no side-effects. An optimizing compiler can work out the result of the whole macro/function call, and just stick the answer in the emitted code. So you might find that your macro and your inline function result in exactly the same code being executed, and so int a = ROUND_DOWN(160*2, 32); and int a = RoundDown(160*2, 32); might both be the same as int a = 320;.
Where there are no side-effects, optimization can also store and re-use intermediate results. So int c = ROUND_DONW(a*2, b); might end up emitting code that looks as though you've written:
int tmp = a*2;
int c = tmp - tmp % b;
Note that whether to actually inline a function is a decision made by the compiler based on its own optimization rules. Those rules might take account of whether the function is marked inline or not, but quite likely don't unless you're using compiler options to force inlining or whatever.
So, assuming a decent compiler there's no reason to use a macro for this - for your macro in particular you're just begging for someone to come along and write:
int a = ROUND_DOWN(321, 32) * 2;
and then waste a few minutes wondering why the result is 319.
[*] Although don't get carried away - for some expressions with side-effects, for example i++ where i is an integer, the macro has undefined behavior due to lack of sequence points.
Well in the example you've given with constants, on any reasonable compiler both versions will compute the constant at compile time.
Assuming you're actually asking about cases where variables are passed in, I would expect the compiler's optimizer to generate the same code in both cases (it wouldn't do the multiplication twice if it's more efficient to save off the result. Finally, the inline function does give the compiler the option to make an actual function call if it would improve performance.
Finally note that I wouldn't worry about micro-optimizations like this because 99% it's just going to have no effect on the performance of your program - I/O will be your bottleneck.