c++ for loop optimization question - c++

I have the following looking code in VC++:
for (int i = (a - 1) * b; i < a * b && i < someObject->someFunction(); i++)
{
// ...
}
As far as I know compilers optimize all these arithmetic operations and they won't be executed on each loop, but I'm not sure if they can tell that the function above also returns the same value each time and it doesn't need to be called each time.
Is it a better practice to save all calculations into variables, or just rely on compiler optimizations to have a more readable code?
int start = (a - 1) * b;
int expra = a * b;
int exprb = someObject->someFunction();
for (int i = startl i < expra && i < exprb; i++)
{
// ...
}

Short answer: it depends. If the compiler can deduce that running someObject->someFunction() every time and caching the result once both produce the same effects, it is allowed (but not guaranteed) to do so. Whether this static analysis is possible depends on your program: specifically, what the static type of someObject is and what its dynamic type is expected to be, as well as what someFunction() actually does, whether it's virtual, and so on.
In general, if it only needs to be done once, write your code in such a way that it can only be done once, bypassing the need to worry about what the compiler might be doing:
int start = (a - 1) * b;
int expra = a * b;
int exprb = someObject->someFunction();
for (int i = start; i < expra && i < exprb; i++)
// ...
Or, if you're into being concise:
for (int i = (a - 1) * b, expra = a * b, exprb = someObject->someFunction();
i < expra && i < exprb; i++)
// ...

From my experience VC++ compiler won't optimize the function call out unless it can see the function implementation at the point of compiling the calling code. So moving the call outside the loop is a good idea.

If a function resides within the same compilation unit as its caller, the compiler can often deduce some facts about it - e.g. that its output might not change for subsequent calls. In general, however, that is not the case.
In your example, assigning variables for these simple arithmetic expressions does not really change anything with regards to the produced object code and, in my opinion, makes the code less readable. Unless you have a bunch of long expressions that cannot reasonably be put within a line or two, you should avoid using temporary variables - if for no other reason, then just to reduce namespace pollution.
Using temporary variables implies a significant management overhead for the programmer, in order to keep them separate and avoid unintended side-effects. It also makes reusing code snippets harder.
On the other hand, assigning the result of the function to a variable can help the compiler optimise your code better by explicitly avoiding multiple function calls.
Personally, I would go with this:
int expr = someObject->someFunction();
for (int i = (a - 1) * b; i < a * b && i < expr; i++)
{
// ...
}

The compiler cannot make any assumption on whether your function will return the same value at each time. Let's imagine that your object is a socket, how could the compiler possibly know what will be its output?
Also, the optimization that a compiler can make in such loops strongly depends on the whether a and b are declared as const or not, and whether or not they are local. With advanced optimization schemes, it may be able to infer that a and b are neither modified in the loop nor in your function (again, you might imagine that your object holds some reference to them).
Well, in short: go for the second version of your code!

It is very likely that the compiler will call the function each time.
If you are concerned with the readability of code, what about using:
int maxindex = min (expra, exprb);
for (i=start; i<maxindex; i++)
IMHO, long lines does not improve readability.
Writing short lines and doing multiple step to get a result, does not impact the performance, this is exactly why we use compilers.

Effectively what you might be asking is whether the compiler will inline the function someFunction() and whether it will see that someObject is the same instance in each loop, and if it does both it will potentially "cache" the return value and not keep re-evaluating it.
Much of this may depend on what optimisation settings you use, with VC++ as well as any other compiler, although I am not sure VC++ gives you quite as many flags as gnu.
I often find it incredible that programmers rely on compilers to optimise things they can easily optimise themselves. Just move the expression to the first section of the for-loop if you know it will evaluate the same each time:
Just do this and don't rely on the compiler:
for (int i = (a - 1) * b, iMax = someObject->someFunction();
i < a * b && i < iMax; ++i)
{
// body
}

Related

Performance function call vs multiplication by 1

Look at this function:
float process(float in) {
float out = in;
for (int i = 0; i < 31; ++i) {
if (biquads_[i]) {
out = biquads_[i]->filter(out);
}
}
return out;
}
biquads_ is a std::optional<Biquad>[31].
in this case i check for every optional to check if its not empty, and then call the filter function of biquad, if instead I unconditionally call filter function, changing it to multiply by 1 or simply return the input value, would be more efficient?
Most likely it won't make a shread of difference (guessing somewhat though since your question is not entirely clear). For two reasons: 1) unless the code is going to be used in a very hot path, it won't matter even if one way is a few nanoseconds faster than the other. 2) most likely your compilers optimizer will be clever enough to generate code that performs close-to (if not identical to) the same in both cases. Did you test it? Did you benchmark/profile it? If not; do so - with optimization enabled.
Strive to write clear, readable, maintainable code. Worry about micro-optimization later when you actually have a problem and your profiler points to your function as a hot-spot.

Should I define constants for repeating literals in the code?

I have an example code like this, in which the literal 1 repeats several times.
foo(x - 1);
y = z + 1;
bar[1] = y;
Should I define a constant ONE, and replace the literals with it?
constexpr int ONE = 1;
foo(x - ONE);
y = z + ONE;
bar[ONE] = y;
Would this replacement make any performance improvement and/or reduce machine code size in the favor of reducing code readability? Would the number of repeating of the literal change the answer?
It will not bring you any performance/memory improvements. However, you should try to keep your code clean from magical numbers. So, if there is a repeated constant in your code in several places, and in all those places this constant is the same from logical point of view, it would be better to make it a named constant.
Example:
const int numberOfParticles = 10; //This is just an example, it's better not to use global variables.
void processParticlesPair(int i, int j) {
for (int iteration = 0; iteration < 10; ++iteration) {
//note, that I didn't replace "10" in the line above, because it is not a numberOrParticles,
//but a number of iterations, so it is a different constant from a logical point of view.
//Do stuff
}
}
void displayParticles() {
for (int i = 0; i < numberOfParticles; ++i) {
for (int j = 0; j < numberOfParticles; ++j) {
if (i != j) {
processParticlesPair(i, j);
}
}
}
}
Depends. If you just have 1s in your code and you ask if you should replace them: DONT. Keep your code clean. You will not have any performance or memory advantages - even worse, you might increase build time
If the 1, however, is a build-time parameter: Yes, please introduce a constant! But choose a better name than ONE!
Should I define a constant ONE, and replace the literals with it?
No, absolutely not. If you have a name that indicates the meaning of the number (e.g. NumberOfDummyFoos), if its value can change and you want to prevent having to update it in a dozen locations, then you can use a constant for that, but a constant ONE adds absolutely no value over a literal 1.
Would this replacement make any performance improvement and/or reduce machine code size in the favor of reducing code readability?
In any realistic implementation, it does not.
Replacing literals with named constants make only sense,
if the meaning of the constant is special. Replacing 1 with ONE is
just overhead in most cases, and does not add any useful information
to the reader, especially if it is used in different functions (index, part of a calculation etc.). If the entry 1 of an array is somehow special, using a constant THE_SPECIAL_INDEX=1 would make sense.
For the compiler it usually does not make any difference.
In assembly, one constant value generally takes the same amount of memory as any other. Setting a constant value in your source code is more of a convenience for humans than an optimization.
In this case, using ONE in such a way is neither a performance or readability enhancement. That's why you've probably never seen it in source code before ;)

Calculations inside the `for (...)` statement

A lot of times I see code like:
int s = a / x;
for (int i = 0; i < s; i++)
// do something
If inside the for loop, neither a nor x is modified, can I then simply write:
for (int i = 0; i < a / x; i++)
// do something
and then assume that the compiler optimizes a/x, i.e replaces it with a constant?
The most important part of int s = a / x is the variable name. It gives your syntax semantics, and lets you remember 12 months later why you were dividing one thing by another. You can't name the expression in the for statement, so you lose that self-documenting nature.
const auto monthlyAmount = (int)yearlyAmount / numberOfMonths;
for (auto i = 0; i < monthlyAmount; ++i)
// do something
In this way, extracting the variable isn't for a compiler optimization, it's a human maintainability optimization.
If the compiler can be sure that the variables used in the expression in the middle of your for loop will not change between iterations, it can optimize the calculation to be performed once at the beginning of the loop, instead of every iteration.
However, consider that the variables used are global variables, or references to variables external to the function, and in your for loop you call a function. The function could change these variables. If the compiler is able to see enough of the code at that point, it could find out if this is the case to decide whether to optimize. However, compilers are only willing to look so far (otherwise things would take even longer to compile), so in general you cannot assume the optimization is performed.
The concern for optimization probably stems from the fact that the condition is evaluated before each iteration. If this is a potentially expensive operation and you don't need to do it over and over again, you can extract it out of the loop:
const std::size_t size = s.size(); // just an example
for (std::size_t i = 0; i < size; ++i)
{
}
For inexpensive operations this is probably a premature optimization and the compiler might generate the same code. The only way to be sure is to check the generated assembly code.
The problem with such Questions is that they cannot be generalized. What optimizations the Compiler will perform and what not can only be determined by a case by case analysis.
I'd certainly expect the compiler to do this if one of the following holds true:
1) Both, A and B are local variables, whose addresses are never taken.
2) The code in the loop is completely inlined.
In practice the last requirement isn't as hard as it looks, because if the functions in the body cannot be inlined, their runtime will likely dwarf the time to re-compute the bound anyway

is it always faster to store multiple class calls in a variable?

If you have a method such as this:
float method(myClass foo)
{
return foo.PrivateVar() + foo.PrivateVar();
}
is it always faster/better to do this instead?:
float method(myClass foo)
{
float temp = foo.PrivateVar();
return temp + temp;
}
I know you're not supposed to put a call like foo.PrivateVar() in a for loop for example, because it evaluates it many times when you actually only need to use the value once (in some cases.
for (int i = 0; i < foo.PrivateInt(); i++)
{
//iterate through stuff with 'i'
}
from this I made the assumption to change code like the first example to that in the second, but then I've been told by people to not try to be smarter than the compiler! and that it could very well inline the calls.
I don't want to profile anything, I just want a few simple rules for good practice on this. I'm writing a demo for a job application and I don't want anyone to look at the code and see some rookie mistake.
That completely depends on what PrivateVar() is doing and where it's defined etc. If the compiler has access to the code in PrivateVar() and can guarantee that there are no side effects by calling the function it can do CSE which is basically what you've done in your second code example.
Exactly the same is true for your for loop. So if you want to be sure it's only evaluated once because it's a hugely expensive function (which also means that guaranteeing no side-effects get's tricky even if there aren't any) write it explicitly.
If PrivateVar() is just a getter, write the clearer code - even if the compiler may not do CSE the performance difference won't matter in 99.9999% of all cases.
Edit: CSE stands for Common Subexpression eliminiation and does exactly what it stands for ;) The wiki page shows an example for a simple multiplication, but we can do this for larger code constructs just as well, like for example a function call.
In all cases we have to guarantee that only evaluating the code once doesn't change the semantics, i.e. doing CSE for this code:
a = b++ * c + g;
d = b++ * c * d;
and changing it to:
tmp = b++ * c;
a = tmp + g;
d = tmp * d;
would obviously be illegal (for function calls this is obviously a bit more complex, but it's the same principle).

Question about optimization in C++

I've read that the C++ standard allows optimization to a point where it can actually hinder with expected functionality. When I say this, I'm talking about return value optimization, where you might actually have some logic in the copy constructor, yet the compiler optimizes the call out.
I find this to be somewhat bad, as in someone who doesn't know this might spend quite some time fixing a bug resulting from this.
What I want to know is whether there are any other situations where over-optimization from the compiler can change functionality.
For example, something like:
int x = 1;
x = 1;
x = 1;
x = 1;
might be optimized to a single x=1;
Suppose I have:
class A;
A a = b;
a = b;
a = b;
Could this possibly also be optimized? Probably not the best example, but I hope you know what I mean...
Eliding copy operations is the only case where a compiler is allowed to optimize to the point where side effects visibly change. Do not rely on copy constructors being called, the compiler might optimize away those calls.
For everything else, the "as-if" rule applies: The compiler might optimize as it pleases, as long as the visible side effects are the same as if the compiler had not optimized at all.
("Visible side effects" include, for example, stuff written to the console or the file system, but not runtime and CPU fan speed.)
It might be optimized, yes. But you still have some control over the process, for example, suppose code:
int x = 1;
x = 1;
x = 1;
x = 1;
volatile int y = 1;
y = 1;
y = 1;
y = 1;
Provided that neither x, nor y are used below this fragment, VS 2010 generates code:
int x = 1;
x = 1;
x = 1;
x = 1;
volatile int y = 1;
010B1004 xor eax,eax
010B1006 inc eax
010B1007 mov dword ptr [y],eax
y = 1;
010B100A mov dword ptr [y],eax
y = 1;
010B100D mov dword ptr [y],eax
y = 1;
010B1010 mov dword ptr [y],eax
That is, optimization strips all lines with "x", and leaves all four lines with "y". This is how volatile works, but the point is that you still have control over what compiler does for you.
Whether it is a class, or primitive type - all depends on compiler, how sophisticated it's optimization caps are.
Another code fragment for study:
class A
{
private:
int c;
public:
A(int b)
{
*this = b;
}
A& operator = (int b)
{
c = b;
return *this;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
int b = 0;
A a = b;
a = b;
a = b;
return 0;
}
Visual Studio 2010 optimization strips all the code to nothing, in release build with "full optimization" _tmain does just nothing and immediately returns zero.
This will depend on how class A is implemented, whether the compiler can see the implementation and whether it is smart enough. For example, if operator=() in class A has some side effects such optimizing out would change the program behavior and is not possible.
Optimization does not (in proper term) "remove calls to copy or assignments".
It convert a finite state machine in another finite state, machine with a same external behaviour.
Now, if you repeadly call
a=b; a=b; a=b;
what the compiler do depends on what operator= actually is.
If the compiler founds that a call have no chances to alter the state of the program (and the "state of the program" is "everything lives longer than a scope that a scope can access") it will strip it off.
If this cannot be "demonstrated" the call will stay in place.
Whatever the compiler will do, don't worry too much about: the compiler cannot (by contract) change the external logic of a program or of part of it.
i dont know c++ that much but am currently reading Compilers-Principles, techniques and tools
here is a snippet from their section on code optimization:
the machine-independent code-optimization phase attempts to improve
intermediate code so that better target code will result. Usually
better means faster, but other objectives may be desired, such as
shorter code, or target code that consumes less power. for example a
straightforward algorithm generates the intermediate code (1.3) using
an instruction for each operator in the tree representation that comes
from semantic analyzer. a simple intermediate code generation
algorithm followed by code optimization is a reasonable way to
generate good target code. the optimizar can duduce that the
conversion of 60 from integer to floating point can be done once and
for all at compile time, so the inttofloat operation can be eliminated
by replacing the integer 6- by the floating point number 60.0.
moreover t3 is used only once to trasmit its value to id1 so the
optimizer can transform 1.3 into the shorter sequence (1.4)
1.3
t1 - intoffloat(60
t2 -- id3 * id1
ts -- id2 + t2
id1 t3
1.4
t1=id3 * 60.0
id1 = id2 + t1
all and all i mean to say that code optimization should come at a much deeper level and because the code is at such a simple state is doesnt effect what your code does
I had some trouble with const variables and const_cast. The compiler produced incorrect results when it was used to calculate something else. The const variable was optimized away, its old value was made into a compile-time constant. Truly "unexpected behavior". Okay, perhaps not ;)
Example:
const int x = 2;
const_cast<int&>(x) = 3;
int y = x * 2;
cout << y << endl;