for loop execution within the loop condition, c++ - c++

I wanted to fill up an int array with 121 ints, from 0 to 120. What is the difference between:
for(int i = 0; i < 122; arr[i] = i, i++){} and
for(int i = 0; i < 122; i++){arr[i] = i;} ?
I checked it and except warning: iteration 121u invokes undefined behavior, which I think isn't related to my question, the code compiles fine and gets the expected results
EDIT: thanks for all who noticed the readability problem, that's true of course, but I meant to see if there is a different interpretation for these 2 lines, so I checked both of these lines in C to assembly and they look identical

None, the result will be the same.
The first construction uses a comma operator; the left side of a comma operator is sequenced before the right side, so arr[i] = i, i++ is well-defined
The second one is easier to read, though, especially if one chooses to omit the {} completely:
for(int i = 0; i < 122; arr[i] = i, i++); //this ; is evil, don't write such code.
Also, if you want to fill up 120 elements, you should use i < 120.

The end result from both of the lines will be the same. However, the second one is better as the first one sacrifices readability for no gain.
When people read through code, they expect for loops to be in manner you have written in the second line. If I was stepping through code and encountered the first line, I would've stopped for a second to look at why an empty for loop was being run, and then would've realised that you are setting the variable in the for loop itself using the comma operator. Breaks the flow while reading code, and so won't recommend it.

Related

is this for loop correct and infinite and why?

so i had this segment of code in my C++ test today:
for(int i=10;i++;i<15){cout<<i;}
what is this supposed to output? and why ?
Thanks!
A for loop will run until either:
its 2nd argument evaluates as 0/false.
the loop body calls break or return, or throws an exception, or otherwise exits the calling thread.
The code you have shown may or may not loop indefinitely. However, it will not loop the 5 times you might be expected, because the 2nd and 3rd arguments of the for statement are reversed.
The loop should look like this:
for(int i = 10; i < 15; i++)
However, in the form you have shown:
for(int i = 10; i++; i < 15)
The loop will continue running as long as i is not 0. Depending on how the compiler interprets i++ as a loop condition, it may recognize that this will lead to overflow and just decide to ignore i and make the loop run indefinitely, or it may actually increment i and let overflow happen.
In the latter case, on every loop iteration, i will be incremented, and eventually i will overflow past the maximum value that an int can hold. At that time, i will usually wrap around to a negative value, since int is a signed type (though overflow behavior is not officially defined by the C++ standard, so the wrap is not guaranteed). Then the loop will keep running since i is not 0, incrementing i each time until it eventually reaches 0, thus breaking the loop.
In either case, the loop will end up calling cout << i many thousands/millions of times (depending on the byte size of int), or call it forever until the program is terminated.
what is this supposed to output? and why ?
That unconditionally do an signed int overflow.
So it is is Undefined behavior and can do anything.
For practical purposes, with any modern compiler this loop will continue forever. The reason for that is that the code is syntactically correct (however very incorrect semantically).
for(int i = 10; i++; i < 15)
means: start with i equal to 10. Check if i++ is true (which it will be, since integers are convertible to booleans, with non-0 values converted to true). Proceed with loop body, on every iteration performing comparison of i and 15 (just comparing, not checking the result, this is your increment expression), incrementing i and checking if it is non-0.
Since compilers understand that signed integers never overflow, i++ could never go to 0 when started with 10. As a result, optimizing compiler will remove the check altogether, and turn it into infinite loop.
Last, but not the least, learn to love compiler warnings. In particular, this code produces following:
<source>:4:29: warning: for increment expression has no effect [-Wunused-value]
for (int i = 10; i++; i < 15) {
~~^~~~
<source>:4:23: warning: iteration 2147483637 invokes undefined behavior [-Waggressive-loop-optimizations]
for (int i = 10; i++; i < 15) {
~^~
<source>:4:23: note: within this loop
for (int i = 10; i++; i < 15) {
^~

Needs upper bound in for loop to be saved in extra variable?

I see that often in older code
DWORD num = someOldListImplementation.GetNum();
for (DWORD i = 0; i < num; i++)
{
someOldListImplementation.Get(i);
}
rather than
for (DWORD i = 0; i < someOldListImplementation.GetNum(); i++)
{
someOldListImplementation.Get(i);
}
I guess the first implementation should prevent calls to GetNum() on each cycle. But, are there cases that the compiler in C++11 does some optimization to the second code snippet which makes the num variable obsolete?
Should I always prefer the first implementation?
If that question duplicates another question for 100%, tell me and close this one. No problem.
The C++ (C++11 in this case) standard doesn't seem to leave much room for this. It states explicitly in 6.5.3 The for statement /1 that (my emphasis):
the condition specifies a test, made before each iteration, such that the loop is exited when the condition becomes false;
However, as stated in 1.9 Program execution /1:
Conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.
This provision is sometimes called the "as-if" rule, because an implementation is free to disregard any requirement of this International Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program.
So, if an implementation can ascertain that the return value from GetNum() will not change during the loop, it can quite freely optimise all but the first call out of existence.
Whether an implementation can ascertain that depends on a great many things, and that number of things seems to expand with each new iteration of the standard. The things I'm thinking of are volatility, access from multiple threads, constexpr and so on.
You may well have to work rather hard to notify the compiler that it is free to do this (and, even then, it's not required to do so).
The only problem I can see with the first code sample you have is that num may exist for longer than is necessary (if its reason for existence is only to manage this loop). However, this can easily be fixed by explicitly restricting the scope, such as with the "small-change":
{
DWORD num = someOldListImplementation.GetNum();
for (DWORD i = 0; i < num; i++)
{
someOldListImplementation.Get(i);
}
}
// That num is no longer in existence.
or by incorporating it into the for statement itself:
for (DWORD num = someOldListImplementation.GetNum(), i = 0; i < num; ++i)
{
someOldListImplementation.Get(i);
}
// That num is no longer in existence.
The exact answer to my question is to use
for (auto i = 0, num = x.GetNum(); i < num; ++i)
{
}
because
it prevent calls to GetNum() on each cycle
variable num is not exposed outside of the loop
Credits go to user Nawaz!
I also used ++i instead of i++. Reason: https://www.quora.com/Why-doesnt-it-matter-if-we-use-I++-or-++I-for-a-for-loop

Should I define constants for repeating literals in the code?

I have an example code like this, in which the literal 1 repeats several times.
foo(x - 1);
y = z + 1;
bar[1] = y;
Should I define a constant ONE, and replace the literals with it?
constexpr int ONE = 1;
foo(x - ONE);
y = z + ONE;
bar[ONE] = y;
Would this replacement make any performance improvement and/or reduce machine code size in the favor of reducing code readability? Would the number of repeating of the literal change the answer?
It will not bring you any performance/memory improvements. However, you should try to keep your code clean from magical numbers. So, if there is a repeated constant in your code in several places, and in all those places this constant is the same from logical point of view, it would be better to make it a named constant.
Example:
const int numberOfParticles = 10; //This is just an example, it's better not to use global variables.
void processParticlesPair(int i, int j) {
for (int iteration = 0; iteration < 10; ++iteration) {
//note, that I didn't replace "10" in the line above, because it is not a numberOrParticles,
//but a number of iterations, so it is a different constant from a logical point of view.
//Do stuff
}
}
void displayParticles() {
for (int i = 0; i < numberOfParticles; ++i) {
for (int j = 0; j < numberOfParticles; ++j) {
if (i != j) {
processParticlesPair(i, j);
}
}
}
}
Depends. If you just have 1s in your code and you ask if you should replace them: DONT. Keep your code clean. You will not have any performance or memory advantages - even worse, you might increase build time
If the 1, however, is a build-time parameter: Yes, please introduce a constant! But choose a better name than ONE!
Should I define a constant ONE, and replace the literals with it?
No, absolutely not. If you have a name that indicates the meaning of the number (e.g. NumberOfDummyFoos), if its value can change and you want to prevent having to update it in a dozen locations, then you can use a constant for that, but a constant ONE adds absolutely no value over a literal 1.
Would this replacement make any performance improvement and/or reduce machine code size in the favor of reducing code readability?
In any realistic implementation, it does not.
Replacing literals with named constants make only sense,
if the meaning of the constant is special. Replacing 1 with ONE is
just overhead in most cases, and does not add any useful information
to the reader, especially if it is used in different functions (index, part of a calculation etc.). If the entry 1 of an array is somehow special, using a constant THE_SPECIAL_INDEX=1 would make sense.
For the compiler it usually does not make any difference.
In assembly, one constant value generally takes the same amount of memory as any other. Setting a constant value in your source code is more of a convenience for humans than an optimization.
In this case, using ONE in such a way is neither a performance or readability enhancement. That's why you've probably never seen it in source code before ;)

Calculations inside the `for (...)` statement

A lot of times I see code like:
int s = a / x;
for (int i = 0; i < s; i++)
// do something
If inside the for loop, neither a nor x is modified, can I then simply write:
for (int i = 0; i < a / x; i++)
// do something
and then assume that the compiler optimizes a/x, i.e replaces it with a constant?
The most important part of int s = a / x is the variable name. It gives your syntax semantics, and lets you remember 12 months later why you were dividing one thing by another. You can't name the expression in the for statement, so you lose that self-documenting nature.
const auto monthlyAmount = (int)yearlyAmount / numberOfMonths;
for (auto i = 0; i < monthlyAmount; ++i)
// do something
In this way, extracting the variable isn't for a compiler optimization, it's a human maintainability optimization.
If the compiler can be sure that the variables used in the expression in the middle of your for loop will not change between iterations, it can optimize the calculation to be performed once at the beginning of the loop, instead of every iteration.
However, consider that the variables used are global variables, or references to variables external to the function, and in your for loop you call a function. The function could change these variables. If the compiler is able to see enough of the code at that point, it could find out if this is the case to decide whether to optimize. However, compilers are only willing to look so far (otherwise things would take even longer to compile), so in general you cannot assume the optimization is performed.
The concern for optimization probably stems from the fact that the condition is evaluated before each iteration. If this is a potentially expensive operation and you don't need to do it over and over again, you can extract it out of the loop:
const std::size_t size = s.size(); // just an example
for (std::size_t i = 0; i < size; ++i)
{
}
For inexpensive operations this is probably a premature optimization and the compiler might generate the same code. The only way to be sure is to check the generated assembly code.
The problem with such Questions is that they cannot be generalized. What optimizations the Compiler will perform and what not can only be determined by a case by case analysis.
I'd certainly expect the compiler to do this if one of the following holds true:
1) Both, A and B are local variables, whose addresses are never taken.
2) The code in the loop is completely inlined.
In practice the last requirement isn't as hard as it looks, because if the functions in the body cannot be inlined, their runtime will likely dwarf the time to re-compute the bound anyway

assignment operation confusion

What is the output of the following code:
int main() {
int k = (k = 2) + (k = 3) + (k = 5);
printf("%d", k);
}
It does not give any error, why? I think it should give error because the assignment operations are on the same line as the definition of k.
What I mean is int i = i; cannot compile.
But it compiles. Why? What will be the output and why?
int i = i compiles because 3.3.1/1 (C++03) says
The point of declaration for a name is immediately after its complete declarator and before its initializer
So i is initialized with its own indeterminate value.
However the code invokes Undefined Behaviour because k is being modified more than once between two sequence points. Read this FAQ on Undefined Behaviour and Sequence Points
int i = i; first defines the variable and then assigns a value to it. In C you can read from an uninitialized variable. It's never a good idea, and some compilers will issue a warning message, but it's possible.
And in C, assignments are also expressions. The output will be "10", or it would be if you had a 'k' there, instead of an 'a'.
Wow, I got 11 too. I think k is getting assigned to 3 twice and then once to 5 for the addition. Making it just int k = (k=2)+(k=3) yields 6, and int k = (k=2)+(k=4) yields 8, while int k = (k=2)+(k=4)+(k=5) gives 13. int k = (k=2)+(k=4)+(k=5)+(k=6) gives 19 (4+4+5+6).
My guess? The addition is done left to right. The first two (k=x) expressions are added, and the result is stored in a register or on the stack. However, since it is k+k for this expression, both values being added are whatever k currently is, which is the second expression because it is evaluated after the other (overriding its assignment to k). However, after this initial add, the result is stored elsewhere, so is now safe from tampering (changing k will not affect it). Moving from left to right, each successive addition reassigns k (not affected the running sum), and adds k to the running sum.