Related
I tried these lines of code and found out shocking output. I am expecting some reason related to initialisation either in general or in for loop.
1.)
int i = 0;
for(i++; i++; i++){
if(i>10) break;
}
printf("%d",i);
Output - 12
2.)
int i;
for(i++; i++; i++){
if(i>10) break;
}
printf("%d",i);
Output - 1
I expected the statements "int i = 0" and "int i" to be the same.What is the difference between them?
I expected the statements "int i = 0" and "int i" to be the same.
No, that was a wrong expectation on your part. If a variable is declared outside of a function (as a "global" variable), or if it is declared with the static keyword, it's guaranteed to be initialized to 0 even if you don't write = 0. But variables defined inside functions (ordinary "local" variables without static) do not have this guaranteed initialization. If you don't explicitly initialize them, they start out containing indeterminate values.
(Note, though, that in this context "indeterminate" does not mean "random". If you write a program that uses or prints an uninitialized variable, often you'll find that it starts out containing the same value every time you run your program. By chance, it might even be 0. On most machines, what happens is that the variable takes on whatever value was left "on the stack" by the previous function that was called.)
See also these related questions:
Non-static variable initialization
Static variable initialization?
See also section 4.2 and section 4.3 in these class notes.
See also question 1.30 in the C FAQ list.
Addendum: Based on your comments, it sounds like when you fail to initialize i, the indeterminate value it happens to start out with is 0, so your question is now:
"Given the program
#include <stdio.h>
int main()
{
int i; // note uninitialized
printf("%d\n", i); // prints 0
for(i++; i++; i++){
if(i>10) break;
}
printf("%d\n", i); // prints 1
}
what possible sequence of operations could the compiler be emitting that would cause it to compute a final value of 1?"
This can be a difficult question to answer. Several people have tried to answer it, in this question's other answer and in the comments, but for some reason you haven't accepted that answer.
That answer again is, "An uninitialized local variable leads to undefined behavior. Undefined behavior means anything can happen."
The important thing about this answer is that it says that "anything can happen", and "anything" means absolutely anything. It absolutely does not have to make sense.
The second question, as I have phrased it, does not really even make sense, because it contains an inherent contradiction, because it asks, "what possible sequence of operations could the compiler be emitting", but since the program contains Undefined behavior, the compiler isn't even obliged to emit a sensible sequence of operations at all.
If you really want to know what sequence of operations your compiler is emitting, you'll have to ask it. Under Unix/Linux, compile with the -S flag. Under other compilers, I don't know how to view the assembly-language output. But please don't expect the output to make any sense, and please don't ask me to explain it to you (because I already know it won't make any sense).
Because the compiler is allowed to do anything, it might be emitting code as if your program had been written, for example, as
#include <stdio.h>
int main()
{
int i; // note uninitialized
printf("%d\n", i); // prints 0
i++;
printf("%d\n", i); // prints 1
}
"But that doesn't make any sense!", you say. "How could the compiler turn "for(i++; i++; i++) ..." into just "i++"? And the answer -- you've heard it, but maybe you still didn't quite believe it -- is that when a program contains undefined behavior, the compiler is allowed to do anything.
The difference is what you already observed. The first code initializes i the other does not. Using an unitialized value is undefined behaviour (UB) in c++. The compiler assumes UB does not happen in a correct program, and hence is allowed to emit code that does whatever.
Simpler example is:
int i;
i++;
Compiler knows that i++ cannot happen in a correct program, and the compiler does not bother to emit correct output for wrong input, hece when you run this code anything could happen.
For further reading see here: https://en.cppreference.com/w/cpp/language/ub
The is a rule of thumb that (among other things) helps to avoid uninitialized variables. It is called Almost-Always-Auto, and it suggests to use auto almost always. If you write
auto i = 0;
You cannot forget to initialize i, because auto requires an initialzer to be able to deduce the type.
PS: C and C++ are two different languages with different rules. Your second code is UB in C++, but I cannot answer your question for C.
I have an example code like this, in which the literal 1 repeats several times.
foo(x - 1);
y = z + 1;
bar[1] = y;
Should I define a constant ONE, and replace the literals with it?
constexpr int ONE = 1;
foo(x - ONE);
y = z + ONE;
bar[ONE] = y;
Would this replacement make any performance improvement and/or reduce machine code size in the favor of reducing code readability? Would the number of repeating of the literal change the answer?
It will not bring you any performance/memory improvements. However, you should try to keep your code clean from magical numbers. So, if there is a repeated constant in your code in several places, and in all those places this constant is the same from logical point of view, it would be better to make it a named constant.
Example:
const int numberOfParticles = 10; //This is just an example, it's better not to use global variables.
void processParticlesPair(int i, int j) {
for (int iteration = 0; iteration < 10; ++iteration) {
//note, that I didn't replace "10" in the line above, because it is not a numberOrParticles,
//but a number of iterations, so it is a different constant from a logical point of view.
//Do stuff
}
}
void displayParticles() {
for (int i = 0; i < numberOfParticles; ++i) {
for (int j = 0; j < numberOfParticles; ++j) {
if (i != j) {
processParticlesPair(i, j);
}
}
}
}
Depends. If you just have 1s in your code and you ask if you should replace them: DONT. Keep your code clean. You will not have any performance or memory advantages - even worse, you might increase build time
If the 1, however, is a build-time parameter: Yes, please introduce a constant! But choose a better name than ONE!
Should I define a constant ONE, and replace the literals with it?
No, absolutely not. If you have a name that indicates the meaning of the number (e.g. NumberOfDummyFoos), if its value can change and you want to prevent having to update it in a dozen locations, then you can use a constant for that, but a constant ONE adds absolutely no value over a literal 1.
Would this replacement make any performance improvement and/or reduce machine code size in the favor of reducing code readability?
In any realistic implementation, it does not.
Replacing literals with named constants make only sense,
if the meaning of the constant is special. Replacing 1 with ONE is
just overhead in most cases, and does not add any useful information
to the reader, especially if it is used in different functions (index, part of a calculation etc.). If the entry 1 of an array is somehow special, using a constant THE_SPECIAL_INDEX=1 would make sense.
For the compiler it usually does not make any difference.
In assembly, one constant value generally takes the same amount of memory as any other. Setting a constant value in your source code is more of a convenience for humans than an optimization.
In this case, using ONE in such a way is neither a performance or readability enhancement. That's why you've probably never seen it in source code before ;)
This question already has answers here:
Is Loop Hoisting still a valid manual optimization for C code?
(8 answers)
Closed 7 years ago.
My question is specifically about the gcc compiler.
Often, inside a loop, I have to use a value returned by a function that is constant during the whole loop.
I would like to know if it's better to prior store this constant return value inside a variable (let's imagine a long loop), or if a compiler like gcc is able to perform some optimizations to cache the constant value, because it would recognize it as constant threw the loop.
For example when I loop over chars in a string, I often write something like that:
bool find_something(string s, char something)
{
size_t sz = s.size();
for (size_t i = 0; i != sz; i++)
if (s[i] == something) return true;
return false;
}
but with a clever compiler, I could use the following (which is shorter and more clear):
bool find_something(string s, char something)
{
for (size_t i = 0; i != s.size(); i++)
if (s[i] == something) return true;
return false;
}
then the compiler could detect that the code inside the loop doesn't perform any change on the string object, and then would build a code that would cache the value returned by s.size(), instead of making a (slower) function call for each iteration.
Is there such optimization with gcc?
Generally there's nothing in your example that makes it impossible for the compiler to move the .size() computation before the loop. And in fact GCC 5.2.0 will produce exactly the same code for both implementations that you showed.
I would however strongly suggest against relying on optimizations like that (in really performance critical code) because a small change somewhere (GCCs optimizer, the implementation details of std::string, ...) could break GCCs ability to do this optimization.
However I don't see a point in writing the more verbose version in the usual 90% of code that's not really performance-critical.
Given a current C++ compiler though I would go for the even more concise:
bool find_something(std::string s, char something)
{
for (ch : s)
if (ch == something) return true;
return false;
}
Which BTW also yields very similar machine code with GCC 5.2.0.
The compiler has to know the object is not modified in a different thread. It can tell that the function won't change if the object doesn't change, but it can't tell that the object won't change from some other stimulus.
The compiler will unwind the call to the member with size if you include some form of whole-program optimization
It depends on whether or not the compiler can identify the function call as constant. Consider the following function that may reside in an external library which cannot be analyzed by the compiler.
int odd_size(string s) {
static int a = 0;
return a++;
}
This function will return distinct values regardless of the input parameter. The compiler can therfore not assume a constant return value even if the passed string object remains constant. No optimization will be applied.
On the other hand, if the compiler detects a constant function call, which may be the case in your example, it probably moves the constant expression out of the loop.
Older versions of gcc had an explicit option -floop-optimize which was responsible for that task. From the gcc-3.4.5 documentation:
-floop-optimize
Perform loop optimizations: move constant expressions out of loops, simplify exit test conditions and optionally do strength-reduction and loop unrolling as well.
Enabled at levels -O, -O2, -O3, -Os.
I cannot find this option in current versions of gcc, but I'm quite sure that they include this type of optimization as well.
A lot of times I see code like:
int s = a / x;
for (int i = 0; i < s; i++)
// do something
If inside the for loop, neither a nor x is modified, can I then simply write:
for (int i = 0; i < a / x; i++)
// do something
and then assume that the compiler optimizes a/x, i.e replaces it with a constant?
The most important part of int s = a / x is the variable name. It gives your syntax semantics, and lets you remember 12 months later why you were dividing one thing by another. You can't name the expression in the for statement, so you lose that self-documenting nature.
const auto monthlyAmount = (int)yearlyAmount / numberOfMonths;
for (auto i = 0; i < monthlyAmount; ++i)
// do something
In this way, extracting the variable isn't for a compiler optimization, it's a human maintainability optimization.
If the compiler can be sure that the variables used in the expression in the middle of your for loop will not change between iterations, it can optimize the calculation to be performed once at the beginning of the loop, instead of every iteration.
However, consider that the variables used are global variables, or references to variables external to the function, and in your for loop you call a function. The function could change these variables. If the compiler is able to see enough of the code at that point, it could find out if this is the case to decide whether to optimize. However, compilers are only willing to look so far (otherwise things would take even longer to compile), so in general you cannot assume the optimization is performed.
The concern for optimization probably stems from the fact that the condition is evaluated before each iteration. If this is a potentially expensive operation and you don't need to do it over and over again, you can extract it out of the loop:
const std::size_t size = s.size(); // just an example
for (std::size_t i = 0; i < size; ++i)
{
}
For inexpensive operations this is probably a premature optimization and the compiler might generate the same code. The only way to be sure is to check the generated assembly code.
The problem with such Questions is that they cannot be generalized. What optimizations the Compiler will perform and what not can only be determined by a case by case analysis.
I'd certainly expect the compiler to do this if one of the following holds true:
1) Both, A and B are local variables, whose addresses are never taken.
2) The code in the loop is completely inlined.
In practice the last requirement isn't as hard as it looks, because if the functions in the body cannot be inlined, their runtime will likely dwarf the time to re-compute the bound anyway
Let's look at the following piece of code which I unintentionally wrote:
void test (){
for (int i = 1; i <=5; ++i){
float newNum;
newNum +=i;
cout << newNum << " ";
}
}
Now, this is what I happened in my head:
I have always been thinking that float newNum would create a new variable newNum with a brand-new value for each iteration since the line is put inside the loop. And since float newNum doesn't throw a compile error, C++ must be assigning some default value (huhm, must be 0). I then expected the output to be "1 2 3 4 5". What was printed was "1 3 6 10 15".
Please help me know what's wrong with my expectation that float newNum would create a new variable for each iteration?
Btw, in Java, this piece of code won't compile due to newNum not initialized and that's probably better for me since I would know I need to set it to 0 to get the expected output.
Since newNum is not initialized explicitly, it will have a random value (determined by the garbage data contained in the memory block it is allocated to), at least on the first iteration.
On subsequent iterations, it may have its earlier values reused (as the compiler may allocate it repeatedly to the same memory location - this is entirely up to the compiler's discretion). Judging from the output, this is what actually happened here: in the first iteration newNum had the value 0 (by pure chance), then 1, 3, 6 and 10, respectively.
So to get the desired output, initialize the variable explicitly inside the loop:
float newNum = 0.0;
C++ must be assigning some default
value (huhm, must be 0)
This is the mistake in your assumptions. C++ doesn't attempt to assign default values, you must explicitly initialise everything.
Most likely it will assign the same location in memory each time around the loop and so (in this simple case) newNum will probably seem to persist from each iteration to the next.
In a more complicated scenario the memory assigned to newNum would be in an essentially random state and you could expect weird behaviour.
http://www.cplusplus.com/doc/tutorial/variables/
The float you are creating is not initialised at all. Looks like you got lucky and it turned out to be zero on the first pass, though it could have had any value in it.
In each iteration of the loop a new float is created, but it uses the same bit of memory as the last one, so ended up with the old value that you had.
To get the effect you wanted, you will need to initialise the float on each pass.
float newNum = 0.0;
The mistake in thoughts you've expressed is "C++ must be assigning some default value". It will not. newNum contains dump.
You are using an uninitialized automatic stack variable. In each loop iteration it is located at the same place on the stack, so event though it has an undefined value, in your case it will be the value of the previous iteration.
Also beware that in the first iteration it could potentialliay have any value, not only 0.0.
You might have got your expected answer in a debug build but not release as debug builds sometimes initialise variables to 0 for you. I think this is undefined behaviour - because C++ doesn't auto initialise variables for you, every time around the loop it is creating a new variable but it keeps using the same memory as it was just released and doesn't scrub out the previous value. As other people have said you could have ended up with complete nonsense printing out.
It is not a compile error to use an uninitialised variable but there should usually be a warning about it. Always good to turn warnings on and try to remove all of them incase something nastier is hidden amongst them.
Using garbage value in your code invokes Undefined Behaviour in C++. Undefined Behavior means anything can happen i.e the behavior of the code is not defined.
float newNum; //uninitialized (may contain some garbage value)
newNum +=i; // same as newNum=newNum+i
^^^^^
Whoa!!
So better try this
float newNum=0; //initialize the variable
for (int i = 1; i <=5; ++i){
newNum +=i;
cout << newNum << " ";
}
C++ must be assigning some default value (huhm, must be 0).
C++ doesn't initialize things without default constructor (it may set it to something like 0xcccccccc on debug build, though) - because as any proper tool, compiler "thinks" that if you haven't provided initialization then it is what you wanted. float doesn't have default constructor, so it is unknown value.
I then expected the output to be "1 2 3 4 5". What was printed was "1 3 6 10 15".
Please help me know what's wrong with my expectation that float newNum would create a new variable for each iteration?
Variable is a block of memory. In this variable is allocated on stack. You didn't initialize it, and each iteration it just happen to be placed on the same memory address, which is why it stores previous value. Of course, you shouldn't rely on such behavior. If you want value to persist across iterations, declare it outside of loop.
Btw, in Java, this piece of code won't compile due to newNum not initialized
BTW, in C++ normal compiler would give you a warning that variable is not initialized (Example: "warning C4700: uninitialized local variable 'f' used"). And on debug build you would get crt debug error (Example: "Run-Time check failure #3 - The variable 'f' is being used without being initialized").
and that's probably better
Negative. You DO need uninitialized variables from time to time (normally - to initialize them without "standard" assignment operator - by passing into function by pointer/reference, for example), and forcing me to initialize every one of them will be a waste of my time.
I don't do too much C++ since last month, but:
The float values is newly allocated for each iteration. I'm a bit surprised about the initial zero value, though. The thing is, after each iteration, the float value runs out of scope, the next step (the reenter of the loop scope) first reallocates the float memory and this will often return the same memory block that was just freed.
(I'm waiting for any bashing :-P)