Confusing output? - c++

I fail to understand is why does the code print '3' in VS2010 (release build), whether I leave the declaration of 'r' or comment it out.
int main(){
int arr1[2];
int &r = arr1[0];
int arr2[2];
cout << (&arr1[1] - &arr2[0]);
}
So, three questions:
a. why does the code print 3?
b. why does it print 3 even if the declaration of 'r' is present? (Is it because that in C++ whether a reference occupies storage or not is implementation defined?)
c. Does this code have undefined behavior or implementation defined behavior?

Because in Release build r variable is removed. Unused variable of built-in type is removed, because Release build is done with optimizations. Try to use it later, and result will change. Some variables may be placed into CPU register and not on the stack, this also changes the distance between another local variables.
On the other hand, unused class instance is not removed, because class instance creation may have side effects, since constructor is invoked.
This is both undefined and implementation-defined behavior, because compiler is free to place variables in any place where it is appropriate.

a. Order of the variables in memory are from
arr2[0]
arr2[1]
arr1[0]
arr1[1]
the code prints 3 because its using pointer arithmetic. Subtracting &arr1[1] from &arr2[0] means a difference of 3 int's.
b. Since r is never referenced, the C++ compiler is free to optimize it out.
c. Not positive but I don't believe the C++ standard defines an explicit order to variables on a stack. Therefore the compiler is free to reorder these variables, even putting extra space between them as it see's fit. So, yes its implementation specific. A different compiler could have just as easily given -1 as the answer.

&arr1[1] - &arr2[0]
Pointer arithmetic is only well-defined within the same array. It does not matter what you think this code snippet does or should do, you are invoking undefined behavior. Your program could do anything.

Related

How function is able to read the value of user input variable from main function without passing the value as param? [duplicate]

From "C++ Primer" by Lippman,
When we define a variable, we should give it an initial value unless we are certain that the initial value will be overwritten before the variable is used for any other purpose. If we cannot guarantee that the variable will be reset before being read, we should initialize it.
What happens if an uninitialized variable is used in say an operation? Will it crash/ will the code fail to compile?
I searched the internet for answer to the same but there were differing 'claims'. Hence the following questions,
Will C and C++ standards differ in how they treat an uninitialized variable?
Regarding similar queries, how and where can I find an 'official' answer? Is it practical for an amateur to look up the C and C++ standards?
Q.1) What happens if an uninitialized variable is used in say an operation? Will it crash/ will the code fail to compile?
Many compilers try to warn you about code that improperly uses the value of an uninitialized variable. Many compilers have an option that says "treat warnings as errors". So depending on the compiler you're using and the option flags you invoke it with and how obvious it is that a variable is uninitialized, the code might fail to compile, although we can't say that it will fail to compile.
If the code does compile, and you try to run it, it's obviously impossible to predict what will happen. In most cases the variable will start out containing an "indeterminate" value. Whether that indeterminate value will cause your program to work correctly, or work incorrectly, or crash, is anyone's guess. If the variable is an integer and you try to do some math on it, you'll probably just get a weird answer. But if the variable is a pointer and you try to indirect on it, you're quite likely to get a crash.
It's often said that uninitialized local variables start out containing "random garbage", but that can be misleading, as evidenced by the number of people who post questions here pointing out that, in their program where they tried it, the value wasn't random, but was always 0 or was always the same. So I like to say that uninitialized local variables never start out holding what you expect. If you expected them to be random, you'll find that (at least on any given day) they're repeatable and predictable. But if you expect them to be predictable (and, god help you, if you write code that depends on it), then by jingo, you'll find that they're quite random.
Whether use of an uninitialized variable makes your program formally undefined turns out to be a complicated question. But you might as well assume that it does, because it's a case you want to avoid just as assiduously as you avoid any other dangerous, imperfectly-defined behavior.
See this old question and this other old question for more (much more!) information on the fine distinctions between undefined and indeterminate behavior in this case.
Q.2) Will C and C++ standards differ in how they treat an uninitialized variable?
They might differ. As I alluded to above, and at least in C, it turns out that not all uses of uninitialized local variables are formally undefined. (Some are merely "indeterminate".) But the passages quoted from the C++ standards by other answers here make it sound like it's undefined there all the time. Again, for practical purposes, the question probably doesn't matter, because as I said, you'll want to avoid it no matter what.
Q.3) Regarding similar queries, how and where can I find an 'official' answer? Is it practical for an amateur to look up the C and C++ standards?
It is not always easy to obtain copies of the standards (let alone official ones, which often cost money), and the standards can be difficult to read and to properly interpret, but yes, given effort, anyone can obtain, read, and attempt to answer questions using the standards. You might not always make the correct interpretation the first time (and you may therefore need to ask for help), but I wouldn't say that's a reason not to try. (For one thing, anyone can read any document and end up not making the correct interpretation the first time; this phenomenon is not limited to amateur programmers reading complex language standards documents!)
C++
The C++ Standard, [dcl.init], paragraph 12 [ISO/IEC 14882-2014], states the following:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced. If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
(end quote)
So using an uninitialized variable will result in undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior. The program may give your expected output or it may crash.
C
The C Standard, 6.7.9, paragraph 10, specifies [ISO/IEC 9899:2011]
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
Uninitialized automatic variables or dynamically allocated memory has indeterminate values, which for objects of some types, can be a trap representation. Reading such trap representations is undefined behavior.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.
What happens if an uninitialized variable is used in say an operation?
It depends. If the operation uses the value of the variable, and the type of the variable and the expression aren't excepted from it, then the behaviour of the program is undefined.
If the value isn't used - such as in the case of sizeof operator, then nothing particular happens that wouldn't happen with an initialised variable.
Will C and C++ standards differ in how they treat an uninitialized variable?
They use different wording, but are essentially quite similar in this regard.
C defines the undefined behaviour of indeterminate values through the concept of "trap representation" and specifies types that are guaranteed to not have trap representations.
C++ doesn't define the concept of "trap representation", but rather lists exceptional cases where producing an indeterminate value doesn't result in undefined behaviour. These cases have some overlap with the exceptional types in C, but aren't exactly the same.
Regarding similar queries, how and where can I find an 'official' answer?
The official answer - if there is one - is always in the language standard document.
The standard says it is undefined.
However on a Unix or VMS based system (Gnu/Linux, UNIX, BSD, MS-Windows > XP or NT, MacOS > X) then the stack and heap are initialised to zero (this is done for security reasons. Now to make your code work.)
However if you go up and down the stack or free then malloc then the data will be random rubish. (there may be other causes of random rubish. Don't rely on undefined behaviours).
Could the program crash? (By this, you mean detect error at run-time.)
Probably not, but again this is undefined behaviour. A C interpreter may do this.
Note also, some C++ types have a constructor that does well-defined initialisation.
You have tagged both C and C++. In C, an uninitialized variable probably has junk bits. Often your compiler with put zero bits there, but you can not count on it. So if you use that variable without explicitly initializing, the result may be sensible and it may not. And strictly speaking this is undefined behavior, so anything at all may happen.
C++ has the same for simple variables, but there is an interesting exception: while a int x[3] contains junk, std::vector x(3) contains zeros.

What happens to uninitialized variables in C/C++?

From "C++ Primer" by Lippman,
When we define a variable, we should give it an initial value unless we are certain that the initial value will be overwritten before the variable is used for any other purpose. If we cannot guarantee that the variable will be reset before being read, we should initialize it.
What happens if an uninitialized variable is used in say an operation? Will it crash/ will the code fail to compile?
I searched the internet for answer to the same but there were differing 'claims'. Hence the following questions,
Will C and C++ standards differ in how they treat an uninitialized variable?
Regarding similar queries, how and where can I find an 'official' answer? Is it practical for an amateur to look up the C and C++ standards?
Q.1) What happens if an uninitialized variable is used in say an operation? Will it crash/ will the code fail to compile?
Many compilers try to warn you about code that improperly uses the value of an uninitialized variable. Many compilers have an option that says "treat warnings as errors". So depending on the compiler you're using and the option flags you invoke it with and how obvious it is that a variable is uninitialized, the code might fail to compile, although we can't say that it will fail to compile.
If the code does compile, and you try to run it, it's obviously impossible to predict what will happen. In most cases the variable will start out containing an "indeterminate" value. Whether that indeterminate value will cause your program to work correctly, or work incorrectly, or crash, is anyone's guess. If the variable is an integer and you try to do some math on it, you'll probably just get a weird answer. But if the variable is a pointer and you try to indirect on it, you're quite likely to get a crash.
It's often said that uninitialized local variables start out containing "random garbage", but that can be misleading, as evidenced by the number of people who post questions here pointing out that, in their program where they tried it, the value wasn't random, but was always 0 or was always the same. So I like to say that uninitialized local variables never start out holding what you expect. If you expected them to be random, you'll find that (at least on any given day) they're repeatable and predictable. But if you expect them to be predictable (and, god help you, if you write code that depends on it), then by jingo, you'll find that they're quite random.
Whether use of an uninitialized variable makes your program formally undefined turns out to be a complicated question. But you might as well assume that it does, because it's a case you want to avoid just as assiduously as you avoid any other dangerous, imperfectly-defined behavior.
See this old question and this other old question for more (much more!) information on the fine distinctions between undefined and indeterminate behavior in this case.
Q.2) Will C and C++ standards differ in how they treat an uninitialized variable?
They might differ. As I alluded to above, and at least in C, it turns out that not all uses of uninitialized local variables are formally undefined. (Some are merely "indeterminate".) But the passages quoted from the C++ standards by other answers here make it sound like it's undefined there all the time. Again, for practical purposes, the question probably doesn't matter, because as I said, you'll want to avoid it no matter what.
Q.3) Regarding similar queries, how and where can I find an 'official' answer? Is it practical for an amateur to look up the C and C++ standards?
It is not always easy to obtain copies of the standards (let alone official ones, which often cost money), and the standards can be difficult to read and to properly interpret, but yes, given effort, anyone can obtain, read, and attempt to answer questions using the standards. You might not always make the correct interpretation the first time (and you may therefore need to ask for help), but I wouldn't say that's a reason not to try. (For one thing, anyone can read any document and end up not making the correct interpretation the first time; this phenomenon is not limited to amateur programmers reading complex language standards documents!)
C++
The C++ Standard, [dcl.init], paragraph 12 [ISO/IEC 14882-2014], states the following:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced. If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases:
(end quote)
So using an uninitialized variable will result in undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior. The program may give your expected output or it may crash.
C
The C Standard, 6.7.9, paragraph 10, specifies [ISO/IEC 9899:2011]
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.
Uninitialized automatic variables or dynamically allocated memory has indeterminate values, which for objects of some types, can be a trap representation. Reading such trap representations is undefined behavior.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.
What happens if an uninitialized variable is used in say an operation?
It depends. If the operation uses the value of the variable, and the type of the variable and the expression aren't excepted from it, then the behaviour of the program is undefined.
If the value isn't used - such as in the case of sizeof operator, then nothing particular happens that wouldn't happen with an initialised variable.
Will C and C++ standards differ in how they treat an uninitialized variable?
They use different wording, but are essentially quite similar in this regard.
C defines the undefined behaviour of indeterminate values through the concept of "trap representation" and specifies types that are guaranteed to not have trap representations.
C++ doesn't define the concept of "trap representation", but rather lists exceptional cases where producing an indeterminate value doesn't result in undefined behaviour. These cases have some overlap with the exceptional types in C, but aren't exactly the same.
Regarding similar queries, how and where can I find an 'official' answer?
The official answer - if there is one - is always in the language standard document.
The standard says it is undefined.
However on a Unix or VMS based system (Gnu/Linux, UNIX, BSD, MS-Windows > XP or NT, MacOS > X) then the stack and heap are initialised to zero (this is done for security reasons. Now to make your code work.)
However if you go up and down the stack or free then malloc then the data will be random rubish. (there may be other causes of random rubish. Don't rely on undefined behaviours).
Could the program crash? (By this, you mean detect error at run-time.)
Probably not, but again this is undefined behaviour. A C interpreter may do this.
Note also, some C++ types have a constructor that does well-defined initialisation.
You have tagged both C and C++. In C, an uninitialized variable probably has junk bits. Often your compiler with put zero bits there, but you can not count on it. So if you use that variable without explicitly initializing, the result may be sensible and it may not. And strictly speaking this is undefined behavior, so anything at all may happen.
C++ has the same for simple variables, but there is an interesting exception: while a int x[3] contains junk, std::vector x(3) contains zeros.

What are the dangers of uninitialised variables?

In a program I am writing I currently have several uninitialised variables in my .h files, all of which are initialised at run-time. However, in Visual Studio it warns me every time I do this to "Always initialise a member variable" despite how seemingly pointless it feels to do so. I am well aware that attempting to use a variable when uninitialised will lead to undefined behaviour, but as far as I know, this can be avoided by not doing so. Am I overlooking something?
Thanks.
These variables could contain any value if you don't initialize them and reading them in an uninitialized stated is undefined behavior. (except if they are zero initalized)
And if you forgot to initialize one of them, and reading from it by accident results in the value you expect it should have on your current system configuration (due to undefined behavior), then your program might behave unpredictable/unexpected after a system update, on a different system or when you do changes in your code.
And these kinds of errors are hard to debug. So even if you set them at runtime it is suggested to initialize them to known values so that you have a controlled environment with predictable behavior.
There are a few exceptions, e.g. if you set the variable right after you declared it and you can't set it directly, like if you set its value using a streaming operator.
You have not included the source so we have to guess about why it happens, and I can see possible reasons with different solutions (except just zero-initializing everything):
You don't initialize at the start of the constructor, but you combine member initialization with some other code that calls some functions for the not fully initialized object. That's a mess - and you never know when some functions will call another function using some non-initialized member. If you really need this, don't send in the entire object - but only the parts you need (might need more refactoring).
You have the initialization in an Init-function. Just use the recent C++-feature of having one constructor call another instead.
You don't initialize some members in the constructor, but even later. If you really don't want to initialize it having a pointer (or std::unique_ptr) containing that data, and create it when needed; or don't have it in the object.
It's a safety measure to not allow uninitialized variables, witch is a good thing, but if you are sure of what you are doing and you make sure your variables are always initialzed before use, you can turn this off, right click on your project in solution explorer -> properties -> C/C++ -> SDL checks, this should be marked as NO. It comes as YES by default.
Note that these compile-time checks do more than just check for unitialized variables, so before you turn this off I advise reading https://learn.microsoft.com/en-us/cpp/build/reference/sdl-enable-additional-security-checks?view=vs-2019
You can also disable a specific warning in you code using warning pragma
Personally I keep these on because IMO in the tradeoff safety/annoyance I prefer safety, but I reckon that someone else can have a different opinion.
There are two parts to this question: first is reading uninitialized variables dangerous and second is defining variables uninitialized dangerous even if I make sure I never access uninitialized variables.
What are the dangers of accessing uninitialized variables?
With very few exceptions, accessing an uninitialized variable makes the whole program have Undefined Behavior. There is a common misconception (which unfortunately is taught) that uninitialized variables have "garbage values" and so reading an uninitialized variable will result in reading some value. This is completely false. Undefined Behavior means the program can have any behavior: it can crash, it can behave as the variable has some value, it can pretend the variable doesn't even exist or all sorts of weird behaviors.
For instance:
void foo();
void bar();
void test(bool cond)
{
int a; // uninitialized
if (cond)
{
a = 24;
}
if (a == 24)
{
foo();
}
else
{
bar();
}
}
What is the result of calling the above function with true? What about with false?
test(true) will cleary call foo().
What about test(false)? If you answer: "Well it depends on what garbage value is in variable a, if it is 24 it will call foo, else it will call bar" Then you are completely wrong.
If you call test(false) the program accesses an uninitialized variable and has Undefined Behavior, it is an illegal path and so the compilers are free to assume cond is never false (because otherwise the program would be illegal). And surprise surprise both gcc and clang with optimizations enabled actually do this and generate this assembly for the function:
test(bool):
jmp foo()
So don't do this! Never access uninitialized variable! It is undefined behavior and it's much much worse than "the variable has some garbage value". Furthermore, on your system could work as you expect, on other systems or with other compiler flags it can behave in unexpected ways.
What are the dangers of defining uninitialized variables if I make sure I always initialize them later, before accessing them?
Well, the program is correct from this respect, but the source code is prone to errors. You have to mentally burden yourself with always checking if somewhere you actually initialized the variable. And if you did forget to initialize a variable finding the bug will be difficult as you have a lot of variables in your code who are defined uninitialized.
As opposed, if you always initialize your variables you and the programmers after you have a much much easier job and ease of mind.
It's just a very very good practice.

What happens when I do int*p=p in c/cpp?

Below code is getting compiled in MinGw. How does it get compiled? How is it possible to assign a variable which is not yet created?
int main()
{
int*p=p;
return 0;
}
How does it get compiled?
The point of declaration of a variable starts at the end of its declarator, but before its initialiser. This allows more legitimate self-referential declarations like
void * p = &p;
as well as undefined initialisations like yours.
How is it possible to assign a variable which is not yet created?
There is no assignment here, just initialisation.
The variable has been created (in the sense of having storage allocated for it), but not initialised. You initialise it from whatever indeterminate value happened to be in that storage, with undefined behaviour.
Most compilers will give a warning or error about using uninitialised values, if you ask them to.
Let's take a look at what happens with the int*p=p; statement:
The compiler allocates space on the stack to hold the yet uninitialized value of variable p
Then the compiler initializes p with its uninitialized value
So, essentially there should be no problem with the code except that it assigns a variable an uninitialized value.
Actually there is no much difference than the following code:
int *q; // define a pointer and do not initialize it
int *p = q; // assign the value of the uninitizlized pointer to another pointer
The likely result ("what it compiles to") will be the declaration of a pointer variable that is not initialized at all (which is subsequently optimized out since it is not used, so the net result would be "empty main").
The pointer is declared and initialized. So far, this is an ordinary and legal thing. However, it is initialized to itself, and its value is only in a valid, initialized state after the end of the statement (that is, at the location of the semicolon).
This, unsurprisingly, makes the statement undefined behavior.
By definition, invoking undefined behavior could in principle cause just about everything (although often quoted dramatic effects like formatting your harddrive or setting the computer on fire are exaggerated).
The compiler might actually generate an instruction that moves a register (or memory location) to itself, which would be a no-op instruction on most architectures, but could cause a hardware exception killing your process on some exotic architectures which have special validating registers for pointers (in case the "random" value is incidentially an invalid address).
The compiler will however not insert any "format harddisk" statements.
In practice, optimizing compilers will nowadays often assume "didn't happen" when they encounter undefined behavior, so it is most likely that the compiler will simply honor the declaration, and do nothing else.
This is, in every sense, perfectly allowable in the light of undefined behavior. Further, it is the easiest and least troublesome option for the compiler.

Yet another question related to sequence points

Yes i read the article on sequence points. However i could not understand why ++i = 2 would invoke undefined behavior? The final value of i would be 2 regardless of anything, so how come the expression is ub?
code snippet
int main()
{
int i =0;
++i=2;
return 0;
}
Sorry my english is not very good.
It looks obvious to you, because obviously i will first be assigned i+1, then second be assigned the value 2.
However, both of these assignments happen within the same sequence point, therefore it's up to the compiler to which happens frist and which happens second, therefore different compiler implementations can generate code that will give different results, therefore it's UB.
You observe that value will be what you claim, that's how UB can manifest itself among other possible scenarios. The program might output what you expect, output some unrelated data, crash, corrupt data or spend all your money ordering pizza. Once C++ standard says that some construct is UB you should not expect any specific behavior. Observed results can vary from one program run to another.
The undefined behavior occurs because a compiler could implement the following code:
++i = 2;
as either:
i = 2;
++i;
or
++i;
i = 2;
It's unspecified in the language, a compiler could choose to implement either of the above. The first would produce 3 and the second 2. So it's undefined.
Calling ++i = 2; does not in and of itself invoke undefined behaviour; any compiler can, if it wants, do a very defined action upon reaching that code. However the c++ standard states that such an operation is undefined,therefore a compiler may do something unexpected (like delete all the files on the C drive or send a text message to the pope) and still be a compliant compiler. The only thing that makes this UB is that the standard says it is UB.
Perhaps the most important point is that one version of a compiler may do something different from the next version of the same compiler.
From the exact same link you are providing :
Furthermore, the prior value shall be
accessed only to determine the value
to be stored.
What does it mean? It means if an
object is written to within a full
expression, any and all accesses to it
within the same expression must be
directly involved in the computation
of the value to be written.
Here on the left hand side of operator =, the access to i is not involved in the computation of the value written.
++i (should be) an rvalue, and hence, can't be used as a lvalue, but (++i) = 2; should work fine. I don't believe this is UB, but, as always, I might be wrong.