Dealing with int arrays in c++ 2

Dealing with int arrays in c++ 2 - c++

Hi guys could anyone explain why does this program correctly even being a bit starnge:
int main()
{
int array[7]={5,7,57,77,55,2,1};
for(int i=0;i<10;i++)
cout<<i[array]<<", "; //array[i]
cout<<endl;
return 0;
}
why does the program compile correctly??

An expression (involving fundamental types) such as this:
x[y]
is converted at compile time to this:
*(x + y)
x + y is the same as y + x
Therefore: *(x + y) is the same as *(y + x)
Therefore: x[y] is the same as y[x]

In your program, you are trying to index an array out of its bounds. This will probably lead to a Segmentation Violation error, meaning that in your program, there is an attempt from the CPU to access memory that can not be physically addressed (think that it is not allocated for the array, as it is out of its bounds). This error is a runtime error, meaning that it is not in the responsibility of the compiler to check it but will it will be raised from the Operating System, having become notified by the hardware. Compiler's 'error' responsibilities are lexical and syntactical errors checking, in order to compile correctly your code into machine code and finally, binary.
For more information about Segmentation Violation error or Segmentation Fault, as commonly known, look here:
http://en.wikipedia.org/wiki/Segmentation_fault

You've come across Undefined Behavior. This means that the compiler is allowed to do whatever it wants with your program -- including compiling it without warnings or errors. Furthermore, it can produce any code it wants to for the case of undefined behavior, including assuming that it does not occur (a common optimization). Accessing an array out-of-bounds is an example of undefined behavior. Signed integer overflow, data races, and invalid pointer creation/use are others.
Theoretically, the compiler could emit code that invoked the shell and performed rm -rf /* (delete every file you have permission to delete)! Of course, no reasonable compiler would do this, but you get the idea.
Simply put, a program with undefined behavior is not a valid C++ program. This is true for the entirety of the program, not just after the undefined behavior. A compiler would have been perfectly free to compile your program to a no-op.

Adding to Benjamin Lindley, Compile the below code and you will see how the address are calculated:
int main()
{
int array[7]={5,7,57,77,55,2,1};
cout<<&(array[0])<<endl;
cout<<&(array[1])<<endl;
return 0;
}
output:(for me);-)
0x28ff20
0x28ff24
Its just &(array+0) and &(array+1)..

Related

Same array giving garbage value at one place and an unrelated value at the other place

In the following code:
#include<iostream>
using namespace std;
int main()
{
int A[5] = {10,20,30,40,50};
// Let us try to print A[5] which does NOT exist but still
cout <<"First A[5] = "<< A[5] << endl<<endl;
//Now let us print A[5] inside the for loop
for(int i=0; i<=5; i++)
{
cout<<"Second A["<<i<<"]"<<" = "<<A[i]<<endl;
}
}
Output:
The first A[5] is giving different output (is it called garbage value?) and the second A[5] which is inside the for loop is giving different output (in this case, A[i] is giving the output as i). Can anyone explain me why?
Also inside the for loop, if I declare a random variable like int sax = 100; then A[5] will take the value 100 and I don't have the slightest of clue why is this happening.
I am on Windows, CodeBlocks, GNUGCC Compiler

Well you invoke Undefined Behaviour, so behaviour is err... undefined and anything can happen including what your show here.
In common implementations, data past the end of array could be used by a different element, and only implementation details in the compiler could tell which one.
Here your implementation has placed the next variable (i) just after the array, so A[5] is an (invalid) accessor for i.
But please do not rely on that. Different compilers or different compilation options could give a different result. And as a compiler is free to assume that you code shall not invoke UB an optimizing compiler could just optimize out all of your code and only you would be to blame.
TL/DR: Never, ever try to experiment UB: anything can happen from a consistent behaviour to an immediate crash passing by various inconsistent outputs. And what you see will not be reproduced in a different context (context here can even be just a different run of same code)

In your Program, I think "there is no any syntax issue" because when I execute this same code in my compiler. Then there is no any issue likes you.
It gives same garbage value at direct as well as in loop.
enter image description here

The problem is that when you wrote:
cout <<"First A[5] = "<< A[5] << endl<<endl;//this is Undefined behavior
In the above statement you're going out of bounds. This is because array index starts from 0 and not 1.
Since your array size is 5. This means you can safely access A[0],A[1],A[2],A[3] and A[4].
On the other hand you cannot access A[5]. If you try to do so, you will get undefined behavior.
Undefined behavior means anything1 can happen including but not limited to the program giving your expected output. But never rely(or make conclusions based) on the output of a program that has undefined behavior.
So the output that you're seeing is a result of undefined behavior. And as i said don't rely on the output of a program that has UB.
So the first step to make the program correct would be to remove UB. Then and only then you can start reasoning about the output of the program.
For the same reason, in your for loop you should replace i<=5 with i<5.
1For a more technically accurate definition of undefined behavior see this where it is mentioned that: there are no restrictions on the behavior of the program.

How are these two pieces of code different?

I tried these lines of code and found out shocking output. I am expecting some reason related to initialisation either in general or in for loop.
1.)
int i = 0;
for(i++; i++; i++){
if(i>10) break;
}
printf("%d",i);
Output - 12
2.)
int i;
for(i++; i++; i++){
if(i>10) break;
}
printf("%d",i);
Output - 1
I expected the statements "int i = 0" and "int i" to be the same.What is the difference between them?

I expected the statements "int i = 0" and "int i" to be the same.
No, that was a wrong expectation on your part. If a variable is declared outside of a function (as a "global" variable), or if it is declared with the static keyword, it's guaranteed to be initialized to 0 even if you don't write = 0. But variables defined inside functions (ordinary "local" variables without static) do not have this guaranteed initialization. If you don't explicitly initialize them, they start out containing indeterminate values.
(Note, though, that in this context "indeterminate" does not mean "random". If you write a program that uses or prints an uninitialized variable, often you'll find that it starts out containing the same value every time you run your program. By chance, it might even be 0. On most machines, what happens is that the variable takes on whatever value was left "on the stack" by the previous function that was called.)
See also these related questions:
Non-static variable initialization
Static variable initialization?
See also section 4.2 and section 4.3 in these class notes.
See also question 1.30 in the C FAQ list.
Addendum: Based on your comments, it sounds like when you fail to initialize i, the indeterminate value it happens to start out with is 0, so your question is now:
"Given the program
#include <stdio.h>
int main()
{
int i; // note uninitialized
printf("%d\n", i); // prints 0
for(i++; i++; i++){
if(i>10) break;
}
printf("%d\n", i); // prints 1
}
what possible sequence of operations could the compiler be emitting that would cause it to compute a final value of 1?"
This can be a difficult question to answer. Several people have tried to answer it, in this question's other answer and in the comments, but for some reason you haven't accepted that answer.
That answer again is, "An uninitialized local variable leads to undefined behavior. Undefined behavior means anything can happen."
The important thing about this answer is that it says that "anything can happen", and "anything" means absolutely anything. It absolutely does not have to make sense.
The second question, as I have phrased it, does not really even make sense, because it contains an inherent contradiction, because it asks, "what possible sequence of operations could the compiler be emitting", but since the program contains Undefined behavior, the compiler isn't even obliged to emit a sensible sequence of operations at all.
If you really want to know what sequence of operations your compiler is emitting, you'll have to ask it. Under Unix/Linux, compile with the -S flag. Under other compilers, I don't know how to view the assembly-language output. But please don't expect the output to make any sense, and please don't ask me to explain it to you (because I already know it won't make any sense).
Because the compiler is allowed to do anything, it might be emitting code as if your program had been written, for example, as
#include <stdio.h>
int main()
{
int i; // note uninitialized
printf("%d\n", i); // prints 0
i++;
printf("%d\n", i); // prints 1
}
"But that doesn't make any sense!", you say. "How could the compiler turn "for(i++; i++; i++) ..." into just "i++"? And the answer -- you've heard it, but maybe you still didn't quite believe it -- is that when a program contains undefined behavior, the compiler is allowed to do anything.

The difference is what you already observed. The first code initializes i the other does not. Using an unitialized value is undefined behaviour (UB) in c++. The compiler assumes UB does not happen in a correct program, and hence is allowed to emit code that does whatever.
Simpler example is:
int i;
i++;
Compiler knows that i++ cannot happen in a correct program, and the compiler does not bother to emit correct output for wrong input, hece when you run this code anything could happen.
For further reading see here: https://en.cppreference.com/w/cpp/language/ub
The is a rule of thumb that (among other things) helps to avoid uninitialized variables. It is called Almost-Always-Auto, and it suggests to use auto almost always. If you write
auto i = 0;
You cannot forget to initialize i, because auto requires an initialzer to be able to deduce the type.
PS: C and C++ are two different languages with different rules. Your second code is UB in C++, but I cannot answer your question for C.

Is this compiler optimization inconsistency entirely explained by undefined behaviour?

During a discussion I had with a couple of colleagues the other day I threw together a piece of code in C++ to illustrate a memory access violation.
I am currently in the process of slowly returning to C++ after a long spell of almost exclusively using languages with garbage collection and, I guess, my loss of touch shows, since I've been quite puzzled by the behaviour my short program exhibited.
The code in question is as such:
#include <iostream>
using std::cout;
using std::endl;
struct A
{
int value;
};
void f()
{
A* pa; // Uninitialized pointer
cout<< pa << endl;
pa->value = 42; // Writing via an uninitialized pointer
}
int main(int argc, char** argv)
{
f();
cout<< "Returned to main()" << endl;
return 0;
}
I compiled it with GCC 4.9.2 on Ubuntu 15.04 with -O2 compiler flag set. My expectations when running it were that it would crash when the line, denoted by my comment as "writing via an uninitialized pointer", got executed.
Contrary to my expectations, however, the program ran successfully to the end, producing the following output:
0
Returned to main()
I recompiled the code with a -O0 flag (to disable all optimizations) and ran the program again. This time, the behaviour was as I expected:
0
Segmentation fault
(Well, almost: I didn't expect a pointer to be initialized to 0.) Based on this observation, I presume that when compiling with -O2 set, the fatal instruction got optimized away. This makes sense, since no further code accesses the pa->value after it's set by the offending line, so, presumably, the compiler determined that its removal would not modify the observable behaviour of the program.
I reproduced this several times and every time the program would crash when compiled without optimization and miraculously work, when compiled with -O2.
My hypothesis was further confirmed when I added a line, which outputs the pa->value, to the end of f()'s body:
cout<< pa->value << endl;
Just as expected, with this line in place, the program consistently crashes, regardless of the optimization level, with which it was compiled.
This all makes sense, if my assumptions so far are correct.
However, where my understanding breaks somewhat is in case where I move the code from the body of f() directly to main(), like so:
int main(int argc, char** argv)
{
A* pa;
cout<< pa << endl;
pa->value = 42;
cout<< pa->value << endl;
return 0;
}
With optimizations disabled, this program crashes, just as expected. With -O2, however, the program successfully runs to the end and produces the following output:
0
42
And this makes no sense to me.
This answer mentions "dereferencing a pointer that has not yet been definitely initialized", which is exactly what I'm doing, as one of the sources of undefined behaviour in C++.
So, is this difference in the way optimization affects the code in main(), compared to the code in f(), entirely explained by the fact that my program contains UB, and thus compiler is technically free to "go nuts", or is there some fundamental difference, which I don't know of, between the way code in main() is optimized, compared to code in other routines?

Your program has undefined behaviour. This means that anything may happen. The program is not covered at all by the C++ Standard. You should not go in with any expectations.
It's often said that undefined behaviour may "launch missiles" or "cause demons to fly out of your nose", to reinforce that point. The latter is more far-fetched but the former is feasible, imagine your code is on a nuclear launch site and the wild pointer happens to write a piece of memory that starts global thermouclear war..

Writing unknown pointers has always been something which could have unknown consequences. What's nastier is a currently-fashionable philosophy which suggests that compilers should assume that programs will never receive inputs that cause UB, and should thus optimize out any code which would test for such inputs if such tests would not prevent UB from occurring.
Thus, for example, given:
uint32_t hey(uint16_t x, uint16_t y)
{
if (x < 60000)
launch_missiles();
else
return x*y;
}
void wow(uint16_t x)
{
return hey(x,40000);
}
a 32-bit compiler could legitimately replace wow with an unconditional call to
launch_missiles without regard for the value of x, since x "can't possibly" be greater than 53687 (any value beyond that would cause the calculation of x*y to overflow. Even though the authors of C89 noted that the majority of compilers of that era would calculate the correct result in a situation like the above, since the Standard doesn't impose any requirements on compilers, hyper-modern philosophy regards it as "more efficient" for compilers to assume programs will never receive inputs that would necessitate reliance upon such things.

Is it legal to initialize a possibly invalid reference without using it?

I would like to save typing in some loop, creating reference to an array element, which might not exist. Is it legal to do so? A short example:
#include<vector>
#include<iostream>
#include<initializer_list>
using namespace std;
int main(void){
vector<int> nn={0,1,2,3,4};
for(size_t i=0; i<10; i++){
int& n(nn[i]); // this is just to save typing, and is not used if invalid
if(i<nn.size()) cout<<n<<endl;
}
};
https://ideone.com/nJGKdW compiles and runs the code just fine (I tried locally with both g++ and clang++), but I am not sure if I can count on that.
PS: Neither gcc not clang complain, even when compiled+run with -Wall and -g.
EDIT 2: The discussion focuses on array indexing. The real code actually uses std::list and a fragment would look like this:
std::list<int> l;
// the list contains something or not, don't know yet
const int& i(*l.begin());
if(!l.empty()) /* use i here */ ;
EDIT 3: Legal solution to what I was doing is to use iterator:
std::list<int> l;
const std::list<int>::iterator I(l.begin()); // if empty, I==l.end()
if(!l.empty()) /* use (*I) here */ ;

No it's not legal. You are reading data out of bounds from the vector in the declaration of n and therefore your program have undefined behavior.

No, for two reasons:
The standard states (8.3.2):
A reference shall be initialized to refer to a valid object or function
std::vector::operator[] guarantees that even if N exceeds the container size, the function never throws exceptions (no-throw guarantee, no bounds checking other than at()). However, in that case, the behavior is undefined.
Therefore, your program is not well-formed (bullet point 1) and invoke undefined behaviour (bullet point 2).

I'd be surprised if this is "allowed" by the specification. However, what it does is store the address of an element that is outside the range of its allocation, which shouldn't in itself cause a problem in most cases - in extreme cases, it may overflow the pointer type, which could cause problems, I suppose.
In other words, if i is WAY outside the size of nn, it could be a problem, not necessarily saying i has to be enormous - if each element in the vector is several megabytes (or gigabytes in a 64-bit machine), you can quite quickly run into problems with address range.
But don't ask me to quote the specification - someone else will probably do that.
Edit: As per comment, since you are requesting the address of a value outside of the valid size, at least in debug builds, this may well cause the vector implementation to assert or otherwise "warn you that this is wrong".

Conditional branches

Why this piece of code compiles?
#include <iostream>
int foo(int x)
{
if(x == 10)
return x*10;
}
int main()
{
int a;
std::cin>>a;
std::cout<<foo(a)<<'\n';
}
The compiler shouldn't give me an error like "not all code paths returns a value"? What happens/returns my function when x isn't equal to ten?

The result is undefined, so the compiler is free to choose -- you probably get what happens to sit at the appropriate stack address where the caller expects the result. Activate compiler warnings, and your compiler will inform you about your omission.

The compiler is not required to give you an error in this circumstance. Many will, some will only issue warnings. Some apparently won't notice.
This is because it's possible that your code ensures outside of this function that the condition will always be true. Therefore, it isn't necessarily bad (though it almost always is, which is why most compilers will issue at least a warning).
The specification will state that the result of exiting a function that should return a value but doesn't is undefined behavior. A value may be returned. Or the program might crash. Or anything might happen. It's undefined.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Dealing with int arrays in c++ 2 - c++

Hi guys could anyone explain why does this program correctly even being a bit starnge: int main() { int array[7]={5,7,57,77,55,2,1}; for(int i=0;i<10;i++) cout<<i[array]<<", "; //array[i] cout<<endl; return 0; } why does the program compile correctly??

An expression (involving fundamental types) such as this: x[y] is converted at compile time to this: (x + y) x + y is the same as y + x Therefore: (x + y) is the same as *(y + x) Therefore: x[y] is the same as y[x]

Adding to Benjamin Lindley, Compile the below code and you will see how the address are calculated: int main() { int array[7]={5,7,57,77,55,2,1}; cout<<&(array[0])<<endl; cout<<&(array[1])<<endl; return 0; } output:(for me);-) 0x28ff20 0x28ff24 Its just &(array+0) and &(array+1)..

Related

Same array giving garbage value at one place and an unrelated value at the other place

How are these two pieces of code different?

Is this compiler optimization inconsistency entirely explained by undefined behaviour?

Is it legal to initialize a possibly invalid reference without using it?

Conditional branches

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Dealing with int arrays in c++ 2 - c++

Hi guys could anyone explain why does this program correctly even being a bit starnge: int main() { int array[7]={5,7,57,77,55,2,1}; for(int i=0;i<10;i++) cout<<i[array]<<", "; //array[i] cout<<endl; return 0; } why does the program compile correctly??

An expression (involving fundamental types) such as this: x[y] is converted at compile time to this: *(x + y) x + y is the same as y + x Therefore: *(x + y) is the same as *(y + x) Therefore: x[y] is the same as y[x]

Adding to Benjamin Lindley, Compile the below code and you will see how the address are calculated: int main() { int array[7]={5,7,57,77,55,2,1}; cout<<&(array[0])<<endl; cout<<&(array[1])<<endl; return 0; } output:(for me);-) 0x28ff20 0x28ff24 Its just &(array+0) and &(array+1)..

Related

Same array giving garbage value at one place and an unrelated value at the other place

How are these two pieces of code different?

Is this compiler optimization inconsistency entirely explained by undefined behaviour?

Is it legal to initialize a possibly invalid reference without using it?

Conditional branches

Categories

Resources

An expression (involving fundamental types) such as this: x[y] is converted at compile time to this: (x + y) x + y is the same as y + x Therefore: (x + y) is the same as *(y + x) Therefore: x[y] is the same as y[x]