Value of variable changes - c++

In the code below, I am trying to store the value of array at index 0 in the temp variable. In this line of code: a[i-1]=a[i]-a[i-1]; when i=0, a[i-1] becomes a[-1].
Why is compiler not giving any error?
Why does the value of temp variable is affected and becomes zero after the first iteration, though it is assigned a value only when i=0 and temp is not used anywhere else?
For example, when I gave input as:
3 1 2 3
Output:
i:0
a[0]: 1
TEMP: 1
TEMP: 0
TEMP: 0
TEMP: 0
What's actually happening? Please explain with reference to the working of compiler. I know that if I put a condition if(i!=0) a[i-1]=a[i]-a[i-1]; the code will work normally. But I want to know why is this happening with the given scenario.
#include<bits/stdc++.h>
using namespace std;
int main()
{
int a[10],i,n,temp;
cin>>n;
for(i=0;i<n;i++){
cin>>a[i];
if(i==0){
temp=a[i];
cout<<"i: "<<i<<endl;
cout<<"a[0]: "<<a[i]<<endl;
}
cout<<"TEMP: "<<temp<<endl;
a[i-1]=a[i]-a[i-1];
}
cout<<endl<<"TEMP: "<<temp;
}

Why is compiler not giving any error?
The compiler is not required to give any error. Accessing array out of bounds has undefined behaviour. It might seem obvious that the array is going to be accessed out of bounds at run time (then again, perhaps not so obvious since the author of the program didn't catch it before running the program), but it would be prohibitively expensive for the compiler to check execution paths in search for bugs in general.
Why does the value of temp variable is affected and becomes zero after the first iteration, though it is assigned a value only when i=0 and temp is not used anywhere else?
Because the behaviour of the program is undefined.
It is usually pointless to analyze why program behaves in some way, when it is allowed to behave in any possible way. However, this case it seems that most likely: When you write out of bounds, you overwrite some memory that isn't part of the array. Other variables may be located in the memory that isn't part of the array. Therefore overwriting some memory that isn't part of the array may corrupt the value of some other variable. That is what you observed.

Why is compiler not giving any error?
As #user2079303 already mentioned, such error is too difficult for compiler to find. There're separate tools for finding such more complicated errors - static and dynamic code analyzers. For example, if you use default static code analyzer from VS 2015, it gives the following warnings:
Severity Code Description
Warning C4701 potentially uninitialized local variable 'temp' used
Warning C6385 Reading invalid data from 'a': the readable size is '40' bytes, but '-4' bytes may be read.
Warning C6386 Buffer overrun while writing to 'a': the writable size is '40' bytes, but '-4' bytes might be written.
Warning C6001 Using uninitialized memory 'temp'.
If you want extra safety for your projects, consider enabling all (or almost all) available warnings and run code analysis regularly.

Related

How size is allocated in c++ for array?

If we initialize array with 4 elements, for example:
int array[4];
Can we assign values like this because it is also taking 4 values:
for(int i=5;i<9;i++){
cin>>array[i];
}
Out-of-bounds access on an array has undefined behaviour, which is another way of saying "unintended consequences":
int a[4];
int b[4];
for(int i=5;i<9;i++){
a[i] = i;
}
In a debugger watch what it's doing and in particular watch what happens to b.
This may or may not crash, but it's still broken code. C++ will not always alert you to such situations, it's your responsibility as a developer to be aware of what's allowed and not allowed when accessing certain structures.
Accessing an array out of bounds doesn't always cause a crash, but it is always problematic. Try with i = 999999 or i = -9 and see what happens.
The problem with undefined behaviour is that it may appear to be working but these unintended consequences eventually catch up with you. This makes debugging your code very difficult as this out-of-bounds write may stomp a variable that you need somewhere else minutes or hours after the initial mistake, and then your program crashes. Those sorts of bugs are the most infuriating to fix since the time between the cause and effect is extremely long.
It's the same as how throwing lit matches in the garbage may not cause a fire every time but when it does cause a fire you may not notice until it's too late. In C++ you must be extremely vigilant about not introducing undefined behaviour into your code.
You are mixing up two constructs - the logic used to iterate in a loop and the indexing of an array.
You may use
for(int i=5;i<9;i++){
...
}
to run the loop four times. However, you many not use those values of i to access the array. The array index has to be offset appropriately so it is valid.
for(int i=5;i<9;i++){
int index = i - 5;
std::cin >> array[index];
}
No, you get an array with 4 slots. Those slots are
array[0]
array[1]
array[2]
array[3]
So your code is incorrect. It might seem to work, but its still wrong due to whats called Undefined Behavior, next week it might fail
Note. You are better off using std::vector in c++
can we allocate value like this because it is also taking 4 values:
for(int i=5;i<9;i++){
cin>>array[i];
No, we can't. Since the size of the array is 4, the only indices that can be accessed are 0,1,2 and 3. Accessing any other index will have undefined behaviour.
try to run it in any online compiler it is working.
The behaviour is undefined. Possible behaviours include, none of which are guaranteed:
- working
- not working
- random output
- non-random output
- the expected output
- unexpected output
- no output
- any output
- crashing at random
- crashing always
- not crashing
- corruption of data
- different behaviour, when executed on another system
- , when compiled with another compiler
- , on tuesday
- , only when you are not looking
- same behaviour in all of the above cases
- anything else within the power of the computer (hopefully limited by the OS)

What do the memory operations malloc and free exactly do?

Recently I met a memory release problem. First, the blow is the C codes:
#include <stdio.h>
#include <stdlib.h>
int main ()
{
int *p =(int*) malloc(5*sizeof (int));
int i ;
for(i =0;i<5; i++)
p[i ]=i;
p[i ]=i;
for(i =0;i<6; i++)
printf("[%p]:%d\n" ,p+ i,p [i]);
free(p );
printf("The memory has been released.\n" );
}
Apparently, there is the memory out of range problem. And when I use the VS2008 compiler, it give the following output and some errors about memory release:
[00453E80]:0
[00453E84]:1
[00453E88]:2
[00453E8C]:3
[00453E90]:4
[00453E94]:5
However when I use the gcc 4.7.3 compiler of cygwin, I get the following output:
[0x80028258]:0
[0x8002825c]:1
[0x80028260]:2
[0x80028264]:3
[0x80028268]:4
[0x8002826c]:51
The memory has been released.
Apparently, the codes run normally, but 5 is not written to the memory.
So there are maybe some differences between VS2008 and gcc on handling these problems.
Could you guys give me some professional explanation on this? Thanks In Advance.
This is normal as you have never allocated any data into the mem space of p[5]. The program will just print what ever data was stored in that space.
There's no deterministic "explanation on this". Writing data into the uncharted territory past the allocated memory limit causes undefined behavior. The behavior is unpredictable. That's all there is to it.
It is still strange though to see that 51 printed there. Typically GCC will also print 5 but fail with memory corruption message at free. How you managed to make this code print 51 is not exactly clear. I strongly suspect that the code you posted is not he code you ran.
It seems that you have multiple questions, so, let me try to answer them separately:
As pointed out by others above, you write past the end of the array so, once you have done that, you are in "undefined behavior" territory and this means that anything could happen, including printing 5, 6 or 0xdeadbeaf, or blow up your PC.
In the first case (VS2008), free appears to report an error message on standard output. It is not obvious to me what this error message is so it is hard to explain what is going on but you ask later in a comment how VS2008 could know the size of the memory you release. Typically, if you allocate memory and store it in pointer p, a lot of memory allocators (the malloc/free implementation) store at p[-1] the size of the memory allocated. In practice, it is common to also store at address p[p[-1]] a special value (say, 0xdeadbeaf). This "canary" is checked upon free to see if you have written past the end of the array. To summarize, your 5*sizeof(int) array is probably at least 5*sizeof(int) + 2*sizeof(char*) bytes long and the memory allocator used by code compiled with VS2008 has quite a few checks builtin.
In the case of gcc, I find it surprising that you get 51 printed. If you wanted to investigate wwhy that is exactly, I would recommend getting an asm dump of the generated code as well as running this under a debugger to check if 5 is actually really written past the end of the array (gcc could well have decided not to generate that code because it is "undefined") and if it is, to put a watchpoint on that memory location to see who overrides it, when, and why.

Buffer array overflow in for loop in c

When would a program crash in a buffer overrun case
#include<stdio.h>
#include<stdlib.h>
main() {
char buff[50];
int i=0;
for( i=0; i <100; i++ )
{
buff[i] = i;
printf("buff[%d]=%d\n",i,buff[i]);
}
}
What will happen to first 50 bytes assigned, when would the program crash?
I see in my UBUNTU with gcc a.out it is crashing when i 99
>>
buff[99]=99
*** stack smashing detected ***: ./a.out terminated
Aborted (core dumped)
<<
I would like to know why this is not crashing when assignment happening at buff[51] in the for loop?
It is undefined behavior. You can never predict when (or if at all) it crashes, but you cannot rely upon it 'not crashing' and code an application.
Reasoning
The rationale is that there is no compile or run time 'index out of bound checking' in c arrays. That is present in STL vectors or arrays in other higher level languages. So whenever your program accesses memory beyond the allocated range, it depends whether it simply corrupts another field on your program's stack or affects memory of another program or something else, so one can never predict a crash which only occurs in extreme cases. It only crashes in a state that forces the OS to intervene OR when it no longer remains possible for your program to function correctly.
Example
Say you were inside a function call, and immediately next to your array was, the RETURN address i.e. the address your program uses to return to the function it was called from. Suppose you corrupted that and now your program tries to return to the corrupted value, which is not a valid address. Hence it would crash in such a situation.
The worst happens when you silently modified another field's value and didn't even discover what was wrong assuming no crash occurred.
Since it seems you have allocated on the stack the buffer, the app possibly will crash on the first occasion you overwrite an instruction which is to be executed, possibly somewhere in the code of the for loop... at least that's how it's supposed to be in theory.

Initialize a variable

Is it better to declare and initialize the variable or just declare it?
What's the best and the most efficient way?
For example, I have this code:
#include <stdio.h>
int main()
{
int number = 0;
printf("Enter with a number: ");
scanf("%d", &number);
if(number < 0)
number= -number;
printf("The modulo is: %d\n", number);
return 0;
}
If I don't initialize number, the code works fine, but I want to know, is it faster, better, more efficient? Is it good to initialize the variable?
scanf can fail, in which case nothing is written to number. So if you want your code to be correct you need to initialize it (or check the return value of scanf).
The speed of incorrect code is usually irrelevant, but for you example code if there is a difference in speed at all then I doubt you would ever be able to measure it. Setting an int to 0 is much faster than I/O.
Don't attribute speed to language; That attribute belongs to implementations of language. There are fast implementations and slow implementations. There are optimisations assosciated with fast implementations; A compiler that produces well-optimised machine code would optimise the initialisation away if it can deduce that it doesn't need the initialisation.
In this case, it actually does need the initialisation. Consider if scanf were to fail. When scanf fails, it's return value reflects this failure. It'll either return:
A value less than zero if there was a read error or EOF (which can be triggered in an implementation-defined way, typically CTRL+Z on Windows and CTRL+d on Linux),
A number less than the number of objects provided to scanf (since you've provided only one object, this failure return value would be 0) when a conversion failure occurs (for example, entering 'a' on stdin when you've told scanf to convert sequences of '0'..'9' into an integer),
The number of objects scanf managed to assign to. This is 1, in your case.
Since you aren't checking for any of these return values (particular #3), your compiler can't deduce that the initialisation is necessary and hence, can't optimise it away. When the variable is uninitialised, failure to check these return values results in undefined behaviour. A chicken might appear to be living, even when it is missing its head. It would be best to check the return value of scanf. That way, when your variable is uninitialised you can avoid using an uninitialised value, and when it isn't your compiler can optimise away the initialisations, presuming you handle erroneous return values by producing error messages rather than using the variable.
edit: On that topic of undefined behaviour, consider what happens in this code:
if(number < 0)
number= -number;
If number is -32768, and INT_MAX is 32767, then section 6.5, paragraph 5 of the C standard applies because -(-32768) isn't representable as an int.
Section 6.5, paragraph 5 says:
If an exceptional condition occurs during the evaluation of an
expression (that is, if the result is not mathematically defined or
not in the range of representable values for its type), the behavior
is undefined.
Suppose if you don't initialize a variable and your code is buggy.(e.g. you forgot to read number). Then uninitialized value of number is garbage and different run will output(or behave) different results.
But If you initialize all of your variables then it will produce constant result. An easy to trace error.
Yes, initialize steps will add extra steps in your code at low level. for example mov $0, 28(%esp) in your code at low level. But its one time task. doesn't kill your code efficiency.
So, always using initialization is a good practice!
With modern compilers, there isn't going to be any difference in efficiency. Coding style is the main consideration. In general, your code is more self-explanatory and less likely to have mistakes if you initialize all variables upon declaring them. In the case you gave, though, since the variable is effectively initialized by the scanf, I'd consider it better not to have a redundant initialization.
Before, you need to answer to this questions:
1) how many time is called this function? if you call 10.000.000 times, so, it's a good idea to have the best.
2) If I don't inizialize my variable, I'm sure that my code is safe and not throw any exception?
After, an int inizialization doesn't change so much in your code, but a string inizialization yes.
Be sure that you do all the controls, because if you have a not-inizialized variable your program is potentially buggy.
I can't tell you how many times I've seen simple errors because a programmer doesn't initialize a variable. Just two days ago there was another question on SO where the end result of the issue being faced was simply that the OP didn't initialize a variable and thus there were problems.
When you talk about "speed" and "efficiency" don't simply consider how much faster the code might compile or run (and in this case it's pretty much irrelevant anyway) but consider your debugging time when there's a simple mistake in the code do to the fact you didn't initialize a variable that very easily could have been.
Note also, my experience is when coding for larger corporations they will run your code through tools like coverity or klocwork which will ding you for uninitialized variables because they present a security risk.

Related to strings

//SECTION I:
void main()
{
char str[5] = "12345"; //---a)
char str[5] = "1234"; //---b)
cout<<"String is: "<<str<<endl;
}
Output: a) Error: Array bounds Overflow.
b) 1234
//SECTION II:
void main()
{
char str[5];
cout<<"Enter String: ";
cin>>str;
cout<<"String is: "<<str<<endl;
}
I tried with many different input strings, and to my surprise, I got strange result:
Case I: Input String: 1234, Output: 1234 (No issue, as this is expected behavior)
Case II: Input String: 12345, Output: 12345 (NO error reported by compiler But I was expecting an Error: Array bounds Overflow.)
Case III: Input String: 123456, Output: 123456 (NO error reported by compiler But I was expecting an Error: Array bounds Overflow.)
.................................................
.................................................
Case VI: Input String: 123456789, Output: 123456789(Error: unhandeled exception. Access Violation.)
My doubt is, When I assigned more characters than its capacity in SECTION I, compiler reported ERROR: Array bounds Overflow.
But, when I am trying the same thing in SECTION II, I am not geting any errors. WHY it is so ?? Please note: I executed this on Visual Studio
char str[5] = "12345";
This is a compiletime error. You assign a string of length 6 (mind the appended null-termination) to an array of size 5.
char str[5];
cin>>str;
This may yield a runtime error. Depending on how long the string is you enter, the buffer you provide (size 5) may be too small (again mind the null-termination).
The compiler of course can't check your user input during runtime. If you're lucky, you're notified for an access violation like this by Segmentation Faults. Truly, anything can happen.
Throwing exceptions on access violations is not mandatory. To address this, you can implement array boundary checking yourself, or alternatively (probably better) there are container classes that adapt their size as necessary (std::string):
std::string str;
cin >> str;
What you are seeing is an undefined behavior. You are writing array out of bounds, anything might happen in that case ( (including seeing the output you expect).
I tried with many different input strings, and to my surprise, I got
strange result:
This phenomena is called Undefined behavior (UB).
As soon as you enter more characters than a char array can hold, you invite UB.
Sometime, it may work, it may not work sometime, it may crash. In short, there is no definite pattern.
[side note: If a compiler allows void main() to get compiled then it's not standard compliant.]
It's because in the second case, the compiler can't know. Only at runtime, there is a bug. C / C++ offer no runtime bounds checking by default, so the error is not recognized but "breaks" your program. However, this breaking doesn't have to show up immediately but while str points to a place in memory with only a fixed number of bytes reserved, you just write more bytes anyways.
Your program's behavior will be undefined. Usually, and also in your particular case, it may continue working and you'll have overwritten some other component's memeory. Once you write so much to access forbidden memory (or free the same memory twice), your program crashes.
char str[5] = "12345" - in this case you didn't leave room to the null terminator. So when the application tries to print str, it goes on and on as long as a null is not encountered, eventually stepping on foridden memory and crashing.
In the `cin case, the cin operation stuffed 0 at the end of the string, so this stops the program from going too far in memory.
Hoever, from my experience, when breaking the rules with memory overrun, things go crazy, and looking for a reason why in one case it works while in other it doesn't, doesn't lead anywhere. More often than not, the same application (with a memory overrun) can work on one PC and crash on another, due to different memory states on them.
because compiler does static check. In your section II, size is unknown in advance. It depends on your input length.
May i suggest you to use STL string ?