Consider the following piece of code. This function reads the some integers and strings from a file.
const int vardo_ilgis = 10;
void skaityti(int &n, int &m, int &tiriama, avys A[])
{
ifstream fd("test.txt");
fd >> n >> m >> tiriama;
fd.ignore(80, '\n');
char vard[vardo_ilgis]; // <---
for(int i = 1; i <= n; i++)
{
cout << i << ' ';
fd.get(vard, vardo_ilgis+1); // <---
cout << i << endl;
A[i].vardas = vard;
getline(fd, A[i].DNR);
}
fd.close();
}
and input:
4 6
4
Baltukas TAGCTT
Bailioji ATGCAA
Doli AGGCTC
Smarkuolis AATGAA
In this case, variable 'vard' has a length vardo_ilgis = 10, but in function fd.get the read input is vardo_ilgis+1 = 11 (larger than the variable length in which data is stored). I'm not asking how to fix a problem, because it's obvious not to read more than you can store on a variable.
However, I really want to understand the reason of this behaviour: the loop count variable gets decreased by fd.get. Why and how even can this happen? That's the output of this little piece of code:
1 0
1 0
1 0
1 0
1 1
2 2
3 3
4 4
Why did you use +1 ??
fd.get(vard, vardo_ilgis+1);
Overrunning that buffer corrupts some memory. In a simple unoptimized build, that corrupted memory could be the loop index.
the loop count variable gets decreased by fd.get. Why and how even can this happen?
Once you know why you have caused undefined behavior, many people say you aren't supposed to inquire into the details of that undefined behavior. I disagree. By understanding the details, you can improve your ability to diagnose other situations where you don't know what undefined behavior you might have invoked.
All you local variables are stored together, so overwriting one will tend to clobber another.
You describe the variable being "decreased" when in fact it was set to zero. The fact that it was 1 before being zeroed didn't affect its being zeroed. The undefined behavior happened to be equivalent to i&=~255; which for values under 256 is equal to i=0;. It is more accidental that you could see it as i--;
Hopefully it is clear why i stopped being zeroed once you ran out of input.
fd.get(vard, vardo_ilgis+1); makes buffer be written out-of-bounds.
In your case, the area where you write (and where you should not) is probably the same memory area where i is stored.
But, what's most important is that you end up with the so famous undetermined behaviour. Which mean anything could happen and there is no point trying to understand why or how (what happens is platform, compiler and even context specific, I don't think anyone can predict nor explain it).
Related
This question already has answers here:
Undefined, unspecified and implementation-defined behavior
(9 answers)
Closed 5 years ago.
I allow the user to enter the number of iterations for my array. I am trying to understand what happens When you exceed the max size of 10 and enter a number higher than that, such as 255. The user can then enter in numbers for each iteration. The program allows for a couple extra inputs, but crashes at around 12 or 13. Why is this happening? Why can't the program allow for the 255 iterations specified? I believe it has to do something with the way memory is referenced in c++, but I am not sure.
#include <iostream>
using namespace std;
int main()
{
int nums[20] = { 0 };
int a[10] = { 0 };
cout << a << endl;
cout << nums << endl;
cout << "How many numbers? (max of 10)" << endl;
cin >> nums[0];
for (int i = 0; i < nums[0]; i++)
{
cout << "Enter number " << i << endl;
cin >> a[i];
}
// Output the numbers entered
for (int i = 0; i < 10; i++)
cout << a[i] << endl;
return 0;
}
}
If this program is run and we enter 255 for how many numbers, and 9 for every single number, then the program does this:
How many numbers? (max of 10)
255
Enter number 0
9
Enter number 1
9
Enter number 2
9
Enter number 3
9
Enter number 4
9
Enter number 5
9
Enter number 6
9
Enter number 7
9
Enter number 8
9
Enter number 9
9
Enter number 10
9
Enter number 11
9
Enter number 12
9
//(program crashes somewhere around here.)
This:
int a[10] = { 0 };
cin >> a[i];
Is only valid if i is between 0 and 9. Any value outside of that is "undefined behavior" which means the program could do anything. Including not crashing for the first couple violations then crashing later. Including crashing immediately. Including never crashing. Or anything else. It's undefined.
What you are doing in your code is that you are moving outside the valid range of the array a[]. This moves on to undefined behaviour, as you experienced. Even going to a[10] is incorrect, even if your program crashed after you entered the 12th number. The very meaning of undefined behaviour is that the kind of error will differ from machine to machine; another machine may take up to a[16] before crashing, or one may simply, and rightly, stop and crash at a[10]. That is why it is called undefined behaviour; knowing that it will crash sooner or later, but not seeing a fixed pattern.
This undefined behaviour stems from your computer's memory. Everytime you rerun your program, your array is allocated a different block of memory in your local memory. This means that the memory blocks around could be filled or unfilled. As a result, if your code is faulty and you extend an array's index out of its range, it may works for the next few memory bytes if they are empty and accessible. However, as soon as they reach a full byte, or if that memory byte is inaccessible, then it will lead the program to crash. Also, since this array's location in memory can change as the program is rerun, "how far" it will go before crashing will depend entirely on the neighbouring bytes of memory.
This is a lot of theory, so if I got something wrong here by accident, please let me know. For further reading, you can visit Wikipedia
The simple answer to this is that your program is trying to write to bits that you have not allocated. It will do this until it tries to write to a bit that another process is using. This is what is crashing it, because this bit is unavailable. You just appear to be hitting "already in use" around 12 or 13.
I wrote a very trivial program to try to examine the undefined behavior attached to buffer overflows. Specifically, regarding what happens when you perform a read on data outside the allocated space.
#include <iostream>
#include<iomanip>
int main() {
int values[10];
for (int i = 0; i < 10; i++) {
values[i] = i;
}
std::cout << values << " ";
std::cout << std::endl;
for (int i = 0; i < 11; i++) {
//UB occurs here when values[i] is executed with i == 10
std::cout << std::setw(2) << i << "(" << (values + i) << "): " << values[i] << std::endl;
}
system("pause");
return 0;
}
When I run this program on Visual Studio, the results aren't terribly surprising: reading index 10 produces garbage:
000000000025FD70
0(000000000025FD70): 0
1(000000000025FD74): 1
2(000000000025FD78): 2
3(000000000025FD7C): 3
4(000000000025FD80): 4
5(000000000025FD84): 5
6(000000000025FD88): 6
7(000000000025FD8C): 7
8(000000000025FD90): 8
9(000000000025FD94): 9
10(000000000025FD98): -1966502944
Press any key to continue . . .
But when I fed this program into Ideone.com's online compiler, I got extremely bizarre behavior:
0xff8cac48
0(0xff8cac48): 0
1(0xff8cac4c): 1
2(0xff8cac50): 2
3(0xff8cac54): 3
4(0xff8cac58): 4
5(0xff8cac5c): 5
6(0xff8cac60): 6
7(0xff8cac64): 7
8(0xff8cac68): 8
9(0xff8cac6c): 9
10(0xff8cac70): 1
11(0xff8cac74): -7557836
12(0xff8cac78): -7557984
13(0xff8cac7c): 1435443200
14(0xff8cac80): 0
15(0xff8cac84): 0
16(0xff8cac88): 0
17(0xff8cac8c): 1434052387
18(0xff8cac90): 134515248
19(0xff8cac94): 0
20(0xff8cac98): 0
21(0xff8cac9c): 1434052387
22(0xff8caca0): 1
23(0xff8caca4): -7557836
24(0xff8caca8): -7557828
25(0xff8cacac): 1432254426
26(0xff8cacb0): 1
27(0xff8cacb4): -7557836
28(0xff8cacb8): -7557932
29(0xff8cacbc): 134520132
30(0xff8cacc0): 134513420
31(0xff8cacc4): 1435443200
32(0xff8cacc8): 0
33(0xff8caccc): 0
34(0xff8cacd0): 0
35(0xff8cacd4): 346972086
36(0xff8cacd8): -29697309
37(0xff8cacdc): 0
38(0xff8cace0): 0
39(0xff8cace4): 0
40(0xff8cace8): 1
41(0xff8cacec): 134514984
42(0xff8cacf0): 0
43(0xff8cacf4): 1432277024
44(0xff8cacf8): 1434052153
45(0xff8cacfc): 1432326144
46(0xff8cad00): 1
47(0xff8cad04): 134514984
...
//The heck?! This just ends with a Runtime Error after like 200 lines.
So apparently, with their compiler, overrunning the buffer by a single index causes the program to enter an infinite loop!
Now, to reiterate: I realize that I'm dealing with undefined behavior here. But despite that, I'd like to know what on earth is happening behind the scenes to cause this. The code that physically performs the buffer overrun is still performing a read of 4 bytes and writing whatever it reads to a (presumably better protected) buffer. What is the compiler/CPU doing that causes these issues?
There are two execution paths leading to the condition i < 11 being evaluated.
The first is before the initial loop iteration. Since i had been initialised to 0 just before the check, this is trivially true.
The second is after a successful loop iteration. Since the loop iteration caused values[i] to be accessed, and values only has 10 elements, this can only be valid if i < 10. And if i < 10, after i++, i < 11 must also be true.
This is what Ideone's compiler (GCC) is detecting. There is no way the condition i < 11 can ever be false unless you have an invalid program, therefore it can be optimised away. At the same time, your compiler doesn't go out of its way to check whether you might have an invalid program unless you provide additional options to tell it to do so (such as -fsanitize=undefined in GCC/clang).
This is a trade off implementations must make. They can favour understandable behaviour for invalid programs, or they can favour raw speed for valid programs. Or a mix of both. GCC definitely focuses greatly on the latter, at least by default.
#include <iostream>
using namespace std;
int main() {
int start_time[3];
int final_time[3];
int i;
for(i=0;i<3; i++)
cin >> start_time[i];
for(i=0;i<3;i++)
cin >> final_time[i];
int a[10];
for(i=0;i<=10;i++)
a[i]=0;
for(i=0;i<3;i++){
cout << start_time[i] << " " << final_time[i] << endl;
}
}
If I give the following input:
23 53 09
23 53 10
We see that the output is:
23 0
53 53
9 10
Why is it taking the starting input of final_time equal to 0 after I press enter?
How do I solve this?
int a[10];
for(i=0;i<=10;i++)
a[i]=0;
In this part, you are writing a 0 into a[10]. But a[10] does not exist. a[] only has space for 10 integers, indexed 0 to 9. So you are writing a zero to somewhere in memory, and you overwrote your final_time[0] by chance. Could have been something else or could have crashed your program, too.
Fix this by correcting your loop to for(i=0;i<10;i++) like your other loops.
Writing to an array out of it's allocated bounds is undefined behavior in C++. that basically means it's not defined in the standard and each compiler vendor will have to make something up for their product on what should happen if you do this. So it's rather random and therefor bad (tm). You may get a different behavior when you switch compilers, you may even get a different behavior when you restart your program. In your case, your compiler vendor (like many others) decided that they will just not check if the developer was correct. They just write that zero to the space in memory that a[10] would have been at, had it existed. But it did not. And by plain chance there was another variable at that spot in memory. So that one had it's value overwritten by the zero.
You're invoking a undefined behaviour by out of bound indexing for array a
The for loop
for(i=0;i<=10;i++)
~~~
should use i < 10 to index from 0 to 9 for 10 elements
I have an array of random numbers, for example
6 5 4 4 8
I need to sort it and remove/ignore the same numbers while printing afterwards, so what I did is I sorted everything with bubble sorth algorithm and got something like this
4 4 5 6 8
Now in order to print only different numbers I wrote this for loop
for(int i=0;i<n;i++){
if(mrst[i]!=mrst[i-1] && mrst[i]>0){
outFile << mrst[i] << " ";
}
}
My question is, the array I have is at the interval of [0:12], though the first time when I call it, it checks an array index of -1 to see if there was the same number before, but it doesn't really exist, but the value stored in there usually is a huge one, so is there a possibility that there may be stored 4 and because of it, the first number won't be printed out. If so, how to prevent it, rewrite the code so it would be optimal?
Perhaps, you're looking for std::unique algorithm:
std::sort(mrst, mrst + n);
auto last = std::unique(mrst, mrst + n);
for(auto elem = mrst; elem != last; ++elem)
outFile << *elem << " ";
Well, as you noted already, you cannot do the check mrst[i] != mrst[i-1] in case i == 0. So I'm sure you can think of a way of not doing that check in exactly this case ... (This looks very much like a homework assignment, so I'm not really willing to give you a complete solution, but I guess I hinted enough)
Note also that it's undefined behaviour to access memory outside the boundaries of an array, so what you're doing there can do anything from working correctly to crashing your program, entirely at the discretion of the compiler.
Basically you can read from any place in heap. So mrst[-1] may give you some garbage from the memory. But you really should avoid doing this. In your case you can just change "mrst[i]!=mrst[i-1] && mrst[i]>0" to "i==0 || mrst[i]!=mrst[i-1]".
In c++ "A || B" don't execute "B" if the "A" is ok.
I'm making a dice game in C++, and in my program I have some arrays.
die[5] = { (rand()%6)+1, (rand()%6)+1, (rand()%6)+1, (rand()%6)+1, (rand()%6)+1 };
And then I use the arrays with
cout<<"First die: "<< die[0] <<"\n"
etc
But, when I run the program, the last array will always print 0, is there a way I can fix this?
You're not really giving much information, but here's my guess:
You're stepping too far. The last spot in the array is die[4], and chances are you're using die[5], which means you're accessing memory you're not supposed to. On some systems, this will automatically initialize as "0".
Arrays of size N always include N elements ranging from 0 to N-1. Using array[N] accesses memory beyond the range of the array. This could be unused memory (best case) or memory assigned to something else. The result is TROUBLE. Do not do this.
In your code you have this line:
54. cout<<"Sixth die: " << die[5] <<"\n";
which is an invalid access as die has only 5 elements thus 0 to 4 are valid indexes.
This is actually "undefined behaviour". Your program might core-dump / give an access violation but it doesn't have to. It can instead just output some random number or zero...