C++ program...overshoots? - c++

I'm decent at C++, but I may have missed some nuance that applies here. Or maybe I completely missed a giant concept, I have no idea. My program was instantly crashing ("blah.exe is not responding") about 1/5 times it was run (other times it ran completely fine) and I tracked the problem down to a constructor for a world class that was called once in the beginning of the main function. Here is the code (in the constructor) that causes the problem:
int ii;
for(ii=0;ii<=255;ii++)
{
cout<<"ent "<<ii<<endl;
entity_list[ii]=NULL;
}
for(ii=0;ii<=255;ii++)
{
cout<<"sec "<<ii<<endl;
sector_list[ii]=NULL;
}
entity_list[0] = new Entity(0,0);
entity_list[0]->_world = this;
Specifically the second for loop. The cout references are new for the sake of telling where it is having trouble. It would print the entire "ent 1" to "ent 255" and then "sec 1" to "sec 255" and then crash right after, as if it was going for a 257th run through of the second for loop. I set the second for loop to go until "ii<=254" which stopped all crashes. Does C++ code tend to "overshoot" for loops or something? What is causing it to crash at this specific loop seemingly at random?
By the way, entity_list and sector_list point to classes called Entity and Sector, respectively, but they are not constructing anything so I didn't think it would be relevant. I also have a forward declaration for the Entity class in a header for this, but since none were being constructed I didn't think it was relevant either.

You are going beyond the bounds of your array.
Based on your comment in Charles' answer, you stated:
I just declared them in the world class as entity_list[255] and
sector_list[255]
And therein lies your problem. By declaring them to have 255 elements, that means you can only access elements a[0] through a[254] (If you count them up, you'll find that that is 255 elements. If index a[255] existed, then it would mean that there were 256 elements).
Now for the question: Why did it act so erratically when you accessed an element outside of the bounds of the array?
The reason is because accessing elements outside of the bounds of the array is undefined behavior in C++. I can't tell you what it should do, because it has been intentionally left undefined (don't ask me why--maybe someone who knows can comment?).
What this means is that the results will be sporadic and unpredictable, especially when you run it on different machines.
It might work just fine. It might crash. It might delete your hard drive! (this one is unlikely, but doing so wouldn't be a violation of the C++ protocol!)
Bottom line--just because you got a strange or non-existant error message does NOT mean its ok. Just don't do it.

How did you declare entity_list and sector_list? Remember that you are using 0 based indexing, so if you go from ii = 0 to ii <= 255 you need 256 buckets, not 255.

Related

Underlying reasons for different output after removing a print statement in an "out of range of vector error"

I have a minimal code causing said behavior:
vector<int> result(9);
int count = 0;
cout << "test1\n"; // removing this line causes 'core dump'
for (int j=0; j < 12; j++)
result[count++] = 1;
cout << "test2\n";
result is a vector of size 9, and inside 'for' loop I am accessing elements out of the range.
Now, removing test1 line, the code runs without any errors; but with this cout line, I get
* Error in `./out_of_range_vector2': free(): invalid next size (fast): 0x0000000001b27c20 *
I understand that this is telling me that free() encounter some memory that were not allocated my malloc(), but what role does this cout line plays here? I'd like to know a little bit more about what's going on here. More specifically, I have two questions:
Is this caused by the different state of heap on these 2 cases? If so, what exactly is different?
Why sometimes accessing out of range elements does not cause error? Is it because it hasn't exceeded vector's capacity?
The line
cout << "test1\n";
does a lot of things, and it can possibly allocate an free memory.
Writing outside the bounds of a vector is "undefined behavior" and this is very different from "getting an error". The meaning of "undefined behavior" is that who writes the compiler and the runtime library simply is free to ignore those possibilities and whatever happens happens. They can do so because a programmer will never ever do those things and when does it's 100% his/her fault. Preventing, protecting or even simply notifying of errors when accessing std::vector elements using operator[] is not part of the contract.
Technically what could have happened is that the write operation destroyed some internal data structure used by the memory allocator and this resulted in crazy behavior after that that may or may not result in a segfault.
A segfault is when things get crazy to the point that even the operating system can detect the program is not doing what is supposed to do because it requests access to locations that do not even exist (so for sure they cannot be the correct location that was supposed to contain the data being looked for).
You can however get undefined behavior and corrupted data without getting to that "segfault" point and the program will simply read or write from the wrong locations incorrect data, even without getting any observable difference from a correct program. This is actually what happens most of the times (unfortunately).
So what happens when you read or write outside the size of an std::vector using the unchecked opertor[]? Most of the time nothing (apparently). It may however do whatever it likes after that mistake, including behaving crazy in places where the code is instead correct, one billion machine instructions later and only if that provokes serious damages. Just don't do that.
When programming in C++ you simply cannot do any mistake. There are no "runtime error angels" to protect you like in other higher level languages.

C++: Creating array index larger than size not causing error

I am learning C++ and as I understand it an array should not populate if you try to assign a value to it that is invalid due to the size restriction placed on the array, yet the following code is properly outputting 232:
int stuff[5];
stuff[7] = 232;
cout<<stuff[7];
I'm sorry if this is an incredibly stupid question, but is this just something my compiler is patching up before run-time?
Writing outside the bounds of an array is undefined behavior, which means the results could be anything. In particular, the compiler is not required to add any run-time bounds-checking that would cause an error message to be emitted (and most compilers don't, since it's more efficient not to).
So what's likely happening is that the value is being written to a memory location just past the end of the array. If something important was stored there, something bad will happen; or if whatever happened to be at that location was not so important, you won't notice any particular effect. You can't rely on on it being harmless, though, so don't do that. :)
Accessing an array out of bounds results in undefined behaviour. It might work, it might give the wrong answer, it might crash the program, whatever.

Why compiler does not complain about accessing elements beyond the bounds of a dynamic array? [duplicate]

This question already has answers here:
Accessing an array out of bounds gives no error, why?
(18 answers)
Closed 7 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Duplicate This question has been answered, is not unique, and doesn’t differentiate itself from another question.
I am defining an array of size 9. but when I access the array index 10 it is not giving any error.
int main() {
bool* isSeedPos = new bool[9];
isSeedPos[10] = true;
}
I expected to get a compiler error, because there is no array element isSeedPos[10] in my array.
Why don't I get an error?
It's not a problem.
There is no bound-check in C++ arrays. You are able to access elements beyond the array's limit (but this will usually cause an error).
If you want to use an array, you have to check that you are not out of bounds yourself (you can keep the sizee in a separate variable, as you did).
Of course, a better solution would be to use the standard library containers such as std::vector.
With std::vector you can either
use the myVector.at(i)method to get the ith element (which will throw an exception if you are out of bounds)
use myVector[i] with the same syntax as C-style arrays, but you have to do bound-checking yourself ( e.g. try if (i < myVector.size()) ... before accessing it)
Also note that in your case, std::vector<bool> is a specialized version implemented so that each booltakes only one bit of memory (therefore it uses less memory than an array of bool, which may or may not be what you want).
Use std::vector instead. Some implementations will do bounds checking in debug mode.
No, the compiler is not required to emit a diagnostic for this case. The compiler does not perform bounds checking for you.
It is your responsibility to make sure that you don't write broken code like this, because the compiler will not error on it.
Unlike in other languages like java and python, array access is not bound-checked in C or C++. That makes accessing arrays faster. It is your responsibility to make sure that you stay within bounds.
However, in such a simple case such as this, some compilers can detect the error at compile time.
Also, some tools such as valgrind can help you detect such errors at run time.
What compiler/debugger are you using?
MSVC++ would complain about it and tell you that you write out of bounds of an array.
But it is not required to do it by the standard.
It can crash anytime, it causes undefined behaviour.
Primitive arrays do not do bounds-checking. If you want bounds-checking, you should use std::vector instead. You are accessing invalid memory after the end of array, and purely by luck it is working.
There is no runtime checking on the index you are giving, accessing element 10 is incorrect but possible. Two things can happen:
if you are "unlucky", this will not crash and will return some data located after your array.
if you are "lucky", the data after the array is not allocated by your program, so access to the requested address is forbidden. This will be detected by the operating system and will produce a "segmentation fault".
There is no rule stateing that the memory access is checked in c, plain and simple. When you ask for an array of bool's it might be faster for the Operating system to give you a 16bit og 32bit array, instead of a 9bit one. This means that you might not even be writing or reading into someone elses space.
C++ is fast, and one of the reasons that it is fast is becaurse there are very few checks on what you are doing, if you ask for some memory, then the programming language will assume that you know what you are doing, and if the operating system does not complain, then everything will run.
There is no problem! You are just accessing memory that you shouldn't access. You get access to memory after the array.
isSeedPos doesn't know how big the array is. It is just a pointer to a position in memory. When you point to isSeepPos[10] the behaviour is undefined. Chances are sooner or later this will cause a segfault, but there is no requirement for a crash, and there is certainly no standard error checking.
Writing to that position is dangerous.
But the compiler will let you do it - Effectively you're writing one-past the last byte of memory assigned to that array = not a good thing.
C++ isn't a lot like many other languages - It assumes that you know what you are doing!
Both C and C++ let you write to arbitrary areas of memory. This is because they originally derived from (and are still used for) low-level programming where you may legitimately want to write to a memory mapped peripheral, or similar, and because it's more efficient to omit bounds checking when the programmer already knows the value will be within (eg. for a loop 0 to N over an array, he/she knows 0 and N are within the bounds, so checking each intermediate value is superfluous).
However, in truth, nowadays you rarely want to do that. If you use the arr[i] syntax, you essentially always want to write to the array declared in arr, and never do anything else. But you still can if you want to.
If you do write to arbitrary memory (as you do in this case) either it will be part of your program, and it will change some other critical data without you knowing (either now, or later when you make a change to the code and have forgotten what you were doing); or it will write to memory not allocated to your program and the OS will shut it down to prevent worse problems.
Nowadays:
Many compilers will spot it if you make an obvious mistake like this one
There are tools which will test if your program writes to unallocated memory
You can and should use std::vector instead, which is there for the 99% of the time you want bounds checking. (Check whether you're using at() or [] to access it)
This is not Java. In C or C++ there is no bounds checking; it's pure luck that you can write to that index.

Reading off the end of an array: running in terminal vs. debugger

I encountered an error in my code where an if() statement was checking a value off the end of an array. IE,
int arrayX [2];
if(arrayX [2])
FunctionCall();
This was leading to a function call that, for reasons related to the length of the above array, tried to subscript a vector with an out-of-bounds index, casuing the error. However, the error only occurred when running under the Xcode debugger; whenever I ran under terminal it didn't happen. This leads me to suspect that when I run under terminal, memory outside the array is being zeroed or tends to be zero for some other reason. The if statement gets tested for 80 different 'faulty' arrays per cycle so it seems unlikely that its a coincidence that it never pops up under terminal.
Just to be clear, my question is: why would unallocated or unrelated memory hold zeroes when run under terminal but not when run under a debugger.
Many debuggers fill unused memory with some distinct pattern, so that exactly the behaviour you describe happens.
What exactly is the question?
Whatever the question, the answer is likely... The program generator can do that if it wants to. The behavior of the sample code is undefined so the resulting program's behavior is wholly unpredictable.
You can't really tell what data is outside the array. Shall there be any debugger that zeroes that part of memory, it may be the Xcode debugger, not the terminal. So it's very strange for me that you had no problems in terminal!!
You said "The if statement gets tested for 80 different 'faulty' arrays per cycle ", consider this: are you sure those "different" faulty arrays reside on "different" areas of ram actually (If it's static data compiler may put it in once place of ram and re-use it)? And, the compiler ( / interpreter) may optimize your code and also take care of memory.

Vector Ranges in C++

Another quick question here, I have this code:
string sa[6] = {
"Fort Sumter", "Manassas", "Perryville",
"Vicksburg", "Meridian", "Chancellorsville" };
vector<string> svec(sa, sa+6);
for (vector<string>::iterator iter = svec.begin(); iter != svec.end(); iter++)
{
std::cout << *iter << std::endl;
}
Why is it that when I do svec(sa, sa+7), the code works but it prints out an empty line after the last word and when I do sa+8 instead it crashes? Because the string array is only 6 elements big, shouldn't it crash at sa+7 also?
Thanks.
Accessing past the end of a vector is undefined behavior. Anything could happen. You might have nasal demons.
You have an array of only six elements. When you try to access the supposed "seventh" element, you get undefined behavior. Technically, that means anything can happen, but that doesn't seem to me like a very helpful explanation, so let's take a closer look.
That array occupies memory, and when you accessed the element beyond the end, you were reading whatever value happened to occupy that memory. It's possible that that address doesn't belong to your process, but it probably is, and so it's generally safe to read the sizeof(string) bytes that reside in that space.
Your program read from it and, since it was reading it through a string array, it treated that memory as though it were a real string object. (Your program can't tell the difference. It doesn't know any better. It's just trying to carry out your instructions.) Apparently, whatever data happened to be there looked enough like a real string object that your program was able to treat it like one, at least long enough to make a copy of it in the vector and then print its (empty) value. It worked this time, but that doesn't mean it will work every time.
There was no such luck with the data in the "eighth" position of the array. It did not look enough like a valid string object. A string object usually contains a pointer to the character data, along with a length. Maybe the area of the object that would normally represent that pointer didn't contain a valid address for your program. Or maybe the part that represented the length field contained a value far larger than what was available at the address in the pointer.
Your application does not crash because there is some standard that specifies that it should crash. Crashing is just random (undefined) behaviour. You will not always get a crash when you exceed the bounds of an array as you have found out.
Essentially, anything could happen such as printing a blank line, crashing or even as just posted - have demons fly out of your nose.
C++ doesn't do range-checking of arrays.
Reading beyond the end of an array is what's called "undefined" behaviour: i.e. it's not guaranteed to throw an exception, it's not guaranteed to not throw an exception, and it's not guaranteed to have consistent behaviour from one run to the next.
If people say that C++ is an "unsafe" language, this is part of what they mean by that. C++ doesn't check the range at run-time, because doing that a run-time take extra CPU instructions, and part of the design philosophy of C++ is to make it no slower than C.
Your compiler might have been able to warn you at compile-time (are you using the compiler command-line options to give you the maximum possible number of warnings?), though that too isn't guaranteed/required by the language.