Subscript of vector is sometimes out of range - c++

I have a genetic algorithm program, everything is allocated dynamically using vectors. Nowhere is the number of generations or individuals per generation set at compile time.
I tried it using 500, 1000, 2000 generations, it runs perfect. Then I tried 10,000 generations. It gave me debug assertion failed, vector subscript out of range at generation 4966.
I tried again twice with the same parameters, 10,000 generations, it ran fine.
I tried it once more, I got the error at generation 7565.
It's strange that sometimes it works perfectly, sometimes I get the error. Especially considering that everything is done using vectors.
Any ideas on where could the problem come from? Maybe the debug mode is buggy for some reason?

The problem comes from stack corruption or most probably from index out of bounds access. The fact that there are cases when your code crashes indicates there is something wrong. If your code is multi-threaded the problem may be because if actions are executed in a given order your code will try to access something out of bounds for a vector.
My advice is to run your code using valgrind and see what it will say. Usually it helps in resolving similar issues.
Also note the fact that there are cases when your code does not crash this does not mean that it works perfectly. You may still have stack corruption or similar.

Related

How to Find Code Resulting in Inconsistent Results

I have a C++ program that processes images and tracks objects in them, using OpenCV. For the most part, it works well; however the results that I get are inconsistent. That is, approximately 10% of the time, I am getting slightly different output values and I cannot figure out why. I do not have any calls to random; I have run valgrind to look for uninitialized memory; I have run clang-tools static analysis on it. No luck. The inconsistent runs have one of several different outputs, so they are not completely random.
Is there a tool that will show me where two runs diverge? If I run gprof or maybe cflow, can I compare them and see what was different? Is there some other tool or process I can use?
Edit: Thank you for the feedback. I believe that it is due to threading and a race condition; the suggestion was very helpful. I am currently using advice from: Ways to Find a Race Condition
Answering my own question, so that someone can see it. It was not a race condition:
The underlying problem was that we were using HOG descriptors with the incorrect parameters.
As it says in the documentation (https://docs.opencv.org/3.4.7/d5/d33/structcv_1_1HOGDescriptor.html), when you call cv::HOGDescriptor::compute, and you pass in a win_stride, it has to be a multiple of block stride. Similarly, block stride must be a multiple of cell stride. We did not set these properly. The end result is that sometimes (about 10% of the time), memory was being overwritten or otherwise corrupted. It didn't throw an error, and the results were almost correct all the time, but they were subtly different.

How to find what's stored next to a certain variable

I'm currently battling with an intermittent bug. I create a float member of my class. I initialize it to zero. And then give it a value. This variable is used several times over the course of the next few processes, and inexplicably it will sometimes change its value to a really small number and cause an error in my program. I've pinpointed the general area in my code where it happens, and I swear, there is nothing in my code that is acting upon this variable. And on top of that I'll run and compile the same exact program with the same exact code several times and this bug only pops up sometimes.
I'm thinking that one of my other arrays or pointers is occasionally stepping out of bounds (because I haven't implemented bounds checking yet) and replacing the variables value with it's own, but I have no idea which one. I was wondering if there is a way in XCode, to find out what variables are stored near or next to this variable, so I can maybe pinpoint who might be stepping on this poor little son of a gun?
You can enable "guard malloc" in XCode. Guard malloc can tell you whether your code wrote out of bounds on any allocated area. I don't know the exact way to enable it (anymore), but you'll definitely find something on the nets.
If you want to watch some memory location while debugging your code with gdb you can use watch breakpoints.
Maybe you have a corrupted memory heap. Using a tool like valgrind could help.

CUDA debugging procedure for non-deterministic output

I'm debugging my CUDA 4.0/Thrust-based image reconstruction code on my Ubuntu 10.10 64-bit system and I've been trying to figure out how to debug this run-time error I have in which my output images appear to some random "noise." There is no random number generator output in my code, so I expect the output to be consistent between runs, even if it's wrong. However, it's not...
I was just wondering if any one has a general procedure for debugging CUDA runtime errors such as these. I'm not using any shared memory in my cuda kernels. I've taken pains to avoid any race conditions involving global memory, but I could have missed something.
I've tried using gpu ocelot, but it has problems recognizing some of my CUDA and CUSPARSE function calls.
Also, my code generally works. It's just when I change this one setting that I get these non-deterministic results. I've checked all code associated with that setting, but I can't figure out what I'm doing wrong. If I can distill it to something that I can post here, I might do that, but at this point it's too complicated to post here.
Are you sure all of your kernels have proper blocksize/remainder handling? The one place we have seen non-deterministic results occurred when we had data elements at the end of the array not being processed.
Our kernels were originally were intended for data that was known to be an integer multiple of 256 elements. So we used a blocksize of 256, and did a simple division to get the number of blocks. When the data was then changed to be any length, the leftover 255 or less elements never got processed. Those spots in the output then had random data.

How to easily figure out where and why a program crashed?

I'm currently working on a program (in C++, using Code::Blocks) that uses a lot of random numbers and takes a while to get going; most of the time, it works fine, but every now and then it performs an illegal operation and must shut down. Given the random numbers all over the place, and the fact that it currently takes ~3-5 minutes for the program to reach the stage at which the errors occur (this timeframe is normal/acceptable), reproducing the problems reliably and convenient is extremely difficult, and reporting on every other line of code to cout to manually track things is time-consuming, visually clutters reporting on things not related to bugs, and is not always helpful, since even if I know when the program stops, I sometimes don't know why.
Is there some way for me to see what the last operation in the program was before it crashed, and for me to see why this operation lead to a crash? Something within CodeBlocks would be best, but something third-party works too. It also needs to be something I can use every time I test the program, because I never know when a crash is going to occur.
That is what debuggers are for. Build the system with full debugging symbols, configure the system so that you get a full crash report (in linux a core file), and then launch the debugger with the core file (alternatively run the whole program inside the debugger, but that might take a while, running inside a debugger is usually much slower than running outside of it.
The debugger should be able to give you the state of the program when the illegal instruction happened and you will get some insight as of the state that the program was on. From there either you figure what is wrong, or maybe you can make a couple of smaller testcases that might trigger the error.
Debugging issues that cannot be reproduced systematically is a pain, good luck there!
Sounds like you want a debugger. Debugging C and C++ programs using GDB

Program crashes when run outside IDE

I'm currently working on a C++ program in Windows XP that processes large sets of data. Our largest input file causes the program to terminate unexpectedly with no sort of error message. Interestingly, when the program is run from our IDE (Code::Blocks), the file is processed without any such issues.
As the data is being processed, it's placed into a tree structure. After we finish our computations, the data is moved into a C++ STL vector before being sent off to be rendered in OpenGL.
I was hoping to gain some insight into what might be causing this crash. I've already checked out another post which I can't post a link to since I'm a new user. The issue in the post was quite similar to mine and resulted from an out of bounds index to an array. However, I'm quite sure no such out-of-bounds error is occurring.
I'm wondering if, perhaps, the size of the data set is leading to issues when allocating space for the vector. The systems I've been testing the program on should, in theory, have adequate memory to handle the data (2GB of RAM with the data set taking up approx. 1GB). Of course, if memory serves, the STL vectors simply double their allocated space when their capacity is reached.
Thanks, Eric
The fact that the code works within the IDE (presumably running within a debugger?), but not standalone suggests to me that it might be an initialisation issue.
Compiler with the warning level set to max.
Then check all your warning. I would guess it is an uninitialized variable (that in debug mode is being initialized to NULL/0).
Personally I have set my templates so that warnings are always at max and that warnings are flagged as errors so that compilation will fail.
You'd probably find it helpful to configure the O/S to create a crash dump (maybe, I don't know, still by using some Windows system software called "Dr Watson"), to which you can then attach a debugger after the program has crashed (assuming that it is crashing).
You should also trap the various ways in which a program might exit semi-gracefully without a crash dump: atexit, set_unexpected, set_terminate and maybe others.
What does your memory model look like? Are you banging up against an index limit (i.e. sizeof int)?
As it turns out, our hardware is reaching its limit. The program was hitting the system's memory limit and failing miserably. We couldn't even see the error statements being produced until I hooked cerr into a file from the command line (thanks starko). Thanks for all the helpful suggestions!
Sounds like your program is throwing an exception that you are not catching. The boost test framework has some exception handlers that could be a quick way to localise the exception location.
Are there indices in the tree structure that could overflow? Are you using indexes into the vector that are beyond the current size of the vector?
new vector...
vector.push_back()
vector.push_back()
vector[0] = xyz
vector[1] = abc
vector[2] = slsk // Uh,oh, outside vector
How large is your largest input set? Do you end up allocating size*size elements? If so, is your largest input set larger than 65536 elements (65536*65536 == MAX_INT)?
I agree the most likely reason that the IDE works fine when standalone does not is because the debugger is wiping memory to 0 or using memory guards around allocated memory.
Failing anything else, is it possible to reduce the size of your data set until you find exactly the size that works, and a slightly large example that fails - that might be informative.
I'd recommend to try to determine approximately which line of your code does causes the crash.
Since this only happen outside your IDE you can use OutputDebugString to output the current position, and use DebugView.
Really the behavior of a program compiled for debug inside and outside of IDE can be completely different. They can use a different set of runtime libraries when a program is loaded from the IDE.
Recently I was bitten by a timing bug in my code, somehow when debugging from the IDE the timing was always good an the bug was not observed, but in release mode bam the bug was there. This kinda of bug are really a PITA to debug.