I am encountering some really odd behavior with my code.
It is throwing an access violation error when I run it and through the debugger I managed to find out the line which is causing it:
for( int x=0; x<width; x++) {
int current = edgeKernelXY(left+x,top+y,true,0);
....
}
I placed a debug point in the method edgeKernelXY and the code never even got into the method.
The next thing I checked was what the values that I am passing into it are. Left, top and y seem normal enough. However according to the debugger, x = 19840810 and current = 19840810. I don't understand how this could have happened especially if I declared x to be 0 at the beginning of the loop. Width is correct at 40.
x and current have not been declared anywhere else in scope of the forloop. What could be going wrong here?
EDIT:
I changed the code as follows:
for( int x=0; x<width; x++) {
int current = edgeKernelXY(left,top+y,true,0);
if( current > THRESHOLD &&
edgeKernelXY(left+x,top+y,true,1) > THRESHOLD &&
edgeKernelXY(left+x,top+y,true,2) > THRESHOLD ) {
} else {
current = 0.0f;
}
Specifically I changed left+x on the first call of edgeKernelXY to just left. This seems to run and the second call to edgeKernelXY shows x set correctly to 0. However the behavior is not what I want. Left + x still gets me crazy values for x which is causing an access violation.
for( int x=0; x<width; x++) {
int current;
current = 0;
current = edgeKernelXY(left+x,top+y,true,0);
Also shows problems with current.
Since you're debugging a release build you can't depend on the debugger showing the right values in watch windows, etc. Debugging release mode can be done, but is considerably trickier. Generally it requires looking at the disassembly to understand what's going on and what the state of things is (ie., x is almost certainly being kept in a register).
If you can't perform a full debug release, you can still get a much better debugging experience by building with optimizations turned off (which can be done in release builds and can be done without linking to debug libraries).
Assuming that you're building in the IDE, just go into the project settings and change the C/C++ - Optimization - Optimization value to Disabled (/Od).
If this causes a problem by setting it project-wide (which it shouldn't), you can do the same on a per-source-file basis by right clicking on the source files you're interested in debugging and setting the option in the properties for that source file. Just remember to clear those settings when you're done (and don't check them into source control) because the IDE doesn't make it obvious that there are source file specific settings so it can be confusing down the road.
Few things to check
Are there any double deletes in your program which might be messing up the heap (I know it doesn't explain the issue of the x on the stack, but something to check nevertheless)
Are there any statements between the for starting the function call edgeKernelXY? If so, please check those for any potential stack-overflows
The value of x being wrong smells like stack corruption to me which can happen due to multiple reasons
Is there an array that you created at the beginning of the function which overran the stack?
Could there be a function call that returned a pointer to an array (on its stack) which you are using assuming that the data is already available
All of the above will not give you the exact answer you are looking for, but some options for you to check. And most importantly - have you run your program through some memory profiler (like Valgrind) and checked the output? More often than not, that will show the cause of this problem.
If you do find out the cause for the problem, please do post the approach you followed. It will make the community's experience richer. Thanks.
Related
It works when, in the loop, I set every element to 0 or to entry_count-1.
It works when I set it up so that entry_count is small, and I write it by hand instead of by loop (sorted_order[0] = 0; sorted_order[1] = 1; ... etc).
Please do not tell me what to do to fix my code. I will not be using smart pointers or vectors for very specific reasons. Instead focus on the question:
What sort of conditions can cause this segfault?
Thank you.
---- OLD -----
I am trying to debug code that isn't working on a unix machine. The gist of the code is:
int *sorted_array = (int*)memory;
// I know that this block is large enough
// It is allocated by malloc earlier
for (int i = 0; i < entry_count; ++i){
sorted_array[i] = i;
}
There appears to be a segfault somewhere in the loop. Switching to debug mode, unfortunately, makes the segfault stop. Using cout debugging I found that it must be in the loop.
Next I wanted to know how far into the loop the segfault happend so I added:
std::cout << i << '\n';
It showed the entire range it was suppose to be looping over and there was no segfault.
With a little more experimentation I eventually created a string stream before the loop and write an empty string into it for each iteration of the loop and there is no segfault.
I tried some other assorted operations trying to figure out what is going on. I tried setting a variable j = i; and stuff like that, but I haven't found anything that works.
Running valgrind the only information I got on the segfault was that it was a "General Protection Fault" and something about default response to 11. It also mentions that there's a Conditional jump or move depends on uninitialized value(s), but looking at the code I can't figure out how that's possible.
What can this be? I am out of ideas to explore.
This is clearly a symptoms of invalid memory uses within your program.This would be bit difficult to find by looking out your code snippet as it is most likely be the side effect of something else bad which has already happened.
However as you have mentioned in your question that you are able to attach your program using Valgrind. as it is reproducible. So you may want to attach your program(a.out).
$ valgrind --tool=memcheck --db-attach=yes ./a.out
This way Valgrind would attach your program in the debugger when your first memory error is detected so that you can do live debugging(GDB). This should be the best possible way to understand and resolve your problem.
Once you are able to figure it out your first error, fix it and rerun it and see what are other errors you are getting.This steps should be done till no error is getting reported by Valgrind.
However you should avoid using the raw pointers in modern C++ programs and start using std::vector std::unique_ptr as suggested by others as well.
Valgrind and GDB are very useful.
The most previous one that I used was GDB- I like it because it showed me the exact line number that the Segmentation Fault was on.
Here are some resources that can guide you on using GDB:
GDB Tutorial 1
GDB Tutorial 2
If you still cannot figure out how to use GDB with these tutorials, there are tons on Google! Just search debugging Segmentation Faults with GDB!
Good luck :)
That is hard, I used valgrind tools to debug seg-faults and it usually pointed to violations.
Likely your problem is freed memory that you are writing to i.e. sorted_array gets out of scope or gets freed.
Adding more code hides this problem as data allocation shifts around.
After a few days of experimentation, I figured out what was really going on.
For some reason the machine segfaults on unaligned access. That is, the integers I was writing were not being written to memory boundaries that were multiples of four bytes. Before the loop I computed the offset and shifted the array up that much:
int offset = (4 - (uintptr_t)(memory) % 4) % 4;
memory += offset;
After doing this everything behaved as expected again.
I am using VS2010. My debug version works fine however my Release version kept on crashing. So In the release version mode I right clicked the project chose Debug and then chose start new instance. At this point I saw that an array that I had declared as such
int ma[4]= { 1,2,8,4};
Never gets initialized. Any suggestions on what might be going on.
When you build in Release, the compiler performs many optimizations on your code. Many of the optimizations include replacing variables with hard-coded values, when it is possible and correct to do so. For example, if you have something like:
int n = 42;
cout << "The answer is: " << n;
By the time the optimizer gets done with it, it will often look more like:
cout << "The answer is: " << 42;
...and the variable n is eliminated from your program completely. If you are stepping through a release version of this program and trying to examine the value of n, you may see very odd values or the debugger may report that n doesn't exist at all.
There are many other optimizations that can be applied which make debugging an optimized program quite difficult. Placing a breakpoint after the initialization of an array could yield very misleading information if the array was eliminated, or if the initialization of it was moved to someplace else.
Another common optimization is to eliminate unused variables, such as with:
int a = ma[0];
If there is no code in your program which actually uses a, the compiler will see that a is unneeded, and optimize it away so that it no longer exists.
In order to see the values that ma has been initialized with, the simplest somewhat reliable approach is to use so-called sprintf debugging:
cout << "ma values: ";
copy (ma, ma+4, ostream_iterator <int> (cout, ", "));
And see what is actually there.
If you debug release build the debugger will report bogus values or will not be able to display any values for most of your variables. The safest way to check that the value of a variable is in Release build is to use logging.
So most probably your array is initialized in Release just as in Debug build but you are not able to see that through the debugger. It seems you have some other problem that is causing the code to crash in Release. Look for some other uninitialized variable or some stack corruption/index out of bounds access.
I have a program that behaves weirdly and probably has undefined behaviour. Sometimes, the return address of a function seems to be changed, and I don't know what's causing it.
The return address is always changed to the same address, an assertion inside a function the control shouldn't be able to reach. I've been able to stop the program with a debugger to see that when it's supposed to execute a return statement, it jumps straight to the line with the assertion instead.
This code approximates how my function works.
int foo(Vector t)
double sum = 0;
for(unsgined int i=0; i<t.size();++i){
sum += t[i];
}
double limit = bar(); // bar returns a value between 0 and 1
double a=0;
for(double i=0; i<10; i++){
a += f(i)/sum; // f(1)/sum + ... + f(10)/sum = 1.0f
if(a>3)return a;
}
//shoudn'get here
assert(false); // ... then this line is executed
}
This is what I've tried so far:
Switching all std::vector [] operators with .at to prevent accidentily writing into memory
Made sure all return-by-value values are const.
Switched on -Wall and -Werror and -pedantic-errors in gcc
Ran the program with valgrind
I get a couple of invalid read of size 8, but they seem to originate from qt, so I'm not sure what to make of it. Could this be the problem?
The error happens only occasionally when I have run the program for a while and give it certain input values, and more often in a release build than in a debug build.
EDIT:
So I managed to reproduce the problem in a console application (no qt loaded) I then manages to simulate events that caused the problem.
Like some of you suggested, it turns out I misjudged what was actually causing it to reach the assertion, probably due to my lack of experience with qt's debugger. The actual problem was a floating point error in the double i used as a loop condition.
I was implementing softmax, but exp(x) got rounded to zero with particular inputs.
Now, as I have solved the problem, I might rephrase it. Is there a method for checking problems like rounding errors automatically. I.e breaking on 0/0 for instance?
The short answer is:
The most portable way of determining if a floating-point exceptional condition has occurred is to use the floating-point exception facilities provided by C in fenv.h.
although, unfortunately, this is far from being perfect.
I suggest you to read both
https://www.securecoding.cert.org/confluence/display/seccode/FLP04-C.+Check+floating-point+inputs+for+exceptional+values
and
https://www.securecoding.cert.org/confluence/display/seccode/FLP03-C.+Detect+and+handle+floating-point+errors
which concisely address the exact question you are posing:
Is there a method for checking problems like rounding errors automatically.
Now I am debugging a large project, which has a stack corruption: the application fails.
I would like to know how to find (debug) such stack corruption code with Visual Studio 2010?
Here's an example of some code which causes stack problems, how would I find less obvious cases of this type of corruption?
void foo()
{
int i = 10;
int *p = &i;
p[-2] = 100;
}
Update
Please note that this is just an example. I need to find such bad code in the current project.
There's one technique that can be very effective with these kinds of bugs, but it'll only work on a subset of them that has a few characteristics:
the corrupting value must be stable (ie., as in your example, when the corruption occurs, it's always 100), or at least something that can be readily identified in a simple expression
the corruption has to occur at a particular address on the stack
the corrupting value is unusual enough that you won't be hit with a slew of false positives
Note that the second condition may seem unlikely at first glance because the stack can be used in so many different ways depending on the runtime actions. However, stack usage is generally pretty deterministic. The problem is that a particular stack location can be used for so many different things that the problem is really item #3.
Anyway, if your bug has these characteristics, you should identify the stack address (or one of them) that gets corrupted, then set a memory breakpoint for a write to that address with a condition that causes it to break only if the value written is the corrupting value. In visual Studio, you can do this by creating a "New Data Breakpoint..." in the Breakpoints window then right clicking the breakpoint to set the condition.
If you end up getting too many false positives, it might help to narrow the scope of the breakpoint by leaving it disabled until some point in the execution path that's closer to the bug (if you can identify such a time), or set the hit count high enough to remove most of the false positives.
An additional complication is the address of the stack may change from run to run - in this case, you'll have to take care to set the breakpoint on each run (the lower bits of the address should be the same).
I believe your Questions quotes an example of stack corruption and the question you are asking is not why it crashes.
If it is so, It crashes because it creates an Undefined Behavior because the index -2 points to an unknown memory location.
To answer the question on profiling your application:
You can use Rational Purify Plus for Visual studio to check for memory overrites and access errors.
This is UB: p[-2] = 100;
You can access p with the operator[] in this(p[i]) way, but in this case i is an invalid value. So p[-2] points to an invalid memory location and causes Undefined Behaviour.
To find it you should debug your app and find where it crashes, and hopefully, it'll be at a place where something is actually wrong.
I've a peculiar issue here, which is happening both with VS2005 and 2010. I have a for loop in which an inline function is called, in essence something like this (C++, for illustrative purposes only):
inline double f(int a)
{
if (a > 100)
{
// This is an error condition that shouldn't happen..
}
// Do something with a and return a double
}
And then the loop in another function:
for (int i = 0; i < 11; ++i)
{
double b = f(i * 10);
}
Now what happens is that in debug build everything works fine. In release build with all the optimizations turned on this is, according to disassembly, compiled so that i is used directly without the * 10 and the comparison a > 100 turns into a > 9, while I guess it should be a > 10. Do you have any leads as to what might make the compiler think that a > 9 is the correct way? Interestingly, even a minor change (a debug printout for example) in the surrounding code makes the compiler use i * 10 and compare that with the literal value of 100.
I know this is somewhat vague, but I'd be grateful for any old idea.
EDIT:
Here's a hopefully reproducable case. I don't consider it too big to be pasted here, so here goes:
__forceinline int get(int i)
{
if (i > 600)
__asm int 3;
return i * 2;
}
int main()
{
for (int i = 0; i < 38; ++i)
{
int j = (i < 4) ? 0 : get(i * 16);
}
return 0;
}
I tested this with VS2010 on my machine, and it seems to behave as badly as the original code I'm having problems with. I compiled and ran this with the IDE's default empty C++ project template, in release configuration. As you see, the break should never be hit (37 * 16 = 592). Note that removing the i < 4 makes this work, just like in the original code.
For anyone interested, it turned out to be a bug in the VS compiler. Confirmed by Microsoft and fixed in a service pack following the report.
First, it'd help if you could post enough code to allow us to reproduce the issue. Otherwise you're just asking for psychic debugging.
Second, it does occasionally happen that a compiler fails to generate valid code at the highest optimization levels, but more likely, you just have a bug somewhere in your code. If there is undefined behavior somewhere in your code, that means the assumptions made by the optimizer may not hold, and then the compiler can end up generating bad code.
But without seeing your actual code, I can't really get any more specific.
The only famous bug I know with optimization (and only with highest optimization level) are occasional modifications of the orders of priority of the operations (due to change of operations performed by the optimizer, looking for the fastest way to compute). You could look in this direction (and put some parenthesis even though they are not strictly speaking necessary, which is why more parenthesis is never bad), but frankly, those kind of bugs are quite rare.
As stated, it difficult to have any precise idea without more code.
Firstly, inline assembly prevents certain optimizations, you should use the __debugbreak() intrinsic for int3 breakpointing. The compiler sees the inline function having no effect other than a breakpoint, so it divides the 600 by 16(note: this is affected by integer truncation), thus it optimizes to debugbreak to trigger with 38 > i >= 37. So it seems to work on this end