I just ran into an issue where a stack overflow in a threaded c++ program on HPUX caused a SEGV_MAPERR when a local object tried to call a very simple procedure. I was puzzled for a while, but luckily I talked to someone who recognized this as a stack size issue and we were able to fix the problem by increasing the stack size available to the threads.
How can I recognize when the stack overflows? Do the symptoms differ on windows/linux/hpux?
Assuming you're not on a platform thats going to stop your app and say "stack overflow" I suspect you'll see the same behavior that you would see from any kind of buffer overflow. The stack is just another preallocated chunk of memory for your program, and if you go outside those bounds... well good luck! Who knows what you'll stomp on!
You could write over the temperature readout from the CPU, it could be the email you're typing to Larry, it could be the bit saying that the kernel is locked, causing a fun deadlock condition! Who knows.
As for C++, there's nothing saying how the stack should be laid out in relation to other things in memory or that this thing even needs to be a stack!
How can I recognize when the stack overflows?
If you know the stack size, where the stack starts and the direction it grows in memory, you can simply check the address of the stack pointer and see if it past the end of the stack. C++ does not allow direct access to the stack pointer. You could easily write a small function in assembly to perform this analysis and link it into you program.
Exception code 0xC00000FD on Windows.
Usually it's easier to diagnose when you realize your SEH stops working.
Perhaps a bit off topic, but the analagous issue in Ada (running out of stack space in tasks) is a rather common "uncommon" error. Many compilers will stop the task (but not the main task) with a PROGRAM_ERROR exception.
In a way, you almost have to be able to sniff out this one. It tends to start with something like, "I moved this big array inside my task, and suddenly it quit working".
Output text to screen became mixed with lines of code from program under test. Also present were previous bash commands and other text of unidentified origin. Added to all that the program text became corrupted.
Related
I'm writing a C++ Software Image processing tool.The tool works fine, but suddenly it stops and it never sends any exception or any crash or nothing that can let me which line or which area that does that crash.
How can I determine that faulty code in that situation?
There's a few things you can do:
First of all though, it sounds more like an infinite loop, deadlock, or like you're using all of your system resources and it's just slowing down and taking a very (possibly infinite) long time. If that's the case, you'll have to find it with debugging.
Things that you can try - not necessarily in this order:
Look for shared variables you're using. Is there a chance that you
have a deadlock with threads and mutexes? Think about it and try to
fix it.
Check for use of uninitialized variables/pointers. Sometimes
(rarely) you can get very strange behavior when you invoke undefined
behavior - I'm not a windows C++ dev (I work on Linux), but it
wouldn't be the first time I saw a lockup from a segmentation fault.
Add error output (std::cerr/stderror) to your processing logic so
you can see how far in it crashes. After that, set a condition to
catch it around that point so you can watch it happen in the
debugger and see the state of your variables and what might be
wrong.
Do a stack trace so you can see what calls have been hit most
recently. This will at least let you know the last function chain
that executed.
Have you used stack tracing before?
Look up MSDN documentation on how to use them. They have different type of stack trace depending on your application.
You could
Use a debugger
Add logging code to your program
Cut sections of code from your program until it starts working
Start with the first.
I was having a very weird segfault that I eventually fixed today. It seems the problem was I was allocating a very large array on stack and that was causing the problem.
My question is, do you always get a SEGV signal on stack overflow? Is there any special signal that could alert there is a stack overflow problem?
I'm using g++ along with gdb.
The "signal" in the sense of Unix signals is apparently SEGV. :) If you mean signal like using a diagnostic tool that will tell you when something bad is happening, you could try valgrind, but really, your system just told you. And knowing at compile time whether the stack will be overflowed is not possible, partly because the stack size limit is a runtime parameter, and besides I imagine if you knew what it would be a priori, you'd still be stuck with something like the Halting Problem.
In Microsoft Visual C++ 2010 I created a program which delibrately causes a stack overflow. When I run the program using "start debugging" an error is thrown when stack overflow occurs. When I run it with "start without debugging" no error is thrown and the program just terminates silently as if it had successfully completed. Could someone explain to me what's going on? Also do any other compilers not throw errors on stack overflow?
(I thought this would be the right place to ask a question about stack overflow.)
C++ won't hold your hand as a managed enviroment does. Having a stack overflow means undefined behaviour.
A stack overflow is undefined behaviour. The compiler is well within it's rights to ignore it or cause any event to happen.
Because when your process stack overflows, it is no longer a valid process. Displaying an error message requires a stack.
Raymond Chen went over this recently.
As for why the debugger is able to throw such an exception, in that case the process is being kept around because it's attached in debugging mode to the debugger's process. Your process isn't displaying the error, the debugger is.
On Windows machines, you can catch the SEH exception which corresponds to a stack overflow. For an example, you can see boost::regex's source code (Google for BOOST_REGEX_HAS_MS_STACK_GUARD).
It might well be that the compiler optimized the intended stack overflow away. Consider the following pseudo-code example:
void RecursiveMethod(int n)
{
if (n % 1024 == 0)
print n;
// call recursively
RecursiveMethod(n + 1);
}
The method will call itself recursively and overflow the stack pretty quickly because there is no exit condition.
However, most compilers use tail recursion, a technique which will transfer the recursive function call into a loop construct.
It should be noted that with tail recursion, the above program would run in an endless loop and not exit silently.
Bart de Smet has a nice blog article where explains how this technique works in .NET:
The Case of The Failed Demo – StackOverflowException on x64
In a debug build, a number of stack checks are put in place to help you detect problems such as stack overflows, stack corruption etc. They are not present in release builds because they would affect the performance of the application. As others have pointed out, a stack overflow is undefined behaviour, so the compiler is not required to implement such stack checks at all.
When you're in a debugging environment, the runtime checks will help you detect problems that will also occur in your release build, therefore, if you fix all the problems detected in your debug build, then they should also be fixed in your release build . . . in theory. In practice, sometimes bugs that you see in your debug build are not present in your release build or vice versa.
Stack overflows shouldn't happen. Generally, stack overflows occur only by unintentional recursive function calling, or by allocating a large enough buffer on the stack. The former is obviously a bug, and the latter should use the heap instead.
In debugging mode, you want the overhead. You want it to detect if you broke your stack, overflowed your buffers, etc. This overhead is built into the debug instrumentation, and the debugger. In high level terms the debug instrumentation is extra code and data, put there to help flag errors, and the debugger is there to detect the flagged errors and notify the user (in addition to helping you debug, of course).
If you are running with the project compiled in release mode, or without a debugger attached, there is no one to hear the screaming of your program when it dies :) If a tree falls in the forest...
Depending on how you program, C++ is programming without training wheels. If you hit a wall, no one is going to be there to tell you that you screwed up. You will just crash and burn, or even worse, crash and keep running along in a very crippled state, without knowing anything is wrong. Because of this, it can be very fast. There are no extra checks or safe guards to keep it from blazing through your program with the full speed and potential of the processor (and, of course, how many extra steps you coded into your program).
I'm developing a game and when I do a specific action in the game, it crashes.
So I went debugging and I saw my application crashed at simple C++ statements like if, return, ... Each time when I re-run, it crashes randomly at one of 3 lines and it never succeeds.
line 1:
if (dynamic) { ... } // dynamic is a bool member of my class
line 2:
return m_Fixture; // a line of the Box2D physical engine. m_Fixture is a pointer.
line 3:
return m_Density; // The body of a simple getter for an integer.
I get no errors from the app nor the OS...
Are there hints, tips or tricks to debug more efficient and get known what is going on?
That's why I love Java...
Thanks
Random crashes like this are usually caused by stack corruption, since these are branching instructions and thus are sensitive to the condition of the stack. These are somewhat hard to track down, but you should run valgrind and examine the call stack on each crash to try and identify common functions that might be the root cause of the error.
Are there hints, tips or tricks to debug more efficient and get known what is going on?
Run game in debugger, on the point of crash, check values of all arguments. Either using visual studio watch window or using gdb. Using "call stack" check parent routines, try to think what could go wrong.
In suspicious(potentially related to crash) routines, consider dumping all arguments to stderr (if you're using libsdl or on *nixlike systems), or write a logfile, or send dupilcates of all error messages using (on Windows) OutputDebugString. This will make them visible in "output" window in visual studio or debugger. You can also write "traces" (log("function %s was called", __FUNCTION__))
If you can't debug immediately, produce core dumps on crash. On windows it can be done using MiniDumpWriteDump, on linux it is set somewhere in configuration variables. core dumps can be handled by debugger. I'm not sure if VS express can deal with them on Windows, but you still can debug them using WinDBG.
if crash happens within class, check *this argument. It could be invalid or zero.
If the bug is truly evil (elusive stack corruption in multithreaded app that leads to delayed crash), write custom memory manager, that will override new/delete, provide alternative to malloc(if your app for some reason uses it, which may be possible), AND that locks all unused memory memory using VirtualProtect (windows) or OS-specific alternative. In this case all potentially dangerous operation will crash app instantly, which will allow you to debug the problem (if you have Just-In-Time debugger) and instantly find dangerous routine. I prefer such "custom memory manager" to boundschecker and such - since in my experience it was more useful. As an alternative you could try to use valgrind, which is available on linux only. Note, that if your app very frequently allocates memory, you'll need a large amount of RAM in order to be able to lock every unused memory block (because in order to be locked, block should be PAGE_SIZE bytes big).
In areas where you need sanity check either use ASSERT, or (IMO better solution) write a routine that will crash the application (by throwing an std::exception with a meaningful message) if some condition isn't met.
If you've identified a problematic routine, walk through it using debugger's step into/step over. Watch the arguments.
If you've identified a problematic routine, but can't directly debug it for whatever reason, after every statement within that routine, dump all variables into stderr or logfile (fprintf or iostreams - your choice). Then analyze outputs and think how it could have happened. Make sure to flush logfile after every write, or you might miss the data right before the crash.
In general you should be happy that app crashes somewhere. Crash means a bug you can quickly find using debugger and exterminate. Bugs that don't crash the program are much more difficult (example of truly complex bug: given 100000 values of input, after few hundreds of manipulations with values, among thousands of outputs, app produces 1 absolutely incorrect result, which shouldn't have happened at all)
That's why I love Java...
Excuse me, if you can't deal with language, it is entirely your fault. If you can't handle the tool, either pick another one or improve your skill. It is possible to make game in java, by the way.
These are mostly due to stack corruption, but heap corruption can also affect programs in this way.
stack corruption occurs most of the time because of "off by one errors".
heap corruption occurs because of new/delete not being handled carefully, like double delete.
Basically what happens is that the overflow/corruption overwrites an important instruction, then much much later on, when you try to execute the instruction, it will crash.
I generally like to take a second to step back and think through the code, trying to catch any logic errors.
You might try commenting out different parts of the code and seeing if it affects how the program is compiled.
Besides those two things you could try using a debugger like Visual Studio or Eclipse etc...
Lastly you could try to post your code and the error you are getting on a website with a community that knows programming and could help you work through the error (read: stackoverflow)
Crashes / Seg faults usually happen when you access a memory location that it is not allowed to access, or you attempt to access a memory location in a way that is not allowed (for example, attempting to write to a read-only location).
There are many memory analyzer tools, for example I use Valgrind which is really great in telling what the issue is (not only the line number, but also what's causing the crash).
There are no simple C++ statements. An if is only as simple as the condition you evaluate. A return is only as simple as the expression you return.
You should use a debugger and/or post some of the crashing code. Can't be of much use with "my app crashed" as information.
I had problems like this before. I was trying to refresh the GUI from different threads.
If the if statements involve dereferencing pointers, you're almost certainly corrupting the stack (this explains why an innocent return 0 would crash...)
This can happen, for instance, by going out of bounds in an array (you should be using std::vector!), trying to strcpy a char[]-based string missing the ending '\0' (you should be using std::string!), passing a bad size to memcpy (you should be using copy-constructors!), etc.
Try to figure out a way to reproduce it reliably, then place a watch on the corrupted pointer. Run through the code line-by-line until you find the very line that corrupts the pointer.
Look at the disassembly. Almost any C/C++ debugger will be happy to show you the machine code and the registers where the program crashed. The registers include the Instruction Pointer (EIP or RIP on x86/x64) which is where the program was when it stopped. The other registers usually have memory addresses or data. If the memory address is 0 or a bad pointer, there is your problem.
Then you just have to work backward to find out how it got that way. Hardware breakpoints on memory changes are very helpful here.
On a Linux/BSD/Mac, using GDB's scripting features can help a lot here. You can script things so that after the breakpoint is hit 20 times it enables a hardware watch on the address of array element 17. Etc.
You can also write debugging into your program. Use the assert() function. Everywhere!
Use assert to check the arguments to every function. Use assert to check the state of every object before you exit the function. In a game, assert that the player is on the map, that the player has health between 0 and 100, assert everything that you can think of. For complicated objects write verify() or validate() functions into the object itself that checks everything about it and then call those from an assert().
Another way to write in debugging is to have the program use signal() in Linux or asm int 3 in Windows to break into the debugger from the program. Then you can write temporary code into the program to check if it is on iteration 1117321 of the main loop. That can be useful if the bug always happens at 1117322. The program will execute much faster this way than to use a debugger breakpoint.
some tips :
- run your application under a debugger, with the symbol files (PDB) together.
- How to set Visual Studio as the default post-mortem debugger?
- set default debugger for WinDbg Just-in-time Debugging
- check memory allocations Overriding new and delete, and Overriding malloc and free
One other trick: turn off code optimization and see if the crash points make more sense. Optimization is allowed to float little bits of your code to surprising places; mapping that back to source code lines can be less than perfect.
Check pointers. At a guess, you're dereferencing a null pointer.
I've found 'random' crashes when there are some reference to a deleted object. As the memory is not necessarily overwritten, in many cases you don't notice it and the program works correctly, and than crashes after the memory was updated and is not valid anymore.
JUST FOR DEBUGGING PURPOSES, try commenting out some suspicious 'deletes'. Then, if it doesn't crash anymore, there you are.
use the GNU Debugger
Refactoring.
Scan all the code, make it clearer if not clear at first read, try to understand what you wrote and immediately fix what seems incorrect.
You'll certainly discover the problem(s) this way and fix a lot of other problems too.
So I'm trying to debug this strange problem where a process ends without calling some destructors...
In the VS (2005) debugger, I hit 'Break all' and look around in the call stacks of the threads of the misteriously disappearing process, when I see this:
smells like SO http://img6.imageshack.us/img6/7628/95434880.jpg
This definitely looks like a SO in the making, which would explain why the process runs to its happy place without packing its suitcase first.
The problem is, the VS debugger's call stack only shows what you can see in the image.
So my question is: how can I find where the infinite recursion call starts?
I read somewhere that in Linux you can attach a callback to the SIGSEGV handler and get more info on what's going on.
Is there anything similar on Windows?
To control what Windows does in case of an access violation (SIGSEGV-equivalent), call SetErrorMode (pass it parameter 0 to force a popup in case of errors, allowing you to attach to it with a debugger.)
However, based on the stack trace you have already obtained, attaching with a debugger on fault may yield no additional information. Either your stack has been corrupted, or the depth of recursion has exceeded the maximum number of frames displayable by VS. In the latter case, you may want to decrease the default stack size of the process (use the /F switch or equivalent option in the Project properties) in order to make the problem manifest itself sooner, and make sure that VS will display all frames. You may, alternatively, want to stick a breakpoint in std::basic_filebuf<>::flush() and walk through it until the destruction phase (or disable it until just prior to the destruction phase.)
Well, you know what thread the problem is on - it might be a simple matter of tracing through it from inception to see where it goes off into the weeds.
Another option is to use one of the debuggers in the Debugging Tools for Windows package - they may be able to show more than the VS debugger (maybe), even if they are generally more complex and difficult to use (actually maybe because of that).
That does look at first glance like an infinite recursion, you could try putting a breakpoint at the line before the one that terminates the process. Does it get there ok? If it does, you've got two fairly easy ways to go.
Either you just step forward and see which destructors get called and when it gets caught up. Or you could put a printf/OutputDebugString in every relevant objects destructor (ONly ones which are globals should need this). If the message is the first thing the destructor does, then the last message you see is from the destructor which hangs things up.
On the other hand, if it doesn't get to that breakpoint I originally mentioned, then can do something similar, but it will be more annoying since the program is still "doing stuff".
I wouldn't rule out there being such a handler in Windows, but I've never heard of it.
I think the traceback that you're showing may be bogus. If you broke into the process after some kind of corruption had already occurred, then the traceback isn't necessarily valid. However, if you're lucky the bottom of the stack trace still has some clues about what's going on.
Try putting Sleep() calls into selected functions in your source that might be involved in the recursion. That should give you a better chance of breaking into the process before the stack has completely overflowed.
I agree with Dan Breslau. Your stack is bogus. may be simply because you don't have the right symbols, though.
If a program simply disappears without the WER handling kicking in, it's usually an out of memory condition. Have you gone investigated that possibility ?