I'm debugging an application and it segfaults at a position where it is almost impossible to determine which of the many instances causes the segfault.
I figured that if I'm able to resolve the position at which the object is created, I will know which instance is causing the problem and resolve the bug.
To be able to retrieve this information, gdb (or some other application) would of course have to override the default malloc/new/new[] implementations, but instrumenting my application with this would be alright.
One could argue that I could just put a breakpoint on the line before the one that segfaults and step into the object from there, but the problem is that this is a central message dispatcher loop which handles a lot of different messages and I'm not able to set a breakpoint condition in such a way as to trap my misbehaving object.
So, at the point where the segfault occurs, you have the object, but you don't know which of many pieces of code that create such objects created it, right?
I'd instrument all of those object-creation bits and have them log the address of each object created to a file, along with the file and line number (the __LINE__ and __FILE__ pre-defined macros can help make this easy).
Then run the app under the debugger, let it trap the segfault and look the address of the offending object up in your log to find out where it was created. Then peel the next layer of the onion.
Have you tried using a memory debugging library (e.g. dmalloc). Many of these already instrument new, etc. and records where an allocation is made. Some are easier to access from gdb than others though.
This product has a memory debugging feature that does what you want: http://www.allinea.com/index.php?page=48
I would first try using the backtrace command in gdb when the segfault occurs. If that does not give me a good clue about what is going on, I would next try to use valgrind to check if there are any memory leaks occurring. These two steps are usually sufficient, in my experience, to narrow down and find the problem spot in most of the usual cases.
Regards.
Related
I'm currently battling with an intermittent bug. I create a float member of my class. I initialize it to zero. And then give it a value. This variable is used several times over the course of the next few processes, and inexplicably it will sometimes change its value to a really small number and cause an error in my program. I've pinpointed the general area in my code where it happens, and I swear, there is nothing in my code that is acting upon this variable. And on top of that I'll run and compile the same exact program with the same exact code several times and this bug only pops up sometimes.
I'm thinking that one of my other arrays or pointers is occasionally stepping out of bounds (because I haven't implemented bounds checking yet) and replacing the variables value with it's own, but I have no idea which one. I was wondering if there is a way in XCode, to find out what variables are stored near or next to this variable, so I can maybe pinpoint who might be stepping on this poor little son of a gun?
You can enable "guard malloc" in XCode. Guard malloc can tell you whether your code wrote out of bounds on any allocated area. I don't know the exact way to enable it (anymore), but you'll definitely find something on the nets.
If you want to watch some memory location while debugging your code with gdb you can use watch breakpoints.
Maybe you have a corrupted memory heap. Using a tool like valgrind could help.
I'm developing a game and when I do a specific action in the game, it crashes.
So I went debugging and I saw my application crashed at simple C++ statements like if, return, ... Each time when I re-run, it crashes randomly at one of 3 lines and it never succeeds.
line 1:
if (dynamic) { ... } // dynamic is a bool member of my class
line 2:
return m_Fixture; // a line of the Box2D physical engine. m_Fixture is a pointer.
line 3:
return m_Density; // The body of a simple getter for an integer.
I get no errors from the app nor the OS...
Are there hints, tips or tricks to debug more efficient and get known what is going on?
That's why I love Java...
Thanks
Random crashes like this are usually caused by stack corruption, since these are branching instructions and thus are sensitive to the condition of the stack. These are somewhat hard to track down, but you should run valgrind and examine the call stack on each crash to try and identify common functions that might be the root cause of the error.
Are there hints, tips or tricks to debug more efficient and get known what is going on?
Run game in debugger, on the point of crash, check values of all arguments. Either using visual studio watch window or using gdb. Using "call stack" check parent routines, try to think what could go wrong.
In suspicious(potentially related to crash) routines, consider dumping all arguments to stderr (if you're using libsdl or on *nixlike systems), or write a logfile, or send dupilcates of all error messages using (on Windows) OutputDebugString. This will make them visible in "output" window in visual studio or debugger. You can also write "traces" (log("function %s was called", __FUNCTION__))
If you can't debug immediately, produce core dumps on crash. On windows it can be done using MiniDumpWriteDump, on linux it is set somewhere in configuration variables. core dumps can be handled by debugger. I'm not sure if VS express can deal with them on Windows, but you still can debug them using WinDBG.
if crash happens within class, check *this argument. It could be invalid or zero.
If the bug is truly evil (elusive stack corruption in multithreaded app that leads to delayed crash), write custom memory manager, that will override new/delete, provide alternative to malloc(if your app for some reason uses it, which may be possible), AND that locks all unused memory memory using VirtualProtect (windows) or OS-specific alternative. In this case all potentially dangerous operation will crash app instantly, which will allow you to debug the problem (if you have Just-In-Time debugger) and instantly find dangerous routine. I prefer such "custom memory manager" to boundschecker and such - since in my experience it was more useful. As an alternative you could try to use valgrind, which is available on linux only. Note, that if your app very frequently allocates memory, you'll need a large amount of RAM in order to be able to lock every unused memory block (because in order to be locked, block should be PAGE_SIZE bytes big).
In areas where you need sanity check either use ASSERT, or (IMO better solution) write a routine that will crash the application (by throwing an std::exception with a meaningful message) if some condition isn't met.
If you've identified a problematic routine, walk through it using debugger's step into/step over. Watch the arguments.
If you've identified a problematic routine, but can't directly debug it for whatever reason, after every statement within that routine, dump all variables into stderr or logfile (fprintf or iostreams - your choice). Then analyze outputs and think how it could have happened. Make sure to flush logfile after every write, or you might miss the data right before the crash.
In general you should be happy that app crashes somewhere. Crash means a bug you can quickly find using debugger and exterminate. Bugs that don't crash the program are much more difficult (example of truly complex bug: given 100000 values of input, after few hundreds of manipulations with values, among thousands of outputs, app produces 1 absolutely incorrect result, which shouldn't have happened at all)
That's why I love Java...
Excuse me, if you can't deal with language, it is entirely your fault. If you can't handle the tool, either pick another one or improve your skill. It is possible to make game in java, by the way.
These are mostly due to stack corruption, but heap corruption can also affect programs in this way.
stack corruption occurs most of the time because of "off by one errors".
heap corruption occurs because of new/delete not being handled carefully, like double delete.
Basically what happens is that the overflow/corruption overwrites an important instruction, then much much later on, when you try to execute the instruction, it will crash.
I generally like to take a second to step back and think through the code, trying to catch any logic errors.
You might try commenting out different parts of the code and seeing if it affects how the program is compiled.
Besides those two things you could try using a debugger like Visual Studio or Eclipse etc...
Lastly you could try to post your code and the error you are getting on a website with a community that knows programming and could help you work through the error (read: stackoverflow)
Crashes / Seg faults usually happen when you access a memory location that it is not allowed to access, or you attempt to access a memory location in a way that is not allowed (for example, attempting to write to a read-only location).
There are many memory analyzer tools, for example I use Valgrind which is really great in telling what the issue is (not only the line number, but also what's causing the crash).
There are no simple C++ statements. An if is only as simple as the condition you evaluate. A return is only as simple as the expression you return.
You should use a debugger and/or post some of the crashing code. Can't be of much use with "my app crashed" as information.
I had problems like this before. I was trying to refresh the GUI from different threads.
If the if statements involve dereferencing pointers, you're almost certainly corrupting the stack (this explains why an innocent return 0 would crash...)
This can happen, for instance, by going out of bounds in an array (you should be using std::vector!), trying to strcpy a char[]-based string missing the ending '\0' (you should be using std::string!), passing a bad size to memcpy (you should be using copy-constructors!), etc.
Try to figure out a way to reproduce it reliably, then place a watch on the corrupted pointer. Run through the code line-by-line until you find the very line that corrupts the pointer.
Look at the disassembly. Almost any C/C++ debugger will be happy to show you the machine code and the registers where the program crashed. The registers include the Instruction Pointer (EIP or RIP on x86/x64) which is where the program was when it stopped. The other registers usually have memory addresses or data. If the memory address is 0 or a bad pointer, there is your problem.
Then you just have to work backward to find out how it got that way. Hardware breakpoints on memory changes are very helpful here.
On a Linux/BSD/Mac, using GDB's scripting features can help a lot here. You can script things so that after the breakpoint is hit 20 times it enables a hardware watch on the address of array element 17. Etc.
You can also write debugging into your program. Use the assert() function. Everywhere!
Use assert to check the arguments to every function. Use assert to check the state of every object before you exit the function. In a game, assert that the player is on the map, that the player has health between 0 and 100, assert everything that you can think of. For complicated objects write verify() or validate() functions into the object itself that checks everything about it and then call those from an assert().
Another way to write in debugging is to have the program use signal() in Linux or asm int 3 in Windows to break into the debugger from the program. Then you can write temporary code into the program to check if it is on iteration 1117321 of the main loop. That can be useful if the bug always happens at 1117322. The program will execute much faster this way than to use a debugger breakpoint.
some tips :
- run your application under a debugger, with the symbol files (PDB) together.
- How to set Visual Studio as the default post-mortem debugger?
- set default debugger for WinDbg Just-in-time Debugging
- check memory allocations Overriding new and delete, and Overriding malloc and free
One other trick: turn off code optimization and see if the crash points make more sense. Optimization is allowed to float little bits of your code to surprising places; mapping that back to source code lines can be less than perfect.
Check pointers. At a guess, you're dereferencing a null pointer.
I've found 'random' crashes when there are some reference to a deleted object. As the memory is not necessarily overwritten, in many cases you don't notice it and the program works correctly, and than crashes after the memory was updated and is not valid anymore.
JUST FOR DEBUGGING PURPOSES, try commenting out some suspicious 'deletes'. Then, if it doesn't crash anymore, there you are.
use the GNU Debugger
Refactoring.
Scan all the code, make it clearer if not clear at first read, try to understand what you wrote and immediately fix what seems incorrect.
You'll certainly discover the problem(s) this way and fix a lot of other problems too.
I've stumbled onto a very interesting issue where a function (has to deal with the Windows clipboard) in my app only works properly when a breakpoint is hit inside the function. This got me wondering, what exactly does the debugger do (VS2008, C++) when it hits a breakpoint?
Without directly answering your question (since I suspect the debugger's internal workings may not really be the problem), I'll offer two possible reasons this might occur that I've seen before:
First, your program does pause when it hits a breakpoint, and often that delay is enough time for something to happen (perhaps in another thread or another process) that has to happen before your function will work. One easy way to verify this is to add a pause for a few seconds beforehand and run the program normally. If that works, you'll have to look for a more reliable way of finding the problem.
Second, Visual Studio has historically (I'm not certain about 2008) over-allocated memory when running in debug mode. So, for example, if you have an array of int[10] allocated, it should, by rights, get 40 bytes of memory, but Visual Studio might give it 44 or more, presumably in case you have an out-of-bounds error. Of course, if you DO have an out-of-bounds error, this over-allocation might make it appear to be working anyway.
Typically, for software breakpoints, the debugger places an interrupt instruction at the location you set the breakpoint at. This transfers control of the program to the debugger's interrupt handler, and from there you're in a world where the debugger can decide what to do (present you with a command prompt, print the stack and continue, what have you.)
On a related note, "This works in the debugger but not when I run without a breakpoint" suggests to me that you have a race condition. So if your app is multithreaded, consider examining your locking discipline.
It might be a timing / thread synchronization issue. Do you do any multimedia or multithreading stuff in your program?
The reason your app only works properly when a breakpoint is hit might be that you have some watches with side effects still in your watch list from previous debugging sessions. When you hit the break point, the watch is executed and your program behaves differently.
http://en.wikipedia.org/wiki/Debugger
A debugger essentially allows you to step through your source code and examine how the code is working. If you set a breakpoint, and run in debug mode, your code will pause at that break point and allow you to step into the code. This has some distinct advantages. First, you can see what the status of your variables are in memory. Second, it allows you to make sure your code is doing what you expect it to do without having to do a whole ton of print statements. And, third, it let's you make sure the logic is working the way you expect it to work.
Edit: A debugger is one of the more valuable tools in my development toolbox, and I'd recommend that you learn and understand how to use the tool to improve your development process.
I'd recommend reading the Wikipedia article for more information.
The debugger just halts execution of your program when it hits a breakpoint. If your program is working okay when it hits the breakpoint, but doesn't work without the breakpoint, that would indicate to me that you have a race condition or another threading issue in your code. The breakpoint is stopping the execution of your code, perhaps allowing another process to complete normally?
It stops the program counter for your process (the one you are debugging), and shows the current value of your variables, and uses the value of your variables at the moment to calculate expressions.
You must take into account, that if you edit some variable value when you hit a breakpoint, you are altering your process state, so it may behave differently.
Debugging is possible because the compiler inserts debugging information (such as function names, variable names, etc) into your executable. Its possible not to include this information.
Debuggers sometimes change the way the program behaves in order to work properly.
I'm not sure about Visual Studio but in Eclipse for example. Java classes are not loaded the same when ran inside the IDE and when ran outside of it.
You may also be having a race condition and the debugger stops one of the threads so when you continue the program flow it's at the right conditions.
More info on the program might help.
On Windows there is another difference caused by the debugger. When your program is launched by the debugger, Windows will use a different memory manager (heap manager to be exact) for your program. Instead of the default heap manager your program will now get the debug heap manager, which differs in the following points:
it initializes allocated memory to a pattern (0xCDCDCDCD comes to mind but I could be wrong)
it fills freed memory with another pattern
it overallocates heap allocations (like a previous answer mentioned)
All in all it changes the memory use patterns of your program so if you have a memory thrashing bug somewhere its behavior might change.
Two useful tricks:
Use PageHeap to catch memory accesses beyond the end of allocated blocks
Build using the /RTCsu (older Visual C++ compilers: /GX) switch. This will initialize the memory for all your local variables to a nonzero bit pattern and will also throw a runtime error when an unitialized local variable is accessed.
My application crashes after running for around 18 hours. I am not able to debug the point in the code where it actually crashes. I checked the call stack- it does not provide any information as such. The last few calls in the call stack are greyed out-meaning I cannot see the code of that part-they all belong to MFC libraries.
However, I get this 'MicroSoft Visual Studio' pop-up when it crashes which says :
Unhandled exception at 0x7c809e8a in NIMCAsst.exe: 0xC0000005:
Access violation reading location 0x154c6000.
Could the above information be useful to understand where it is crashing.Is there any software that could tell me a particular memory address is held by which variable in the code.
If you can't catch the exception sometimes you just have to go through your code line by line, very unpleasant but I'd put money on it being your code not in MFC (always is with my bugs). Check how you're using memory and what you're passing into the MFC functions extra carefully.
Probably the crash is caused by a buffer overflow or other type of memory corruption. This has overwritten some part of the stack holding the return address which has made the debugger unable to reconstruct the stack trace correctly. Or, that the code that caused the crash, you do not have correct sybols for (if the stack trace shows a module name, this would be the case).
My first guess would be to examine the code calling the code that crashed for possible issues that might have caused it. Do you get any other exceptions or error conditions before the crash? Maybe you are ignoring an error return? Did you try using the Debug Heap? What about adplus? Application verifier to turn on heap checks?
Other possibilities include to run a tool like pclint over the code to check for obvious issues of memory use. Are you using threads? Maybe there is a race condition. The list could go on forever really.
The above information only tells you which memory was accessed illegally.
You can use exception handling to narrow down the place where the problem occurs, but then you need at least an idea in which corner to seek.
You say that you're seeing the call stack, that suggests you're using a debugger. The source code of MFC is available (but perhaps not with all vc++ editions), so in principle one can trace through it. Which VC++ version are you using?
The fact that the bug takes so long to occur suggests that it is memory corruption. Some other function writes to a location that it doesn't own. This works a long time, but finally the function alters a pointer that MCF needs, and after a while MFC accesses the pointer and you are notified.
Sometimes, the 'location' can be recognized as data, in which case you have a hint. F.e. if the error said:
Access violation reading location 0x31323334
you'd recognize this as a part of an ASCII string "1234", and this might lead you to the culprit.
As Patrick says, it's almost definitely your code giving MFC invalid values. One guess would be you're passing in an incorrect length so the library is reading too far. But there are really a multitude of possible causes.
Is the crash clearly reproducible?
If yes, Use Logfiles! You should use a logfile and add a number statements that just log the source file/line number passed. Start with a few statements at the entrypoint (main event handler) and the most common execution paths. After the crash inspect the last entry in the logfile. Then add new entries down the path/paths that must have been passed etc. Usually after a few iterations of this work you will find the point of failure. In case of your long wait time the log file might become huge and each iteration will take another 18 hours. You may need to add some technique of rotating log files etc. But with this technique i was able to find some comparable bugs.
Some more questions:
Is your app multithreaded?
Does it use any arrays not managed by stl or comparable containers (does it use C-Strings, C/C++-Arrays etc)?
Try attaching a debugger to the process and have the debugger break on access violations.
If this isnt possible then we use a tool called "User mode process dumper" to create a memory dump of the process at the point where the access violation happened. You can find this for download here:
http://www.microsoft.com/downloads/details.aspx?FamilyID=E089CA41-6A87-40C8-BF69-28AC08570B7E&displaylang=en
How it works: You configure rules on a per-process (or optionally system-wide) basis, and have the tool create either a minidump or a full dump at the point where it detects any one of a list of exceptions - one of them being an access violation. After the dump has been made the application continues as normal (and so if the access violation is unhandled, you will then see this dialog).
Note that ALL access violations in your process are captured - even those that are then later handled, also a full dump can create a while to create depending on the amount of memory the application is using (10-20 seconds for a process consuming 100-200 MB of private memory). For this reason it's probably not a good idea to enable it system-wide.
You should then be able to analyse the dump using tools like WinDbg (http://www.microsoft.com/whdc/devtools/debugging/default.mspx) to figure out what happened - in most cases you will find that you only need a minidump, not a full dump (however if your application doesnt use much memory then there arent really many drawbacks of having a full dump other than the size of the dump and the time it takes to create the dump).
Finally, be warned that debugging access violations using WinDbg can be a fairly involed and complex process - if you can get a stack trace another way then you might want to try that first.
This is the cause of possible memory leak, there are various blogs could teach on checking for memory leaks in application, you simply make observations on Physical Memory of the process from Windows Task Manager, you could find at some stage where memory keep increasing & run out of memory. You can also try running with windbg tool to identify memory leaks in your code. I havent used this tool just giving some heads up on this.
This question is pretty old, and I've had the same problem,
but I've quickly solved it - it's all about threads:
First, note that updating GUI can only be done at the Main Thread.
My problem was that I've tried to handle GUI from a Worker Thread (and not a Main Thread) and i've got the same error: 0xC0000005.
I've solved it by posting a message (which is executed at the Main Thread) - and the problem was solved:
typedef enum {
WM_UPDATE_GUI
}WM_MY_MSG
// register function callback to a message
BEGIN_MESSAGE_MAP(CMyDlg, CDlgBase)
ON_MESSAGE(WM_UPDATE_GUI, OnUpdateGui)
END_MESSAGE_MAP()
// For this example - function that is not invoked in the Main Thread:
void CMyDlg::OnTimer()
{
CString str_to_GUI("send me to gui"); // send string to gui
// Update_GUI(str_to_GUI); // crashed
::PostMessage(hWnd, MyMsg::WM_UPDATE_GUI, (WPARAM)&str_to_GUI, 0);
}
HRESULT CMyDlg::OnUpdateGui(WPARAM wParam, LPARAM lParam)
{
CString str = *(CString*)wParam; // get the string from the posted message
Update_GUI(str);
return S_OK;
}
So I'm trying to debug this strange problem where a process ends without calling some destructors...
In the VS (2005) debugger, I hit 'Break all' and look around in the call stacks of the threads of the misteriously disappearing process, when I see this:
smells like SO http://img6.imageshack.us/img6/7628/95434880.jpg
This definitely looks like a SO in the making, which would explain why the process runs to its happy place without packing its suitcase first.
The problem is, the VS debugger's call stack only shows what you can see in the image.
So my question is: how can I find where the infinite recursion call starts?
I read somewhere that in Linux you can attach a callback to the SIGSEGV handler and get more info on what's going on.
Is there anything similar on Windows?
To control what Windows does in case of an access violation (SIGSEGV-equivalent), call SetErrorMode (pass it parameter 0 to force a popup in case of errors, allowing you to attach to it with a debugger.)
However, based on the stack trace you have already obtained, attaching with a debugger on fault may yield no additional information. Either your stack has been corrupted, or the depth of recursion has exceeded the maximum number of frames displayable by VS. In the latter case, you may want to decrease the default stack size of the process (use the /F switch or equivalent option in the Project properties) in order to make the problem manifest itself sooner, and make sure that VS will display all frames. You may, alternatively, want to stick a breakpoint in std::basic_filebuf<>::flush() and walk through it until the destruction phase (or disable it until just prior to the destruction phase.)
Well, you know what thread the problem is on - it might be a simple matter of tracing through it from inception to see where it goes off into the weeds.
Another option is to use one of the debuggers in the Debugging Tools for Windows package - they may be able to show more than the VS debugger (maybe), even if they are generally more complex and difficult to use (actually maybe because of that).
That does look at first glance like an infinite recursion, you could try putting a breakpoint at the line before the one that terminates the process. Does it get there ok? If it does, you've got two fairly easy ways to go.
Either you just step forward and see which destructors get called and when it gets caught up. Or you could put a printf/OutputDebugString in every relevant objects destructor (ONly ones which are globals should need this). If the message is the first thing the destructor does, then the last message you see is from the destructor which hangs things up.
On the other hand, if it doesn't get to that breakpoint I originally mentioned, then can do something similar, but it will be more annoying since the program is still "doing stuff".
I wouldn't rule out there being such a handler in Windows, but I've never heard of it.
I think the traceback that you're showing may be bogus. If you broke into the process after some kind of corruption had already occurred, then the traceback isn't necessarily valid. However, if you're lucky the bottom of the stack trace still has some clues about what's going on.
Try putting Sleep() calls into selected functions in your source that might be involved in the recursion. That should give you a better chance of breaking into the process before the stack has completely overflowed.
I agree with Dan Breslau. Your stack is bogus. may be simply because you don't have the right symbols, though.
If a program simply disappears without the WER handling kicking in, it's usually an out of memory condition. Have you gone investigated that possibility ?