How to Detect Wrong Write - c++

In my code, I have a object whose contents is fully garbage. I guess in other part of code a wrong write happened and by (un)luck it wrote at the address of object I mentioned.
I was wondering if there is some tool that can tell me every write that a memory address receives, so I can track the origin of this bug.

Most debuggers support breakpoints on write.
For example in visual studio you have to launch your app in debugger, break in (on a normal breakpoint close to initialization of suspect variable), then go "Debug/New Breakpoint/New Data Breakpoint" in the menu.
In SoftIce you can use BPM command. :) That one can also break on memory access.

Depending on your platform, you should be able to lock that region of memory using something like VirtualProtect (I think it is mprotect on linux). This way you'll get accessviolation/segfault when memory region is accessed improperly. And when you get accessviolation/segfault, you can catch them in debugger.
However, to lock memory region, normally region should be aligned to memory page (at least on windows), which could be a problem.
Aside from that you could use data breakpoints in your debugger.

If you're on Windows, you can use Global Flags (gflags.exe, part of Debugging Tools For Windows) and turn PageHeap on. (On the Image tab, type the name of your .exe, then when it shows up click on the the "Enable Page Heap" checkbox.) Then run your program.
Just remember to turn it off again when you're done.

Related

Why do certain things never crash whith debugger on?

My application uses GLUTesselator to tesselate complex concave polygons. It randomly crashes when I run the plain release exe, but it never crashes if I do start debugging in VS. I found this right here which is basically my problem:
The multi-thread debug CRT (/MTd) masks the problem, because, like
Windows does with processes spawned by
a debugger, it provides to your
program a debug heap, that is
initialized to the 0xCD pattern.
Probably somewhere you use some
uninitialized area of memory from the
heap as a pointer and you dereference
it; with the two debug heaps you get
away with it for some reason (maybe
because at address 0xbaadf00d and
0xcdcdcdcd there's valid allocated
memory), but with the "normal" heap
(which is often initialized to 0) you
get an access violation, because you
dereference a NULL pointer.
The problem is the crash occurs in GLU32.dll and I have no way to find out why its trying to dereference a null pointer sometimes. it seems to do this when my polygons get fairly large and have lots of points. What can I do?
Thanks
It's a fact of life that sometimes programs behave differently in the debugger. In your case, some memory is initialized differently, and it's probably laid out differently as well. Another common case in concurrent programs is that the timing is different, and race conditions often happen less often in a debugger.
You could try to manually initialize the heap to a different value (or see if there is an option for this in Visual Studio). Usually initializing to nonzero catches more bugs, but that may not be the case in your situation. You could also try to play with your program's memory mapping to arrange that the page 0xcdcdc000 is unmapped.
Visual Studio can set a breakpoint on accesses to a particular memory address, you could try this (it may slow your program significantly more than a variable breakpoint).
but it never crashes if I do start debugging in VS.
Well, I'm not sure exactly why but while debugging in visual studio program sometimes can get away with accessing some memory regions that would crash it without debugger. I do not know exact reasons, though, but sometimes 0xcdcdcdcd and 0xbaadfood doesn't have anything to do with that. It is just accessing certain addresses doesn't cause problems. When this happens, you'll need to find alternative methods of guessing the problem.
What can I do?
Possible solutions:
Install exception handler in your program (_set_se_translator, if I remember correctly). On access violation try MinidumpWriteDump. Debug it later using Visual Studio (afaik, crash dump debugging is n/a in express edition), or using windbg.
Use just-in-time debuggers. Non-express edition of visual studio have this feature. There are probably alternatives.
Write custom memory manager (that'll override new/delete and will provide malloc/free alternatives (if you use them)) that will grab large chunk of memory, lock all unused memory with VirtualProtect. In this case all invalid access will cause crashes even in debug mode. You'll need a lot of memory for such memory manager, because to be locked, each block should be aligned to pages.
Add excessive logging to all suspicious function calls. Dump a lot of text/debug information into file (or stderr) - parameter values, arrays, everything you suspect could be related to crash, flush after every write to file, otherwise some info will be lost during the crash. This way you'll be able to guess what happened before program crashed.
Try debugging release build. You should be able to do it to some extent if you enable "debug information" for release build in project settings.
Try switching on/off "basic runtime checks" and "buffer security check" in project properties (configuration properties->c/c++->code genration).
Try to find some kind of external tool - something like valgrind or bounds checker. Although, to my expereinece, #3 is more reliable than that approach. Although that really depends on the problem.
A link to an earlier question and two thoughts.
First off you may want to look at a previous question about valgrind substitutes for windows. Lots of good hints on programs that will help you.
Now the thoughts:
1) The debugger may stop your program from crashing in the code you're testing, but it's not fixing the problem. At worst you're just kicking the can down the street, there's still corruption but it's not evident from the way you're running. When you ship you can be assured someone will run into the problem again.
2) What often happens in cases like this is that the error isn't near where the problem occurs. While you may be noticing the problem in GLU32.dll, there was probably corruption earlier, maybe even in a different thread or function, which didn't cause a problem and at some later point the program came back to the corrupted region and failed.

Watch a memory location/install 'data breakpoint' from code?

We have a memory overwrite problem. At some point, during the course of our program, a memory location is being overwritten and causing our program to crash. the problem happens only in release mode. when in debug, all is well.
that is a classic C/C++ bug, and a very hard one to locate.
I wondered if there's a way to add some 'debugging code' that will watch this memory location and will call a callback once its changed. This is basically what a debugger could do in debug mode (a 'data breakpoint') but we need something similar in release.
If you can control the location of the variable then you can allocate it on a dedicated page and set the permissions of the page to allow reads only using VirtualProtect (on Windows ... not sure for Linux).
This way you will get an access violation when someone tries to write to it. With an exception translator function you could treat this as a callback.
Even if you can't move the variable directly (eg. it is a class member), maybe you could add sufficient padding around the variable to ensure it is on a dedicated page and use the same approach.
You can still generate debug symbols for a "release" piece of code. This can still be run through a debugger just like you would in "debug" mode.
I recently did something similiar with one of our release drivers so that we could run it through vtune. For the Microsfot LINK, I added the -DEBUG flag, for Microsoft CC I added -Zi. Everything works fine. MSKB link
You might find this link useful.
assuming you're using windows use windbg to debug your program and check out the ba command-this will break when the memory is accessed.
There are tools for this - like heap agent and boundschecker and many others that will discover overwrites. Basically you need some sentinels at the end of your memory allocations and they need to be checked.
Debugging APIs are platform-specific, but they do exist. Windows and UNIX APIs can be found online.

What exactly does a debugger do?

I've stumbled onto a very interesting issue where a function (has to deal with the Windows clipboard) in my app only works properly when a breakpoint is hit inside the function. This got me wondering, what exactly does the debugger do (VS2008, C++) when it hits a breakpoint?
Without directly answering your question (since I suspect the debugger's internal workings may not really be the problem), I'll offer two possible reasons this might occur that I've seen before:
First, your program does pause when it hits a breakpoint, and often that delay is enough time for something to happen (perhaps in another thread or another process) that has to happen before your function will work. One easy way to verify this is to add a pause for a few seconds beforehand and run the program normally. If that works, you'll have to look for a more reliable way of finding the problem.
Second, Visual Studio has historically (I'm not certain about 2008) over-allocated memory when running in debug mode. So, for example, if you have an array of int[10] allocated, it should, by rights, get 40 bytes of memory, but Visual Studio might give it 44 or more, presumably in case you have an out-of-bounds error. Of course, if you DO have an out-of-bounds error, this over-allocation might make it appear to be working anyway.
Typically, for software breakpoints, the debugger places an interrupt instruction at the location you set the breakpoint at. This transfers control of the program to the debugger's interrupt handler, and from there you're in a world where the debugger can decide what to do (present you with a command prompt, print the stack and continue, what have you.)
On a related note, "This works in the debugger but not when I run without a breakpoint" suggests to me that you have a race condition. So if your app is multithreaded, consider examining your locking discipline.
It might be a timing / thread synchronization issue. Do you do any multimedia or multithreading stuff in your program?
The reason your app only works properly when a breakpoint is hit might be that you have some watches with side effects still in your watch list from previous debugging sessions. When you hit the break point, the watch is executed and your program behaves differently.
http://en.wikipedia.org/wiki/Debugger
A debugger essentially allows you to step through your source code and examine how the code is working. If you set a breakpoint, and run in debug mode, your code will pause at that break point and allow you to step into the code. This has some distinct advantages. First, you can see what the status of your variables are in memory. Second, it allows you to make sure your code is doing what you expect it to do without having to do a whole ton of print statements. And, third, it let's you make sure the logic is working the way you expect it to work.
Edit: A debugger is one of the more valuable tools in my development toolbox, and I'd recommend that you learn and understand how to use the tool to improve your development process.
I'd recommend reading the Wikipedia article for more information.
The debugger just halts execution of your program when it hits a breakpoint. If your program is working okay when it hits the breakpoint, but doesn't work without the breakpoint, that would indicate to me that you have a race condition or another threading issue in your code. The breakpoint is stopping the execution of your code, perhaps allowing another process to complete normally?
It stops the program counter for your process (the one you are debugging), and shows the current value of your variables, and uses the value of your variables at the moment to calculate expressions.
You must take into account, that if you edit some variable value when you hit a breakpoint, you are altering your process state, so it may behave differently.
Debugging is possible because the compiler inserts debugging information (such as function names, variable names, etc) into your executable. Its possible not to include this information.
Debuggers sometimes change the way the program behaves in order to work properly.
I'm not sure about Visual Studio but in Eclipse for example. Java classes are not loaded the same when ran inside the IDE and when ran outside of it.
You may also be having a race condition and the debugger stops one of the threads so when you continue the program flow it's at the right conditions.
More info on the program might help.
On Windows there is another difference caused by the debugger. When your program is launched by the debugger, Windows will use a different memory manager (heap manager to be exact) for your program. Instead of the default heap manager your program will now get the debug heap manager, which differs in the following points:
it initializes allocated memory to a pattern (0xCDCDCDCD comes to mind but I could be wrong)
it fills freed memory with another pattern
it overallocates heap allocations (like a previous answer mentioned)
All in all it changes the memory use patterns of your program so if you have a memory thrashing bug somewhere its behavior might change.
Two useful tricks:
Use PageHeap to catch memory accesses beyond the end of allocated blocks
Build using the /RTCsu (older Visual C++ compilers: /GX) switch. This will initialize the memory for all your local variables to a nonzero bit pattern and will also throw a runtime error when an unitialized local variable is accessed.

Software to track several memory errors in old project?

I am programming a game since 2 years ago.
sometimes some memory errors (ie: a function returning junk instead of what it was supposed to return, or a crash that only happen on Linux, and never happen with GDB or Windows) happen seemly at random. That is, I try to fix it, and some months later the same errors return to haunt me.
There are a software (not Valgrind, I already tried it... it does not find the errors) that can help me with that problem? Or a method of solving these errors? I want to fix them permanently.
On Windows, you can automatically capture a crashing exception in a production environment and analyze it as if the error occurred on your developer PC under the debugger. This is done using a "mini-dump" file. You basically use the Windows "dbghelp.dll" DLL to generate a copy of the thread stacks, parts or all of the heap, the register values, the loaded modules, and the unhandled exception that resulted in the crash. You can launch this ".dmp" file in the MS Visual Studio debugger as if it were an executable and it will show you exactly where the crash occurred.
You can set up a trap for unhandled exceptions and delegate the creation of the mini-dump file to dbghelp.dll in that trap. You need to keep the ".pdb" files that were generated with the deployed binaries to match up memory addresses with source code locations for a better debugging experience. This topic is too deep to fully cover See Microsoft's documentation on this DLL.
You do need to be able to copy the .dmp file from the PC where it crashed to your development environment to fully debug it. If you have a hands-off relationship with your users you'll need to have the option of having a separate utility app "phone home" over the internet to tranfer the .dmp file to a location where you can access it. You can launch the app from the unhandled exception trap after the .dmp file has been generated. For user privacy, you should give the user the option of whether or not to do this.
The Totalview debugger (commercial software) may catch the crash.
Purify (commercial software) can help you find memory leaks.
Does your code compile free of compiler warnings? Did you run lint?
One thing you could try is using the Hans Boehm GC with your project. It can be used as a leak detector, allowing you to remove suspicious-looking free() or delete statements and easily see whether they cause memory leaks.
AFAIK, Boundscheck in Windows does a very good job. In one of my project, it caught some very weird errors.
To avoid this in my own projects (on Windows), I wrote my own memory allocator which simply called VirtualAlloc and VirtualFree. It allocated an extra page for each request, aligned it just to the left of the last page, and used VirtualProtect to generate an exception whenever the last page was accessed. This detected out-of-bounds accesses, even just reads, on the spot.
Disclaimer: I was by no means the first to have this idea.
For example, if pages are 4096 bytes, and new int[1] was called, the allocator would:
Allocate 8192 bytes (4 bytes are needed, which is one page, and the extra guard page brings the total to 2 pages)
Mark the last page unaccessible
Determine the address to return (the last allocated page starts at 4096... 4096 - 2 = 4092)
The following code:
main() {
int *array = new int[10];
return array[10];
}
would then generate an access violation on the spot.
It also had a (compile-time) option to detect accesses beyond the left side of the allocation (ie, array[-1]), but these kinds of errors seemed rare, so I never used the option.

Heisenbug: WinApi program crashes on some computers

Please help! I'm really at my wits' end.
My program is a little personal notes manager (google for "cintanotes").
On some computers (and of course I own none of them) it crashes with an unhandled exception just after start.
Nothing special about these computers could be said, except that they tend to have AMD CPUs.
Environment: Windows XP, Visual C++ 2005/2008, raw WinApi.
Here is what is certain about this "Heisenbug":
1) The crash happens only in the Release version.
2) The crash goes away as soon as I remove all GDI-related stuff.
3) BoundChecker has no complains.
4) Writing a log shows that the crash happens on a declaration of a local int variable! How could that be? Memory corruption?
Any ideas would be greatly appreciated!
UPDATE: I've managed to get the app debugged on a "faulty" PC. The results:
"Unhandled exception at 0x0044a26a in CintaNotes.exe: 0xC000001D: Illegal Instruction."
and code breaks on
0044A26A cvtsi2sd xmm1,dword ptr [esp+14h]
So it seems that the problem was in the "Code Generation/Enable Enhanced Instruction Set" compiler option. It was set to "/arch:SSE2" and was crashing on the machines that didn't support SSE2. I've set this option to "Not Set" and the bug is gone. Phew!
Thank you all very much for help!!
4) Writig a log shows that the crash happen on a declaration of a local int variable! how could that be? Memory corruption?
What is the underlying code in the executable / assembly? Declaration of int is no code at all, and as such cannot crash. Do you initialize the int somehow?
To see the code where the crash happened you should perform what is called a postmortem analysis.
Windows Error Reporting
If you want to analyse the crash, you should get a crash dump. One option for this is to register for Windows Error Reporting - requires some money (you need a digital code signing ID) and some form filling. For more visit https://winqual.microsoft.com/ .
Get the crash dump intended for WER directly from the customer
Another option is to get in touch witch some user who is experiencing the crash and get a crash dump intended for WER from him directly. The user can do this when he clicks on the Technical details before sending the crash to Microsoft - the crash dump file location can be checked there.
Your own minidump
Another option is to register your own exception handler, handle the exception and write a minidump anywhere you wish. Detailed description can be found at Code Project Post-Mortem Debugging Your Application with Minidumps and Visual Studio .NET article.
So it doesnnt crash when configuration is DEBUG Configuration? There are many things different than a RELEASE configruation:
1.) Initialization of globals
2.) Actual machine Code generated etc..
So first step is find out what are exact settings for each parameter in the RELEASE mode as compared to the DEBUG mode.
-AD
1) The crash happens only in the Release version.
That's usually a sign that you're relying on some behaviour that's not guaranteed, but happens to be true in the debug build. For example, if you forget to initialize your variables, or access an array out of bounds. Make sure you've turned on all the compiler checks (/RTCsuc). Also check things like relying on the order of evaluation of function parameters (which isn't guaranteed).
2) The crash goes away as soon as I remove all GDI-related stuff.
Maybe that's a hint that you're doing something wrong with the GDI related stuff? Are you using HANDLEs after they've been freed, for example?
Download the Debugging tools for Windows package. Set the symbol paths correctly, then run your application under WinDbg. At some point, it will break with an Access Violation. Then you should run the command "!analyze -v", which is quite smart and should give you a hint on whats going wrong.
Most heisenbugs / release-only bugs are due to either flow of control that depends on reads from uninitialised memory / stale pointers / past end of buffers, or race conditions, or both.
Try overriding your allocators so they zero out memory when allocating. Does the problem go away (or become more reproducible?)
Writig a log shows that the crash happens on a declaration of a local int variable! How could that be? Memory corruption?
Stack overflow! ;)
4) Writig a log shows that the crash happen on a declaration of a local int variable!how could that be? Memory corruption
I've found the cause to numerous "strange crashes" to be dereferencing of a broken this inside a member function of said object.
What does the crash say ? Access violation ? Exception ? That would be the further clue to solve this with
Ensure you have no preceeding memory corruptions using PageHeap.exe
Ensure you have no stack overflow (CBig array[1000000])
Ensure that you have no un-initialized memory.
Further you can run the release version also inside the debugger, once you generate debug symbols (not the same as creating debug version) for the process. Step through and see if you are getting any warnings in the debugger trace window.
"4) Writing a log shows that the crash happens on a declaration of a local int variable! How could that be? Memory corruption?"
This could be a sign that the hardware is in fact faulty or being pushed too hard. Find out if they've overclocked their computer.
When I get this type of thing, i try running the code through gimpels PC-Lint (static code analysis) as it checks different classes of errors to BoundsChecker. If you are using Boundschecker, turn on the memory poisoning options.
You mention AMD CPUs. Have you investigated whether there is a similar graphics card / driver version and / or configuration in place on the machines that crash? Does it always crash on these machines or just occasionally? Maybe run the System Information tool on these machines and see what they have in common,
Sounds like stack corruption to me. My favorite tool to track those down is IDA Pro. Of course you don't have that access to the user's machine.
Some memory checkers have a hard time catching stack corruption ( if it indeed that ). The surest way to get those I think is runtime analysis.
This can also be due to corruption in an exception path, even if the exception was handled. Do you debug with 'catch first-chance exceptions' turned on? You should as long as you can. It does get annoying after a while in many cases.
Can you send those users a checked version of your application? Check out Minidump Handle that exception and write out a dump. Then use WinDbg to debug on your end.
Another method is writing very detailed logs. Create a "Log every single action" option, and ask the user to turn that on and send it too you. Dump out memory to the logs. Check out '_CrtDbgReport()' on MSDN.
Good Luck!
EDIT:
Responding to your comment: An error on a local variable declaration is not surprising to me. I've seen this a lot. It's usually due to a corrupted stack.
Some variable on the stack may be running over it's boundaries for example. All hell breaks loose after that. Then stack variable declarations throw random memory errors, virtual tables get corrupted, etc.
Anytime I've seen those for a prolong period of time, I've had to go to IDA Pro. Detailed runtime disassembly debugging is the only thing I know that really gets those reliably.
Many developers use WinDbg for this kind of analysis. That's why I also suggested Minidump.
Try Rational (IBM) PurifyPlus. It catches a lot of errors that BoundsChecker doesn't.