Application crashes says : Access violation reading location - c++

My application crashes after running for around 18 hours. I am not able to debug the point in the code where it actually crashes. I checked the call stack- it does not provide any information as such. The last few calls in the call stack are greyed out-meaning I cannot see the code of that part-they all belong to MFC libraries.
However, I get this 'MicroSoft Visual Studio' pop-up when it crashes which says :
Unhandled exception at 0x7c809e8a in NIMCAsst.exe: 0xC0000005:
Access violation reading location 0x154c6000.
Could the above information be useful to understand where it is crashing.Is there any software that could tell me a particular memory address is held by which variable in the code.

If you can't catch the exception sometimes you just have to go through your code line by line, very unpleasant but I'd put money on it being your code not in MFC (always is with my bugs). Check how you're using memory and what you're passing into the MFC functions extra carefully.

Probably the crash is caused by a buffer overflow or other type of memory corruption. This has overwritten some part of the stack holding the return address which has made the debugger unable to reconstruct the stack trace correctly. Or, that the code that caused the crash, you do not have correct sybols for (if the stack trace shows a module name, this would be the case).
My first guess would be to examine the code calling the code that crashed for possible issues that might have caused it. Do you get any other exceptions or error conditions before the crash? Maybe you are ignoring an error return? Did you try using the Debug Heap? What about adplus? Application verifier to turn on heap checks?
Other possibilities include to run a tool like pclint over the code to check for obvious issues of memory use. Are you using threads? Maybe there is a race condition. The list could go on forever really.

The above information only tells you which memory was accessed illegally.
You can use exception handling to narrow down the place where the problem occurs, but then you need at least an idea in which corner to seek.
You say that you're seeing the call stack, that suggests you're using a debugger. The source code of MFC is available (but perhaps not with all vc++ editions), so in principle one can trace through it. Which VC++ version are you using?
The fact that the bug takes so long to occur suggests that it is memory corruption. Some other function writes to a location that it doesn't own. This works a long time, but finally the function alters a pointer that MCF needs, and after a while MFC accesses the pointer and you are notified.
Sometimes, the 'location' can be recognized as data, in which case you have a hint. F.e. if the error said:
Access violation reading location 0x31323334
you'd recognize this as a part of an ASCII string "1234", and this might lead you to the culprit.

As Patrick says, it's almost definitely your code giving MFC invalid values. One guess would be you're passing in an incorrect length so the library is reading too far. But there are really a multitude of possible causes.

Is the crash clearly reproducible?
If yes, Use Logfiles! You should use a logfile and add a number statements that just log the source file/line number passed. Start with a few statements at the entrypoint (main event handler) and the most common execution paths. After the crash inspect the last entry in the logfile. Then add new entries down the path/paths that must have been passed etc. Usually after a few iterations of this work you will find the point of failure. In case of your long wait time the log file might become huge and each iteration will take another 18 hours. You may need to add some technique of rotating log files etc. But with this technique i was able to find some comparable bugs.
Some more questions:
Is your app multithreaded?
Does it use any arrays not managed by stl or comparable containers (does it use C-Strings, C/C++-Arrays etc)?

Try attaching a debugger to the process and have the debugger break on access violations.
If this isnt possible then we use a tool called "User mode process dumper" to create a memory dump of the process at the point where the access violation happened. You can find this for download here:
http://www.microsoft.com/downloads/details.aspx?FamilyID=E089CA41-6A87-40C8-BF69-28AC08570B7E&displaylang=en
How it works: You configure rules on a per-process (or optionally system-wide) basis, and have the tool create either a minidump or a full dump at the point where it detects any one of a list of exceptions - one of them being an access violation. After the dump has been made the application continues as normal (and so if the access violation is unhandled, you will then see this dialog).
Note that ALL access violations in your process are captured - even those that are then later handled, also a full dump can create a while to create depending on the amount of memory the application is using (10-20 seconds for a process consuming 100-200 MB of private memory). For this reason it's probably not a good idea to enable it system-wide.
You should then be able to analyse the dump using tools like WinDbg (http://www.microsoft.com/whdc/devtools/debugging/default.mspx) to figure out what happened - in most cases you will find that you only need a minidump, not a full dump (however if your application doesnt use much memory then there arent really many drawbacks of having a full dump other than the size of the dump and the time it takes to create the dump).
Finally, be warned that debugging access violations using WinDbg can be a fairly involed and complex process - if you can get a stack trace another way then you might want to try that first.

This is the cause of possible memory leak, there are various blogs could teach on checking for memory leaks in application, you simply make observations on Physical Memory of the process from Windows Task Manager, you could find at some stage where memory keep increasing & run out of memory. You can also try running with windbg tool to identify memory leaks in your code. I havent used this tool just giving some heads up on this.

This question is pretty old, and I've had the same problem,
but I've quickly solved it - it's all about threads:
First, note that updating GUI can only be done at the Main Thread.
My problem was that I've tried to handle GUI from a Worker Thread (and not a Main Thread) and i've got the same error: 0xC0000005.
I've solved it by posting a message (which is executed at the Main Thread) - and the problem was solved:
typedef enum {
WM_UPDATE_GUI
}WM_MY_MSG
// register function callback to a message
BEGIN_MESSAGE_MAP(CMyDlg, CDlgBase)
ON_MESSAGE(WM_UPDATE_GUI, OnUpdateGui)
END_MESSAGE_MAP()
// For this example - function that is not invoked in the Main Thread:
void CMyDlg::OnTimer()
{
CString str_to_GUI("send me to gui"); // send string to gui
// Update_GUI(str_to_GUI); // crashed
::PostMessage(hWnd, MyMsg::WM_UPDATE_GUI, (WPARAM)&str_to_GUI, 0);
}
HRESULT CMyDlg::OnUpdateGui(WPARAM wParam, LPARAM lParam)
{
CString str = *(CString*)wParam; // get the string from the posted message
Update_GUI(str);
return S_OK;
}

Related

A program I support is crashing with SIGSEGV but I can't from the .dmp file

As per the title, I can't locate any dump files when this program I support is crashing.
The application's logs clearly mention its a SIGSEGV exception, but I have searched my entire hard drive, and there are no .dmp files anywhere to be found.
The developers of the program have seen similar issues elsewhere but have so far been unable to explain why this is happening - and we're kind of a bit stuck at the moment.
The last section in the application logs reads as :
Received signal SIGSEGV, segmentation violation.
OurApplication::sigHandler 11.
Removing signal handlers.
OurApplication::signalCatched.
OurApplication::sigHandler: exiting application.
Removing signal handlers.
My limited understanding of this is that our application's signal handler might be 'neutralising' the SIGSEGV exception that got thrown. And therefore no core dump is getting generated... I did raise this idea with the developers but they never really seemed have investigated if this might be the reason. The theory they raised in counter was that they think the reason the dmp isn't getting generated is because the program may be crashing twice very close together.
So the questions I have at this point are:
Are there any Windows7 parameters that control the creation of a .dmp file?
Are there any requirements/flags that need to be compiled into a program in order for it (or windows) to create a core dump file if it crashes?
I'm 99% sure it must be windows that is responsible for creating the core file, since the program itself would be dead/terminated when it crashed, correct?
Are there any other things I should be aware of, or check for, or 'evidence' I can collect and then show our developers?
Many thanks in advance
Are there any Windows7 parameters that control the creation of a .dmp file?
There are parameters which control the creation of a crash dump: see MSDN on Collecting user-mode dumps.
Are there any requirements/flags that need to be compiled into a program in order for it (or windows) to create a core dump file if it crashes?
You don't need to compile anything in for the previous answer to work. However, the program needs to terminate due to an unhandled exception, which means you need to let the exception bubble up and not being handled by the unhandled exception handler.
I'm 99% sure it must be Windows that is responsible for creating the core file, since the program itself would be dead/terminated when it crashed, correct?
As stated above, Windows can handle that and it's a good idea to have Windows handle the crash. Imagine that your program is broken due to a memory leak. Some memory has been overwritten. In that case, your unhandled exception handler can be destroyed. Windows, however, still has full control over the process, can suspend it and create a dump from outside (rather from inside).
Are there any other things I should be aware of, or check for, or 'evidence' I can collect and then show our developers?
Well, suggest letting the dump be created by Windows due to above reasons. Then they also don't need to implement a configuration setting (you don't want the crash dump file to be always created, do you?). You don't need to implement a limiting number for the files. You don't need to implement a check for disk space, etc.
And you can suggest to read the Windows Internals 6 books.
Consider creating your own minidump file programatically. Should be plenty of code around showing how to do it. You can try here:
https://stackoverflow.com/search?q=minidump
This way, you're not relying on Dr. Watson or any other settings to create a dump file. Instead you will be calling the functions in DBGHELP.DLL to create the dump file.

C++: __try...__except; hides crash in release mode?

I have a DX9 application that runs on an embedded Windows XP box. When leaving it automated overnight for soak testing it crashes after about six to eight hours. On our dev. machines (Win 7) we can't seem to reproduce this issue. I'm also fairly certain it's not a memory leak.
If we run the same application in Debug on the embedded machines, it doesn't crash.
If we place a __try/__except around the main loop update on the embedded machines, it doesn't crash.
I know in Debug, there is some additional byte padding around the local stack which may be "hiding" a local array access out of bounds, or some sort of uninitialized variable is sneaking through.
So I have two questions:
Does __try/__except behave similar to debug, even when run in release?
What kind of things should I be scanning the code for if we have a crash in Release mode, but not in Debug mode?
If you're using __try{ } __except() you shouldn't.
Those and C++ code don't mix well. (for instance, you can't have C++ objects on the stack of a function wrapped with those. You should use C++ try {} catch() {} if you use catch(...) (with ellipsis) it does basically the same as __except()
both try.. catch and __try .. __except behave the same in debug and release.
If you suspect that your problem is an unexpected exception you should read about all of the following:
SetUnhandledExceptionFilter()
_set_se_translator()
_CrtSetReportMode()
_RTC_SetErrorFunc()
_set_abort_behavior()
_set_error_mode()
_set_new_handler()
_set_new_mode()
_set_purecall_handler()
set_terminate()
set_unexpected()
_set_invalid_parameter_handler()
_controlfp()
Using one of the first two would probably allow you to pinpoint your problem pretty quickly. The rest are there if you want absolute control for all error cases possible in your process.
Specifically, with SetUnhandledExceptionFilter() you can set up a function filter which logs the address of the code which caused the exception. You can then use your debugger to pin point that code. Using the DbgHelp library and with the information given to the filter function you can write some code which prints out a full stack trace of the crash, including symbols and line numbers.
Make sure you set up your build configuration to emit debug symbols for release builds as well. They can only help and don't do anything to slow your application (but maybe make it bigger)
If we place a __try/__except around the main loop update on the embedded machines, it doesn't crash.
Then do that.
A single __try block around the whole program (as well as the entry point for each worker thread) is the recommended approach, it lets you write out a crash dump and make an error report before exiting. There's not much recovery you can do with SEH, because the exceptions just don't carry enough information to distinguish different failures usefully. Storing the whole program state and pulling it into a debugger is very useful, though.
Note: Some video drivers cause SEH exceptions that they also catch, perhaps some logic expects there to be more than one SEH scope installed, which your __try block provided.

C++: Where to start when my application crashes at random places?

I'm developing a game and when I do a specific action in the game, it crashes.
So I went debugging and I saw my application crashed at simple C++ statements like if, return, ... Each time when I re-run, it crashes randomly at one of 3 lines and it never succeeds.
line 1:
if (dynamic) { ... } // dynamic is a bool member of my class
line 2:
return m_Fixture; // a line of the Box2D physical engine. m_Fixture is a pointer.
line 3:
return m_Density; // The body of a simple getter for an integer.
I get no errors from the app nor the OS...
Are there hints, tips or tricks to debug more efficient and get known what is going on?
That's why I love Java...
Thanks
Random crashes like this are usually caused by stack corruption, since these are branching instructions and thus are sensitive to the condition of the stack. These are somewhat hard to track down, but you should run valgrind and examine the call stack on each crash to try and identify common functions that might be the root cause of the error.
Are there hints, tips or tricks to debug more efficient and get known what is going on?
Run game in debugger, on the point of crash, check values of all arguments. Either using visual studio watch window or using gdb. Using "call stack" check parent routines, try to think what could go wrong.
In suspicious(potentially related to crash) routines, consider dumping all arguments to stderr (if you're using libsdl or on *nixlike systems), or write a logfile, or send dupilcates of all error messages using (on Windows) OutputDebugString. This will make them visible in "output" window in visual studio or debugger. You can also write "traces" (log("function %s was called", __FUNCTION__))
If you can't debug immediately, produce core dumps on crash. On windows it can be done using MiniDumpWriteDump, on linux it is set somewhere in configuration variables. core dumps can be handled by debugger. I'm not sure if VS express can deal with them on Windows, but you still can debug them using WinDBG.
if crash happens within class, check *this argument. It could be invalid or zero.
If the bug is truly evil (elusive stack corruption in multithreaded app that leads to delayed crash), write custom memory manager, that will override new/delete, provide alternative to malloc(if your app for some reason uses it, which may be possible), AND that locks all unused memory memory using VirtualProtect (windows) or OS-specific alternative. In this case all potentially dangerous operation will crash app instantly, which will allow you to debug the problem (if you have Just-In-Time debugger) and instantly find dangerous routine. I prefer such "custom memory manager" to boundschecker and such - since in my experience it was more useful. As an alternative you could try to use valgrind, which is available on linux only. Note, that if your app very frequently allocates memory, you'll need a large amount of RAM in order to be able to lock every unused memory block (because in order to be locked, block should be PAGE_SIZE bytes big).
In areas where you need sanity check either use ASSERT, or (IMO better solution) write a routine that will crash the application (by throwing an std::exception with a meaningful message) if some condition isn't met.
If you've identified a problematic routine, walk through it using debugger's step into/step over. Watch the arguments.
If you've identified a problematic routine, but can't directly debug it for whatever reason, after every statement within that routine, dump all variables into stderr or logfile (fprintf or iostreams - your choice). Then analyze outputs and think how it could have happened. Make sure to flush logfile after every write, or you might miss the data right before the crash.
In general you should be happy that app crashes somewhere. Crash means a bug you can quickly find using debugger and exterminate. Bugs that don't crash the program are much more difficult (example of truly complex bug: given 100000 values of input, after few hundreds of manipulations with values, among thousands of outputs, app produces 1 absolutely incorrect result, which shouldn't have happened at all)
That's why I love Java...
Excuse me, if you can't deal with language, it is entirely your fault. If you can't handle the tool, either pick another one or improve your skill. It is possible to make game in java, by the way.
These are mostly due to stack corruption, but heap corruption can also affect programs in this way.
stack corruption occurs most of the time because of "off by one errors".
heap corruption occurs because of new/delete not being handled carefully, like double delete.
Basically what happens is that the overflow/corruption overwrites an important instruction, then much much later on, when you try to execute the instruction, it will crash.
I generally like to take a second to step back and think through the code, trying to catch any logic errors.
You might try commenting out different parts of the code and seeing if it affects how the program is compiled.
Besides those two things you could try using a debugger like Visual Studio or Eclipse etc...
Lastly you could try to post your code and the error you are getting on a website with a community that knows programming and could help you work through the error (read: stackoverflow)
Crashes / Seg faults usually happen when you access a memory location that it is not allowed to access, or you attempt to access a memory location in a way that is not allowed (for example, attempting to write to a read-only location).
There are many memory analyzer tools, for example I use Valgrind which is really great in telling what the issue is (not only the line number, but also what's causing the crash).
There are no simple C++ statements. An if is only as simple as the condition you evaluate. A return is only as simple as the expression you return.
You should use a debugger and/or post some of the crashing code. Can't be of much use with "my app crashed" as information.
I had problems like this before. I was trying to refresh the GUI from different threads.
If the if statements involve dereferencing pointers, you're almost certainly corrupting the stack (this explains why an innocent return 0 would crash...)
This can happen, for instance, by going out of bounds in an array (you should be using std::vector!), trying to strcpy a char[]-based string missing the ending '\0' (you should be using std::string!), passing a bad size to memcpy (you should be using copy-constructors!), etc.
Try to figure out a way to reproduce it reliably, then place a watch on the corrupted pointer. Run through the code line-by-line until you find the very line that corrupts the pointer.
Look at the disassembly. Almost any C/C++ debugger will be happy to show you the machine code and the registers where the program crashed. The registers include the Instruction Pointer (EIP or RIP on x86/x64) which is where the program was when it stopped. The other registers usually have memory addresses or data. If the memory address is 0 or a bad pointer, there is your problem.
Then you just have to work backward to find out how it got that way. Hardware breakpoints on memory changes are very helpful here.
On a Linux/BSD/Mac, using GDB's scripting features can help a lot here. You can script things so that after the breakpoint is hit 20 times it enables a hardware watch on the address of array element 17. Etc.
You can also write debugging into your program. Use the assert() function. Everywhere!
Use assert to check the arguments to every function. Use assert to check the state of every object before you exit the function. In a game, assert that the player is on the map, that the player has health between 0 and 100, assert everything that you can think of. For complicated objects write verify() or validate() functions into the object itself that checks everything about it and then call those from an assert().
Another way to write in debugging is to have the program use signal() in Linux or asm int 3 in Windows to break into the debugger from the program. Then you can write temporary code into the program to check if it is on iteration 1117321 of the main loop. That can be useful if the bug always happens at 1117322. The program will execute much faster this way than to use a debugger breakpoint.
some tips :
- run your application under a debugger, with the symbol files (PDB) together.
- How to set Visual Studio as the default post-mortem debugger?
- set default debugger for WinDbg Just-in-time Debugging
- check memory allocations Overriding new and delete, and Overriding malloc and free
One other trick: turn off code optimization and see if the crash points make more sense. Optimization is allowed to float little bits of your code to surprising places; mapping that back to source code lines can be less than perfect.
Check pointers. At a guess, you're dereferencing a null pointer.
I've found 'random' crashes when there are some reference to a deleted object. As the memory is not necessarily overwritten, in many cases you don't notice it and the program works correctly, and than crashes after the memory was updated and is not valid anymore.
JUST FOR DEBUGGING PURPOSES, try commenting out some suspicious 'deletes'. Then, if it doesn't crash anymore, there you are.
use the GNU Debugger
Refactoring.
Scan all the code, make it clearer if not clear at first read, try to understand what you wrote and immediately fix what seems incorrect.
You'll certainly discover the problem(s) this way and fix a lot of other problems too.

What can cause an abnormal program termination?

MFC application (uses SQLite3.dll for DB access, along with other DLLs for accessing hardware) terminates abnormally. There is no particular sequence of termination :(
My application is a
Single threaded Application
Uses exception handling
Uses more than 6 DLLs to access different hardwares
Runs on WinXP SP2
Initially i thought it might be because of Stack Overflow, later i discovered its not. Can someone tell me what are all the general causes for an abnormal program termination? If someone has come across similar problems or has any hints or clues, please pass them on.
Thanks in Advance
Generally speaking, the general causes of crashes are when you:
read memory that isn't yours
write memory that isn't yours
divide by zero
do something inside an interrupt that you shouldn't
free() a pointer more than once
Possibly also:
have an unhanded exception
found a bug in your MFC
one of your >6 hardware-access DLLs is doing any of the above
You are encountering some kind of hardware fault
Maybe you're passing a bad buffer to one of your hardware DLLs, or are forgetting to lock some memory, or you could even have a version mismatch between the DLLs and their headers.
There are so many choices :P
Since this is a run-time issue, I suggest you send debug statements to a log file. Include the function name and perhaps a timestamp. Always flush the output buffer after writing to the file, as this provides better probability that the last line was written to the file before the exception occurs.

Heisenbug: WinApi program crashes on some computers

Please help! I'm really at my wits' end.
My program is a little personal notes manager (google for "cintanotes").
On some computers (and of course I own none of them) it crashes with an unhandled exception just after start.
Nothing special about these computers could be said, except that they tend to have AMD CPUs.
Environment: Windows XP, Visual C++ 2005/2008, raw WinApi.
Here is what is certain about this "Heisenbug":
1) The crash happens only in the Release version.
2) The crash goes away as soon as I remove all GDI-related stuff.
3) BoundChecker has no complains.
4) Writing a log shows that the crash happens on a declaration of a local int variable! How could that be? Memory corruption?
Any ideas would be greatly appreciated!
UPDATE: I've managed to get the app debugged on a "faulty" PC. The results:
"Unhandled exception at 0x0044a26a in CintaNotes.exe: 0xC000001D: Illegal Instruction."
and code breaks on
0044A26A cvtsi2sd xmm1,dword ptr [esp+14h]
So it seems that the problem was in the "Code Generation/Enable Enhanced Instruction Set" compiler option. It was set to "/arch:SSE2" and was crashing on the machines that didn't support SSE2. I've set this option to "Not Set" and the bug is gone. Phew!
Thank you all very much for help!!
4) Writig a log shows that the crash happen on a declaration of a local int variable! how could that be? Memory corruption?
What is the underlying code in the executable / assembly? Declaration of int is no code at all, and as such cannot crash. Do you initialize the int somehow?
To see the code where the crash happened you should perform what is called a postmortem analysis.
Windows Error Reporting
If you want to analyse the crash, you should get a crash dump. One option for this is to register for Windows Error Reporting - requires some money (you need a digital code signing ID) and some form filling. For more visit https://winqual.microsoft.com/ .
Get the crash dump intended for WER directly from the customer
Another option is to get in touch witch some user who is experiencing the crash and get a crash dump intended for WER from him directly. The user can do this when he clicks on the Technical details before sending the crash to Microsoft - the crash dump file location can be checked there.
Your own minidump
Another option is to register your own exception handler, handle the exception and write a minidump anywhere you wish. Detailed description can be found at Code Project Post-Mortem Debugging Your Application with Minidumps and Visual Studio .NET article.
So it doesnnt crash when configuration is DEBUG Configuration? There are many things different than a RELEASE configruation:
1.) Initialization of globals
2.) Actual machine Code generated etc..
So first step is find out what are exact settings for each parameter in the RELEASE mode as compared to the DEBUG mode.
-AD
1) The crash happens only in the Release version.
That's usually a sign that you're relying on some behaviour that's not guaranteed, but happens to be true in the debug build. For example, if you forget to initialize your variables, or access an array out of bounds. Make sure you've turned on all the compiler checks (/RTCsuc). Also check things like relying on the order of evaluation of function parameters (which isn't guaranteed).
2) The crash goes away as soon as I remove all GDI-related stuff.
Maybe that's a hint that you're doing something wrong with the GDI related stuff? Are you using HANDLEs after they've been freed, for example?
Download the Debugging tools for Windows package. Set the symbol paths correctly, then run your application under WinDbg. At some point, it will break with an Access Violation. Then you should run the command "!analyze -v", which is quite smart and should give you a hint on whats going wrong.
Most heisenbugs / release-only bugs are due to either flow of control that depends on reads from uninitialised memory / stale pointers / past end of buffers, or race conditions, or both.
Try overriding your allocators so they zero out memory when allocating. Does the problem go away (or become more reproducible?)
Writig a log shows that the crash happens on a declaration of a local int variable! How could that be? Memory corruption?
Stack overflow! ;)
4) Writig a log shows that the crash happen on a declaration of a local int variable!how could that be? Memory corruption
I've found the cause to numerous "strange crashes" to be dereferencing of a broken this inside a member function of said object.
What does the crash say ? Access violation ? Exception ? That would be the further clue to solve this with
Ensure you have no preceeding memory corruptions using PageHeap.exe
Ensure you have no stack overflow (CBig array[1000000])
Ensure that you have no un-initialized memory.
Further you can run the release version also inside the debugger, once you generate debug symbols (not the same as creating debug version) for the process. Step through and see if you are getting any warnings in the debugger trace window.
"4) Writing a log shows that the crash happens on a declaration of a local int variable! How could that be? Memory corruption?"
This could be a sign that the hardware is in fact faulty or being pushed too hard. Find out if they've overclocked their computer.
When I get this type of thing, i try running the code through gimpels PC-Lint (static code analysis) as it checks different classes of errors to BoundsChecker. If you are using Boundschecker, turn on the memory poisoning options.
You mention AMD CPUs. Have you investigated whether there is a similar graphics card / driver version and / or configuration in place on the machines that crash? Does it always crash on these machines or just occasionally? Maybe run the System Information tool on these machines and see what they have in common,
Sounds like stack corruption to me. My favorite tool to track those down is IDA Pro. Of course you don't have that access to the user's machine.
Some memory checkers have a hard time catching stack corruption ( if it indeed that ). The surest way to get those I think is runtime analysis.
This can also be due to corruption in an exception path, even if the exception was handled. Do you debug with 'catch first-chance exceptions' turned on? You should as long as you can. It does get annoying after a while in many cases.
Can you send those users a checked version of your application? Check out Minidump Handle that exception and write out a dump. Then use WinDbg to debug on your end.
Another method is writing very detailed logs. Create a "Log every single action" option, and ask the user to turn that on and send it too you. Dump out memory to the logs. Check out '_CrtDbgReport()' on MSDN.
Good Luck!
EDIT:
Responding to your comment: An error on a local variable declaration is not surprising to me. I've seen this a lot. It's usually due to a corrupted stack.
Some variable on the stack may be running over it's boundaries for example. All hell breaks loose after that. Then stack variable declarations throw random memory errors, virtual tables get corrupted, etc.
Anytime I've seen those for a prolong period of time, I've had to go to IDA Pro. Detailed runtime disassembly debugging is the only thing I know that really gets those reliably.
Many developers use WinDbg for this kind of analysis. That's why I also suggested Minidump.
Try Rational (IBM) PurifyPlus. It catches a lot of errors that BoundsChecker doesn't.