Can't figure out what's wrong in my code - c++

I have a dump file which I managed to create, from my DLL which is created for any unhandled exception.
When I did something like int* tt = new int[4]; return int[n]; with n = 4, I would get the dump file, and could open it, and see at what line the error is caused. This was possible for both directly from a release exe, and a release DLL.
Now this was an easy error, and I only entered it to test my memory dump creation.
I now have a 900kb dump file, and the event log says the error comes from my .DLL, yet if I open the file, it does not display any source code.
The call stack is
KERNELBASE.dll!RaiseException() + 0x3d bytes
clr.dll!RaiseTheExceptionInternalOnly() + 0x18f bytes
clr.dll!IL_Throw() + 0xe2 bytes
000007fe81f65fd7()
00000000034d1610()
000000002d06ecb8()
436f93ce00050011()
436f93cf00110012()
000000002d06ec50()
00006d930c4f7680()
clr.dll!InlinedCallFrame::`vftable'()
000000002d06f3d8()
which does not help me at all to figure out where in the DLL my error is coming from.
Another issue with debugging this is, it only happens on a live-PC, but never on my debugging system. Can anyone help me finding a way how to debug this? It seems to happen on the calling of the DLL, but: not every time, only like every 2nd time (sometimes on 1st try, sometimes on 5th). I am completely lost on what is happening here.
Edit:
Updated the call stack with the Microsoft symbols loaded, but I still do not know where this may be coming from.

You need to load the symbols for kernelbase.dll. And possibly clr.dll.
Presumably you are using visual studio?
Set it up to access symbols from the microsoft symbol server: http://msdn.microsoft.com/en-us/library/b8ttk8zy(v=vs.80).aspx
You may need to right click on items in the callstack and tell it to load symbols.
Additionally be sure to keep a copy of the pdb file for any releases of software you make.

Related

Analyzing a crash dump file

I have a crash dump file from a customer which I have to analyze. I am new to the world of crash dump analysis. (The source code is C++).
Here is what I have tried:-
I opened the .dmp file with MS Visual Studio which indicated the following error - You cannot debug a 64-bit dump of a 32-bit process. So, I thought of giving WinDbg a try.
When I opened the file in WinDbg after setting the symbol search path, I started the getting the following - Debuggee not connected.
Can anyone point me out in the right direction? Should I be asking the customer to provide a 32-bit dump from his point or can this dump file be debugged.
Also, provide the necessary documentation to get started.
To some extent, you can debug a 64-bit dump of a 32-bit process with Windbg,
by use of the wow64exts. However if possible I think it’s best to have a 32 bits dump.
If the customer can provide a 32-bit dump , get it.
Here is a sample of the wow64exts:
0:008> k
Child-SP RetAddr Call Site
00000000`0291f128 00000000`779d263a wow64cpu!CpupSyscallStub+0x2
00000000`0291f130 00000000`7792c4f6 wow64cpu!WaitForMultipleObjects32+0x1d
00000000`0291f1e0 00000000`7792b8f5 wow64!RunCpuSimulation+0xa
00000000`0291f230 000007fe`e51fd6af wow64!Wow64LdrpInitialize+0x435
00000000`0291f770 000007fe`e519c1ae ntdll!_LdrpInitialize+0xde
00000000`0291f7e0 00000000`00000000 ntdll!LdrInitializeThunk+0xe
0:008> .load wow64exts
0:008> !sw
Switched to 32bit mode
0:008:x86> k
ChildEBP RetAddr
02a1f2dc 7783c752 ntdll_779e0000!NtWaitForMultipleObjects+0xc
02a1f460 75b956c0 KERNELBASE!WaitForMultipleObjectsEx+0x10b
02a1f4d4 75b9586a kernel32!WerpReportFaultInternal+0x1c4
02a1f4e8 75b67828 kernel32!WerpReportFault+0x6d
02a1f4f4 778c07c4 kernel32!BasepReportFault+0x19
The most useful tool for crash dump analysis is to load it into Windbg (File -> Open crash dump) and then use the
!analyze -v
command. This applies a number of heuristics to rewind slightly from the actual crash site to work out where the cause of the crash is likely to be, eg to where a null pointer dereference occurred. There's a good tutorial here. A really good site to bookmark is John Robbins' blog which has lots of great articles about Windbg.

how to get function name and line number when project crashed in release mode

i have a project in C++/MFC
when i run its in debug mode and project crashed
i can get function and line number of code
with SetUnhandledExceptionFilter function
but in release mode i can not get it
i am test this function and source
_set_invalid_parameter_handler msdn.microsoft.com/en-us/library/a9yf33zb(v=vs.80).aspx
StackWalker http://www.codeproject.com/KB/threads/StackWalker.aspx
MiniDumpReader & crashrpt http://code.google.com/p/crashrpt/
StackTracer www.codeproject.com/KB/exception/StackTracer.aspx
any way to get function and line of code when project crashed in release mode
without require pdb file or map file or source file ?
PDB files are meant to provide you this information; the flaw is that don't you want a PDB file. I can understand not wanting to release the PDB to end users, but in that case why would you want them to see stack trace information? To me your goal is conflicting with itself.
The best solution for gathering debug info from end users is via a minidump, not by piecing together a stack trace on the client.
So, you have a few options:
Work with minidumps (ideal, and quite common)
Release the PDBs (which won't contain much more info than you're already trying to deduce)
Use inline trace information in your app such as __LINE__, __FILE__, and __FUNCTION__.
Just capture the crash address if you can't piece together a meaningful stack trace.
Hope this helps!
You can get verbose output from the linker that will show you where each function is placed in the executable. Then you can use the offset from the crash report to figure out which function was executing.
In release mode, this sort of debugging information isn't included in the binary. You can't use debugging information which simply isn't there.
If you need to debug release-mode code, start manually tracing execution by writing to a log file or stdout. You can include information about where your code appears using __FUNCTION__ and __LINE__, which the compiler will replace with the function/line they appear in/on. There are many other useful predefined macros which you can use to debug your code.
Here's a very basic TR macro, which you can sprinkle through out your code to follow the flow of execution.
void trace_function(const char* function, int line) {
std::cout << "In " << function << " on line " << line << std::endl;
}
#define TR trace_function(__FUNCTION__, __LINE__)
Use it by placing TR at the top of each function or anywhere you want to be sure the flow of execution is reaching:
void my_function() {
TR();
// your code here
}
The best solution though, is to do your debugging in debug mode.
You can separate the debug symbols so that your release version is clean, then bring them together with the core dump to diagnose the problem afterwards.
This works well for GNU/Linux, not sure what the Microsoft equivalent is. Someone mentioned PDB...?

Program crashes with 0xC000000D and no exceptions - how do I debug it?

I have a Visual C++ 9 Win32 application that uses a third-party library. When a function from that library is called with a certain set of parameters the program crashes with "exception code 0xC000000D".
I tried to attach Visual Studio debugger - no exceptions are thrown (neither C++ nor structured like access violations) and terminate() is not called either. Still the program just ends silently.
How does it happen that the program just ends abnormally but without stopping in the debugger? How can I localize the problem?
That's STATUS_INVALID_PARAMETER, use WinDbg to track down who threw it (i.e. attach WinDbg, sxe eh then g.
Other answers and comments to the question helped a lot. Here's what I did.
I notices that if I run the program under Visual Studio debugger it just ends silently, but if I run it without debugger it crashes with a message box (usual Windows message box saying that I lost my unsaved data and everyone is sooo sorry).
So I started the program wihtout debugger, let it crash and then - while the message box was still there - attached the debugger and hit "Break". Here's the call stack:
ntdll.dll!_KiFastSystemCallRet#0()
ntdll.dll!_ZwWaitForMultipleObjects#20() + 0xc bytes
kernel32.dll!_WaitForMultipleObjectsEx#20() - 0x48 bytes
kernel32.dll!_WaitForMultipleObjects#16() + 0x18 bytes
faultrep.dll!StartDWException() + 0x5df bytes
faultrep.dll!ReportFault() + 0x533 bytes
kernel32.dll!_UnhandledExceptionFilter#4() + 0x55c bytes
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//OurCodeInvokingThirdPartyLibraryCode
so obviously that's some problem inside the trird-party library. According to MSDN, UnhandledExceptionFilter() is called in fatal situations and clearly the call is done because of some problem in the library code. So we'll try to work the problem out with the library vendor first.
If you don't have source and debugging information for your 3rd party library, you will not be able to step into it with the debugger. As I see it, your choices are;
Put together a simple test case illustrating the crash and send it onto the library developer
Wrap that library function in your own code that checks for illegal parameters and throw an exception / return an error code when they are passed by your own application
Rewrite the parts of the library that do not work or use an alternative
Very difficult to fix code that is provided as object only
Edit You might also be able to exit more gracefully using __try __finally around your main message loop, something like
int CMyApp::Run()
{
__try
{
int i = CWinApp::Run();
m_Exitok = MAGIC_EXIT_NO;
return i;
}
__finally
{
if (m_Exitok != MAGIC_EXIT_NO)
FaultHandler();
}
}

Function name from Windows stack trace

How do I restore the stack trace function name instead of <UNKNOWN>?
Event Type: Error
Event Source: abcd
Event Category: None
Event ID: 16
Date: 1/3/2010
Time: 10:24:10 PM
User: N/A
Computer: CMS01
Description:
Server.exe caused a in module at 2CA001B:77E4BEF7
Build 6.0.0.334
WorkingSetSize: 1291071488 bytes
EAX=02CAF420 EBX=00402C88 ECX=00000000 EDX=7C82860C ESI=02CAF4A8
EDI=02CAFB68 EBP=02CAF470 ESP=02CAF41C EIP=77E4BEF7 FLG=00000206
CS=2CA001B DS=2CA0023 SS=7C820023 ES=2CA0023 FS=7C82003B GS=2CA0000
2CA001B:77E4BEF7 (0xE06D7363 0x00000001 0x00000003 0x02CAF49C)
2CA001B:006DFAC7 (0x02CAF4B8 0x00807F50 0x00760D50 0x007D951C)
2CA001B:006DFC87 (0x00003561 0x7F6A0F38 0x008E7290 0x00021A6F)
2CA001B:0067E4C3 (0x00003561 0x00000000 0x02CAFBB8 0x02CAFB74)
2CA001B:00674CB2 (0x00003561 0x006EBAC7 0x02CAFB68 0x02CAFA64)
2CA001B:00402CA4 (0x00003560 0x00000000 0x00000000 0x02CAFBB8)
2CA001B:00402B29 (0x00003560 0x00000001 0x02CAFBB8 0x00000000)
2CA001B:00683096 (0x00003560 0x563DDDB6 0x00000000 0x02CAFC18)
2CA001B:00688E32 (0x02CAFC58 0x7C7BE590 0x00000000 0x00000000)
2CA001B:00689F0C (0x02CAFC58 0x7C7BE590 0x00000000 0x00650930)
2CA001B:0042E8EA (0x7F677648 0x7F677648 0x00CAFD6C 0x02CAFD6C)
2CA001B:004100CA (0x563DDB3E 0x00000000 0x00000000 0x008E7290)
2CA001B:0063AC39 (0x7F677648 0x02CAFD94 0x02CAFD88 0x77B5B540)
2CA001B:0064CB51 (0x7F660288 0x563DD9FE 0x00000000 0x00000000)
2CA001B:0063A648 (0x00000000 0x02CAFFEC 0x77E6482F 0x008E7290)
2CA001B:0063A74D (0x008E7290 0x00000000 0x00000000 0x008E7290)
2CA001B:77E6482F (0x0063A740 0x008E7290 0x00000000 0x00000000)
Get the exact same build of the program, objdump it, and have a look at what function is at the address. If symbol names have been stripped from the executable, this might be a little bit difficult.
If the program code is in any way dynamic, you may have to run it into a debugger to find the addresses of functions.
If the program is deliberately obfuscated and nasty, and it randomises its function addresses in some way at runtime (evasive things like viruses, or copy-protection code sometimes do this) then all bets are off.
In general:
The easiest way to find out what caused the crash is to follow the steps necessary to reproduce it in an instance of the application running in a debugger. All other approaches are much more difficult. This is why developers will often not spend time trying to tackle bugs which there are no known methods to reproduce.
You need to have the debug info file (.pdb) next to the exe when the crash occurs.
Then hopefully, your crash dumping code can load it and use the information in it.
Try to run the same built from your IDE, pause it and jump to the adresses in assembly. Then you can switch to source code and see the functions there.
At least that's how it works in Delphi. And I know of this possibility in Visual Studio as well.
If you don't have the built in your IDE, you have to use a Debugger like OllyDbg. Run the exe that caused the errors and pause the application in OllyDbg. Go to the adresses from the stack trace.
Open a similar application project in your IDE, run it and pause it. Search for the same binary pattern you see in OllyDbg and then switch to source code.
The last possibility I know is analysing the map file if you built it during your built.
I use StackWalker - just compile it into the source of your program, and if it crashes you can generate a stack trace at that time, including function names.
The most common cause is that you in fact don't have a module at the specified address. This can happen e.g. when you dereference an uninitialized function pointer, or call a virtual function using an invalid this pointer.
This is probably not the case here: "77E4BEF7" is with very high probability a Windows DLL, and 006DFAC7 an address in one of your modules. You don't need PDB files for this: Windows should always know the module name (.EXE or .DLL) - in fact, it needs first that module name to even find the proper PDB file.
Now, the remaining question is why you don't have the module information. This I can't tell from the information above. You need information about the actual system. For instance, does if have DEP? Is DEP enabled for your process?

LoadLibrary fails when including a specific file during DLL build

I'm getting really strange behavior in one of the DLLs of my C++ app. It works and loads fine until I include a single file using #include in the main file of the DLL. I then get this error message:
Loading components from D:/Targets/bin/MatrixWorkset.dll
Could not load "D:/Targets/bin/MatrixWorkset.dll": Cannot load library MatrixWorkset: Invalid access to memory location.
Now I've searched and searched through the code and google and I can't figure out what is going on. Up till now everything was in a single DLL and I've decided to split it into two smaller ones. The file that causes the problems is part of the other second library (which loads fine).
Any ideas would really be appreciated.
Thanks,
Jaco
The likely cause is a global with class type. The constructor is run from DllMain(), and DllMain() in turn runs before LoadLibrary() returns. There are quite a few restrictions on what you can do until DllMain() has returned.
Is it possible that header includes a #pragma comment(lib,"somelibrary.lib") statement somewhere? If so it's automatically trying to import a library.
To troubleshoot this I'd start by looking at the binary with depends (http://www.dependencywalker.com/), to see if there are any DLL dependencies you don't expect. If you do find something and you are in Visual Studio, you should turn on "Show Progress" AKA /VERBOSE on the linker.
Since you are getting the Invalid Access to memory location, it's possible there's something in the DLLMAIN or some static initializer that is crashing. Can you simplify the MatrixWorkset.dll (assuming you wrote it)?
The error you describe sounds like a run-time error. Is this error displayed automatically by windows or is it one that your program emits?
I say attach a debugger to your application and trace where this error is coming from. Is Windows failing to load a dependency? Is your library somehow failing on load-up?
If you want to rule in/out this header file you're including, try pre-compiling your main source file both with and without this #include and diff the two results.
I'm still not getting it going. Let me answer some of the questions asked:
1) Windows is not failing to load a dependency, I think since Dependency Walker shows everything is ok.
2) I've attached a debugger which basically prints the following when it tries to load MatrixWorkset.dll:
10:04:19.234
stdout:&"warning: Loading components from D:/ScinericSoftware/VisualWorkspace/trunk/Targets/bin/MatrixWorkset.dll\n"
10:04:19.234
stdout:&"\n"
status:Stopped: "signal-received"
status:Stopped.
10:04:19.890
stdout:30*stopped,reason="signal-received",signal-name="SIGSEGV",signal-meaning="Segmentation fault",thread-id="1",frame={addr="0x7c919994",func="towlower",args=[],from="C:\\WINDOWS\\system32\\ntdll.dll"}
input:31info shared
input:32-stack-list-arguments 2 0 0
input:33-stack-list-locals 2
input:34-stack-list-frames
input:35-thread-list-ids
input:36-data-list-register-values x
10:04:19.890
3) MSalters: I'm not sure what you mean with a "global with class type". The file that is giving the problems have been included in a different DLL in which it worked fine and the DLL loaded successfully.
This is the top of the MatrixVariable.h file:
#include "QtSF/Variable.h" // Located in depending DLL (the DLL in which this file always lived.
#include "Matrix.h" // File located in this DLL
#include "QList" // These are all files from the Qt Framework
#include "QModelIndex"
#include "QItemSelection"
#include "QObject"
using namespace Zenautics;
using namespace std;
class MatrixVariable : public Variable
{
Q_OBJECT
Q_PROPERTY(int RowCount READ rowCount WRITE setRowCount)
Q_PROPERTY(int ColumnCount READ columnCount WRITE setColumnCount)
Q_PROPERTY(int UndoPoints READ undoPoints WRITE setUndoPoints)
public:
//! Default constructor.
MatrixVariable(const QString& name, int rows, int cols, double fill_real = 0, double fill_complex = 0, bool isReal = true);
etc. etc. etc.
A possible solution is to put the MatrixVariable file back in the original DLL but that defeats the whole idea of splitting the DLL into smaller parts which is not really a option.
I get that error from GetLastError() when I fail to load a DLL from a command line EXE recently. It used to work, then I added some MFC code to the DLL. Now all bets are off.
I just had this exact same problem. A dll that had been working just fine, suddenly stopped working. I was taking an access violation in the CRT stuff that initializes static objects. Doing a rebuild all did not fix the problem. But when I manually commented out all the statics, the linker complained about a corrupt file. Link again: Worked. Now I can LoadLibrary. Then, one by one, I added the statics back in. Each time, I recompiled and tested a LoadLibrary. Each time it worked fine. Eventually, all my statics were back, and things working normally.
If I had to guess, some intermediate file used by the linker was corrupted (I see the ilk files constantly getting corrupted by link.exe). If you can, maybe wipe out all your files and do a clean build? But I'm guessing you've already figured things out since this is 6 months old ...