I am currently trying to get to the bottom of a client crash with one of our applications. I have wrapped the app in an exception handler which creates a mini-dump when the crash occurs.
The application crashes with the exception c0000139 (of which there isn't a huge amount of documentation).
The callstack looks like this
ntdll.dll!_RtlRaiseStatus#4() + 0x26 bytes
ntdll.dll!_LdrpSnapThunk#32() + 0x26f48 bytes
ntdll.dll!_LdrpSnapIAT#16() + 0xd9 bytes
ntdll.dll!_LdrpHandleOneOldFormatImportDescriptor#16() + 0x7a bytes
ntdll.dll!_LdrpHandleOldFormatImportDescriptors#16() + 0x2e bytes
ntdll.dll!_LdrpWalkImportDescriptor#8() + 0x11d bytes
ntdll.dll!_LdrpLoadDll#24() - 0x265 bytes
ntdll.dll!_LdrLoadDll#16() + 0x110 bytes
kernel32.dll!_LoadLibraryExW#12() + 0xc8 bytes
odbc32.dll!_ODBCLoadLibraryEx#12() + 0x29 bytes
odbc32.dll!_LoadDriver#12() + 0x119f bytes
odbc32.dll!_SQLDriverConnectW#32() + 0x1be bytes
odbc32.dll!_SQLDriverConnect#32() + 0x125 bytes
It looks like the program is trying to create a database connection (to Oracle via ODBC) and somehow failing to either find the dll or has found a dll with the wrong entry point.
I was wondering if anyone could offer advice an how to track this problem down further, or if anyone has experienced this problem I'd be interested in hearing how you solved it.
Thanks in advance
Rich
That exception code is Entry point not found - something is trying to load a DLL and the DLL cannot find all the DLLs it needs.
Use depends.exe to show what the DLL requires.
Enable loader snaps (gflags -i yourapp.exe +sls) and have it pinpoint the library its failing to find/load (starting the program under the debugger will spew all the loader diagnostics).
Alternatively, get the name of the library from the crash dump by examining parameters of LoadLibraryExW call.
Thanks for all the responses.
Turns out (at least this seems like it was the problem) that we had a configuration problem. Half of the software was set to load the 9i drivers and half of the software was expecting the 10g drivers.
It's early days yet and we need to test this, however it seems very likely that this was the cause.
Cheers
Rich
Related
I have a crash dump file from a customer which I have to analyze. I am new to the world of crash dump analysis. (The source code is C++).
Here is what I have tried:-
I opened the .dmp file with MS Visual Studio which indicated the following error - You cannot debug a 64-bit dump of a 32-bit process. So, I thought of giving WinDbg a try.
When I opened the file in WinDbg after setting the symbol search path, I started the getting the following - Debuggee not connected.
Can anyone point me out in the right direction? Should I be asking the customer to provide a 32-bit dump from his point or can this dump file be debugged.
Also, provide the necessary documentation to get started.
To some extent, you can debug a 64-bit dump of a 32-bit process with Windbg,
by use of the wow64exts. However if possible I think it’s best to have a 32 bits dump.
If the customer can provide a 32-bit dump , get it.
Here is a sample of the wow64exts:
0:008> k
Child-SP RetAddr Call Site
00000000`0291f128 00000000`779d263a wow64cpu!CpupSyscallStub+0x2
00000000`0291f130 00000000`7792c4f6 wow64cpu!WaitForMultipleObjects32+0x1d
00000000`0291f1e0 00000000`7792b8f5 wow64!RunCpuSimulation+0xa
00000000`0291f230 000007fe`e51fd6af wow64!Wow64LdrpInitialize+0x435
00000000`0291f770 000007fe`e519c1ae ntdll!_LdrpInitialize+0xde
00000000`0291f7e0 00000000`00000000 ntdll!LdrInitializeThunk+0xe
0:008> .load wow64exts
0:008> !sw
Switched to 32bit mode
0:008:x86> k
ChildEBP RetAddr
02a1f2dc 7783c752 ntdll_779e0000!NtWaitForMultipleObjects+0xc
02a1f460 75b956c0 KERNELBASE!WaitForMultipleObjectsEx+0x10b
02a1f4d4 75b9586a kernel32!WerpReportFaultInternal+0x1c4
02a1f4e8 75b67828 kernel32!WerpReportFault+0x6d
02a1f4f4 778c07c4 kernel32!BasepReportFault+0x19
The most useful tool for crash dump analysis is to load it into Windbg (File -> Open crash dump) and then use the
!analyze -v
command. This applies a number of heuristics to rewind slightly from the actual crash site to work out where the cause of the crash is likely to be, eg to where a null pointer dereference occurred. There's a good tutorial here. A really good site to bookmark is John Robbins' blog which has lots of great articles about Windbg.
I created a windows-console application that works fine but trying to use Winsock2 (Ws2_32.lib) in another static-library (as part of a larger project) throws an exception.
The code compiles fine and the exe runs all is well, calls to WSAStartup() and gethostbyname() work as expected but then calling gethostbyname()
causes :
First-chance exception at 0x76e1c41f in TestApp.exe: 0x000006F4: A null reference pointer was passed to the stub.
which leads to:
First-chance exception at 0x7505cd99 (rpcrt4.dll) in TestApp.exe: 0xC0020043: An internal error occurred in RPC.
Ive double checked the calling code is the same and checked that the correct versions of the *.h *.dll and *.lib are being used by the linker - as far as i can tell they are.
I've compared the project settings for the two apps and cant see anything out of the ordinary.
Ive also made sure that all the libraries in the Project are using the same Character-Set.
[EDIT : chages after discovering the difference in the two apps is just the debugger exceptions being turned on or not ]
I can continue past the exceptions and the code appears to run, but I no longer have valid debugging symbols in the function. It isnt a crash but of course id rather not have the exceptions every time I call the function - I can obviously turn the exception-breaks off but aren't they there to tell me something is wrong ?
I am currently trying to get the up-to-date symbols for the ws2_32.lib and other modules from the MSDN symbol server / SymChk.exe
[EDIT 2 - finally got symbols for the stack]
> rpcrt4.dll!_NdrClientCall2() + 0x301 bytes
FWPUCLNT.DLL!_FwppProxyEngineOpen#24() + 0x19 bytes
FWPUCLNT.DLL!_FwppSessionCreate#20() + 0xd1 bytes
FWPUCLNT.DLL!_FwpmEngineOpen0#20() + 0x29 bytes
FWPUCLNT.DLL!_FwpIsNameCacheEnabledForProcess#4() + 0x7778 bytes
FWPUCLNT.DLL!_FwpmProcessNameResolutionEvent0#16() + 0x74 bytes
FWPUCLNT.DLL!_NamespaceCallout#12() + 0x72 bytes
ws2_32.dll!PrepareNamespaceCalloutBlob() + 0x153 bytes
ws2_32.dll!getxyDataEnt() + 0x74a7 bytes
ws2_32.dll!_gethostbyname#4() + 0xe7 bytes
I was getting this exception "0x000006F4: A null reference pointer was passed to the stub."
Turns out disabling my 3rd party firewall stopped the exception being thrown. Perhaps the firewall is intercepting the request and messing something up.
Might be worth a try for you :)
I have a dump file which I managed to create, from my DLL which is created for any unhandled exception.
When I did something like int* tt = new int[4]; return int[n]; with n = 4, I would get the dump file, and could open it, and see at what line the error is caused. This was possible for both directly from a release exe, and a release DLL.
Now this was an easy error, and I only entered it to test my memory dump creation.
I now have a 900kb dump file, and the event log says the error comes from my .DLL, yet if I open the file, it does not display any source code.
The call stack is
KERNELBASE.dll!RaiseException() + 0x3d bytes
clr.dll!RaiseTheExceptionInternalOnly() + 0x18f bytes
clr.dll!IL_Throw() + 0xe2 bytes
000007fe81f65fd7()
00000000034d1610()
000000002d06ecb8()
436f93ce00050011()
436f93cf00110012()
000000002d06ec50()
00006d930c4f7680()
clr.dll!InlinedCallFrame::`vftable'()
000000002d06f3d8()
which does not help me at all to figure out where in the DLL my error is coming from.
Another issue with debugging this is, it only happens on a live-PC, but never on my debugging system. Can anyone help me finding a way how to debug this? It seems to happen on the calling of the DLL, but: not every time, only like every 2nd time (sometimes on 1st try, sometimes on 5th). I am completely lost on what is happening here.
Edit:
Updated the call stack with the Microsoft symbols loaded, but I still do not know where this may be coming from.
You need to load the symbols for kernelbase.dll. And possibly clr.dll.
Presumably you are using visual studio?
Set it up to access symbols from the microsoft symbol server: http://msdn.microsoft.com/en-us/library/b8ttk8zy(v=vs.80).aspx
You may need to right click on items in the callstack and tell it to load symbols.
Additionally be sure to keep a copy of the pdb file for any releases of software you make.
We're using Fogbugz for tracking issues and I am in the middle of writing a C++ wrapper around the XML API for Fogbugz.
The best practice seems to be to use the "scout" field so that similar/same crashes are just counted but not reported again. To do that we need a unique string for a particular cause of a crash.
In Win32 - after getting a dmp file or other crash handler what is a good way to make a unique string for a crash? (we're going to create a dmp file and send it to the fogbugz server)
In previous postings/articles/etc Joel has made various suggestions but much of those counted on a language like C# that use reflection and have a lot of information that is either harder to get or not possible to get.
Have any other people gotten things like stack traces or other things to make scout entries in fogbugz?
EDIT
To clarify - we don;t want a unique id for every incident - there are likely crashes that have the same code path. We want to capture that. I was thinking that we would get the last few stack calls that are in our code (not ones from win32 DLLs) - but not sure how to go about doing this.
Reporting every crash as unique is not right. Reporting all crashes under the same case is not right. Different users repeating a scenario that causes a crash should map to the same incident.
EDIT
What I think we want is a general "signature" of a crash - based on what is on the stack. Similar stacks should have the same signature. For example - take the top 5 methods that are in our app and then the first call (if any) we make into an MS DLL. This would probably be sufficient for a signature and would likely correlate the crashes that are "the same".
So how does one get the list of methods on the stack? And how can you tell if they are from your own app or in another DLL?
EDIT - NOTE
We want to create a "bucket id"/signature while in the exception handler so that we can create the minidump and send it to fogbugz as a scout description. Alternatively we can load up the dump on t he next start of the app and send it then with a signature we generate.
Here in my project I use the Address Memory of the Crash as a "Unique" ID.
IMO the best thing you can use will be bucket id from dump analysis. Use properly configured Debugging Tools for Windows (windbg), one can do !analyze -v and classify your dumps into different buckets based on bucket id. Bucket id guaranteed that if two dumps are the same, their bucket id will be the same. That solves part of the puzzle.
Many times two dumps rooted from same problem will create different bucket id's (maybe version difference, say your 1.0 and 1.1 both crash at same point). You can use faulting module and stack signature to correlate bugs from the same point of fault.
There will be certain things that causes very random dumps (e.g. heap corruption, the faulting module is typically the victim). Therefore dump analysis should be considered best-effort. When you can't, you can't.
I used something like this to generate exceptions in my last app (MSVC), so every error would get logged with the sourcefile and line it occured on:
class Error {
//...
public: Error(string file, string line, string error) ;
};
#define ERROR(err) Error(__FILE__, __LINE__, err)
It's probably a little bit late, but I will add my solution here, too, in case it can help other people.
You can do this using fools from "Debugging Tools for Windows", for example windbg.exe or better kd.exe.
Running the command "kd.exe -z "path_to_dump.dmp" -c "kd;q" >> dumpstack.txt, you might get the following result:
Microsoft (R) Windows Debugger Version 10.0.15063.400 X86
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [d:\work\bugs\14122\myexe.exe.2624.dmp]
User Mini Dump File with Full Memory: Only application data is available
************* Symbol Path validation summary **************
Response Time (ms) Location
Deferred srv*C:\Symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*C:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 10 Version 15063 MP (4 procs) Free x86 compatible
Product: WinNt, suite: SingleUserTS
15063.0.x86fre.rs2_release.170317-1834
Machine Name:
Debug session time: Fri Oct 13 00:09:01.000 2017 (UTC + 1:00)
System Uptime: 0 days 0:18:33.797
Process Uptime: 0 days 0:03:40.000
................................................................
.....................................................
Loading unloaded module list
..............................
This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(a40.2580): Security check failure or stack buffer overrun - code c0000409 (first/second chance not available)
eax=00000001 ebx=00000000 ecx=00000007 edx=77cc4350 esi=00000000 edi=00000000
eip=62ae7666 esp=0b75e17c ebp=0b75e1a8 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
msvcr120!abort+0x28:
62ae7666 cd29 int 29h
0:068> kd: Reading initial command 'kb;q'
ChildEBP RetAddr Args to Child
0b75e178 62addc5f 935dda1f 00000000 00000000 msvcr120!abort+0x28
0b75e1a8 0b75e7d4 62a9b436 0b75e1dc 62a52aa5 msvcr120!terminate+0x33
WARNING: Frame IP not in any known module. Following frames may be wrong.
0b75e1ac 62a9b436 0b75e1dc 62a52aa5 00000000 0xb75e7d4
0b75e1b4 62a52aa5 00000000 62a59740 0b75e7d4 msvcr120!__FrameUnwindToState+0x89
0b75e1c8 62a52b33 00000000 00000000 00000000 msvcr120!_EH4_CallFilterFunc+0x12
0b75e1f4 62a5a0f3 62b1f7b8 62a4f7c6 0b75e324 msvcr120!_except_handler4_common+0x8e
0b75e214 77cd6152 0b75e324 0b75e7c4 0b75e344 msvcr120!_except_handler4+0x1e
0b75e238 77cd6124 0b75e324 0b75e7c4 0b75e344 ntdll!ExecuteHandler2+0x26
0b75e30c 77cc4266 0b75e324 0b75e344 0b75e324 ntdll!ExecuteHandler+0x24
0b75e30c 74cf28f2 0b75e324 0b75e344 0b75e324 ntdll!KiUserExceptionDispatcher+0x26
0b75e684 62a59339 e06d7363 00000001 00000003 KERNELBASE!RaiseException+0x62
0b75e6c4 6001821c 0b75e6e4 6004e1bc 946a8f2a msvcr120!_CxxThrowException+0x5b
0b75e6f8 60018042 0b75e720 946a8efa ffffffff mymodule!FunctionC+0x7c
0b75e730 60016544 946a8ece ffffffff 092889d8 mymodule!FunctionB+0x32
0b75e754 600166b8 00842338 6000588d 00000001 myothermodule!FunctionB+0x44
From this stack, you can create a unique bucket if you take for example only your methods from the stack and concatenate them in a string: "mymodule!FunctionC+0x7c;mymodule!FunctionB+0x32;myothermodule!FunctionB+0x44". In order for this to work, you need to have access to you personal symbols server, either using the environment variable _NT_SYMBOL_PATH or with the -y command line switch.
You can alternatively create a string from the return addresses only (second column): "62addc5f,0b75e7d4,62a9b436,62a52aa5,62a52b33,62a5a0f3,77cd6152,77cd6124,77cc4266,74cf28f2,62a59339,6001821c,60018042,60016544,600166b8"
Just use an MD5 string generated from the dump file and you will likely to get a unique string for every crash.
I would start with collecting the data on how often every function in your code has been "flashed" in a crash report stack trace. Every report would have to be added to some kind of database, and every function would have to be indexed so that you could later query, which functions seem to crash more often than others. (And of course, functions like main() will be in every report, but that's understandable).
Or, you think that only crash reports seem to be the problem, you could just remove all those entries from crash stack traces, and then hash the rest (your functions). That way you could see if any particular call chain of your own functions causes a crash repeatedly, no matter what external functions have been called in between.
Then of course, some of the more complicated problems will not be captured this way anyway, as the stack trace will be completely different. To help that, you could record other data from your application along with the stack trace in every report, like sizes of buffers, counters, states of different parts of the application and so on... And then do some statistics on that.
I have a Visual C++ 9 Win32 application that uses a third-party library. When a function from that library is called with a certain set of parameters the program crashes with "exception code 0xC000000D".
I tried to attach Visual Studio debugger - no exceptions are thrown (neither C++ nor structured like access violations) and terminate() is not called either. Still the program just ends silently.
How does it happen that the program just ends abnormally but without stopping in the debugger? How can I localize the problem?
That's STATUS_INVALID_PARAMETER, use WinDbg to track down who threw it (i.e. attach WinDbg, sxe eh then g.
Other answers and comments to the question helped a lot. Here's what I did.
I notices that if I run the program under Visual Studio debugger it just ends silently, but if I run it without debugger it crashes with a message box (usual Windows message box saying that I lost my unsaved data and everyone is sooo sorry).
So I started the program wihtout debugger, let it crash and then - while the message box was still there - attached the debugger and hit "Break". Here's the call stack:
ntdll.dll!_KiFastSystemCallRet#0()
ntdll.dll!_ZwWaitForMultipleObjects#20() + 0xc bytes
kernel32.dll!_WaitForMultipleObjectsEx#20() - 0x48 bytes
kernel32.dll!_WaitForMultipleObjects#16() + 0x18 bytes
faultrep.dll!StartDWException() + 0x5df bytes
faultrep.dll!ReportFault() + 0x533 bytes
kernel32.dll!_UnhandledExceptionFilter#4() + 0x55c bytes
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//OurCodeInvokingThirdPartyLibraryCode
so obviously that's some problem inside the trird-party library. According to MSDN, UnhandledExceptionFilter() is called in fatal situations and clearly the call is done because of some problem in the library code. So we'll try to work the problem out with the library vendor first.
If you don't have source and debugging information for your 3rd party library, you will not be able to step into it with the debugger. As I see it, your choices are;
Put together a simple test case illustrating the crash and send it onto the library developer
Wrap that library function in your own code that checks for illegal parameters and throw an exception / return an error code when they are passed by your own application
Rewrite the parts of the library that do not work or use an alternative
Very difficult to fix code that is provided as object only
Edit You might also be able to exit more gracefully using __try __finally around your main message loop, something like
int CMyApp::Run()
{
__try
{
int i = CWinApp::Run();
m_Exitok = MAGIC_EXIT_NO;
return i;
}
__finally
{
if (m_Exitok != MAGIC_EXIT_NO)
FaultHandler();
}
}