Debug Crash in DLL - c++

Trying to debug a crash in one of our DLL's. It is loaded into Server Manager and crashes when trying to configure Active Directory Certificate Services (the DLL is a registered provider). I know the crash is an access violation and I have the pdb file, just don't know how to go about debugging this. I've read pages such as this and this (didn't help). I tried to glean the info using windbg (using lm to get the loaded address, which appears to be 8000000:
"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\windbg.exe" -z myKSP.dll
Then
0:000> lm
start end module name
00000001`80000000 00000001`8005e000 ...
Then, since the Event Viewer tells me:
Exception code: 0xc0000005
Fault offset: 0x000000000002a601
I tried to view that:
0:000> ln 80000000+2a601
Browse module
Set bu breakpoint
Nothing is shown.
I have VS2015, so, I tried to attach to the serververmanager.exe process. Next, I tried loading symbols via Tools->Options->Debugging->Symbols and specifying the path, but, when I set a breakpoint, I always receive "no symbols have been loaded". In the previous symbol windows, I set the cache folder, which downloaded a bunch of stuff, but that did not seem to load anything.
Clearly, I'm not using the tools correctly. How do I debug a DLL, compiled in Release mode, PDB is available, that is loaded by the ServerManager.exe or whatever sub-process it might spawn)?

Start windbg, press Ctrl-D to open your dump file, then type the following. That should give you either a significant stack after one of the kp500, or at least will tell you whether the pdb file doesn't match the binary.
.symfix
.sympath+ <FOLDER_WITH_YOUR_PDB>
.reload
!sym noisy
.reload /v /f myKSP.dll
!sym quiet
kp500
.ecxr
kp500

Related

Reasons why my C++ Program runs extremely slow/stalls using Visual Studio debugging (F5) and fast/no stalls not using it (Ctrl+F5) in the same config

It's basically a Hello, World application with all vanilla settings with the addition of one external library and an init to it. The program takes about 21 seconds to run within Visual Studio using the debugger (F5 or Start Debugging), but runs instantly otherwise. Happens in Release and Debug.
The library is for the Julia programming language. I include it's lib and header directory and simply call jl_init(). Half the stall happens before the line is even hit.
From the command line (cmd to the project dir and type x64\myprogram.exe) or Ctrl+F5 it runs instantly.
From Visual Studio using F5 or hitting "Start Debugging" take about 10 seconds to even reach the jl_init() line which is the very first line of the program. Then another 10 seconds to get through it.
int main(int argc, char** argv)
{
jl_init(); // takes almost 10 seconds to reach this line, before it even runs.
printf("Hello, World!\n"); // takes another 10 seconds to reach this line.
return 0;
}
I'm on VS 2019 v142. Windows 10. The project is on a local SSD. I'm not sure how to tackle this problem. Any ideas?
Edit:
It could be related to loading symbols, but these files are mostly build without symbols though:
'Julia.exe' (Win32): Loaded 'D:\Program Files\Julia-1.6.2\bin\libjulia.dll'. Module was built without symbols.
I'll add that I went into Tools>Options>Debugging>Symbols and selected "Load all modules, unless excluded", then added these dlls into the list of excluded modules. I also unchecked all symbol file location checking in the same dialog. I don't see any indication in the output or the modules debug window that my changes took effect. I also tried disabling ALL symbol loading in Tools>Options>Debugging>Symbols by selecting "Load only specified modules" and specifying no modules. Making these changes didn't help.
I think it's definitely related to dll loading but don't know how.
Edit 2: disabling Tools > Options > Debugging > General “Load debug symbols in external process” made the stall go from 21 seconds to about 12, which indicates it's symbol related.
One of possible issues here is PDB loading.
PDB's are needed to debug libraries, and help debugger resolve callstacks when there are functions from that library in it. In many cases you can debug your app just as well without most of them loaded.
You can disable automatic loading, or set whitelist of modules for which you want to load PDB's following Microsoft documentation: https://learn.microsoft.com/en-us/visualstudio/debugger/specify-symbol-dot-pdb-and-source-files-in-the-visual-studio-debugger?view=vs-2019#symbol-file-locations-and-loading-behavior

Get reason that LoadLibrary cannot load DLL

On Linux and Mac, when using dlopen() to load a shared library that links to another library, if linking fails because of a missing symbol, you can get the name of the missing symbol with dlerror(). It says something like
dlopen failed: cannot locate symbol "foo"
On Windows, when using LoadLibrary() to load a DLL with a missing symbol, you can only get an error code from GetLastError() which for this type of issue will always be 127. How can I figure out which symbol is missing, or a more verbose error message from LoadLibrary() that explains why the function failed?
I figured out a way using the MSYS2 terminal. Other methods might work with GUI software.
A major caveat is that this can't be done in pure C/C++ and released for end users. It's for developers only, but it's better than nothing.
Install Debugging Tools for Windows by downloading the Windows SDK and unchecking everything except Debugging Tools.
I could be wrong, but it seems that installing this software installs a hook into the Windows kernel to allow LoadLibrary() to write verbose information to stderr.
Open the MSYS2 Mingw64 terminal as an administrator and run
'/c/Program Files (x86)/Windows Kits/10/Debuggers/x64/gflags.exe' -i main.exe +sls
This prints the following to the terminal to confirm that the registry has been changed.
Current Registry Settings for main.exe executable are: 00000002
sls - Show Loader Snaps
Use -sls instead of +sls if you need to undo, since I believe that the change takes place for all programs called main.exe in Windows globally, not just for your file.
Then running main.exe should print debug information to stderr, but since I'm debugging an -mwindows application, it's not working for me.
But for some reason, running the binary with MSYS2's gdb allows this debug information to be printed to stderr.
Install mingw-w64-x86_64-gdb with MSYS2 and run gdb ./main.exe and type run or r.
Search for a section similar to the following.
warning: 1ec8:43a0 # 764081125 - LdrpNameToOrdinal - WARNING: Procedure "foo" could not be located in DLL at base 0x000000006FC40000.
warning: 1ec8:43a0 # 764081125 - LdrpReportError - ERROR: Locating export "foo" for DLL "C:\whatever\plugin.dll" failed with status: 0xc0000139.
warning: 1ec8:43a0 # 764081125 - LdrpGenericExceptionFilter - ERROR: Function LdrpSnapModule raised exception 0xc0000139
Exception record: .exr 00000000050BE5F0
Context record: .cxr 00000000050BE100
warning: 1ec8:43a0 # 764081125 - LdrpProcessWork - ERROR: Unable to load DLL: "C:\whatever\plugin.dll", Parent Module: "(null)", Status: 0xc0000139
warning: 1ec8:43a0 # 764081171 - LdrpLoadDllInternal - RETURN: Status: 0xc0000139
warning: 1ec8:43a0 # 764081171 - LdrLoadDll - RETURN: Status: 0xc0000139
Great! It says Procedure "foo" could not be located in DLL so we have our missing symbol, just like in POSIX/UNIX's dlopen().
While the answer from Remy Lebeau is technically correct, determining the missing symbol from GetLastError() is still possible on a Windows platform. To understand what exactly is missing, understanding the terminology is critical.
Symbol:
When a DLL is compiled, it's functions are referenced by symbols.
These symbols directly relate to the functions name (the symbols are
represented by visible and readable strings), its return type, and
it's parameters. The symbols can actually be read directly through a
text editor although difficult to find in large DLLs.DLL Symbols - C++ Forum
To have a missing symbol implies that a function within cannot be found. If this error occurs prior to using GetProcAddress(), then it's possible that any number of functions cannot be loaded due to missing prerequisites. This means it is possible that a library that you are attempting to load also requires a library that the first cannot load. These levels of dependency may go on for an unknown number of layers, but the only answer that GetLastError() can determine is that there was a missing symbol. One such method is by using Dependency Walker to determine the missing library the first library requires. Once all required libraries are available and can be found by that library (which can be its own can of worms), that library can be loaded via LoadLibrary().

nsi.pdb Cannot Load Symbol Error When I am trying to toggle breakpoint C++

I got an error when I try to put a breakpoint to a line. It puts correctly but when I start the program it says This BP will not be currenlt hit. Because no symbols loaded.
When I take a look at modules list I saw NSI.pdb doesnt exist or couldnt be loaded Im not sure why?
I installed Debug tools for windows 86 - 64 both.
here's the modules windows

Check dll size in windbg

My app is crash. So I am using windbg to check the trace log. Here is my trace log in windbg:
FAILED_INSTRUCTION_ADDRESS:
cbmwk5!unloaded+161c0
727261c0 ?? ???
ANALYSIS_VERSION: 6.3.9600.17298 (debuggers(dbg).141024-1500) amd64fre
LAST_CONTROL_TRANSFER: from 423c1d76 to 72725817
IP_MODULE_UNLOADED:
cbmwk5!unloaded+15817
72725817 ?? ???
PRIMARY_PROBLEM_CLASS: BAD_INSTRUCTION_PTR_NULL
DEFAULT_BUCKET_ID: BAD_INSTRUCTION_PTR_NULL
STACK_TEXT:
023eebfc 72725817 cbmwk5!unloaded+0x15817
023eec00 423c1d76 unknown!unknown+0x0
023eec30 7272590a cbmwk5!unloaded+0x1590a
023eec58 72723a39 cbmwk5!unloaded+0x13a39
023ef120 76b65762 shell32!SHGetFolderPathW+0x180
023ef128 72720813 cbmwk5!unloaded+0x10813
023ef144 7271110e cbmwk5!unloaded+0x110e
023ef164 72715916 cbmwk5!unloaded+0x5916
023ef59c 7271636b cbmwk5!unloaded+0x636b
Could you please help me how to check the size of cbmwk5.dll according to STACK_TEXT?
What is the meaning of "+0x15817" int the statement:
023eebfc 72725817 cbmwk5!unloaded+0x15817
I tried to reload by using command:
.reload /unl cbmwk5.dll
and then type: !analyze -v
but the error missing cbmwk5.dll occurs:
SYMSRV: c:\localsymbols\cbmwk5.dll\506DCE083b000\cbmwk5.dll not found
SYMSRV: http://msdl.microsoft.com/download/symbols/cbmwk5.dll/506DCE083b000/cbmwk5.dll not found
DBGHELP: C:\Program Files (x86)\Windows Kits\8.1\Debuggers\cbmwk5.dll - file not found
DBGENG: cbmwk5.dll - Image mapping disallowed by non-local path.
DBGHELP: No header for cbmwk5.dll. Searching for dbg file
DBGHELP: .\cbmwk5.dbg - file not found
DBGHELP: .\dll\cbmwk5.dbg - path not found
DBGHELP: .\symbols\dll\cbmwk5.dbg - path not found
DBGHELP: cbmwk5.pdb - file not found
*** WARNING: Unable to verify timestamp for cbmwk5.dll
*** ERROR: Module load completed but symbols could not be loaded for cbmwk5.dll
DBGHELP: cbmwk5 - no symbols loaded
DBGHELP: C:\Program Files (x86)\Windows Kits\8.1\Debuggers\cbmwk5.dll - file not found
SYMSRV: C:\Program Files (x86)\Windows Kits\8.1\Debuggers\x64\sym\cbmwk5.dll\506DCE083b000\cbmwk5.dll not found
Thanks a lot.
What is the meaning of "+0x15817" int the statement
+0x15817 means that the debugger has no clue whatsoever which function was called. It just doesn't know anything about the DLL, only where it was once loaded. So it can only annotate the address with the DLL name and a very large offset. Otherwise visible from the SYMSRV trace messages, the debugger made an attempt to download the PDB file for the DLL but the symbol server doesn't know anything about the DLL. Which is certainly not unusual, it is a 3rd party DLL, not Microsoft's. Even Google has never heard of it.
FAILED_INSTRUCTION_ADDRESS:
cbmwk5!unloaded+161c0
The unloaded annotation is your strongest clue to the problem. The code crashed because the DLL was unloaded from memory. Yet the program tried to call it anyway. With no code left to execute (note the ??), the processor is forced to give up and generates an access violation. That was the end of the program, it cannot continue operating.
023ef120 76b65762 shell32!SHGetFolderPathW+0x180
The stack trace gives a (weak) clue to the underlying problem. Beware that this is speculation. But the presence of shell function like SHGetFolderPathW() is a strong hint that this is a misbehaving shell extension. They can do a lot of damage since they tend to be injected into your program when you use one of the common shell dialogs, like OpenFileDialog. In other words, it doesn't have anything to do with your program, it is somebody else's crappy code that made the program bomb.
You fix this kind of problem by disabling shell extensions one by one until the problem disappears. SysInternals' AutoRuns utility is the weapon of choice. It has to be done by the machine owner, there's little that you can do but give advice.

Program crash - how to read appcompat.txt?

After the program I am debugging crashes, I am left with heap dump *.mdmp file & appcompat.txt in my Temp directory. I understand that appcompat.txt is an error report. Is there a description of its format?
My appcompat.txt lists a number of DLLs. Am I correct assuming that the reason for a crash could have only come from one of the listed DLLs? Can I limit my debugging effort to the DLLs listed in appcompat.txt?
Thanks in advance!
The minidump file is far more informative for diagnosing crashes:
Install Debugging Tools for Windows, if you don't already have it.
Set up the symbol path variable _NT_SYMBOL_PATH to point to the Microsoft symbol server
Run Windbg and do File -> Open Crash Dump and locate your .dmp or .mdmp file
Type !analyze -v.
This will try to isolate the location of the crash. Note that just because a crash occurs in a particular dll it doesn't mean that is where the bug resides - it could be because an invalid parameter has been passed in from your application code. The analysis should hopefully show you a meaningful stack and an error code which should help in working out the actual cause of the crash.