I'm trying to debug a segmentation fault using mdb in solaris (as this is the only available option for me). The following is the core dump
`threading model: multi-threaded
status: process terminated by SIGSEGV (Segmentation Fault)
C++ symbol demangling enabled
librax.so void RajHistory_Backend::addMessage+0x58()
librax.sovoid RajErrorSink::output+0xe0()
librax.so void RajErrorSink_bg::DoWrite+9()
librax.so int RajErrorSink_Backend::run+0x128()
libmtcpp.so void*startThread+0x1e()
libc.so.1 _thr_setup+0x5b()
libc.so.1 _lwp_start()`
Code snippet
//
std::deque<char*> _queue;
//
void RajHistory_Backend::addMessage(char *msg)
{
_mutex.lock();
_queue.push_front(msg);
while ( _queue.size()-1 > _size )
{
char *b = _queue.back();
_queue.pop_back();
delete [] b;
}
_mutex.unlock();
}
I'm actually struggling to find the crash reason as I'm new to mdb . I did some debugging using this link
When I tried to print the variables in this method
> history_size/d
libenv.so`history_size:
libenv.so`history_size: 20
>msg/C
libc.so.1`msg:
libc.so.1`msg: H
>msg/s
libc.so.1`msg:
libc.so.1`msg: HGHcöHc°HÐÃUHåLeàLmèIüLuðH]ØAõL}øHì0
öIÖ|u5M
ötA>`
Does this mean the received message, char* msg is invalid?
How do I get the exact line in addMessage() method which caused the problem? Any hints on this how I can debug in mdb?
When mdb prints msg as libc.so.1`msg it’s indicating it found a value in libc, not in your code. Similarly, it’s showing libenv.so`history_size so found a similarly named entry in libenv. mdb won’t know the names of arguments to your functions.
You’ll have a much easier time if you build your binaries with debugging info (-g flag to the compiler) and use the source level debugger for your compiler (gdb if you're using gcc, dbx if you're using the Sun Studio or Solaris Studio compilers).
Related
I have a program with a large codebase, so I can't share a minimal example. What I've done is removed everything from main so that it looks like this:
int main()
{
std::cout << "here" << std::endl;
return 0;
}
But I'm still including all of the header that I was including before. When I run the debugger (GDB 9.2) it breaks before hitting main (I've a breakpoint set on the std::cout) with the following:
Starting debugger: C:\GameDev\Tools\MSYS2-32\mingw32\bin\gdb.exe -nx -fullname -quiet -args C:/GameDev/Colony/bin/Colony.exe
done
Setting breakpoints
Debugger name and version: GNU gdb (GDB) 9.2
Child process PID: 6840
In ?? () ()
Which I understand means something has happened during initialisation? I looked at this question Debug error before main() using GDB and did as suggested, printed the info file info file and set a breakpoint manually on the entry point and running it again. That doesn't seem to give me any additional info (same as above); or maybe I don't know what I'm looking for and how to retrieve it.
I've tried running the program through Dr Memory but it seems to execute okay in there, up until shutdown at which point after leaving Dr Memory gives me no errors but 2 suspected false positives. Both of these look like they're pointing to MingW hashtable code, which I believe is from my use of std::unordered_map in a few places (the only place where that hashtable code would come in). But none of that code is invoked because main is effectively empty.
None of that code is statically initialised either.
So, what sort of things can cause this error? I can try and track down the offending code if I know what can do it.
The protobuf definition is:
message AttackUserInfo
{
required string name = 1;
repeated uint32 skill_ids = 2;
}
Now, there a protobuf object attackUserInfo.
I want to clear the skill_ids.What should I do in gdb?
I have called the function Clear_skill_ids().But It isn't effect.
gdb> call attackUserInfo.Clear_skill_ids() //It cored.
What is "It" in "It cored"? Did the program under GDB get SIGSEGV, or did GDB itself dump core?
If the latter, it's a bug in GDB. You should upgrade it to the latest version, and if that doesn't help, file a bug against GDB with repro instructions.
I have a 64 bits C++ server application running on windows 7 and when it does a select on the database and calls next() on the result set the process simply dies, no exceptions, no dumps and no debug info after ResultSet->next(). Writes on the database works with no problems and both reads and writes works on the 32 bit version
I'm using the 11.2 version of the win64 oracle libraries that came with instant client and SDK
EDIT: it's the simplest of codes
const std::string sql("select * from schedule_import");
std::auto_ptr<IRecordSet> query = m_conn->Open(sql);
while(query->Next()) // dies
{
const std::string key(query->GetField("bean_key"));
//...
IRecordSet is just an interface for common functions of DB drivers like next, getField and it's implemented in here
bool OracleRecordSet::Next()
{
return m_pResultSet->next() != NULL; //crashes here
}
where m_pResultSet is a oracle::occi::ResultSet*
I'm not familiared with Oracle API, but question is if m_pResultSet is not NULL (in case of segmentation fault) and if there is no exception (in case of abort(), raise() stack).
BR!
After a lot of trys my problem is solved.
I was linking my debug program to oraocci11.lib because I didn't have the debug version and I thought it didn't matter much. After a bit of search I've found the debug version of the library, oraocci11d.lib, with the corresponding dll and the crash is gone
I have a Visual C++ 9 Win32 application that uses a third-party library. When a function from that library is called with a certain set of parameters the program crashes with "exception code 0xC000000D".
I tried to attach Visual Studio debugger - no exceptions are thrown (neither C++ nor structured like access violations) and terminate() is not called either. Still the program just ends silently.
How does it happen that the program just ends abnormally but without stopping in the debugger? How can I localize the problem?
That's STATUS_INVALID_PARAMETER, use WinDbg to track down who threw it (i.e. attach WinDbg, sxe eh then g.
Other answers and comments to the question helped a lot. Here's what I did.
I notices that if I run the program under Visual Studio debugger it just ends silently, but if I run it without debugger it crashes with a message box (usual Windows message box saying that I lost my unsaved data and everyone is sooo sorry).
So I started the program wihtout debugger, let it crash and then - while the message box was still there - attached the debugger and hit "Break". Here's the call stack:
ntdll.dll!_KiFastSystemCallRet#0()
ntdll.dll!_ZwWaitForMultipleObjects#20() + 0xc bytes
kernel32.dll!_WaitForMultipleObjectsEx#20() - 0x48 bytes
kernel32.dll!_WaitForMultipleObjects#16() + 0x18 bytes
faultrep.dll!StartDWException() + 0x5df bytes
faultrep.dll!ReportFault() + 0x533 bytes
kernel32.dll!_UnhandledExceptionFilter#4() + 0x55c bytes
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//OurCodeInvokingThirdPartyLibraryCode
so obviously that's some problem inside the trird-party library. According to MSDN, UnhandledExceptionFilter() is called in fatal situations and clearly the call is done because of some problem in the library code. So we'll try to work the problem out with the library vendor first.
If you don't have source and debugging information for your 3rd party library, you will not be able to step into it with the debugger. As I see it, your choices are;
Put together a simple test case illustrating the crash and send it onto the library developer
Wrap that library function in your own code that checks for illegal parameters and throw an exception / return an error code when they are passed by your own application
Rewrite the parts of the library that do not work or use an alternative
Very difficult to fix code that is provided as object only
Edit You might also be able to exit more gracefully using __try __finally around your main message loop, something like
int CMyApp::Run()
{
__try
{
int i = CWinApp::Run();
m_Exitok = MAGIC_EXIT_NO;
return i;
}
__finally
{
if (m_Exitok != MAGIC_EXIT_NO)
FaultHandler();
}
}
I am working on a migration project, here we are migrating large set of C++ libraries from Mainframe to Solaris. We have complted migration sucessfully, but while running the application, some places it crashes with 'signal SEGV (no mapping at the fault address)'.
Since the application supports on windows also, we checked with purify on windows. There are no memory leaks in the application and it works fine on windows.
Can any one suggests, what could be the other reasons which may create this type of errors. Any tools for tracing this type of errors?
It's not necessarily a memory leak. It could be that a piece of memory is referenced after it is free'ed.
My friend once came to me with a piece of code that runs fine on Windows but gives segv on Linux. It turned out that sometimes the memory is still valid after you free'ed it on Windows (probably for a short period of time) but immediately triggered segv on Linux.
The line below looks wrong to me
m_BindMap[sLabel] = b; // crashes at this line at when map size
I assume you are trying to add a number to the end of the string. Try this instead
stringstream ss;
ss << ":BIND" << ns;
string sLabel = ss.str();
Are you using g++? If so, recompile with the "-g" flag. Run the program in gdb. When it crashes type "bt" (for backtrace) and that should tell you where your problem is.
I am using CC compiler on solaris and dbx debugger. I know the call stack where it is crashing. But it is abromal crash.
map<string,CDBBindParam,less<string> >m_BindMap;
CNumString ns(CNumStringTraits(0,2,'0'));
ns = m_BindMap.size();
string sLabel = ":BIND"+ns;
CDBBindParam b(sLabel,val);
**m_BindMap[sLabel] = b;** // crashes at this line at when map size is more than 2
return sLabel;