Cannot Debug Shared Library - Symbols Not Loading Properly - c++

I am currently writing a small library, and I want to check it for leaks (among other things); however, for some reason, gdb is not loading the library symbols. I have read many other posts on here (and various other places on the internet) about this, however, I cannot seem to find a solution. Here is what is going on:
I am compiling the shared library with the following flags (these are included in both the final shared library as well as all object files):
CFLAGS=-Wall -O0 -g -fPIC
Likewise, I am compiling the binary memtest (the client application for the library) to check for memory leaks and such with these flags
CFLAGS=-Wall -O0 -g
Now, I inserted a NULL pointer into the library to test if I could trace through it and "debug" the pointer (i.e. it's making it crash). So I try to run it through gdb, but it's a no go. The output of info sharedlibrary is the same for both the executable and the core:
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
... Some libraries I am not worried about debugging...
0x00d37340 0x00d423a4 Yes (*) /home/raged/MyLIB/memtest/../lib/libMyLIB.so.0 <--- My lib
.... and some more....
(*): Shared library is missing debugging information.
As you can see, it's not loading the debug information. I am uncertain as to why this is. I have built and linked everything with the -g flag, and I even try -ggdb and -g3 but nothing seems to work properly. When I load in a core dump, this is what I see:
...some libs...
Reading symbols from /home/raged/MyLIB/memtest/../lib/libMyLIB.so.0...done.
Loaded symbols for /home/raged/MyLIB/memtest/../lib/libMyLIB.so.0
Reading symbols from /usr/lib/libstdc++.so.6...(no debugging symbols found)...done.
...some more libs...
Notice how my library does not give a (no debugging symbols found) error - anyone have any ideas why? As I said before, I am unable to debug this through running the program gdb ./memtest or through debugging the core file.
Thanks for your help.
EDIT It may also be important to note, that (if you didn't realize by path) this library is a local shared library (i.e. I'm using -Wl,-rpath to link/load it)
EDIT2 It seems my version of GDB was out-of-date. Now, I have updated to the latest version from the CVS server (I have also tried latest release version 7.2) and it can "load" symbols. My info sharedlibrary now reads this:
0x00e418b0 0x00e4be74 Yes /home/raged/MyLIB/memtest/../lib/libMyLIB.so.0
However, I am still unable to step through any functions (in the shared library) - anyone have any ideas?
EDIT3 I have also tried to step through linking against a static library (libMyLIB.a) but it still isn't working. My OS is CentOS 5.6; does anyone know of any issues with this system? Also, just another confirmation that my symbols are being loaded (it just can't step through any shared lib function for some reason)
(gdb) sharedlibrary MyLIB
Symbols already loaded for /home/raged/MyLIB/memtest/../lib/libMyLIB.so.0

I found the reason this wasn't working: I was calling an old function call to initialize a pointer in my test executable. Since the object was never being created, I could never step into the library. Once I updated the function call, all worked well.
That said, if anyone experiences similar issues while all symbols appear to be loaded, be sure to check that all pointers are initialized properly even if they have the correct type.

Related

Linking to a shared library that links to another shared library segfaults on exit

Unfortunately I have been unable to create a minimal example, but here is the situation. I have one library that links to another, like this:
add_library(MainLib MainLib.cpp)
add_library(ChildLib ChildLib.cpp)
target_link_libraries(ChildLib MainLib)
I can do this (not using the ChildLib, but rather compiling ChildLib.cpp into the executable directly):
ADD_EXECUTABLE(TestNoLink TestNoLink.cpp ChildLib.cpp)
TARGET_LINK_LIBRARIES(TestNoLink MainLib)
and everything compiles, links, and runs with no problems.
However, if I do this (now using the ChildLib):
ADD_EXECUTABLE(TestChildLink TestChildLink.cpp)
TARGET_LINK_LIBRARIES(TestChildLink ChildLib) # no need to link to MainLib here because ChildLib already links to MainLib
Everything still compiles and links, and it actually runs fine as well, but after it is finished running, it segfaults.
Is there some concept I should be looking for here to figure out what is causing this?
The problem was apparently that I was linking to a shared AND static version of the same library (a Boost library). Shouldn't the linker notice that and produce an error?

Debugger can't match source or step code in library

I'm trying to debug some code coming from the Crypto++ library, but I'm getting non-sensical information during the session.
The function of interest is DEREncodePrivateKey. Its a member function on DL_PrivateKey_EC<T> (Crypto++ is heavily templated).
228 pk.DEREncodePrivateKey(encoder);
(gdb) s
non-virtual thunk to CryptoPP::DL_PrivateKey_EC<CryptoPP::ECP>::DEREncodePrivateKey(CryptoPP::BufferedTransformation&) const
(this=0x7fff5fbfeca0, bt=#0x7fff5fbfe540) at dll.cpp:690
Line number 690 out of range; dll.cpp has 146 lines.
dll.cpp can be found at trunk / c5 / dll.cpp, and it only has 146 lines ad gdb reported.
The object was dynamic_cast just before the line in question:
const PKCS8PrivateKey& pk = dynamic_cast<const PKCS8PrivateKey&>(key);
I built the library from sources with -O0 -g3, so I think I minimized some/ most of the typical problems.
I've tried building the library and my test programs with different compilers (g++ and clang++), and I've tried debugging it with different debuggers (gdb and lldb). But I still get the non-sensical information and the library cannot be stepped in this area. Other areas are OK (as can be seen before the issue).
I'm also certain that I'm using my version of the library. Its being linked as a static lib using the full path to the library, and info shared confirms Apple is not sneaking in a dynamic library.
I need to step DL_PrivateKey_EC<CryptoPP::ECP>::DEREncodePrivateKey to see what's going on. I think the function that's being called is in eccrypto, but I'd like to see it under the debugger.
Any ideas how to proceed?
It sounds like the line table in the debug info is incorrect. On Mac OS X you can dump the line table with dwarfdump --debug-line appname.dSYM or if you are looking at a specific object file, dwarfdump --debug-line dll.o. From within lldb you can also do target modules dump line-table dll.cpp. My guess is that some function from a header file was inlined into your dll.cpp -- from line 690 -- but the linetable may incorrectly forget to switch from dll.cpp to your header file.
It seems unlikely that you would see the same problem from both g++ and from clang++. I suspect at least part of your project was not rebuilt.
fwiw you can continue debugging here -- the only problem is that the debugger won't be showing you the source as you step through your method. But you can continue to step and I expect you'll be back in your dll.cpp sources again in no time.

Internal exceptions in shared library terminate end user application

I am building a shared library which uses Boost.thread internally. As a result, Boost.system is also used since Boost.thread depends on that. My shared library exports a C interface, so I want to hide all my internal exception handling and thread usage etc from the end user. It is supposed to be a black box so to speak. However, when I link with a client application, while the program runs fine - as soon as it is time to stop the processing by invoking a library function I get:
terminate called after throwing an instance of 'boost::thread_interrupted'
I catch this exception internally in the library, so I have no idea why it is not actually being caught. The end user's program is not meant to know about or handle Boost exceptions in any way. When building the shared library, I use static linking for both Boost.thread and Boost.system so the outside world is never meant to see them. I am on GCC 4.7 on Ubuntu 12. On Windows, I have no such problems (neither with MSVC or MinGw).
(EDIT)
I am editing the question to show a minimalistic example that reproduces the problem, as per the requests in the comments.
Here first is the code for testlib.cpp and testlib.h.
testlib.cpp:
#include <boost/thread/thread.hpp>
void thread_func()
{
while(1)
{
boost::this_thread::interruption_point();
}
}
void do_processing()
{
// Start a thread that will execute the function above.
boost::thread worker(thread_func);
// We assume the thread started properly for the purposes of this example.
// Now let's interrupt the thread.
worker.interrupt();
// And now let's wait for it to finish.
worker.join();
}
And now testlib.h:
#ifndef TESTLIB_H
#define TESTLIB_H
void do_processing();
#endif
I build this into a shared library with the following command:
g++ -static-libgcc -static -s -DNDEBUG -I /usr/boost_1_54_0 -L /usr/boost_1_54_0/stage/lib -Wall -shared -fPIC -o libtestlib.so testlib.cpp -lboost_thread -lboost_system -lpthread -O3
Then, I have the code for a trivial client program which looks as follows:
#include "testlib.h"
#include <cstdio>
int main()
{
do_processing();
printf("Execution completed properly.\n");
return 0;
}
I build the client as follows:
g++ -DNDEBUG -I /usr/boost_1_54_0 -L ./ -Wall -o client client.cpp -ltestlib -O3
When I run the client, I get:
terminate called after throwing an instance of 'boost::thread_interrupted'
Aborted (core dumped)
I am not explicitly catching the thread interruption exception, but according to the Boost documentation Boost.thread does that and terminates the given thread only. I tried explicitly catching the exception from within the thread_func function, but that made no difference.
(End OF EDIT)
(EDIT 2)
It is worth noting that even with -fexceptions turned on, the problem still persists. Also, I tried to throw and catch an exception that is defined in the same translation unit as the code that catches and throws it, with no improvement. In short, all exceptions appear to remain uncaught in the shared library even though I definitely have catch handlers for them. When I compile the client file and the testlib file as part of a single program, that is to say without making testlib into a shared library, everything works as expected.
(End OF EDIT 2)
Any tips?
I finally figured it out. The -static flag should never be specified when -shared is specified. My belief was that it merely told the linker to prefer static versions of libraries that it links, but instead it makes the generated dynamic library unsuitable for dynamic linking which is a bit ironic. But there it is. Removing -static solved all my problems, and I am able to link Boost statically just fine inside my dynamic library which handles exceptions perfectly.
Maybe this?
If you have a library L which throws E, then both L and the
application A MUST be linked against X, the library containing the
definition of E.
Try to link executable against boost, too.
A shared library that itself includes statically linked libraries is not such a good idea, and I don't think that this scenario is well supported in the GNU toolchain.
I think that your particular problem arises from the option -static-libgcc, but I've been unable to compile it in my machine with your options. Not that linking statically-dinamically to libpthread.so sounds as such a good idea either... What will happen if the main executable wants to create its own threads? Will it be compiled with -pthread? If it is, then you will link twice the thread functions; if it isn't, it will have the functions but not the precompiler macros nor the thread-safe library functions.
My advice is simply not to compile your library statically, that's just not the Linux way.
Actually that should not be a real problem, even if you don't want to rely on the distribution version of boost: compile your program against the shared boost libraries and deploy all these files (libboost_thread.so.1.54.0, libboost_system.so.1.54.0 and libtestlib.so) to the same directory. Then run your program with LD_LIBRARY_PATH=<path-to-so-files>. Since the client is not intended to use boost directly, it doesn't need the boost headers, nor link them in the compiler command. You still have your black box, but now it is formed by 3 *so files, instead of just 1.

Is it possible to symbolicate C++ code?

I have been running into trouble recently trying to symbolicate a crash log of an iOS app. For some reason the UUID of the dSYM was not indexed in Spotlight. After some manual search and a healthy dose of command line incantations, I managed to symbolicate partially the crash log.
At first I thought the dSYM might be incomplete or something like that, but then I realized that the method calls missing were the ones occurring in C++ code: this project is an Objective-C app that calls into C++ libraries (via Objective-C++) which call back to Objective-C code (again, via Objective-C++ code). The calls that I'm missing are, specifically, the ones that happen in C++ land.
So, my question is: is there some way that the symbolication process can resolve the function calls of C++ code? Which special options do I need to set, if any?
One useful program that comes with the apple sdk is atos (address to symbol). Basically, here's what you want to do:
atos -o myExecutable -arch armv7 0x(address here)
It should print out the name of the symbol at that address.
I'm not well versed in Objective-C, but I'd make sure that the C++ code is being compiled with symbols. Particularly, did you make sure to include -rdynamic and/or -g when compiling the C++ code?
try
dwarfdump --lookup=0xYOUR_ADRESS YOUR_DSYM_FILE
you will have to look up each adress manually ( or write a script to do this ) but if the symbols are ok ( your dSym file is bigger than say 20MB) this will do the job .

XCode 4.2 static libraries linking issue

I have Core static library, a few Component static libraries that relays on the Core one, and then there is an App that links against both Core and Component libraries. My App can link both against Core and Component as long as Component don't uses classes from Core (App uses classes from Core).
I got the following error in both armv6 and armv7 versions. So my problem is not the very popular linking issue that everyone has.
ld: symbol(s) not found for architecture armv6
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I added reference to Core in Component and even added it in "Link Binary With Libraries" which shouldn't be necessary for static lib.
Since I start having this issue I start doubting my design... It probably makes more sense in dynamically linking environment but still it should be doable in static one, especially since this already works under Windows with MSVC compilers.
Edit:
I made some progress! Although I still don't know where to go with it.
Here is my setup:
Core has a class cResourceManager that has a templated method GetResource<T>(int id)
Core also has class cResource
Component has class cMesh that inherits cResource
Here are some tests:
If I try from App to call rm->GetResource<cMesh>(...) I get the linking error
If I try from App to construct cMesh I get linking the linking error
If I try from App to call static method that will return new instance of cMesh I get the linking error
If I comment out the construction of cMesh but leave other member cMesh function calls the App links fine. I can even call delete mesh.
I have never seen anything like it!
If you remove the cMesh constructor, then you are then using the default (no argument, no body) cMesh constructor that is given to you. It almost sounds like there's a build error or missing code as a result of some code in your cMesh constructor and so the library isn't actually getting generated, and perhaps Xcode isn't reporting the error. Xcode is no good at reporting linker errors.
I would suggest looking at what symbols the linker says are missing and double-check that they are actually defined in your code. My guess is that you're using one of those symbols in your cMesh constructor. A lot of times with virtual base classes, you may forget to define and implement a method or two in a child class. Could be a result of missing a method based on your template, or your template isn't #included correctly. This could compile fine but result in linker errors like you're seeing.
If Xcode isn't showing you the full linker error, show the Log Navigator (Command ⌘+7), double-click the last "Build " entry, select the error, and then press the button on the far-right of the row that appears when selected. The symbols should be listed there. If not, it's time for xcodebuild in the Terminal.
If it's not that case, I'd be interested in seeing the results of whether or not the library is being built for the appropriate architecture, or maybe this can spur some progress:
In the Xcode Organizer Shift ⇧+Command ⌘+2, click Projects and find the path to the DerivedData for your project.
In the Terminal, navigate to that directory (cd ~/Library/Developer/Xcode/DerivedData/proj-<random value>/)
Remove (or move aside) the Build directory (rm -r Build)
In Xcode, try to build with the cMesh constructor present.
Find the Library product file (cd Build/Products/<scheme>-iphoneos)
Your compiled static libraries (<libname>.a) should be in this directory. If they're not there, they didn't build (unless you put your products elsewhere). If your libraries are there, let's confirm that they actually are getting built for the appropriate architecture. Run otool -vh <library>.a. You should see something like:
$ otool -vh libtesting.a
Archive : libtesting.a
libtesting.a(testing.o):
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
MH_MAGIC ARM V7 0x00 OBJECT 3 1928 SUBSECTIONS_VIA_SYMBOLS
As you can see, my test library was built for ARMv7.
Make sure you are linking them in the correct order.
If Component depends on symbols in Core, then Component needs to be first in the link order, so the linker knows which symbols to look for in Core.
In MSVC the order doesn't matter, but in most other compiler suites it does.
I don't think Clang generates code for armv6, if you're targeting devices that old you still need to use GCC.