dynamic_cast fails between "long distance" siblings on linux compilation - c++

There is a class DerivedClass which inherits from both BaseClassA and BaseClassB publicly. All classes have virtual functions to make sure the virtual table is built properly.
BaseClassA and BaseClassB are located in Library1 and DerivedClass is in Library2.
One function in Library1 retrieves a DerivedClass in the form of a BaseClassA pointer and tries to dynamic_cast to BaseClassB but it fails. The same function works in different environments and compilers (visual studio for instance).
Extra information:
Reproduced with 4.1.2 and 4.5 gcc versions (read about the old gcc bug)
Derived class name is "Match" (thought it may collide with other library? python?)
There are tens of libraries compiling one after the other while linking between them.
nm -gC: Found the vtable address for Match
gdb (7.0.1): used to check the vtable address for the class but couldn't understand much. (gdb version doesnt support "info vtbl". gdb can't show direct information about the class claiming it is a typedef)
readelf -s: I saw the symbol
I figured it might be one of the following problems
duplicated symbol
HIDDEN symbol somewhere in the libraries linked
Duplicated vtables
-E and RTLD_GLOBAL dlopen flag linker flag didn't work (the linking is made using the makefile linker stage and probably not the dlopen.
non-inline function manipulation (didnt work as well - but could be my mistake with understanding what exactly has to be done)
Been farming the web trying to find a solution. But what I want to know first is What Is The Problem? How can I focus on it?
(5) looks promising even though I didnt manage to use it.
Any suggestions would be greatly appreciated (a solution would be great as well ;) )

Apparently #4 was the answer.
The was a hidden feature that loads functions dynamically and had to be handles separately. Adding the -Wl,-E flags to the linking process and changing the loader flags did the trick.
What I want to know if there is any linuxy way of understanding this is the error.
Something like "ldd" command or others (top, nm, readelf, etc) that I tried but couldn't see anything that pointed to this exact error.
Thanks SOF for continuous help in various subjects

Related

"undefined reference to __dso_handle" while linking static library with -nostdlib [duplicate]

I have an unresolved symbol error when trying to compile my program which complains that it cannot find __dso_handle. Which library is this function usually defined in?
Does the following result from nm on libstdc++.so.6 mean it contains that?
I tried to link against it but the error still occurs.
nm libstdc++.so.6 | grep dso
00000000002fc480 d __dso_handle
__dso_handle is a "guard" that is used to identify dynamic shared objects during global destruction.
Realistically, you should stop reading here. If you're trying to defeat object identification by messing with __dso_handle, something is likely very wrong.
However, since you asked where it is defined: the answer is complex. To surface the location of its definition (for GCC), use iostream in a C++ file, and, after that, do extern int __dso_handle;. That should surface the location of the declaration due to a type conflict (see this forum thread for a source).
Sometimes, it is defined manually.
Sometimes, it is defined/supplied by the "runtime" installed by the compiler (in practice, the CRT is usually just a bunch of binary header/entry-point-management code, and some exit guards/handlers). In GCC (not sure if other compilers support this; if so, it'll be in their sources):
Main definition
Testing __dso_handle replacement/tracker example 1
Testing __dso_handle replacement/tracker example 2
Often, it is defined in the stdlib:
Android
BSD
Further reading:
Subtle bugs caused by __dso_handle being unreachable in some compilers
I ran into this problem. Here are the conditions which seem to reliably generate the trouble:
g++ linking without the C/C++ standard library: -nostdlib (typical small embedded scenario).
Defining a statically allocated standard library object; specific to my case is std::vector. Previously this was std::array statically allocated without any problems. Apparently not all std:: statically allocated objects will cause the problem.
Note that I am not using a shared library of any type.
GCC/ARM cross compiler is in use.
If this is your use case then merely add the command line option to your compile/link command line: -fno-use-cxa-atexit
Here is a very good link to the __dso_handle usage as 'handle to dynamic shared object'.
There appears to be a typo in the page, but I have no idea who to contact to confirm:
After you have called the objects' constructor destructors GCC automatically calls the function ...
I think this should read "Once all destructors have been called GCC calls the function" ...
One way to confirm this would be to implement the __cxa_atexit function as mentioned and then single step the program and see where it gets called. I'll try that one of these days, but not right now.
Adding to #natersoz's answer-
For me, using -Wabi-tag -D_GLIBCXX_USE_CXX11_ABI=0 alongside -fno-use-cxa-atexit helped compile an old lib. A telltale is if the C++ functions in the error message have std::__cxx11 in them, due to an ABI change.

How to debug GCC/LD linking process for STL/C++

I'm working on a bare-metal cortex-M3 in C++ for fun and profit. I use the STL library as I needed some containers. I thought that by simply providing my allocator it wouldn't add much code to the final binary, since you get only what you use.
I actually didn't even expect any linking process at all with the STL
(giving my allocator), as I thought it was all template code.
I am compiling with -fno-exception by the way.
Unfortunately, about 600KB or more are added to my binary. I looked up what symbols are included in the final binary with nm and it seemed a joke to me. The list is so long I won't try and past it. Although there are some weak symbols.
I also looked in the .map file generated by the linker and I even found the scanf symbols
.text
0x000158bc 0x30 /CodeSourcery/Sourcery_CodeBench_Lite_for_ARM_GNU_Linux/bin/../arm-none-linux-gnueabi/libc/usr/lib/libc.a(sscanf.o)
0x000158bc __sscanf
0x000158bc sscanf
0x000158bc _IO_sscanf
And:
$ arm-none-linux-gnueabi-nm binary | grep scanf
000158bc T _IO_sscanf
0003e5f4 T _IO_vfscanf
0003e5f4 T _IO_vfscanf_internal
000164a8 T _IO_vsscanf
00046814 T ___vfscanf
000158bc T __sscanf
00046814 T __vfscanf
000164a8 W __vsscanf
000158bc T sscanf
00046814 W vfscanf
000164a8 W vsscanf
How can I debug this? For first I wanted to understand what exactly GCC is using for linking (I'm linking through GCC). I know that if symbol is found in a text segment, the
whole segment is used, but still that's too much.
Any suggestion on how to tackle this would really be appreciated.
Thanks
Using GCC's -v and -Wl,-v options will show you the linker commands (and version info of the linker) being used.
Which version of GCC are you using? I made some changes for GCC 4.6 (see PR 44647 and PR 43863) to reduce code size to help embedded systems. There's still an outstanding enhancement request (PR 43852) to allow disabling the inclusion of the IO symbols you're seeing - some of them come from the verbose terminate handler, which prints a message when the process is terminated with an active exception. If you're not using execptions then some of that code is useless to you.
The problem is not about the STL, it is about the Standard library.
The STL itself is pure (in a way), but the Standard Library also includes all those streams packages and it seems that you also managed to pull in the libc as well...
The problem is that the Standard Library has never been meant to be picked apart, so there might not have been much concern into re-using stuff from the C Standard Library...
You should first try to identify which files are pulled in when you compile (using strace for example), this way you can verify that you only ever use header-only files.
Then you can try and remove the linking that occurs. There are options to pass to gcc to precise that you would like a standard library-free build, something like --nostdlib for example, however I am not well versed enough in those to instruct you exactly here.

Is it possible to symbolicate C++ code?

I have been running into trouble recently trying to symbolicate a crash log of an iOS app. For some reason the UUID of the dSYM was not indexed in Spotlight. After some manual search and a healthy dose of command line incantations, I managed to symbolicate partially the crash log.
At first I thought the dSYM might be incomplete or something like that, but then I realized that the method calls missing were the ones occurring in C++ code: this project is an Objective-C app that calls into C++ libraries (via Objective-C++) which call back to Objective-C code (again, via Objective-C++ code). The calls that I'm missing are, specifically, the ones that happen in C++ land.
So, my question is: is there some way that the symbolication process can resolve the function calls of C++ code? Which special options do I need to set, if any?
One useful program that comes with the apple sdk is atos (address to symbol). Basically, here's what you want to do:
atos -o myExecutable -arch armv7 0x(address here)
It should print out the name of the symbol at that address.
I'm not well versed in Objective-C, but I'd make sure that the C++ code is being compiled with symbols. Particularly, did you make sure to include -rdynamic and/or -g when compiling the C++ code?
try
dwarfdump --lookup=0xYOUR_ADRESS YOUR_DSYM_FILE
you will have to look up each adress manually ( or write a script to do this ) but if the symbols are ok ( your dSym file is bigger than say 20MB) this will do the job .

XCode 4.2 static libraries linking issue

I have Core static library, a few Component static libraries that relays on the Core one, and then there is an App that links against both Core and Component libraries. My App can link both against Core and Component as long as Component don't uses classes from Core (App uses classes from Core).
I got the following error in both armv6 and armv7 versions. So my problem is not the very popular linking issue that everyone has.
ld: symbol(s) not found for architecture armv6
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I added reference to Core in Component and even added it in "Link Binary With Libraries" which shouldn't be necessary for static lib.
Since I start having this issue I start doubting my design... It probably makes more sense in dynamically linking environment but still it should be doable in static one, especially since this already works under Windows with MSVC compilers.
Edit:
I made some progress! Although I still don't know where to go with it.
Here is my setup:
Core has a class cResourceManager that has a templated method GetResource<T>(int id)
Core also has class cResource
Component has class cMesh that inherits cResource
Here are some tests:
If I try from App to call rm->GetResource<cMesh>(...) I get the linking error
If I try from App to construct cMesh I get linking the linking error
If I try from App to call static method that will return new instance of cMesh I get the linking error
If I comment out the construction of cMesh but leave other member cMesh function calls the App links fine. I can even call delete mesh.
I have never seen anything like it!
If you remove the cMesh constructor, then you are then using the default (no argument, no body) cMesh constructor that is given to you. It almost sounds like there's a build error or missing code as a result of some code in your cMesh constructor and so the library isn't actually getting generated, and perhaps Xcode isn't reporting the error. Xcode is no good at reporting linker errors.
I would suggest looking at what symbols the linker says are missing and double-check that they are actually defined in your code. My guess is that you're using one of those symbols in your cMesh constructor. A lot of times with virtual base classes, you may forget to define and implement a method or two in a child class. Could be a result of missing a method based on your template, or your template isn't #included correctly. This could compile fine but result in linker errors like you're seeing.
If Xcode isn't showing you the full linker error, show the Log Navigator (Command ⌘+7), double-click the last "Build " entry, select the error, and then press the button on the far-right of the row that appears when selected. The symbols should be listed there. If not, it's time for xcodebuild in the Terminal.
If it's not that case, I'd be interested in seeing the results of whether or not the library is being built for the appropriate architecture, or maybe this can spur some progress:
In the Xcode Organizer Shift ⇧+Command ⌘+2, click Projects and find the path to the DerivedData for your project.
In the Terminal, navigate to that directory (cd ~/Library/Developer/Xcode/DerivedData/proj-<random value>/)
Remove (or move aside) the Build directory (rm -r Build)
In Xcode, try to build with the cMesh constructor present.
Find the Library product file (cd Build/Products/<scheme>-iphoneos)
Your compiled static libraries (<libname>.a) should be in this directory. If they're not there, they didn't build (unless you put your products elsewhere). If your libraries are there, let's confirm that they actually are getting built for the appropriate architecture. Run otool -vh <library>.a. You should see something like:
$ otool -vh libtesting.a
Archive : libtesting.a
libtesting.a(testing.o):
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
MH_MAGIC ARM V7 0x00 OBJECT 3 1928 SUBSECTIONS_VIA_SYMBOLS
As you can see, my test library was built for ARMv7.
Make sure you are linking them in the correct order.
If Component depends on symbols in Core, then Component needs to be first in the link order, so the linker knows which symbols to look for in Core.
In MSVC the order doesn't matter, but in most other compiler suites it does.
I don't think Clang generates code for armv6, if you're targeting devices that old you still need to use GCC.

Undefined reference to operator new

I'm trying to build a simple unit test executable, using cpputest. I've built the cpputest framework into a static library, and am now trying to link that into an executable. However, I'm tied into a fairly complicated Makefile setup, because of the related code.
This is my command line:
/usr/bin/qcc -V4.2.4,gcc_ntoarmle_acpp-ne -lang-c++ -O2 -g -g -o Application/UnitTests/Tests/symbols/UnitTestExe -Wl,--start-group Application/UnitTests/Tests/../.objs/main.o Application/UnitTests/lib/libcpputest.a -Wl,--end-group -lm
I'm getting many errors like the following:
Application/UnitTests/lib/libcpputest.a(CommandLineTestRunner.o): In function `CommandLineTestRunner::parseArguments(TestPlugin*)':
Application/UnitTests/cpputest/src/CppUTest/.objs/../CommandLineTestRunner.cpp:114: undefined reference to `operator new(unsigned int, char const*, int)'
I can't figure out what's causing this. Don't I get operator new for free with C++?
You probably need to link with the C++ support runtime library. This happens automatically when you invoke g++. On Linux, this is achieved by adding the -lstdc++ flag to the linker. You have to figure out how to do the same on your platform.
Maybe you're calling gcc, the C compiler instead of g++, which is the C++ compiler.
There's very little information in your question to work from, but it looks like some code uses some form of placement new, and while that special operator new is declared (the compiler finds it and compiles the code using it), the linker can't find its definition.
(Since this old answer of mine seems to still get attention: See here for an extensive discussion on declaration vs. definition.)
You need to rebuild your code from scratch, including the library. I got this error because I inadvertently copied object files compiled on another machine (with the rest of the source) to my machine. Most likely this disturbs the linking step since there are now two types of object files, native (for modified source files) and non-native (all others). I am guessing here, but the operator 'new' means slightly different things on different architectures and that's why you are getting this error.
p.s. I know this is way too late for a useful answer but I'm still posting this for the record.
For QNX 6.5.0 I have specified flag -lang-c++ for qcc (gcc) to avoid the error.
Like the original post, in my case this error happened while trying to link a software using CppUTest framework.
In my case, the source of the problem seems to be related to the fact I disabled the MEMORY_LEAK_DETECTION compile option of CppUTest. I enabled it again, which solved the problem.
Sometimes adding -lstdc++ is not enough. You should add it to the right place. For example I had list like this, not working:
target_link_libraries(cfr2 pthread m stdc++ "${CMAKE_SOURCE_DIR}/compressor/libcompressor.a" )
But this one works fine:
target_link_libraries(cfr2 pthread m "${CMAKE_SOURCE_DIR}/compressor/libcompressor.a" stdc++)
It'd be great if someone explained it in the comment section.