Call ambiguous function in gdb with stripped symbols - gdb

I'm trying to call a function with
(gdb) call fun()
in a 3rd party library libFoo which I get in compiled form with stripped symbols.
(gdb) info function ^fun$
Non-debugging symbols:
0x00007ffff6d7e3b0 fun
Problem is, there is an unrelated system library libBar loaded which also has fun in it, a variable this time, and gdb prefers that symbol instead of the desired one. I suspect that this is because this hit is a non-stripped debugging symbol.
(gdb) info var ^fun$
File ../bar/baz.c:
256: static const int fun[18];
(gdb) info symbol fun
fun in section .rodata of libBar.so
It's a bit crazy that it tries to call variable as a function, but that's what it tries to do.
The question is, how do I disambiguate the symbol and instruct gdb to use one from libFoo ?
So far, the only way I found requires a manual step (info function ^fun$ above) and then call (void)0x00007ffff6d7e3b0(). This isn't too good because it doesn't allow me to script the call across different program runs.

Related

"undefined reference to __dso_handle" while linking static library with -nostdlib [duplicate]

I have an unresolved symbol error when trying to compile my program which complains that it cannot find __dso_handle. Which library is this function usually defined in?
Does the following result from nm on libstdc++.so.6 mean it contains that?
I tried to link against it but the error still occurs.
nm libstdc++.so.6 | grep dso
00000000002fc480 d __dso_handle
__dso_handle is a "guard" that is used to identify dynamic shared objects during global destruction.
Realistically, you should stop reading here. If you're trying to defeat object identification by messing with __dso_handle, something is likely very wrong.
However, since you asked where it is defined: the answer is complex. To surface the location of its definition (for GCC), use iostream in a C++ file, and, after that, do extern int __dso_handle;. That should surface the location of the declaration due to a type conflict (see this forum thread for a source).
Sometimes, it is defined manually.
Sometimes, it is defined/supplied by the "runtime" installed by the compiler (in practice, the CRT is usually just a bunch of binary header/entry-point-management code, and some exit guards/handlers). In GCC (not sure if other compilers support this; if so, it'll be in their sources):
Main definition
Testing __dso_handle replacement/tracker example 1
Testing __dso_handle replacement/tracker example 2
Often, it is defined in the stdlib:
Android
BSD
Further reading:
Subtle bugs caused by __dso_handle being unreachable in some compilers
I ran into this problem. Here are the conditions which seem to reliably generate the trouble:
g++ linking without the C/C++ standard library: -nostdlib (typical small embedded scenario).
Defining a statically allocated standard library object; specific to my case is std::vector. Previously this was std::array statically allocated without any problems. Apparently not all std:: statically allocated objects will cause the problem.
Note that I am not using a shared library of any type.
GCC/ARM cross compiler is in use.
If this is your use case then merely add the command line option to your compile/link command line: -fno-use-cxa-atexit
Here is a very good link to the __dso_handle usage as 'handle to dynamic shared object'.
There appears to be a typo in the page, but I have no idea who to contact to confirm:
After you have called the objects' constructor destructors GCC automatically calls the function ...
I think this should read "Once all destructors have been called GCC calls the function" ...
One way to confirm this would be to implement the __cxa_atexit function as mentioned and then single step the program and see where it gets called. I'll try that one of these days, but not right now.
Adding to #natersoz's answer-
For me, using -Wabi-tag -D_GLIBCXX_USE_CXX11_ABI=0 alongside -fno-use-cxa-atexit helped compile an old lib. A telltale is if the C++ functions in the error message have std::__cxx11 in them, due to an ABI change.

Why does this linker warning and segment fault happen?

I recently upgraded some external library version from librdkafka 1.3.0 to librdkafka 1.6.1.
After building the external library, it was linked as a shared object.
Then the following warning occurred when my program was linked.
/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/ld:
Warning: type of symbol `mtx_lock' changed from 2 to 1
in ../externals/synapfilter/lib/libsnf.a(memoryUtil.cpp.o)
Also a segment fault occurred during program execution.
The output of gdb is as follows.
Program terminated with signal SIGSEGV, Segmentation fault.
b#0 0x0000000000f27a80 in mtx_lock ()
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.5-7.el6_0.x86_64 cyrus-sasl-lib-2.1.23-15.el6_6.2.x86_64 glibc-2.12-1.192.el6.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-57.el6.x86_64 libcom_err-1.41.12-22.el6.x86_64 libgcc-4.4.7-17.el6.x86_64 libicu-4.2.1-14.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 libstdc++-4.4.7-17.el6.x86_64 libzstd-1.4.5-3.el6.x86_64 lz4-r131-1.el6.x86_64 nss-softokn-freebl-3.14.3-23.3.el6_8.x86_64 openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x0000000000f27a80 in mtx_lock ()
#1 0x00007f59479a38cc in rd_kafka_global_cnt_incr () at rdkafka.c:182
#2 rd_kafka_new (type=type#entry=RD_KAFKA_PRODUCER, app_conf=app_conf#entry=0x2531870, errstr=errstr#entry=0x7ffd71c7c7d0 <incomplete sequence \350>,
errstr_size=errstr_size#entry=512) at rdkafka.c:2092
I found that the name(mtx_lock) in the two external libraries used was duplicated.
It was used as a global variable in one object file of the libsnf.a.
$ objdump -t memoryUtil.cpp.o | grep mtx_lock
0000000000000000 g O .bss 0000000000000028 mtx_lock
Also the name was used as a function in one object file of the librdkafka.a.
$ objdump -t tinycthread.o | grep mtx_lock
0000000000000090 g F .text 0000000000000016 mtx_lock
I wonder why this is happening and how to fix it.
In my makefile, I linked a libsnf.a as a static library and librdkafka.so as a dynamic library.
I wonder why this is happening
You have two separate object files: memoryUtil.cpp.o and tinycthread.o, defining the same symbol: mtx_lock. One of them defines it as a function, the other as a variable.
Normally this should result in "multiply defined" symbol error at link time, but you get a warning instead. I am not sure why; perhaps one of these symbol definitions is weak.
(In general, you should never use objdump to look at ELF symbols -- use readelf -Ws instead.)
Your program proceeds to call mtx_lock(), but gets a data variable instead, and crashes.
and how to fix it.
Since these libraries are open source, the easiest fix is to rename one (or both) of the variables, and rebuild.
If you don't want to rebuild, you could use objcopy --redefine-sym ... to achieve the same result.
Update:
The mtx_lock() function is part of the C11 standard, which makes its use as a variable in libsnf highly problematic.

Symbol lookup error at runtime instead of load time

I have an application which uses a class Foo from an .so shared library. I've come across a problem where at runtime it prints
<appname>: symbol lookup error: <appname>: undefined symbol: <mangled_Foo_symbol_name>
Now, it turned out that the unmangled symbol was for the constructor of the class Foo, and the problem was simply that an old version of the library was loaded, which didn't contain Foo yet.
My question isn't about resolving the error (that's obviously to use the correct library), but why it appears at runtime instead of at time of load / startup.
The line of code causing the error just instantiates an object of class Foo, so I'm not using anything like dlopen here, at least not explicitly / to my knowledge.
In contrast, if I remove the whole library from the load search path, I get this error at startup:
<appname>: error while loading shared libraries: libname.so.2: cannot open shared object file: No such file or directory
When the wrong version of gcc / libstdc++ is on the load path, an error also appears at starup:
<appname>: /path/to/gcc-4.8.0/lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by <appname>)
This "fail fast" behavior is much more desirable, I don't want to run my application for quite awhile first, until I finally realize it's using the wrong library.
What causes the load error to appear at runtime and how can I make it appear immediately?
From the man page of ld.so:
ENVIRONMENT
LD_BIND_NOW (libc5; glibc since 2.1.1) If set to a nonempty string, causes the dynamic linker to resolve all symbols at program startup instead of deferring function call resolution to the point when they are first referenced. This is useful when using a debugger.
LD_WARN (ELF only)(glibc since 2.1.3) If set to a nonempty string, warn about unresolved symbols.
I think you can not statically link .so library. If you want to avoid load/run time errors you have to use all static libraries (.a). If you do not have static version of library and source then try to find some statifier. After googling I find few statifiers but do not know how do they work so leaving that part up to you.

How to Determine Which Shared Library a Function Belongs to in gdb?

When I get the callstack from gdb, I only get function names and source file information.
(gdb) f
#0 main (argc=1, argv=0xbffff1d4) at main.c:5
I don't get which Shared Library or Application the function belongs to.
On Windows, Windbg or Visual Studio will show callstacks with "myDll!myFunc" format, which shows you which module the function belongs to.
Currently in gdb I'm using "info address [function]" to get the address of the function symbol, and then use "info share" to manually find the range in which the function lies in memory to determine which library it is in.
Anyway to see the library directly without this manual process?
You can use info symbol. It prints a library name for a function.
Like this:
(gdb) info symbol f
f(double) in section .text of libmylib_gcc.so
(gdb) info symbol printf
printf in section .text of /lib64/libc.so.6

EMF file(.so) debugging, symbol not found VTable error

In Solaris I have an exe file as per the guideline I need to add a shared library (.so) to extend the functionality. I have created a lthmyplugin.so file and added as described. Now the utlity run perfectly fine untill it calls my function After calling my function it fails.
Questions:
Is there any way to debug?
When I run the command truss it identifies aa.so
Also ldd -d lthmyplugin.so show no error except
symbol not found: __1cIMyPluginG__vtbl_ (./lthmyplugin.so)
symbol not found: __1cIThPluginG__vtbl_ (./lthmyplugin.so)
symbol not found: __1cOThLocalOptionsG__vtbl_ (./lthmyplugin.so)
symbol not found: __1cJThOptionsG__vtbl_ (./lthmyplugin.so)
Can this cause the programme to fail?
fyi, I have not used and any virtual function,constructors or destructors
What does this mean symbol not found: _1cIThPluginG_vtbl_ ?
Thanks,
You can use the nm tool to see the functions exposed by the so file. You can call:
nm -g lthmyplugin.so
... To see what functionality it exposes.
Besides that, given you've tagged this as C++, I'm going to take a stab and ask: did you specify a C style calling convention? If you didn't, it will mangle the names making them ugly, unreadable and in 99.9% of cases, unfindable. You can tell gcc not to mangle your functions by adding __attribute__((cdecl)), like so:
int not_mangled(int some_arg) __attribute__((cdecl))
{
return some_arg * 3;
}