Is there a simple way to dump the export object value from a shared library? - gdb

For example, there is a symbol named country, I can get its information (type, address, and length) by nm -D -S
$ nm -D libs_ma.so -S
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w __cxa_finalize
w __gmon_start__
0000000000004028 0000000000000008 D country
But how can I dump the address (4028) with length (8) by some Linux command (just like dlsym() and printf() worked in c program)?

how can I dump the address (4028) with length (8) by some Linux command
Your best bet is probably to use a debugger, such as radare2 or GDB.
This question shows how to do that in radare2.
Here is how you could do this using GDB:
// foo.c
long country = 0xABCDEF0123456789;
gcc -shared -fPIC -o foo.so foo.c
nm -D foo.so | grep country
0000000000004020 D country
gdb -q --batch -ex 'x/gx 0x4020' foo.so
0x4020 <country>: 0xabcdef0123456789
It is also rather easy to write a program in a language of your choice to do this. Your program would have to:
iterate over LOAD segments until it finds one with .p_vaddr <= $address < .p_vaddr + .p_memsz
mmap or read that segment, seek to $address - .p_vaddr offset, and dump $length bytes from there.

Related

Linking to C++ static library on linux throws linker errors while building an execuatble [duplicate]

We recently caught a report because of GCC 5.1, libstdc++ and Dual ABI. It seems Clang is not aware of the GCC inline namespace changes, so it generates code based on one set of namespaces or symbols, while GCC used another set of namespaces or symbols. At link time, there are problems due to missing symbols.
If I am parsing the Dual ABI page correctly, it looks like a matter of pivoting on _GLIBCXX_USE_CXX11_ABI and abi::cxx11 with some additional hardships. More reading is available on Red Hat's blog at GCC5 and the C++11 ABI and The Case of GCC-5.1 and the Two C++ ABIs.
Below is from a Ubuntu 15 machine. The machine provides GCC 5.2.1.
$ cat test.cxx
#include <string>
std::string foo __attribute__ ((visibility ("default")));
std::string bar __attribute__ ((visibility ("default")));
$ g++ -g3 -O2 -shared test.cxx -o test.so
$ nm test.so | grep _Z3
...
0000201c B _Z3barB5cxx11
00002034 B _Z3fooB5cxx11
$ echo _Z3fooB5cxx11 _Z3barB5cxx11 | c++filt
foo[abi:cxx11] bar[abi:cxx11]
How can I generate a binary with symbols using both decorations ("coexistence" as the Red Hat blog calls it)?
Or, what are the options available to us?
I'm trying to achieve an "it just works" for users. I don't care if there are two weak symbols with two different behaviors (std::string lacks copy-on-write, while std::string[abi:cxx11] provides copy-on-write). Or, one can be an alias for the other.
Debian has a boatload of similar bugs at Debian Bug report logs: Bugs tagged libstdc++-cxx11. Their solution was to rebuild everything under the new ABI, but it did not handle the corner case of mixing/matching compilers modulo the ABI changes.
In the Apple world, I think this is close to a fat binary. But I'm not sure what to do in the Linux/GCC world. Finally, we don't control how the distro's build the library, and we don't control what compilers are used to link an applications with the library.
Disclaimer, the following is not tested in production, use at your own risk.
You can yourself release your library under dual ABI. This is more or less analogous to OSX "fat binary", but built entirely with C++.
The easiest way to do so would be to compile the library twice: with -D_GLIBCXX_USE_CXX11_ABI=0 and with -D_GLIBCXX_USE_CXX11_ABI=1. Place the entire library under two different namespaces depending on the value of the macro:
#if _GLIBCXX_USE_CXX11_ABI
# define DUAL_ABI cxx11 __attribute__((abi_tag("cxx11")))
#else
# define DUAL_ABI cxx03
#endif
namespace CryptoPP {
inline namespace DUAL_ABI {
// library goes here
}
}
Now your users can use CryptoPP::whatever as usual, this maps to either CryptoPP::cxx11::whatever or CryptoPP::cxx03::whatever depending on the ABI selected.
Note, the GCC manual says that this method will change mangled names of everything defined in the tagged inline namespace. In my experience this doesn't happen.
The other method would be tagging every class, function, and variable with __attribute__((abi_tag("cxx11"))) if _GLIBCXX_USE_CXX11_ABI is nonzero. This attribute nicely adds [cxx11] to the output of the demangler. I think that using a namespace works just as well though, and requires less modification to the existing code.
In theory you don't need to duplicate the entire library, only functions and classes that use std::string and std::list, and functions and classes that use these functions and classes, and so on recursively. But in practice it's probably not worth the effort, especially if the library is not very big.
Here's one way to do it, but its not very elegant. Its also not clear to me how to make GCC automate it so I don't have to do things twice.
First, the example that's going to be turned into a library:
$ cat test.cxx
#include <string>
std::string foo __attribute__ ((visibility ("default")));
std::string bar __attribute__ ((visibility ("default")));
Then:
$ g++ -D_GLIBCXX_USE_CXX11_ABI=0 -c test.cxx -o test-v1.o
$ g++ -D_GLIBCXX_USE_CXX11_ABI=1 -c test.cxx -o test-v2.o
$ ar cr test.a test-v1.o test-v2.o
$ ranlib test.a
$ g++ -shared test-v1.o test-v2.o -o test.so
Finally, see what we got:
$ nm test.a
test-v1.o:
00000004 B bar
U __cxa_atexit
U __dso_handle
00000000 B foo
0000006c t _GLOBAL__sub_I_foo
00000000 t _Z41__static_initialization_and_destruction_0ii
U _ZNSsC1Ev
U _ZNSsD1Ev
test-v2.o:
U __cxa_atexit
U __dso_handle
0000006c t _GLOBAL__sub_I__Z3fooB5cxx11
00000018 B _Z3barB5cxx11
00000000 B _Z3fooB5cxx11
00000000 t _Z41__static_initialization_and_destruction_0ii
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1Ev
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev
And:
$ nm test.so
00002020 B bar
00002018 B __bss_start
00002018 b completed.7181
U __cxa_atexit##GLIBC_2.1.3
w __cxa_finalize##GLIBC_2.1.3
00000650 t deregister_tm_clones
000006e0 t __do_global_dtors_aux
00001ef4 t __do_global_dtors_aux_fini_array_entry
00002014 d __dso_handle
00001efc d _DYNAMIC
00002018 D _edata
00002054 B _end
0000087c T _fini
0000201c B foo
00000730 t frame_dummy
00001ee8 t __frame_dummy_init_array_entry
00000980 r __FRAME_END__
00002000 d _GLOBAL_OFFSET_TABLE_
000007dc t _GLOBAL__sub_I_foo
00000862 t _GLOBAL__sub_I__Z3fooB5cxx11
w __gmon_start__
000005e0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
00001ef8 d __JCR_END__
00001ef8 d __JCR_LIST__
w _Jv_RegisterClasses
00000690 t register_tm_clones
00002018 d __TMC_END__
00000640 t __x86.get_pc_thunk.bx
0000076c t __x86.get_pc_thunk.dx
0000203c B _Z3barB5cxx11
00002024 B _Z3fooB5cxx11
00000770 t _Z41__static_initialization_and_destruction_0ii
000007f6 t _Z41__static_initialization_and_destruction_0ii
U _ZNSsC1Ev##GLIBCXX_3.4
U _ZNSsD1Ev##GLIBCXX_3.4
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1Ev##GLIBCXX_3.4.21
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev##GLIBCXX_3.4.21

Linking problems due to symbols with abi::cxx11?

We recently caught a report because of GCC 5.1, libstdc++ and Dual ABI. It seems Clang is not aware of the GCC inline namespace changes, so it generates code based on one set of namespaces or symbols, while GCC used another set of namespaces or symbols. At link time, there are problems due to missing symbols.
If I am parsing the Dual ABI page correctly, it looks like a matter of pivoting on _GLIBCXX_USE_CXX11_ABI and abi::cxx11 with some additional hardships. More reading is available on Red Hat's blog at GCC5 and the C++11 ABI and The Case of GCC-5.1 and the Two C++ ABIs.
Below is from a Ubuntu 15 machine. The machine provides GCC 5.2.1.
$ cat test.cxx
#include <string>
std::string foo __attribute__ ((visibility ("default")));
std::string bar __attribute__ ((visibility ("default")));
$ g++ -g3 -O2 -shared test.cxx -o test.so
$ nm test.so | grep _Z3
...
0000201c B _Z3barB5cxx11
00002034 B _Z3fooB5cxx11
$ echo _Z3fooB5cxx11 _Z3barB5cxx11 | c++filt
foo[abi:cxx11] bar[abi:cxx11]
How can I generate a binary with symbols using both decorations ("coexistence" as the Red Hat blog calls it)?
Or, what are the options available to us?
I'm trying to achieve an "it just works" for users. I don't care if there are two weak symbols with two different behaviors (std::string lacks copy-on-write, while std::string[abi:cxx11] provides copy-on-write). Or, one can be an alias for the other.
Debian has a boatload of similar bugs at Debian Bug report logs: Bugs tagged libstdc++-cxx11. Their solution was to rebuild everything under the new ABI, but it did not handle the corner case of mixing/matching compilers modulo the ABI changes.
In the Apple world, I think this is close to a fat binary. But I'm not sure what to do in the Linux/GCC world. Finally, we don't control how the distro's build the library, and we don't control what compilers are used to link an applications with the library.
Disclaimer, the following is not tested in production, use at your own risk.
You can yourself release your library under dual ABI. This is more or less analogous to OSX "fat binary", but built entirely with C++.
The easiest way to do so would be to compile the library twice: with -D_GLIBCXX_USE_CXX11_ABI=0 and with -D_GLIBCXX_USE_CXX11_ABI=1. Place the entire library under two different namespaces depending on the value of the macro:
#if _GLIBCXX_USE_CXX11_ABI
# define DUAL_ABI cxx11 __attribute__((abi_tag("cxx11")))
#else
# define DUAL_ABI cxx03
#endif
namespace CryptoPP {
inline namespace DUAL_ABI {
// library goes here
}
}
Now your users can use CryptoPP::whatever as usual, this maps to either CryptoPP::cxx11::whatever or CryptoPP::cxx03::whatever depending on the ABI selected.
Note, the GCC manual says that this method will change mangled names of everything defined in the tagged inline namespace. In my experience this doesn't happen.
The other method would be tagging every class, function, and variable with __attribute__((abi_tag("cxx11"))) if _GLIBCXX_USE_CXX11_ABI is nonzero. This attribute nicely adds [cxx11] to the output of the demangler. I think that using a namespace works just as well though, and requires less modification to the existing code.
In theory you don't need to duplicate the entire library, only functions and classes that use std::string and std::list, and functions and classes that use these functions and classes, and so on recursively. But in practice it's probably not worth the effort, especially if the library is not very big.
Here's one way to do it, but its not very elegant. Its also not clear to me how to make GCC automate it so I don't have to do things twice.
First, the example that's going to be turned into a library:
$ cat test.cxx
#include <string>
std::string foo __attribute__ ((visibility ("default")));
std::string bar __attribute__ ((visibility ("default")));
Then:
$ g++ -D_GLIBCXX_USE_CXX11_ABI=0 -c test.cxx -o test-v1.o
$ g++ -D_GLIBCXX_USE_CXX11_ABI=1 -c test.cxx -o test-v2.o
$ ar cr test.a test-v1.o test-v2.o
$ ranlib test.a
$ g++ -shared test-v1.o test-v2.o -o test.so
Finally, see what we got:
$ nm test.a
test-v1.o:
00000004 B bar
U __cxa_atexit
U __dso_handle
00000000 B foo
0000006c t _GLOBAL__sub_I_foo
00000000 t _Z41__static_initialization_and_destruction_0ii
U _ZNSsC1Ev
U _ZNSsD1Ev
test-v2.o:
U __cxa_atexit
U __dso_handle
0000006c t _GLOBAL__sub_I__Z3fooB5cxx11
00000018 B _Z3barB5cxx11
00000000 B _Z3fooB5cxx11
00000000 t _Z41__static_initialization_and_destruction_0ii
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1Ev
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev
And:
$ nm test.so
00002020 B bar
00002018 B __bss_start
00002018 b completed.7181
U __cxa_atexit##GLIBC_2.1.3
w __cxa_finalize##GLIBC_2.1.3
00000650 t deregister_tm_clones
000006e0 t __do_global_dtors_aux
00001ef4 t __do_global_dtors_aux_fini_array_entry
00002014 d __dso_handle
00001efc d _DYNAMIC
00002018 D _edata
00002054 B _end
0000087c T _fini
0000201c B foo
00000730 t frame_dummy
00001ee8 t __frame_dummy_init_array_entry
00000980 r __FRAME_END__
00002000 d _GLOBAL_OFFSET_TABLE_
000007dc t _GLOBAL__sub_I_foo
00000862 t _GLOBAL__sub_I__Z3fooB5cxx11
w __gmon_start__
000005e0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
00001ef8 d __JCR_END__
00001ef8 d __JCR_LIST__
w _Jv_RegisterClasses
00000690 t register_tm_clones
00002018 d __TMC_END__
00000640 t __x86.get_pc_thunk.bx
0000076c t __x86.get_pc_thunk.dx
0000203c B _Z3barB5cxx11
00002024 B _Z3fooB5cxx11
00000770 t _Z41__static_initialization_and_destruction_0ii
000007f6 t _Z41__static_initialization_and_destruction_0ii
U _ZNSsC1Ev##GLIBCXX_3.4
U _ZNSsD1Ev##GLIBCXX_3.4
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1Ev##GLIBCXX_3.4.21
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEED1Ev##GLIBCXX_3.4.21

building the ta-lib library fails with undefined references from libm.so

Trying to make the ta-lib library (ta-lib-0.4.0-src.tar.gz) I get the following error:
/home/me/ta-lib/src/.libs/libta_lib.so: undefined reference to `sinh'
/home/me/ta-lib/src/.libs/libta_lib.so: undefined reference to `sincos'
/home/me/ta-lib/src/.libs/libta_lib.so: undefined reference to `ceil'
...
for a large number of maths functions.
The failing command looks like this:
gcc -g -O2 -o .libs/ta_regtest (... .o files) -L/home/me/ta-lib/src \
/home/me/ta-lib/src/.libs/libta_lib.so -lm -lpthread -ldl
The offending library (ta_lib) looks like this:
objdump -TC libta_lib.so | grep " D \*UND\*"
0000000000000000 D *UND* 0000000000000000 sinh
0000000000000000 D *UND* 0000000000000000 sincos
0000000000000000 D *UND* 0000000000000000 ceil
...
For the same maths functions (the grep excludes defined functions and those that have a "w" (presumably weak) flag)
A map lists the libraries included, among them:
LOAD /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/libm.so
and a list of the symbols (objdump -TC) defined in libm.so includes:
000000000001a320 w iD .text 0000000000000020 GLIBC_2.2.5 ceil
which was one of the undefined references (they are all there). I cannot determine the meaning of GLIBC_2.2.5.
Why is the loader not finding these functions?
My system looks like this:
$ uname -a
Linux mynode 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

How to print symbol list for .so file in OSX?

I have an .SO file (note, not .a, not .dylib and not .o) and I need to get symbol information from it on OSX.
I have tried
nm -gU lib.so
However, nothing is printed out.
I can't use otool because it's not an object file, and readelf does not exists on OSX. How do I get the symbol information?
Please note, that I am using this .so file in another project, and there is symbol information. I am able to load the library, and reference functions from it. However, I have yet to find a tool on OSX to let me print the symbol information from it.
As asked,
file lib.so
ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked, stripped
Try using c++filt piped from nm:
nm lib.so | c++filt -p -i
c++filt - Demangle C++ and Java symbols.
-p
--no-params
When demangling the name of a function, do not display the types of
the function's parameters.
-i
--no-verbose
Do not include implementation details (if any) in the demangled
output.
EDIT: Based upon the new (ARM) info provided in the question, try using symbols instead:
symbols lib.so -arch arm | awk '{print $4}'
I've used awk to simplify output; remove to output everything.
Manual page : Symbols
https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man1/nm.1.html
Nm displays the name list (symbol table) of each object file in the argument list. If an argument is
an archive, a listing for each object file in the archive will be produced. File can be of the form
libx.a(x.o), in which case only symbols from that member of the object file are listed. (The paren-
theses have to be quoted to get by the shell.) If no file is given, the symbols in a.out are listed.

GDB doesn't show function names

I am debugging from an embedded device using gdbserver:
./gdbserver HOST:5000 /home/test_app
In my PC, I execute gdb in this way:
arm-none-linux-gnueabi-gdb test_app
Once the application is executing, I receive the Segfault I want to debug, but it's impossible to know what line produced it:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 715]
0x31303030 in ?? ()
(gdb) bt
#0 0x31303030 in ?? ()
#1 0x0000dff8 in ?? ()
#2 0x0000dff8 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(I must say I'm totally new to GDB)
Ok this usually happens if debug symbols are missing... just to make sure run following commands
file <your_executable>
you will get info on your binary like format, arch etc.. last part of the info describes if the binary is stripped or not. For debugging in GDB the binary should not have been stripped.
nm --debug-sym <your_executable> | grep debug
If you have some valid prints as below it means debug symbols are present.
00000000 N .debug_abbrev
00000000 N .debug_aranges
00000000 N .debug_frame
00000000 N .debug_info
00000000 N .debug_line
00000000 N .debug_loc
00000000 N .debug_pubnames
00000000 N .debug_str
Further when you invoke GDB you should have follwing line
Reading symbols from <your_executable>...done.
At this point you should be able to list sources with list command.
Make sure both gdb and gdbserver have same versioninig.
arm-none-linux-gnueabi-gdb --version
./gdbserver --version
If all the above are true, and you still don't get backtrace, there is something bad going on with your stack. Try running some static analysis, valgrind on your code / newly added code.
You need to build your application with debug symbols enabled. The switch for gcc is -g
For others, if nm --debug-sym <your_executable> | grep debug prints the debug symbols but you do not get them in gdb, this might be because you are opening a core in gdb using a executable that is different from the one that generated the core.
You will need to include -g for every translation unit, for example, if you have a bunch of object files that are linked to build your final executable you will need to include -g for each compilation command.
g++ -g file1.cpp -c -o file1.o
g++ -g file2.cpp -c -o file2.o
...
g++ -g file1.o file2.o -o main