Say that I have three libraries: libMissingSymbol.so, libMiddle.so, and libSymbolHaver.so. libMissingSymbol contains a symbol defined in libSymbolHaver, but only has a dependency on libMiddle. libMiddle is supposed to have a dependency on libSymbolHaver, but it doesn't. I don't have the source code or unlinked object files that these libraries were assembled from. Is it possible for me to link libMiddle with libSymbolHaver so that libMissingSymbol can find the symbol it needs at load time? Is there any way that I can fix this using only these three shared object files and any necessary tools? I have to end up with libraries with the same contents (including SONAMEs) barring the dependency change to libMiddle in order to not break things further down the line in my project.
Hypothetical readelf output (trimmed for relevance) to clarify:
$ readelf -s libMissingSymbol.so
123: 00000000 0 OBJECT GLOBAL DEFAULT UND MangledSymbol
$ readelf -d libMissingSymbol.so
Dynamic section at offset 0x42434 contains 37 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libMiddle.so]
0x0000000e (SONAME) Library soname: [libMissingSymbol.so]
$ readelf -d libMiddle.so
Dynamic section at offset 0x75b28 contains 29 entries:
Tag Type Name/Value
0x0000000e (SONAME) Library soname: [libMiddle.so]
$ readelf -s libSymbolHaver.so
35: 00064d0c 4 OBJECT GLOBAL DEFAULT 22 MangledSymbol
Is it possible for me to link libMiddle with libSymbolHaver so that libMissingSymbol can find the symbol it needs at load time?
No: all UNIX linkers, except the AIX one, consider .so the final link product and no further modification is possible.
Update:
viability of doing this a different way (e.g. decompiling libMiddle and rebuilding it with the correct dependencies)?
I don't believe that is viable either -- it is really hard to modify a fully-linked ELF file and not violate myriad of internal consistency constraints.
I suggest the following approach, which is very likely to just work(TM).
Abandon your "only using these three libraries" restriction. It appears to be artificial and unnecessary.
Copy libMiddle.so -> libZiddle.so (be sure to make a copy of the original libMiddle.so somewhere else in case things go wrong).
binary-patch the SONAME in libZiddle.so to match the new name. The string "libMiddle.so" is in the .dynstr section of the library, and (I believe) is not hashed in any way, so changing one letter in it will not introduce any self-inconsistencies into the new library.
Once you've done this, compare readelf -a libMiddle.so and readelf -a libZiddle.so, the SONAME should be the only difference.
Remove libMiddle.so.
Link a new libMiddle.so containing some_unused_function(), and having dynamic dependency on both libZiddle.so and libSymbolHaver.so.
Now any binary that currently links against libMiddle.so and fails with missing symbol (e.g. libMissingSymbol.so) will find the new (empty) libMiddle.so, but because the new libMiddle.so requires both libZiddle.so (where most of the symbols are) and libSymbolHaver.so, it should just work.
Related
I seem to somehow have an impossible situation, that I can only assume means my analysis is somehow wrong, since the following all seem to be true:
The executable runs, so it must have all dependent functions provided
The executable depends on a function I'll call Foo::Bar::_ex
This function is not defined in any .a or .so file in the entire filesystem
One of the dependent libraries requires this undefined function
I cannot link the code into an executable because I can't find any library that provides this function
I can see the requirement of this function by the application by running ldd on the app, and seeing that it requires a library I'll call libExample.so. I can see by running objdump -T on the .so file that it requires the mystery function:
ldd APP
libExample.so => /path/to/libExample.so
objdump -T /path/to/libExample.so | c++filt | grep Foo::Bar::_ex
00000000 D *UND* 00000000 Foo::Bar::_ex
For every /path/to/libWhatever.a I collected the library path and output of objdump -t /path/to/libWhatever.a | c++filt into ~/adump.txt. Similarly, I collected the path of every .so file and output of objdump -T /path/to/libWhatever.so | c++filt into ~/sodump.txt.
When I grep adump.txt for Foo::Bar::_ex, I get only entries like the following:
00000000 *UND* 00000000 Foo::Bar::_ex
00000108 g O .data.rel.local 00000004 Foo::Bar::_ex
When I grep sodump.txt for Foo::Bar::_ex, I get only entries like the following:
00000000 D *UND* 00000000 Foo::Bar::_ex
004f9bc4 g DO .data 000000004 Base Foo::Bar::_ex
00000000009ff5f8 g DO .data 0000000000000008 Base Foo::Bar::_ex
I understand from the objdump man page that DF means defining a function, and DO means defining an object, and that if I could find a DF entry for Foo::Bar::_ex in some library, my problem would be solved, just use that library in the link command.
I don't understand what "Object" means in objdump terms - it obviously isn't function code or a runtime object, so what is it?
How does the app run without complaint about a missing function, when none of the libraries provide anything acceptable to the linker?
I think I found out my real problem today, and it isn't what I thought. I have one shared library that is somehow buggered in a way where it only works if you pass the path to it on the command line instead of using -L and -l.
In other words, just for this one library, g++ -L /path/to/lib/dir -l libName.so does not work. The linker says it cannot find any of the functions in it, which clearly exist. It doesn't complain about the file not being found, it just can't find the functions.
If I use g++ /path/to/libName.so, now it is happy, and links the app with the specific path given. As long as that path can be loaded at runtime, it works.
So the dorky process I use is to copy the lib to the current dir, give just the name of the library to g++, then remove the copy. The exe is then able to find the library in the usual way at runtime.
Go figure.
Well today I was going through the "fun" exercise of trying to pull in all the dependencies (read: lost of swearing, just not loud enough for anyone else to hear), and was getting frustrated with linker errors about can't find a compatible libstdc++ implementation.
So I rolled back the set of OS packages I was installing in my docker container thru a Dockerfile, and .... it worked! Suddenly linking succeeded.
As far as I can tell, I started bringing in some unrelated OS packages that provide libraries whose names are similar to totally unrelated libraries that were sitting in a directory (god only knows where they came from).
Linking to the unrelated OS packages starts asking for other OS stuff, and things quickly go off the rails from there. Once I linked just to the provided libraries of unknown origin, my problem went away :)
Thanks for your answers, lesson learned!
List items 1- 4 are the steps that I did.
List item 5 describes the problem
List item 6 provides additional information
I have compiled a C source code say c1.c with -g flag.
I have also a
dynamic shared library say liba1.so built with -g for all the source
files that it has.
I built the executable say exe1 by linking c1.o (c1.c object code) with the liba1.so .
I do gdb exe1. and am able to step through the sources of c1.c. When c1 calls the shared library, I am also able to put a breakpoint on a function in the shared library.
However, when I try to step through the function, it says that "Single stepping until exit from function foo1 ,which has no line number information" Also it should ordinarily show the value of the parameters passed into the function foo1 but does not do that. This happens for all functions in the shared library including some very big ones so the values cannot be optimized out
I did an objdump -t on the shared library AND the executable - it shows the symbol table (the fact that I can set a breakpoint on the function also supports this). Also, I can see the values of the variables used in the file c1.c So what should I do in order to ensure that I can see the values of the local variables inside the shared library. Here are the other arguments that are being used to compile the shared library "-O2 -std=gnu99 -Werror -fno-stack-protector -Wstack-protector --param ssp-buffer-size=1 -g -nostdinc". doing info f and trying to look at memory addresses on the frame also does not give any information.
I am looking for some suggestion to at least troubleshoot it. Can I know using objdump (or any other utility) if a shared library has line number information.
I am looking for some suggestion to at least troubleshoot it.
The most likely reason for no line number information, is that there is in fact no line number information, and the most likely reason for that is that you have two copies of liba1.so -- one that has debug info, and one that doesn't, and you are loading at runtime the latter.
First step: (gdb) info shared will tell you exactly which liba1.so is loaded.
If it is in fact the version that you've just built with -g, you should verify that it does have the debug info you are expecting. The exact commands for doing so are platform specific (and you didn't tell which platform you are on). On an ELF platform, objdump -g liba1.so or readelf -w liba1.so should work.
One common reason for -g code to not have debug info is presence of -s (strip) flag on the link line; make sure you don't have "stray" flags on your link line. Some platforms also require -g to be used at link time in addition to compile time.
I have been working on a cross platform windowing library aimed to be used for OpenGL specifically, currently focusing on linux. I am making use of glload to manage OpenGL extensions, and this is being compiled, along with other libraries that I will use later, into an .so. This `.so is being dynamically loaded as you would expect, but at run time the program gives the following output (manually wrapped so it is easier to read):
_dist/x64-linux-debug/bin/test: Symbol `glXCreateContextAttribsARB' has \
different size in shared object, consider re-linking
Now, obviously I have tried re-linking, going as far as rebuilding the entire project many times (testing things out, not just blindly hoping it will magically make it all better). The program does seem to be willing to run as it will produce some logging output as I would expect it to. I have used nm to confirm that the 'symbol' is in the .so
nm _dist/x64-linux-debug/lib64/libvendor.so | grep glXCreateContextAttribsARB
00000000009e0e78 B glXCreateContextAttribsARB
If I use readelf to look at the symbols being defined I get the following (again, I have manually wrapped the first three lines for formatting sake):
readelf -Ws _dist/x64-linux-debug/bin/test \
_dist/x64-linux-debug/lib64/libvendor.so | \
grep glXCreateContextAttribsARB
348: 000000000062b318 8 OBJECT GLOBAL DEFAULT 26 glXCreateContextAttribsARB
421: 000000000062b318 8 OBJECT GLOBAL DEFAULT 26 glXCreateContextAttribsARB
1370: 00000000009e0e78 8 OBJECT GLOBAL DEFAULT 25 glXCreateContextAttribsARB
17464: 00000000009e0e78 8 OBJECT GLOBAL DEFAULT 25 glXCreateContextAttribsARB
I am afraid that this is about all I can offer to help, as I really do not know what to try or look into. Like I said, I am sure more will info will be need, so please just say an I will provide what I can. I am running these commands from my project root, encase you are wondering.
wilsonmichaelpatrick's answer is mostly correct, but using gdb is likely not the fastest way to find the problem, and will likely not work at all if you have a non-debug build.
First, you should confirm that there in fact is a problem:
readelf -Ws _dist/x64-linux-debug/bin/test _dist/x64-linux-debug/lib64/libvendor.so |
grep glXCreateContextAttribsARB
This should show the symbol being defined in test and libvendor.so, with different size.
Second, re-link test and libvendor.so with -Wl,-y,glXCreateContextAttribsARB flag. That will tell you which object files (or libraries) provide the (different) definitions.
Finally, preprocess the sources that produce above object files with -E and -dD flags, and see what's different between them.
Update:
I need help digesting what it is saying
Don't be helpless. Read man readelf, or just run it by hand. You'll see something like this:
readelf -Ws /bin/date | head -5
Symbol table '.dynsym' contains 75 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __ctype_toupper_loc#GLIBC_2.3 (2)
This tells you the meaning of the data you've got. In particular, this tells you that the size of the symbol in test and in libvendor.so is the same (8). Therefore, the problem is not in these two ELF files, but somewhere else. Run readelf on your other libraries, and look for definition of glXCreateContextAttribsARB that has a different size. Then follow the rest of the procedure.
The runtime is noticing that glXCreateContextAttribsARB as compiled in the shared object, and glXCreateContextAttribsARB as compiled in the main program (or maybe even some other shared object previously linked) have different sizes. This means that, in the separate builds for the shared object and whatever else references that object, they must be looking at different code (probably in a shared object) where this is defined. Sometimes this occurs because they are looking at different files, sometimes this occurs because of different #defines causing different interpretations of the same file. Whatever the reason, you absolutely need to make sure that the same symbol (e.g. a structure) is defined the same way (i.e. with the same member variables and size) across everything that is linked together at runtime.
It's actually a very good thing that it is refusing to run, as this is a catastrophe when two parts of the code interpret the same bit of memory in different ways at runtime. (Not too much of an exaggeration to say anything could happen if this was allowed to proceed.)
You might want to try just loading up the executable in gdb (without running it) and typing
info types
to see where it is defined, and then load the shared object in gdb (without running it) and doing another info types there to see what each of them thinks it's looking at. If it's the same thing, check the preprocessor directives.
I have faced a tedious issue related to objects of different sizes so I want to share my experience - even though it is clear to me that it is only one reason that might explain different object sizes - and not mandatorily the OP's.
The symptoms were objects of different sizes in debug mode, none in release mode. The linker produced the according warnings. The symbol names were hard to decipher but related to some unnamed static variables in instances of class templates.
The reason was the debug logging feature à la LOG("Do something.");. The LOG macro used the C ANSI macro __FILE__ which expanded to another path depending on whether the header was included by the application or by the shared library. And this string was exactly the aforementioned unnamed static variable.
Even more tedious was the fact that due to our make environment the __FILE__ macro sometimes expanded to, let's say, C:\temp\file.h and sometimes to C:\other\..\temp\file.h so that building the application and the library from the same place didn't solve the problem either.
I hope this piece of experience might spare some time to some of you.
In most cases you're probably just linking against the wrong library (a different version). For example, you have libfoo installed twice and link your executable with -L /path/to/version1 -lfoo but during runtime you link with /path/to/version2 (you can see this one with ldd yourprogram).
One reason could be that the executable was linked with -rpath,/path/to/version1 but (as recent versions do) this set the RUNPATH entry in the dynamic section; while you have LD_LIBRARY_PATH=/path/to/version2. When RUNPATH is set, LD_LIBRARY_PATH gets precedence. In this case delete the library from /path/to/version2 (or remove that path from LD_LIBRARY_PATH).
EXAMPLE
$ minimal
/home/carlo/minimal: Symbol `_ZN6libcwd8libcw_doE' has different size in shared object, consider re-linking
COREDUMP : /home/carlo/projects/libcwd/libcwd/elfxx.cc:2381: void libcwd::elfxx::objfile_ct::load_dwarf(): Assertion `size == sizeof(address)' failed.
(libcwd is smart enough to see it too; aka the problem here is with libcwd):
$ ldd minimal | grep libcwd_r
libcwd_r.so.5 => /usr/local/install/6.0.0-1ubuntu2/lib/libcwd_r.so.5 (0x00007f0b69840000)
$ echo $LD_LIBRARY_PATH
/usr/local/install/6.0.0-1ubuntu2/lib
$ objdump -a -x minimal | grep PATH
RUNPATH /opt/gitache/libcwd_r/888f62c44fd64f1486176bf9e35b36f79612790017c31f95e117fc59743a54ca/lib
Unsetting LD_LIBRARY_PATH or removing libcwd from that path results in
$ unset LD_LIBRARY_PATH
$ ldd minimal | grep libcwd_r
libcwd_r.so.5 => /opt/gitache/libcwd_r/888f62c44fd64f1486176bf9e35b36f79612790017c31f95e117fc59743a54ca/lib/libcwd_r.so.5 (0x00007f11d7298000)
and things work again. Or alternatively I could add to my CMakeLists.txt of the project:
$ set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--disable-new-dtags")
After which we get,
$ objdump -a -x minimal | grep PATH
RPATH /opt/gitache/libcwd_r/888f62c44fd64f1486176bf9e35b36f79612790017c31f95e117fc59743a54ca/lib
which now has precedence over LD_LIBRARY_PATH and therefore also solves the issue. This is not the recommended way however: if you set LD_LIBRARY_PATH you should know what you are doing. If that doesn't work, you should fix LD_LIBRARY_PATH or remove the offending library.
I have a static library, say mystaticlib.a. I want to see its contents, such as the number of object files inside it.
How can I do this on gcc?
On gcc, use ar -t.
-t option of the gnu archiver (ar) writes a table of contents of archive to the standard output. Only the files specified by the file operands shall be included in the written list. If no file operands are specified, all files in archive shall be included in the order of the archive.
More info here.
You can see the contents (the .o files that went into it) and the defined symbols by using nm. If this contains C++ code you should use the -C option to demangle the symbol names:
nm -C libschnoeck.a | less
On a Mac, simply use
nm libschnoeck.a | less
There is no -C option with the Mac version of nm.
It just stumbled over this:
You can open an archive (.a) with 7zip.
Also works for the object files in the archive.
Listing all sorts of contents like .text, .bss, .data, etc. with their offset, length, type, ...
Furthermore its possible to unpack all, using a hex editor or notepad++ to view the contents.
I tested this with an archive created with GNUToolsARMEmbedded\2018-q4-major\bin\arm-none-eabi- Toolchain
and 7Zip 16.04 (64-bit)
I just discovered that you can use readelf -a to display the contents of all the object files in a static library.
Invoke the readelf command like this: $ readelf -a mystaticlib.a.
How do I determine whether a function exists within a library, or list out the functions in a compiled library?
You can use the nm command to list the symbols in static libraries.
nm -g -C <libMylib.a>
For ELF binaries, you can use readelf:
readelf -sW a.out | awk '$4 == "FUNC"' | c++filt
-s: list symbols
-W: don't cut too long names
The awk command will then filter out all functions, and c++filt will unmangle them. That means it will convert them from an internal naming scheme so they are displayed in human readable form. It outputs names similar to this (taken from boost.filesystem lib):
285: 0000bef0 91 FUNC WEAK DEFAULT 11 boost::exception::~exception()
Without c++filt, the name is displayed as _ZN5boost9exceptionD0Ev
For Microsoft tools, "link /dump /symbols <filename>" will give you the gory details. There are probably other ways (or options) to give an easier to read listing.
Under Linux/Unix you can use objdump -T to list the exported symbols contained in a given object. Under Windows there's dumpbin (IIRC dumpbin /exports). Note that C++ function names are mangled in order to allow overloads.
EDIT: after seeing codelogic's anwser I remembered that objdump also understands -C to perform de-mangling.
use this command:
objdump -t "your-library"
It will print more than you want - not just function names, but the entire symbol table. Check the various attributes of the symbols you get, and you will be able to sort out the functions from variables and stuff.