Get dlopen to ignore undefined symbols - c++

I am compiling a dynamically generated C++ file as shared object which contains references to symbols available only in it's full build.
g++ -o tmp_form.so -fPIC -shared -lsomelib -std=gnu99 tmp_form.cc
I don't need the missing symbols for my current program, only those from the linked library. But dlopen does require them to be available or fails otherwise. The missing symbols are all variables which are being referenced in structs.
One option would be to add the weak reference attribute to the missing symbols in the generated code. But I would like to avoid making changes to the code generator if possible.
Any advise is appreciated.

Your link command is incorrect:
... -lsomelib ... tmp_form.cc
should be
... tmp_form.cc -lsomelib
The order of sources/objects and libraries on the link line does matter.
If you are using an ELF platform and a very recent build of Gold linker, you can "downgrade" unresolved symbols to weak with --weak-unresolved-symbols option (added here) without modifying the source.
Otherwise, you'll have to modify sources, there is no other way.
P.S. Function references would not have a problem with RTLD_LAZY due to lazy binding, but for data references weak unresolved is your only choice, lazy binding is not possible for them.

Try dlopen("/path/to/the/library", RTLD_LAZY);

Related

Loading classes - dynamical link library returned undefined symbol error

I'm using C++ dlopen() to link a shared library named as lib*.so (in directory A) in my main program (in directory B).
I experimented on some simple function loading. Every thing works very well. However, it gave me a headache when I was trying to load class and factory functions that return a pointer to the class object. (I'm using the terms from the tutorial below)
The methodology I used was based on the examples in chapter 3.3 of this tutorial https://www.tldp.org/HOWTO/C++-dlopen/thesolution.html#externC.
There is a bit of polymorphism here ... lib*.so contains a child class that inherits a parent abstract class from the main program directory (directory B). When dlopen() tries to load lib*.so in the main program, it failed due to "undefined symbol".
I used nm command to examine the symbol tables in lib*.so and main program binary. The symbols in these binaries are:
lib*.so : U _ZTI7ParentBox
main program binary: V _ZTI7ParentBox
ParentBox is the name of the parent class inherited by ChildBox in lib*.so. Note that parent class header file is in another project in directory B.
Although there is name mangling the symbol names are exactly the same.
I'm just wondering why the dynamic linker cannot link them? and giving me undefeind symbol error for dlopen()?
Am I missing the understanding of some key concepts here?
P.S. more strangely, it was able to resolve the symbols for member functions between the child class (U type symbol) in lib*.so (T type symbol) and parent class. Why is it able to do this but not able to resolve the undefined symbol for parent class name?
(I've been searching around for a long time and tried -rdynamic, -ldl stuff though I'm not fully understood what they are, but nothing worked)
Update 04 April 2019:
This is the g++ command line I used to make the main program binary.
g++ -fvisibility=hidden -pthread -static-libgcc -static-libstdc++ \
-m64 -fpic -ggdb3 -fno-var-tracking-assignments -std=c++14 \
-rdynamic \
-o ./build/main-prog \
/some_absolute_path/ParentBox.o \
/some_other_pathen/Triangle.o \
/some_other_pathen/Circle.o \
/some_other_pathen/<lots_of_depending_obj> \
/some_absolute_path/librandom.a \
-lz -ldl -lrt -lbz2
I searched every argument of this command line in https://gcc.gnu.org/onlinedocs/gcc/Option-Index.html (This seems to be a good reference site for all fellow programmers working with large projects with complicated g++ line :) )
Thanks to #Employed Russian. With his instructions, the problem narrows down to export the symbols in main program binary.
However, the main program binary has lots of dependencies as you can see from the above command, Circle, Triangle and lots of other object files.
We also need to add "-rdynamic" to the compilation of Circle, Triangle and other dependency object files. Otherwise it does not work.
In my case, I added "-rdynamic" to all files in my project to export all symbols. Not sure about "-fvisibility=hidden" doing anything good. I removed all of them in my Makefile anyway... I know this is not the best way but I will worry about speed later when everything is functionally correct. :)
More Updates:
The correct solution is in #Employed Russian's update in the answer.
My previous solution happened to work because I also removed "-fvisibility=hidden". It is not necessary (and probably wrong) to add -rdynamic to all objects used in the final link.
Please refer to #Employed Russian's explanation which addresses the core issue.
Final Update:
For fellow programmers who are interested in how C/C++ program is executed and how library can be linked, here is a good reference web course (Life of Binary) by Xeno Kovah: http://opensecuritytraining.info/LifeOfBinaries.html
You can also find a playlist on youtube. Just search "Life of Binary"
Although there is name mangling the symbol names are exactly the same. I'm just wondering why the dynamic linker cannot link them?
Most likely explanation: the symbol is not exported from the main binary.
Repeat your command with nm -D:
nm -AD lib*.so main-prog | grep ' _ZTI7ParentBox$'
Chances are, you'll see lib*.so: U _ZTI7ParentBox and nothing from main-prog.
This happens because normally the linker will not export any symbol from main-prog, that is not referenced by some shared library participating in the link (and your lib*.so isn't linked with main-prog, or else you wouldn't need to dlopen it).
To change that behavior, you could add -Wl,--export-dynamic linker flag when linking main-prog. That instructs the linker to export everything that is linked into main-prog.
tried -rdynamic
That is equivalent to -Wl,--export-dynamic, and should have worked (assuming you added it to the main-prog link line, and not somewhere else).
Update:
Everything works now! Since main-prog also depends on some other objects, it appears that simply add -rdynamic to the final main-prog linking does not resolve the problem. We need to add "-rdynamic" to the compilation of those depending objects.
That is the wrong solution. Your problem is that -fvisibility=hidden tells the compiler to mark all symbols that go into main-prog as not exported, and -rdynamic doesn't export any hidden symbols.
The correct solution is to remove -fvisibility=hidden from any objects that define symbols you do want to export, and add -rdynamic to the final link.

Why does ld need library that my executable depends on?

I'm trying to build my executable (that depends on library utils.so) using the following command
g++ -L/path/to/libutils -lutils -I/path/to/utils_headers executable.cpp -o executable
Actually I don't have utils.so - only the header files of utils library.
I'm getting the error:
ld: cannot find -lutils
Does linker really need to access all the libraries my executable depends on in order to build my executable? If it does then I'd like to know why it needs to access them.
My executable is a shared library. I'm sure that header files of the utils lib are enough to build it (i.e without having utils.so).
The linkage option -lutils by default directs the linker to search,
first in the specified library search directories (-Ldir) and then
in its default search directories, for either of the files libutils.so (
shared library) or libutils.a (static library), preferring libutils.so
if both of them are found in the same search directory.
If such a file is found, the linker stops searching and adds that file
to the input files of the linkage, whether or not it resolves any references in
the linkage. The linker cannot know whether the file resolves any references
if it does not input the file.
If no such file is found, the linker gives the error: cannot find -lutils. Because
you told it to find libutils.{so|a} and it could not.
You say:
My executable is a shared library
But it isn't. Your compile-and-link command:
$ g++ -L/path/to/libutils -lutils -I/path/to/utils_headers executable.cpp -o executable
is not an attempt to link a shared library. It is an attempt to link a program.1
This would be an attempt to link a shared library:
$ g++ -shared -I/path/to/utils_headers -o libexecutable.so executable.cpp -L/path/to/libutils -lutils
You cannot link a program with unresolved references. But you can link a shared library
with unresolved references.
So, you could link a libexecutable.so like that, or you could link it simply like:
$ g++ -shared -I/path/to/utils_headers -o libexecutable.so executable.cpp
These are two different linkages: if they succeed they produce different output files.
In the first linkage, some symbols will (let's assume) be resolved to definitions provided in libutils.so or libutils.a
(whichever one is found), and this will be reflected by:
libutils.so is found: The .dynamic section of libexecutable.so contains a DT_NEEDED
structure that expresses a runtime dependency on libutils.so. libutils.so will need to be included in any linkage that includes libexecutable.so, but the output file of such a linkage will itself contain a runtime dependency only on libexecutable.so.
libutils.a is found: libexecutable.so itself contains the definitions for all the symbols
it uses that are defined by object files in libutils.a.2 libexecutable.so may be included in subsequent linkages with no need for libutils.{so|a}.
In the second linkage, the .dynamic section of libexecutable.so will not express a runtime
dependency on libutils.so nor will the file contain definitions of any symbols provided by libutils.{so|a}. libutils.so will (again) need to be included in an subsequent linkage that includes libexecutable.so, but the output file of such a linkage will acquire independent runtime dependencies on both libexecutable.so and libutils.so.
But, if you specify -lutils in the linkage - or any linkage - and the linker cannot find libutils.{so|a}
in any of its search directories, then you get the error you observe, because you told the linker
to input a file, whose effects on the linkage can only be determined and implemented if that file is found - and it cannot be found.
[1] An attempt that is likely to fail, because it consumes libraries before the object
files that refer to them
[2] See static-libraries to understand
why.
In general, an ELF linker needs a sufficiently accurate representation of the shared object that is linked in. It does not have to be an actually working shared objects, just a sufficiently close representation of it. A few things absolute require data that is not available in the object itself:
When compiling C programs, a reference to a global data object of incomplete type does not contain size information. The linker cannot place the object into the data segment unless it obtains the size information from somewhere. By default (when compiling for executables, including PIE) the object needs to be allocated in the data segment on many targets because of the relocations the compiler uses for compiling accesses to global data objects.
Similarly, the link editor might get the alignment of global data objects wrong if it has insufficient information.
Many libraries use symbol versioning. Symbol version information is only available when the link editor can see the shared object. If that information is missing, the link editor will not emit a symbol version, which instructs the dynamic linker to bind the symbol to the base version at run time, leading to subtle bugs.
However, if you only use C function symbols (not data symbols, or the varieties of symbols that C++ requires) and the target library does not use symbol versioning, you can use a stub library for linking. This is a library that defines all the functions you need and has the appropriate soname, but the functions are just dummies which do not actually do anything.

GCC Linking a shared library against an executable

I have the following problem. I have a shared library, which is just a bunch of translation units linked together so when I compile that shared library I won't get any linker error (undefined references, even though I might have).
The shared library gets loaded dynamically from an executable which also contains the exports which my shared library is using (The references used in my library are resolved at runtime).
The main problem is that I want the undefined reference warnings so I can fix them statically instead of waiting the application to crash.
I read somewhere that I can pass "-Wl,--no-undefined" to gcc so I can get these errors back, indeed it worked but it also gave me all the undefined references of the executable's exports. I want to filter these warnings just to the scope of my translation units.
Is this possible? If not, how can I define reference to a executable which has exports for a shared library.
you can try linking the library & main program with -Wl,-z,now. that should make the runtime ldso resolve all references immediately and throw an error if none are found.
otherwise, i'm not seeing an option off hand in the linker manual to say "allow this ELF to satisfy symbols, but don't actually list it as a DT_NEEDED".
you could try using -Wl,--no-undefined and parsing the output with a script so you can filter out symbols you know will be satisfied by the main program.
another option might be to label all the symbols you know the main program provides with __attribute__((weak)) and then still use -Wl,--no-undefined. the weak symbols won't be reported as an error.

g++ linking order dependency when linking c code to c++ code

Prior to today I had always believed that the order that objects and libraries were passed to g++ during the linking stage was unimportant. Then, today, I tried to link from c++ code to c code. I wrapped all the C headers in an extern "C" block but the linker still had difficulties finding symbols which I knew were in the C object archives.
Perplexed, I created a relatively simple example to isolate the linking error but much to my surprise, the simpler example linked without any problems.
After a little trial and error, I found that by emulating the linking pattern used in the simple example, I could get the main code to link OK. The pattern was object code first, object archives second eg:
g++ -o serverCpp serverCpp.o algoC.o libcrypto.a
Can anyone shed some light on why this might be so?. I've never seen this problem when linking ordinary c++ code.
The order you specify object files and libraries is VERY important in GCC - if you haven't been bitten by this before you have lead a charmed life. The linker searches symbols in the order that they appear, so if you have a source file that contains a call to a library function, you need to put it before the library, or the linker won't know that it has to resolve it. Complex use of libraries can mean that you have to specify the library more than once, which is a royal pain to get right.
The library order pass to gcc/g++ does actually matter. If A depends on B, A must be listed first. The reason is that it optimizes out symbols that aren't referenced, so if it sees library B first, and no one has referenced it at that point then it won't link in anything from it at all.
A static library is a collection of object files grouped into an archive. When linking against it, the linker only chooses the objects it needs to resolve any currently undefined symbols. Since the objects are linked in order given on the command line, objects from the library will only be included if the library comes after all the objects that depend on it.
So the link order is very important; if you're going to use static libraries, then you need to be careful to keep track of dependencies, and don't introduce cyclic dependencies between libraries.
You can use --start-group archives --end-group
and write the 2 dependent libraries instead of archives:
gcc main.o -L. -Wl,--start-group -lobj_A -lobj_b -Wl,--end-group

How to make gcc or ld report undefined symbols but not fail?

If you compile a shared library with GCC and pass the "-z defs" flag (which I think just gets passed blindly on to ld) then you get a nice report of what symbols are not defined, and ld fails (no .so file is created). On the other hand, if you don't specify "-z defs" or explicitly specify "-z nodefs" (the default), then a .so will be produced even if symbols are missing, but you get no report of what symbols were missing if any.
I'd like both! I want the .so to be created, but I'd also like any missing symbols to be reported. The only way I know of to do this so far is to run it twice, once with "-z defs" and once without. This means the potentially long linking stage is done twice though, which will make the compile/test cycle even worse.
In case you're wondering my ultimate goal -- when compiling a library, undefined symbols in a local object file indicates a dependency wasn't specified that should have been in my build environment, whereas if a symbol is missing in a library that you're linking against that's not an error (-l flags are only given for immediate dependencies, not dependencies of dependencies, under this system). I need the report for the part where it lists "referenced in file" so I can see whether the symbol was referenced by a local object or a library being linked. The --allow-shlib-undefined option almost fixes this but it doesn't work when linking against static libraries.
Preference to solutions that will work with both the GNU and Solaris linkers.
Instead of making ld report the undefined symbols during linking, you could use nm on the resulting .so file. For example:
nm --dynamic --undefined-only foo.so
EDIT: Though I guess that doesn't give you which source files the symbols are used in. Sorry I missed that part of your question.
You could still use nm for an approximate solution, along with grep:
for sym in `nm --dynamic --undefined-only foo.so |cut -d' ' -f11 |c++filt -p` ; do
grep -o -e "\\<$sym\\>" *.cpp *.c *.h
done
This might have problems with local symbols of the same name, etc.
From the GNU ld 2.15 NEWS file:
Improved linker's handling of unresolved symbols. The switch
--unresolved-symbols= has been added to tell the linker when it
should report them and the switch --warn-unresolved-symbols has been added to
make reports be issued as warning messages rather than errors.