Undefined symbol when trying to load a library with dlopen - c++

I'm trying to load a shared library (plugin) I was provided (closed source) with dlopen under a Linux ARM platform. I'm trying to load this way:
void* handle = dlopen(<library_path>/<library_name>, RTLD_NOW);
The result is a failure with this message:
Failed to load <library_path>/<library_name>: undefined symbol: <symbol_name>.
I tried to look inside the library with nm, but it seems the lib was stripped, no symbol could be found. I also tried using readelf -s, and, in fact, I got this result:
12663: 00000000 0 NOTYPE GLOBAL DEFAULT UND <symbol_name>
By reading around, I get that readelf -s returns all the symbols, including those symbols defined in libraries referenced by it.
The answers to this question are not completely clear to me: is this a symbol which is supposed to be in the library and which is not there because it was compiled the wrong way or is this a symbol I'm supposed find somewhere else? The output of readelf -d seems to suggest I'm providing all the needed shared libraries. May this error be related to a mistake in the way I'm compiling my executable or is this something not related to the loader?
Also, I read about the meaning of each column, but those values are quite strange. How do you interpret that symbol description? Why is address 0? Why is type NOTYPE?

undefined symbol: X means always that X should be exported from one of loaded libraries, but it's not. You should find out in which library requested symbol is and link to it.
You should know that this message is always result of problem with library, it's not fault. Library should know how to get all it's symbols. If it doesn't you can link your executable to required library so when you load your plugin, requested symbol is already known.
This error might have more complex reason. In case when both plugin and main app are linking to library, then attempts to link it might end with undefined symbols anyway. This might happen if main app and plugin are using different version of library (namely plugin uses newer one). Then at the point of loading plugin older version is already loaded, so loader assumes everything is ok, but newer version might contain new symbols. If plugin uses them, you will get undefined symbol errors.

This problem appears also if the order of the static libraries in the linking command is wrong for the app. The Unix ld linker requires that the library which implements a function is specified after the library which refers the function.
I got this trouble when I was trying to build libtesseract shared library taking libz library from a custom location (not a standard libz from the host, but manually built from source as well). I have put an example below:
Wrong linking order (-lz before -llept):
$ g++ -fPIC -DPIC -shared -nostdlib /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o -Wl,--whole-archive ....(some libs) -Wl,--no-whole-archive -L/home/build/jenkins/workspace/tesseract/zlib/bin/lib -L/home/build/jenkins/workspace/tesseract/leptonica/bin/lib -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/x86_64-linux-gnu -lz -llept -lstdc++ -lm -lc -lgcc_s /usr/lib/gcc/x86_64-linux-gnu/5/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o -g -O2 -Wl,-soname -Wl,libtesseract.so.4 -o .libs/libtesseract.so.4.0.1
Check with "nm -D":
$ nm -D .libs/libtesseract.so.4.0.1 | grep deflateInit
U deflateInit_
Check with "dlopen":
Cannot load ./tesseract/src/api/.libs/libtesseract.so.4.0.1 (./tesseract/src/api/.libs/libtesseract.so.4.0.1: undefined symbol: deflateInit_)
It happens because the linker is processing in the loop all static libraries passed in the command line and skipping those which are not used by any of the preceeding ones. Since on the moment of checking of libz.a the linker sees that all of already checked libraries do not use any function from libz.a the linker just "forgets" libz.a.
Proper linking order (-lz after -llept):
$ g++ -fPIC -DPIC -shared -nostdlib /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o -Wl,--whole-archive ....(some libs) -Wl,--no-whole-archive -L/home/build/jenkins/workspace/tesseract/zlib/bin/lib -L/home/build/jenkins/workspace/tesseract/leptonica/bin/lib -L/usr/lib/gcc/x86_64-linux-gnu/5 -L/usr/lib/x86_64-linux-gnu -llept -lz -lstdc++ -lm -lc -lgcc_s /usr/lib/gcc/x86_64-linux-gnu/5/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o -g -O2 -Wl,-soname -Wl,libtesseract.so.4 -o .libs/libtesseract.so.4.0.1
Check with "nm -D":
$ nm -D .libs/libtesseract.so.4.0.1 | grep deflateInit
000000000041fb5b T deflateInit_
000000000041fba3 T deflateInit2_
"dlopen" did not show this error this time.

Related

Linker error undefined reference even if functions are defined in shared library

I´m trying to modify a piece of C++ code (developed under Linux with gcc toolchain) using in it new functions that are defined in two libraries, one shared library (called libsio4_api.so) and a static library (called sio4_main.a). The original code is built with a makefile that first creates a shared library linking toghether various object files and then links this library with the main object file.
This is the line that links the shared library:
g++ -s -shared main.o RTLinkResolution.o IORefresh.o UserProgramImpl.o UserProgramDataManager.o UserProgramCode.o -L./genericFiles/lib -lUserProgramEnvironment -lm -o libUProg.so
I modified it in this way to add to the linkage my libraries (-pthread is needed by one of them):
g++ -s -shared -pthread libsio4_api.so sio4_main.a main.o RTLinkResolution.o IORefresh.o UserProgramImpl.o UserProgramDataManager.o UserProgramCode.o -L./genericFiles/lib -lUserProgramEnvironment -lm -o libUProg.so
This linking terminate without errors and in the resulting lib I can see the functions I want to use (in this example I show just one but also the others are present):
nm -D libUProg.so | grep sio4_async_init
000000000006d581 T sio4_async_init
000000000006dcae T sio4_async_init_data
U _Z15sio4_async_initv
The problem is that when the final linking is done the functions are not found:
g++ -o test main.o libUProg.so -pthread sio4_main.a libsio4_api.so
libUProg.so: undefined reference to `sio4_async_close(int)'
libUProg.so: undefined reference to `sio4_async_init()'
libUProg.so: undefined reference to `sio4_async_open(int, int, int*)'
I searched already for similar problems in other topics and I found out that the order in which libraries are fed to the linker is important, but even if I change the order in the final linking command those functions are not found.
Does someone have any clue about how I can proceed?

c++ & OpenMP : undefined reference to GOMP_loop_dynamic_start

I'm stuck in the following problem : at first I compile the following file cancme.cpp :
void funct()
{
int i,j,k,N;
double s;
#pragma omp parallel for default(none) schedule(dynamic,10) private(i,k,s) shared(j,N)
for(i=j+1;i<N;i++) {}
}
by:
mingw32-g++.exe -O3 -std=c++11 -mavx -fopenmp -c C:\pathtofile\cancme.cpp -o C:\pathtofile\cancme.o
Next I build a second file, test.cpp to simply link cancme.o with :
int main()
{
return(0);
}
by:
mingw32-g++.exe -O3 -std=c++11 -mavx -fopenmp -c C:\pathtofile\test.cpp -o C:\pathtofile\test.o
When linking it with cancme.o, by :
mingw32-g++.exe -o C:\pathtofile\test.exe C:\pathtofile\test.o -lgomp C:\pathtofile\cancme.o
I get the following error messages :
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x39): undefined reference to `GOMP_loop_dynamic_start'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x49): undefined reference to `GOMP_loop_dynamic_next'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x52): undefined reference to `GOMP_loop_end_nowait'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x92): undefined reference to `GOMP_parallel_start'
C:\pathtofile\cancme.o:cancme.cpp:(.text+0x9f): undefined reference to `GOMP_parallel_end'
Does anyone have an idea about what's going wrong there??? The OpenMP library is correctly linked by the -lgomp flag but it is like it was not recognized.
note : I use MingW 4.8.1 c++ compiler under windows 7 64 bit:
thank you
renato
The GNU linker is a single pass linker. It means that it only resolves symbols that it has seen before it has reached the object file that defines the corresponding symbol. That means, if object file foo.o refers symbols from library libbar.a, then having ... foo.o -lbar ... will result in successful link since the undefined references seen in foo.o are satisfied during the processing of libbar.a (as long as no other object listed after -lbar refers symbols from the library). The opposite, i.e. ... -lbar foo.o ... won't work since once the linker has processed libbar.a, it will no longer search it while trying to resolve the references in foo.o.
On Unix systems that support dynamic link libraries (e.g. Linux, FreeBSD, Solaris, etc.), this is often not the case since -lbar will first look for the dynamic version of the library, e.g. libbar.so, and only if not found would try to link against the static libbar.a. When linking with dynamic libraries, the order doesn't matter since the unresolved references are handled later by the runtime link editor.
On Windows, linking against dynamic link libraries (DLLs) require the so-called import libraries to be statically linked into the executable. Therefore, even if the external dependencies are handled by the runtime linker, one still needs to properly order the static import libraries. libgomp.a is one such library.
Note that this behaviour is specific to the GNU linker that is part of MinGW. The Microsoft linker behaves differently: when resolving a reference, it first searches the libraries listed after the object file and then the libraries listed before it.
You made a mistake in compile command:
mingw32-g++.exe -o C:\pathtofile\test.exe C:\pathtofile\test.o -lgomp C:\pathtofile\cancme.o
If you'd use some makefile generators such as CMake for example you not get any linking error.
Don't mess compiler parameters like this: object_file_1 linker_flag_1 object_file_2.
Correct command to compile would be this:
mingw32-g++.exe -o C:\pathtofile\test.exe C:\pathtofile\test.o C:\pathtofile\cancme.o -lgomp

dynamic library loading with linking to static library

I have a program structure that has
static library(ACE)
static library(common.a)
dynamic library plugin 1(1.so)
plugin 2(2.so) and executable
plugin1, plugin2 and executable all use both common.a and libACE.a
Follow the tutorial here: http://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html.
I only link those two static library when compiling the executable as shown below:
g++ -g -DUNIX -DLINUX -Wall -D__NUMBER_FIELD_ID__ -I/opt/ACE_wrappers -Ilib/ -I. -I./common -I./common/lib -I../inc -I/opt/pct/pctlib/inc -o acs_d acs_d.o -L../lib -Wl,--export-dynamic -rdynamic -Wl,--whole-archive /opt/ACE_wrappers/ace/libACE.a common/libcommon_d.a -Wl,--no-whole-archive -ldl -lrt -lpthread
The point is, when I use dlopen to open those two plugins, one succeeds and one fails
The successful one use more ACE functions and the error is complaining undefined symbol as shown below:
[CModuleMgr] loadCModule(): Errors occurred when opening the module. nCModuleId[1] pLibHandle[(nil)] sCModulePath[/opt/acs/adapter/libadapter_d.so] sError[/opt/acs/adapter/libadapter_d.so: undefined symbol: _ZN17ACE_Event_Handler10set_handleEi]
For the main program, I have tried to use command nm to find the symbol
$ nm acs_d | grep _ZN17ACE_Event_Handler10set_handleEi
000000000048f240 t _ZN17ACE_Event_Handler10set_handleEi
It is there, but the plugin cannot find it! I have used option like -Wl,--export-dynamic -rdynamic -Wl,--whole-archive. But it still cannot find this symbol. any idea?
It is there, but the plugin cannot find it!
No, the symbol is not there!
Or rather, the symbol has internal linkage (t), and is not visible or usable outside of the ELF image into which it is linked. Globally visible symbols have external (T) linkage.
The most likely cause for the symbol to have t linkage is that the symbol has __attribute__((visibility("hidden"))) at the source level. Documentation here.
Why ACE developers marked it as such, I don't know.

Why does the library linker flag sometimes have to go at the end using GCC?

I'm writing a small C program that uses librt. I'm quite surprised that the program won't compile if I place the link flag at the start instead of at the end:
At the moment, to compile the program I do:
gcc -o prog prog.c -lrt -std=gnu99
If I were to do the following, it will fail to find the functions in librt:
gcc -std=gnu99 -lrt -o prog prog.c
Yet, this works with other libraries. I found the issue when attempting to use a simple Makefile. make actually compiled prog.c without liking first (using -c flag) and then did the linking.
This is the Makefile:
CC = gcc
CFLAGS = -std=gnu99
LIBS= -lrt
LDFLAGS := -lrt
prog: prog.o
$(CC) -o prog prog.c -lrt -std=gnu99
The output I would get when typing make would be:
gcc -std=gnu99 -c -o prog.o prog.c
gcc -lrt prog.o -o prog
prog.o: In function `main':
prog.c:(.text+0xe6): undefined reference to `clock_gettime'
prog.c:(.text+0x2fc): undefined reference to `clock_gettime'
collect2: ld returned 1 exit status
make: *** [buff] Error 1
I have now crafted a Makefile that puts the linking at the end of the gcc line, however I'm puzzled why it doesn't work if the linking flag is at the start.
I would appreciate if anybody can explain this to me. Thanks.
As the linker processes each module (be it a library or a object file), it attempts to resolve each undefined symbol while potentially adding to its list of undefined symbols. When it gets to the end of the list of modules, it either has resolved all undefined symbols and is successful or it reports undefined symbols.
In your case, when it processed librt, it had no undefined symbols. Processing proc resulted in clock_gettime being an undefined symbol. gcc will not go back and look in librt for the undefined symbols.
For that reason, you should always have your code first, followed by your libraries, followed by platform provided libraries.
Hope this helps.
From the ld (the GNU linker) documentation (http://sourceware.org/binutils/docs/ld/Options.html#Options):
The linker will search an archive only once, at the location where it is specified on the command line. If the archive defines a symbol which was undefined in some object which appeared before the archive on the command line, the linker will include the appropriate file(s) from the archive. However, an undefined symbol in an object appearing later on the command line will not cause the linker to search the archive again.
So if you specify the library too early, the linker will scan it, but not find anything of interest. Then the linker moves on to the object file produced by the compiler and finds references that need to be resolved, but it has already scanned the library and won't bother looking there again.

C++ Statically linked shared library

I have a shared library used by a another application beyond my control which requires *.so objects. My library makes use of sqlite3 which needs to be statically linked with it (I absolutely need a self-contained binary).
When I try to compile and link my library:
-fpic -flto -pthread -m64
-flto -static -shared
I end up with the following error:
/usr/bin/ld: /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.1/crtbeginT.o: relocation R_X86_64_32 against `__DTOR_END__' can not be used when making a shared object; recompile with -fPIC
/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.6.1/crtbeginT.o: could not read symbols: Bad value
collect2: ld returned 1 exit status
What is recompile with -fPIC related to? My code or CRT?
I have already tried to compile my object with -fPIC with the same result.
Thanks.
EDIT:
The problem does not seem to be related to SQLite3.
I wrote a simple one-line-do-nothing library which compiles and links like this:
g++ -c -fPIC -o bar.o bar.cpp
g++ -shared -o bar.so bar.o
but not like this:
g++ -c -fPIC -o bar.o bar.cpp
g++ -static -shared -o bar.so bar.o
The problem seems to be related to CRT (crtbeginT.o). Am I supposed to recompile GCC --with-pic or anything?
You shouldn't use the -static flag when creating a shared library, it's for creating statically linked executables.
If you only have a static version of the library, you can just link it in using -lsqlite3. But if there's both a dynamic version(.so) and a static version, the linker will prefer the dynamic one.
To instruct the linker to pick the static one, give the linker the -Bstatic flag, and make it switch back to dynamic linking for other stuff (like libc and dynamic runtime support) with -Bdynamic. That is, you use the flags:
-Wl,-Bstatic -lsqlite3 -Wl,-Bdynamic
Alternativly, you can just specify the full path of the .a file, e.g. /usr/lib/libsqlite3.a instead of any compiler/linker flags.
With the GNU ld, you can also use -l:libsqlite3.a instead of -lsqlite3. This will force the use of the library file libsqlite3.a instead of libsqlite3.so, which the linker prefers by default.
Remember to make sure the .a file have been compiled with the -fpic flag, otherwise you normally can't embed it in a shared library.
Any code that will somehow make its way into a dynamic library should be relocatable. It means that everything that is linked with your .so, no matter statically or dynamically, should be compiled with -fPIC. Specifically, static sqlite library should also be compiled with -fPIC.
Details of what PIC means are here: http://en.wikipedia.org/wiki/Position-independent_code
I had the same problem. Apparently -static is not the same as -Bstatic. I switched to -Bstatic and everything worked.