linking openGL on BSD - opengl

I have a slightly strange problem with a program that uses openGL
If I try compiling on a FreeBSD machine with nvidia graphics using the link options -pthread -lm -lX11 -lGL -lGLU
I get the error:
//usr/local/lib/libGL.so: undefined reference to `_nv021glcore'
//usr/local/lib/libGL.so: undefined reference to `_nv013glcore
Open GL demos and things run fine on this same machine. If I try compiling the same program on another machine with the same version of FreeBSD but without nvidia boards present, it compiles, but if I try running the binary on the machine with the nvidia graphics, the program fails with
X Error of failed request: BadValue (integer parameter out of range for operation)
Major opcode of failed request: 153 (GLX)
Minor opcode of failed request: 3 (X_GLXCreateContext)
Value in failed request: 0x0
Serial number of failed request: 22
Current serial number in output stream: 2
OpenGL binary demos and things run fine on the same BSD machine, so there does not seem to be a problem with the GL setup and the same program causing this compiles and runs on linux, I'm at a loss as to what is causing this.
EDIT
most of these nvxxxcore functions seem to be exported from libnvidia-glcore.so.1
nm -D gives the exports :
00000000010abaa0 T _nv000glcore
00000000010aba90 T _nv001glcore
00000000010abac0 T _nv002glcore
00000000011322e0 T _nv003glcore
00000000010abae0 T _nv014glcore
00000000010d0200 T _nv015glcorea-tls.so which ma
00000000010bc9e0 T _nv016glcore
0000000001c27340 B _nv017glcore
0000000001578ac0 R _nv018glcore
0000000001584480 R _nv019glcore
0000000001c292c0 B _nv020glcore
0000000001c22080 B _nv022glcore
0000000000add970 T _nv023glcore
0000000001c27860 B _nv024glcore
0000000001c292a8 B _nv027glcore
0000000001c2a2a0 B _nv028glcore
0000000001c27178 B _nv029glcore
0000000001bf0248 D _nv035glcore
000000000119ba60 T _nv042glcore
Which does not include these missing links, but this still leaves me none the wiser of how to solve this.
EDIT
The exports in question are located in libnvidia-tls.so which makes this all the stranger, since this should be found since
readelf -d /usr/local/lib/libGL.so
Dynamic section at offset 0xf0e08 contains 26 entries:
Tag Type Name/Value
0x0000000000000001 NEEDED Shared library: [libnvidia-tls.so.1]
0x0000000000000001 NEEDED Shared library: [libnvidia-glcore.so.1]
0x0000000000000001 NEEDED Shared library: [libX11.so.6]
0x0000000000000001 NEEDED Shared library: [libXext.so.6]
0x0000000000000001 NEEDED Shared library: [libc.so.7]
0x000000000000000e SONAME Library soname: [libGL.so.1]
0x0000000000000010 SYMBOLIC 0x0

eventually found the solution.
compilation can be done by passing -u _nv021glcore -u _nv013glcore which brings back the failed opcode. To solve this also need to startx with -- +iglx to enable indirect rendering which seems to be disabled by default these days as it seems to be considered legacy openGL

Related

lib OSMesa off-screen context creation fails in C++, but only when statically linked

I made a C++ tool for off-screen rendering of 3D models. The rendering is done using OSMesa library.
The software was working flawlessly for more than a year, and I stopped to make updates to it something like 6 months ago. In the meanwhile my development environment was updated multiple times.
Now I was compiling it again and found an unexpected bug.
The plain version of the software was still working as expected, but the statically linked one is segfaulting.
I'm assuming that the error is mine in the OSmesa configuration/compilation/linking procedure and not in the library code, but any advice about better debugging of the segmentation fault is appreciated.
Having tried numerous variations of the compilation process without success, I'm now quite stuck.
Anyone can see something stupid I'm doing in some of the steps described below?
I recompiled a static version of the OSmesa library with the same version of the shared library that is working in my system (12.0.6), disabling all the non-needed features (using an Ubuntu based system, no static version of OSmesa lib is available from repositories):
./configure \
--disable-xvmc \
--disable-glx \
--disable-dri \
--with-dri-drivers="" \
--with-gallium-drivers="" \
--disable-shared-glapi \
--disable-egl \
--with-egl-platforms="" \
--enable-osmesa \
--enable-gallium-llvm=no \
--disable-gles1 \
--disable-gles2 \
--enable-static \
--disable-shared
This is the compile command of my off-screen rendering tool:
g++ -std=c++11 -Wall -O3 -g -static -static-libgcc -static-libstdc++ ./src/measure_model.cpp model.o thumbnail.o -o measure_model_debug -pthread -lOSMesa -ldl -lm -lpng -lz -lcrypto
This is a warning that I was getting by statically compiling using OSMesa, and it was present even a year ago with the working static binary:
/home/XXX/XXX/backend/lambda/mesa/mesa-12.0.6/src/mesa/main/dlopen.h:52: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
This is what I get from running the tool:
Segmentation fault (core dumped)
But no segmentation fault is produced if I simply skip the OSmesa context creation step (and obviously all the 3D rendering)
This is the backtrace:
#0 0x0000000000000000 in ?? ()
#1 0x00000000004af20a in mtx_init (type=4, mtx=0xe10f70) at ../../include/c11/threads_posix.h:215
#2 _mesa_NewHashTable () at main/hash.c:135
#3 0x000000000052f295 in _mesa_alloc_shared_state (ctx=ctx#entry=0xdcc9b0) at main/shared.c:67
#4 0x000000000046e717 in _mesa_initialize_context (ctx=ctx#entry=0xdcc9b0, api=api#entry=API_OPENGL_COMPAT, visual=, share_list=share_list#entry=0x0, driverFunctions=driverFunctions#entry=0x7fffffffcd40) at main/context.c:1192
#5 0x000000000046c870 in OSMesaCreateContextAttribs (attribList=attribList#entry=0x7fffffffd290, sharelist=) at osmesa.c:834
#6 0x000000000046ccdc in OSMesaCreateContextExt (format=, depthBits=, stencilBits=, accumBits=, sharelist=) at osmesa.c:660
#7 0x0000000000468742 in generate_thumbnail(Model*, Json::Value) ()
#8 0x0000000000401c7d in main (argc=, argv=) at ./src/measure_model.cpp:107
A statically linked binary is a strict requirement.
The segmentation fault is happening on the same machine I use to compile the tool (OSmesa static lib is compiled in the same machine too), but no segmentation fault in the non-statically linked version of the same tool.
This is what I get from running the tool:
Segmentation fault (core dumped)
But no segmentation fault is produced if I simply skip the OSmesa context creation step (and obviously all the 3D rendering)
So, there is some problem from OSmesa creation. With your backtrace we can see that top function was executed from EIP of zero (jump to NULL / call of NULL), so there is call of some function in mtx_init, which is part of OS Mesa context creating.
#0 0x0000000000000000 in ?? ()
#1 0x00000000004af20a in mtx_init (type=4, mtx=0xe10f70) at ../../include/c11/threads_posix.h:215
#2 _mesa_NewHashTable () at main/hash.c:135
#3 0x000000000052f295 in _mesa_alloc_shared_state (ctx=ctx#entry=0xdcc9b0) at main/shared.c:67
#4 0x000000000046e717 in _mesa_initialize_context (ctx=ctx#entry=0xdcc9b0, api=api#entry=API_OPENGL_COMPAT, visual=, share_list=share_list#entry=0x0, driverFunctions=driverFunctions#entry=0x7fffffffcd40) at main/context.c:1192
#5 0x000000000046c870 in OSMesaCreateContextAttribs (attribList=attribList#entry=0x7fffffffd290, sharelist=) at osmesa.c:834
#6 0x000000000046ccdc in OSMesaCreateContextExt (format=, depthBits=, stencilBits=, accumBits=, sharelist=) at osmesa.c:660
#7 0x0000000000468742 in generate_thumbnail(Model*, Json::Value) ()
#8 0x0000000000401c7d in main (argc=, argv=) at ./src/measure_model.cpp:107
What was the function? According to online sources of include/c11/threads_posix.h: mtx_init() on github, there are only calls to pthread_mutex_init, pthread_mutexattr_init and several other mutex related functions of libpthread (-lpthread).
Why there was produced call to NULL instead of real function? Probably due to using static linkage of glibc and/or libpthread. Exact problem is still unidentified at this moment (I was able to found report of statically linked libpthread.a into some shared lib which is incorrect and will never work).
In your case there is only alias (strong one) of pthread_mutex_init in glibc/nptl/pthread_mutex_init.c (line 150) strong_alias (__pthread_mutex_init, pthread_mutex_init) and there may be some weak alias of the symbol in the glibc itself, probably uninitialized. Some was wrong in your linking options or/and in ld mind and he did not find/link the nptl/pthread_mutex_init.o (it is part of libpthread.a archive) with real symbol into final executable (ld often skips unused/unneeded objects of .a archives and don't link them into final executable), keeping the relocation pointing into NULL. Some expert of glibc may know, Employed Russian is one of experts on SO.
I suggest to link statically only to your internal libs or probably also to normal non-system libs like mesa (you may use -Wl,-Bstatic -lyour_lib -Wl,-Bdynamic options to temporary change linkage to static for libs listed between; or use cheat option of -l: as -l:libYour_lib.a found by Radek in the same q.). But do not link statically to most basic libs of glibc like libc, libpthread, librt (there are some problems in static linking of glibc when nss is used: target system must have exact same version of dynamic glibc to enable nss to work).
If you want to pack your application for older machines and you needs some features of glibc you may also try to pack your own version of shared glibc libs with your application; put them to some subdirectory, add rpath option of linker to change library search paths, also change INTERP section from default ABI ld-linux.so.2 loader to your own copy of ld-linux.so.2 from your version of glibc, ... And you still will have problems with too old kernels, as newer glibcs requires some modern features (syscalls, structs) of rather new kernel.
Or you can pack your application into some sort of container like Docker, or some other isolation solution (or chroot?) to always have your versions of libs...
UPDATE: Just found report of similar bt with NULL instead of mutex implementation from nptl: https://bugzilla.redhat.com/show_bug.cgi?id=163083 "Statically linked C++ program using pthreads will segfault" (2005-2007) pthread_mutex_init(&lock, NULL); g++ -g -static foo.cpp -o foo -lpthread where #0 0x00000000 in ?? () #1 0x08048232 in main () at foo.cpp:7
This is apparently due to certain pthreads functions not being included in the output executable. This bug may duplicate #115157, and I apologize if so, but hopefully the included test case will be useful.
Additional info:
The suggestion in #115157 to forcibly link in all of libpthread.a is a valid workaround.
https://bugzilla.redhat.com/show_bug.cgi?id=115157 "executables linked statically with /usr/lib/nptl/libpthread.a fail" - 2004-2009 CLOSED WONTFIX
Jakub Jelinek 2004-10-29 05:26:10 EDT
First of all, avoid -static if you can, it only creates problems,
both portability wise and others as well.
If you really need to create statically linked binary with -lpthread
linked in, then just use -Wl,--whole-archive -lpthread -Wl,--no-whole-archive
instead of -pthread. Anything else has really many problems.

Unable to run Woden Physics Example in Pharo

I am trying to run the Woden Physics Example inside Pharo which involves getting Bullet properly compiled and the smalltalk bindings properly installed in Pharo.
I am using Linux Mint 17 x64.
But NativeBoost seems unable to load the compiled libraries. I have been using the sources provided here:
https://github.com/ronsaldo/bullet-pharo
https://github.com/ronsaldo/swig
I built the modified version of swig as well as the bullet libraries and bindings with the provided build scripts.
I also have doublechecked that the bullet libraries are 32 bit.
Opening up the Woden physics example returns this error:
failed to get a symbol address:
PharoNB_new_BTDefaultCollisionConfiguration__SWIG_1
When examining the call stack in the debugger, it turns out that the module handle is 0.
I verified this by executing the same message as
BulletCInterface nbLibraryNameOrHandle
executes:
NativeBoost forCurrentPlatform loadModule: 'BulletPharo'
This message returns 0. I tried to specify the full path to libPharoBullet.so in the workspace, like:
NativeBoost forCurrentPlatform loadModule:
'/home/martin/.local/share/Pharo/bullet-pharo/libBulletPharo.so'
with the same result. I also verified it with a 32 bit system library of mine (liblzma) and there NativeBoost was able to load it, as it returned a non-zero handle.
So i suspect something during compilation went wrong...
I also did
readelf -h libPharoBullet.so
and its ABI was "UNIX - GNU" while the ABI of pharo-vm is "UNIX - System V"
Could this be the problem here ?
How can i force the ABI to be System V when compiling ? I use gcc 4.8.2
Or what steps could i otherwise perform ?

armadillo requested size is too large

I am using armadillo4.300.0. I am operating on a dense matrix of size 2840260x103. I am loading this matrix from a .csv file of size approximately 3.7GB. I have enabled "ARMA_64BIT_WORD" in my application as well as config.hpp under armadillo_bits directory.
#if !defined(ARMA_64BIT_WORD)
#define ARMA_64BIT_WORD
#endif
I am compiling with gcc49 and running on ubuntu 12.04. When I run I am getting the following error. Interestingly, the application occasionally runs too. For eg., if I keep trying for some 10 times, it runs sometime.
error: Mat::init(): requested size is too large
terminate called after throwing an instance of 'std::logic_error'
what(): Mat::init(): requested size is too large
Do I need to take care of something else?
Ramki.
This problem is solved with the Intel MKL library, when we compile with the -DMKL_ILP64 -m64. Typically we focus only on link flags. But it is important to note that these flags must be enabled during compile phase on the gcc command as well. I am not sure how to enable this on openmpi library. Also the lib armadillo.so must link with mkl_ilp64 instead of mkl_lp64. Follow the instruction below.
Building and installing armadillo :
export CXX=icpc
export CC=icpc
export PATH=$PATH:/home/ramki/intel/bin:
Edit $armadillo_root/cmake_aux/Modules/ARMA_FindMKL.cmake, include the PATHS correctly.
Edit $armadillo_root/cmake_aux/Modules/ARMA_FindMKL.cmake, change mkl_lp64 to mkl_ilp64
Edit $armadillo_root/CMakeLists.txt and (1) Change CMAKE_SHARED_LINKER_FLAGS to include the link line by intel link advisor and (2) Change CMAKE_CXX_FLAGS as given by intel link advisor
Run ./configure and make sure MKL library is used for blas and lapack, icpc to be the compiler and the rest to be alright.
Run make .
Verify the linked libraries by running ldd libarmadillo.so. Mainly verify whether it is linked with mkl_ilp64 library and mkl blas and lapack libraries.
Now run make install DESTDIR=local path.
This should work.

GCC: Executing Code at "Preinitialization" time

So on Linux when a C++ program that was compiled/linked with gcc, has its executable loaded the following happens:
exec* syscall
LD dynamic libraries loaded
C++ static initialization
entry point of main
Suppose I have some function with prototype void f(),
Is there some way (via source modification, attributes, compiler/linker options, etc) to link the executable with f such that it will be executed between step 1 and 2 ?
What about between step 2 and 3 ?
(Clearly there is no standard way to do this, I am asking for a platform-specifc, compiler-specific way for a recent versions of gcc/linux/x86_64/glibc/binutils)
Yes, you could do this between (1) and (2), or between (2) and (3). Step 2 "ld dynamic libraries loaded" is actually done by calling the dynamic linker, ld.so. Typically, this would be /lib64/ld-linux-x86-64.so.2 or similar; its part of glibc. However, the path is actually specified in the executable, so you can use any path you want.
$ readelf -l `which bash`
⋮
Program Headers:
⋮
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
⋮
This is in addition to doing things like LD_PRELOAD/LD_AUDIT.
For between (2) and (3), it sounds like you just want to change the entrypoint address.
Basically no. The execve() system call will wipe the environment. There is no way for anything you do in your address space to survive into the new one. You are allowed to send file descriptors (except ones flagged CLOEXEC, of course) into the new process, and you can pass it arguments via the environment.
... which gives you something that might do what you want. You can set LD_PRELOAD to load a shared library "first", before any code is run from your target executable and before any symbols are resolved by the shared linker. It's not clear from your question whether this meets your requirements or not.

gdb: Cannot find new threads: generic error after system update

I am running an OpenEmbedded based Linux on an ARM board, where my application is running. I used to run kernel 2.6.35, gdb 6.8 and gcc 4.3. Lately I've updated the system to kernel 2.6.37, gdb 7.4 (also tried 7.3) and gcc 4.6.
Now, my application can not be debugged anymore (on the ARM board), everytime I try to run it in gdb I get the error "gdb: Cannot find new threads: generic error". The application makes use of pthreads and does link against pthreads (readelf lists libpthread.so.0 as a dependency). The suggested solutions I found so far all recommend linking to pthread which I am already doing. The other recommendation I found was to use LD_PRELOAD=/lib/libpthread.so.0 which does not make any difference for me.
Debugging the x86 builds of the application works without a problem.
EDIT: To answer the questions posed in the first answer, I am using gdb on the target (ARM), i.e. no cross-gdb. I also have not stripped libpthread.so.0 (/lib/libpthread-2.9.so: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.16, not stripped). glibc remained at version 2.9, and the update involved recompiling the whole linux image
EDIT2: Removing /lib/libthread-db* allows debugging (with consequent warnings and obviously some features will not work)
EDIT3: Using set debug libthread-db 1 I get:
Starting program: /home/root/app
Trying host libthread_db library: libthread_db.so.1.
Host libthread_db.so.1 resolved to: /lib/libthread_db.so.1.
td_ta_new failed: application not linked with libthread
thread_db_load_search returning 0
Trying host libthread_db library: libthread_db.so.1.
Host libthread_db.so.1 resolved to: /lib/libthread_db.so.1.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
warning: Unable to set global thread event mask: generic error
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 0.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 1.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 2.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 3.
thread_db_load_search returning 1
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 0.
Cannot find new threads: generic error
(gdb) Write failed: Broken pipe
There are two common causes of this error:
You have a mis-match between libpthread.so.0 and libthread_db.so.1
You have stripped libpthread.so.0
Your message isn't entirely clear:
are you using cross GDB to debug the application running on ARM from an x86 host?
have you updated (or rebuilt) glibc in addition to updating the kernel, etc.
If you stripped libpthread.so.0, then don't do that -- libthread_db needs it to not be stripped.
If you are cross-debugging, make sure to rebuild libthread_db.so.1 on host to match glibc on target.
Update:
not cross-debugging
did not strip libpthread
So, something in your GDB or glibc appears to have been broken. You can try to see what that is by
Putting removed libthread_db back, and
(gdb) set debug libthread-db 1
(gdb) run
Update 2:
warning: Unable to set global thread event mask: generic error
This means that GDB was able to look up td_ta_set_event function in libthread_db, and called it, but the function returned an error. One way this could happen is if GDB was unable to find __nptl_threads_events function in libpthread.so.0. What does this command produce:
nm /lib/libpthread.so.0 | grep __nptl_threads_events
If that command produces output, e.g.:
000000000021c294 b __nptl_threads_events
then I am not sure what else is failing. You'll likely have to debug GDB itself to figure out what's happening.
If on the other hand the grep above produces no output, then it's a problem with your toolchain: you'll have to figure out why that variable doesn't appear in your rebuilt libpthread.so.0.