Intel Fortran relocation error, libiomp5.so - fortran

There is a Fortran code which I am trying to compile which works only with Intel Fortran, and with a recent version, due to some Fortran 2008 features.
I am doing all this on a cluster, by using SSH.
I used to load the module
module load compilers/intel/15.1/
to use ifort. However, the 15.1 version is not sufficient for the compilation of this program. Therefore I loaded the only one more recent available on the cluster,
module load compilers/intel/17.0.3/
The program compiles.
However, at runtime I get the following error:
./myprogram: relocation error: ./myprogram: symbol kmp_aligned_malloc, version VERSION not defined in file libiomp5.so with link time reference
./myprogram: relocation error: ./myprogram: symbol kmp_aligned_malloc, version VERSION not defined in file libiomp5.so with link time reference
./myprogram: relocation error: ./myprogram: symbol kmp_aligned_malloc, version VERSION not defined in file libiomp5.so with link time reference
./myprogram: relocation error: ./myprogram: symbol kmp_aligned_malloc, version VERSION not defined in file libiomp5.so with link time reference
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 22573 on
node splinter-login.local exiting improperly. There are two reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
There seems to be clash between libraries. am I right? I printed
echo $LD_LIBRARY_PATH
:/share/apps/intel/l_ics_2015.1.133/composer_xe_2015.1.133/ipp/../compiler/lib/intel64/:/lib:/lib64:/share/apps/intel/l_ics_2015.1.133/composer_xe_2015.1.133/compiler/lib/intel64:/share/apps/intel/l_ics_2015.1.133/composer_xe_2015.1.133/compiler/lib/intel64/:/share/apps/intel/l_ics_2015.1.133/composer_xe_2015.1.133/ipp/../compiler/lib/intel64
so it seems to me that in LD_LIBRARY_PATH there are still the Intel 15 libraries. Should I subsitute them with the 17 ones? Problem is that in /share/apps/intel/ there are no 17 libraries...
UPDATE: I did
module purge
module load compilers/intel/17.0.3/
recompiled and rerun
and sourced the .sh file mentioned in the comments
and I still get the same error

Related

Get reason that LoadLibrary cannot load DLL

On Linux and Mac, when using dlopen() to load a shared library that links to another library, if linking fails because of a missing symbol, you can get the name of the missing symbol with dlerror(). It says something like
dlopen failed: cannot locate symbol "foo"
On Windows, when using LoadLibrary() to load a DLL with a missing symbol, you can only get an error code from GetLastError() which for this type of issue will always be 127. How can I figure out which symbol is missing, or a more verbose error message from LoadLibrary() that explains why the function failed?
I figured out a way using the MSYS2 terminal. Other methods might work with GUI software.
A major caveat is that this can't be done in pure C/C++ and released for end users. It's for developers only, but it's better than nothing.
Install Debugging Tools for Windows by downloading the Windows SDK and unchecking everything except Debugging Tools.
I could be wrong, but it seems that installing this software installs a hook into the Windows kernel to allow LoadLibrary() to write verbose information to stderr.
Open the MSYS2 Mingw64 terminal as an administrator and run
'/c/Program Files (x86)/Windows Kits/10/Debuggers/x64/gflags.exe' -i main.exe +sls
This prints the following to the terminal to confirm that the registry has been changed.
Current Registry Settings for main.exe executable are: 00000002
sls - Show Loader Snaps
Use -sls instead of +sls if you need to undo, since I believe that the change takes place for all programs called main.exe in Windows globally, not just for your file.
Then running main.exe should print debug information to stderr, but since I'm debugging an -mwindows application, it's not working for me.
But for some reason, running the binary with MSYS2's gdb allows this debug information to be printed to stderr.
Install mingw-w64-x86_64-gdb with MSYS2 and run gdb ./main.exe and type run or r.
Search for a section similar to the following.
warning: 1ec8:43a0 # 764081125 - LdrpNameToOrdinal - WARNING: Procedure "foo" could not be located in DLL at base 0x000000006FC40000.
warning: 1ec8:43a0 # 764081125 - LdrpReportError - ERROR: Locating export "foo" for DLL "C:\whatever\plugin.dll" failed with status: 0xc0000139.
warning: 1ec8:43a0 # 764081125 - LdrpGenericExceptionFilter - ERROR: Function LdrpSnapModule raised exception 0xc0000139
Exception record: .exr 00000000050BE5F0
Context record: .cxr 00000000050BE100
warning: 1ec8:43a0 # 764081125 - LdrpProcessWork - ERROR: Unable to load DLL: "C:\whatever\plugin.dll", Parent Module: "(null)", Status: 0xc0000139
warning: 1ec8:43a0 # 764081171 - LdrpLoadDllInternal - RETURN: Status: 0xc0000139
warning: 1ec8:43a0 # 764081171 - LdrLoadDll - RETURN: Status: 0xc0000139
Great! It says Procedure "foo" could not be located in DLL so we have our missing symbol, just like in POSIX/UNIX's dlopen().
While the answer from Remy Lebeau is technically correct, determining the missing symbol from GetLastError() is still possible on a Windows platform. To understand what exactly is missing, understanding the terminology is critical.
Symbol:
When a DLL is compiled, it's functions are referenced by symbols.
These symbols directly relate to the functions name (the symbols are
represented by visible and readable strings), its return type, and
it's parameters. The symbols can actually be read directly through a
text editor although difficult to find in large DLLs.DLL Symbols - C++ Forum
To have a missing symbol implies that a function within cannot be found. If this error occurs prior to using GetProcAddress(), then it's possible that any number of functions cannot be loaded due to missing prerequisites. This means it is possible that a library that you are attempting to load also requires a library that the first cannot load. These levels of dependency may go on for an unknown number of layers, but the only answer that GetLastError() can determine is that there was a missing symbol. One such method is by using Dependency Walker to determine the missing library the first library requires. Once all required libraries are available and can be found by that library (which can be its own can of worms), that library can be loaded via LoadLibrary().

Relocation error when using gdb

I'm experiencing a strange behavior that I need help resolving. We develop an Qt 5.8 application using Qt Creator for our i.MX6 board.
The application runs correctly in debug mode when it is started directly (or via gdb) on the target via a shell using the following argument:
-qmljsdebugger=port:10001,services:DebugMessages,QmlDebugger,V8Debugger,QmlInspector
However, whenever it is launched with remote gdb from host it prints out a relocation error. This behavior started to appear
recently when more signals/properties were added to a model (which in no different to the existing models).
The following is printed in Qt Creator:
Debugging starts
Listening on port 10000
Remote debugging from host 192.168.213.11
Process /opt/app/bin/app created; pid = 4923
/opt/app/bin/app: relocation error: /opt/app/bin/app: symbol _ZTVN10__cxxabiv120__si_class���e_infoE, version CXXABI_1.3 not defined in file libstdc++.so.6 with link time reference
If the following lines (+ one emit) are removed from the code it starts up correctly:
Q_PROPERTY(bool deleted READ getDeleted WRITE setDeleted NOTIFY deletedChanged)
Q_SIGNAL void deletedChanged();
I have also noticed that if additional signals are added I get different error messages, for example if 3 new unused signals are added (it seems that it does not matter to what model the signals are added) I get
/opt/app/bin/app: relocation error: /opt/app/bin/app: symbol ���St9exception, version GLIBCXX_3.4 not defined in file libstdc++.so.6 with link time reference
The project has about 25 models with about 100 properties in total.
Any ideas where to go from here?
EDIT: The libraries and applications on target are prelinked once at fresh startup. I noticed that it works correctly if the application to debug is prelinked before gdb-debugging starts. Not sure I understand why.

Unable to spawn a shell using gdb

I have this C function in a huge code:
void test() {
char *arg[] = {"/bin/sh", 0};
execve("/bin/sh", arg, 0);
}
I am trying to debug this code using gdb
(gdb) call test()
process 1948 is executing new program: /bin/dash
warning: Selected architecture i386:x86-64 is not compatible with reported target architecture i386
Architecture of file not recognized.
An error occurred while in a function called from GDB.
Evaluation of the expression containing the function
(test) will be abandoned.
When the function is done executing, GDB will silently stop.
Hence the shell is not spawning. How to go about it?
gdb isn't allowing you to exec a binary with a different architecture, even though it's compatible on your platform. The same occurs if you try to exec a 32bit executable from a 64bit one. This occurs on the latest version (7.5.1) of gdb as well.
If you can compile your code as 32bit without it causing other problems, it would be a workaround.
As per Hasturkun's comment, there is a patch available here.

Can't create QT postgresql plugin for Symbian device

I'm trying to create postgresql plugin for Symbian device but I can't compile it. I'm working on Windows 7 64bit.
I did everything according to this article: http://doc.qt.io/archives/qt-4.7/sql-driver.html#qpsql
C:\QtSDK\QtSources\4.8.1\src\plugins\sqldrivers\psql>qmake "INCLUDEPATH+=C:\Program Files (x86)\PostgreSQL\8.3\include" "LIBS+=C:\Program Files (x86)\PostgreSQL\8.3\lib\libpq.lib" psql.pro
WARNING: (internal):1: Unescaped backslashes are deprecated.
So, it looked OK. Then...
C:\...drivers\psql>C:\QtSDK\Symbian\tools\sbs\win32\mingw\bin\make debug-gcce
sbs -c arm.v5.udeb.gcce4_4_1
python.exe is not recognized as an internal or external command, operable program or batch file.
make: *** [debug-gcce] Error 9009
I noticed, that sbs_home was set to python directory but it was not in the path, then the make could not find the script raptor_start.py:
C:\...drivers\psql>echo %sbs_home%
C:\QtSDK\Symbian\tools\sbs\win32\python27
C:\...drivers\psql>set path=%path%;%sbs_home%
C:\...drivers\psql>C:\QtSDK\Symbian\tools\sbs\win32\mingw\bin\make debug-gcce
sbs -c arm.v5.udeb.gcce4_4_1
python.exe: can't open file C:\QtSDK\Symbian\tools\sbs\win32\python27\python\raptor_start.py': [Errno 2] No such file or directory
make: *** [debug-gcce] Error 2
C:\...drivers\psql>set sbs_home=C:\QtSDK\Symbian\tools\sbs
so, when I started compiling I got this error:
C:/QtSDK/Symbian/SDKs/Symbian3Qt474/epoc32/include/stdapis/stlportv5/stl/_istream.c:650: warning: suggest parentheses around '&&' within '||' target : epoc32\release\armv5\udeb\qsqlpsql.dll [arm.v5.udeb.gcce4_4_1]
FAILED linkandpostlink for arm.v5.udeb.gcce4_4_1: epoc32\release\armv5\udeb\qsqlpsql.dll
mmp: qsqlpsql_dll.mmp
c:/qtsdk/symbian/tools/gcce4/bin/../lib/gcc/arm-none-symbianelf/4.4.1/../../../../arm-none-symbianelf/bin/ld.exe: warning: C:/QtSDK/Symbian/SDKs/Symbian3Qt474/epoc32/release/armv5/udeb/usrt3_1.lib(ucppinit.o) uses variable-size enums yet the output is to use 32-bit enums; use of enum values across objects may fail
C:/QtSDK/Symbian/SDKs/Symbian3Qt474/epoc32/build/psql/c_8d95259b570e1766/qsqlpsql_dll/armv5/udeb/qsql_psql.o: In function `qMakeError': C:/QtSDK/QtSources/4.8.1/src/sql/drivers/psql/qsql_psql.cpp:175: undefined reference to `PQerrorMessage'
....many undefined references...
C:/QtSDK/QtSources/4.8.1/src/sql/drivers/psql/qsql_psql.cpp:117: undefined reference to `PQfreemem'
collect2: ld returned 1 exit status
mingw32-make[1]: *** [C:/QtSDK/Symbian/SDKs/Symbian3Qt474/epoc32/release/armv5/udeb/qsqlpsql.dll] Error 1
sbs: error: The make-engine exited with errors.
sbs : warnings: 3
sbs : errors: 2
built 'arm.v5.udeb.gcce4_4_1'
Run time 5 seconds
sbs: build log in C:\QtSDK\Symbian\SDKs\Symbian3Qt474\epoc32\build\Makefile.2012-06-26-15-03-12.78-2996.log
make: *** [debug-gcce] Error 1
Has anybody idea, what with it?
You are trying to build an Symbian/ARM application using precompiled binary postgresql client library for Windows. This won't ever work. The instructions you refer to only show how to build for OS X, Unix and Mac targets natively. You're cross compiling for Symbian.
You'd first need to obtain a binary version of the postgresql client library compiled for Symbian. It might exist out there, or you may need to compile it yourself. I don't think that the library supports Symbian as a target, and I couldn't readily find any Symbian ports for download. You may be out of luck for a trivial solution. A port might not be entirely out of the question, though -- perhaps the platform-specific code is localized enough.

gdb: Cannot find new threads: generic error after system update

I am running an OpenEmbedded based Linux on an ARM board, where my application is running. I used to run kernel 2.6.35, gdb 6.8 and gcc 4.3. Lately I've updated the system to kernel 2.6.37, gdb 7.4 (also tried 7.3) and gcc 4.6.
Now, my application can not be debugged anymore (on the ARM board), everytime I try to run it in gdb I get the error "gdb: Cannot find new threads: generic error". The application makes use of pthreads and does link against pthreads (readelf lists libpthread.so.0 as a dependency). The suggested solutions I found so far all recommend linking to pthread which I am already doing. The other recommendation I found was to use LD_PRELOAD=/lib/libpthread.so.0 which does not make any difference for me.
Debugging the x86 builds of the application works without a problem.
EDIT: To answer the questions posed in the first answer, I am using gdb on the target (ARM), i.e. no cross-gdb. I also have not stripped libpthread.so.0 (/lib/libpthread-2.9.so: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.16, not stripped). glibc remained at version 2.9, and the update involved recompiling the whole linux image
EDIT2: Removing /lib/libthread-db* allows debugging (with consequent warnings and obviously some features will not work)
EDIT3: Using set debug libthread-db 1 I get:
Starting program: /home/root/app
Trying host libthread_db library: libthread_db.so.1.
Host libthread_db.so.1 resolved to: /lib/libthread_db.so.1.
td_ta_new failed: application not linked with libthread
thread_db_load_search returning 0
Trying host libthread_db library: libthread_db.so.1.
Host libthread_db.so.1 resolved to: /lib/libthread_db.so.1.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
warning: Unable to set global thread event mask: generic error
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 0.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 1.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 2.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 3.
thread_db_load_search returning 1
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 0.
Cannot find new threads: generic error
(gdb) Write failed: Broken pipe
There are two common causes of this error:
You have a mis-match between libpthread.so.0 and libthread_db.so.1
You have stripped libpthread.so.0
Your message isn't entirely clear:
are you using cross GDB to debug the application running on ARM from an x86 host?
have you updated (or rebuilt) glibc in addition to updating the kernel, etc.
If you stripped libpthread.so.0, then don't do that -- libthread_db needs it to not be stripped.
If you are cross-debugging, make sure to rebuild libthread_db.so.1 on host to match glibc on target.
Update:
not cross-debugging
did not strip libpthread
So, something in your GDB or glibc appears to have been broken. You can try to see what that is by
Putting removed libthread_db back, and
(gdb) set debug libthread-db 1
(gdb) run
Update 2:
warning: Unable to set global thread event mask: generic error
This means that GDB was able to look up td_ta_set_event function in libthread_db, and called it, but the function returned an error. One way this could happen is if GDB was unable to find __nptl_threads_events function in libpthread.so.0. What does this command produce:
nm /lib/libpthread.so.0 | grep __nptl_threads_events
If that command produces output, e.g.:
000000000021c294 b __nptl_threads_events
then I am not sure what else is failing. You'll likely have to debug GDB itself to figure out what's happening.
If on the other hand the grep above produces no output, then it's a problem with your toolchain: you'll have to figure out why that variable doesn't appear in your rebuilt libpthread.so.0.