I am trying to remotely compile some code on a computing cluster, and I am getting the following error:
cd run_test; export OMP_NUM_THREADS=2; mpiexec -n 2 ./BATSRUS.exe > runlog
libibverbs: Warning: couldn't open config directory '/usr/etc/libibverbs.d'.
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
libibverbs: Warning: couldn't open config directory '/usr/etc/libibverbs.d'.
libibverbs: Warning: couldn't open config directory '/usr/etc/libibverbs.d'.
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
libibverbs: Warning: no userspace device-
specific driver found for /sys/class/infiniband_verbs/uverbs0
-------------------------------------------------------------------------
[[57136,1],1]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:
Module: OpenFabrics (openib)
Host: login001
Another transport will be used instead, although this may result in
lower performance.
NOTE: You can disable this warning by setting the MCA parameter
btl_base_warn_component_unused to 0.
--------------------------------------------------------------------------
[login001:44614:0:44614] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x54b6)
==== backtrace (tid: 44614) ====
=================================
[login001:44613:0:44613] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x54b6)
==== backtrace (tid: 44613) ====
=================================
--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 0 on node login001 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
I will be honest that I have absolutely no idea what this means. This code compiles fine and runs fine on a local machine. Does anyone have any ideas as to how I can solve this issue?
Related
I am having an annoying problem while launching unreal-engine. I installed it from the AUR, the game is installed in directory /opt/unreal-engine. Here is the log messages. I am still a beginner on both linux and unreal-engine, so guide me with easy to follow steps. Thanks.
LogUnixPlatformFile: Warning: create dir('/opt/unreal-engine/Engine/Saved/Config/Linux/Manifest.ini') failed: errno=13 (Permission denied)
LogUnixPlatformFile: Warning: create dir('/opt/unreal-engine/Engine/Saved/Config/Linux/Manifest.ini') failed: errno=13 (Permission denied)
LogUnixPlatformFile: Warning: create dir('/opt/unreal-engine/Engine/Saved/Config/CrashReportClient/UE4CC-Linux-F9CC0BA4D1E046998DB4DD29DC904FC3/CrashReportClient.ini') failed: errno=13 (Permission denied)
LogUnixPlatformFile: Warning: create dir('/opt/unreal-engine/Engine/Saved/Config/CrashReportClient/UE4CC-Linux-F9CC0BA4D1E046998DB4DD29DC904FC3/CrashReportClient.ini') failed: errno=13 (Permission denied)
[2020.02.05-06.48.13:587][ 0]LogUnixPlatformFile: Warning: open('/opt/unreal-engine/Engine/DerivedDataCache/8729777EB19D4DDAA2910A8040E24FC5.tmp', Flags=0x00080041) failed: errno=13 (Permission denied)
[2020.02.05-06.48.13:587][ 0]LogUnixPlatformFile: Warning: open('/opt/unreal-engine/Engine/DerivedDataCache/8729777EB19D4DDAA2910A8040E24FC5.tmp', Flags=0x00080041) failed: errno=13 (Permission denied)
[2020.02.05-06.48.13:587][ 0]LogDerivedDataCache: Warning: Fail to write to ../../../Engine/DerivedDataCache, derived data cache to this directory will be read only. WriteError: 0 (errno=2 (No such file or directory)) ReadError: 0 (errno=2 (No such file or directory))
[2020.02.05-06.48.13:587][ 0]LogDerivedDataCache: Warning: Local data cache path (../../../Engine/DerivedDataCache) was not usable, will not use it.
[2020.02.05-06.48.13:587][ 0]LogDerivedDataCache: Unable to find inner node Local for hierarchical cache Hierarchy.
[2020.02.05-06.48.13:587][ 0]LogDerivedDataCache: Shared data cache path not found in *engine.ini, will not use an Shared cache.
[2020.02.05-06.48.13:587][ 0]LogDerivedDataCache: Unable to find inner node Shared for hierarchical cache Hierarchy.
[2020.02.05-06.48.13:612][ 0]LogMaterial: Verifying Global Shaders for SF_VULKAN_SM5
Fatal error: [File:/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Engine/Private/ShaderCompiler/ShaderCompiler.cpp] [Line: 1253]
Could not create the shader compiler transfer file '/opt/unreal-engine/Engine/Intermediate/Shaders/tmp/DD11AF6A45C44C8586192EE55DBE07C0/0A2ADC10248884539A8A1054157AD3D00'.
Signal 11 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554
CommonUnixCrashHandler: Signal=11
Malloc Size=65535 LargeMemoryPoolOffset=131119
Malloc Size=439632 LargeMemoryPoolOffset=570768
Malloc Size=330840 LargeMemoryPoolOffset=901624
[2020.02.05-06.48.37:669][ 0]LogCore: === Critical error: ===
Unhandled Exception: SIGSEGV: invalid attempt to write memory at address 0x0000000000000003
[2020.02.05-06.48.37:669][ 0]LogCore: Fatal error: [File:/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Engine/Private/ShaderCompiler/ShaderCompiler.cpp] [Line: 1253]
Could not create the shader compiler transfer file '/opt/unreal-engine/Engine/Intermediate/Shaders/tmp/DD11AF6A45C44C8586192EE55DBE07C0/0A2ADC10248884539A8A1054157AD3D00'.
0x00007f76dd8d93f0 libUE4Editor-Engine.so!FShaderCompileThreadRunnable::CompilingLoop() [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Engine/Private/ShaderCompiler/ShaderCompiler.cpp:1513]
0x00007f76dd8d45a9 libUE4Editor-Engine.so!FShaderCompileThreadRunnableBase::Run() [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Engine/Private/ShaderCompiler/ShaderCompiler.cpp:1078]
0x00007f76e0202167 libUE4Editor-Core.so!FRunnableThreadPThread::Run() [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Core/Private/HAL/PThreadRunnableThread.cpp:25]
0x00007f76e01c9a00 libUE4Editor-Core.so!FRunnableThreadPThread::_ThreadProc(void*) [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Core/Private/HAL/PThreadRunnableThread.h:177]
0x00007f76e0cc54cf libpthread.so.0!UnknownFunction(0x94ce)
0x00007f76d7a792d3 libc.so.6!clone(+0x42)
0x00007f76e0197af6 libUE4Editor-Core.so!FGenericPlatformMisc::RaiseException(unsigned int) [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Core/Private/GenericPlatform/GenericPlatformMisc.cpp:477]
0x00007f76e03abd97 libUE4Editor-Core.so!FOutputDevice::LogfImpl(char16_t const*, ...) [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Core/Private/Misc/OutputDevice.cpp:71]
0x00007f76dd8d5b15 libUE4Editor-Engine.so!FShaderCompileThreadRunnable::WriteNewTasks() [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Engine/Private/ShaderCompiler/ShaderCompiler.cpp:1253]
0x00007f76dd8d93f0 libUE4Editor-Engine.so!FShaderCompileThreadRunnable::CompilingLoop() [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Engine/Private/ShaderCompiler/ShaderCompiler.cpp:1513]
0x00007f76dd8d45a9 libUE4Editor-Engine.so!FShaderCompileThreadRunnableBase::Run() [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Engine/Private/ShaderCompiler/ShaderCompiler.cpp:1078]
0x00007f76e0202167 libUE4Editor-Core.so!FRunnableThreadPThread::Run() [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Core/Private/HAL/PThreadRunnableThread.cpp:25]
0x00007f76e01c9a00 libUE4Editor-Core.so!FRunnableThreadPThread::_ThreadProc(void*) [/home/utkarsha/Applications/unreal-engine/src/unreal-engine/Engine/Source/Runtime/Core/Private/HAL/PThreadRunnableThread.h:177]
0x00007f76e0cc54cf libpthread.so.0!UnknownFunction(0x94ce)
0x00007f76d7a792d3 libc.so.6!clone(+0x42)
[2020.02.05-06.48.37:669][ 0]LogExit: Executing StaticShutdownAfterError
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
fish: “./UE4Editor” terminated by signal SIGSEGV (Address boundary error)
#Utkarsha Khanal. Waiting simply fixed it for me, It took about 10-20mins on "verifying Global Shaders for SF_VULKAN_SM5" and then it booted up fine and compiled the shaders. Give that a try.
Same problem here. solved by changing folder permission. sync unreal want to not run as sudo, you have to make a change on folder permission first.
I am trying to remotely debug a Hello World program that was cross-compiled
for mipsel but I am being unsuccessful in using gdb/gdbserver.
My target architecture is:
Linux debian-mipsel 2.6.32-5-4kc-malta #1 Tue Sep 24 01:20:35 UTC 2013 mips GNU/Linux
This system is running with QEMU and it can successfully execute the cross-compiled
file. (eg. ./bin/hello)
Currently, gdb version is 8.1.1 and gdbserver is 7.8, but I tried to change
both a lot and still getting the same result.
Notes
The process is created and gdb starts listening.
I can verify port is open with nc.
gdbserver output
new_argv[0] = "/bin/hello"
Process /bin/hello created; pid = 1385
>>>> entering linux_wait_1
linux_wait_1: [Process 1385]
my_waitpid (-1, 0x40000001)
my_waitpid (-1, 0x1): status(57f), 1385
LWFE: waitpid(-1, ...) returned 1385, ERRNO-OK
LLW: waitpid 1385 received Trace/breakpoint trap (stopped)
linux_low_filter_event: pc is 0x400190
pc is 0x400190
stop pc is 0x400190
my_waitpid (1386, 0x0)
my_waitpid (1386, 0x0): status(177f), 1386
my_waitpid (1386, 0x0)
gdb output
(gdb) file hello
Reading symbols from hello...done.
(gdb) target remote 10.0.0.2:12345
Remote debugging using 10.0.0.2:12345
Ignoring packet error, continuing...
warning: unrecognized item "timeout" in "qSupported" response
Ignoring packet error, continuing...
Remote replied unexpectedly to 'vMustReplyEmpty': timeout
Trying out DDD for the first time in conjunction with some C++ code I already have written and compiled on another machine. When I run DD with the code, I get this error:
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
terminate called after throwing an instance of 'std::runtime_error'
what(): User configuration file not found
Program received signal SIGABRT, Aborted.
0x00007ffff6f84428 in __GT_raise (sig=sig#entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 (gdb)
Not sure what to think, as I have the code built and running on an RPi. Any help would be most appreciated!
When I run DD with the code, I get this error
That is an error from your program (which throws exception).
You can find out where that error is coming from using the GDB where command.
If your program doesn't throw this exception when you run it outside of DDD, it's likely that your program looks for "configuration file" in its current directory (bad idea (TM)), and that the directory in which you start it is different from the directory in which DDD starts it.
You can use cd command inside DDD to change the current directory, and this will likely "fix it" for you (but really you should fix your program so that it uses $HOME or some other well defined location for its configuration files).
I use this (http://cs.baylor.edu/~donahoo/tools/gdb/tutorial.html) guide to learn how GDB works.
After compiling and uploading the code to my embedded linux arm platform I use a remote connection to connect with the gdbserver on my target:
Target:
root#zedboard-zynq7:/Software# gdbserver HOST:1234 broken
Process broken created; pid = 1103
Listening on port 1234
Remote debugging from host 192.168.178.32
Host (Ubuntu 14.04 running in a virtual machine):
Remote debugging using 192.168.178.33:1234
warning: A handler for the OS ABI "GNU/Linux" is not built into this
configuration of GDB. Attempting to continue with the default arm settings.
Cannot access memory at address 0x0
0x43330d40 in ?? ()
(gdb)
I set the breakpoint to line 43 and continue the program until it stops at the breakpoint:
(gdb) b 43
Breakpoint 1 at 0x8b68: file broken.cpp, line 43.
(gdb) continue
Continuing.
Breakpoint 1, main () at broken.cpp:43
43 double seriesValue = ComputeSeriesValue(x, n);
(gdb)
But after a step call on my host I got this error:
Host:
warning: Remote failure reply: E01
Ignoring packet error, continuing...
Target:
ptrace: Input/output error.
input_interrupt, count = 1 c = 36 ('$')
What does it mean and how can I fix it?
Thanks for help.
Host (Ubuntu 14.04 running in a virtual machine):
Remote debugging using 192.168.178.33:1234
warning: A handler for the OS ABI "GNU/Linux" is not built into this
configuration of GDB. Attempting to continue with the default arm settings.`
This says that your (host) GDB has not been built with support for the target you want to debug.
What does it mean and how can I fix it?
You need to either get a different build of (host) GDB, or build one yourself with correct --target setting.
Usually a correct host GDB is included with the cross-gcc that you use to build for your target. So a fix may be as simple as running /path/to/cross-gdb instead of gdb.
I am running an OpenEmbedded based Linux on an ARM board, where my application is running. I used to run kernel 2.6.35, gdb 6.8 and gcc 4.3. Lately I've updated the system to kernel 2.6.37, gdb 7.4 (also tried 7.3) and gcc 4.6.
Now, my application can not be debugged anymore (on the ARM board), everytime I try to run it in gdb I get the error "gdb: Cannot find new threads: generic error". The application makes use of pthreads and does link against pthreads (readelf lists libpthread.so.0 as a dependency). The suggested solutions I found so far all recommend linking to pthread which I am already doing. The other recommendation I found was to use LD_PRELOAD=/lib/libpthread.so.0 which does not make any difference for me.
Debugging the x86 builds of the application works without a problem.
EDIT: To answer the questions posed in the first answer, I am using gdb on the target (ARM), i.e. no cross-gdb. I also have not stripped libpthread.so.0 (/lib/libpthread-2.9.so: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.16, not stripped). glibc remained at version 2.9, and the update involved recompiling the whole linux image
EDIT2: Removing /lib/libthread-db* allows debugging (with consequent warnings and obviously some features will not work)
EDIT3: Using set debug libthread-db 1 I get:
Starting program: /home/root/app
Trying host libthread_db library: libthread_db.so.1.
Host libthread_db.so.1 resolved to: /lib/libthread_db.so.1.
td_ta_new failed: application not linked with libthread
thread_db_load_search returning 0
Trying host libthread_db library: libthread_db.so.1.
Host libthread_db.so.1 resolved to: /lib/libthread_db.so.1.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
warning: Unable to set global thread event mask: generic error
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 0.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 1.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 2.
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 3.
thread_db_load_search returning 1
Warning: find_new_threads_once: find_new_threads_callback: cannot get thread info: generic error
Found 0 new threads in iteration 0.
Cannot find new threads: generic error
(gdb) Write failed: Broken pipe
There are two common causes of this error:
You have a mis-match between libpthread.so.0 and libthread_db.so.1
You have stripped libpthread.so.0
Your message isn't entirely clear:
are you using cross GDB to debug the application running on ARM from an x86 host?
have you updated (or rebuilt) glibc in addition to updating the kernel, etc.
If you stripped libpthread.so.0, then don't do that -- libthread_db needs it to not be stripped.
If you are cross-debugging, make sure to rebuild libthread_db.so.1 on host to match glibc on target.
Update:
not cross-debugging
did not strip libpthread
So, something in your GDB or glibc appears to have been broken. You can try to see what that is by
Putting removed libthread_db back, and
(gdb) set debug libthread-db 1
(gdb) run
Update 2:
warning: Unable to set global thread event mask: generic error
This means that GDB was able to look up td_ta_set_event function in libthread_db, and called it, but the function returned an error. One way this could happen is if GDB was unable to find __nptl_threads_events function in libpthread.so.0. What does this command produce:
nm /lib/libpthread.so.0 | grep __nptl_threads_events
If that command produces output, e.g.:
000000000021c294 b __nptl_threads_events
then I am not sure what else is failing. You'll likely have to debug GDB itself to figure out what's happening.
If on the other hand the grep above produces no output, then it's a problem with your toolchain: you'll have to figure out why that variable doesn't appear in your rebuilt libpthread.so.0.