I have the following call to glXGetFBConfigs:
const GLXFBConfig * fbuf_configs = glXGetFBConfigs (
display, screen_id, &fbuf_config_count
)
Upon inspecting it with valgrind, it produces 137 loss records for a function dlopen_doit, which subsequently calls a malloc. Below is an example of one such record, however all 137 are essentially uniform, such that dlopen_doit is the culprit.
==12317== 9 bytes in 1 blocks are indirectly lost in loss record 16 of 137
==12317== at 0x483579F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==12317== by 0x6B1FB5B: ??? (in /usr/lib64/libnvidia-glcore.so.435.21)
==12317== by 0x6B19F54: ??? (in /usr/lib64/libnvidia-glcore.so.435.21)
==12317== by 0x6B11934: ??? (in /usr/lib64/libnvidia-glcore.so.435.21)
==12317== by 0x6B2337F: ??? (in /usr/lib64/libnvidia-glcore.so.435.21)
==12317== by 0x58145B5: ??? (in /usr/lib64/opengl/nvidia/lib/libGLX_nvidia.so.435.21)
==12317== by 0x400F29B: call_init.part.0 (in /lib64/ld-2.29.so)
==12317== by 0x400F3D8: _dl_init (in /lib64/ld-2.29.so)
==12317== by 0x40131E2: dl_open_worker (in /lib64/ld-2.29.so)
==12317== by 0x4DC5F90: _dl_catch_exception (in /lib64/libc-2.29.so)
==12317== by 0x4012AD9: _dl_open (in /lib64/ld-2.29.so)
==12317== by 0x4E5529B: dlopen_doit (in /lib64/libdl-2.29.so)
I am unable to find any information on-line regarding dlopen_doit, and the manual page for glXGetFBConfigs is hardly extensive, and does not mention any need to manually free its return value. I only was able to discover that glXGetFBConfigs was the cause by slowly isolating all other function calls, as it is mentioned nowhere in the valgrind trace.
What is the cause and potential solution for such behaviour ?
Related
just trying detecting some potential issues on a small SDL2 program under linux/GCC written in C++17
valgrind report a lot of noisy memory leak about vg_replace_malloc.c that are suggested to be ignored from the official documentation (link)
(Ignore the "vg_replace_malloc.c", that's an implementation detail.)
But later on on the analysis, there is a block of:
==9891== 256 bytes in 4 blocks are definitely lost in loss record 2,243 of 2,414
==9891== at 0x483980B: malloc (vg_replace_malloc.c:309)
==9891== by 0x40156B3: dl_open_worker (in /usr/lib64/ld-2.30.so)
==9891== by 0x4E60407: _dl_catch_exception (in /usr/lib64/libc-2.30.so)
==9891== by 0x40148FD: _dl_open (in /usr/lib64/ld-2.30.so)
==9891== by 0x4EF139B: dlopen_doit (in /usr/lib64/libdl-2.30.so)
==9891== by 0x4E60407: _dl_catch_exception (in /usr/lib64/libc-2.30.so)
==9891== by 0x4E604D2: _dl_catch_error (in /usr/lib64/libc-2.30.so)
==9891== by 0x4EF1B08: _dlerror_run (in /usr/lib64/libdl-2.30.so)
==9891== by 0x4EF1429: dlopen##GLIBC_2.2.5 (in /usr/lib64/libdl-2.30.so)
==9891== by 0x493CC37: ??? (in /usr/lib64/libSDL2-2.0.so.0.12.0)
==9891== by 0x4941DC5: ??? (in /usr/lib64/libSDL2-2.0.so.0.12.0)
==9891== by 0x494C3CC: ??? (in /usr/lib64/libSDL2-2.0.so.0.12.0)
I am wondering if it is some sort of library dependency or a false positive or is obscurely pointing at something related to my code....
Any one could give me more insight how to interpret that definitely lost bytes snippet?
The problem with that output was using SDL2 from package repository that are compiled without debug info.
Therefore recompiling the SDL2 library from source, including debug info, made the valgrind reports a lot clearer and led to solve and understand the issues.
When running my test c++ app against my dynamic library which links against NVIDIA's libGL.so I am getting the following errors (see below) reported by Valgrind. I am tempted to suppress them, but I am not sure if this is my issue or something libnvidia-glcore.so has. Part of the unsurety stems form not fully understanding Valgrind's output. I have looked into what variables might be uninitialized in my code in the call to glXCreateContextAttribsARB but I do not see any there. If it appears from the output to by my issue what types of things am I looking for? The two errors I am getting are:
==10156== Conditional jump or move depends on uninitialised value(s)
==10156== at 0x7E4CAF4: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7DEE0CD: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7DEEADC: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F75DA1: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F775D3: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7E279BE: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7E27D21: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F760F5: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F3E353: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7A8C9C0: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x4E535F2: opengl_core::render_system::init() (x11_render_system.cpp:92)
==10156== by 0x4040D8: test_render_system::run() (test_x11_render_system.cpp:10)
==10156== Uninitialised value was created by a heap allocation
==10156== at 0x4C29BCF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==10156== by 0x5116428: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x7EECF2E: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7E479C1: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7DC8C31: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x50BF331: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x50EB72A: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x50EEA87: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x50E47D2: glXCreateContextAttribsARB (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x4E52EF8: opengl_core::render_context::init(opengl_core::render_window&, opengl_core::fb_config&) (x11_render_context.cpp:120)
==10156== by 0x4E534D0: opengl_core::render_system::init() (x11_render_system.cpp:65)
==10156== by 0x4040D8: test_render_system::run() (test_x11_render_system.cpp:10)
==10156==
==10156== Conditional jump or move depends on uninitialised value(s)
==10156== at 0x7E4CAF4: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7DEE0CD: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7DF085F: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F4B78B: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F4CFBC: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7E279BE: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7E27D21: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F4BFE0: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F38ED5: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7B20F52: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7F3E2CB: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7A8C9C0: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== Uninitialised value was created by a heap allocation
==10156== at 0x4C29BCF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==10156== by 0x5116428: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x7EECF2E: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7E479C1: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x7DC8C31: ??? (in /usr/lib64/nvidia/libnvidia-glcore.so.346.47)
==10156== by 0x50BF331: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x50EB72A: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x50EEA87: ??? (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x50E47D2: glXCreateContextAttribsARB (in /usr/lib64/nvidia/libGL.so.346.47)
==10156== by 0x4E52EF8: opengl_core::render_context::init(opengl_core::render_window&, opengl_core::fb_config&) (x11_render_context.cpp:120)
==10156== by 0x4E534D0: opengl_core::render_system::init() (x11_render_system.cpp:65)
==10156== by 0x4040D8: test_render_system::run() (test_x11_render_system.cpp:10)
==10156==
As per request:
// src/x11_render_system.cpp
91 m_impl->m_context.make_current(m_impl->m_window);
92 glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
93 glClearColor(1.0, 0.0, 0.0, 1.0);
94 glXSwapBuffers(display, window);
95 m_impl->m_context.make_not_current();
Valgrind is quite prone to false positive with critical hardware drivers (such as GPU drivers) due to the way they work. Basically, these drivers access the GPU's memory (and even registers) through user space (virtual RAM) which is setup by the BIOS (this is POSIX mmap at work). This way, the driver can access device's registers through arbitrary addresses, like any other variable.
The point is that some device's registers are only meant to be read. For example, they could reflect some status of the device. Thus, only the device have a reason to write them (and even if the CPU tried to do this, it would fail). Most of the time, it does so internally at power up, and from time to time when status change, and it reflects to user space when mapping is setup. In essence, these are pure volatile variables... even more volatile than the usual thread to thread conception of it, which by the way is well handled by Valgrind since it emulates CPU.
But Valgrind lives in a determinist world (CPU and RAM) and these GPU's registers are completely out of this world. When the driver read them, Valgrind simply think it is accessing RAM (due to mmap), which is definitely not true. Thus, at the point the driver use the read data (some device status) to branch accordingly, Valgrind reports because nothing in its world ever wrote this data.
Let's be honest: proprietary drivers are not open-source, so it's hard to guess what is really happening, but it is likely something similar. What I can tell for sure is that this is happening with Valgrind and GPU drivers since ages (even with very small programs), mainly during initializations and everybody agrees these are false positives. Thus, you can safely ignore it... or create a suppression file for Valgrind in your project (let's name it valgrind.supp):
{
NVidia-driver
Memcheck:Cond
obj:/usr/lib64/nvidia/libnvidia-glcore.so.346.47
}
Then you call Valgrind with the option --suppressions=valgrind.supp and it will no longer report these false positive.
You may have other driver objects related to this, just add entries for them (you'll have to repeat the whole {...} and modify the object line to match what Valgrind reports). You may also have to update them everytime you update your driver since the version changes, though I guess you can use basic wildcards to avoid this.
Take a look here for more infos on this Valgrind feature.
Take the following code:
bool x_init = false;
int x;
void initX(){
x = 4;
x_init = true;
}
bool X_initialized(){
x_init;
}
//...
if( X_initialized() && x <3){
doSomething(x);
}
In this case it is evident x is not used uninitialized, however the compiler/valgrind have to prove that, and what it sees is that "x<3" is using x without initializing it.. Proving arbitrary stuff about code is generally not possible. So if drivers are obfuscated or just coded without using valgrind ( driver vendors tends to have milion of tests, so it is likely they rely on their tests more than profiling tools) it is very possible valgrind can't detect that (it's not a failure of valgrind, but a mathematical limit and if you wish a failure about coding style of third parties code).
However you should report that to the maintainers of the code you are using (NVIDIA?), it is possible that's an issue that needs to be fixed.
Another possibility is that at some point their code requires "Random behaviour" and as such they use uninitialized values as source for non deterministic data (there are no silver bullets, if you use coverage tools you'll soon know that is not always possible have 100% coverage, if you use profiling tools they will soon or later fail too)..
Another chance is that those "uninitialized" values are just "volatile" variables that are initialized when drivers are loaded (after system boostrap) and hence the "application" cannot see them as initialized (probably the most plausible case)
You can show the code around x11_render_system.cpp:92
But in my opinion the valgrind might make mistakes also, just ignore it if you did not find any problems errored by valgrind
When I run my code I get
Bus error(core dumped)
When I run it with valgrind I get
==26570== Invalid read of size 8
==26570== at 0x67EDEE6: ??? (in /home/carolinaloureiro/Qt/5.4/gcc_64/lib/libQt5SerialPort.so.5.4.0)
==26570== by 0x67F34CB: ??? (in /home/carolinaloureiro/Qt/5.4/gcc_64/lib/libQt5SerialPort.so.5.4.0)
==26570== by 0x4E3D5F4: classA::function1(bool) (in /home/carolinaloureiro/catkin_ws/src/testpackage/lib/libLIB.so.1)
==26570== by 0x4E3DC75: OptoPorts_private::run() (in /home/carolinaloureiro/catkin_ws/src/testpackage/lib/libLIB.so.1)
==26570== by 0x5875383: ??? (in /home/carolinaloureiro/Qt/5.4/gcc_64/lib/libQt5Core.so.5.4.0)
==26570== by 0x55BDE99: start_thread (pthread_create.c:308)
==26570== by 0x65192EC: clone (clone.S:112)
==26570== Address 0x200000001109da98 is not stack'd, malloc'd or (recently) free'd
Could this be because I didn't include a library properly? I've been trying to solve this problem for a while but I am no sure how to.
Thanks
I'm working on a rather complex piece of software and once in a while it segfaults on exit. I tried to investigate the problem with valgrind, but the output I get does not tell me which of the numerous usages of QString is the problematic one.
I used valgrind with --track-origins=yes, but this also does not help to see which one it is.
==28264== Invalid read of size 4
==28264== at 0x563B66: QBasicAtomicInt::deref() (qatomic_x86_64.h:133)
==28264== by 0x563DC6: QString::~QString() (in build/output/bin/qgis)
==28264== by 0x36F8A395E9: __cxa_finalize (cxa_finalize.c:55)
==28264== by 0x5B94212: ??? (in build/output/lib/libqgis_core.so.2.1.0)
==28264== by 0x36F860FB69: _dl_fini (dl-fini.c:253)
==28264== by 0x36F8A39278: __run_exit_handlers (exit.c:77)
==28264== by 0x36F8A392C4: exit (exit.c:99)
==28264== by 0x36F8A21B4B: (below main) (libc-start.c:308)
==28264== Address 0x135b30b0 is 0 bytes inside a block of size 40 free'd
==28264== at 0x4A074C4: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28264== by 0x36C48C31F7: QString::free(QString::Data*) (qstring.cpp:1235)
==28264== by 0x563DDC: QString::~QString() (in build/output/bin/qgis)
==28264== by 0x36F8A39278: __run_exit_handlers (exit.c:77)
==28264== by 0x36F8A392C4: exit (exit.c:99)
==28264== by 0x36F8A21B4B: (below main) (libc-start.c:308)
How can I find the problematic instance of QString? Or what else can I do to track down problems where "below main" cleans up?
I recently had a very (very!) similar problem with one of my global QString's, but this only happened in Qt 5 (5.1.1), not in Qt 4 (4.8.5)... the way I solved it in the end was to run the application through gdb / ddd and let it crash to determine the offending symbol's name. After figuring this out, I simply made it a member of one of my QObject-derived classes (there was actually no reason for it to be global) and this fixed it.
I have a CPP code which uses openmp. It is linked to a fortran90 code. If run with one thread, everything is fine. If run with any number of threads different from 1, I get a segmentation fault when exiting the cpp part. The result of the code is exact, no errors whatsoever. It runs smoothly, until it's time to exit. The part of the code related to openmp is:
#pragma omp parallel for shared(even_phi,odd_phi,odd_divisor,odd_start_index,odd_iter_index) private(ii,jj,kk,cc,io,pp,f1,f2,f3,f4,f5,f6,ff,tmp_phi) schedule(static)
for (kk=1; kk<nz-1; kk++)
{
cc = (kk-1)*(ny-2);
for (jj=1; jj<ny-1; jj++)
{
io = odd_start_index[cc];
pp = odd_iter_index[cc++];
for (ii=io; ii<maxElem; ii++)
{
f1 = even_phi[pp-odown];
f2 = even_phi[pp-oright];
f3 = even_phi[pp];
tmp_phi = odd_phi[pp];
f4 = even_phi[pp+1];
f5 = even_phi[pp+oleft];
f6 = even_phi[pp+oup];
ff = f1+f2+f3+f4+f5+f6;
odd_phi[pp] = odd_divisor[pp]*ff + c2*tmp_phi;
pp++;
}
}
}
it's a standard numerical solver code. Which also works perfectly without openmp, and with OMP_NUM_THREADS=1. If executed with more threads, after an almost complete normal execution, Valgrinds says:
==23723== Thread 20:
==23723== Jump to the invalid address stated on the next line
==23723== at 0x2A6EBBB8: ???
==23723== by 0x2A6EA515: ???
==23723== Address 0x2a6ebbb8 is not stack'd, malloc'd or (recently) free'd
==23723==
==23723==
==23723== Process terminating with default action of signal 11 (SIGSEGV)
==23723== Access not within mapped region at address 0x2A6EBBB8
==23723== at 0x2A6EBBB8: ???
==23723== by 0x2A6EA515: ???
==23723== If you believe this happened as a result of a stack
==23723== overflow in your program's main thread (unlikely but
==23723== possible), you can try to increase the size of the
==23723== main thread stack using the --main-stacksize= flag.
==23723== The main thread stack size used in this run was 1048576.
==23723==
==23723== HEAP SUMMARY:
==23723== in use at exit: 632,995,339 bytes in 101 blocks
==23723== total heap usage: 10,071 allocs, 9,970 frees, 1,257,933,743 bytes allocated
==23723==
==23723== Thread 1:
==23723== 6,992 bytes in 23 blocks are possibly lost in loss record 47 of 74
==23723== at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==23723== by 0x35A0E11812: _dl_allocate_tls (dl-tls.c:300)
==23723== by 0x35A1E07068: pthread_create##GLIBC_2.2.5 (allocatestack.c:571)
==23723== by 0x2A6EA981: ???
==23723== by 0x2A4C666E: ???
==23723== by 0x4C8DB7: solvermodule (in /home/tom/bin/solver)
==23723== by 0x4C6794: MAIN__ (qdiff4v.f90:749)
==23723== by 0x4C8DF9: main (in /home/tom/bin/solver)
==23723==
==23723== 30,276 bytes in 1 blocks are definitely lost in loss record 50 of 74
==23723== at 0x4A0674C: operator new[](unsigned long) (vg_replace_malloc.c:305)
==23723== by 0x2A4C6394: ???
==23723== by 0x4C8DB7: solvermodule (in /home/tom/bin/solver)
==23723== by 0x4C6794: MAIN__ (qdiff4v.f90:749)
==23723== by 0x4C8DF9: main (in /home/tom/bin/solver)
==23723==
==23723== 30,276 bytes in 1 blocks are definitely lost in loss record 51 of 74
==23723== at 0x4A0674C: operator new[](unsigned long) (vg_replace_malloc.c:305)
==23723== by 0x2A4C63BF: ???
==23723== by 0x4C8DB7: solvermodule (in /home/tom/bin/solver)
==23723== by 0x4C6794: MAIN__ (qdiff4v.f90:749)
==23723== by 0x4C8DF9: main (in /home/tom/bin/solver)
==23723==
==23723== 30,276 bytes in 1 blocks are definitely lost in loss record 52 of 74
==23723== at 0x4A0674C: operator new[](unsigned long) (vg_replace_malloc.c:305)
==23723== by 0x2A4C63EA: ???
==23723== by 0x4C8DB7: solvermodule (in /home/tom/bin/solver)
==23723== by 0x4C6794: MAIN__ (qdiff4v.f90:749)
==23723== by 0x4C8DF9: main (in /home/tom/bin/solver)
==23723==
==23723== 30,276 bytes in 1 blocks are definitely lost in loss record 53 of 74
==23723== at 0x4A0674C: operator new[](unsigned long) (vg_replace_malloc.c:305)
==23723== by 0x2A4C6415: ???
==23723== by 0x4C8DB7: solvermodule (in /home/tom/bin/solver)
==23723== by 0x4C6794: MAIN__ (qdiff4v.f90:749)
==23723== by 0x4C8DF9: main (in /home/tom/bin/solver)
==23723==
==23723== 39,232 bytes in 1 blocks are definitely lost in loss record 57 of 74
==23723== at 0x4A0674C: operator new[](unsigned long) (vg_replace_malloc.c:305)
==23723== by 0x2A4C6369: ???
==23723== by 0x4C8DB7: solvermodule (in /home/tom/bin/solver)
==23723== by 0x4C6794: MAIN__ (qdiff4v.f90:749)
==23723== by 0x4C8DF9: main (in /home/tom/bin/solver)
==23723==
==23723== LEAK SUMMARY:
==23723== definitely lost: 160,336 bytes in 5 blocks
==23723== indirectly lost: 0 bytes in 0 blocks
==23723== possibly lost: 6,992 bytes in 23 blocks
==23723== still reachable: 632,828,011 bytes in 73 blocks
==23723== suppressed: 0 bytes in 0 blocks
==23723== Reachable blocks (those to which a pointer was found) are not shown.
==23723== To see them, rerun with: --leak-check=full --show-reachable=yes
==23723==
==23723== For counts of detected and suppressed errors, rerun with: -v
==23723== ERROR SUMMARY: 7 errors from 7 contexts (suppressed: 6 from 6)
gdb says:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff5a04700 (LWP 23837)]
0x00007ffff7024bc2 in ?? ()
Missing separate debuginfos, use: debuginfo-install libgcc-4.4.6-4.el6.x86_64 libgfortran-4.4.6-4.el6.x86_64 libgomp-4.4.6-4.el6.x86_64 libstdc++-4.4.6-4.el6.x86_64
which clearly doesn't help. I've been playing with GOMP_STACKSIZE and the number of threads,
thinking that I may have a stack size problem, but to no avail.
I'm missing something. Maybe something stupid. And cannot find it.
This is a bug in GCC. I found a bug reported on GCC about problems related to the use of openmp and the iso_c_binding module. After that I Compiled and executed the code using the intel compiler with no problem whatsoever.
My code is quite long, and have no idea how to isolate the problematic part to reproduce the bug and make a report. Will do my best to do that.
I'm using gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), CentOS release 6.3 (Final).
I'll mark this as the answer, and if I find anything more usefull later I'll post it here because it may be usefull to others.