Running valgrind on NVIDIA Jetson gives no leak source information - c++

tl;dr
valgrind not showing reachable memory leak source
details
C++ application was built using cmake with following extra options:
set(CMAKE_CXX_FLAGS_DEBUG "-ggdb3 -O0")
set(CMAKE_C_FLAGS_DEBUG "-ggdb3 -O0")
which were passed as seen from make VERBOSE=1 command.
Output from running /usr/bin/valgrind --num-callers=500 --trace-children=yes --leak-check=full --show-reachable=yes -v --track-origins=yes --show-leak-kinds=all ./aplication --application params indicated that relevant symbols were loaded.
To check memory leakage valgrind was used in combination with gdb which alowed checking valgrind reports after arbitrary time intervals. These reports indicated gradual reachable memory increase - indicating leakage.
The problem is that valgrind didn't provide any usable insight what might be causing memory leaks. The output is as follows:
==21466== Searching for pointers to 10,678 not-freed blocks
==21466== Checked 71,211,640 bytes
==21466==
==21466== 984 bytes in 42 blocks are still reachable in loss record 1 of 6
==21466== at 0x4845494: operator new(unsigned long, std::nothrow_t const&) (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==21466==
==21466== 2,560 bytes in 8 blocks are possibly lost in loss record 2 of 6
==21466== at 0x4846B0C: calloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==21466==
==21466== 17,512 bytes in 3 blocks are still reachable in loss record 3 of 6
==21466== at 0x4846D10: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==21466==
==21466== 405,564 bytes in 977 blocks are still reachable in loss record 4 of 6
==21466== at 0x4846B0C: calloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==21466==
==21466== 468,429 bytes in 3,965 blocks are still reachable in loss record 5 of 6
==21466== at 0x4844BFC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==21466==
==21466== 2,166,764 bytes in 5,683 blocks are still reachable in loss record 6 of 6
==21466== at 0x484522C: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==21466==
==21466== LEAK SUMMARY:
==21466== definitely lost: 0 bytes in 0 blocks
==21466== indirectly lost: 0 bytes in 0 blocks
==21466== possibly lost: 2,560 bytes in 8 blocks
==21466== still reachable: 3,059,253 bytes in 10,670 blocks
==21466== suppressed: 0 bytes in 0 blocks
Is there anything else to try to make valgrind provide more info on leak source?
ENV
hw pack: NVIDIA Jetson
nv_tegra_release: R32 (release), REVISION: 5.1, GCID: 27362550, BOARD: t186ref, EABI: aarch64, DATE: Wed May 19 18:16:00 UTC 2021
linux distro: Ubuntu 18.04.5 LTS
cpu model: AMD Ryzen 5 3600X 6-Core Processor
cpu arch: x86_64
valgrind: 3.13.0
References
How to Run Valgrind
Still Reachable Leak detected by Valgrind
Using valgrind
with gdb
UPDATE
Uninstalling valgrind (v3.13) previously installed via apt and installing valgrind (v3.17) via snap solved the issue.

In case of problems with valgrind, it is always recommended to try with a recent version, either the last release or the git version.
Note that it is quite easy to recompile valgrind from sources, at it has very few dependencies.
In case of specific problems with stack traces, it is always useful to compare the stack traces produced by valgrind and gdb by using valgrind + gdb + vgdb.
Put a breakpoint in gdb at relevant places, and you can then compare the gdb stacktrace produced by the gdb backtrace command and the valgrind stacktrace produced by the monitoring command:
(gdb) monitor v.info scheduler

Related

Valgrind and QEMU - Unable to detect memory leak

I want to test my C++ code for memory leaks with Valgrind (memcheck) x86.
But the software gets cross-compiled and is running on ARM.
In order to do some automated testing I decided to emulate my ARM hardware via QEMU.
And I also decided to use the cpputest unit test ARM binaries to ensure a deterministic behaviour and search for memory leaks within the scope the unit test covers.
All in all, I have an ARM binary which should be emulated via QEMU user mode.
My call looks like that:
./valgrind --smc-check=all qemu-arm-static -L ... arm-ptest-binary
My C++ code looks like that. It has a memory leak of 20 byte and the valgrind call do not find this leak when using it with QEMU.
After I insert a memory allocation and no freeing mechanism I'd have expected an memory leak
int test_func ()
{
int *foo;
foo = new int [5];
printf("test_func called!\n");
return 1;
}
Valgrind output:
==19300== HEAP SUMMARY:
==19300== in use at exit: 1,103,129 bytes in 2,316 blocks
==19300== total heap usage: 4,259 allocs, 1,943 frees, 1,866,916 bytes allocated
==19300==
==19300== LEAK SUMMARY:
==19300== definitely lost: 0 bytes in 0 blocks
==19300== indirectly lost: 0 bytes in 0 blocks
==19300== possibly lost: 304 bytes in 1 blocks
==19300== still reachable: 1,102,825 bytes in 2,315 blocks
==19300== suppressed: 0 bytes in 0 blocks
[...]
When I run this program on ARM hardware the valgrind-arm finds the leak with the exact same binary.
Does anyone of you have an idea why Valgrind does not find the memory leak in combination with QEMU user mode?
Thanks in advance
You are running Valgrind on QEMU itself, which will cause valgrind to report memory leaks in QEMU's own code, but valgrind does not have sufficient visibility into what the guest program running under QEMU is doing to be able to report leaks in the guest. In particular, Valgrind works by intercepting calls to malloc, free, operator new, etc -- it will be doing this for the host QEMU process's (x86) allocation and free calls, but has no way to intercept the (arm) calls your guest process makes.
You might look at running an entire guest OS under QEMU's system emulation mode, and then running the Arm Valgrind inside that on your guest program.

Fixing A bug for Valgrind for false memory leaks

I just as of yesterday installed Ubuntu(terminal) on my windows computer. In addition, when ahead and installed Valgrind, "sudo apt install Valgrind", to test everything out I went ahead and create a c++ hello world program. Then tested it with Valgrind. The Valgrind said I had about "72,704 bytes in 1 block". Further research, I performed suggested this was a bug of Valgrind with the c++ standard library functions, possible the iostream. My question is how do I go about fixing this bug. I can't ignore because if I am working on programs I need to able to accurately gauge where it's coming from. If anyone can provide a step by step solution for my problem in layman's term for the problem it would be invaluable.
here is my code and errors I get:
#include <iostream>
using std::cout; using std::endl;
int main() {
cout << "Hello World" << endl;
}
==195== Memcheck, a memory error detector
==195== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==195== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==195== Command: ./helloworld
==195==
Hello World
==195==
==195== HEAP SUMMARY:
==195== in use at exit: 72,704 bytes in 1 blocks
==195== total heap usage: 2 allocs, 1 frees, 76,800 bytes allocated
==195==
==195== LEAK SUMMARY:
==195== definitely lost: 0 bytes in 0 blocks
==195== indirectly lost: 0 bytes in 0 blocks
==195== possibly lost: 0 bytes in 0 blocks
==195== still reachable: 72,704 bytes in 1 blocks
==195== suppressed: 0 bytes in 0 blocks
==195== Rerun with --leak-check=full to see details of leaked memory
==195==
==195== For counts of detected and suppressed errors, rerun with: -v
==195== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
update :
Seems like I can sort deal with the false positive leaks by creating suppression files. However, I am not as tech savvy compared some people on here to know how to do that. How would I create suppression files on Valgrind (specifically for my situation)? Please explain it in LAYMAN's terms and please be as detailed as possible.
second update:
I have been semi-successful in suppressing the memory leaks but I would like to know if there is a more viable long term solution out there.
I would say that these are most likely genuine issues, but probably too minor for the libc/libstdc++ developers to fix.
You can generate suppressions in the Valgrind output by specifying --gen-suppressions=yes. This will generate output like this:
==28328== 56 bytes in 1 blocks are still reachable in loss record 1 of 7
==28328== at 0x4C290F1: malloc (vg_replace_malloc.c:298)
==28328== by 0x4111D8: xmalloc (xmalloc.c:43)
==28328== by 0x41120B: xmemdup (xmalloc.c:115)
==28328== by 0x40F899: clone_quoting_options (quotearg.c:102)
==28328== by 0x40742A: decode_switches (ls.c:1957)
==28328== by 0x40742A: main (ls.c:1280)
==28328==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
fun:xmalloc
fun:xmemdup
fun:clone_quoting_options
fun:decode_switches
fun:main
}
In the above, the section with the pid (==28328==) is the usual output. The section after it (delimited by braces) is the generated suppression. You can copy this block into a file, for instance my_suppressions, and then you can run valgrind and tell it to read the file --suppressions=my_suppressions.
If you are planning on maintaining your suppressions file for a long time, then it is best to put some meaningful (and unique) text in the place of 'insert_a_suppression_name_here'. This will help you monitor which suppressions are being used, since if you run valgrind in verbose mode (-v or --verbose) it will list all of the used suppressions. For instance
--30822-- used_suppression: 2 Example for SO my_suppressions:2 suppressed: 112 bytes in 2 blocks
--30822-- used_suppression: 4 U1004-ARM-_dl_relocate_object /remote/us01home48/pfloyd/tools/vg313/lib/valgrind/default.supp:1431

Confusing output from Valgrind shows indirectly lost memory leaks but no definitely lost or possibly lost

I am running valgrind on the macos x 10.8. Valgrind says on startup
"==11312== WARNING: Support on MacOS 10.8 is experimental and mostly broken.
==11312== WARNING: Expect incorrect results, assertions and crashes.
==11312== WARNING: In particular, Memcheck on 32-bit programs will fail to
==11312== WARNING: detect any errors associated with heap-allocated data."
Valgrind is giving this leak summary:
"LEAK SUMMARY:
==11312== definitely lost: 0 bytes in 0 blocks
==11312== indirectly lost: 48 bytes in 2 blocks
==11312== possibly lost: 0 bytes in 0 blocks
==11312== still reachable: 45,857 bytes in 270 blocks
==11312== suppressed: 16,805 bytes in 87 blocks"
According to valgrinds faq, http://valgrind.org/docs/manual/faq.html#faq.deflost, "indirectly lost" means your program is leaking memory in a pointer-based structure. (E.g. if the root node of a binary tree is "definitely lost", all the children will be "indirectly lost".) If you fix the "definitely lost" leaks, the "indirectly lost" leaks should go away.
I dont have any definitely lost leaks or even possibly lost leaks to fix. What am I supposed to fix? Could this report be a bug due to the experimental nature of valgrind in 10.8?
I believe i am compiling this as a 64 bit program since the compiler is a 64 bit compiler.
I feel weird answering my own question.
Yes the report by valgrind on mac is incorrect. According to valgrind on linux all heap blocks were freed so no leaks are possible.
I really hope valgrind fixes the issues with mac since I mainly am developing on mac now.
Valgrind has been updated. Use (if you use homebrew):
brew unlink valgrind
brew install valgrind
And, lo and behold:
==23998== LEAK SUMMARY:
==23998== definitely lost: 0 bytes in 0 blocks
==23998== indirectly lost: 0 bytes in 0 blocks
==23998== possibly lost: 0 bytes in 0 blocks
==23998== still reachable: 76,800 bytes in 2 blocks
==23998== suppressed: 58,420 bytes in 359 blocks

valgrind reports getpwuid() leaks in c++ with Ubuntu

I have the following C++ file, pwd01.cpp:
#include <pwd.h>
#include <iostream>
int main() {
passwd* pwd = getpwuid(getuid());
}
I compile this with the following command:
g++ pwd01.cpp -Wall -o pwd01
On Ubuntu 12.04.1 LTS / gcc version 4.6.3, valgrind reports a leak (see below). When I compile the same code with the same command on Mac OS 10.6.8 / gcc version 4.2.1, valgrind reports no leaks.
I am aware that I do not need to free passwd* ( should I free pointer returned by getpwuid() in Linux? ); so what am I missing?
valgrind ./pwd01
==10618== Memcheck, a memory error detector
==10618== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==10618== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==10618== Command: ./pwd01
==10618==
==10618==
==10618== HEAP SUMMARY:
==10618== in use at exit: 300 bytes in 11 blocks
==10618== total heap usage: 68 allocs, 57 frees, 10,130 bytes allocated
==10618==
==10618== LEAK SUMMARY:
==10618== definitely lost: 60 bytes in 1 blocks
==10618== indirectly lost: 240 bytes in 10 blocks
==10618== possibly lost: 0 bytes in 0 blocks
==10618== still reachable: 0 bytes in 0 blocks
==10618== suppressed: 0 bytes in 0 blocks
==10618== Rerun with --leak-check=full to see details of leaked memory
==10618==
==10618== For counts of detected and suppressed errors, rerun with: -v
==10618== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
Does not seem to be a "real" leak, i.e., if called several times, the leak doesn't compound; probably it holds a static pointer to a memory area, if it is NULL (the first time) it allocates those 60 bytes, and then doesn't free them up.
The MacOS X version either uses a truly static area, or its valgrind has got better suppressors.
Just run the getpwuid a couple hundred times in a loop to ensure it really leaks only 60 bytes (and not 1200), just to be sure.
UPDATE
I have finally tracked the leak to several structures inside nssswitch.c and getXXent.c, of different sizes and persuasions. While the code seems to make many more allocations than really necessary, needing malloc locks, this shouldn't be usually appreciable performance-wise, and I for one surely do not intend to second-guess the maintainers of glibc!
It may not be getpwuid() itself that is causing that (false) positive. It could be any number of other things that the C library initializes at start up time, but then doesn't tear down at process termination (because the process is going away, along with all the mapped memory that belongs to it, some things don't really need to be destructed/unallocated). As another answer said, run some additional tests, especially as you build more code beyond the simple example you provided, and make sure the numbers are stable, and not directly attributable to your own code. There's not much you can do about the library code directly, except submit a bug report (I'm assuming you're not one of the C library developers, anyway).

Valgrind and Memory Leaks

I'm doing a little bit of memory profiling to my software and after running standard memory leak check with valgrind's following command
valgrind --tool=memcheck --leak-check=full ./path_to_program
I got following summary:
==12550== LEAK SUMMARY:
==12550== definitely lost: 597,170 bytes in 7 blocks
==12550== indirectly lost: 120 bytes in 10 blocks
==12550== possibly lost: 770,281 bytes in 1,455 blocks
==12550== still reachable: 181,189 bytes in 2,319 blocks
==12550== suppressed: 0 bytes in 0 blocks
==12550== Reachable blocks (those to which a pointer was found) are not shown.
==12550== To see them, rerun with: --leak-check=full --show-reachable=yes
==12550==
==12550== For counts of detected and suppressed errors, rerun with: -v
==12550== ERROR SUMMARY: 325 errors from 325 contexts (suppressed: 176 from 11)
It doesn't look quite good to me, so my question is
Why isn't my program exploding if it has all these leaks?
And also what is the difference between:
definitely lost
indirectly lost
possibly lost
still reachable
and how can I try to fix them?
I suggest visiting the Valgrind FAQ:
With Memcheck's memory leak detector, what's the difference between
"definitely lost", "indirectly lost", "possibly lost", "still
reachable", and "suppressed"?
The details are in the Memcheck section
of the user manual.
In short:
"definitely lost" means your program is leaking memory -- fix those
leaks!
"indirectly lost" means your program is leaking memory in a
pointer-based structure. (E.g. if the root node of a binary tree is
"definitely lost", all the children will be "indirectly lost".) If you
fix the "definitely lost" leaks, the "indirectly lost" leaks should go
away.
"possibly lost" means your program is leaking memory, unless you're
doing unusual things with pointers that could cause them to point into
the middle of an allocated block; see the user manual for some
possible causes. Use --show-possibly-lost=no if you don't want to see
these reports.
"still reachable" means your program is probably ok -- it didn't free
some memory it could have. This is quite common and often reasonable.
Don't use --show-reachable=yes if you don't want to see these reports.
"suppressed" means that a leak error has been suppressed. There are
some suppressions in the default suppression files. You can ignore
suppressed errors.