Valgrind results of a "segmentation fault" program - c++

My program (./a.out) encountered with a segmentation fault, so I use Valgrind to check if I can find at which line of code the program corrupts. I got the following output, but I cannot understand them. To me, the most suspicious line of the output is ==17967== Address 0x20687cf80 is 0 bytes inside a block of size 16 alloc'd, does this line means the address 0x20687cf80 is not propoerly allocated a memory block? What can I do to resolve this problem.
I am using a 64-bit linux with 64GB ram.
[root#gpu BloomFilterAndHashTable]# valgrind --tool=memcheck --leak-check=full ./a.out /mnt/disk2/experiments/two_stage_bloom_filter/test/10_10.txt /mnt/disk2/experiments/10M_worstcase_trace/w_10_10.trace 24
==17967== Memcheck, a memory error detector
==17967== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==17967== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==17967== Command: ./a.out /mnt/disk2/experiments/two_stage_bloom_filter/test/10_10.txt /mnt/disk2/experiments/10M_worstcase_trace/w_10_10.trace 24
==17967==
9998797 Prefixes loaded! //output of my program
==17967== Warning: set address range perms: large range [0x4201a040, 0x6f423220) (defined)
==17967== Warning: set address range perms: large range [0x9c834040, 0x20687cf40) (undefined)
insertion cost time(us): 173168519 9998797 17.318935 0.057740 //output of my program
==17967== Warning: set address range perms: large range [0x23647d040, 0x25647d040) (defined)
Trace loaded! //output of my program
lookup cost time(us): 5728767367 67108864 85.365286 0.011714 //output of my program
==17967== Mismatched free() / delete / delete []
==17967== at 0x4A055FE: free (vg_replace_malloc.c:366)
==17967== by 0x401B13: hash_table_delete(BloomFilter*, char*) (BloomFilterAndHashTable.cpp:503)
==17967== by 0x402212: main (BloomFilterAndHashTable.cpp:687)
==17967== Address 0x20687cf80 is 0 bytes inside a block of size 16 alloc'd
==17967== at 0x4A05F97: operator new(unsigned long) (vg_replace_malloc.c:261)
==17967== by 0x40146D: hash_table_insert(char*, int, BloomFilter*) (BloomFilterAndHashTable.cpp:293)
==17967== by 0x401DD5: main (BloomFilterAndHashTable.cpp:597)
==17967==
Delete succeeded! //output of my program
deletion cost time(us): 178048113 9998797 17.806953 0.056158 //output of my program
==17967== Warning: set address range perms: large range [0x23647d030, 0x25647d050) (noaccess)
--17967:0:aspacem Valgrind: FATAL: VG_N_SEGMENTS is too low.
--17967:0:aspacem Increase it and rebuild. Exiting now.
[root#gpu BloomFilterAndHashTable]#

The "suspicious" output ==17967== Address 0x20687cf80 is 0 bytes inside a block of size 16 alloc'd means, that there is an allocated block of memory, 16 bytes in size. Address 0x20687cf80 is the address of the very first byte of that block (i.e. it's the address of the whole block). So the line itself does only tell you details about a memory block that is involved in the whole warning.
The warning itself is about a "mismatched free()". The following lines show where the free was called:
==17967== at 0x4A055FE: free (vg_replace_malloc.c:366)
==17967== by 0x401B13: hash_table_delete(BloomFilter*, char*) (BloomFilterAndHashTable.cpp:503)
Meaning, hash_table_delete calls free(). Now, why does valgrind think that this is a mismatch? Because the address of the memory block that gets freed (0x20687cf80) was allocated by operator new, which was called by hash_table_insert:
==17967== at 0x4A05F97: operator new(unsigned long) (vg_replace_malloc.c:261)
==17967== by 0x40146D: hash_table_insert(char*, int, BloomFilter*) (BloomFilterAndHashTable.cpp:293)
This is suspicious. If it is the source of your error is another problem, but you should fix it anyways.

Related

__static_initialization_and_destruction_0 seg fault

I was developing a C++ program and everything was working fine. Then, while I was programming, I ran make and ran my program like usual. But during the execution of it all, my computer crashed and shut itself off. I reopened my computer and ran make again but this time it gave me a bunch of errors.
Everything seemed off, like my whole computer is corrupted. But everything was working as intended in my operating system except the stuff related to my program. I've tried making a new c++ project, it works just fine. I've tried deleting the project and re-compiling it from github to no avail.
I managed to compile the program in the end but now it gives me a seg fault. The first thing my program does is to print out Starting... to the screen, but this segfault occurs without ever printing that, so it led me to believe that this error is linker related. (Even when the make command was failing, before I fixed it, it told me there was a linker error)
Here is what valgrind says:
turgut#turgut-N56VZ:~/Desktop/CppProjects/videoo-render$ valgrind bin/Renderer
==7521== Memcheck, a memory error detector
==7521== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7521== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7521== Command: bin/Renderer
==7521==
==7521== Invalid read of size 1
==7521== at 0x484FBD4: strcmp (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==7521== by 0x121377: __static_initialization_and_destruction_0 (OpenGLRenderer.cpp:111)
==7521== by 0x121377: _GLOBAL__sub_I__ZN6OpenGL7Texture5max_zE (OpenGLRenderer.cpp:197)
==7521== by 0x659FEBA: call_init (libc-start.c:145)
==7521== by 0x659FEBA: __libc_start_main##GLIBC_2.34 (libc-start.c:379)
==7521== by 0x1216A4: (below main) (in /home/turgut/Desktop/CppProjects/videoo-render/bin/Renderer)
==7521== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==7521==
==7521==
==7521== Process terminating with default action of signal 11 (SIGSEGV)
==7521== Access not within mapped region at address 0x0
==7521== at 0x484FBD4: strcmp (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==7521== by 0x121377: __static_initialization_and_destruction_0 (OpenGLRenderer.cpp:111)
==7521== by 0x121377: _GLOBAL__sub_I__ZN6OpenGL7Texture5max_zE (OpenGLRenderer.cpp:197)
==7521== by 0x659FEBA: call_init (libc-start.c:145)
==7521== by 0x659FEBA: __libc_start_main##GLIBC_2.34 (libc-start.c:379)
==7521== by 0x1216A4: (below main) (in /home/turgut/Desktop/CppProjects/videoo-render/bin/Renderer)
==7521== If you believe this happened as a result of a stack
==7521== overflow in your program's main thread (unlikely but
==7521== possible), you can try to increase the size of the
==7521== main thread stack using the --main-stacksize= flag.
==7521== The main thread stack size used in this run was 8388608.
==7521==
==7521== HEAP SUMMARY:
==7521== in use at exit: 72,741 bytes in 3 blocks
==7521== total heap usage: 3 allocs, 0 frees, 72,741 bytes allocated
==7521==
==7521== LEAK SUMMARY:
==7521== definitely lost: 0 bytes in 0 blocks
==7521== indirectly lost: 0 bytes in 0 blocks
==7521== possibly lost: 0 bytes in 0 blocks
==7521== still reachable: 72,741 bytes in 3 blocks
==7521== suppressed: 0 bytes in 0 blocks
==7521== Rerun with --leak-check=full to see details of leaked memory
==7521==
==7521== For lists of detected and suppressed errors, rerun with: -s
==7521== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
turgut#turgut-N56VZ:~/Desktop/Cpp
OpenGLRenderer.cpp:197 is just the end of the file and here is what's writeen in OpenGLRenderer.cpp:111:
static bool __debug = strcmp(getenv("DEBUG"), "true") == 0;
It says that there is an error with strcmp but I've tried using that function on a different project and it worked just fine.
What could be the reason for this? I'm on ubuntu 22.04, gcc verison 11.2.0.
this segfault occurs without ever printing that, so it led me to believe that this error is linker related.
The linker is not involved in your program running, so it can't be "linker related".
There is a dynamic loader (if your program uses shared libraries), so perhaps that's what you meant.
In any case, the crash is happening because OpenGLRenderer.cpp:111 (probably in libGL.so) is calling strcmp() with one of the arguments being NULL (which is not a valid thing to do). This does happen before main.
This line:
static bool __debug = strcmp(getenv("DEBUG"), "true") == 0;
is buggy: it will crash when DEBUG is not set in the environment (getenv("DEBUG") will return NULL in that case).
As a workaround, you can run export DEBUG=off, before running your program and the crash will go away.
It's unclear whether you inserted this line into OpenGLRenderer.cpp yourself or whether it was already present, but it's buggy either way.
P.S. A correct way to initialize __debug could be:
static const char *debug_str = getenv("DEBUG");
static const bool debug = strcmp(debug_str == NULL ? "off" : debug_str, "true") == 0;
P.P.S. Avoid using identifiers prefixed with __ (such as __debug) -- they are reserved.

How can I find what is causing my memory allocation error when Valgrind gives me "the impossible happened" [duplicate]

Valgrind yields the following message block:
1,065,024 bytes in 66,564 blocks are definitely lost in loss record 21 of 27
at 0x4C2B800: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x40CA21: compute(Parameters&, Array<double>&) [clone .constprop.71] (array.hpp:135)
by 0x403E70: main (main.cpp:374)
How to read this message?
main.cpp line 374 reads:
results[index] = compute(parameters, weights);
Is memory leaked exactly at line 374 of main.cpp? Is it leaked in compute() or maybe at assignment/indexing into results?
Is memory leaked exactly at line 374 of main.cpp?
No. It just shows the line number in main where the call was made that ultimately leads to the function and line where the memory was allocated.
Is it leaked in compute() or maybe at assignment/indexing into results?
It says that memory was allocated in compute() but was not deallocated in the program before the program exited. That's what constitutes a memory leak.

How do I understand Invalid read in Valgrind, where address is bigger than the alloc'd block size

I am new to Valgrind. Got these Valgrind message:
==932767== Invalid read of size 16
==932767== at 0x3D97D2B9AA: __strcasecmp_l_sse42 (in /lib64/libc-2.12.so)
...
==932767== Address 0x8c3e170 is 9 bytes after a block of size 7 alloc'd
==932767== at 0x6A73B4A: malloc (vg_replace_malloc.c:296)
==932767== by 0x34E821195A: ???
Here I have two questions:
the allocated block is 7 bytes, then how come the address 0x8c3e170 is in 9 bytes? Normally the pointed size is smaller than the allocated size. So under what circumstance we will meet the above issue?
the Invalide read size is 16bytes. Does it include the 2 extra bytes from "Address 0x8c3e170 is 9 bytes after a block of size 7 alloc'd"
If it weren't for the ellipsis I would say the Address 0x8c3e170... msg is directly related to the Invalid read of size 16 because it's indented further.
It's possible to get false positives, so don't rule that out. For example, it's possible that strcasecmp is reading more than it needs to as an optimization.
I read the 2nd message as the address being read from starts 9 bytes after the end of a block of size 7.
I have two suggestions, either of which will probably help you track this down:
1) Run your application under valgrind such that you can attach in a separate terminal window with gdb:
~ valgrind --vgdb=yes --vgdb-error=0 your_program
in another window:
~ gdb your_program
(gdb) target remote | vgdb
This option makes it halt as though a breakpoint were set on every problem valgrind finds
2) Compile with the undefined and/or memory sanitizers either with clang or gcc (4.9 or higher). They catch the same sorts of issues, but I find the error messages more informative.

How to interpret Valgrind output

Valgrind yields the following message block:
1,065,024 bytes in 66,564 blocks are definitely lost in loss record 21 of 27
at 0x4C2B800: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x40CA21: compute(Parameters&, Array<double>&) [clone .constprop.71] (array.hpp:135)
by 0x403E70: main (main.cpp:374)
How to read this message?
main.cpp line 374 reads:
results[index] = compute(parameters, weights);
Is memory leaked exactly at line 374 of main.cpp? Is it leaked in compute() or maybe at assignment/indexing into results?
Is memory leaked exactly at line 374 of main.cpp?
No. It just shows the line number in main where the call was made that ultimately leads to the function and line where the memory was allocated.
Is it leaked in compute() or maybe at assignment/indexing into results?
It says that memory was allocated in compute() but was not deallocated in the program before the program exited. That's what constitutes a memory leak.

Using valgrind to find a memory leak in the mysql c++ client

I'm using valgrind to try and track down a memory leak is the mysql c++ client distributed from mysql.
In both the examples (resultset.cpp) and my own program, there is a single 56 byte block that is not freed. In my own program, I've traced the leak to a call to the mysql client.
Here are the results when I run the test:
valgrind --leak-check=full --show-reachable=yes ./my-executable
==29858== Memcheck, a memory error detector
==29858== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==29858== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==29858== Command: ./my-executable
==29858==
==29858==
==29858== HEAP SUMMARY:
==29858== in use at exit: 56 bytes in 1 blocks
==29858== total heap usage: 693 allocs, 692 frees, 308,667 bytes allocated
==29858==
==29858== 56 bytes in 1 blocks are still reachable in loss record 1 of 1
==29858== at 0x4C284A8: malloc (vg_replace_malloc.c:236)
==29858== by 0x400D334: _dl_map_object_deps (dl-deps.c:506)
==29858== by 0x4013652: dl_open_worker (dl-open.c:291)
==29858== by 0x400E9C5: _dl_catch_error (dl-error.c:178)
==29858== by 0x4012FF9: _dl_open (dl-open.c:583)
==29858== by 0x7077BCF: do_dlopen (dl-libc.c:86)
==29858== by 0x400E9C5: _dl_catch_error (dl-error.c:178)
==29858== by 0x7077D26: __libc_dlopen_mode (dl-libc.c:47)
==29858== by 0x72E5FEB: pthread_cancel_init (unwind-forcedunwind.c:53)
==29858== by 0x72E614B: _Unwind_ForcedUnwind (unwind-forcedunwind.c:126)
==29858== by 0x72E408F: __pthread_unwind (unwind.c:130)
==29858== by 0x72DDEB4: pthread_exit (pthreadP.h:265)
==29858==
==29858== LEAK SUMMARY:
==29858== definitely lost: 0 bytes in 0 blocks
==29858== indirectly lost: 0 bytes in 0 blocks
==29858== possibly lost: 0 bytes in 0 blocks
==29858== still reachable: 56 bytes in 1 blocks
==29858== suppressed: 0 bytes in 0 blocks
==29858==
==29858== For counts of detected and suppressed errors, rerun with: -v
==29858== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 6)
I have a few questions regarding this:
How should I interpret the --show-reachable block?
Is that block useful for me to try and zero in on the error?
If the block is not useful, does valgrind have another mechanism that would help me trace the leak?
If not, is there some other tool (hopefully OSS on linux) to help me narrow this down?
Thanks in advance..
UPDATE: Here is the code that I found on my system for the definition of pthread_exit. I'm not certain that this is the actual source that is being invoked. However, if it is, can anyone explain what might be going wrong?
void
pthread_exit (void *retval)
{
/* specific to PTHREAD_TO_WINTHREAD */
ExitThread ((DWORD) ((size_t) retval)); /* thread becomes signalled so its death can be waited upon */
/*NOTREACHED*/
assert (0); return; /* void fnc; can't return an error code */
}
Reachable just means that the blocks had a valid pointer referencing them in scope when the program exited, which indicates that the program does not explicitly free everything on exit because it relies on the underlying OS to do so. What you should be looking for are lost blocks, where blocks of memory lost all references to them and can no longer be freed.
So, the 56 bytes were probably allocated in main, which did not explicitly free them. What you posted does not show a memory leak. It shows main freeing everything but what main allocated because main assumes that when it dies, all memory will be reclaimed by the kernel.
Specifically, it's pthread (in main) making this assumption (which is a valid assumption on darn near everything found in production written in the last 15+ years). The need to free blocks that still have a valid reference on exit is a bit of a contentious point, but for this specific question all that needs to be mentioned is that the assumption was made.
Edit
It's actually pthread_exit() not cleaning something up on exit, but as explained it probably doesn't need to (or quite possibly can't) once it reaches that point.