What do the numbers in the valgrind's outputs mean? - c++

I have this output from valgrind:
==4836== 10,232 bytes in 1 blocks are still reachable in loss record 1 of 1
==4836== at 0x4C2779D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4836== by 0x401865: thrt() (main.cpp:221)
==4836== by 0x4048B1: main (tester.cpp:35)
what does ==4836== mean?
what does 0x4C2779D mean?

The answer to your first question: that number represents the process ID.
Look at the official source.
From this same source, we can see the answer to your second question:
The code addresses (eg. 0x804838F) are usually unimportant, but occasionally crucial for tracking down weirder bugs.

Related

valgrind - different results with hellgrind vs leak-check

I have some strange behaviour that I do not understand. The code is a bit complex so I would refrain from posting it here and instead describe the behaviour and hope that somebody, knowing how valgrind works, has an idea that I can pursue despite this little information.
Background:
I am developing some additional functionality for an open-source, c/c++ based agent-based modelling platform fork # my github. Compilation is fine. Everything seems to work as it should so far based on my validation with test-programs. Also, valgrind does not report any errors of relevance. But, reproducability (which is crucial) is strange.
Within the framework one defines a model file (initialisation of a simulation run, basically). Based on this file, one should be able to reproduce the exact same output (and platform independent). In a way this works: If I start the simulation environment (GUI version), load the file and run it, it produces the same result each time. Also, using the command-line version, I get the same results each time.
But, if, from a running instance of the simulation environment, I run the same model more than once, then the strange behavior occurs - sometimes...
Compiler options used:
CC=g++
GLOBAL_CC=-march=native -std=gnu++14
SSWITCH_CC=-fnon-call-exceptions -Og -ggdb3 -Wall
The set-up:
I run the compiled file and, internally to the program compiled, a fixed simulation set-up three times. Now, it should produce the exact same results each time, which I check by printing random numbers at different stages.
The strange behaviour:
Option #1:
When I run the program in valgrind using the options:
valgrind --leak-check=full --leak-resolution=high --show-reachable=yes
I do not get the same results internally
Report from Option 1:
Finished processing sim1
==6206==
==6206== HEAP SUMMARY:
==6206== in use at exit: 43 bytes in 1 blocks
==6206== total heap usage: 4,124,309 allocs, 4,124,308 frees, 888,390,511 bytes allocated
==6206==
==6206== 43 bytes in 1 blocks are still reachable in loss record 1 of 1
==6206== at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==6206== by 0x5BE7FB2: getcwd (getcwd.c:84)
==6206== by 0x143391: lsdmain(int, char**) (lsdmain.cpp:203)
==6206== by 0x10C37D: main (main_gnuwin.cpp:29)
==6206==
==6206== LEAK SUMMARY:
==6206== definitely lost: 0 bytes in 0 blocks
==6206== indirectly lost: 0 bytes in 0 blocks
==6206== possibly lost: 0 bytes in 0 blocks
==6206== still reachable: 43 bytes in 1 blocks
==6206== suppressed: 0 bytes in 0 blocks
==6206==
==6206== For counts of detected and suppressed errors, rerun with: -v
==6206== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Option #2
However, when I use the following option instead:
valgrind --tool=helgrind
I do get the same results each time with the command line version. Interestingly, the first results with option #1 are the same as the results with option #2.
I would be happy for any suggestions. And, I am not a trained computer scientist... I am using and mt1937 (reinitialised each time) - but the initial random numbers between the simulations are the same, so I do not think the error resides here. Although later within the run the random numbers change in Option #1 (this is my test, besides the time the simulation needs to find an equilibrium).
Finally, I could find the issue: At two points in the program I sort a temporary vector with pairs of distance values and pointers of objects located on a 2d space:
std::sort( vector.begin(),vector.end() ); // vector of std::pairs<double, pointer>
The solution, obviously, is to only sort by the first item of the pair:
std::sort( vector.begin(),vector.end(), [](auto const &A, auto const &B ){return A.first < B.first; } );
Some remarks on why I did not find this issue directly:
When I implemented this sort, I intended to make it "stable". The pointers of the objects are kind of unique, thus in different subsets the ordering would be the same and also independently of how I add the items to the set.
I did not consider that pointer values are (not precisely, but in effect) random numbers outside of my control.
I did not see this, because somehow the OS (or whatever) always assigns the same pointer values between different calls of the program (I suggest there is a "virtual" space that is always initialized again). Because of this, I did not suggest that pointers were the issue.
Curiously, when I ran the program with Valgrind and --tool=helgrind option, the issue did not persist. One suggestion I got (offline) was that memcheck preinitialises the memory with a given pattern, this would have been an answer if uninitialised variables had been the cause. As it seems, helgrind also controls the memory in different scopes, providing for each of my subsequent simulations a "fresh" virtual memory such that my pointer sorting was stable in the repeated loop.
I hope this helps somebody if he or she runs into the same problems. Thanks for all the suggestions!

Valgrind error with new array [duplicate]

I am getting an invalid read error when the src string ends with \n, the error disappear when i remove \n:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (void)
{
char *txt = strdup ("this is a not socket terminated message\n");
printf ("%d: %s\n", strlen (txt), txt);
free (txt);
return 0;
}
valgrind output:
==18929== HEAP SUMMARY:
==18929== in use at exit: 0 bytes in 0 blocks
==18929== total heap usage: 2 allocs, 2 frees, 84 bytes allocated
==18929==
==18929== All heap blocks were freed -- no leaks are possible
==18929==
==18929== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==18929==
==18929== 1 errors in context 1 of 1:
==18929== Invalid read of size 4
==18929== at 0x804847E: main (in /tmp/test)
==18929== Address 0x4204050 is 40 bytes inside a block of size 41 alloc'd
==18929== at 0x402A17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==18929== by 0x8048415: main (in /tmp/test)
==18929==
==18929== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
How to fix this without sacrificing the new line character?
It's not about the newline character, nor the printf format specifier. You've found what is arguably a bug in strlen(), and I can tell you must be using gcc.
Your program code is perfectly fine. The printf format specifier could be a little better, but it won't cause the valgrind error you are seeing. Let's look at that valgrind error:
==18929== Invalid read of size 4
==18929== at 0x804847E: main (in /tmp/test)
==18929== Address 0x4204050 is 40 bytes inside a block of size 41 alloc'd
==18929== at 0x402A17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==18929== by 0x8048415: main (in /tmp/test)
"Invalid read of size 4" is the first message we must understand. It means that the processor ran an instruction which would load 4 consecutive bytes from memory. The next line indicates that the address attempted to be read was "Address 0x4204050 is 40 bytes inside a block of size 41 alloc'd."
With this information, we can figure it out. First, if you replace that '\n' with a '$', or any other character, the same error will be produced. Try it.
Secondly, we can see that your string has 40 characters in it. Adding the \0 termination character brings the total bytes used to represent the string to 41.
Because we have the message "Address 0x4204050 is 40 bytes inside a block of size 41 alloc'd," we now know everything about what is going wrong.
strdup() allocated the correct amount of memory, 41 bytes.
strlen() attempted to read 4 bytes, starting at the 40th, which would extend to a non-existent 43rd byte.
valgrind caught the problem
This is a glib() bug. Once upon a time, a project called Tiny C Compiler (TCC) was starting to take off. Coincidentally, glib was completely changed so that the normal string functions, such as strlen() no longer existed. They were replaced with optimized versions which read memory using various methods such as reading four bytes at a time. gcc was changed at the same time to generate calls to the appropriate implementations, depending on the alignment of the input pointer, the hardware compiled for, etc. The TCC project was abandoned when this change to the GNU environment made it so difficult to produce a new C compiler, by taking away the ability to use glib for the standard library.
If you report the bug, glib maintainers probably won't fix it. The reason is that under practical use, this will likely never cause an actual crash. The strlen function is reading bytes 4 at a time because it sees that the addresses are 4-byte aligned. It's always possible to read 4 bytes from a 4-byte-aligned address without segfaulting, given that reading 1 byte from that address would succeed. Therefore, the warning from valgrind doesn't reveal a potential crash, just a mismatch in assumptions about how to program. I consider valgrind technically correct, but I think there is zero chance that glib maintainers will do anything to squelch the warning.
The error message seems to indicate that it's strlen that read past the malloced buffer allocated by strdup. On a 32-bit platform, an optimal strlen implementation could read 4 bytes at a time into a 32-bit register and do some bit-twiddling to see if there's a null byte in there. If near the end of the string, there are less than 4 bytes left, but 4 bytes are still read to perform the null byte check, then I could see this error getting printed. In that case, presumably the strlen implementer would know if it's "safe" to do this on the particular platform, in which case the valgrind error is a false positive.

How do I understand Invalid read in Valgrind, where address is bigger than the alloc'd block size

I am new to Valgrind. Got these Valgrind message:
==932767== Invalid read of size 16
==932767== at 0x3D97D2B9AA: __strcasecmp_l_sse42 (in /lib64/libc-2.12.so)
...
==932767== Address 0x8c3e170 is 9 bytes after a block of size 7 alloc'd
==932767== at 0x6A73B4A: malloc (vg_replace_malloc.c:296)
==932767== by 0x34E821195A: ???
Here I have two questions:
the allocated block is 7 bytes, then how come the address 0x8c3e170 is in 9 bytes? Normally the pointed size is smaller than the allocated size. So under what circumstance we will meet the above issue?
the Invalide read size is 16bytes. Does it include the 2 extra bytes from "Address 0x8c3e170 is 9 bytes after a block of size 7 alloc'd"
If it weren't for the ellipsis I would say the Address 0x8c3e170... msg is directly related to the Invalid read of size 16 because it's indented further.
It's possible to get false positives, so don't rule that out. For example, it's possible that strcasecmp is reading more than it needs to as an optimization.
I read the 2nd message as the address being read from starts 9 bytes after the end of a block of size 7.
I have two suggestions, either of which will probably help you track this down:
1) Run your application under valgrind such that you can attach in a separate terminal window with gdb:
~ valgrind --vgdb=yes --vgdb-error=0 your_program
in another window:
~ gdb your_program
(gdb) target remote | vgdb
This option makes it halt as though a breakpoint were set on every problem valgrind finds
2) Compile with the undefined and/or memory sanitizers either with clang or gcc (4.9 or higher). They catch the same sorts of issues, but I find the error messages more informative.

Valgrind Memory Leak Log

In valgrind, we have leak logs like this
==15788== 480 bytes in 20 blocks are definitely lost in loss record 5,016 of 5,501
==20901== 112 (48 direct, 64 indirect) bytes in 2 blocks are definitely lost in loss record 3,501 of 5,122
==20901== 1,375,296 bytes in 78 blocks are possibly lost in loss record 5,109 of 5,122
==20901== Conditional jump or move depends on uninitialised value(s)
==20901== Use of uninitialised value of size 8
In Valgrind's documentation, i couldn't find exact details. Can please somebody explain
I know definitely lost means - the memory allocated is not at all freed. But can what does it mean by "20 blocks" and what it means of "lost in loss record 5,016 of 5,501". And if it says 480 bytes are lost, does it mean for one run in a loop or total ..?
In the Second line ,"112 (48 direct, 64 indirect) bytes in 2 blocks are definitely lost", what does it means "48 direct, 64 indirect".
And i understand the meaning of"possibly lost", but does that mean valgrind is not sure if it's a leak ..?
And regarding the 4th line, i have no idea at all. I checked the call stack provided along with that 4th line. I don't notice any "jump or move".
For the 5th line, it says the uninitialised is in last line of this code snippet. I don't see any uninitialised value here.
char *data = new char[somebigSize];
memset(data, '\0', somebigSize);
int sizeInt = sizeof(int);
int length = 20; //some value obtained
int position = 10;
char *newPtrVar = new char[sizeInt + 1];
memset(newPtrVar, '\0', sizeInt + 1);
memcpy(newPtrVar, &length, sizeInt);
memcpy(&data[position], newPtrVar, sizeInt);
The valgrind manual covers this in detail. It's quite complicated - see the link for full details, but in essence you can have:
"still reachable" (memory that is pointed to by a pointer).
"directly lost" (memory that is not pointed to be any live pointer)
"indirectly lost" (memory that is pointed to by a pointer that is in memory that is "directly lost)
"possibly lost" (memory that is pointed to, but the pointer does not point to the start of the memory).
The last case could be some random pointer, and it could be something like a memory manager that allocates a redzone before the memory returned to the user.

Regarding Possible Lost in Valgrind

What is wrong if we push the strings into vector like this:
globalstructures->schema.columnnames.push_back("id");
When i am applied valgrind on my code it is showing
possibly lost of 27 bytes in 1 blocks are possibly lost in loss record 7 of 19.
like that in so many places it is showing possibly lost.....because of this the allocations and frees are not matching....which is resulting in some strange error like
malloc.c:No such file or directory
Although I am using calloc for allocation of memory everywhere in my code i am getting warnings like
Syscall param write(buf) points to uninitialised byte(s)
The code causing that error is
datapage *dataPage=(datapage *)calloc(1,PAGE_SIZE);
writePage(dataPage,dataPageNumber);
int writePage(void *buffer,long pagenumber)
{
int fd;
fd=open(path,O_WRONLY, 0644);
if (fd < 0)
return -1;
lseek(fd,pagenumber*PAGE_SIZE,SEEK_SET);
if(write(fd,buffer,PAGE_SIZE)==-1)
return false;
close(fd);
return true;
}
Exact error which i am getting when i am running through gdb is ...
Breakpoint 1, getInfoFromSysColumns (tid=3, numColumns=#0x7fffffffdf24: 1, typesVector=..., constraintsVector=..., lengthsVector=...,
columnNamesVector=..., offsetsVector=...) at dbheader.cpp:1080
Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff78bd720, bytes=8) at malloc.c:3498
3498 malloc.c: No such file or directory.
When i run the same through valgrind it's working fine...
Well,
malloc.c:No such file or directory
can occur while you are debugging using gdb and you use command "s" instead of "n" near malloc which essentially means you are trying to step into malloc, the source of which may not be not available on your Linux machine.
That is perhaps the same reason why it is working fine with valgrind.
Why error is in malloc:
The problem is that you overwrote some memory buffer and corrupted one
of the structures used by the memory manager. (c)
Try to run valgrind with --track-origins=yes and see where that uninitialized access comes from. If you believe that it should be initialized and it is not, maybe the data came from a bad pointer, valgrind will show you where exactly the values were created. Probably those uninitialized values overwrote your buffer, including memory manager special bytes.
Also, review all valgrind warnings before the crash.