Regarding Possible Lost in Valgrind - c++

What is wrong if we push the strings into vector like this:
globalstructures->schema.columnnames.push_back("id");
When i am applied valgrind on my code it is showing
possibly lost of 27 bytes in 1 blocks are possibly lost in loss record 7 of 19.
like that in so many places it is showing possibly lost.....because of this the allocations and frees are not matching....which is resulting in some strange error like
malloc.c:No such file or directory
Although I am using calloc for allocation of memory everywhere in my code i am getting warnings like
Syscall param write(buf) points to uninitialised byte(s)
The code causing that error is
datapage *dataPage=(datapage *)calloc(1,PAGE_SIZE);
writePage(dataPage,dataPageNumber);
int writePage(void *buffer,long pagenumber)
{
int fd;
fd=open(path,O_WRONLY, 0644);
if (fd < 0)
return -1;
lseek(fd,pagenumber*PAGE_SIZE,SEEK_SET);
if(write(fd,buffer,PAGE_SIZE)==-1)
return false;
close(fd);
return true;
}
Exact error which i am getting when i am running through gdb is ...
Breakpoint 1, getInfoFromSysColumns (tid=3, numColumns=#0x7fffffffdf24: 1, typesVector=..., constraintsVector=..., lengthsVector=...,
columnNamesVector=..., offsetsVector=...) at dbheader.cpp:1080
Program received signal SIGSEGV, Segmentation fault.
_int_malloc (av=0x7ffff78bd720, bytes=8) at malloc.c:3498
3498 malloc.c: No such file or directory.
When i run the same through valgrind it's working fine...

Well,
malloc.c:No such file or directory
can occur while you are debugging using gdb and you use command "s" instead of "n" near malloc which essentially means you are trying to step into malloc, the source of which may not be not available on your Linux machine.
That is perhaps the same reason why it is working fine with valgrind.

Why error is in malloc:
The problem is that you overwrote some memory buffer and corrupted one
of the structures used by the memory manager. (c)
Try to run valgrind with --track-origins=yes and see where that uninitialized access comes from. If you believe that it should be initialized and it is not, maybe the data came from a bad pointer, valgrind will show you where exactly the values were created. Probably those uninitialized values overwrote your buffer, including memory manager special bytes.
Also, review all valgrind warnings before the crash.

Related

valgrind - different results with hellgrind vs leak-check

I have some strange behaviour that I do not understand. The code is a bit complex so I would refrain from posting it here and instead describe the behaviour and hope that somebody, knowing how valgrind works, has an idea that I can pursue despite this little information.
Background:
I am developing some additional functionality for an open-source, c/c++ based agent-based modelling platform fork # my github. Compilation is fine. Everything seems to work as it should so far based on my validation with test-programs. Also, valgrind does not report any errors of relevance. But, reproducability (which is crucial) is strange.
Within the framework one defines a model file (initialisation of a simulation run, basically). Based on this file, one should be able to reproduce the exact same output (and platform independent). In a way this works: If I start the simulation environment (GUI version), load the file and run it, it produces the same result each time. Also, using the command-line version, I get the same results each time.
But, if, from a running instance of the simulation environment, I run the same model more than once, then the strange behavior occurs - sometimes...
Compiler options used:
CC=g++
GLOBAL_CC=-march=native -std=gnu++14
SSWITCH_CC=-fnon-call-exceptions -Og -ggdb3 -Wall
The set-up:
I run the compiled file and, internally to the program compiled, a fixed simulation set-up three times. Now, it should produce the exact same results each time, which I check by printing random numbers at different stages.
The strange behaviour:
Option #1:
When I run the program in valgrind using the options:
valgrind --leak-check=full --leak-resolution=high --show-reachable=yes
I do not get the same results internally
Report from Option 1:
Finished processing sim1
==6206==
==6206== HEAP SUMMARY:
==6206== in use at exit: 43 bytes in 1 blocks
==6206== total heap usage: 4,124,309 allocs, 4,124,308 frees, 888,390,511 bytes allocated
==6206==
==6206== 43 bytes in 1 blocks are still reachable in loss record 1 of 1
==6206== at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==6206== by 0x5BE7FB2: getcwd (getcwd.c:84)
==6206== by 0x143391: lsdmain(int, char**) (lsdmain.cpp:203)
==6206== by 0x10C37D: main (main_gnuwin.cpp:29)
==6206==
==6206== LEAK SUMMARY:
==6206== definitely lost: 0 bytes in 0 blocks
==6206== indirectly lost: 0 bytes in 0 blocks
==6206== possibly lost: 0 bytes in 0 blocks
==6206== still reachable: 43 bytes in 1 blocks
==6206== suppressed: 0 bytes in 0 blocks
==6206==
==6206== For counts of detected and suppressed errors, rerun with: -v
==6206== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Option #2
However, when I use the following option instead:
valgrind --tool=helgrind
I do get the same results each time with the command line version. Interestingly, the first results with option #1 are the same as the results with option #2.
I would be happy for any suggestions. And, I am not a trained computer scientist... I am using and mt1937 (reinitialised each time) - but the initial random numbers between the simulations are the same, so I do not think the error resides here. Although later within the run the random numbers change in Option #1 (this is my test, besides the time the simulation needs to find an equilibrium).
Finally, I could find the issue: At two points in the program I sort a temporary vector with pairs of distance values and pointers of objects located on a 2d space:
std::sort( vector.begin(),vector.end() ); // vector of std::pairs<double, pointer>
The solution, obviously, is to only sort by the first item of the pair:
std::sort( vector.begin(),vector.end(), [](auto const &A, auto const &B ){return A.first < B.first; } );
Some remarks on why I did not find this issue directly:
When I implemented this sort, I intended to make it "stable". The pointers of the objects are kind of unique, thus in different subsets the ordering would be the same and also independently of how I add the items to the set.
I did not consider that pointer values are (not precisely, but in effect) random numbers outside of my control.
I did not see this, because somehow the OS (or whatever) always assigns the same pointer values between different calls of the program (I suggest there is a "virtual" space that is always initialized again). Because of this, I did not suggest that pointers were the issue.
Curiously, when I ran the program with Valgrind and --tool=helgrind option, the issue did not persist. One suggestion I got (offline) was that memcheck preinitialises the memory with a given pattern, this would have been an answer if uninitialised variables had been the cause. As it seems, helgrind also controls the memory in different scopes, providing for each of my subsequent simulations a "fresh" virtual memory such that my pointer sorting was stable in the repeated loop.
I hope this helps somebody if he or she runs into the same problems. Thanks for all the suggestions!

How to debug segmentation fault?

It works when, in the loop, I set every element to 0 or to entry_count-1.
It works when I set it up so that entry_count is small, and I write it by hand instead of by loop (sorted_order[0] = 0; sorted_order[1] = 1; ... etc).
Please do not tell me what to do to fix my code. I will not be using smart pointers or vectors for very specific reasons. Instead focus on the question:
What sort of conditions can cause this segfault?
Thank you.
---- OLD -----
I am trying to debug code that isn't working on a unix machine. The gist of the code is:
int *sorted_array = (int*)memory;
// I know that this block is large enough
// It is allocated by malloc earlier
for (int i = 0; i < entry_count; ++i){
sorted_array[i] = i;
}
There appears to be a segfault somewhere in the loop. Switching to debug mode, unfortunately, makes the segfault stop. Using cout debugging I found that it must be in the loop.
Next I wanted to know how far into the loop the segfault happend so I added:
std::cout << i << '\n';
It showed the entire range it was suppose to be looping over and there was no segfault.
With a little more experimentation I eventually created a string stream before the loop and write an empty string into it for each iteration of the loop and there is no segfault.
I tried some other assorted operations trying to figure out what is going on. I tried setting a variable j = i; and stuff like that, but I haven't found anything that works.
Running valgrind the only information I got on the segfault was that it was a "General Protection Fault" and something about default response to 11. It also mentions that there's a Conditional jump or move depends on uninitialized value(s), but looking at the code I can't figure out how that's possible.
What can this be? I am out of ideas to explore.
This is clearly a symptoms of invalid memory uses within your program.This would be bit difficult to find by looking out your code snippet as it is most likely be the side effect of something else bad which has already happened.
However as you have mentioned in your question that you are able to attach your program using Valgrind. as it is reproducible. So you may want to attach your program(a.out).
$ valgrind --tool=memcheck --db-attach=yes ./a.out
This way Valgrind would attach your program in the debugger when your first memory error is detected so that you can do live debugging(GDB). This should be the best possible way to understand and resolve your problem.
Once you are able to figure it out your first error, fix it and rerun it and see what are other errors you are getting.This steps should be done till no error is getting reported by Valgrind.
However you should avoid using the raw pointers in modern C++ programs and start using std::vector std::unique_ptr as suggested by others as well.
Valgrind and GDB are very useful.
The most previous one that I used was GDB- I like it because it showed me the exact line number that the Segmentation Fault was on.
Here are some resources that can guide you on using GDB:
GDB Tutorial 1
GDB Tutorial 2
If you still cannot figure out how to use GDB with these tutorials, there are tons on Google! Just search debugging Segmentation Faults with GDB!
Good luck :)
That is hard, I used valgrind tools to debug seg-faults and it usually pointed to violations.
Likely your problem is freed memory that you are writing to i.e. sorted_array gets out of scope or gets freed.
Adding more code hides this problem as data allocation shifts around.
After a few days of experimentation, I figured out what was really going on.
For some reason the machine segfaults on unaligned access. That is, the integers I was writing were not being written to memory boundaries that were multiples of four bytes. Before the loop I computed the offset and shifted the array up that much:
int offset = (4 - (uintptr_t)(memory) % 4) % 4;
memory += offset;
After doing this everything behaved as expected again.

I am getting the following error during run time. What does it mean and how do i debug it?

*** glibc detected *** ./main: corrupted double-linked list: 0x086c4f30 ***
After this the program does not exit and I am forced to exit using cntrl+C. I am not using any memory de allocation like "delete" in my whole code either
On using Valgrind, i get the following message:
Invalid write of size 4
==20358== at 0x8049932: main (main.cpp:123)
==20358== Address 0x432e6f8 is 0 bytes after a block of size 16 alloc'd
==20358== at 0x402C454: operator new[](unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==20358== by 0x8049907: main (main.cpp:120)
And the corresponding piece of code in line 123 is
float **der_global= new float *[NODES];
for(int i=0; i<no_element; i++)
{
der_global[i]=new float [no_element];
}
Your original new call gives you space to store NODES pointers; but your for-loop tries to set no_element of them, which doesn't have to be the same number. Your for loop should have i less than NODES, not i less than no_element.
This error usually shows up when the program frees memory that is no longer valid.
Are u using malloc or any other dynamic allocation.
It would be easy to solve ur problem if u could add some of your code
Try using valgrind
valgrind --tool=memcheck --leak-check=full --track-origins=yes --show-reachable=yes --log-file=val.log ./<executable> <parameters>
and look at the val.log
You could also use gdb but for that u will need to compile with the -g tag

What do the memory operations malloc and free exactly do?

Recently I met a memory release problem. First, the blow is the C codes:
#include <stdio.h>
#include <stdlib.h>
int main ()
{
int *p =(int*) malloc(5*sizeof (int));
int i ;
for(i =0;i<5; i++)
p[i ]=i;
p[i ]=i;
for(i =0;i<6; i++)
printf("[%p]:%d\n" ,p+ i,p [i]);
free(p );
printf("The memory has been released.\n" );
}
Apparently, there is the memory out of range problem. And when I use the VS2008 compiler, it give the following output and some errors about memory release:
[00453E80]:0
[00453E84]:1
[00453E88]:2
[00453E8C]:3
[00453E90]:4
[00453E94]:5
However when I use the gcc 4.7.3 compiler of cygwin, I get the following output:
[0x80028258]:0
[0x8002825c]:1
[0x80028260]:2
[0x80028264]:3
[0x80028268]:4
[0x8002826c]:51
The memory has been released.
Apparently, the codes run normally, but 5 is not written to the memory.
So there are maybe some differences between VS2008 and gcc on handling these problems.
Could you guys give me some professional explanation on this? Thanks In Advance.
This is normal as you have never allocated any data into the mem space of p[5]. The program will just print what ever data was stored in that space.
There's no deterministic "explanation on this". Writing data into the uncharted territory past the allocated memory limit causes undefined behavior. The behavior is unpredictable. That's all there is to it.
It is still strange though to see that 51 printed there. Typically GCC will also print 5 but fail with memory corruption message at free. How you managed to make this code print 51 is not exactly clear. I strongly suspect that the code you posted is not he code you ran.
It seems that you have multiple questions, so, let me try to answer them separately:
As pointed out by others above, you write past the end of the array so, once you have done that, you are in "undefined behavior" territory and this means that anything could happen, including printing 5, 6 or 0xdeadbeaf, or blow up your PC.
In the first case (VS2008), free appears to report an error message on standard output. It is not obvious to me what this error message is so it is hard to explain what is going on but you ask later in a comment how VS2008 could know the size of the memory you release. Typically, if you allocate memory and store it in pointer p, a lot of memory allocators (the malloc/free implementation) store at p[-1] the size of the memory allocated. In practice, it is common to also store at address p[p[-1]] a special value (say, 0xdeadbeaf). This "canary" is checked upon free to see if you have written past the end of the array. To summarize, your 5*sizeof(int) array is probably at least 5*sizeof(int) + 2*sizeof(char*) bytes long and the memory allocator used by code compiled with VS2008 has quite a few checks builtin.
In the case of gcc, I find it surprising that you get 51 printed. If you wanted to investigate wwhy that is exactly, I would recommend getting an asm dump of the generated code as well as running this under a debugger to check if 5 is actually really written past the end of the array (gcc could well have decided not to generate that code because it is "undefined") and if it is, to put a watchpoint on that memory location to see who overrides it, when, and why.

How to debug further based on Valgrind output

I have C/C++ code which is giving a segfault. It is compiled using gcc/g++ on a RH Linux Enterprise server. I used the Valgrind memory checker on the executable with:
valgrind --tool=memcheck --leak-check=full --show-reachable=yes
I get this as one of Valgrind's output messages:
==7053== Invalid read of size 1
==7053== at 0xDBC96C: func1 (file1:4742)
==7053== by 0xDB8769: func2 (file1.c:3478)
==7053== by 0xDB167E: func3 (file1.c:2032)
==7053== by 0xDB0378: func4 (file1.c:1542)
==7053== by 0xDB97D8: func5 (file1.c:3697)
==7053== by 0xDB17A7: func6 (file1.c:2120)
==7053== by 0xDBD55E: func7 (file2.c:271)
==7053== Address 0x1bcaf2f0 is not stack'd, malloc'd or (recently) free'd
I read that to mean that my code has accessed an invalid memory location it is not allowed to.
My questions:
How do I find out which buffer memory access has been invalid, and which of the functions above has done that.
How can I use the address 0x1bcaf2f0, which valgrind is saying is invalid. How can I find the symbol (essentially, the buffer name) at that address? Memory map file, any other way.
Any other general pointers, valgrind options or other tools for using Valgrind to detect memory (heap/stack corruption) errors?
Ad 1: In your example, that'd be func1 in line file1:4742 (1). The following functions are the stack trace. Analyzing that line should lead you to the invalid memory access.
Ad 2: Try splitting it into multiple simpler lines in case it's too complex and not obvious which exact call is causing the warning.
Ad 3: memcheck is the quintessential valgrind tool for detecting errors with heap memory. It won't help for stack corruption though.
If you have Valgrind 3.7.0, you can use the embedded gdbserver to
debug with gdb your application running under Valgrind.
See http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver