LLVM memory corruption - c++

I have some C++ memory corruption problems in an LLVM pass and I don't know how to solve them. Here is my piece of code :
in a large loop for each basic block I have :
std::vector<Value*> values(cnt);
// Value** values=new Value*[cnt];
// for(gy=0;gy<cnt;gy++){
// values[gy] = new Value;
// }
if(is)
{
LLVMContext& C = is->getContext();
errs()<<"\ni: \n";
for(i=0;i<cnt;i++){
values[i]=ConstantInt::getSigned(Type::getInt64Ty(C),myArray[i]);
errs()<<" "<<myArray[i];
}
//SmallVector<Value*>* bla = new SmallVector<Value*>(200);
std::vector<Value*> bla(200);
//SmallVector<Value*,200> bla;
for(i=0;i<cnt;i++){
bla.push_back(values[i]);
}
is->setMetadata("path",MDNode::get(C,bla));
errs()<<"\nmodified instr "<<*is<<"\n";
if( (is->getMetadata("path")) ){
for(i=0;i<cnt;i++){
if(is->getMetadata("path")->getOperand(i)) {
errs()<<"\nget instr "<<*(is->getMetadata("path")->getOperand(i))<<"\n";
}
}
}
bla.clear();
//for(i=0;i<cnt;i++)
//delete values[i];
}
values.clear();
The error is:
opt: malloc.c:3790: _int_malloc: Assertion `(unsigned long)(size) >= (unsigned long)(nb)' failed.
with nothing printed from the loop (only when I get to the 7th basic block, that is similar to the others).
I also tried the alternatives that are commented // in the code. Maybe the stack memory becomes somehow corrupted, and I need to allocate the vectors/arrays on heap (i tried some variants)? I also tried to delete first the elements from vectors/arrays and then delete the pointer to the array/vector.
I have to mention that when the metadata that I have to add was not so big (only 18 operands instead of 72), it worked well.
Do you think maybe it is worth to use Valgrind? How it can help me?
Thank you for any suggestion !
...........
As an update, I receive a memory corruption error at some basic blocks. Here are the debug outputs:
opt - opt: malloc.c:3801: _int_malloc: Assertion (unsigned long)(size) >= (unsigned long)(nb) failed.
gdb -
Program received signal SIGABRT, Aborted.
0xb7fdd424 in __kernel_vsyscall ()
valgrind - it executes all the code and at the problematic loop iteration I have :
==5134== Invalid write of size 4
==5134== at 0x4039280: (anonymous namespace)::Hello::runOnModule(llvm::Module&) (in /home/alex/llvm/Release+Asserts/lib/Hello.so)
==5134== by 0x8E33DE3: llvm::MPPassManager::runOnModule(llvm::Module&) (in /home/alex/llvm/Release+Asserts/bin/opt)
==5134== by 0x8E3726F: llvm::PassManagerImpl::run(llvm::Module&) (in /home/alex/llvm/Release+Asserts/bin/opt)
==5134== by 0x8E37385: llvm::PassManager::run(llvm::Module&) (in /home/alex/llvm/Release+Asserts/bin/opt)
==5134== by 0x41AE4D2: (below main) (libc-start.c:226)
==5134== Address 0x46cfa40 is 0 bytes after a block of size 200 alloc'd
==5134== at 0x402C454: operator new[](unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==5134== by 0x4037AE0: (anonymous namespace)::Hello::runOnModule(llvm::Module&) (in /home/alex/llvm/Release+Asserts/lib/Hello.so)
and repeating.
==5230== HEAP SUMMARY:
==5230== in use at exit: 95,110 bytes in 317 blocks
==5230== total heap usage: 33,395 allocs, 33,078 frees, 4,484,753 bytes allocated
==5230==
==5230== LEAK SUMMARY:
==5230== definitely lost: 7,732 bytes in 31 blocks
==5230== indirectly lost: 85,864 bytes in 275 blocks
==5230== possibly lost: 0 bytes in 0 blocks
==5230== still reachable: 1,514 bytes in 11 blocks
==5230== suppressed: 0 bytes in 0 blocks
==5230== Rerun with --leak-check=full to see details of leaked memory
==5230==
==5230== For counts of detected and suppressed errors, rerun with: -v
==5230== ERROR SUMMARY: 16432 errors from 15 contexts (suppressed: 0 from 0)
I assume the main problems are:
1. I don't allocate well memory for values. Or maybe I am allocating only for Value* pointers, not for the actual values. Maybe I don't free the mem.
2. I cannot use array instead of vector, since is->setMetadata("path",MDNode::get(C,values)); won't let me.
Do you think maybe it is worth to use Valgrind? How it can help me? I want only to attach some integers as metadata, one integer per operand.
Thank you for any suggestion !

Related

Setting watchpoints for large data structures in lldb

I am learning lldb and I am curious how you go about setting watchpoints for larger data structures for example a vector. I know that I can use print and that works but I get a message saying that watch points of size "x" are not supported. Is there a way around this? Thanks for the help!
(lldb) s
Process 36110 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
frame #0: 0x0000000100001600 a.out`main at test.cpp:10
7 vector<int> arr;
8 arr.push_back(1);
9 arr.push_back(2);
-> 10 arr.push_back(3);
11 arr.push_back(4);
12 arr.push_back(5);
13
Target 0: (a.out) stopped.
(lldb) print arr
(std::__1::vector<int, std::__1::allocator<int> >) $2 = size=2 {
[0] = 1
[1] = 2
}
(lldb) w s v arr
error: Watchpoint creation failed (addr=0x7ffeefbff458, size=24, variable expression='arr').
error: watch size of 24 is not supported
If you are on a Mac, the x86_64 architecture allows 4 separate watched regions of at most 8 bytes each. At present, lldb will only use one region per watch request. It could gang multiple watch regions together to handle larger requests which would work for this structure. Feel free to file an enhancement request for this feature with http://bugs.llvm.org. But watchpoints are really limited resources, so you generally have to be very targeted about what you are trying to watch - which is probably why nobody's gotten around to supporting > 8 bytes.
If you want to stop when elements get added to or removed from the vector, it's good enough to watch the end pointer in the vector (i.e. __end_). You can see the actual guts of the vector with the --raw argument to "frame var":
(lldb) fr v --raw arr
(std::__1::vector<int, std::__1::allocator<int> >) arr = {
std::__1::__vector_base<int, std::__1::allocator<int> > = {
__begin_ = 0x0000000100400000
__end_ = 0x000000010040001c
__end_cap_ = {
std::__1::__compressed_pair_elem<int *, 0, false> = {
__value_ = 0x0000000100400038
}
}
}
}
Whenever the vector grows or shrinks, the end marker will get adjusted, so a watchpoint set with:
(lldb) watch set v arr.__end_
Watchpoint created: Watchpoint 1: addr = 0x7ffeefbff1c8 size = 8 state = enabled type = w
declare # '/tmp/vectors.cpp:6'
watchpoint spec = 'arr.__end_'
new value: 0x000000010030020c
will catch push_back, erase, etc.
If you want to stop when the vector values change, you're going to have to watch individual values; given only 32 bytes to play with you're not going to watch all the data in a vector of meaningful size. And of course when the vector resizes, your watchpoint on the old data will now be pointing to freed memory...

Memory corruption in f90

I have the following code.
PROGRAM CTS
implicit none
!C driver for routine fourn
INTEGER NDAT,NDIM
PARAMETER(NDIM=1,NDAT=1024)
INTEGER i,idum,isign,j,k,l,nn(NDIM)
REAL data1(NDAT),data2(NDAT),ran1 ,x,dx
REAL,DIMENSION(:),ALLOCATABLE::F,F1
allocate(F(NDAT),F1(NDAT))
x=1.
dx = (200.-1.)/real(NDAT)
nn(1)=NDAT
do i=1,NDAT
F1(i) =atan(x-100)
x= x + dx
enddo
x=1.
x=1.
isign=1
call fo(F1,nn,1,isign)
open(1,file="zresult.dat",status="replace")
do i=1,NDAT
write(1,*)x,F1(i)*dx
x= x + dx
enddo
stop
END
!!!!!!!!!!!!!!!!!!!!!!!!!!
SUBROUTINE fo(data,nn,ndim,isign)
INTEGER isign,ndim,nn(ndim)
REAL data(*)
INTEGER i1,i2,i2rev,i3,i3rev,ibit,idim,ifp1,ifp2,ip1,ip2,ip3,k1,&
k2,n,nprev,nrem,ntot
REAL tempi,tempr
DOUBLE PRECISION theta,wi,wpi,wpr,wr,wtemp
ntot=1
do 11 idim=1,ndim
ntot=ntot*nn(idim)
11 continue
nprev=1
do 18 idim=1,ndim
n=nn(idim)
nrem=ntot/(n*nprev)
ip1=2*nprev
ip2=ip1*n
ip3=ip2*nrem
i2rev=1
do 14 i2=1,ip2,ip1
if(i2.lt.i2rev)then
do 13 i1=i2,i2+ip1-2,2
do 12 i3=i1,ip3,ip2
i3rev=i2rev+i3-i2
tempr=data(i3)
tempi=data(i3+1)
data(i3)=data(i3rev)
data(i3+1)=data(i3rev+1)
data(i3rev)=tempr
data(i3rev+1)=tempi
12 continue
13 continue
endif
ibit=ip2/2
1 if ((ibit.ge.ip1).and.(i2rev.gt.ibit)) then
i2rev=i2rev-ibit
ibit=ibit/2
goto 1
endif
i2rev=i2rev+ibit
14 continue
ifp1=ip1
2 if(ifp1.lt.ip2)then
ifp2=2*ifp1
theta=isign*6.28318530717959d0/(ifp2/ip1)
wpr=-2.d0*sin(0.5d0*theta)**2
wpi=sin(theta)
wr=1.d0
wi=0.d0
do 17 i3=1,ifp1,ip1
do 16 i1=i3,i3+ip1-2,2
do 15 i2=i1,ip3,ifp2
k1=i2
k2=k1+ifp1
tempr=sngl(wr)*data(k2)-sngl(wi)*data(k2+1)
tempi=sngl(wr)*data(k2+1)+sngl(wi)*data(k2)
data(k2)=data(k1)-tempr
data(k2+1)=data(k1+1)-tempi
data(k1)=data(k1)+tempr
data(k1+1)=data(k1+1)+tempi
15 continue
16 continue
wtemp=wr
wr=wr*wpr-wi*wpi+wr
wi=wi*wpr+wtemp*wpi+wi
17 continue
ifp1=ifp2
goto 2
endif
nprev=n*nprev
18 continue
return
END
!!!!!!!!!!!
The problem is If I do not allocate F1 and put REAL F1(NDAT), the code runs without any problem, but when I allocate F1 I will get the following error
I have tried all possibilities to understand what is happening -fcheck=all etc. it seems memory corruption.
*** Error in `./out': free(): invalid next size (normal): 0x088a7f20 ***
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0xB76BE133
#1 0xB76BE7D0
#2 0xB77C73FF
#3 0xB77C7424
#4 0xB74E4686
#5 0xB74E7AB2
#6 0xB751EFD2
#7 0xB75294C9
#8 0xB752A13C
#9 0xB7777607
#10 0xB776EECF
#11 0xB776EFB9
#12 0xB76BDA93
#13 0xB77D733B
#14 0xB74E9230
#15 0xB74E928C
#16 0xB76C09E7
#17 0x80496D4 in cts at z2.f90:33
Aborted (core dumped)
Could you please help me to find out where the problem is.
Thank you so much
If you move the END after the subroutine, put CONTAINS before the subroutine to make it internal the program, change assumed size array
data(*)
to assumed shape array
data(:)
(just using data(NDAT) would also help)
then you can compile your code as
gfortran-7 -Wall -Wno-unused-variable -fcheck=all memcorr.f90
and get clear message
> ./a.out
At line 63 of file memcorr.f90
Fortran runtime error: Index '1025' of dimension 1 of array 'data' above upper bound of 1024
That means your are accessing your array out of bounds.
Line 63 is:
data(i3)=data(i3rev)
so i3 or i3rev is too large (larger than NDAT). You must find out why and fix that.
The point is: use explicit interfaces, assumed shape arrays and all other Fortran 90 stuff that will help you find bugs.
The best thing is to use modules for all your subroutines and functions.

Debugging an assertion with gdb shows weird std::string size

I have a problem with an assertion in a C++ program.
HA_Archive & HA_Archive::operator << (const string & str) {
buffer[wcursor] = HA_TYPE_STRING;
wcursor++;
unsigned size = str.size();
CASSERT((bufferSize > wcursor + size),"buffer exceeds the maximum");
CASSERT is a simple assert, and there is the problem.
The program left a core dump that I have debugged with gdb, and I found something strange.
Program terminated with signal 6, Aborted.
#0 0xb7766424 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7766424 in __kernel_vsyscall ()
#1 0xb6cd1cb1 in raise () from /lib/libc.so.6
#2 0xb6cd33e8 in abort () from /lib/libc.so.6
#3 0xb6ccb58c in __assert_fail () from /lib/libc.so.6
#4 0x086c6dbd in HA_Archive::operator<< (this=0xb2610fb8, str=#0xb49e1f08) at HA_Archive.cxx:94
#5 0x0849b4d3 in PortDriver::serialize (this=0xb49e1ed8, ar=#0xb2610fb8) at PortDriver.cxx:624
#6 0x0838ed80 in PortSession::serialize (this=0xb49e1630, ar=#0xb2610fb8) at PortSession/PortSession.h:71
(gdb) frame 4
#4 0x086c6dbd in HA_Archive::operator<< (this=0xb2610fb8, str=#0xb49e1f08) at HA_Archive.cxx:94
94 HA_Archive.cxx: No such file or directory.
in HA_Archive.cxx
(gdb) print str
$1 = (const string &) #0xb49e1f08: {static npos = 4294967295, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0xb322f9b4 "NOT-SET"}}
(gdb) print wcursor
$2 = 180
(gdb) print bufferSize
$3 = 4096
(gdb) print size
$4 = 171791040
Printing the str I can see that it has "NOT-SET" and that is OK, but when I print the variable size that is str.size() the value is huge! Obviously is the cause that make the asserts fails, because bufferSize is 4096 and wcursor is only 180.
I am very far to be and expert in gdb so my first question is if I am doing something wrong whit it. Maybe size is not the real value at runtime?
My second question is: If gdb is showing the correct value of size, why I am seeing correctly the string "NOT-SET" when I print it, but the size is that huge number?
Thanks!
There are a few ways this can happen.
The string could really be that size, but the contents could have a nul character at str[7], which would cause GDB to stop printing it out.
Or maybe something has scribbled on your heap and has overwritten the memory location that stores the string's size, so although the contents are still only 7 bytes long the size member has been overwritten with garbage.
Or str could just be a dangling reference and the memory pointed to by _M_p still contains the string "NOT-SET" but the memory containing the size member has been re-used for something else.
I would try running under valgrind to ensure there are no buffer overruns that might be overwriting the member, or use-after-free errors.

Segfault when using time.h

Ok I've been trying just about everything I know to get this program to stop crashing, but I just can't see why. I was able to isolate the problem to code with ctime, and just made a small program to demonstrate what's wrong. This code compiles without a problem.
#include<iostream>
#include<ctime>
int main();
time_t getDay(time_t t);
int diffDay(time_t end,time_t begin);
int main()
{
time_t curTime=time(NULL); //Assign current time
time_t curDay=getDay(curTime); //Assign beginning of day
time_t yesterday=curDay-16*60*60; //Assign a time that's within yesterday
time_t dif=diffDay(curTime,yesterday); //Assign how many days are between yesterday and curTime
std::cout << "Cur Time: " << curTime << '\n'
<< "Cur Day: " << curDay << '\n'
<< "Yes Day: " << dif << '\n' << std::flush;
char a;
std::cin >> a; ///Program crashes after here.
return 0;
}
///Get beginning of day that t is a part of
time_t getDay(time_t t)
{
//Get current time
struct tm* loctim=localtime(&t);
if(loctim==0)
return 0;
//Set loctim to beginning of day
loctim->tm_sec=0;
loctim->tm_min=0;
loctim->tm_hour=0;
//Create a int from the new time
int reval=mktime(loctim);
//Free memory
delete loctim;
return reval;
}
///Calculate how many days are between begin and end
int diffDay(time_t end,time_t begin)
{
time_t eDay=getDay(end); //Get beginning of day end is a part of
time_t bDay=getDay(begin); //Get beginning of day begin is a part of
time_t dif=(eDay-bDay)/(24*60*60); //Get how many days (86400 seconds)
return dif;
}
Here is some text I got from debugging.
Call Stack
#0 77BC3242 ntdll!LdrLoadAlternateResourceModuleEx() (C:\Windows\system32\ntdll.dll:??)
#1 00000000 0x6d067ad3 in ??() (??:??)
#2 00000000 0x00000018 in ??() (??:??)
#3 77BC3080 ntdll!LdrLoadAlternateResourceModuleEx() (C:\Windows\system32\ntdll.dll:??)
#4 00000000 0x00000018 in ??() (??:??)
#5 77C60FCB ntdll!TpCheckTerminateWorker() (C:\Windows\system32\ntdll.dll:??)
#6 00000000 0x007f0000 in ??() (??:??)
#7 00000000 0x50000163 in ??() (??:??)
#8 00000000 0x00000018 in ??() (??:??)
#9 77C1AC4B ntdll!RtlReAllocateHeap() (C:\Windows\system32\ntdll.dll:??)
#10 00000000 0x007f0000 in ??() (??:??)
#11 00000000 0x50000163 in ??() (??:??)
#12 00000000 0x00000018 in ??() (??:??)
#13 77BC3080 ntdll!LdrLoadAlternateResourceModuleEx() (C:\Windows\system32\ntdll.dll:??)
#14 00000000 0x00000018 in ??() (??:??)
#15 769A9D45 msvcrt!malloc() (C:\Windows\syswow64\msvcrt.dll:??)
#16 769AF5D3 strcpy_s() (C:\Windows\syswow64\msvcrt.dll:??)
#17 769B2B18 open_osfhandle() (C:\Windows\syswow64\msvcrt.dll:??)
#18 00000000 0x00000018 in ??() (??:??)
#19 769B3C7D msvcrt!_get_fmode() (C:\Windows\syswow64\msvcrt.dll:??)
#20 769BA6A0 msvcrt!_fsopen() (C:\Windows\syswow64\msvcrt.dll:??)
#21 00000000 0xc3458a06 in ??() (??:??)
#22 00000000 0x00000000 in ??() (??:??)
Also here's another call stack from the same build.
#0 77BE708C ntdll!RtlTraceDatabaseLock() (C:\Windows\system32\ntdll.dll:??)
#1 00000000 0x6ccdaf66 in ??() (??:??)
#2 00000000 0x00000000 in ??() (??:??)
Is it some special build option? I was using -std=c++0x but decided to try the program without it and it still crashed. Thanks for any help, I've been trying to fix this all day.
I think that the problem is here:
struct tm* loctim=localtime(&t);
delete loctim;
localtime returns a pointer to a static buffer. You shall not free it. This is causing an "undefined behaviour". i.e. some data are put into an inconsistent state and may cause crash at another place of program which may seem not to be directly related to the problem.
A nice way to find such problems is to run the program under valgrind. It gives you very accurate information about what is going wrong -
vlap:~/src $ valgrind ./a.out
==29314== Memcheck, a memory error detector
==29314== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==29314== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==29314== Command: ./a.out
==29314==
==29314== Invalid free() / delete / delete[] / realloc()
==29314== at 0x4C29E6C: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29314== by 0x400D2A: getDay(long) (test.cpp:44)
==29314== by 0x400BEE: main (test.cpp:11)
==29314== Address 0x59f5560 is 0 bytes inside data symbol "_tmbuf"
==29314==
==29314== Invalid free() / delete / delete[] / realloc()
==29314== at 0x4C29E6C: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29314== by 0x400D2A: getDay(long) (test.cpp:44)
==29314== by 0x400D4D: diffDay(long, long) (test.cpp:52)
==29314== by 0x400C13: main (test.cpp:13)
==29314== Address 0x59f5560 is 0 bytes inside data symbol "_tmbuf"
==29314==
==29314== Invalid free() / delete / delete[] / realloc()
==29314== at 0x4C29E6C: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29314== by 0x400D2A: getDay(long) (test.cpp:44)
==29314== by 0x400D5D: diffDay(long, long) (test.cpp:53)
==29314== by 0x400C13: main (test.cpp:13)
==29314== Address 0x59f5560 is 0 bytes inside data symbol "_tmbuf"
==29314==
Cur Time: 1395580379
Cur Day: 1395529200
Yes Day: 1
a
==29314==
==29314== HEAP SUMMARY:
==29314== in use at exit: 0 bytes in 0 blocks
==29314== total heap usage: 12 allocs, 15 frees, 1,846 bytes allocated
==29314==
==29314== All heap blocks were freed -- no leaks are possible
==29314==
==29314== For counts of detected and suppressed errors, rerun with: -v
==29314== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 3 from 3)
You cant use delete, which is a c++ operator, to free the result of localtime() which doesnt use c++ memory management. In any case, you dont actually need to release the value returned by localtime.
You can use the cmd or the terminal to get the time in a file on cmd: echo %time% > time.txt and on linux terminal: date > time.txt
You can run the commsnd with: system(command)
And than you read the file.

memory-allocation breakpoint does not stop execution

I have a piece of JNI code with a mem leak:
Detected memory leaks!
Dumping objects ->
{76} normal block at 0x277522F8, 52 bytes long.
Data: < "u' "u' "u' > F8 22 75 27 F8 22 75 27 F8 22 75 27 CD CD CD CD
Object dump complete.
So, I set a breakpoint ont the specified memory allocation number (76 in this case).
_crtBreakAlloc = 76;
But the application never stop execution like if that allocation was never performed.
I also took two memory snapshots at the beginnning and at the end of the program, and I compared them.
(At code beginning):
_CrtMemCheckpoint( &s1 );
(At code end):
_CrtMemCheckpoint( &s2 );
_CrtMemState s3;
_CrtMemDifference( &s3, &s1, &s2);
_CrtMemDumpStatistics( &s3 );
Here the result:
0 bytes in 0 Free Blocks.
0 bytes in 0 Normal Blocks.
0 bytes in 0 CRT Blocks.
0 bytes in 0 Ignore Blocks.
0 bytes in 0 Client Blocks.
Largest number used: 2839 bytes.
Total allocations: 101483 bytes.
It seems that all is OK.
I can't figure out what is happening. Is it a false positive mem leak? Or is a memleak of the JVM? If so, is there a way to detect it?
Added after the solution was found:
I modified the initialization of a static map and the problem has been solved.
In particular, I transformed a private static member from map to map*. The problem is that when you initialize a static, it must be initialized with a constant.
Here is how I changed the declaration of the static member:
static const map<wstring, enumValue>* mapParamNames;
So my initialize() method becomes:
map<wstring, paramNames>* m = new map<wstring, paramNames>();
(*m)[L"detectCaptions"] = detectCaptions;
(*m)[L"insertEmptyParagraphsForBigInterlines"] = insertEmptyParagraphsForBigInterlines;
(*m)[L"fastMode"] = fastMode;
(*m)[L"predefinedTextLanguage"] = predefinedTextLanguage;
(*m)[L"detectFontSize"] = detectFontSize;
(*m)[L"saveCharacterRecognitionVariants"] = saveCharacterRecognitionVariants;
(*m)[L"detectBold"] = detectBold;
(*m)[L"saveWordRecognitionVariants"] = saveWordRecognitionVariants;
KernelParamsSetter::mapParamNames = m;
Finally, I inserted the delete of the map in the class destructor:
delete KernelParamsSetter::mapParamNames;
Hope this can be useful for someone.
One possibility would be that memory allocation 76 occurs during the static initialization of a global variable. In that case, you may be setting _crtBreakAlloc too late to catch the allocation.