Hi when I was trying to execute my program(c++) i was getting the following error:
a.out: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
Aborted
and when i traced my program using cout's, I could find that, it is because of the following line
BNode* newNode=new BNode();
If i remove this line I was not getting the error.
Can any one please help in this regard...
The shown line of code is ok in general. The heap probably was corrupted before. I would use a memory checker like valgrind to find out where.
Without a memory checking tool you just have to look hard at your code and find the error.
Sometimes a binary search strategy helps. Deliberately deactivate parts of your code and narrow down. Don't be fooled by false positives like the line you posted.
Another alternative is to switch to a programming language with automatic memory management.
The error message means that the integrity of the program heap was violated. The heap was broken. The line you removed... maybe it was the culprit, maybe it was not to blame. Maybe the heap was damaged by some code before that (or even well before that) and the new that you removed simply revealed the problem, not caused it. There's no way to say from what you posted.
So, it is possible that you actually changed nothing by removing that line. The error could still be there, and the program will simply fail in some other place. Buffer overrun, double free or something like that is normally to blame for the invalidated heap. Run your code through some static or dynamic checker to look for these problems (valgrind, coverity etc.)
Related
I am writing a C++ application which has started to occasionally have what I believe to be bad allocations even though no bad alloc error is given. In one method, I have:
float * out = new float[len] //len works out to about 570000
memset(out, 0, sizeof(float) * len);
when this code runs, I get EXC_BAD_ACCESS at address out[4] (always the 4th element). In LLDB I tested the range of the error; turns out, I can never access/write to out[n] for any 4 <= n < 1028.
I assume that there is some sort of allocation problem preventing me from getting clean access to this memory block, but I can't figure out how to find the responsible code. Any ideas where I can start?
I can provide more details if necessary. Thanks!
A user reported an error to me where the line
read(unit_chk) ((kpt_latt(i,nkp),i=1,3),nkp=1,num_kpts)
failed with the error (similar to Why do I get a C malloc assertion failure?)
malloc.c:2365: sysmalloc: Assertion `(old_top == (((mbinptr)
(((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct
malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >=
(unsigned long)((((__builtin_offsetof (struct malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) -
1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask)
== 0)' failed.
Abort
As far as I know, the error occurs only for a specific set of inputs. Also, when the read() is changed to the equivalent
((kpt_latt(i,nkp),i=1,3),nkp=1,(num_kpts-1)), &
kpt_latt(1,num_kpts),kpt_latt(2,num_kpts),kpt_latt(3,num_kpts)
the error disappears. Even compiling with a different compiler version (IntelStudio 2013 SP1 composer_xe_2013_sp1.2.144 instead of IntelStudio 2015 composer_xe_2015.6.233) made the error disappear. (This is all from the user's reports -- I have not yet reproduced the error.)
When the program is run through valgrind, it reports
valgrind: m_mallocfree.c:268 (mk_plain_bszB): Assertion 'bszB != 0' failed.
valgrind: This is probably caused by your program erroneously writing past the
end of a heap block and corrupting heap metadata. If you fix any
invalid writes reported by Memcheck, this assertion failure will
probably go away. Please try that before reporting this as a bug.
Before that, there area a couple of messages that Conditional jump or move depends on uninitialised value(s), Use of uninitialised value of size 8 and Invalid read of size 8; and one Invalid write of size 1 on the statement cited above.
The array that is being read into is allocated to the proper size just one line before:
allocate(kpt_latt(3,num_kpts))
read(unit_chk) ((kpt_latt(i,nkp),i=1,3),nkp=1,num_kpts)
EDIT: The user has reported back with a possible solution. The array kpt_latt that is being read was declared with a wrong data type, namely as integer while the data in the file was written as real. This is an error of course; but is it realistic that this caused the failed malloc() assertion?
Fine print: We are talking about a default-kind integer (4 bytes) and a double precision real (8 bytes) here. The resulting bogus values in kpt_latt were not noticed because the program does not actually use them. I still have not reproduced the error myself, so I have to rely on what the user tells me.
My server daemon works fine on most machines however on one I am getting:
malloc.c:3074: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1)
- 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) ||
((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct
malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) -
1)))&& ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
gdb backtrace:
#4 0x002a8300 in sYSMALLOc (av=<value optimised out>, bytes=<value optimised out>) at malloc.c:3071
#5 _int_malloc (av=<value optimised out>, bytes=<value optimised out>) at malloc.c:4702
#6 0x002a9898 in *__GI___libc_malloc (bytes=16) at malloc.c:3638
#7 0x0804d575 in xmpp_ctx_new (mem=0x0, log=0x0) at src/ctx.c:383
#8 0x0804916e in main (argc=1, argv=0xbffff834) at ../src/adminbot.c:277
Any ideas what to try else ? I am unable to find a bug in my code, it could be a bug in the XMPP library and I need to determine that.
Thanks.
This is almost certainly due to a heap corruption bug in your code (writing just before or just after an allocated block).
Since you are apparently on Linux, the tool to use here is Valgrind. It should point you straight at the problem, and it should do so even on machines where your daemon "works".
Trying anything other than Valgrind for this kind of problem is likely a waste of time.
The assertion almost certainly indicates some kind of memory corruption prior to a call to malloc. Given that the assertion is tripping in xmpp_ctx_new, which appears to be a very early call in the libstrophe XMPP library, I'd say it's very likely that the bug is in your code (though it may not be if you're allocating several XMPP contexts - not sure if there's any reason to do that).
If you're only allocating one XMPP context, you can isolate the bug to your code by inserting a call to malloc(sizeof(xmpp_ctx_t)) prior to calling xmpp_ctx_new, and you'll see the problem isn't in libstrophe. (Incidentally, I'm pretty sure the problem won't be in this call to xmpp_ctx_new because I google'd the source to the function (mem=0x0 looked likely to cause problems), and saw that it basically reduced to malloc and a few initializers - reading the source is generally a good strategy for looking for bugs in OSS.)
I'm fixing a bug about android multimedia framework lower c++ lib. When the code running to following position system goto crash.
if (((*pChar) >= _T('a')) && ((*pChar) <= _T('z'))) {
nFrameTime++;
}
nFrameTime is int type;
pChar is wchar_t* type;
But when i modify the code to:
if (((*pChar) >= _T('a')) || ((*pChar) <= _T('z'))) {
nFrameTime++;
}
Everything is OK. I do not care about using "&&" or "||", I only want to know why that go to crash. Anybody can give me some suggestions?
Most likely pChar isn't pointing at valid data. That's the only thing in there that could really cause a crash (compiler bugs excepted).
The real mystery is why the changed version isn't crashing.
As to the answer to my question, it could be that when you change the code it modifies things just enough that the garbage in pChar happens to point to a valid memory location. Another possibility, as Ben Voigt pointed out in the comments, is that the check is being optimized away in the second version, because any value at all of *pChar will cause it to be true.
I have a huge MMC snapin written in Visual C++ 9. Every once in a while when I hit F5 in MMC mmc.exe crashes. If I attach a debugger to it I see the following message:
A buffer overrun has occurred in mmc.exe which has corrupted the program's internal state. Press Break to debug the program or Continue to terminate the program.
For more details please see Help topic 'How to debug Buffer Overrun Issues'.
First of all, there's no How to debug Buffer Overrun Issues topic anywhere.
When I inspect the call stack I see that it's likely something with security cookies used to guard against stack-allocated buffer overruns:
MySnapin.dll!__crt_debugger_hook() Unknown
MySnapin.dll!__report_gsfailure() Line 315 + 0x7 bytes C
mssvcr90d.dll!ValidateLocalCookies(void (unsigned int)* CookieCheckFunction=0x1014e2e3, _EH4_SCOPETABLE * ScopeTable=0x10493e48, char * FramePointer=0x0007ebf8) + 0x57 bytes C
msvcr90d.dll!_except_handler4_common(unsigned int * CookiePointer=0x104bdcc8, void (unsigned int)* CookieCheckFunction=0x1014e2e3, _EXCEPTION_RECORD * ExceptionRecord=0x0007e764, _EXCEPTION_REGISTRATION_RECORD * EstablisherFrame=0x0007ebe8, _CONTEXT * ContextRecord=0x0007e780, void * DispatcherContext=0x0007e738) + 0x44 bytes C
MySnapin.dll!_except_handler4(_EXCEPTION_RECORD * ExceptionRecord=0x0007e764, _EXCEPTION_REGISTRATION_RECORD * EstablisherFrame=0x0007ebe8, _CONTEXT * ContextRecord=0x0007e780, void * DispatcherContext=0x0007e738) + 0x24 bytes C
ntdll.dll!7c9032a8()
[Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]
ntdll.dll!7c90327a()
ntdll.dll!7c92aa0f()
ntdll.dll!7c90e48a()
MySnapin.dll!IComponentImpl<CMySnapin>::GetDisplayInfo(_RESULTDATAITEM * pResultDataItem=0x0007edb0) Line 777 + 0x14 bytes C++
// more Win32 libraries functions follow
I have lots of code and no idea where the buffer overrun might occur and why. I found this forum discussion and specifically the advise to replace all wcscpy-like functions with more secure versions like wcscpy_s(). I followed the advise and that didn't get me closer to the problem solution.
How do I debug my code and find why and where the buffer overrun occurs with Visual Studio 2008?
Add /RTCs switch to the compiler. This will enable detection of buffer overruns and underruns at runtime. When overrun will be detected, program will break exactly in place where it happened rather than giving you postmortem message.
If that does not help, then investigate wcscpy_s() calls that you mentioned. Verify that the 'number of elements' has correct value. I recently fixed buffer overrun caused incorrect usage of wcscpy_s(). Here is an example:
const int offset = 10;
wchar_t buff[MAXSIZE];
wcscpy_s(buff + offset, MAXSIZE, buff2);
Notice that buff + offset has MAXSIZE - offset elements, not MAXSIZE.
I just had this problem a minute ago, and I was able to solve it. I searched first on the net with no avail, but I got to this thread.
Anyways, I am running VS2005 and I have a multi-threaded program. I had to 'guess' which thread caused the problem, but luckily I only have a few.
So, what I did was in that thread I ran through the debugger, stepping through the code at a high level function. I noticed that it always occurred at the same place in the function, so now it was a matter of drilling down.
The other thing I would do is step through with the callstack window open making sure that the stack looked okay and just noting when the stack goes haywire.
I finally narrowed down to the line that caused the bug, but it wasn't actually that line. It was the line before it.
So what was the cause for me? Well, in short-speak I tried to memcpy a NULL pointer into a valid area of memory.
I'm surprised the VS2005 can't handle this.
Anyways, hope that helps. Good luck.
I assume you aren't able to reproduce this reliably.
I've successfully used Rational Purify to hunt down a variety of memory problems in the past, but it costs $ and I'm not sure how it would interact with MMC.
Unless there's some sort of built-in memory debugger you may have to try solving this programmatically. Are you able to remove/disable chunks of functionality to see if the problem manifests itself?
If you have "guesses" about where the problem occurs you can try disabling/changing that code as well. Even if you changed the copy functions to _s versions, you still need to be able to reliably handle truncated data.
I have got this overrun when I wanted to increment a value in a pointer variable like this:
*out_BMask++;
instead
(*out_BMask)++;
where out_BMask was declared as int *out_BMask
If you did something like me then I hope this will help you ;)