Threads are blocked in malloc and free, virtual size - c++

I'm running a 64-bit multi-threaded program on the windows server 2003 server (X64), It run into a case that some of the threads seem to be blocked in the malloc or free function forever. The stack trace is like follows:
ntdll.dll!NtWaitForSingleObject() + 0xa bytes
ntdll.dll!RtlpWaitOnCriticalSection() - 0x1aa bytes
ntdll.dll!RtlEnterCriticalSection() + 0xb040 bytes
ntdll.dll!RtlpDebugPageHeapAllocate() + 0x2f6 bytes
ntdll.dll!RtlDebugAllocateHeap() + 0x40 bytes
ntdll.dll!RtlAllocateHeapSlowly() + 0x5e898 bytes
ntdll.dll!RtlAllocateHeap() - 0x1711a bytes
MyProg.exe!malloc(unsigned __int64 size=0) Line 168 C
MyProg.exe!operator new(unsigned __int64 size=1) Line 59 + 0x5 bytes C++
ntdll.dll!NtWaitForSingleObject()
ntdll.dll!RtlpWaitOnCriticalSection()
ntdll.dll!RtlEnterCriticalSection()
ntdll.dll!RtlpDebugPageHeapFree()
ntdll.dll!RtlDebugFreeHeap()
ntdll.dll!RtlFreeHeapSlowly()
ntdll.dll!RtlFreeHeap()
MyProg.exe!free(void * pBlock=0x000000007e8e4fe0) C
BTW, the param values passed to the new operator is not correct here maybe due to optimization.
Also, at the same time, I found in the process Explorer, the virtual size of this program is 10GB, but the private bytes and working set is very small (<2GB). We did have some threads using virtualalloc but in a way that commit the memory in the call, and these threads are not blocked.
m_pBuf = VirtualAlloc(NULL, m_size, MEM_COMMIT, PAGE_READWRITE);
......
VirtualFree(m_pBuf, 0, MEM_RELEASE);
This looks strange to me, seems a lot of virtual space is reserved but not committed, and malloc/free is blocked by lock. I'm guessing if there's any corruptions in the memory/object, so plan to turn on gflag with pageheap to troubleshoot this.
Does anyone has similar experience on this before? Could you share with me so I may get more hints?
Thanks a lot!

Your program is using PageHeap, which is intended for debugging only and imposes a ton of memory overhead. To see which programs have PageHeap activated, do this at a command line.
% Gflags.exe /p
To disable it for your process, type this (for MyProg.exe):
% Gflags.exe /p /disable MyProg.exe

Pageheap.exe detects most heap-related bugs - try Pageheap
Also you should look in to "the param values passed to the new ..." - does this corruption occur in the debug mode? make sure all optimizations are disabled.

If your system is running out of memory, it might be the case that the OS is swapping, that means that for a single allocation, in the worst case the OS could need to locate the best candidate for swapping, write it to disk, free the memory and return it. Are you sure that it is locking or might it just be performing very slowly? Can another thread be swapping memory to disk while these two threads wait for it's call to malloc/free to complete?

My preferred solution for debugging leaks in native applications in to use UMDH to get consecutive snapshots of the user-mode heap(s) in the process and then run UMDH again to diff the snapshots. Any pattern of change in the snapshots is likely a leak.
You get a count and size of memory blocks bucketed by their allocating callstack so it's reasonably straightforward to see where the biggest hogs are.
The user-mode dump heap (UMDH) utility
works with the operating system to
analyze Windows heap allocations for a
specific process.

Related

i convert my compiler to 64 bit from 32 but still i cant use more than 2GB :( why?

i can create this array:
int Array[490000000];
cout << "Array Byte= " << sizeof(Array) << endl;
Array byte = 1,960,000,000 byte and convert gb = 1,96 GB about 2 gb whatever.
but i cant create same time these:
int Array[490000000];
int Array2[490000000];
it give error why ? sorry for bad englisgh :)
Also i checked my compiler like this:
printf("%d\n", sizeof(char *));
it gives me 8.
C++ programs are not usually compiled to have 2Gb+ of stack space, regardless of whether it is compiled in 32-bit mode or 64-bit mode. Stack space can be increased as part of the compiler options, but even in the scenario where it is permissible to set the stack size that high, it's still not an ideomatic solution or recommended.
If you need an array of 2Gb, you should use std::vector<int> Array(490'000'000); (strongly recommended) or a manually created array, i.e. int* Array = new int[490'000'000]; (remember that manually allocated memory must be manually deallocated with delete[]), either of which will allocate dynamic memory. You'll still want to be compiling in 64-bit mode, since this will brush up against the maximum memory limit of your application if you don't, but in your scenario, it's not strictly necessary, since 2Gb is less than the maximum memory of a 32-bit application.
But still I can't use more than 2 GB :( why?
The C++ language does not have semantics to modify (nor report) how much automatic memory is available (or at least I have not seen it.) The compilers rely on the OS to provide some 'useful' amount. You will have to search (google? your hw documents, user's manuals, etc) for how much. This limit is 'machine' dependent, in that some machines do not have as much memory as you may want.
On Ubuntu, for the last few releases, the Posix function ::pthread_attr_getstacksize(...) reports 8 M Bytes per thread. (I am not sure of the proper terminology, but) what linux calls 'Stack' is the resource that the C++ compiler uses for the automatic memory. For this release of OS and compiler, the limit for automatic var's is thus 8M (much smaller than 2G).
I suppose that because the next machine might have more memory, the compiler might be given a bigger automatic memory, and I've seen no semantics that will limit the size of your array based on memory size of the machine performing the compile ...
there can can be no compile-time-report that the stack will overflow.
I see Posix has a function suggesting a way to adjust size of stack. I've not tried it.
I have also found Ubuntu commands that can report and adjust size of various memory issues.
From https://www.nics.tennessee.edu/:
The command to modify limits varies by shell. The C shell (csh) and
its derivatives (such as tcsh) use the limit command to modify limits.
The Bourne shell (sh) and its derivatives (such as ksh and bash) use
the ulimit command. The syntax for these commands varies slightly and
is shown below. More detailed information can be found in the man page
for the shell you are using.
One minor experiment ... the command prompt
& dtb_chimes
launches this work-in-progress app which uses Posix and reports 8 MByte stack (automatic var)
With the ulimit prefix command
$ ulimit -S -s 131072 ; dtb_chimes
the app now reports 134,217,728
./dtb_chimes_ut
default Stack size: 134,217,728
argc: 1
1 ./dtb_chimes_ut
But I have not confirmed the actual allocation ... and this is still a lot smaller than 1.96 GBytes ... but, maybe you can get there.
Note: I strongly recommend std::vector versus big array.
On my Ubuntu desktop, there is 4 GByte total dram (I have memory test utilities), and my dynamic memory is limited to about 3.5 GB. Again, the amount of dynamic memory is machine dependent.
64 bits address a lot more memory than I can afford.

Is it possible to protect a region of memory from WinAPI?

Having read this interesting article outlining a technique for debugging heap corruption, I started wondering how I could tweak it for my own needs. The basic idea is to provide a custom malloc() for allocating whole pages of memory, then enabling some memory protection bits for those pages, so that the program crashes when they get written to, and the offending write instruction can be caught in the act. The sample code is C under Linux (mprotect() is used to enable the protection), and I'm curious as to how to apply this to native C++ and Windows. VirtualAlloc() and/or VirtualProtect() look promising, but I'm not sure how a use scenario would look like.
Fred *p = new Fred[100];
ProtectBuffer(p);
p[10] = Fred(); // like this to crash please
I am aware of the existence of specialized tools for debugging memory corruption in Windows, but I'm still curious if it would be possible to do it "manually" using this approach.
EDIT: Also, is this even a good idea under Windows, or just an entertaining intellectual excercise?
Yes, you can use VirtualAlloc and VirtualProtect to set up sections of memory that are protected from read/write operations.
You would have to re-implement operator new and operator delete (and their [] relatives), such that your memory allocations are controlled by your code.
And bear in mind that it would only be on a per-page basis, and you would be using (at least) three pages worth of virtual memory per allocation - not a huge problem on a 64-bit system, but may cause problems if you have many allocations in a 32-bit system.
Roughly what you need to do (you should actually find the page-size for the build of Windows - I'm too lazy, so I'll use 4096 and 4095 to represent pagesize and pagesize-1 - you also will need to do more error checking than this code does!!!):
void *operator new(size_t size)
{
Round size up to size in pages + 2 pages extra.
size_t bigsize = (size + 2*4096 + 4095) & ~4095;
// Make a reservation of "size" bytes.
void *addr = VirtualAlloc(NULL, bigsize, PAGE_NOACCESS, MEM_RESERVE);
addr = reinterpret_cast<void *>(reinterpret_cast<char *>(addr) + 4096);
void *new_addr = VirtualAlloc(addr, size, PAGE_READWRITE, MEM_COMMIT);
return new_addr;
}
void operator delete(void *ptr)
{
char *tmp = reinterpret_cast<char *>(ptr) - 4096;
VirtualFree(reinterpret_cast<void*>(tmp));
}
Something along those lines, as I said - I haven't tried compiling this code, as I only have a Windows VM, and I can't be bothered to download a compiler and see if it actually compiles. [I know the principle works, as we did something similar where I worked a few years back].
This is what Gaurd Pages are for (see this MSDN tutorial), they raise a special exception when the page is accessed the first time, allowing you to do more than crash on the first invalid pages access (and catch bad read/writes as opposed to NULL pointers etc).

PageHeap does not show exact crash location

I am using PageHeap to identify heap corruption. My application has a heap corruption. But the application breaks(due to crash) when it creates an stl object for a string passed to a method. I cannot see any visible memory issues near the crash location. I enabled full page heap for detecting heap corruption and /RTCs for detcting stack corruption.
What should I do to break at the exact location where the heap corruption occurs?
Enabling FULL pageheap can increase the chances of the debugger catching a heap corruption as it's happening:
gflags /p /enable /full <processname>
Also, if you can find out what address is getting overwritten, you can set up a breakpoint on memory access in windbg. Not sure if the VS debugger has the same feature.
Pageheap does not always detect heap corruption exactly at the moment when it occurs.
Pageheap inserts an invalid page right after allocations. So whenever you overrun an allocated block you get an AV. But there are other possible cases. One example is writing just before an allocated block corrupting heap block header data structure. Heap block header is a valid writable memory (most likely in the same page with the allocated block). Consider the following example:
#include <stdlib.h>
int
main()
{
void* block = malloc(100);
int* intPtr = (int*)block;
*(intPtr-1) = 0x12345; // no crash
free(block); // crash
return 0;
}
So writing some garbage just before the allocated block passes just fine. With Pageheap enabled the example breaks inside free() call. Here is the call stack:
verifier.dll!_VerifierStopMessage#40() + 0x206 bytes
verifier.dll!_AVrfpDphReportCorruptedBlock#16() + 0x239 bytes
verifier.dll!_AVrfpDphCheckNormalHeapBlock#16() + 0x11a bytes
verifier.dll!_AVrfpDphNormalHeapFree#16() + 0x22 bytes
verifier.dll!_AVrfDebugPageHeapFree#12() + 0xe3 bytes
ntdll.dll!_RtlDebugFreeHeap#12() + 0x2f bytes
ntdll.dll!#RtlpFreeHeap#16() + 0x36919 bytes
ntdll.dll!_RtlFreeHeap#12() + 0x722 bytes
heapripper.exe!free(void * pBlock=0x0603bf98) Line 110 C
> heapripper.exe!main() Line 11 + 0x9 bytes C++
heapripper.exe!__tmainCRTStartup() Line 266 + 0x12 bytes C
kernel32.dll!#BaseThreadInitThunk#12() + 0xe bytes
ntdll.dll!___RtlUserThreadStart#8() + 0x27 bytes
ntdll.dll!__RtlUserThreadStart#8() + 0x1b bytes
Pageheap enables rigorous heap consistency checks, but the checks do not kick in untill some other heap API is called. The check routines are seen on stack. (Without Pageheap the application would probably just AV in heap implementation attempting to use an invalid pointer.)
So Pageheap does not give you 100% guarantee to catch a corruption exactly at the moment when it occurs. You need tools like Purify or Valgrind that track every memory access.
Don't get me wrong, I think Pageheap is still very useful. It causes much less performance degradation compared to the mentioned Purify and Valgrind, so it allows running much more complex scenarios.

How to debug a buffer overrun in Visual C++ 9?

I have a huge MMC snapin written in Visual C++ 9. Every once in a while when I hit F5 in MMC mmc.exe crashes. If I attach a debugger to it I see the following message:
A buffer overrun has occurred in mmc.exe which has corrupted the program's internal state. Press Break to debug the program or Continue to terminate the program.
For more details please see Help topic 'How to debug Buffer Overrun Issues'.
First of all, there's no How to debug Buffer Overrun Issues topic anywhere.
When I inspect the call stack I see that it's likely something with security cookies used to guard against stack-allocated buffer overruns:
MySnapin.dll!__crt_debugger_hook() Unknown
MySnapin.dll!__report_gsfailure() Line 315 + 0x7 bytes C
mssvcr90d.dll!ValidateLocalCookies(void (unsigned int)* CookieCheckFunction=0x1014e2e3, _EH4_SCOPETABLE * ScopeTable=0x10493e48, char * FramePointer=0x0007ebf8) + 0x57 bytes C
msvcr90d.dll!_except_handler4_common(unsigned int * CookiePointer=0x104bdcc8, void (unsigned int)* CookieCheckFunction=0x1014e2e3, _EXCEPTION_RECORD * ExceptionRecord=0x0007e764, _EXCEPTION_REGISTRATION_RECORD * EstablisherFrame=0x0007ebe8, _CONTEXT * ContextRecord=0x0007e780, void * DispatcherContext=0x0007e738) + 0x44 bytes C
MySnapin.dll!_except_handler4(_EXCEPTION_RECORD * ExceptionRecord=0x0007e764, _EXCEPTION_REGISTRATION_RECORD * EstablisherFrame=0x0007ebe8, _CONTEXT * ContextRecord=0x0007e780, void * DispatcherContext=0x0007e738) + 0x24 bytes C
ntdll.dll!7c9032a8()
[Frames below may be incorrect and/or missing, no symbols loaded for ntdll.dll]
ntdll.dll!7c90327a()
ntdll.dll!7c92aa0f()
ntdll.dll!7c90e48a()
MySnapin.dll!IComponentImpl<CMySnapin>::GetDisplayInfo(_RESULTDATAITEM * pResultDataItem=0x0007edb0) Line 777 + 0x14 bytes C++
// more Win32 libraries functions follow
I have lots of code and no idea where the buffer overrun might occur and why. I found this forum discussion and specifically the advise to replace all wcscpy-like functions with more secure versions like wcscpy_s(). I followed the advise and that didn't get me closer to the problem solution.
How do I debug my code and find why and where the buffer overrun occurs with Visual Studio 2008?
Add /RTCs switch to the compiler. This will enable detection of buffer overruns and underruns at runtime. When overrun will be detected, program will break exactly in place where it happened rather than giving you postmortem message.
If that does not help, then investigate wcscpy_s() calls that you mentioned. Verify that the 'number of elements' has correct value. I recently fixed buffer overrun caused incorrect usage of wcscpy_s(). Here is an example:
const int offset = 10;
wchar_t buff[MAXSIZE];
wcscpy_s(buff + offset, MAXSIZE, buff2);
Notice that buff + offset has MAXSIZE - offset elements, not MAXSIZE.
I just had this problem a minute ago, and I was able to solve it. I searched first on the net with no avail, but I got to this thread.
Anyways, I am running VS2005 and I have a multi-threaded program. I had to 'guess' which thread caused the problem, but luckily I only have a few.
So, what I did was in that thread I ran through the debugger, stepping through the code at a high level function. I noticed that it always occurred at the same place in the function, so now it was a matter of drilling down.
The other thing I would do is step through with the callstack window open making sure that the stack looked okay and just noting when the stack goes haywire.
I finally narrowed down to the line that caused the bug, but it wasn't actually that line. It was the line before it.
So what was the cause for me? Well, in short-speak I tried to memcpy a NULL pointer into a valid area of memory.
I'm surprised the VS2005 can't handle this.
Anyways, hope that helps. Good luck.
I assume you aren't able to reproduce this reliably.
I've successfully used Rational Purify to hunt down a variety of memory problems in the past, but it costs $ and I'm not sure how it would interact with MMC.
Unless there's some sort of built-in memory debugger you may have to try solving this programmatically. Are you able to remove/disable chunks of functionality to see if the problem manifests itself?
If you have "guesses" about where the problem occurs you can try disabling/changing that code as well. Even if you changed the copy functions to _s versions, you still need to be able to reliably handle truncated data.
I have got this overrun when I wanted to increment a value in a pointer variable like this:
*out_BMask++;
instead
(*out_BMask)++;
where out_BMask was declared as int *out_BMask
If you did something like me then I hope this will help you ;)

Crash within CString

I am observing a crash within my application and the call stack shows below
mfc42u!CString::AllocBeforeWrite+5
mfc42u!CString::operator=+22
No idea why this occuring. This does not occur frequently also.
Any suggestions would help. I have the crash dump with me but not able to progress any further.
The operation i am performing is something like this
iParseErr += m_RawMessage[wMsgLen-32] != NC_SP;
where m_RawMessage is a 512 length char array.
wMsgLen is unsigned short
and NC_SP is defined as
#define NC_SP 0x20 // Space
EDIT:
Call Stack:
042afe3c 5f8090dd mfc42u!CString::AllocBeforeWrite+0x5 * WARNING: Unable to verify checksum for WP Communications Server.exe
042afe50 0045f0c0 mfc42u!CString::operator=+0x22
042aff10 5f814d6b WP_Communications_Server!CParserN1000::iCheckMessage(void)+0x665 [V:\CSAC\SourceCode\WP Communications Server\HW Parser N1000.cpp # 1279]
042aff80 77c3a3b0 mfc42u!_AfxThreadEntry+0xe6
042affb4 7c80b729 msvcrt!_endthreadex+0xa9
042affec 00000000 kernel32!BaseThreadStart+0x37
Well this is complete call stack and i have posted the code snippet as in my original message
Thanks
I have a suggestion that might be a little frustrating for you:
CString::AllocBeforeWrite does implicate to me, that the system tries to allocate some memory.
Could it be, that some other memory operation (specially freeing or resizing of memory) is corrupted before?
A typical problem with C/C++ memory management is, that an error on freeing (or resizing) memory (for example two times freeing the same junk of memory) will not crash the system immediatly but can cause dumps much later -- specially when new memory is to be allocated.
Your situation looks to me quite like that.
The bad thing is:
It can be very difficult to find the place where the real error occurs -- where the heap is corrupted in the first place.
This also can be the reason, why your problem only occurs once in a while. It could depend on some complicated situation beforehand.
I'm sure you'll have checked the obvious: wMsgLen >= 32