How do I count which function requests what number of bytes? - c++

I have a complex code base in C++. I have run a memory profiler that counts the number of bytes allocated by malloc, this gives me X bytes. Theoretically, my code should return X-Y bytes (Y varies with the input, and ranges from a few KB to a couple of GB, so this is not negligible.)
I need to find out which part of my code is asking for the extra bytes. I've tried a few tools, but to no avail: massif, perf, I've even tried gdb breaking on malloc(). I could probably write a wrapper for malloc asking to provide the calling function, but I don't know how to do that.
Does anyone know a way to find how much memory different parts of the program are asking for?

If you use a custom allocate function - a wrapper around malloc - you can use the gcc backtrace functions (http://man7.org/linux/man-pages/man3/backtrace.3.html) to find out which functions call malloc with what arguments.
That'll tell you the functions which are allocating. From there you can probably sort the biggies into domains by hand.
This question has good info on the wrapping itself. Create a wrapper function for malloc and free in C
Update:
This won't catch new/delete allocations but overriding them is even easier than malloc! See here: How to properly replace global new & delete operators + the very important comment on the best answer "Don't forget the other 3 versions: new[], delete[], nothrow"

You can make a macro that calls the libc malloc and prints the details of the allocation.
#define malloc( sz ) (\
{\
printf( "Allocating %d Bytes, File %s:%d\n", sz, __FILE__, __LINE__ );\
void *(*libc_malloc)(size_t) = dlsym(RTLD_NEXT, "malloc");\
printf("malloc\n");\
void* mem = libc_malloc(sz);\
mem; // GCC-specific statement-expression \
}
This should ( touch wood ) get called in lieu of the real malloc and spit out the number of bytes allocated and where the allocation occurred. Returning mem like this is GCC-specific though.

Related

Memory consumption by printf( )

Does displaying a simple statement in C (or C++) occupy some memory?
For example,
//in C
printf("\nHello World");
//in C++
cout<<"Hello World" ;
and, will it make a difference if I attach some value of a variable to be displayed in the same statement?
For example,
printf("Value is %d" , var) ;
Code occupies memory. String literals occupy memory. Function calls (usually) use some stack.
Generally speaking I don't think printf should need to perform any dynamic memory allocations in order to work. But although (I believe) it's possible to avoid it I don't think they're forbidden from doing so. The same goes for cout << when outputting the types that have built-in support. If it ends up calling a user-defined overload then that can use whatever memory it likes.
Posix lists ENOMEM as a possible error for printf but not for snprintf. This suggests that on Posix systems (which of course is not all C implementations) the output might dynamically allocate memory, but the formatting will not.
Does displaying a simple statement in C(or C++) occupies some memory?
Yes, of course. The string constant has to be stored somewhere, usually in a read-only segment of memory. The printf and cout facilities also take up space.
will it make a difference if i attach some value of a variable to be displayed in the same statement?
Yes. The parameters have to be stored somewhere, usually on the stack, so the memory used by the parameters will be returned after the printf or cout call ends. Also the call itself probably generates a few more instructions in the calling routine to push the parameters on the stack.
Also note that printf and cout are buffered write. Chances are if the buffer is not yet filled for output and your program quits which triggers a closure of output stream port, the print string will never show.
Well, yes. The characters in the string requires memory, so these objects are placed in the global memory, but not in the process heap. When the function is called, its arguments passes trough stack, and then the call happens.
push offset string "hello, world"
call dword ptr printf
We have its address, calculated by the offset directive. So it use sizeof(uintptr_t) bytes of stack memory.
[UPD.]
This code:
int val = 5;
printf("val = %d", val);
Disassembles to:
mov eax, dword ptr[val]
push eax
push offset string "val = %d"
call dword ptr printf
So it use 2 * sizeof(uintptr_t) bytes of stack memory.
If you mean heap memory, then the answer is, yes (at least, in general).
We produced some middleware to run on a variety of game consoles, with GCC, visual studio and Metrowerks compilers. A common constraint in the gaming industry is that you may need to allocate to custom heaps -- so we had to ensure that our middleware library made no heap allocations (other than to a heap allocator explicitly provided to us). So ensure this, we had to:
drop calls to print, vsprintf, ...
drop the use of C++ streams
both of these sets of calls (in general) use the heap. printf (in general) requires the heap because of the use of format strings.

Is it possible to protect a region of memory from WinAPI?

Having read this interesting article outlining a technique for debugging heap corruption, I started wondering how I could tweak it for my own needs. The basic idea is to provide a custom malloc() for allocating whole pages of memory, then enabling some memory protection bits for those pages, so that the program crashes when they get written to, and the offending write instruction can be caught in the act. The sample code is C under Linux (mprotect() is used to enable the protection), and I'm curious as to how to apply this to native C++ and Windows. VirtualAlloc() and/or VirtualProtect() look promising, but I'm not sure how a use scenario would look like.
Fred *p = new Fred[100];
ProtectBuffer(p);
p[10] = Fred(); // like this to crash please
I am aware of the existence of specialized tools for debugging memory corruption in Windows, but I'm still curious if it would be possible to do it "manually" using this approach.
EDIT: Also, is this even a good idea under Windows, or just an entertaining intellectual excercise?
Yes, you can use VirtualAlloc and VirtualProtect to set up sections of memory that are protected from read/write operations.
You would have to re-implement operator new and operator delete (and their [] relatives), such that your memory allocations are controlled by your code.
And bear in mind that it would only be on a per-page basis, and you would be using (at least) three pages worth of virtual memory per allocation - not a huge problem on a 64-bit system, but may cause problems if you have many allocations in a 32-bit system.
Roughly what you need to do (you should actually find the page-size for the build of Windows - I'm too lazy, so I'll use 4096 and 4095 to represent pagesize and pagesize-1 - you also will need to do more error checking than this code does!!!):
void *operator new(size_t size)
{
Round size up to size in pages + 2 pages extra.
size_t bigsize = (size + 2*4096 + 4095) & ~4095;
// Make a reservation of "size" bytes.
void *addr = VirtualAlloc(NULL, bigsize, PAGE_NOACCESS, MEM_RESERVE);
addr = reinterpret_cast<void *>(reinterpret_cast<char *>(addr) + 4096);
void *new_addr = VirtualAlloc(addr, size, PAGE_READWRITE, MEM_COMMIT);
return new_addr;
}
void operator delete(void *ptr)
{
char *tmp = reinterpret_cast<char *>(ptr) - 4096;
VirtualFree(reinterpret_cast<void*>(tmp));
}
Something along those lines, as I said - I haven't tried compiling this code, as I only have a Windows VM, and I can't be bothered to download a compiler and see if it actually compiles. [I know the principle works, as we did something similar where I worked a few years back].
This is what Gaurd Pages are for (see this MSDN tutorial), they raise a special exception when the page is accessed the first time, allowing you to do more than crash on the first invalid pages access (and catch bad read/writes as opposed to NULL pointers etc).

OSX lacks memalign

I'm working on a project in C and it requires memalign(). Really, posix_memalign() would do as well, but darwin/OSX lacks both of them.
What is a good solution to shoehorn-in memalign? I don't understand the licensing for posix-C code if I were to rip off memalign.c and put it in my project- I don't want any viral-type licensing LGPL-ing my whole project.
Mac OS X appears to be 16-byte mem aligned.
Quote from the website:
I had a hard time finding a definitive
statement on MacOS X memory alignment
so I did my own tests. On 10.4/intel,
both stack and heap memory is 16 byte
aligned. So people porting software
can stop looking for memalign() and
posix_memalign(). It’s not needed.
update: OSX now has posix_memalign()
Late to the party, but newer versions of OSX do have posix_memalign(). You might want this when aligning to page boundaries. For example:
#include <stdlib.h>
char *buffer;
int pagesize;
pagesize = sysconf(_SC_PAGE_SIZE);
if (pagesize == -1) handle_error("sysconf");
if (posix_memalign((void **)&buffer, pagesize, 4 * pagesize) != 0) {
handle_error("posix_memalign");
}
One thing to note is that, unlike memalign(), posix_memalign() takes **buffer as an argument and returns an integer error code.
Should be easy enough to do yourself, no? Something like the following (not tested):
void *aligned_malloc( size_t size, int align )
{
void *mem = malloc( size + (align-1) + sizeof(void*) );
char *amem = ((char*)mem) + sizeof(void*);
amem += align - ((uintptr)amem & (align - 1));
((void**)amem)[-1] = mem;
return amem;
}
void aligned_free( void *mem )
{
free( ((void**)mem)[-1] );
}
(thanks Jonathan Leffler)
Edit:
Regarding ripping off another memalign implementation, the problem with that is not licensing. Rather, you'd run into the difficulty that any good memalign implementation will be an integral part of the heap-manager codebase, not simply layered on top of malloc/free. So you'd have serious trouble transplanting it to a different heap-manager, especially when you have no access to it's internals.
Why does the software you are porting need memalign() or posix_memalign()? Does it use it for alignments bigger than the 16-byte alignments referenced by austirg?
I see Mike F posted some code - it looks relatively neat, though I think the while loop may be sub-optimal (if the alignment required is 1KB, it could iterate quite a few times).
Doesn't:
amem += align - ((uintptr)amem & (align - 1));
get there in one operation?
Yes Mac OS X does have 16 Byte memory alignment in the ABI.
You should not need to use memalign(). If you memory requirements are a factor of 16 then I would not implement it and maybe just add an assert.
From the macosx man pages:
The malloc(), calloc(), valloc(),
realloc(), and reallocf() functions
allocate memory. The allocated memory is aligned such that it can be
used for any data type, including AltiVec- and SSE-related types. The free()
function frees allocations that were created via the preceding allocation
functions.
If you need an arbitrarily aligned malloc, check out x264's malloc (common/common.c in the git repository), which has a custom memalign for systems without malloc.h. Its extremely trivial code, to the point where I would not even consider it copyrightable, but you should easily be able to implement your own after seeing it.
Of course, if you only need 16-byte alignment, as stated above, its in the OS X ABI.
Might be worthwhile suggesting using Doug Lea's malloc in your code. link text
Thanks for the help, guys... helped in my case (OpenCascade src/Image/Image_PixMap.cxx, OSX10.5.8 PPC)
Combined with the answers above, this might save someone some digging around or instill hope if not particularly familiar with malloc, etc.:
The rather large project I'm building only had one reference to posix_memalign, and it turns out it was the result of a bunch of preprocessor conditions that didn't include OSX but DID include BORLANDC, which confirms what others suggested about it being safe to use malloc in some cases:
#if defined(_MSC_VER)
return (TypePtr )_aligned_malloc (theBytesCount, theAlign);
#elif (defined(__GNUC__) && __GNUC__ >= 4 && __GNUC_MINOR__ >= 1)
return (TypePtr ) _mm_malloc (theBytesCount, theAlign);
#elif defined(__BORLANDC__)
return (TypePtr ) malloc (theBytesCount);
#else
void* aPtr;
if (posix_memalign (&aPtr, theAlign, theBytesCount))
{
aPtr = NULL;
}
return (TypePtr )aPtr;
#endif
So, it could be as simple as just using malloc, as suggested by others.
e.g. here: moving __BORLANDC__ condition above __GNUC__ and adding APPLE:
#elif (defined(__BORLANDC__) || defined(__APPLE__)) //now above `__GNUC__`
NOTE: I did NOT check that BORLANDC uses 16-byte alignment like someone above stated OS X does. Nor did I verify that PPC OS X does. However, this usage suggests that this alignment isn't particularly important. (Here's hoping it works, and that it could be that easy for you searchers, as well!)

Using sprintf without a manually allocated buffer

In the application that I am working on, the logging facility makes use of sprintf to format the text that gets written to file. So, something like:
char buffer[512];
sprintf(buffer, ... );
This sometimes causes problems when the message that gets sent in becomes too big for the manually allocated buffer.
Is there a way to get sprintf behaviour without having to manually allocate memory like this?
EDIT: while sprintf is a C operation, I'm looking for C++ type solutions (if there are any!) for me to get this sort of behaviour...
You can use asprintf(3) (note: non-standard) which allocates the buffer for you so you don't need to pre-allocate it.
No you can't use sprintf() to allocate enough memory. Alternatives include:
use snprintf() to truncate the message - does not fully resolve your problem, but prevent the buffer overflow issue
double (or triple or ...) the buffer - unless you're in a constrained environment
use C++ std::string and ostringstream - but you'll lose the printf format, you'll have to use the << operator
use Boost Format that comes with a printf-like % operator
I dont also know a version wich avoids allocation, but if C99 sprintfs allows as string the NULL pointer. Not very efficient, but this would give you the complete string (as long as enough memory is available) without risking overflow:
length = snprintf(NULL, ...);
str = malloc(length+1);
snprintf(str, ...);
"the logging facility makes use of sprintf to format the text that gets written to file"
fprintf() does not impose any size limit. If you can write the text directly to file, do so!
I assume there is some intermediate processing step, however. If you know how much space you need, you can use malloc() to allocate that much space.
One technique at times like these is to allocate a reasonable-size buffer (that will be large enough 99% of the time) and if it's not big enough, break the data into chunks that you process one by one.
With the vanilla version of sprintf, there is no way to prevent the data from overwriting the passed in buffer. This is true regardless of wether the memory was manually allocated or allocated on the stack.
In order to prevent the buffer from being overwritten you'll need to use one of the more secure versions of sprintf like sprintf_s (windows only)
http://msdn.microsoft.com/en-us/library/ybk95axf.aspx

memset() causing data abort

I'm getting some strange, intermittent, data aborts (< 5% of the time) in some of my code, when calling memset(). The problem is that is usually doesn't happen unless the code is running for a couple days, so it's hard to catch it in the act.
I'm using the following code:
char *msg = (char*)malloc(sizeof(char)*2048);
char *temp = (char*)malloc(sizeof(char)*1024);
memset(msg, 0, 2048);
memset(temp, 0, 1024);
char *tempstr = (char*)malloc(sizeof(char)*128);
sprintf(temp, "%s %s/%s %s%s", EZMPPOST, EZMPTAG, EZMPVER, TYPETXT, EOL);
strcat(msg, temp);
//Add Data
memset(tempstr, '\0', 128);
wcstombs(tempstr, gdevID, wcslen(gdevID));
sprintf(temp, "%s: %s%s", "DeviceID", tempstr, EOL);
strcat(msg, temp);
As you can see, I'm not trying to use memset with a size larger that what's originally allocated with malloc()
Anyone see what might be wrong with this?
malloc can return NULL if no memory is available. You're not checking for that.
There's a couple of things. You're using sprintf which is inherently unsafe; unless you're 100% positive that you're not going to exceed the size of the buffer, you should almost always prefer snprintf. The same applies to strcat; prefer the safer alternative strncat.
Obviously this may not fix anything, but it goes a long way in helping spot what might otherwise be very annoying to spot bugs.
malloc can return NULL if no memory is
available. You're not checking for
that.
Right you are... I didn't think about that as I was monitoring the memory and it there was enough free. Is there any way for there to be available memory on the system but for malloc to fail?
Yes, if memory is fragmented. Also, when you say "monitoring memory," there may be something on the system which occasionally consumes a lot of memory and then releases it before you notice. If your call to malloc occurs then, there won't be any memory available. -- Joel
Either way...I will add that check :)
wcstombs doesn't get the size of the destination, so it can, in theory, buffer overflow.
And why are you using sprintf with what I assume are constants? Just use:
EZMPPOST" " EZMPTAG "/" EZMPVER " " TYPETXT EOL
C and C++ combines string literal declarations into a single string.
Have you tried using Valgrind? That is usually the fastest and easiest way to debug these sorts of errors. If you are reading or writing outside the bounds of allocated memory, it will flag it for you.
You're using sprintf which is
inherently unsafe; unless you're 100%
positive that you're not going to
exceed the size of the buffer, you
should almost always prefer snprintf.
The same applies to strcat; prefer the
safer alternative strncat.
Yeah..... I mostly do .NET lately and old habits die hard. I likely pulled that code out of something else that was written before my time...
But I'll try not to use those in the future ;)
You know it might not even be your code... Are there any other programs running that could have a memory leak?
It could be your processor. Some CPUs can't address single bytes, and require you to work in words or chunk sizes, or have instructions that can only be used on word or chunk aligned data.
Usually the compiler is made aware of these and works around them, but sometimes you can malloc a region as bytes, and then try to address it as a structure or wider-than-a-byte field, and the compiler won't catch it, but the processor will throw a data exception later.
It wouldn't happen unless you're using an unusual CPU. ARM9 will do that, for example, but i686 won't. I see it's tagged windows mobile, so maybe you do have this CPU issue.
Instead of doing malloc followed by memset, you should be using calloc which will clear the newly allocated memory for you. Other than that, do what Joel said.
NB borrowed some comments from other answers and integrated into a whole. The code is all mine...
Check your error codes. E.g. malloc can return NULL if no memory is available. This could be causing your data abort.
sizeof(char) is 1 by definition
Use snprintf not sprintf to avoid buffer overruns
If EZMPPOST etc are constants, then you don't need a format string, you can just combined several string literals as STRING1 " " STRING2 " " STRING3 and strcat the whole lot.
You are using much more memory than you need to.
With one minor change, you don't need to call memset in the first place. Nothing
really requires zero initialisation here.
This code does the same thing, safely, runs faster, and uses less memory.
// sizeof(char) is 1 by definition. This memory does not require zero
// initialisation. If it did, I'd use calloc.
const int max_msg = 2048;
char *msg = (char*)malloc(max_msg);
if(!msg)
{
// Allocaton failure
return;
}
// Use snprintf instead of sprintf to avoid buffer overruns
// we write directly to msg, instead of using a temporary buffer and then calling
// strcat. This saves CPU time, saves the temporary buffer, and removes the need
// to zero initialise msg.
snprintf(msg, max_msg, "%s %s/%s %s%s", EZMPPOST, EZMPTAG, EZMPVER, TYPETXT, EOL);
//Add Data
size_t len = wcslen(gdevID);
// No need to zero init this
char* temp = (char*)malloc(len);
if(!temp)
{
free(msg);
return;
}
wcstombs(temp, gdevID, len);
// No need to use a temporary buffer - just append directly to the msg, protecting
// against buffer overruns.
snprintf(msg + strlen(msg),
max_msg - strlen(msg), "%s: %s%s", "DeviceID", temp, EOL);
free(temp);