malloc() vs. HeapAlloc()

malloc() vs. HeapAlloc() - c++

What is the difference between malloc() and HeapAlloc()? As far as I understand malloc allocates memory from the heap, just as HeapAlloc, right?
So what is the difference?

Actually, malloc() (and other C runtime heap functions) are module dependant, which means that if you call malloc() in code from one module (i.e. a DLL), then you should call free() within code of the same module or you could suffer some pretty bad heap corruption (and this has been well documented). Using HeapAlloc() with GetProcessHeap() instead of malloc(), including overloading new and delete operators to make use of such, allow you to pass dynamically allocated objects between modules and not have to worry about memory corruption if memory is allocated in code of one module and freed in code of another module once the pointer to a block of memory has been passed across to an external module.

You are right that they both allocate memory from a heap. But there are differences:
malloc() is portable, part of the standard.
HeapAlloc() is not portable, it's a Windows API function.
It's quite possible that, on Windows, malloc would be implemented on top of HeapAlloc. I would expect malloc to be faster than HeapAlloc.
HeapAlloc has more flexibility than malloc. In particular it allows you to specify which heap you wish to allocate from. This caters for multiple heaps per process.
For almost all coding scenarios you would use malloc rather than HeapAlloc. Although since you tagged your question C++, I would expect you to be using new!

With Visual C++, the function malloc() or the operator new eventually calls HeapAlloc(). If you debug the code, you will find the the function _heap_alloc_base() (in the file malloc.c) is calling return HeapAlloc(_crtheap, 0, size) where _crtheap is a global heap created with HeapCreate().
The function HeapAlloc() does a good job to minimize the memory overhead, with a minimum of 8 bytes overhead per allocation. The largest I have seen is 15 bytes per allocation, for allocations ranging from 1 byte to 100,000 bytes. Larger blocks have larger overhead, however as a percent of the total allocated it remains less than 2.5% of the payload.
I cannot comment on performance because I have not benchmarked the HeapAlloc() with a custom made routine, however as far as the memory overhead of using HeapAlloc(), the overhead is amazingly low.

malloc is a function in the C standard library (and also in the C++ standard library).
HeapAlloc is a Windows API function.
The latter lets you specify the heap to allocate from, which I imagine can be useful for avoiding serialization of allocation requests in different threads (note the HEAP_NO_SERIALIZE flag).

In systems where multiple DLLs may come and go (via LoadLibrary/Freelibrary), and when memory may be allocated within one DLL, but freed in another (see previous answer), HeapAlloc and related functions seem to be the least-common-denominator for successful memory sharing.
Thread safe, presumably highly optimized by PhDs galore, HeapAlloc appears to work in all kinds of situations where our not-so-shareable code using malloc/free would fail.
We are a C++ embedded shop, so we have overloaded operator new/delete across our system to use the HeapAlloc( GetProcessHeap( ) ) which can stubbed (on target) or native (to windows) for code portability.
So far no problems now that we have bypassed malloc/free which seem indisputably DLL specifically, a new "heap" for each DLL load.

Additionally, you can refer to:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366705(v=vs.85).aspx
Which stands that you can enable some features of the HEAP managed by WinApi memory allocator, eg "HeapEnableTerminationOnCorruption".
As I understand, it makes some basic heap overflow protections which may be considered as an added value to your application in terms of security.
(eg, I would prefer to crash my app (as an app owner) rather than execute arbitrary code)
Other thing is that it might be useful in early phase of development, so you could catch memory issues before going to the production.

malloc is exported function by C run-time library(CRT) which is compiler specific.
C run-time library dll name changes from visual studio versions to versions.
HeapAlloc function is exported by kernel32.dll present in windows folder.

This is what MS has to say about it: http://msdn.microsoft.com/en-us/library/windows/desktop/aa366533(v=vs.85).aspx
One thing one one mentioned thus far is: "The malloc function has the disadvantage of being run-time dependent. The new operator has the disadvantage of being compiler dependent and language dependent."
Also, "HeapAlloc can be instructed to raise an exception if memory could not be allocated"
So if you want your program to run with any CRT, or perhaps no CRT at all, you'd use HeapAlloc. Perhaps only people who would do such thing would be malware writers. Another use might be if you are writing a very memory intensive application with specific memory allocation/usage patterns that you'd rather write your own heap allocator instead of using a CRT one.

Related

What Alloc API may call VirtualAlloc/reserve memory internally?

I am debugging a potential memory leak problem in a debug DLL.
The case is that process run a sub-test which loads/unloads a DLL dynamically, during the test a lot of memory is reserved and committed(1.3GB). After test is finished and DLL unloaded, still massive amount of memory remains reserved(1.2GB).
Reason why I said this reserved memory is allocated by DLL is that if I use a release DLL(nothing else changed, same test), reserved memory is ~300MB, so all the additional reserved memory must be allocated in debug DLL.
Looks like a lot of memory is committed during test but decommit(not release to free status) after test. So I want to track who reserve/decommit that large memory. But in the source code, there is no VirtualAlloc called, so questions are:
Is VirtualAlloc the only way to reserve memory?
If not, what other API can do that? If is, what other API will internally call VirtualAlloc? Quite some people online says HeapAlloc will internally call VirtualAlloc? How does it work?

[Parts of this are purely implementation detail, and not things that your application should rely on, so take them only for informational purposes, not as official documentation or contract of any kind. That said, there is some value in understanding how things are implemented under the hood, if only for debugging purposes.]
Yes, the VirtualAlloc() function is the workhorse function for memory allocation in Windows. It is a low-level function, one that the operating system makes available to you if you need its features, but also one that the system uses internally. (To be precise, it probably doesn't call VirtualAlloc() directly, but rather an even lower level function that VirtualAlloc() also calls down to, like NtAllocateVirtualMemory(), but that's just semantics and doesn't change the observable behavior.)
Therefore, HeapAlloc() is built on top of VirtualAlloc(), as are GlobalAlloc() and LocalAlloc() (although the latter two became obsolete in 32-bit Windows and should basically never be used by applications—prefer explicitly calling HeapAlloc()).
Of course, HeapAlloc() is not just a simple wrapper around VirtualAlloc(). It adds some logic of its own. VirtualAlloc() always allocates memory in large chunks, defined by the system's allocation granularity, which is hardware-specific (retrievable by calling GetSystemInfo() and reading the value of SYSTEM_INFO.dwAllocationGranularity). HeapAlloc() allows you to allocate smaller chunks of memory at whatever granularity you need, which is much more suitable for typical application programming. Internally, HeapAlloc() handles calling VirtualAlloc() to obtain a large chunk, and then divvying it up as needed. This not only presents a simpler API, but is also more efficient.
Note that the memory allocation functions provided by the C runtime library (CRT)—namely, C's malloc() and C++'s new operator—are a higher level yet. These are built on top of HeapAlloc() (at least in Microsoft's implementation of the CRT). Internally, they allocate a sizable chunk of memory that basically serves as a "master" block of memory for your application, and then divvy it up into smaller blocks upon request. As you free/delete those individual blocks, they are returned to the pool. Once again, this extra layer provides a simplified interface (and in particular, the ability to write platform-independent code), as well as increased efficiency in the general case.
Memory-mapped files and other functionality provided by various OS APIs is also built upon the virtual memory subsystem, and therefore internally calls VirtualAlloc() (or a lower-level equivalent).
So yes, fundamentally, the lowest level memory allocation routine for a normal Windows application is VirtualAlloc(). But that doesn't mean it is the workhorse function that you should generally use for memory allocation. Only call VirtualAlloc() if you actually need its additional features. Otherwise, either use your standard library's memory allocation routines, or if you have some compelling reason to avoid them (like not linking to the CRT or creating your own custom memory pool), call HeapAlloc().
Note also that you must always free/release memory using the corresponding mechanism to the one you used to allocate the memory. Just because all memory allocation functions ultimately call VirtualAlloc() does not mean that you can free that memory by calling VirtualFree(). As discussed above, these other functions implement additional logic on top of VirtualAlloc(), and thus require that you call their own routines to free the memory. Only call VirtualFree() if you allocated the memory yourself via a call to VirtualAlloc(). If the memory was allocated with HeapAlloc(), call HeapFree(). For malloc(), call free(); for new, call delete.
As for the specific scenario described in your question, it is unclear to me why you are worrying about this. It is important to keep in mind the distinction between reserved memory and committed memory. Reserved simply means that this particular block in the address space has been reserved for use by the process. Reserved blocks cannot be used. In order to use a block of memory, it must be committed, which refers to the process of allocating a backing store for the memory, either in the page file or in physical memory. This is also sometimes known as mapping. Reserving and committing can be done as two separate steps, or they can be done at the same time. For example, you might want to reserve a contiguous address space for future use, but you don't actually need it yet, so you don't commit it. Memory that has been reserved but not committed is not actually allocated.
In fact, all of this reserved memory may not be a leak at all. A rather common strategy used in debugging is to reserve a specific range of memory addresses, without committing them, to trap attempts to access memory within this range with an "access violation" exception. The fact that your DLL is not making these large reservations when compiled in Release mode suggests that, indeed, this may be a debugging strategy. And it also suggests a better way of determining the source: rather than scanning through your code looking for all of the memory-allocation routines, scan your code looking for the conditional code that depends upon the build configuration. If you're doing something different when DEBUG or _DEBUG is defined, then that is probably where the magic is happening.
Another possible explanation is the CRT's implementation of malloc() or new. When you allocate a small chunk of memory (say, a few KB), the CRT will actually reserve a much larger block but only commit a chunk of the requested size. When you subsequently free/delete that small chunk of memory, it will be decommitted, but the larger block will not be released back to the OS. The reason for this is to allow future calls to malloc/new to re-use that reserved block of memory. If a subsequent request is for a larger block than can be satisfied by the currently reserved address space, it will reserve additional address space. If, in debugging builds, you are repeatedly allocating and freeing increasingly large chunks of memory, what you're seeing may be the result of memory fragmentation. But this is really not a problem, aside from a minor performance hit, which is really not worth worrying about in debugging builds.

How to make a simple tool to detect double free or memory overflow in Linux for a large project?

I have a large embedded project that has Linux running. Also, it has various process and threads running. I can't log all the malloc and new calls as it will make the box - Embedded Set-top box sluggish. Also, sluggishness might cause a crash because of mutex time out or other things. Thus, I want to make a tool that can help to debug the issues related to memory like - memory overflow.
For example, when you do a malloc of 4 bytes. But, you write 8 bytes. This may create a problem on the other chunk of data allocated.The other chunk of data header can be tampered. Thus, free() will fail or crash. How can I make a tool to detect such issue. Also, a tool to track down the memory leaks. Is there a way to do so? I can't use valgrind as it slows down my STB. So, I want to develop my tool that can check for the memory header corruption or memory leaks. Just based on my choice, it can do either memory corruption check or memory leak detection. Also, it should be a light weight.

Firstly there is probably no way to call this "simple".
Secondly if you are using C++ I highly suggest not using malloc/free but rather new/delete. The options for overriding those operators are much more flexible.
C++ provides a number of tools to improve memory safety really:
smart pointers (the performance cost really is worth the safety improvement)
Encapsulating things in classes. for example if you use std::array::at(i) it will throw an exception if your access is out of bounds. ref
lastly having proper usage of asserts in your code can go a long way to catch errors.
My point is merely that you should not depend on your debugging tools to negate the necessity of using good C++ programming methods.
Ok so now next you need to override new and delete.
A google search will provide many ways to do this.
link1
For your problem it probably makes more sense to overload delete/new globally.
Buffer overflow detection
This is the first part of your problem.
What you need to do is allocate additional memory in your new overloaded instruction so that there are some memory buffer regions before and after the memory and then return only the centre part.
How big a buffer is your choice.
pseudo code:
inline void* operator new(size_t s)
{
void* mem = malloc(s+2*BUFFER);
memset(mem,0x5A,s+2*BUFFER);
return (mem+BUFFER)
}
At some stage in the future you need to check that the BUFFER regions kept the values of 0x5A. You should probably do this in the call to free() but you can also have your own function to do this which you call periodically. In order to speed up this process use a function like memcmp perhaps.
Memory leak detection
Detecting memory leaks is not trivial.
Firstly I suggest using stack-based objects when ever possible to all-together avoid allocating memory on the heap when not needed.
The main question regarding memory leaks is to know if a certain memory block shouldn't been deleted or not.
99% of your memory leak problems can probably be solved just by using smart pointers.
However one of the most difficult memory leaks to catch is that of a growing data structure. (say for example a linked list that grows slowly over time)
Firstly in your overloaded new/malloc functions keep a list of all memory currently allocated. And also a counter of the total number of memory allocated.
Method 1: threshold detection:
Essentially every-time your program's memory usage exceeds a threshold amount you report this and increase the threshold. If your program continues to exceed thresholds as it keeps running something is wrong.
Method 2: Comparative analysis:
In pseudeo code:
Value1 = currentAmountOfMemoryUsed;
runSomeCode()
if (currentAmountOfMemoryUsed != Value1) reportProblem()
If this is possible depends a lot on what happens in runSomeCode() as some code can legally "save" up some memory for when it runs again later.
Method 3: Leak detection on program exit:
The premise is that if your code is 100% correctly written every bit of memory allocated should be freed at the time your program exists.
This method once again is not always possible because perhaps your program needs to run indefinitely and also your program might segfault because of your errors long before this can be detected.
Compiler support
On a lower level most compilers have some support to get into the whole memory management system but the way to handle this is 100% compiler/platform specific. e.g. Visual Studio C++
This is why I highly suggest not using malloc/free directly as this is problematic for debugging in this way as well as breaks the constructor/destructor design patterns of C++.
overriding malloc/free
There is however a more hands-on approach to overriding malloc/free.
That is by defining your own malloc/free functions.
Typically under debugging this will then use macro's to include FILE and LINE in the call:
#ifdef NDEBUG
#define myMalloc(s) myMallocImplementation(s,__FILE__,__LINE__);
#else
#define myMalloc(s) malloc(s)
#endif
What this allows is that your malloc implementation can then save the source location where the memory was allocated. This approach will however not catch malloc/free usage within libraries you are using.
This is a bit harder to do with new/delete calls as it would normally require some amount of digging into the call-stack at run-time to find out who called your new() function and that again is fairly compiler specific.
Also see: MSDN blog article
Memory freezing
Given everything I like to also just mention something that is very common in safety critical code (as used in motor vehicles and/or airplanes ect)
Outside of initialization a safety-critical program is usually not allowed to use malloc/free/new/delete. So all memory allocations must happen during initialization and then once the program and then usually malloc/free is frozen in some way. Any call to malloc/free after that will cause an assert.
This can be quite a heavy limitation to work with in a C++ environment but it does make for very robust code.
Note this does nothing for buffer overflow access or invalid pointer access problems.

What are the Windows and Linux native OS/system calls made from malloc()?

I recently saw the following post:
A memory allocator isn't lower level than malloc. (The default
allocator typically calls malloc directly or indirectly)
An allocator just allows you to specify different allocation
strategies. For example, you might use an allocator which calls malloc
once to retrieve a large pool of memory, and then for subsequent
allocation requests, it just returns a small chunk of this pool.
Or you may use it as a hook to allow you to perform some additional
task every time memory is allocated or freed.
As to your second question, malloc is the lowest you can go without
losing portability. malloc is typically implemented using some
OS-specific memory allocation function, so that would be lower level
still. But that's unrelated to your main question, since C++
allocators are a higher-level abstraction.
from: C++: Memory allocators
My question is- how is malloc implemented in the following Operating systems?
for Windows
for Linux
what are the OS-specific functions which are called/implementations of malloc()?

In Windows, in recent versions of MSVC, malloc (and C++ new, as it is implemented using the same fundamentals for the actual memory allocation part of new) calls HeapAlloc(). In other versions, such as g++ mingw, the C runtime is an older version, which doesn't call quite as directly to HeapAlloc, but at the base of it, it still goes to HeapAlloc - to find something different, we need to go back to Windows pre-95, which did have a GlobalAlloc and LocalAlloc set of functions - but I don't think people use 16-bit compilers these days - at least not for Windows programming.
In Linux, if you are using glibc, it depends on the size of the allocation whether it calls sbrk or mmap - mmap (with MAP_ANONYMOUS in the flags) is used for larger allocations (over a threshold, which I believe is 2MB in the typical implementation)

My question is- how is malloc implemented in the following Operating systems?
On Linux there are two famous malloc implementations:
dlmalloc (Doug Lea's malloc)
ptmalloc
On Linux libc like glibc, eglibc or newlib implement ptmalloc or a variant of ptmalloc.
what are the OS-specific functions which are called/implementations of malloc()?
On Unix and Linux systems sbrk and mmap system calls are used. See man 2 sbrk and man 2 mmap for more information.

Alright, I am not sure about Linux, but when it comes to windows...
Memory can be allocated in two categorized places.
1) Heaps (Process Heap, Custom Created Heaps) see -> http://msdn.microsoft.com/en-us/library/windows/desktop/aa366711(v=vs.85).aspx
using functions like HeapAlloc & HeapFree. LocalAlloc and LocalFree can be used as 'shortcuts' to HeapAlloc when you want to allocate in the default process heap.
2) Virtual Memory (usually only process-specific due to access restrictions in global virtual memory for security), using VirtualAlloc, VirtualFree. see -> http://msdn.microsoft.com/en-us/library/windows/desktop/aa366916(v=vs.85).aspx
To my knowledge, malloc will use the heap allocation functions on windows.
I hope this helps.

malloc() and friends are considered part of the runtime system that comes with a compiler. So each compiler can and does use different OS calls to implement malloc.
As others have said, on Linux the options are sbrk() and mmap().
On Windows the options are HeapAlloc() and VirtualAlloc().

On Windows, malloc implementations will usally call the win32 heap functions like HeapCreate, HeapDestroy, HeapAlloc, HeapFree. Those functions will call the NTDLL usermode heap manager located within ntdll.dll, those functions will have the RtlxxxHeap name, RtlAllocateHeap, RtlCreateHeap, etc...
In the end system calls within the NtxxxVirtualMemory group will be called, NtAllocateVirtualMemory, NtQueryVirtualMemory, NtFreeVirtualMemory.

How are heaps created in mixed language applications?

We have a front end written in Visual Basic 6.0 that calls several back end DLLs written in mixed C/C++. The problem is that each DLL appears to have its own heap and one of them isn’t big enough. The heap collides with the program stack when we’ve allocated enough memory.
Each DLL is written entirely in C, except for the basic DLL wrapper, which is written in C++. Each DLL has a handful of entry points. Each entry point immediately calls a C routine. We would like to increase the size of the heap in the DLL, but haven’t been able to figure out how to do that. I searched for guidance and found these MSDN articles:
http://msdn.microsoft.com/en-us/library/hh405351(v=VS.85).aspx
These articles are interesting but provide conflicting information. In our problem it appears that each DLL has its own heap. This matches the “Heaps: Pleasures and Pains” article that says that the C Run-Time (C RT) library creates its own heap on startup. The “Managing Heap Memory” article says that the C RT library allocated out of the default process heap. The “Memory management options in Win32” article says the behavior depends on the version of the C RT library being used.
We’ve temporarily solved the problem by allocating memory from a private heap. However, in order to improve the structure of this very large complex program, we want to switch from C with a thin C++ wrapper to real C++ with classes. We’re pretty certain that the new and free operator won’t allocate memory from our private heap and we’re wondering how to control the size of the heap C++ uses to allocate objects in each DLL. The application needs to run in all versions of desktop Windows-NT, from 2000 through 7.
The Question
Can anyone point us to definitive and correct documentation that
explains how to control the size of the heap C++ uses to allocate
objects?
Several people have asserted that stack corruption due to heap allocations overwriting the stack are impossible. Here is what we observed. The VB front end uses four DLLs that it dynamicly loads. Each DLL is independant of the others and provides a handful of methods called by the front end. All the DLLs comunicate via data structures written to files on disk. These data structures are all structured staticlly. They contain no pointers, just value types and fixed sized arrays of value types. The problem DLL is invoked by a single call where a file name is passed. It is designed to allocate about 20MB of data structures required to do complete its processing. It does a lot of calculation, writes the results to disk, releases the 20MB of data structures, and returns and error code. The front end then unloads the DLL. While debugging the problem under discussion, we set a break point at the beginning of the data structure allocation code and watched the memory values returned from the calloc calls and compared them with the current stack pointer. We watched as the allocated blocks approached the the stack. After the allocation was complete the stack began to grow until it overlapped the heap. Eventually the calculations wrote into the heap and corrupted the stack. As the stack unwound it tried to return to an invalid address and crashed with a segmentation fault.
Each of our DLLs is staticly linked to the CRT, so that each DLL has its own CRT heap and heap manager. Microsoft says in http://msdn.microsoft.com/en-us/library/ms235460(v=vs.80).aspx:
Each copy of the CRT library has a separate and distinct state.
As such, CRT objects such as file handles, environment variables, and
locales are only valid for the copy of the CRT where these objects are
allocated or set. When a DLL and its users use different copies of the
CRT library, you cannot pass these CRT objects across the DLL boundary
and expect them to be picked up correctly on the other side.
Also, because each copy of the CRT library has its own heap manager,
allocating memory in one CRT library and passing the pointer across a
DLL boundary to be freed by a different copy of the CRT library is a
potential cause for heap corruption.
We don't pass pointers between DLLs. We aren't experiencing heap corruption, we are experiencing stack corruption.

OK, the question is:
Can anyone point us to definitive and correct documentation that
explains how to control the size of the heap C++ uses to allocate
objects?
I am going to answer my own question. I got the answer from reading Raymond Chen's blog The Old New Thing, specifically There's also a large object heap for unmanaged code, but it's inside the regular heap. In that article Raymond recommends Advanced Windows Debugging by Mario Hewardt and Daniel Pravat. This book has very specific information on both stack and heap corruption, which is what I wanted to know. As a plus it provides all sorts of information about how to debug these problems.

Could you please elaborate on this your statement:
The heap collides with the program stack when we’ve allocated enough memory.
If we're talking about Windows (or any other mature platform), this should not be happening: the OS makes sure that stacks, heaps, mapped files and other objects never intersect.
Also:
Can anyone point us to definitive and correct documentation that explains how to control the size of the heap C++ uses to allocate objects?
The heap size is not fixed on Windows: it grows as the application uses more and more memory. It will grow until all available virtual memory space for the process is used. It is pretty easy to confirm this: just write a simple test app which keeps allocating memory and counts how much has been allocated. On a default 32-bit Windows you'll reach almost 2Gb. Surely, initially the heap doesn't occupy all available space, therefore it must grow in the process.
Without many details about the "collision" it's hard to tell what's happening in your case. However, looking at the tags to this question prompts me to one possibility. It is possible (and happens quite often, unfortunately) that ownership of allocated memory areas is being passed between modules (DLLs in your case). Here's the scenario:
there are two DLLs: A and B. Both of them created their own heaps
the DLL A allocates an object in its heap and passes the pointer and ownership to B
the DLL B receives the pointer, uses the memory and deallocates the object
If the heaps are different, most heap managers would not check if the memory region being deallocated actually belongs to it (mostly for performance reasons). So they would deacllocate something which doesn't belong to them. By doing that they corrupt the other module's heap. This may (and often does) lead to a crash. But not always. Depending on your luck (and particular heap manager implementation), this operation may change one of the heaps in a manner that the next allocation will happen outside of the area where the heap is located.
This often happens when one module is managed code, while the other is native one. Since you have the VB6 tag in the question, I'd check if this is the case.

If the stack grows large enough to hit the heap, a prematurely aborted stack overflow may be the problem: Invalid data is passed that does not satisfy the exit condition of some recursion (loop detection is not working or not existing) in the problem DLL so that an infinite recursion consumes ridiculously large stack space. One would expect such a DLL to terminate with a stack overflow exception, but for maybe compiler / linker optimizations or large foreign heap sizes it crashes elsewhere.

Heaps are created by the CRT. That is to say, the malloc heap is created by the CRT, and is unrelated to HeapCreate(). It's not used for large allocations, though, which are handed off to the OS directly.
With multiple DLLs, you might have multiple heaps (newer VC versions are better at sharing, but even VC6 had no problem if you used MSVCRT.DLL - that's shared)
The stack, on the other hand, is managed by the OS. Here you see why multiple heaps don't matter: The OS allocation for the different heaps will never collide with the OS allocation for the stack.
Mind you, the OS may allocate heap space close to the stack. The rule is just no overlap, after all, there's no guaranteed "unused separation zone". If you then have a buffer overflow, it could very well overflow into the stack space.
So, any solutions? Yes: move to VC2010. It has buffer security checks, implemented in quite an efficient way. They're the default even in release mode.

How are malloc and free implemented?

I want to implement my own dynamic memory management system in order to add new features that help to manage memory in C++.
I use Windows (XP) and Linux (Ubuntu).
What is needed to implement functions like 'malloc' and 'free'?
I think that I have to use lowest level system calls.
For Windows, I have found the functions: GetProcessHeap, HeapAlloc, HeapCreate, HeapDestroy and HeapFree.
For Linux, I have not found any system calls for heap management. On Linux, malloc and free are system calls, are not they?
Thanks
Edit:
C++ does not provide garbage collector and garbage collector is slow. Some allocations are easy to free, but there are allocations that needs a garbage collector.
I want to implement these functions and add new features:
* Whenever free() be called, check if the pointer belongs to heap.
* Help with garbage collection. I have to store some information about the allocated block.
* Use multiple heaps (HeapCreate/HeapDestroy on Windows). I can delete an entire heap with its allocated blocks quickly.

On linux, malloc and free are not system calls. malloc/free obtains memory from the kernel by extending and shrinking(if it can) the data segment using the brk system calls as well as obtaining anonymous memory with mmap - and malloc manages memory within those regions. Some basic information any many great references can be found here

In *nix, malloc() is implemented at the C library level. It uses brk()/sbrk() to grow/shrink the data segment, and mmap/munmap to request/release memory mappings. See this page for a description of the malloc implementation used in glibc and uClibc.

If you are simply wrapping the system calls then you are probably not gaining anything on using the standard malloc - thats all they are doing.
It's more common to malloc (or HeapAlloc() etc ) a single block of memory at the start of the program and manage the allocation into this yourself, this can be more efficient if you know you are going to be creating/discarding a lot of small blocks of memory regularly.

brk is the system call used on Linux to implement malloc and free. Try the man page for information.
You've got the Windows stuff down already.
Seeing the other answers here, I would like to note that you are probably reinventing the wheel; there are many good malloc implementations out there already. But programming malloc is a good thought exercise - take a look here for a nice homework assignment (originally CMU code) implementing the same. Their shell gives you a bit more than the Linux OS actually does, though :-).

garbage collector is slow
This is a completely meaningless statement. In many practical situations, programs can get a significant performance boost by using a Garbage Collector, especially in multi-threaded scenarios. In many other situations, Garbage Collectors do incur a performance penalty.

Try http://www.dent.med.uni-muenchen.de/~wmglo/malloc-slides.html for pointers.
This is a brief performance comparison, with pointers to eight different malloc/free implementations. A nice starting point, because a few good reference statistics will help you determine whether you've improved on the available implementations - or not.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js