Managing Heap Memory - c++

I'm doing an special software for windows that needs to use two simultaneous heaps for a number of reasons.
I have read this article https://msdn.microsoft.com/en-us/library/ms810603 and it describes how new heaps can be created and use them.
There is one heap that is created by the compiler called the default heap, which is the one we all use in C/C++ when we call new and malloc.
The question I have is, Is it possible to set a new heap as default heap so that all function calls that use memory allocations use this heap instead of the original one? Then when I need it switch back to the original
I know it looks tricky but I need to deal with this to avoid some heap corruption during hardware interrupts.
Thanks in advance,
Martin

Related

When would a program create multiple heaps?

If and when does c++ trigger the operation of allocating a secondary heap and are there any reasons someone would want to allocate more than one heap? Do any standard actions in c++ like creating a new namespace trigger this, how does the memory handle multiple object with the same name?
According to the Dartmouth_edu article in my comment above there are quite a few time a program may utilize multiple heaps.
"IBM C and C++ Compilers lets you create and use your own pools of memory, called heaps. "
Good examples are if you think a heap object may corrupt the heap isolate it in its own heap.
If you allocate a whole heap for a multipart object you can just destroy the heap instead of having to free the memory of every component.
If you want to do fancy stuff like multithreading! You can also speed up memory access by allowing one thread to free memory from its heap while another is using its own separate heap.
Normal user actions do not create new heaps. Explicitly creating a new heap creates a new heap.
Namespaces are handled by address memory scopes and pointers. #parktomatomi thanks for the help.
C++ does not dictate the use of a heap, much less the use of a secondary heap. That is an implementation detail that is left up to each compiler to determine. As far as the language is concerned, variables can have dynamic storage duration, but the standard does not say how this is achieved.
In practice, all the compilers I know of do use heap memory for dynamic allocations. In theory, each allocation method (new vs. malloc) could have its own heap, but there is little reason to complicate memory management by introducing more heaps than necessary. Plus, you shouldn't mix allocation methods. The benefits of multiple heaps tend to depend on manual fine-tuning that is currently beyond the ken of compilers. (A programmer can implement multiple heaps, but that is not the same as "triggering" multiple heaps.)
Namespaces and object names are an unrelated subject, as those do not exist in an executable (unless retained as notes for a debugger).

Defragmentation of dynamically allocated memory in C++

How does the defragmentation of dynamically allocated memory (allocated using the new and malloc operator ) work in C++?
There is no defragmentation in the C++ heap because the application is free to keep pointers to allocated memory. Thus the heap manager cannot move memory around that is already allocated. The only "defragmentation" possible is if you free two adjacent blocks. Then the heap manager will combine the two blocks to a single larger free block that can be used for allocation again.
You might want to look into slab allocators. This won't be your silver bullet but for specific problems you may be able to relief the pressure. In a past project of mine we've had our own allocator written which was quite a complex affair but it certainly managed to get a grip on the issue.
While I agree with the other answers in general, sometimes there is hope in specific use cases. Such as similar objects that may be tackled with pool allocators:
http://www.boost.org/doc/libs/1_53_0/libs/pool/doc/index.html
Also an interesting read are the allocators that come with boost interprocess: http://www.boost.org/doc/libs/1_53_0/doc/html/interprocess/allocators_containers.html#interprocess.allocators_containers.stl_allocators_adaptive

Is there a way to get the range of memory addresses that are available on the heap?

It seems to me that this is how memory works in C++:
If you use new then you are asking the compiler's implementation to give you some memory (any memory) from the heap.
If you use the placement new syntax, the you are asking to re-allocate a specific memory location that you already know the address of (let's just assume it is also from the heap) which presumably was also originally allocated from the new operator at some point.
My question is this:
Is there anyway to know which memory locations are available to your program a priori (i.e. without re-allocating memory from the heap that was already given to you by the new operator)?
Is the memory in the heap contiguous? If so, can you find out where it starts and where it ends?
p.s. Just trying to get as close to the metal as possible as fast as possible...
Not in any portable way. Modern operating systems tend to use paging (aka virtual memory) anyway, so that the amount of memory available is not a question that can be easily answered.
There is no requirement for the memory in the heap to be contiguous, if you need that you are going to have to write your own heap, which isn't so hard to do.
The memory available to your program "a priori" contains the variables you have defined. The compiler has calculated exactly how much the program needs. There is nothing "extra" you can use for something else.
New objects you need to create dynamically are allocated from the free store (aka heap), possibly by using new but more often by using containers from the library like std::vector.
The language standard says nothing about how this works in any detail, just how it can be used.
It is very difficult question. In modern operating system there are such subsystem as memory manager. When your program executes new operator, there are two options:
if there is enough memory available to program, you get pointer to memory in your program's heap
if there isn't enough memory, execution is given to memory manager of operating system and it decides what to do: give more memory to your program (let's say that it will resize your heap) or refuse and throw exception.
Is there anyway to know which memory locations are available to your program a priori (i.e. without re-allocating memory from the heap that was already given to you by the new operator)?
I want to emphasize that it depends on version of OS and on environment.
Is the memory in the heap contiguous?
No, it may be non-contiguous.
The contiguity of addresses received from successive calls to new or malloc() isn't defined. The C runtime and operating system are free to return pointers willy-nilly all over the address space from successive news. (And in fact, it's likely to do so, since good allocators draw from different pools depending on the size of the allocation to reduce fragmentation, and those pools will be in different pages.)
However, bytes within a single allocation in new are guaranteed to be contiguous, so if you do
int *foo = new int[1024 * 1024]
you'll get a million contiguous words.
If you really need a large, contiguous allocation, you'll probably need to use operating-system-specific functions to do so (unless someone has hidden this behind some Boost library I'm unaware of). On Windows, VirtualAlloc(). On POSIX, mmap().

How to implement a memory heap

Wasn't exactly sure how to phrase the title, but the question is:
I've heard of programmers allocating a large section of contiguous memory at the start of a program and then dealing it out as necessary. This is, in contrast to simply going to the OS every time memory is needed.
I've heard that this would be faster because it would avoid the cost of asking the OS for contiguous blocks of memory constantly.
I believe the JVM does just this, maintaining its own section of memory and then allocating objects from that.
My question is, how would one actually implement this?
Most C and C++ compilers already provide a heap memory-manager as part of the standard library, so you don't need to do anything at all in order to avoid hitting the OS with every request.
If you want to improve performance, there are a number of improved allocators around that you can simply link with and go. e.g. Hoard, which wheaties mentioned in a now-deleted answer (which actually was quite good -- wheaties, why'd you delete it?).
If you want to write your own heap manager as a learning exercise, here are the basic things it needs to do:
Request a big block of memory from the OS
Keep a linked list of the free blocks
When an allocation request comes in:
search the list for a block that's big enough for the requested size plus some book-keeping variables stored alongside.
split off a big enough chunk of the block for the current request, put the rest back in the free list
if no block is big enough, go back to the OS and ask for another big chunk
When a deallocation request comes in
read the header to find out the size
add the newly freed block onto the free list
optionally, see if the memory immediately following is also listed on the free list, and combine both adjacent blocks into one bigger one (called coalescing the heap)
You allocate a chunk of memory at the beginning of the program large enough to sustain its need. Then you have to override new and/or malloc, delete and/or free to return memory from/to this buffer.
When implementing this kind of solution, you need to write your own allocator(to source from the chunk) and you may end up using more than one allocator which is often why you allocate a memory pool in the first place.
Default memory allocator is a good all around allocator but is not the best for all allocation needs. For example, if you know you'll be allocating a lot of object for a particular size, you may define an allocator that allocates fixed size buffer and pre-allocate more than one to gain some efficiency.
Here is the classic allocator, and one of the best for non-multithreaded use:
http://gee.cs.oswego.edu/dl/html/malloc.html
You can learn a lot from reading the explanation of its design. The link to malloc.c in the article is rotted; it can now be found at http://gee.cs.oswego.edu/pub/misc/malloc.c.
With that said, unless your program has really unusual allocation patterns, it's probably a very bad idea to write your own allocator or use a custom one. Especially if you're trying to replace the system malloc, you risk all kinds of bugs and compatibility issues from different libraries (or standard library functions) getting linked to the "wrong version of malloc".
If you find yourself needing specialized allocation for just a few specific tasks, that can be done without replacing malloc. I would recommend looking up GNU obstack and object pools for fixed-sized objects. These cover a majority of the cases where specialized allocation might have real practical usefulness.
Yes, both stdlib heap and OS heap / virtual memory are pretty troublesome.
OS calls are really slow, and stdlib is faster, but still has some "unnecessary"
locks and checks, and adds a significant overhead to allocated blocks
(ie some memory is used for management, in addition to what you allocate).
In many cases its possible to avoid dynamic allocation completely,
by using static structures instead. For example, sometimes its better (safer etc) to define a 64k
static buffer for unicode filename, than define a pointer/std:string and dynamically
allocate it.
When the program has to allocate a lot of instances of the same structure, its
much faster to allocate large memory blocks and then just store the instances there
(sequentially or using a linked list of free nodes) - C++ has a "placement new" for that.
In many cases, when working with varible-size objects, the set of possible sizes
is actually very limited (eg. something like 4+2*(1..256)), so its possible to use
a few pools like [3] without having to collect garbage, fill the gaps etc.
Its common for a custom allocator for specific task to be much faster than one(s)
from standard library, and even faster than speed-optimized, but too universal implementations.
Modern CPUs/OSes support "large pages", which can significantly improve the memory
access speed when you explicitly work with large blocks - see http://7-max.com/
IBM developerWorks has a nice article about memory management, with an extensive resources section for further reading: Inside memory management.
Wikipedia has some good information as well: C dynamic memory allocation, Memory management.

When HeapCreate function is used or in what cases do you need a number of heaps?

Windows API has a set of function for heap creation and handling: HeapCreate, HeapAlloc, HeapDestroy and etc.
I wonder what is the use for another heap in a program?
From fragmentation point of view, you will get external fragmentation where memory is not reused among heaps. So even if low-fragmentation heaps are used, stil there is a fragmentation.
Memory management of additional heaps seems to be low-level. So they are not easy to use.
In addition, additional heap can probably be emulated using allocations from heap and managing allocated memory.
So what is the usage? Did you use it?
One use case might be a long-running complex process that does a lot of memory allocation and deallocation. If the user wants to interrupt the process, then an easy way to clean up the memory currently allocated might be to have everything on a private heap and then simply destroy the heap.
I have seen this technique used in an embedded system (which wasn't using Windows, so it didn't use those exact API functions). The custom memory allocator had a feature to "mark" a specific state of the heap and then "rewind" to that point if a process was aborted.
One reason that is only important in rare situations, but immensely important there: memory allocated by new/malloc isn't executable on modern Windows systems. Hence if you write for example a JIT, you will have to use HeapCreate with HEAP_CREATE_ENABLE_EXECUTE.
Use: Very very very rarily.
Usage:
I once worked on a projected that used the heap management as a crude garbage collector (no destructors). There was a section of the code that went off a did some work without regard to memory management (using a separate heap). Then when it was done we just destroyed that heap to re-claim all the memory.
One use is for fixed size objects. If you need to do a lot of allocation/deallocation of objects that are all the same size (i.e. small message buffers) a private heap avoids fragmentation issues.
You might also dedicate a heap per thread - for locality of reference or to reduce locking (which is required when a heap is shared across threads).
One use case I see more often than not is in malware.
The malware would have a packed binary somewhere in its .rsrc section, allocate an executable private heap, and then run the code there. Its a very effective technique
One usage not mentioned here is to avoid heap contention.
You could create a thread-local heap which is not thread-safe, passing HEAP_NO_SERIALIZE flag to HeapCreate.
Since only one thread can access the heap, no locks are required and contention is alleviated.