"Private memory" not released after catching bad_alloc despite object being destructed - c++

An object tries to allocate more memory then the allowed virtual address space (2Gb on win32). The std::bad_alloc is caught and the the object released. Process memory usage drops and the process is supposed to continue; however, any subsequent memory allocation fails with another std::bad_alloc. Checking the memory usage with VMMap showed that the heap memory appears to be released but it is actually marked as private, leaving no free space. The only thing to do seems to quit and restart. I would understand a fragmentation problem but why can't the process have the memory back after the release?
The object is a QList of QLists. The application is multithreaded. I could make a small reproducer but I could reproduce the problem only once, while most of the times the reproduces can use again the memory that was freed.
Is Qt doing something sneaky? Or maybe is it win32 delaying the release?

As I understand your problem, you are allocating large amounts of memory from heap which fails at some point. Releasing the memory back to the process heap does not necesarily mean that the heap manager actually frees the virtual pages that contain only free blocks of the heap (due to performance reasons). So, if you try to allocate a virtual memory directly (VirtualAlloc or VirtualAllocEx), the attempt fails since nearly all memory is consumed by the heap manager that has no chance of knowing about your direct allocation attempt.
Well, what you can possibly do with this. You can create your own heap (HeapCreate) and limit its maximum size. That may be quite tricky, since you need to persuade Qt to use this heap.
When allocating large amounts of memory, I recommend using VirtualAlloc rather than heap functions. If the requested size is >= 512 KB, the heap mamanger actually uses VirtualAlloc to satisfy your request. However, I don't know if it actually releases the pages when you free the region, or whether it starts using it for satisfying other heap allocation requests.

The answer by Martin Drab put me on the right path. Investigating about the heap allocations I found this old message that clarifies what is going on:
The issue here is that the blocks over 512k are direct calls to
VirtualAlloc, and everything else smaller than this are allocated out
of the heap segments. The bad news is that the segments are never
released (entirely or partially) so ones you take the entire address
space with small blocks you cannot use them for other heaps or blocks
over 512 K.
The problem is not Qt-related but Windows-related; I could finally reproduce it with a plain std::vector of char arrays. The default heap allocator leaves the address space segments unaltered even after the correspondent allocation was explicitly released. The ratio is that the process might ask again buffers of a similar size and the heap manager will save time reusing existent address segments instead of compacting older ones to create new ones.
Please note this has nothing to do with the amount of physical nor virtual memory available. It's only the address space that remains segmented, even though those segments are free. This is a serious problem on 32 bit architectures, where the address space is only 2Gb large (can be 3).
This is why the memory was marked as "private", even after being released, and apparently not usable by the same process for average-sized mallocs even though the committed memory was very low.
To reproduce the problem, just create a huge vector of chunks smaller than 512Kb (they must be allocated with new or malloc). After the memory is filled and then released (no matter if the limit is reached and an exception caught or the memory is just filled with no error), the process won't be able to allocate anything bigger than 512Kb. The memory is free, it's assigned to the same process ("private") but all the buckets are too small.
But there are worse news: there is apparently no way to force a compaction of the heap segments. I tried with this and this but had no luck; there is no exact equivalent of POSIX fork() (see here and here). The only solution is to do something more low level, like creating a private heap and destroying it after the small allocations (as suggested in the message cited above) or implementing a custom allocator (there might be some commercial solution out there). Both quite infeasible for large, existent software, where the easiest solution is to close the process and restart it.

Related

What part of the process virtual memory does Windows Task Manager display

My question is a bit naive. I'm willing to have an overview as simple as possible and couldn't find any resource that made it clear to me. I am a developer and I want to understand what exactly is the memory displayed in the "memory" column by default in Windows Task Manager:
To make things a bit simpler, let's forget about the memory the process shares with other processes, and imagine the shared memory is negligible. Also I'm focussed on the big picture and mainly care for things at GB level.
As far as I know, the memory reserved by the process called "virtual memory", is partly stored in the main memory (RAM), partly on the disk. The system decides what goes where. The system basically keeps in RAM the parts of the virtual memory that is accessed sufficiently frequently by the process. A process can reserve more virtual memory than RAM available in the computer.
From a developer point of view, the virtual memory may only be partially allocated by the program through its own memory manager (with malloc() or new X() for example). I guess the system has no awareness of what part of the virtual memory is allocated since this is handled by the process in a "private" way and depends on the language, runtime, compiler... Q: Is this correct?
My hypothesis is that the memory displayed by the task manager is essentially the part of the virtual memory being stored in RAM by the system. Q: Is it correct? And is there a simple way to know the total virtual memory reserved by the process?
Memory on windows is... extremely complicated and asking 'how much memory does my process use' is effectively a nonsensical question. TO answer your questions lets get a little background first.
Memory on windows is allocated via ptr = VirtualAlloc(..., MEM_RESERVE, ...) and committed later with VirtualAlloc(ptr+n, MEM_COMMIT, ...).
Any reserved memory just uses up address space and so isn't interesting. Windows will let you MEM_RESERVE terabytes of memory just fine. Committing the memory does use up resources but not in the way you'd think. When you call commit windows does a few sums and basically works out (total physical ram + total swap - current commit) and lets you allocate memory if there's enough free. BUT the windows memory manager doesn't actually give you physical ram until you actually use it.
Later, however, if windows is tight for physical RAM it'll swap some of your RAM out to disk (it may compress it and also throw away unused pages, throw away anything directly mapped from a file and other optimisations). This means your total commit and total physical ram usage for your program may be wildly different. Both numbers are useful depending on what you're measuring.
There's one last large caveat - memory that is shared. When you load DLLs the code, the read-only memory [and even maybe the read/write section but this is COW'd] can be shared with other programs. This means that your app requires that memory but you cannot count that memory against just your app - after all it can be shared and so doesn't take up as much physical memory as a naive count would think.
(If you are writing a game or similar you also need to count GPU memory but I'm no expert here)
All of the above goodness is normally wrapped up by the heap the application uses and you see none of this - you ask for and use memory. And its just as optimal as possible.
You can see this by going to the details tab and looking at the various options - commit-size and working-set are really useful. If you just look at the main window in task-manager and it has a single value I'd hope you understand now that a single value for memory used has to be some kind of compromise as its not a question that makes sense.
Now to answer your questions
Firstly the OS knows exactly how much memory your app has reserved and how much it has committed. What it doesn't know is if the heap implementation you (or more likely the CRT) are using has kept some freed memory about which it hasn't released back to the operation system. Heaps often do this as an optimisation - asking for memory from the OS and freeing it back to the OS is a fairly expensive operation (and can only be done in large chunks known as pages) and so most of them keep some around.
Second question: Dont use that value, go to details and use the values there as only you know what you actually want to ask.
EDIT:
For your comment, yes, but this depends on the size of the allocation. If you allocate a large block of memory (say >= 1MB) then the heap in the CRT generally directly defers the allocation to the operating system and so freeing individual ones will actually free them. For small allocations the heap in the CRT asks for pages of memory from the operating system and then subdivides that to give out in allocations. And so if you then free every other one of those you'll be left with holes - and the heap cannot give those holes back to the OS as the OS generally only works in whole pages. So anything you see in task manager will show that all the memory is still used. Remember this memory isn't lost or leaked, its just effectively pooled and will be used again if allocations ask for that size. If you care about this memory you can use the crt heap statistics famliy of functions to keep an eye on those - specifically _CrtMemDumpStatistics

Win7 C++ application always reserving at least 4k memory per allocation

I'm currently looking into memory consumption issues of a C++ application that I have written (a rendering engine using OpenGL) and have stumbled upon a rather unusual problem:
I'm using my own allocators basically everywhere in the system, which all obtain their memory from a default allocator which is using malloc()/free() for the actual memory.
It turns out that my application is always reserving at least 4096 bytes (the page size on my system) for every allocation through malloc(), even if the size is significantly smaller.
malloc(8) or even malloc(1) both result in an increase of memory of 4096 bytes. I'm tracking the used memory size through GetProcessMemoryInfo() directly before and after the allocation, as well as through the TaskManager (which basically shows the same values). Interestingly, using _msize(ptr) returns the correct size of the pointer.
I can only reproduce this behaviour within my own application, testing it with a new VS2012 C++ project did not yield the same results. This behaviour also seems independent of the current reserved size of the application, even with more than 10GB of free RAM it always reserves at least 4K per allocation.
I have no deep knowledge of the innards of the Windows operating system (if it is at all related to the OS), so if anyone has an idea what could cause this behaviour I would be greatful!
Check this, it's from 1993 :-)
http://msdn.microsoft.com/en-us/library/ms810603.aspx
This does not mean that the smallest amount of memory that can be allocated in a heap is 4096 bytes; rather, the heap manager commits pages of memory as needed to satisfy specific allocation requests. If, for example, an application allocates 100 bytes via a call to GlobalAlloc, the heap manager allocates a 100-byte chunk of memory within its committed region for this request. If there is not enough committed memory available at the time of the request, the heap manager simply commits another page to make the memory available.
You might be running with "full page heap"... a diagnostic mode to help more quickly catch memory access errors in your code.

What are the long term consequences of memory leaks?

Suppose I had a program like this:
int main(void)
{
int* arr = new int[x];
//processing; neglect to call delete[]
return 0;
}
In a trivial example such as this, I assume there is little actual harm in neglecting to free the memory allocated for arr, since it should be released by the OS when the program is finished running. For any non-trivial program, however, this is considered to be bad practice and will lead to memory leaks.
My question is, what are the consequences of memory leaks in a non-trivial program? I realize that memory leaks are bad practice, but I do not understand why they are bad and what trouble they cause.
A memory leak can diminish the performance of the computer by reducing the amount of available memory. Eventually, in the worst case, too much of the available memory may become allocated and all or part of the system or device stops working correctly, the application fails, or the system slows down unacceptably due to thrashing.
Memory leaks may not be serious or even detectable by normal means. In modern operating systems, normal memory used by an application is released when the application terminates. This means that a memory leak in a program that only runs for a short time may not be noticed and is rarely serious.
Much more serious leaks include those:
where the program runs for an extended time and consumes additional memory over time, such as background tasks on servers, but especially in embedded devices which may be left running for many years
where new memory is allocated frequently for one-time tasks, such as when rendering the frames of a computer game or animated video
where the program can request memory — such as shared memory — that is not released, even when the program terminates
where memory is very limited, such as in an embedded system or portable device
where the leak occurs within the operating system or memory manager
when a system device driver causes the leak
running on an operating system that does not automatically release memory on program termination. Often on such machines if memory is lost, it can only be reclaimed by a reboot, an example of such a system being AmigaOS.
Check out here for more info.
There is an underlying assumption to your question:
The role of delete and delete[] is solely to release memory.
... and it is erroneous.
For better or worse, delete and delete[] have a dual role:
Run destructors
Free memory (by calling the right overload of operator delete)
With the corrected assumption, we can now ask the corrected question:
What is the risk in not calling delete/delete[] to end the lifetime of dynamically allocated variables ?
As mentioned, an obvious risk is leaking memory (and ultimately crashing). However this is the least of your worries. The much bigger risk is undefined behavior, which means that:
compiler may inadvertently not produce executable code that behaves as expected: Garbage in, Garbage out.
in pragmatic terms, the most likely output is that destructors are not run...
The latter is extremely worrisome:
Mutexes: Forget to release a lock and you get a deadlock...
File Descriptors: Some platforms (such as FreeBSD I believe) have a notoriously low default limit on the number of file descriptors a process may open; fail to close your file descriptors and you will not be able to open any new file or socket!
Sockets: on top of being a file descriptor, there is a limited range of ports associated to an IP (which with the latest version of Linux is no longer global, yeah!). The absolute maximum is 65,536 (u16...) but the ephemeral port range is usually much smaller (half of it). If you forget to release connections in a timely fashion you can easily end up in a situation where even though you have plenty of bandwidth available, your server stops accepting new connections because there is no ephemeral port available.
...
The problem with the attitude of well, I got enough memory anyway is that memory is probably the least of your worries simply because memory is probably the least scarce resource you manipulate.
Of course you could say: Okay, I'll concentrate on other resources leak, but tools nowadays report them as memory leaks (and it's sufficient) so isolating that leak among hundreds/thousands is like seeking a needle in a haystack...
Note: did I mention that you can still run out of memory ? Whether on lower-end machines/systems or on a restricted processes/virtual-machines memory can be quite tight for the task at hand.
Note: if you find yourself calling delete, you are doing it wrong. Learn to use the Standard Library std::unique_ptr and its containers std::vector. In C++, automatic memory management is easy, the real challenge is to avoid dangling pointers...
Let's say we have this program running:
while(true)
{
int* arr = new int;
}
The short term problem is that your computer will eventually run out of memory and the program will crash.
Instead, we could have this program that would run forever because there is no memory leak:
while(true)
{
int* arr = new int;
delete arr;
}
When a simple program like this crashes there is no long term consequences because the operating system will free the memory after the crash.
But you can imagine more critical systems where a system crash will have catastrophic consequences such as:
while(true)
{
int* arr = new int;
generateOxygenForAstronauts();
}
Think about the astronauts and free your memory!
A tool that runs for a short period of time and then exits can often get away with having memory leaks, as your example indicates. But a program that is expected to run without failure for long periods of time must be completely free of memory leaks. As others have mentioned, the whole system will bog down first. Additionally, code that leaks memory often is very bad at handling allocation failures - the result of a failed allocation is usually a crash and loss of data. From the user's perspective, this crash usually happens at exactly the worst possible moment (e.g. during file save, when file buffers get allocated).
Well, it is a strange question, since the immediate answer is straightforward: as you lose memory to memory leaks, you can/will eventually run out of memory. How big a problem that represents to a specific program depends on how big each leak is, how often these leaks occur and for how long. That's all there is to it.
A program that allocates relatively low amount of memory and/or is not run continuously might not suffer any problems from memory leaks at all. But a program that is run continuously will eventually run out of memory, even if it leaks it very slowly.
Now, if one decided to look at it closer, every block of memory has two sides to it: it occupies a region of addresses in the address space of the process and it occupies a portion of the actual physical storage.
On a platform without virtual memory, both sides work against you. Once the memory block is leaked, you lose the address space and you lose the storage.
On a platform with virtual memory the actual storage is a practically unlimited resource. You can leak as much memory as you want, you will never run out of the actual storage (within practically reasonable limits, of course). A leaked memory block will eventually be pushed out to external storage and forgotten for good, so it will not directly affect the program in any negative way. However, it will still hold its region of address space. And the address space still remains a limited resource, which you can run out of.
One can say, that if we take an imaginary virtual memory platform with address space that is overwhelmingly larger than anything ever consumable by our process (say, 2048-bit platform and a typical text editor), then memory leaks will have no consequence for our program. But in real life memory leaks typically constitute a serious problem.
Nowadays compilers do some optimization on your code before generating the binary. And so single newing without deleting it wouldn't have much of harm.
But in general by doing any "Newing" you should "delete" that portion of memory you've reserved in your program.
And also be aware that simple deleting doesn't guarantee not running out of memory.
There are different aspects from O.S. and the compiler side to control this feature.
This link may help you a little
And this one too

Memory stability of a C++ application in Linux

I want to verify the memory stability of a C++ application I wrote and compiled for Linux.
It is a network application that responds to remote clients connectings in a rate of 10-20 connections per second.
On long run, memory was rising to 50MB, eventhough the app was making calls to delete...
Investigation shows that Linux does not immediately free memory. So here are my questions :
How can force Linux to free memory I actually freed? At least I want to do this once to verify memory stability.
Otherwise, is there any reliable memory indicator that can report memory my app is actually holding?
What you are seeing is most likely not a memory leak at all. Operating systems and malloc/new heaps both do very complex accounting of memory these days. This is, in general, a very good thing. Chances are any attempt on your part to force the OS to free the memory will only hurt both your application performance and overall system performance.
To illustrate:
The Heap reserves several areas of virtual memory for use. None of it is actually committed (backed by physical memory) until malloc'd.
You allocate memory. The Heap grows accordingly. You see this in task manager.
You allocate more memory on the Heap. It grows more.
You free memory allocated in Step 2. The Heap cannot shrink, however, because the memory in #3 is still allocated, and Heaps are unable to compact memory (it would invalidate your pointers).
You malloc/new more stuff. This may get tacked on after memory allocated in step #3, because it cannot fit in the area left open by free'ing #2, or because it would be inefficient for the Heap manager to scour the heap for the block left open by #2. (depends on the Heap implementation and the chunk size of memory being allocated/free'd)
So is that memory at step #2 now dead to the world? Not necessarily. For one thing, it will probably get reused eventually, once it becomes efficient to do so. In cases where it isn't reused, the Operating System itself may be able to use the CPU's Virtual Memory features (the TLB) to "remap" the unused memory right out from under your application, and assign it to another application -- on the fly. The Heap is aware of this and usually manages things in a way to help improve the OS's ability to remap pages.
These are valuable memory management techniques that have the unmitigated side effect of rendering fine-grained memory-leak detection via Process Explorer mostly useless. If you want to detect small memory leaks in the heap, then you'll need to use runtime heap leak-detection tools. Since you mentioned that you're able to build on Windows as well, I will note that Microsoft's CRT has adequate leak-checking tools built-in. Instructions for use found here:
http://msdn.microsoft.com/en-us/library/974tc9t1(v=vs.100).aspx
There are also open-source replacements for malloc available for use with GCC/Clang toolchains, though I have no direct experience with them. I think on Linux Valgrind is the preferred and more reliable method for leak-detection anyway. (and in my experience easier to use than MSVCRT Debug).
I would suggest using valgrind with memcheck tool or any other profiling tool for memory leaks
from Valgrind's page:
Memcheck
detects memory-management problems, and is aimed primarily at
C and C++ programs. When a program is run under Memcheck's
supervision, all reads and writes of memory are checked, and calls to
malloc/new/free/delete are intercepted. As a result, Memcheck can
detect if your program:
Accesses memory it shouldn't (areas not yet allocated, areas that have been freed, areas past the end of heap blocks, inaccessible areas
of the stack).
Uses uninitialised values in dangerous ways.
Leaks memory.
Does bad frees of heap blocks (double frees, mismatched frees).
Passes overlapping source and destination memory blocks to memcpy() and related functions.
Memcheck reports these errors as soon as they occur, giving the source
line number at which it occurred, and also a stack trace of the
functions called to reach that line. Memcheck tracks addressability at
the byte-level, and initialisation of values at the bit-level. As a
result, it can detect the use of single uninitialised bits, and does
not report spurious errors on bitfield operations. Memcheck runs
programs about 10--30x slower than normal. Cachegrind
Massif
Massif is a heap profiler. It performs detailed heap profiling by
taking regular snapshots of a program's heap. It produces a graph
showing heap usage over time, including information about which parts
of the program are responsible for the most memory allocations. The
graph is supplemented by a text or HTML file that includes more
information for determining where the most memory is being allocated.
Massif runs programs about 20x slower than normal.
Using valgrind is as simple as running application with desired switches and give it as an input of valgrind:
valgrind --tool=memcheck ./myapplication -f foo -b bar
I very much doubt that anything beyond wrapping malloc and free [or new and delete ] with another function can actually get you anything other than very rough estimates.
One of the problems is that the memory that is freed can only be released if there is a long contiguous chunk of memory. What typically happens is that there are "little bits" of memory that are used all over the heap, and you can't find a large chunk that can be freed.
It's highly unlikely that you will be able to fix this in any simple way.
And by the way, your application is probably going to need those 50MB later on when you have more load again, so it's just wasted effort to free it.
(If the memory that you are not using is needed for something else, it will get swapped out, and pages that aren't touched for a long time are prime candidates, so if the system runs low on memory for some other tasks, it will still reuse the RAM in your machine for that space, so it's not sitting there wasted - it's just you can't use 'ps' or some such to figure out how much ram your program uses!)
As suggested in a comment: You can also write your own memory allocator, using mmap() to create a "chunk" to dole out portions from. If you have a section of code that does a lot of memory allocations, and then ALL of those will definitely be freed later, to allocate all those from a separate lump of memory, and when it's all been freed, you can put the mmap'd region back into a "free mmap list", and when the list is sufficiently large, free up some of the mmap allocations [this is in an attempt to avoid calling mmap LOTS of times, and then munmap again a few millisconds later]. However, if you EVER let one of those memory allocations "escape" out of your fenced in area, your application will probably crash (or worse, not crash, but use memory belonging to some other part of the application, and you get a very strange result somewhere, such as one user gets to see the network content supposed to be for another user!)
Use valgrind to find memory leaks : valgrind ./your_application
It will list where you allocated memory and did not free it.
I don't think it's a linux problem, but in your application. If you monitor the memory usage with « top » you won't get very precise usages. Try using massif (a tool of valgrind) : valgrind --tool=massif ./your_application to know the real memory usage.
As a more general rule to avoid leaks in C++ : use smart pointers instead of normal pointers.
Also in many situations, you can use RAII (http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization) instead of allocating memory with "new".
It is not typical for an OS to release memory when you call free or delete. This memory goes back to the heap manager in the runtime library.
If you want to actually release memory, you can use brk. But that opens up a very large can of memory-management worms. If you directly call brk, you had better not call malloc. For C++, you can override new to use brk directly.
Not an easy task.
The latest dlmalloc() has a concept called an mspace (others call it a region). You can call malloc() and free() against an mspace. Or you can delete the mspace to free all memory allocated from the mspace at once. Deleting an mspace will free memory from the process.
If you create an mspace with a connection, allocate all memory for the connection from that mspace, and delete the mspace when the connection closes, you would have no process growth.
If you have a pointer in one mspace pointing to memory in another mspace, and you delete the second mspace, then as the language lawyers say "the results are undefined".

Dealing with fragmentation in a memory pool?

Suppose I have a memory pool object with a constructor that takes a pointer to a large chunk of memory ptr and size N. If I do many random allocations and deallocations of various sizes I can get the memory in such a state that I cannot allocate an M byte object contiguously in memory even though there may be a lot free! At the same time, I can't compact the memory because that would cause a dangling pointer on the consumers. How does one resolve fragmentation in this case?
I wanted to add my 2 cents only because no one else pointed out that from your description it sounds like you are implementing a standard heap allocator (i.e what all of us already use every time when we call malloc() or operator new).
A heap is exactly such an object, that goes to virtual memory manager and asks for large chunk of memory (what you call "a pool"). Then it has all kinds of different algorithms for dealing with most efficient way of allocating various size chunks and freeing them. Furthermore, many people have modified and optimized these algorithms over the years. For long time Windows came with an option called low-fragmentation heap (LFH) which you used to have to enable manually. Starting with Vista LFH is used for all heaps by default.
Heaps are not perfect and they can definitely bog down performance when not used properly. Since OS vendors can't possibly anticipate every scenario in which you will use a heap, their heap managers have to be optimized for the "average" use. But if you have a requirement which is similar to the requirements for a regular heap (i.e. many objects, different size....) you should consider just using a heap and not reinventing it because chances are your implementation will be inferior to what OS already provides for you.
With memory allocation, the only time you can gain performance by not simply using the heap is by giving up some other aspect (allocation overhead, allocation lifetime....) which is not important to your specific application.
For example, in our application we had a requirement for many allocations of less than 1KB but these allocations were used only for very short periods of time (milliseconds). To optimize the app, I used Boost Pool library but extended it so that my "allocator" actually contained a collection of boost pool objects, each responsible for allocating one specific size from 16 bytes up to 1024 (in steps of 4). This provided almost free (O(1) complexity) allocation/free of these objects but the catch is that a) memory usage is always large and never goes down even if we don't have a single object allocated, b) Boost Pool never frees the memory it uses (at least in the mode we are using it in) so we only use this for objects which don't stick around very long.
So which aspect(s) of normal memory allocation are you willing to give up in your app?
Depending on the system there are a couple of ways to do it.
Try to avoid fragmentation in the first place, if you allocate blocks in powers of 2 you have less a chance of causing this kind of fragmentation. There are a couple of other ways around it but if you ever reach this state then you just OOM at that point because there are no delicate ways of handling it other than killing the process that asked for memory, blocking until you can allocate memory, or returning NULL as your allocation area.
Another way is to pass pointers to pointers of your data(ex: int **). Then you can rearrange memory beneath the program (thread safe I hope) and compact the allocations so that you can allocate new blocks and still keep the data from old blocks (once the system gets to this state though that becomes a heavy overhead but should seldom be done).
There are also ways of "binning" memory so that you have contiguous pages for instance dedicate 1 page only to allocations of 512 and less, another for 1024 and less, etc... This makes it easier to make decisions about which bin to use and in the worst case you split from the next highest bin or merge from a lower bin which reduces the chance of fragmenting across multiple pages.
Implementing object pools for the objects that you frequently allocate will drive fragmentation down considerably without the need to change your memory allocator.
It would be helpful to know more exactly what you are actually trying to do, because there are many ways to deal with this.
But, the first question is: is this actually happening, or is it a theoretical concern?
One thing to keep in mind is you normally have a lot more virtual memory address space available than physical memory, so even when physical memory is fragmented, there is still plenty of contiguous virtual memory. (Of course, the physical memory is discontiguous underneath but your code doesn't see that.)
I think there is sometimes unwarranted fear of memory fragmentation, and as a result people write a custom memory allocator (or worse, they concoct a scheme with handles and moveable memory and compaction). I think these are rarely needed in practice, and it can sometimes improve performance to throw this out and go back to using malloc.
write the pool to operate as a list of allocations, you can then extended and destroyed as needed. this can reduce fragmentation.
and/or implement allocation transfer (or move) support so you can compact active allocations. the object/holder may need to assist you, since the pool may not necessarily know how to transfer types itself. if the pool is used with a collection type, then it is far easier to accomplish compacting/transfers.