Reduce Visual Studio's memory limit - c++

I have a project and I'd like to trigger some memory exceptions to see where they occur without having to load 2GB files. how do I do that?

Just run a quick loop allocating blocks of memory until exhausted.
void* p;
do {
p = malloc (1024 * 1024);
} while (p != NULL);

I assume you talk about the 2GB limit of 32 bit Windows. If you do - this might do the trick:
just pre-allocate some memory up front to generate some base load e.g.
struct memwaste
{
char* m_ptr;
memwaste() : m_ptr(new char[1024*1024*104]) {} //waste 1 gb
~memwaste() { delete[] m_ptr }
}x;
add this struct to your code somewhere and it "wastes" some memory (aka baseload). Now you can run your program. Eventually it will run into problems allocating memory.
The baseload of memwast has to be adapted to your needs of course - it depends on where you want to check the memory allocation errors.

Related

where did the memory go?

class Node
{
//some member variables.
};
std::cout<<"size of the class is "<<sizeof(Node)<<"\n";
int pm1 =peakmemory();
std::cout<<"Peak memory before loop is "<< pm1<<"\n";
for(i=0; i< nNode; ++i)
{
Node * p = new Node;
}
int pm2 =peakmemory();
std::cout<<"Peak memory after loop is "<< pm2<<"\n";
I thought pm2-pm1 approximates nNode * sizeof(Node). But it turns out pm2-pm1 is much larger than nNode *sizeof(Node). Where did the memory go? I suspect sizeof(Node) does not reflect the correct memory usage.
I have tested on both Windows and linux. Final conclusion is Node * p = new Node; will allocate a memory larger than sizeof(Node) where Node is a class.
Since you haven't specified what platform you're running on, here are a few possibilities:
Allocation size: Your C++ implementation may be allocating memory in units which are larger than sizeof(Node), e.g. to limit the amount of book-keeping it does.
Alignment: The allocator may have a policy of returning addresses aligned to some minimum power of 2. Again, this may simplify its implementation somewhat.
Overhead: Some allocators, in addition to the memory you are using, have some padding with a fixed pattern to protect against memory corruption; or some meta-data used by the allocator.
That is not to say this actually happens. But it could; it certainly agrees with the language specification (AFAICT).
Pretty much all memory allocators (such as those on linux/win32) have an allocation header that proceeds the memory allocation (which includes info about the size of the allocation). On linux for example, you can look at the source for malloc, and that gives info about the stored header (and how to compute its size):
https://git.busybox.net/uClibc/tree/libc/stdlib/malloc/malloc.h#n106
As already mentioned in a comment, debug builds may also allocate additional bytes at either side of the allocation to guard against buffer overrun errors.

dynamic memory allocation using new with binary search in C++

I am trying to find the maximum memory allocated using new[]. I have used binary search to make allocation a bit faster, in order to find the final memory that can be allocated
bool allocated = false;
int* ptr= nullptr;
int low = 0,high = std::numeric_limits<int>;
while(true)
{
try
{
mid = (low + high) / 2;
ptr = new int[mid];
delete[] ptr;
allocated = true;
}
catch(Exception e)
{....}
if (allocated == true)
{
low = mid;
}else
{
high = low;
cout << "maximum memory allocated at: " << ptr << endl;
}
}
I have modified my code, I am using a new logic to solve this. My problem right now is it is going to a never ending loop. Is there any better way to do this?
This code is useless for a couple of reasons.
Depending on your OS, the memory may or may not be allocated until it is actually accessed. That is, new happily returns a new memory address, but it doesn't make the memory available just yet. It is actually allocated later when and if a corresponding address is accessed. Google up "lazy allocation". If the out-of-memory condition is detected at use time rather than at allocation time, allocation itself may never throw an exception.
If you have a machine with more than 2 gigabytes available, and your int is 32 bits, alloc will eventually overflow and become negative before the memory is exhausted. Then you may get a bad_alloc. Use size_t for all things that are sizes.
Assuming you are doing ++alloc and not ++allocation, it shouldn't matter what address it uses. if you want it to use a different address every time then don't delete the pointer.
This is a particularly bad test.
For the first part you have undefined behaviour. That's because you should only ever delete[] the pointer returned to you by new[]. You need to delete[] pvalue, not value.
The second thing is that your approach will be defragmenting your memory as you're continuously allocating and deallocating contiguous memory blocks. I imagine that your program will understate the maximum block size due to this fragmentation effect. One solution to this would be to launch instances of your program as a new process from the command line, setting the allocation block size as a parameter. Use a divide and conquer bisection approach to attain the maximum size (with some reliability) in log(n) trials.

Fail to malloc big block memory after many malloc/free small blocks memory

Here is the code.
First I try to malloc and free a big block memory, then I malloc many small blocks memory till it run out of memory, and I free ALL those small blocks.
After that, I try to malloc a big block memory.
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char **argv)
{
static const int K = 1024;
static const int M = 1024 * K;
static const int G = 1024 * M;
static const int BIG_MALLOC_SIZE = 1 * G;
static const int SMALL_MALLOC_SIZE = 3 * K;
static const int SMALL_MALLOC_TIMES = 1 * M;
void **small_malloc = (void **)malloc(SMALL_MALLOC_TIMES * sizeof(void *));
void *big_malloc = malloc(BIG_MALLOC_SIZE);
printf("big malloc first time %s\n", (big_malloc == NULL)? "failed" : "succeeded");
free(big_malloc);
for (int i = 0; i != SMALL_MALLOC_TIMES; ++i)
{
small_malloc[i] = malloc(SMALL_MALLOC_SIZE);
if (small_malloc[i] == NULL)
{
printf("small malloc failed at %d\n", i);
break;
}
}
for (int i = 0; i != SMALL_MALLOC_TIMES && small_malloc[i] != NULL; ++i)
{
free(small_malloc[i]);
}
big_malloc = malloc(BIG_MALLOC_SIZE);
printf("big malloc second time %s\n", (big_malloc == NULL)? "failed" : "succeeded");
free(big_malloc);
return 0;
}
Here is the result:
big malloc first time succeeded
small malloc failed at 684912
big malloc second time failed
It looks like there are memory fragments.
I know memory fragmentation happens when there are many small empty space in memory but there is no big enough empty space for big size malloc.
But I've already free EVERYTHING I malloc, the memory should be empty.
Why I can't malloc big block at the second time?
I use Visual Studio 2010 on Windows 7, I build 32-bits program.
The answer, sadly, is still fragmentation.
Your initial large allocation ends up tracked by one allocation block; however when you start allocating large numbers of 3k blocks of memory your heap gets sliced into chunks.
Even when you free the memory, small pieces of the block remain allocated within the process's address space. You can use a tool like Sysinternals VMMap to see these allocations visually.
It looks like 16M blocks are used by the allocator, and once these blocks are freed up they never get returned to the free pool (i.e. the blocks remain allocated).
As a result you don't have enough contiguous memory to allocate the 1GB block the second time.
Even I know just a little about this, I found the following thread Why does malloc not work sometimes? which covers the similar topic as yours.
It contains the following links:
http://www.eskimo.com/~scs/cclass/int/sx7.html (Pointer Allocation Strategies)
http://www.gidforums.com/t-9340.html (reasons why malloc fails? )
The issue is likely that even if you free every allocation, malloc does not return all the memory to the operating system.
When your program requested the numerous smaller allocations, malloc had to increase the size of the "arena" from which it allocates memory.
There is no guarantee that if you free all the memory, the arena will shrink to the original size. It's possible that the arena is still there, and all the blocks have been put into a free list (perhaps coalesced into larger blocks).
The presence of this lingering arena in your address space may be making it impossible to satisfy the large allocation request.

C++ bad allocation from heap fragmentation?

I have the following problem.I run a loop for 3000 times. On every pass I allocate a byte buffer on the heap:
uint8_t* frameDest;
try {
frameDest= new uint8_t[numBytes * sizeof(uint8_t)];
memcpy(frameDest, frameSource, _numBytes);
} catch(std::exception &e) {
printf(e.what());
}
So the allocated frame serves as destination for a data from some frameSource._numBytes equals 921600 .
Next, in the same loop frameDest is pushed into std::vector of uint8_t* pointers(_frames_cache).This vector serves as frames cache and being cleaned every X frame.With the current setup I clean the vector when more than 20 frames are in the cache.The method that cleans the cache is this:
void FreeCache()
{
_frameCacheMutex.lock();
try {
int cacheSize = _frames_cache.size();
for (int i = 0; i < cacheSize; ++i) {
uint8_t* frm = _frames_cache.front();
_frames_cache.erase(_frames_cache.begin());
delete [] frm;
}
} catch (std::exception& e) {
printf(e.what());
}
_frameCacheMutex.unlock();
}
The issue:bad alloc exception is thrown after ~2000+ frames in the first code block.I tested for memory leaks with Dr.Memory and found none.Also I am getting no erros or exceptions on allocations/deallocation on other parts of the program.I have 2 instances of such a code running in 2 separate thread which means during the whole lifetime of this program some 6000 allocations / deallocations are processed, 960000 bytes each.In the whole app there are more heap allocs going on but not at the frequency as in this part.I have read that modern compilers handle heap management in a pretty advanced way,and still,I suspect my issue has to do with memory fragmentation.I use Visual C++ 2012 compiler (C++ 11) ,32bit under Windows7 64bit OS.
My question is:how likely it is memory management problem and should I write or use a custom heap alloc manager?If not,what could it be?
It's hard to tell if this is a memory fragmentation problem. But you may want to consider allocating a fixed amount of memory for your frames as a ring buffer at the beginning, so that during runtime you won't run into any fragmentation (and also save the time for memory (de)allocation). Not sure if this works with your use case, of course.

C++/ActiveX replacing realloc with malloc, memcpy, free. Functional and Performance tests

I've been assigned to a project that is a complex legacy system written in C++ and ActiveX ~ 10 years old.
The setup is Microsoft Visual Studio 2008.
Whilst there are no issues with the system right now, as part of the security review of the legacy system, an automated security code scanning tool has marked instances of realloc as Bad Practice issue, due to security vulnerability.
This is because realloc function might leave a copy of sensitive information stranded in memory where it cannot be overwritten. The tool recommends replacing realloc with malloc, memcpy and free.
Now realloc function being versatile, will allocate memory when the source buffer is null. It also frees memory when the size of the buffer is 0. I was able to verify both these scenarios.
Source: MDSN Library 2001
realloc returns a void pointer to the reallocated (and possibly moved) memory block. The return value is NULL if the size is zero and the buffer argument is not NULL, or if there is not enough available memory to expand the block to the given size. In the first case, the original block is freed. In the second, the original block is unchanged. The return value points to a storage space that is guaranteed to be suitably aligned for storage of any type of object. To get a pointer to a type other than void, use a type cast on the return value.
So, my replacement function that uses malloc, memcpy and free has to cater for these cases.
I have reproduced below the original code snippet (an array implementation) that uses realloc to dynamically resize and shrink its internal buffer.
First the class definition:
template <class ITEM>
class CArray
{
// Data members:
protected:
ITEM *pList;
int iAllocUnit;
int iAlloc;
int iCount;
public:
CArray() : iAllocUnit(30), iAlloc(0), iCount(0), pList(NULL)
{
}
virtual ~CArray()
{
Clear(); //Invokes SetCount(0) which destructs objects and then calls ReAlloc
}
The existing ReAlloc method:
void ReAllocOld()
{
int iOldAlloc = iAlloc;
// work out new size
if (iCount == 0)
iAlloc = 0;
else
iAlloc = ((int)((float)iCount / (float)iAllocUnit) + 1) * iAllocUnit;
// reallocate
if (iOldAlloc != iAlloc)
{
pList = (ITEM *)realloc(pList, sizeof(ITEM) * iAlloc);
}
}
The following is my implementation that replaces these with malloc,memcpy and free:
void ReAllocNew()
{
int iOldAlloc = iAlloc;
// work out new size
if (iCount == 0)
iAlloc = 0;
else
iAlloc = ((int)((float)iCount / (float)iAllocUnit) + 1) * iAllocUnit;
// reallocate
if (iOldAlloc != iAlloc)
{
size_t iAllocSize = sizeof(ITEM) * iAlloc;
if(iAllocSize == 0)
{
free(pList); /* Free original buffer and return */
}
else
{
ITEM *tempList = (ITEM *) malloc(iAllocSize); /* Allocate temporary buffer */
if (tempList == NULL) /* Memory allocation failed, throw error */
{
free(pList);
ATLTRACE(_T("(CArray: Memory could not allocated. malloc failed.) "));
throw CAtlException(E_OUTOFMEMORY);
}
if(pList == NULL) /* This is the first request to allocate memory to pList */
{
pList = tempList; /* assign newly allocated buffer to pList and return */
}
else
{
size_t iOldAllocSize = sizeof(ITEM) * iOldAlloc; /* Allocation size before this request */
size_t iMemCpySize = min(iOldAllocSize, iAllocSize); /* Allocation size for current request */
if(iMemCpySize > 0)
{
/* MemCpy only upto the smaller of the sizes, since this could be request to shrink or grow */
/* If this is a request to grow, copying iAllocSize will result in an access violation */
/* If this is a request to shrink, copying iOldAllocSize will result in an access violation */
memcpy(tempList, pList, iMemCpySize); /* MemCpy returns tempList as return value, thus can be omitted */
free(pList); /* Free old buffer */
pList = tempList; /* Assign newly allocated buffer and return */
}
}
}
}
}
Notes:
Objects are constructed and destructed correctly in both the old and new code.
No memory leaks detected (as reported by Visual Studio built in CRT Debug heap functions: http://msdn.microsoft.com/en-us/library/e5ewb1h3(v=vs.90).aspx)
I wrote a small test harness (console app) that does the following:
a. Add 500000 instances of class containing 2 integers and an STL string.
Integers added are running counter and its string representations like so:
for(int i = 0; i < cItemsToAdd; i++)
{
ostringstream str;
str << "x=" << 1+i << "\ty=" << cItemsToAdd-i << endl;
TestArray value(1+i, cItemsToAdd-i, str.str());
array.Append(&value);
}
b. Open a big log file containing 86526 lines of varying lengths, adding to an instance of this array: CArray of CStrings and CArray of strings.
I ran the test harness with the existing method (baseline) and my modified method. I ran it in both debug and release builds.
The following are the results:
Test-1: Debug build -> Adding class with int,int,string, 100000 instances:
Original implementation: 5 seconds, Modified implementation: 12 seconds
Test-2: Debug build -> Adding class with int,int,string, 500000 instances:
Original implementation: 71 seconds, Modified implementation: 332 seconds
Test-3: Release build -> Adding class with int,int,string, 100000 instances:
Original implementation: 2 seconds, Modified implementation: 7 seconds
Test-4: Release build -> Adding class with int,int,string, 500000 instances:
Original implementation: 54 seconds, Modified implementation: 183 seconds
Reading big log file into CArray of CString objects:
Test-5: Debug build -> Read big log file with 86527 lines CArray of CString
Original implementation: 5 seconds, Modified implementation: 5 seconds
Test-6: Release build -> Read big log file with 86527 lines CArray of CString
Original implementation: 5 seconds, Modified implementation: 5 seconds
Reading big log file into CArray of string objects:
Test-7: Debug build -> Read big log file with 86527 lines CArray of string
Original implementation: 12 seconds, Modified implementation: 16 seconds
Test-8: Release build -> Read big log file with 86527 lines CArray of string
Original implementation: 9 seconds, Modified implementation: 13 seconds
Questions:
As you can see from the above tests, realloc is consistently faster compared to memalloc, memcpy and free. In some instances (Test-2 for eg) its faster by a whopping 367%. Similarly for Test-4 it is 234%. So what can I do to get these numbers down that is comparable to realloc implementation?
Can my version be made more efficient?
Assumptions:
Please note that I cannot use C++ new and delete. I have to use only malloc and free. I also cannot change any of the other methods (as it is existing functionality) and impacts are huge. So my hands are tied to get the best implementation of realloc that I possibly can.
I have verified that my modified implementation is functionally correct.
PS: This is my first SO post. I have tried to be as detailed as possible. Suggestions on posting is also appreciated.
First of all I'd like to point out you are not addressing the vulnerability as the memory released by free is not being cleared as well, same as realloc.
Also note your code does more than the old realloc: it throws an exception when out of memory. Which may be futile.
Why is your code slower than realloc? Probably because realloc is using under the hood shortcuts which are not available to you. For example realloc may be allocating more memory than you actually request, or allocating contiguous memory just after the end of the previous block, so your code is doing more memcpy's than realloc.
Point in case. Running the following code in CompileOnline gives result Wow no copy
#include <iostream>
#include <stdlib.h>
using namespace std;
int main()
{
void* p = realloc(NULL, 1024);
void* t = realloc(p, 2048);
if (p == t)
cout << "Wow no copy" << endl;
else
cout << "Alas, a copy" << endl;
return 0;
}
What can you do to make your code faster?
You can try to allocate more memory after the currently allocated block, but then freeing the memory becomes more problematic as you need to remember all the pointers you allocated, or find a way to modify the lookup tables used by free to free the correct amount of memory on one go.
OR
Use the common strategy of (internally) allocating twice as much memory as you previously allocated and (optionally) shrink the memory only when the new threshold is less than half the allocated memory.
This gives you some head room so not every time memory grows is it necessary to call malloc/memcpy/free.
If you look at an implementation of realloc e.g.
http://www.scs.stanford.edu/histar/src/pkg/uclibc/libc/stdlib/malloc/realloc.c
you see that the difference between your implementation and an existing one
is that it expands the memory heap block instead of creating a whole new block
by using low-level calls. This probably accounts for some of the speed difference.
I think you also need to consider the implications of memset of memory every time you do a realloc because then a performance degradation seems inevitable.
I find the argument about realloc leaving code in the memory is somewhat overly paranoid because the same can be said about normal malloc/calloc/free. It would mean that you would not only need to find all reallocs/malloc/callocs but also any runtime or 3rd party function that internally uses those functions to be really sure that nothing is kept in memory alternatively another way would be to create your own heap and replace it with the regular one to keep it clean.
Conceptually realloc() is not doing anything too smart - it allocates memory by some blocks exactly as you do in your ReAllocNew.
The only conceptual difference can be in the way how new block size is calculated.
realloc may use something like this:
int new_buffer_size = old_buffer_size * 2;
and this will decrease number of memory moves from what you have there.
In any case I think that block size calculation formula is the key factor.