The limited allocation size C++ - c++

I use Visual Studio 2008.
I have dynamically declared the variable big_massive:
unsigned int *big_massive = new unsigned int[1073741824]
But, when I tried to debug this program, I got following error: Invalid allocation size: 4294967295 bytes.
I hope there are any path to avoid such error? Thank you!

That allocation is simply not possible on 32bit x86 systems with sizeof(int)==4 (you are requesting 4GB). A process's total address space is limited to 4GB, and the process itself is usually limited to less than that (2GB or 3GB for 32bit Windows depending on boot.ini settings and the Windows edition, not sure which limit applies for 32bit processes on 64bit Windows, but 4GB is simply not possible).
For the 64bit case, you'd need to have 4GB of virtual memory available to back that allocation for it to succeed.

Amount of virtual memory per process on 32bit Windows system or 64bit Windows system running a 32bit program (WoW64): 2147483648
Amount of memory needed to hold an array of 1073741824 4-byte unsigned integers: 4294967296
Can't possibly fit in the amount of memory available, so it's an invalid allocation.

A 32 bit system cannot access more than 4GB of memory per process. However, allocating 3GB of memory is fine on OS supporting lazy allocation and overcommiting, even if you only use the first 10kB, and your maximum swap+memory is 1GB anyway. But keep in mind that relying on this is stupid in the first place.
Before trying to use that much memory, check if you can't represent your data in a more compact form. If your array has holes, or values are repeated, or you don't use the full 32bit range of your int, or you don't need those values to have a specific order, just do not use an array.
Remember RAM is for temporary data. If your data need to be written on disk, why don't you use disk space in the first place. You might even use memory-mapped files (you select a part of your file, and you can access it like memory). You might also like the (easier or not) alternatives of databases management systems.

Related

Is it true that 32Bit program will be out of memory, if other programs use too much, in 64bit windows?

I am developing a 32 bit application and got out of memory error.
And I noticed that my Visual Studio and a plugin (other apps too) used too much memory which is around 4 or 5 GB.
So I suspected that these program use up all the memory addresses where my program is able to find free memory.
I suppose that 32 bit can only use the first 4 GB, other memory it can not use at all.
I don't know if I am correct with this, other wise I will look for other answers, like I have bug in my code.
Your statement of
I suppose that 32bit can only use the first 4 giga byte, othere momery
it can not use at all.
is definitely incorrect. In a 64-bit OS, all applications can use all of the memory, regardless of what bitness it is, thanks to the translation table for virtual to physical memory being 64-bit.
Some really ancient hardware may not allow DMA to addresses above 4GB, but I really hope most of that is in the junk-yard by now.
If the system as a whole is running low on memory, it will affect all applications more or less equally.
However, a 32-bit application can only, by default, use the lower 2GB of the virtual address range (although these 2GB can be placed anywhere in the physical memory, as described above by means of a 64-bit translation table). You can extend this to nearly 4GB (3GB in a 32-bit OS, and subject to the /3GB boot flag in this case) by using /LARGEADDRESSAWARE in your linking command - this simply tells the OS that your application will "understand" that addresses can be negative, and thus will operate correctly with addresses over 2GB.
Any system can be brought down by a too heavy load.
But in normal use in Windows and any other virtual memory OS, the memory consumption of other programs does not much affect any given program execution.
Getting an out of memory error is unusual, but it can happen if you make a large allocation or if you declare a large local automatic variable. It can also happen if you fail to properly deallocate memory that's no longer used, i.e. if the program is leaking memory. For a 32-bit program on a 64-bit machine it's then not memory itself that's used up, but available address space within the program.

allocate more than 1 GB memory on 32 bit XP

I'v run into an odd problem, my process cannot allocate more than what seems to be slightly below 1 GiB. Windows Task Manager "Mem Usage" column shows values close to 1 GiB when my software gives a bad_alloc exception. Yes, i'v checked that the value passed to memory allocation is sensible. ( no race condition / corruption exists that would make this fail ). Yes, I need all this memory and there is no way around it. ( It's a buffer for images, which cannot be compressed any further )
I'm not trying to allocate the whole 1 GiB memory in one go, there a few allocations around 300 MiB each. Would this cause problems? ( I'll try to see if making more smaller allocations works any better ). Is there some compiler switch or something else that I must set in order to get past 1 GiB? I've seen others complaining about the 2 GiB limit, which would be fine for me.. I just need little bit more :). I'm using VS 2005 with SP1 and i'm running it on a 32 bit XP and it's in C++.
On a 32-bit OS, a process has a 4GB address space in total.
On Windows, half of this is off-limits, so your process has 2GB.
This is 2GB of contiguous memory. But it gets fragmented. Your executable is loaded in at one address, each DLL is loaded at another address, then there's the stack, and heap allocations and so on. So while your process probably has enough free address space, there are no contiguous blocks large enough to fulfill your requests for memory. So making smaller allocations will probably solve it.
If your application is compiled with the LARGEADDRESSAWARE flag, it will be allowed to use as much of the remaining 2GB as Windows can spare. (And the value of that depends on your platform and environment.
for 32-bit code running on a 64-bit OS, you'll get a full 4-GB address space
for 32-bit code running on a 32-bit OS without the /3GB boot switch, the flag means nothing at all
for 32-bit code running on a 32-bit OS with the /3GB boot switch, you'll get 3GB of address space.
So really, setting the flag is always a good idea if your application can handle it (it's basically a capability flag. It tells Windows that we can handle more memory, so if Windows can too, it should just go ahead and give us as large an address space as possible), but you probably can't rely on it having an effect. Unless you're on a 64-bit OS, it's unlikely to buy you much. (The /3GB boot switch is necessary, and it has been known to cause problems with drivers, especially video drivers)
Allocating big chunks of continuous memory is always a problem.
It is very likely to get more memory in smaller chunks
You should redesign your memory structures.
You are right to suspect the larger 300MB allocations. Your process will be able to get close to 2GB (3 if you use the /3GB boot.ini switch and LARGEADDRESSAWARE link flag), but not as a large contiguous block.
Typical solutions for this are to break up the requests into tiles or strips of fixed size (say 256x256x4 bytes) and write an intermediate class to hide this representation detail.
You can quickly verify this by writing a small allocation loop that allocate blocks of different sizes.
You could also check this function from MSDN. 1GB rings a bell from here:
This parameter must be greater than or equal to 13 pages (for example,
53,248 on systems with a 4K page size), and less than the system-wide
maximum (number of available pages minus 512 pages). The default size
is 345 pages (for example, this is 1,413,120 bytes on systems with a
4K page size).
Here they mentioned that default maximum number of pages allowed for a process is 345 pages which is slightly more than 1GB.
When I have a few big allocs like that to do, I use the Windows function VirtualAlloc, to avoid stressing the default allocator.
Another way forward might be to use nedmalloc in your project.

64-bit library limited to 4GB?

I'm using an image manipulation library that throws an exception when I attempt to load images > 4GB in size. It claims to be 64bit, but wouldn't a 64bit library allow loading images larger than that? I think they recompiled their C libraries using a 64 bit memory model/compiler but still used unsigned integers and failed upgrade to use 64 bit types.
Is that a reasonable conclusion?
Edit - As an after-thought can OS memory become so fragemented that allocation of large chunks is no longer possible? (It doesn't work right after a reboot either, but just wondering.) What about under .NET? Can the .NET managed memory become so fragmented that allocation of large chunks fails?
It's a reasonable suggestion, however the exact cause could be a number of things - for example what OS are you running, how much RAM / swap do you have? The application/OS may not over-commit virtual memory so you'll need 4GB (or more) of free RAM to open the image.
Out of interest does it seem to be a definite stop at the 4GB boundary - i.e. does a 3.99GB image succeed, but a 4GB one fail - you say it does which would suggest a definite use of a 32bit size in the libraries data structures.
Update
With regards your second question - not really. Pretty much all modern OS's use virtual memory, so each process gets it's own contiguous address space. A single contiguous region in a processes' address space doesn't need to be backed by contiguous physical RAM, it can be made up of a number of separate physical areas of RAM made to look like they are contiguous; so the OS doesn't need to have a single 4GB chunk of RAM free to give your application a 4GB chunk.
It's possible that an application could fragment it's virtual address space such that there isn't room for a contiguous 4GB region, but considering the size of a 64-bit address space it's probably highly unlikely in your scenario.
Yes, unless perhaps the binary file format itself limits the size of images.

How much memory should you be able to allocate?

Background: I am writing a C++ program working with large amounts of geodata, and wish to load large chunks to process at a single go. I am constrained to working with an app compiled for 32 bit machines. The machine I am testing on is running a 64 bit OS (Windows 7) and has 6 gig of ram. Using MS VS 2008.
I have the following code:
byte* pTempBuffer2[3];
try
{
//size_t nBufSize = nBandBytes*m_nBandCount;
pTempBuffer2[0] = new byte[nBandBytes];
pTempBuffer2[1] = new byte[nBandBytes];
pTempBuffer2[2] = new byte[nBandBytes];
}
catch (std::bad_alloc)
{
// If we didn't get the memory just don't buffer and we will get data one
// piece at a time.
return;
}
I was hoping that I would be able to allocate memory until the app reached the 4 gigabyte limit of 32 bit addressing. However, when nBandBytes is 466,560,000 the new throws std::bad_alloc on the second try. At this stage, the working set (memory) value for the process is 665,232 K So, it I don't seem to be able to get even a gig of memory allocated.
There has been some mention of a 2 gig limit for applications in 32 bit Windows which may be extended to 3 gig with the /3GB switch for win32. This is good advice under that environment, but not relevant to this case.
How much memory should you be able to allocate under the 64 bit OS with a 32 bit application?
As much as the OS wants to give you. By default, Windows lets a 32-bit process have 2GB of address space. And this is split into several chunks. One area is set aside for the stack, others for each executable and dll that is loaded. Whatever is left can be dynamically allocated, but there's no guarantee that it'll be one big contiguous chunk. It might be several smaller chunks of a couple of hundred MB each.
If you compile with the LargeAddressAware flag, 64-bit Windows will let you use the full 4GB address space, which should help a bit, but in general,
you shouldn't assume that the available memory is contiguous. You should be able to work with multiple smaller allocations rather than a few big ones, and
You should compile it as a 64-bit application if you need a lot of memory.
on windows 32 bit, the normal process can take 2 GB at maximum, but with /3GB switch it can reach to 3 GB (for windows 2003).
but in your case I think you are allocating contiguous memory, and so the exception occured.
You can allocate as much memory as your page file will let you - even without the /3GB switch, you can allocate 4GB of memory without much difficulty.
Read this article for a good overview of how to think about physical memory, virtual memory, and address space (all three are different things). In a nutshell, you have exactly as much physical memory as you have RAM, but your app really has no interaction with that physical memory at all - it's just a convenient place to store the data that in your virtual memory. Your virtual memory is limited by the size of your pagefile, and the amount your app can use is limited by how much other apps are using (although you can allocate more, providing you don't actually use it). Your address space in the 32 bit world is 4GB. Of those, 2 GB are allocated to the kernel (or 1GB if you use the /3BG switch). Of the 2GB that are left, some is going to be used up by your stack, some by the program you are currently running, (and all the dlls, etc..). It's going to get fragmented, and you are only going to be able to get so much contiguous space - this is where your allocation is failing. But since that address space is just a convenient way to access the virtual memory you have allocated for you, it's possible to allocate much more memory, and bring chunks of it into your address space a few at a time.
Raymond Chen has an example of how to allocate 4GB of memory and map part of it into a section of your address space.
Under 32-bit Windows, the maximum allocatable is 16TB and 256TB in 64 bit Windows.
And if you're really into how memory management works in Windows, read this article.
During the ElephantsDream project the Blender Foundation with Blender 3D had similar problems (though on Mac). Can't include the link but google: blender3d memory allocation problem and it will be the first item.
The solution involved File Mapping. Haven't tried it myself but you can read up on it here: http://msdn.microsoft.com/en-us/library/aa366556(VS.85).aspx
With nBandBytes at 466,560,000, you are trying to allocate 1.4 GB. A 32-bit app typically only has access to 2 GB of memory (more if you boot with /3GB and the executable is marked as large address space aware). You may be hard pressed to find that many blocks of contiguous address space for your large chunks of memory.
If you want to allocate gigabytes of memory on a 64-bit OS, use a 64-bit process.
You should be able to allocate a total of about 2GB per process. This article (PDF) explains the details. However, you probably won't be able to get a single, contiguous block that is even close to that large.
Even if you allocate in smaller chunks, you couldn't get the memory you need, especially if the surrounding program has unpredictable memory behavior, or if you need to run on different operating systems. In my experience, the heap space on a 32-bit process caps at around 1.2GB.
At this amount of memory, I would recommend manually writing to disk. Wrap your arrays in a class that manages the memory and writes to temporary files when necessary. Hopefully the characteristics of your program are such that you could effectively cache parts of that data without hitting the disk too much.
Sysinternals VMMap is great for investigating virtual address space fragmentation, which is probably limiting how much contiguous memory you can allocate. I recommend setting it to display free space, then sorting by size to find the largest free areas, then sorting by address to see what is separating the largest free areas (probably rebased DLLs, shared memory regions, or other heaps).
Avoiding extremely large contiguous allocations is probably for the best, as others have suggested.
Setting LARGE_ADDRESS_AWARE=YES (as jalf suggested) is good, as long as the libraries that your application depends on are compatible with it. If you do so, you should test your code with the AllocationPreference registry key set to enable top-down virtual address allocation.

Can you allocate a very large single chunk of memory ( > 4GB ) in c or c++?

With very large amounts of ram these days I was wondering, it is possible to allocate a single chunk of memory that is larger than 4GB? Or would I need to allocate a bunch of smaller chunks and handle switching between them?
Why???
I'm working on processing some openstreetmap xml data and these files are huge. I'm currently streaming them in since I can't load them all in one chunk but I just got curious about the upper limits on malloc or new.
Short answer: Not likely
In order for this to work, you absolutely would have to use a 64-bit processor.
Secondly, it would depend on the Operating System support for allocating more than 4G of RAM to a single process.
In theory, it would be possible, but you would have to read the documentation for the memory allocator. You would also be more susceptible to memory fragmentation issues.
There is good information on Windows memory management.
A Primer on physcal and virtual memory layouts
You would need a 64-bit CPU and O/S build and almost certainly enough memory to avoid thrashing your working set. A bit of background:
A 32 bit machine (by and large) has registers that can store one of 2^32 (4,294,967,296) unique values. This means that a 32-bit pointer can address any one of 2^32 unique memory locations, which is where the magic 4GB limit comes from.
Some 32 bit systems such as the SPARCV8 or Xeon have MMU's that pull a trick to allow more physical memory. This allows multiple processes to take up memory totalling more than 4GB in aggregate, but each process is limited to its own 32 bit virtual address space. For a single process looking at a virtual address space, only 2^32 distinct physical locations can be mapped by a 32 bit pointer.
I won't go into the details but This presentation (warning: powerpoint) describes how this works. Some operating systems have facilities (such as those described Here - thanks to FP above) to manipulate the MMU and swap different physical locations into the virtual address space under user level control.
The operating system and memory mapped I/O will take up some of the virtual address space, so not all of that 4GB is necessarily available to the process. As an example, Windows defaults to taking 2GB of this, but can be set to only take 1GB if the /3G switch is invoked on boot. This means that a single process on a 32 bit architecture of this sort can only build a contiguous data structure of somewhat less than 4GB in memory.
This means you would have to explicitly use the PAE facilities on Windows or Equivalent facilities on Linux to manually swap in the overlays. This is not necessarily that hard, but it will take some time to get working.
Alternatively you can get a 64-bit box with lots of memory and these problems more or less go away. A 64 bit architecture with 64 bit pointers can build a contiguous data structure with as many as 2^64 (18,446,744,073,709,551,616) unique addresses, at least in theory. This allows larger contiguous data structures to be built and managed.
The advantage of memory mapped files is that you can open a file much bigger than 4Gb (almost infinite on NTFS!) and have multiple <4Gb memory windows into it.
It's much more efficent than opening a file and reading it into memory,on most operating systems it uses the built-in paging support.
This shouldn't be a problem with a 64-bit OS (and a machine that has that much memory).
If malloc can't cope then the OS will certainly provide APIs that allow you to allocate memory directly. Under Windows you can use the VirtualAlloc API.
it depends on which C compiler you're using, and on what platform (of course) but there's no fundamental reason why you cannot allocate the largest chunk of contiguously available memory - which may be less than you need. And of course you may have to be using a 64-bit system to address than much RAM...
see Malloc for history and details
call HeapMax in alloc.h to get the largest available block size
Have you considered using memory mapped files? Since you are loading in really huge files, it would seem that this might be the best way to go.
It depends on whether the OS will give you virtual address space that allows addressing memory above 4GB and whether the compiler supports allocating it using new/malloc.
For 32-bit Windows you won't be able to get single chunk bigger than 4GB, as the pointer size is 32-bit, thus limiting your virtual address space to 4GB. (You could use Physical Address Extension to get more than 4GB memory; however, I believe you have to map that memory into the virtualaddress space of 4GB yourself)
For 64-bit Windows, the VC++ compiler supports 64-bit pointers with theoretical limit of the virtual address space to 8TB.
I suspect the same applies for Linux/gcc - 32-bit does not allow you, whereas 64-bit allows you.
As Rob pointed out, VirtualAlloc for Windows is a good option for this, as is an anonymouse file mapping. However, specifically with respect to your question, the answer to "if C or C++" can allocate, the answer is NO THIS IS NOT SUPPORTED EVEN ON WIN7 RC 64
In the PE/COFF specification for exe files, the field which specifies the HEAP reserve and HEAP commit, is a 32 bit quantity. This is in-line with the physical size limitations of the current heap implmentation in the windows CRT, which is just short of 4GB. So, there is no way to allocate more than 4GB from C/C++ (technicall the OS support facilities of CreateFileMapping and VirtualAlloc/VirtualAllocNuma etc... are not C or C++).
Also, BE AWARE that there are underlying x86 or amd64 ABI construct's known as the page table's. This WILL in effect do what you are concerened about, allocating smaller chunks for your larger request, even though this is happining in kernel memory, there is an effect on the overall system, these tables are finite.
If you are allocating memory in such grandious purportions, you would be well advised to allocate based on the allocation granularity (which VirtualAlloc enforces) and also to identify optional flags's or methods to enable larger pages.
4kb pages were the initial page size for the 386, subsaquently the pentium added 4MB. Today, the AMD64 (Software Optimization Guide for AMD Family 10h Processors) has a maximum page table entry size of 1GB. This mean's for your case here, let's say you just did 4GB, it would require only 4 unique entries in the kernel's directory to locate\assign and permission your process's memory.
Microsoft has also released this manual that articulates some of the finer points of application memory and it's use for the Vista/2008 platform and newer.
Contents
Introduction. 4
About the Memory Manager 4
Virtual Address Space. 5
Dynamic Allocation of Kernel Virtual
Address Space. 5
Details for x86 Architectures. 6
Details for 64-bit Architectures. 7
Kernel-Mode Stack Jumping in x86
Architectures. 7
Use of Excess Pool Memory. 8
Security: Address Space Layout
Randomization. 9
Effect of ASLR on Image Load
Addresses. 9
Benefits of ASLR.. 11
How to Create Dynamically Based
Images. 11
I/O Bandwidth. 11
Microsoft SuperFetch. 12
Page-File Writes. 12
Coordination of Memory Manager and
Cache Manager 13
Prefetch-Style Clustering. 14
Large File Management 15
Hibernate and Standby. 16
Advanced Video Model 16
NUMA Support 17
Resource Allocation. 17
Default Node and Affinity. 18
Interrupt Affinity. 19
NUMA-Aware System Functions for
Applications. 19
NUMA-Aware System Functions for
Drivers. 19
Paging. 20
Scalability. 20
Efficiency and Parallelism.. 20
Page-Frame Number and PFN Database. 20
Large Pages. 21
Cache-Aligned Pool Allocation. 21
Virtual Machines. 22
Load Balancing. 22
Additional Optimizations. 23
System Integrity. 23
Diagnosis of Hardware Errors. 23
Code Integrity and Driver Signing. 24
Data Preservation during Bug Checks. 24
What You Should Do. 24
For Hardware Manufacturers. 24
For Driver Developers. 24
For Application Developers. 25
For System Administrators. 25
Resources. 25
If size_t is greater than 32 bits on your system, you've cleared the first hurdle. But the C and C++ standards aren't responsible for determining whether any particular call to new or malloc succeeds (except malloc with a 0 size). That depends entirely on the OS and the current state of the heap.
Like everyone else said, getting a 64bit machine is the way to go. But even on a 32bit machine intel machine, you can address bigger than 4gb areas of memory if your OS and your CPU support PAE. Unfortunately, 32bit WinXP does not do this (does 32bit Vista?). Linux lets you do this by default, but you will be limited to 4gb areas, even with mmap() since pointers are still 32bit.
What you should do though, is let the operating system take care of the memory management for you. Get in an environment that can handle that much RAM, then read the XML file(s) into (a) data structure(s), and let it allocate the space for you. Then operate on the data structure in memory, instead of operating on the XML file itself.
Even in 64bit systems though, you're not going to have a lot of control over what portions of your program actually sit in RAM, in Cache, or are paged to disk, at least in most instances, since the OS and the MMU handle this themselves.