I have a C++ app which generates raw bitmap images in runtime and pushes those into temporary std::vector allocating the memory dynamically on each write, which is then used to write those images to file before the program exists.Currently the bitmap size is 1280x720. I am getting "bad allocation" error after frame 650 +- 3 frames.It is pretty clear to me that the reasons for this is not enough RAM memory.That is because if I make smaller size (let'say 300x200) I manage to store all the 950 frames ok.It is strange because my machine has 16gb of RAM.Also in the task manager I see the RAM display has still a lot of free (dark green) space(getting only 5gb in use at most).It utilize like 1/3 of the space only.I am running on Windows 7 64bit 16gb RAM Intel I7 CPU.I am debugging the program in VS2012.Is it possible the OS restricts the dynamic allocation to some arbitrary size?If yes, how can I lift that restriction?
The type of data is bytes (unsigned chars).And yes ,as someone mentioned in the answer below , I compile for 32bit.
1280 x 720 x 3 (bytes/pixel) = 2764800 bytes/image = 2.64 MB/image (I'm supposing 24 bpp images here)
2.64 MB/image x 650 images = 1713.87 MB, really near to the dreaded 2 GB boundary. This makes me thing that you are running on a 64 bit OS, but your application is compiled as a 32 bit application without the /LARGEADDRESSAWARE linker flag, thus it has only a 2 GB virtual address space available1.
To easily exploit the physical RAM available on your machine you can compile your program as a 64 bit application (thus rendering substantially irrelevant the virtual address space limitations). Other methods are quite a bit more complicated (usually they involve managing "sliding windows" of memory by yourselves).
Another option is to compile your application with the /LARGEADDRESSAWARE linker flag, but you are going to actually get more memory on 32 bit systems launched with the /3GB kernel option or on 64 bit systems (respectively, 1 GB more and 2 GB more); also, given that the high bit of the addresses can be set, then you have to be careful in what you do with pointers (subtraction and comparisons can be tricky).
Actually, 32 bit pointers can address a full 4 GB virtual address space, but the upper half is reserved for the system by default.
Related
I have code which is 32 bit and i think compiler too. But when i am compiling my c++ code, its taking more than 2 GB memory. As per my understanding on 32 bit system no process can take more than 2 GB.
Any suggestions how can i achieve this? I found lot of posts on this but those
are not helpful as they are adding swaps. But i already have 8 GB ram. So my problem is not available memory, its size of compiling process which could not be more than 2 GB.
Even i have 8 GB ram I have tried to adding swap and that's also not working.
On Windows 32 Bit, the maximum amount of RAM is 4 GB. By default, this address space is seperated into kernel memory and process memory, both being 2 GB large. Most programs don't need more than 2 GB of memory, but if you do, you can enlarge the process memory by specifying the /3GB switch, leaving less memory for the kernel.
Read here for more information: https://msdn.microsoft.com/en-us/library/windows/hardware/ff556232(v=vs.85).aspx
Edit: Keep in mind that if you want to make use of this additional memory, you also need to compile your program with the /LARGEADDRESSAWARE switch. That will set a flag in the Process Environment Block of your program, making Windows aware that your program might need more than 2 GB of memory.
Since you stated you have 8GB of RAM, I am presuming your OS and CPU are actually 64-bit. So you are asking how to make a 32-bit program access more than 2GB of virtual address space, on a 64-bit OS, i.e. running under WOW64.
In that case, using the /LARGEADDRESSAWARE linker option in Visual Studio will give your app 4GB of virtual address space, under WOW64. You won't see any benefit in 32-bit Windows, unless you force your users to boot their OS with a certain flag.
I believe your app doesn't really need more than 2GB of RAM, but it's impossible to tell without knowing any details.
In any case, the one correct answer is: switch to a 64-bit app, which will get you 8TB of virtual address space. That's 8 terabytes.
My professor said that regularly, we can only use about 2 GB out of 4 GB RAM because the other 2 GB is used by the OS. However, when running some tests, I see that with a 4 GB virtual memory space of a process, I can only allocate a maximum of just under 2 GB using VirtualAlloc() function. Why is that (I was expecting it to be about more than 3 GB)?
As I know, the stack, data, and code segments only use a small amount of memory. One of my friend told me that the other 2 GB is used by OS just like the professor said. However, I think that the professor meant 2 GB of physical memory. It's not in the virtual memory of this process.
Could anyone explain what happens here? Thanks.
Some information:
Physical memory: 4GB.
Virtual memory: 4GB.
OS: Windows 10.
Your professor is correct - 2 GB of your virtual memory are kernel memory.
This way, when a context switch occurs, these 2 GB can stay and only the other 2 need to be swapped. It helps performance.
You can also see here an explanation by Microsoft, including explanations how to increase the user portion to 3 GB.
By the way, the situation is different in 64-bit machines, where the virtual memory is much larger.
It does not have anything to do with RAM, the virtual in VirtualAlloc() tells no lies. Sure, the upper 2GB is reserved to the OS, biggest chunks it needs are the file system cache and the video memory aperture. The latter is the bigger reason why the /3GB boot option no longer works. As you found out, you can never get the full 2GB, your program needs address space as well and is always first. It got it when it was loaded by the OS loader, what is left can be divvied up by VirtualAlloc.
Usually well less than 2 GB, the address space tends to get fragmented by loaded DLLs. Beware that you might use some even if you did not link their import libraries, anti-malware and cloud-storage utilities may inject them. Any heap allocations in your program also tends to cause splits.
These concerns are getting pretty dated, all modern machines boot a 64-bit OS. A 32-bit program now runs in an emulator and the upper range is no longer needed by the OS. You can now get at lot closer to 4GB by linking with the /LARGEADDRESSAWARE linker option. That option by itself gives you a pretty good hint why they originally decided that splitting up the address space like that was considered a good idea. Also the approach taken in 64-bit OSes.
I am getting this question all the time from my users, unfortunately I have not found good links about x64 (x86 is a different story).
What is the maximum memory available to an application on 64-bit Windows:
C++ application
.Net application
.Net application using C++ libraries
Application is running on Windows 2008/2012 server
Application is running on Windows 7/8
The total amount would be - in theory - a bit over 18 quintillion (2^64 or 18 billion billion) bytes or 18 billion gigabytes assuming addresses are considered to be unsigned. If you limit yourself and consider a signed 64 bit integer, then you're looking at half of that. Oh, and don't forget to subtract memory that's going to be reserved for hardware, like video ram, address space for busses, etc.
But even these numbers aren't necessarily the maximum (at least theory wise), because there are additional tricks you're able to pull off (like using physical address extension to use more than 2 GB on 32 bit).
So, essentially as the short answer: 64 bit allows you to address and use all the memory your money can buy.
Unfortunately there are most likely hardware and software limits that are much lower, for example the maximum amount of memory being useable by your mainboard (depending on the age of the board, right now would usually be 8 or 16 GB, sometimes 32 GB). Judging by Windows itself, the maximum amount can vary greatly, based on your architecture and version you're running.
I'v run into an odd problem, my process cannot allocate more than what seems to be slightly below 1 GiB. Windows Task Manager "Mem Usage" column shows values close to 1 GiB when my software gives a bad_alloc exception. Yes, i'v checked that the value passed to memory allocation is sensible. ( no race condition / corruption exists that would make this fail ). Yes, I need all this memory and there is no way around it. ( It's a buffer for images, which cannot be compressed any further )
I'm not trying to allocate the whole 1 GiB memory in one go, there a few allocations around 300 MiB each. Would this cause problems? ( I'll try to see if making more smaller allocations works any better ). Is there some compiler switch or something else that I must set in order to get past 1 GiB? I've seen others complaining about the 2 GiB limit, which would be fine for me.. I just need little bit more :). I'm using VS 2005 with SP1 and i'm running it on a 32 bit XP and it's in C++.
On a 32-bit OS, a process has a 4GB address space in total.
On Windows, half of this is off-limits, so your process has 2GB.
This is 2GB of contiguous memory. But it gets fragmented. Your executable is loaded in at one address, each DLL is loaded at another address, then there's the stack, and heap allocations and so on. So while your process probably has enough free address space, there are no contiguous blocks large enough to fulfill your requests for memory. So making smaller allocations will probably solve it.
If your application is compiled with the LARGEADDRESSAWARE flag, it will be allowed to use as much of the remaining 2GB as Windows can spare. (And the value of that depends on your platform and environment.
for 32-bit code running on a 64-bit OS, you'll get a full 4-GB address space
for 32-bit code running on a 32-bit OS without the /3GB boot switch, the flag means nothing at all
for 32-bit code running on a 32-bit OS with the /3GB boot switch, you'll get 3GB of address space.
So really, setting the flag is always a good idea if your application can handle it (it's basically a capability flag. It tells Windows that we can handle more memory, so if Windows can too, it should just go ahead and give us as large an address space as possible), but you probably can't rely on it having an effect. Unless you're on a 64-bit OS, it's unlikely to buy you much. (The /3GB boot switch is necessary, and it has been known to cause problems with drivers, especially video drivers)
Allocating big chunks of continuous memory is always a problem.
It is very likely to get more memory in smaller chunks
You should redesign your memory structures.
You are right to suspect the larger 300MB allocations. Your process will be able to get close to 2GB (3 if you use the /3GB boot.ini switch and LARGEADDRESSAWARE link flag), but not as a large contiguous block.
Typical solutions for this are to break up the requests into tiles or strips of fixed size (say 256x256x4 bytes) and write an intermediate class to hide this representation detail.
You can quickly verify this by writing a small allocation loop that allocate blocks of different sizes.
You could also check this function from MSDN. 1GB rings a bell from here:
This parameter must be greater than or equal to 13 pages (for example,
53,248 on systems with a 4K page size), and less than the system-wide
maximum (number of available pages minus 512 pages). The default size
is 345 pages (for example, this is 1,413,120 bytes on systems with a
4K page size).
Here they mentioned that default maximum number of pages allowed for a process is 345 pages which is slightly more than 1GB.
When I have a few big allocs like that to do, I use the Windows function VirtualAlloc, to avoid stressing the default allocator.
Another way forward might be to use nedmalloc in your project.
With very large amounts of ram these days I was wondering, it is possible to allocate a single chunk of memory that is larger than 4GB? Or would I need to allocate a bunch of smaller chunks and handle switching between them?
Why???
I'm working on processing some openstreetmap xml data and these files are huge. I'm currently streaming them in since I can't load them all in one chunk but I just got curious about the upper limits on malloc or new.
Short answer: Not likely
In order for this to work, you absolutely would have to use a 64-bit processor.
Secondly, it would depend on the Operating System support for allocating more than 4G of RAM to a single process.
In theory, it would be possible, but you would have to read the documentation for the memory allocator. You would also be more susceptible to memory fragmentation issues.
There is good information on Windows memory management.
A Primer on physcal and virtual memory layouts
You would need a 64-bit CPU and O/S build and almost certainly enough memory to avoid thrashing your working set. A bit of background:
A 32 bit machine (by and large) has registers that can store one of 2^32 (4,294,967,296) unique values. This means that a 32-bit pointer can address any one of 2^32 unique memory locations, which is where the magic 4GB limit comes from.
Some 32 bit systems such as the SPARCV8 or Xeon have MMU's that pull a trick to allow more physical memory. This allows multiple processes to take up memory totalling more than 4GB in aggregate, but each process is limited to its own 32 bit virtual address space. For a single process looking at a virtual address space, only 2^32 distinct physical locations can be mapped by a 32 bit pointer.
I won't go into the details but This presentation (warning: powerpoint) describes how this works. Some operating systems have facilities (such as those described Here - thanks to FP above) to manipulate the MMU and swap different physical locations into the virtual address space under user level control.
The operating system and memory mapped I/O will take up some of the virtual address space, so not all of that 4GB is necessarily available to the process. As an example, Windows defaults to taking 2GB of this, but can be set to only take 1GB if the /3G switch is invoked on boot. This means that a single process on a 32 bit architecture of this sort can only build a contiguous data structure of somewhat less than 4GB in memory.
This means you would have to explicitly use the PAE facilities on Windows or Equivalent facilities on Linux to manually swap in the overlays. This is not necessarily that hard, but it will take some time to get working.
Alternatively you can get a 64-bit box with lots of memory and these problems more or less go away. A 64 bit architecture with 64 bit pointers can build a contiguous data structure with as many as 2^64 (18,446,744,073,709,551,616) unique addresses, at least in theory. This allows larger contiguous data structures to be built and managed.
The advantage of memory mapped files is that you can open a file much bigger than 4Gb (almost infinite on NTFS!) and have multiple <4Gb memory windows into it.
It's much more efficent than opening a file and reading it into memory,on most operating systems it uses the built-in paging support.
This shouldn't be a problem with a 64-bit OS (and a machine that has that much memory).
If malloc can't cope then the OS will certainly provide APIs that allow you to allocate memory directly. Under Windows you can use the VirtualAlloc API.
it depends on which C compiler you're using, and on what platform (of course) but there's no fundamental reason why you cannot allocate the largest chunk of contiguously available memory - which may be less than you need. And of course you may have to be using a 64-bit system to address than much RAM...
see Malloc for history and details
call HeapMax in alloc.h to get the largest available block size
Have you considered using memory mapped files? Since you are loading in really huge files, it would seem that this might be the best way to go.
It depends on whether the OS will give you virtual address space that allows addressing memory above 4GB and whether the compiler supports allocating it using new/malloc.
For 32-bit Windows you won't be able to get single chunk bigger than 4GB, as the pointer size is 32-bit, thus limiting your virtual address space to 4GB. (You could use Physical Address Extension to get more than 4GB memory; however, I believe you have to map that memory into the virtualaddress space of 4GB yourself)
For 64-bit Windows, the VC++ compiler supports 64-bit pointers with theoretical limit of the virtual address space to 8TB.
I suspect the same applies for Linux/gcc - 32-bit does not allow you, whereas 64-bit allows you.
As Rob pointed out, VirtualAlloc for Windows is a good option for this, as is an anonymouse file mapping. However, specifically with respect to your question, the answer to "if C or C++" can allocate, the answer is NO THIS IS NOT SUPPORTED EVEN ON WIN7 RC 64
In the PE/COFF specification for exe files, the field which specifies the HEAP reserve and HEAP commit, is a 32 bit quantity. This is in-line with the physical size limitations of the current heap implmentation in the windows CRT, which is just short of 4GB. So, there is no way to allocate more than 4GB from C/C++ (technicall the OS support facilities of CreateFileMapping and VirtualAlloc/VirtualAllocNuma etc... are not C or C++).
Also, BE AWARE that there are underlying x86 or amd64 ABI construct's known as the page table's. This WILL in effect do what you are concerened about, allocating smaller chunks for your larger request, even though this is happining in kernel memory, there is an effect on the overall system, these tables are finite.
If you are allocating memory in such grandious purportions, you would be well advised to allocate based on the allocation granularity (which VirtualAlloc enforces) and also to identify optional flags's or methods to enable larger pages.
4kb pages were the initial page size for the 386, subsaquently the pentium added 4MB. Today, the AMD64 (Software Optimization Guide for AMD Family 10h Processors) has a maximum page table entry size of 1GB. This mean's for your case here, let's say you just did 4GB, it would require only 4 unique entries in the kernel's directory to locate\assign and permission your process's memory.
Microsoft has also released this manual that articulates some of the finer points of application memory and it's use for the Vista/2008 platform and newer.
Contents
Introduction. 4
About the Memory Manager 4
Virtual Address Space. 5
Dynamic Allocation of Kernel Virtual
Address Space. 5
Details for x86 Architectures. 6
Details for 64-bit Architectures. 7
Kernel-Mode Stack Jumping in x86
Architectures. 7
Use of Excess Pool Memory. 8
Security: Address Space Layout
Randomization. 9
Effect of ASLR on Image Load
Addresses. 9
Benefits of ASLR.. 11
How to Create Dynamically Based
Images. 11
I/O Bandwidth. 11
Microsoft SuperFetch. 12
Page-File Writes. 12
Coordination of Memory Manager and
Cache Manager 13
Prefetch-Style Clustering. 14
Large File Management 15
Hibernate and Standby. 16
Advanced Video Model 16
NUMA Support 17
Resource Allocation. 17
Default Node and Affinity. 18
Interrupt Affinity. 19
NUMA-Aware System Functions for
Applications. 19
NUMA-Aware System Functions for
Drivers. 19
Paging. 20
Scalability. 20
Efficiency and Parallelism.. 20
Page-Frame Number and PFN Database. 20
Large Pages. 21
Cache-Aligned Pool Allocation. 21
Virtual Machines. 22
Load Balancing. 22
Additional Optimizations. 23
System Integrity. 23
Diagnosis of Hardware Errors. 23
Code Integrity and Driver Signing. 24
Data Preservation during Bug Checks. 24
What You Should Do. 24
For Hardware Manufacturers. 24
For Driver Developers. 24
For Application Developers. 25
For System Administrators. 25
Resources. 25
If size_t is greater than 32 bits on your system, you've cleared the first hurdle. But the C and C++ standards aren't responsible for determining whether any particular call to new or malloc succeeds (except malloc with a 0 size). That depends entirely on the OS and the current state of the heap.
Like everyone else said, getting a 64bit machine is the way to go. But even on a 32bit machine intel machine, you can address bigger than 4gb areas of memory if your OS and your CPU support PAE. Unfortunately, 32bit WinXP does not do this (does 32bit Vista?). Linux lets you do this by default, but you will be limited to 4gb areas, even with mmap() since pointers are still 32bit.
What you should do though, is let the operating system take care of the memory management for you. Get in an environment that can handle that much RAM, then read the XML file(s) into (a) data structure(s), and let it allocate the space for you. Then operate on the data structure in memory, instead of operating on the XML file itself.
Even in 64bit systems though, you're not going to have a lot of control over what portions of your program actually sit in RAM, in Cache, or are paged to disk, at least in most instances, since the OS and the MMU handle this themselves.