Malloc fails to allocate memory of 1 GiB on Windows - c++

I am trying to allocate memory of 1 GiB using malloc() on Windows and it fails. I know malloc's uncertainty. What is best solution to allocate memory of 1 GiB?

If you are using a 32-bit (x86) application, you are unlikely to be able to allocate a 1 GB continuous chunk of memory (and certainly can't allocate 2GB). As to why this happens, you should see the venerable presentation "Why Your Windows Game Won't Run In 2,147,352,576 Bytes" (Gamefest 2007) attached to this blog post.
You should build your application as an x64 native (x64) application instead.
You could enable /LARGEADDRESSAWARE and stick with a 32-bit application on Windows x64, but it has a number of quirks and may limit what kinds of 3rd party support libraries you can use. A better solution is to use x64 native if possible.

Use the /LARGEADDRESSAWARE flag to tell Windows that you're not doing funny things with addresses. This unlocks an extra 2GB of address space on Win64.

Related

maximum memory allocation in C++Builder 6

I am writing an application in C++Builder 6 Enterprise.
The maximum memory the software allows me to reserve is around 870 MB, no more. The physical memory available on the system is 8 GB and the PC is running Windows 7.
Immediately after a memory allocation statement like malloc(870000000) is executed, Task Manager says the memory used by the whole system is 2.5 GB.
My question is, why can't I allocate up to end of available memory?
C++Builder 6 was released in 2002, and can produce only 32bit apps. The ability to produce 64bit apps was added in C++Builder XE3 in 2012.
A 32bit app cannot access more than 4GB max, no matter what.
Apps written in C++Builder 6 are not Large Address Aware (and it is not safe to manually mark them as such, as the RTL and memory manager are not LAA compatible), so the most memory they can hope to access is 2GB max (the other 2GB is reserved for Windows to use).
When you ask malloc() to allocate ~830 MB (not 870 MB, which would be 912261120 instead of 870000000), you are asking it to allocate 1 contiguous block of memory, which is likely to fail in a non-trivial app.
Even if the app were Large Address Aware, that would up the accessible memory to only 3GB on 32bit Windows (only if the /3GB flag is enabled on Windows startup), and 4GB on 64bit Windows.
So, you will never be able to get a 32bit app to allocate anywhere close to the full 8GB. You need a 64bit app for that.

Qt Creator - calloc fails with large memory

I have a problem with Qt Creator, or one of its components.
I have a program which needs lots of memory (about 4 GBytes) and I use calloc to allocate it. If I compile the C code with mingw/gcc (without using the Qt-framework) it works, but if I compile it within the Qt Creator (with the C code embedded in the Qt framework using C++), using the mingw/gcc toolchain, calloc returns a null-pointer.
I already searched and found the qt-pro option QMAKE_LFLAGS += -Wl,--large-address-aware, which worked for some cases (around 3.5GBytes), but if I go above 4GBytes, it only works with the C code compiled with gcc, not with Qt.
How can I allocate the needed amount of memory using calloc when compiling with Qt Creator?
So your cigwin tool chain builds 64-bit applications for your. Possible size of memory, that can be allocated by 64-bit application is 264 bytes that far exceeds 4Gb. But Qt Creator (if you installed it from QtSDK and not reconfigured it manually) uses Qt's tool chain, that builds 32 bit applications. You theoretically can allocate 4Gb of memory by 32 bit application, but do not forget, that all libraries will be also loaded into this memory. In practice, you are possible to allocate about 3 Gb of memory and not in one continuous chunk.
You have 3 ways to solve your problem:
reconsider your algorithm. Do not allocate 4Gb of RAM, use smarter data structures, or use disk cache etc. I believe if your problem would actually require more then 4 GB of memory to solve, you wouldn't ask this question.
separate your Qt code from your C program. Then, you can still use 64-bit-target-compiler for C program and 32-bit-target-compiler for Qt/C++ part. You can communicate with your C program through any interprocess communication mechanism. (Actually standard input/output streams are often enough)
Move to 64 bit. I mean, use 64-bit-target-compiler for both C and C++ code. But it is not so simple, as one could think. You'll need to rebuild Qt in 64 bit mode. It is possible with some modules turned off and some code fixups (I've tried once), but Windows 64 bit officially not supported.

Long run time for c++ program in cygwin compared to linux

I have a c++ program which takes really long time to run in cygwin versus quick turnaround on a linux machine. I thought it could be a memory issue and tried to print the memory used and this is waht I see:
Linux
virtual memory: 5072 KB, Resident set size (RSS) : 1064 KB
Cygwin
virtual memory: 7672 KB, Resident set size (RSS) : 108928 KB
Can anyone help me understand what causes this difference? The cygwin is running on a laptop with 64-bit windows & and 3 GB memory. There is some old "C" code which does malloc in the program. Would converting these to standard c++ containers help?
Cygwin provides a POSIX compatibility layer on to of Windows. That is bound to be slower than code built against the native OS CRT.
If your code is Standard C or C++, recompile it with MSVC or MinGW/GCC and then compare it.
On another note, malloc vs new is a non-issue. Heap allocation is expensive.
What might be important is that Windows heap allocation is in general more expensive than Linux' implementation. The effect of this difference depends on your code.
As rubenvb says you can't really say without seeing the code - but:
The amount of memory is irrelevent, it may be that either the cygwin launcher or the OS decides to just allocate a lot of memeory to the cygwin job because that memory isn't being used. So future memory allocations by the cygwin app will be quicker. There is also an issue with how Linux reports memory use, it does optomistic allocation so if you allocate say a Gb of memory that memory isn't actualy locked to that process until it's used and the task won't show as using 1Gb.
There are some tasks which are very cheap on a Unix system but are very slow on Windows architecture. The most notorious is fork() which is very common on Unix apps but is a bad idea on Windows

Address Windowing Extension

I have an 32bit application with very large memory requirements.
I noticed that there is something called Address Windowing Extension.
However I haven't found much information in regards to how to use it and also what disadvantages and problems one might run into while using this?
It shouldn't work on versions of Windows at 64bits (read here http://msdn.microsoft.com/en-us/library/aa366778.aspx Intel and AMD's specification of PAE does support the x86-64 architecture but the software layer of Microsoft's PAE (the API), called AWE, is not supported on 64 bit editions of Windows, so Windows Vista 64 bit cannot attribute more than 4 GiB of RAM for a 32 bit application.).
Even on Windows 32 bits there is a "license" limit on the amount of memory usable (same page shows all the limits).
And clearly it's complex to program :-) It's like using EMS on the old 8086.
Well the truth is that you can use AWE from a 32bits application running inside a Windows OS 64bit, and you don't need PAE. For example MS SQL Server (before 2012 version) can be configured in this mode.
But unless you have a very specific requirements, probably is far a better option to port to 64bits.
You have several disvantages:
Need to run with a user with SeLockMemoryPrivilege
The memory can not be shared with other process. It is allocated in physical memory. Leaving less memory to the OS and other applications (with AllocateUserPhysicalPages).
You need a virtual address in order to access such memory. So you can have a memory windows of 4GiB with LARGE_ADDRESS_AWARE flag.
If you want to access more thant 4GiB you have to map/unmap those physical pages (with MapUserPhysicalPages).
This article from 1999 explain how to use such API.

WIN32 memory issue (differences between debug/release)

I'm currently working on a legacy app (win32, Visual C++ 2005) that allocates memory using LocalAlloc (in a supplied library I can't change). The app keeps very large state in fixed memory (created at the start with multiple calls to LocalAlloc( LPTR, size)). I notice that in release mode I run out of memory at about 1.8gb but in debug it happily goes on to over 3.8gb. I'm running XP64 with the /3gb switch. I need to increase the memory used in the app and I'm hitting the memory limit in release (debug works ok). Any ideas?
You probably have the Debug configuration linking with /LARGEADDRESSAWARE and the Release configuration linking with /LARGEADDRESSAWARE:NO (or missing altogether).
Check Linker->System->Enable Large Addresses in the project's configuration properties.
Sounds like your Release build is also compiled as x86. If not, than there must be something in your code which treats pointer as signed 32-bit integers and this code is only active in Release.
How does the running out of memory manifests itself?
Also, there is no reason to use /3gb flag for XP64 when running 64-bit applications: it doesn't change anything in this scenario
One suggestion: have a look at the base addresses of the DLLs that get loaded into the process space in release and debug mode, and see if there is much difference. It's possible that, in the release case, there are DLLs loaded at addresses so that, while there's enough free space in total to support a LocalAlloc() call, there isn't enough continuous address space to satisfy it. (For a contrived example, suppose that there was a DLL loaded at 0x40000000 (1Gb), another at 0x80000000 (2Gb), and another at 0xC0000000 (3Gb). Even if these DLLs were really small, the process couldn't allocate more than 1Gb at a time, as there's no continuous block of address space left free that's big enough).
You could also get a variation on this problem if the memory allocations happened in a different order in debug and release, too.