My MFC based application would write multiple files of different types on a disk. I have to pre-allocate fixed disk space for the entire application so that other applications do not eat up my disk resources. Through google, I figured out how to pre-allocate disk space for a single file but not for multiple files.
See this answer Reserve disk space before writing a file for efficiency
I don't think it is possible to pre-allocate disk space at application level (i.e. pre-allocate a chunk of space for multiple files). You have to allocate space for each file separately using CFile::Seek or SetFilePointer.
Related
There is big file. I need fast sort it. I going to process the file by part, that fit in RAM, to avoid/degrees using page file (next step: merge parts). How to use max RAM?
My solution: use WinApi file memory mapping, but I don't knew how to get part of file maximum size, but fit RAM (how to determine size)?
You can VirtualLock the pages you want to process. It locks in physical memory the size you need (if there is enough) swapping others to the paging file.
You can use the GlobalMemoryStatusEx function to determine how much memory your application can allocate without severely impacting other applications.
So you could map the file and lock the pages you are going to process.
I am writing a program that needs to traverse a large 40gb binary file, but I only have 16gb of physical RAM. A friend told me that I can use file mapping to allievate this problem. I understand how to create a file mapping and reading into a file map handle, and how file mapping maps parts of a file in persistent memory to different chunks of virtual memory for reading.
So if I am understanding this correctly, I can create a buffer of say 10gb, and read the first 10gb of the file into this buffer. But When I have to read past the 10gb mark on the file, will the OS fetch another block automatically for me, or do I have to manually do so in my code?
The functions you linked to aren't (directly) related to file mapping. They're used for regular file I/O.
To use traditional file I/O with a really large file, you could do as you described. You would open the file, create a buffer, and read a chunk of the file into the buffer. When you need to access a different part of the file, you read a different chunk into the buffer.
To use a file mapping, you use CreateFile, CreateFileMapping, and then MapViewOfFile. You don't (directly) create a buffer and read a part of the file into it. Instead, you tell the system that you want to access a range of the file as though it were a range of memory addresses. Reads and writes to those addresses are turned into file i/o operations behind the scenes. In this approach you might still have to work in chunks. If the part of the file you need to access is not in the range you currently have mapped, you can create another view (and possibly close the other one).
But note that I said address space, which is different than RAM. If you're building for 64-bit Windows, you can try to map the entire 40 GB file into your address space. The fact that you have only 16 GB of RAM won't stop you. There may be some other problems at that size, but it won't be because of your RAM. If there are other problems, you'll be back to managing the file in chunks as before.
I have a large matrix of values that takes up about 2GB of RAM.
I need to form a copy of this matrix, then the original can be swapped out to disk, to be loaded later. The contents of this matrix are important. Computing it initially is expensive, so you cannot easily throw it away and re-create it. It is faster to drop the matrix to disk, and then re-load it from disk, than it is to re-compute it from scratch.
Is there an easier or better way to designate a section of memory to be temporarily put on disk until next access than what I have, which is:
when the resource (2GB matrix) is not needed
open a file
write the file to disk
free the memory
when the resource is needed
open file
read in matrix
delete file from disk
I came across File mapping But I'm not sure this is the right thing to use
Have a look at Memory Mapped Files.
Memory-mapped files (MMFs) offer a unique memory management feature that allows applications
to access files on disk in the same way they access dynamic memory—through pointers.
The operating system will very efficiently swap portions of the original matrix to/from disk.
Assuming the matrix doesn't need to survive program restarts, compile your application as 64-bit and just leave the matrix in memory. The OS will automatically swap-out the least-used memory pages when under memory pressure.
However, even on a mildly modern hardware, you'll have much more than 2+2 GB1 of RAM and a very good chance everything will stay in RAM anyway.
1 Original matrix + copy.
My System:
Physical memory: 3gb
Windows XP Service Pack 3 (32bit)
Swap file size: 30gb
Goal: To find the largest possible memory map size I can allocate on my machine.
When I run the following code to allocate 2gb memory map file, the call fails.
handle=CreateFileMapping(INVALID_HANDLE_VALUE,NULL,PAGE_READWRITE|SEC_COMMIT,0,INT_MAX,NULL);
I've been very puzzled by this, because I can allocate a memory map file's up to the system swap file size of 30gb by constantly calling CreateFileMapping with 100mb at a time.
After restarting the machine, and re-running the application that requests 2gb of memory mapped file to CreateFileMapping it works and it returns a valid handle. So this leads me a bit confused what the hell is going on under the hood with windows?
So the situation is this, I can create many small memory mapped files using up all the system page file (30gb), but when asking for a single allocation of 2gb the call fails. When restarting the machine and running the same application the call succeeds!
Some notes:
1) The memory mapped file is not being loaded into the proccess virtual address space, there is no view yet to the file.
2) The OS can allocate small 100mb memory mapped files to 30gb of the systems page file!
Right now the only conclusion I can come to, is that the Windows XP SP3 (32bit) virtual memory manager cannot successfully reserve the requested 2gb in the system page file, and then fails due to the system memory fragmentation (it seems like it needs to reserve a continues allocation of memory, even though the page file is 4kb each). After a restart I assume the system memory fragmentation is less, thus allowing the same call to succeed and allocate a memory mapped file of 2gb in size.
I've run some experiments, after running the machine for a day I started a small application that would allocate a memory maped file of 300mb and then release it. It would then increase the size by 1mb and try again. Finally it stops at 700mb and reports (insufficient system resources). I would then go through and close down each application and this would in turn stop the error messages and it finally continues to allocate a memory mapped file of 3.5gb in size!
So my question is what is going on here? There must be some type of memory fragmentation happening internally with the virtual memory manager, because allocating 100mbs memory mapped files will consume up to the 30gb of the system page file (commit limit).
Update
Conclusion is if you're going to create a large memory mapped file backed by the system page file with INVALID_HANDLE_VALUE, then the system page file (swap file) needs to be resize to the required size and be in a non fragmented state for large allocations > 2gb! Though under heavy IO load it can still fail. To get around all these problems you can create your own file with the needed size (I did 1tb) and memory map to that file instead.
Final Update
I ran the same tests on a Windows 7 box, and to my surprise it works every single time (up to the system page file size) without touching anything. So I guess this is just a bug, that large memory allocations can fail more often on Windows XP than Windows 7.
The problem is file fragmentation. Physical memory (RAM) has nothing to do with anything here. In a virtual memory system, 'memory' is allocated from the file system. Physical memory is just an optimization to speed access to memory.
When you request a memory-mapped file with write access, the system must have a file with contiguous pages free. The system swap file is often fragmented. If your disk drive is nicely defragmented, you should be able to create a large memory-mapped file using a file of your choice (not the system page file).
So if you really have to have a 2GB memory-mapped file, you need to create one on the drive at installation. This shifts the problem of creating a contiguous 2GB file to installation, but once created, you should be ok.
So my question is what is going on here? There must be some type of memory fragementation happening internally with the virtual memory manager, because allocating 100mbs memory mapped files will consume up to the 30gb of the system page file (commit limit).
Sounds about right. If you don't need large contiguous chunks of memory, don't ask for them if you can get the same amount of memory in smaller chunks.
To find the largest possible memory map size I can allocate on my machine.
Try it with size X.
If that fails, try with size X/2 and repeat.
This gets you a chunk at runtime, maybe not the exact largest possible chunk, but within a factor of 2.
Let's takes up Windows developer position.
Assume some user perform following steps:
Create memory mapping.
Populate some memory with sensitive data
Unmap from file
Continue using memory
Windows need to unload these pages for critical tasks.
Resolution - mapped memory should feat for swapping. But it doesn't means that mapped will be swapped.
Platofrm - Linux, Arch - ARM
Programming lang - C/C++
Objective - map a regular (let say text) file to a pre-known location (physical address) in ram and pass that physical address to some other application. Size of the block which i map at a time is 128K.
The way I am trying to go about is-
User space process issues the ioctl call to ask a device driver to get a chunk of memory (ram), calculated the physical address and return it to the user space.
User space process needs to maps the file to that physical address space
I am not sure how to go about it. Any help is appreciated. ???
Issue with mmap call on the file and then calculating the physical address is that, pages are not in memory till someone access them and physical memory pages allocated might not be contiguous.
The other process which will actually access the file is from third party vendor application. That application demands once we pass it the physical address, file contents needs to be present in contiguous memory.
How i am doing it right now --
User process call the mmap to device.
Device driver does a kmalloc, calculate the starting physical address and mmap the VMA to that physical address.
Now user process do a read on the file and copies it to the address space obtained during the mmap.
Issue - Copy of the file exist two location in the ram, one when read is done from disk and other when i copy it to the buffer obtained using mmap and corresponding copying overheads.
In a ideal world i would like to load the file directly from the disk to a known/predefined location.
"Mapping a file" implies using virtual addresses rather than physical, so that's not going to do what you want.
If you want to put the file contents into a contiguous block of physical memory, just use open() and read() once you have obtained the contiguous buffer.
Perhaps something like madvise() with MADV_SEQUENTIAL advice argument could help?
Some things to consider:
How large is the file you're going to be mapping?
That might affect your ability to get a contiguous block of RAM, even if you were to take the kernel driver based approach.
For a kernel driver based approach, well behaved drivers typically should not kmalloc(), e.g. to get a contiguous block of memory, more than 32KB. Furthermore, you typically can't kmalloc() more than 2MB (I've tried this :)). Is that going to be suitable for your needs?
If you need a really large chunk of memory something like the kernel's alloc_bootmem() function could help, but it only works for static "built-in" drivers, not dynamically loadable ones.
Is there any way you can rework your design so that a large contiguous block of mapped memory isn't necessary?