Platofrm - Linux, Arch - ARM
Programming lang - C/C++
Objective - map a regular (let say text) file to a pre-known location (physical address) in ram and pass that physical address to some other application. Size of the block which i map at a time is 128K.
The way I am trying to go about is-
User space process issues the ioctl call to ask a device driver to get a chunk of memory (ram), calculated the physical address and return it to the user space.
User space process needs to maps the file to that physical address space
I am not sure how to go about it. Any help is appreciated. ???
Issue with mmap call on the file and then calculating the physical address is that, pages are not in memory till someone access them and physical memory pages allocated might not be contiguous.
The other process which will actually access the file is from third party vendor application. That application demands once we pass it the physical address, file contents needs to be present in contiguous memory.
How i am doing it right now --
User process call the mmap to device.
Device driver does a kmalloc, calculate the starting physical address and mmap the VMA to that physical address.
Now user process do a read on the file and copies it to the address space obtained during the mmap.
Issue - Copy of the file exist two location in the ram, one when read is done from disk and other when i copy it to the buffer obtained using mmap and corresponding copying overheads.
In a ideal world i would like to load the file directly from the disk to a known/predefined location.
"Mapping a file" implies using virtual addresses rather than physical, so that's not going to do what you want.
If you want to put the file contents into a contiguous block of physical memory, just use open() and read() once you have obtained the contiguous buffer.
Perhaps something like madvise() with MADV_SEQUENTIAL advice argument could help?
Some things to consider:
How large is the file you're going to be mapping?
That might affect your ability to get a contiguous block of RAM, even if you were to take the kernel driver based approach.
For a kernel driver based approach, well behaved drivers typically should not kmalloc(), e.g. to get a contiguous block of memory, more than 32KB. Furthermore, you typically can't kmalloc() more than 2MB (I've tried this :)). Is that going to be suitable for your needs?
If you need a really large chunk of memory something like the kernel's alloc_bootmem() function could help, but it only works for static "built-in" drivers, not dynamically loadable ones.
Is there any way you can rework your design so that a large contiguous block of mapped memory isn't necessary?
Related
POSIX environments provide at least two ways of accessing files. There's the standard system calls open(), read(), write(), and friends, but there's also the option of using mmap() to map the file into virtual memory.
When is it preferable to use one over the other? What're their individual advantages that merit including two interfaces?
mmap is great if you have multiple processes accessing data in a read only fashion from the same file, which is common in the kind of server systems I write. mmap allows all those processes to share the same physical memory pages, saving a lot of memory.
mmap also allows the operating system to optimize paging operations. For example, consider two programs; program A which reads in a 1MB file into a buffer creating with malloc, and program B which mmaps the 1MB file into memory. If the operating system has to swap part of A's memory out, it must write the contents of the buffer to swap before it can reuse the memory. In B's case any unmodified mmap'd pages can be reused immediately because the OS knows how to restore them from the existing file they were mmap'd from. (The OS can detect which pages are unmodified by initially marking writable mmap'd pages as read only and catching seg faults, similar to Copy on Write strategy).
mmap is also useful for inter process communication. You can mmap a file as read / write in the processes that need to communicate and then use synchronization primitives in the mmap'd region (this is what the MAP_HASSEMAPHORE flag is for).
One place mmap can be awkward is if you need to work with very large files on a 32 bit machine. This is because mmap has to find a contiguous block of addresses in your process's address space that is large enough to fit the entire range of the file being mapped. This can become a problem if your address space becomes fragmented, where you might have 2 GB of address space free, but no individual range of it can fit a 1 GB file mapping. In this case you may have to map the file in smaller chunks than you would like to make it fit.
Another potential awkwardness with mmap as a replacement for read / write is that you have to start your mapping on offsets of the page size. If you just want to get some data at offset X you will need to fixup that offset so it's compatible with mmap.
And finally, read / write are the only way you can work with some types of files. mmap can't be used on things like pipes and ttys.
One area where I found mmap() to not be an advantage was when reading small files (under 16K). The overhead of page faulting to read the whole file was very high compared with just doing a single read() system call. This is because the kernel can sometimes satisify a read entirely in your time slice, meaning your code doesn't switch away. With a page fault, it seemed more likely that another program would be scheduled, making the file operation have a higher latency.
mmap has the advantage when you have random access on big files. Another advantage is that you access it with memory operations (memcpy, pointer arithmetic), without bothering with the buffering. Normal I/O can sometimes be quite difficult when using buffers when you have structures bigger than your buffer. The code to handle that is often difficult to get right, mmap is generally easier. This said, there are certain traps when working with mmap.
As people have already mentioned, mmap is quite costly to set up, so it is worth using only for a given size (varying from machine to machine).
For pure sequential accesses to the file, it is also not always the better solution, though an appropriate call to madvise can mitigate the problem.
You have to be careful with alignment restrictions of your architecture(SPARC, itanium), with read/write IO the buffers are often properly aligned and do not trap when dereferencing a casted pointer.
You also have to be careful that you do not access outside of the map. It can easily happen if you use string functions on your map, and your file does not contain a \0 at the end. It will work most of the time when your file size is not a multiple of the page size as the last page is filled with 0 (the mapped area is always in the size of a multiple of your page size).
In addition to other nice answers, a quote from Linux system programming written by Google's expert Robert Love:
Advantages of mmap( )
Manipulating files via mmap( ) has a handful of advantages over the
standard read( ) and write( ) system calls. Among them are:
Reading from and writing to a memory-mapped file avoids the
extraneous copy that occurs when using the read( ) or write( ) system
calls, where the data must be copied to and from a user-space buffer.
Aside from any potential page faults, reading from and writing to a memory-mapped file does not incur any system call or context switch
overhead. It is as simple as accessing memory.
When multiple processes map the same object into memory, the data is shared among all the processes. Read-only and shared writable
mappings are shared in their entirety; private writable mappings have
their not-yet-COW (copy-on-write) pages shared.
Seeking around the mapping involves trivial pointer manipulations. There is no need for the lseek( ) system call.
For these reasons, mmap( ) is a smart choice for many applications.
Disadvantages of mmap( )
There are a few points to keep in mind when using mmap( ):
Memory mappings are always an integer number of pages in size. Thus, the difference between the size of the backing file and an
integer number of pages is "wasted" as slack space. For small files, a
significant percentage of the mapping may be wasted. For example, with
4 KB pages, a 7 byte mapping wastes 4,089 bytes.
The memory mappings must fit into the process' address space. With a 32-bit address space, a very large number of various-sized mappings
can result in fragmentation of the address space, making it hard to
find large free contiguous regions. This problem, of course, is much
less apparent with a 64-bit address space.
There is overhead in creating and maintaining the memory mappings and associated data structures inside the kernel. This overhead is
generally obviated by the elimination of the double copy mentioned in
the previous section, particularly for larger and frequently accessed
files.
For these reasons, the benefits of mmap( ) are most greatly realized
when the mapped file is large (and thus any wasted space is a small
percentage of the total mapping), or when the total size of the mapped
file is evenly divisible by the page size (and thus there is no wasted
space).
Memory mapping has a potential for a huge speed advantage compared to traditional IO. It lets the operating system read the data from the source file as the pages in the memory mapped file are touched. This works by creating faulting pages, which the OS detects and then the OS loads the corresponding data from the file automatically.
This works the same way as the paging mechanism and is usually optimized for high speed I/O by reading data on system page boundaries and sizes (usually 4K) - a size for which most file system caches are optimized to.
An advantage that isn't listed yet is the ability of mmap() to keep a read-only mapping as clean pages. If one allocates a buffer in the process's address space, then uses read() to fill the buffer from a file, the memory pages corresponding to that buffer are now dirty since they have been written to.
Dirty pages can not be dropped from RAM by the kernel. If there is swap space, then they can be paged out to swap. But this is costly and on some systems, such as small embedded devices with only flash memory, there is no swap at all. In that case, the buffer will be stuck in RAM until the process exits, or perhaps gives it back withmadvise().
Non written to mmap() pages are clean. If the kernel needs RAM, it can simply drop them and use the RAM the pages were in. If the process that had the mapping accesses it again, it cause a page fault the kernel re-loads the pages from the file they came from originally. The same way they were populated in the first place.
This doesn't require more than one process using the mapped file to be an advantage.
Cant really find any specifics on this, heres all I know about mmf's in windows:
Creating a memory mapped file in windows adds nothing to the apparent amount of memory a program uses
Creating a view to that file consumes memory equivalent to the view size
This looks rather backwards to me, since for one, I know that the mmf itself actually has memory...somewhere. If I write something in a mmf and destroy the view, the data is still there. Meanwhile, why does the view take any memory at all? Its just a pointer, no?
Then theres the weirdness with whats actually in the ram and whats on the disk. In large mmf's with a distributed looking access pattern, sometimes the speed is there and sometimes its not. I'm guessing some of it gets sometimes stored in the file if one is tied to it or the paging file but really, I have no clue.
Anyways, the problem that drove me to investigate this is that I have a ~2gb file that I want multiple programs to share. I can't create a 2gb view in each of them since I'm just "out of memory" so I have to create/destroy smaller ones. This creates a lot of overhead due to additional offset calculations and the creation of the view itself. Can anybody explain to me why it is like this?
On a demand-paged virtual memory operating system like Windows, the view of an MMF occupies address space. Just numbers to the processor, one for each 4096 bytes. You only start using RAM until you actually use the view. Reading or writing data. At which point you trigger a page fault and force the OS to map the virtual memory page to physical memory. The "demand-paged" part.
You can't get a single chunk of 2 GB of address space in a 32-bit process since there would not be room for anything else. The limit is the largest hole in the address space between other allocations for code and data, usually hovers around ~650 megabytes, give or take. You'll need to target x64. Or build an x86 program that's linked with /LARGEADDRESSAWARE and runs on a 64-bit operating system. A backdoor which is getting to be pretty pointless these days.
The thing in memory mapped file is that it lets you manipulate its data without I/O calls. Because of this behavior, when you access the file, windows loads it to the physical memory, so it can be manipulated in it rather than on the disk. You can read more about this in here: http://blogs.msdn.com/b/khen1234/archive/2006/01/30/519483.aspx
Anyways, the problem that drove me to investigate this is that I have a ~2gb file that I want multiple programs to share. I can't create a 2gb view in each of them since I'm just "out of memory" so I have to create/destroy smaller ones.
The most likely cause is that the programs are 32-bit. 32-bit programs (by default) only have 2GB of address space so you can't map a 2GB file in a single view. If you rebuild them in 64-bit mode, the problem should go away.
Whenever a new / malloc is used, OS create a new(or reuse) heap memory segment, aligned to the page size and return it to the calling process. All these allocations will constitute to the Process's virtual memory. In 32bit computing, any process can scale only upto 4 GB. Higher the heap allocation, the rate of increase of process memory is higher. Though there are lot of memory management / memory pools available, all these utilities end up again in creating a heap and reusing it effeciently.
mmap (Memory mapping) on the other hand, provides the ablity to visualize a file as memory stream, and enables the program to use pointer manipulations directly on file. But here again, mmap is actually allocating the range of addresses in the process space. So if we mmap a 3GB file with size 3GB and take a pmap of the process, you could see the total memory consumed by the process is >= 3GB.
My question is, is it possible to have a file based memory pool [just like mmaping a file], however, does not constitute the process memory space. I visualize something like a memory DB, which is backed by a file, which is so fast for read/write, which supports pointer manipulations [i.e get a pointer to the record and store anything as if we do using new / malloc], which can grow on the disk, without touching the process virtual 4GB limit.
Is it possible ? if so, what are some pointers for me to start working.
I am not asking for a ready made solution / links, but to conceptually understand how it can be achieved.
It is generally possible but very coplicated. You would have to re-map if you wanted to acces different 3Gb segments of your file, which would probably kill the performance in case of scattered access. Pointers would only get much more difficult to work with, as remmpaing changes data but leaves the adresses the same.
I have seen STXXL project that might be interesting to you; or it might not. I have never used it so I cannot give you any other advice about it.
What you are looking for, is in principle, a memory backed file-cache. There are many such things in for example database implementations (where the whole database is way larger than the memory of the machine, and the application developer probably wants to have a bit of memory left for application stuff). This will involve having some sort of indirection - an index, hash or some such to indicate what area of the file you want to access, and using that indirection to determine if the memory is in memory or on disk. You would essentially have to replicate what the virtual memory handling of the OS and the processor does, by having tables that indicate where in physical memory your "virtual heap" is, and if it's not present in physical memory, read it in (and if the cache is full, get rid of some - and if it's been written, write it back again).
However, it's most likely that in today's world, you have a machine capable of 64-bit addressing, and thus, it would be much easier to recompile the application as a 64-bit application, usemmap or similar to access the large memory. In this case, even if RAM isn't sufficient, you can access the memory of the file via the virtual memory system, and it takes care of all the mapping back and forth between disk and RAM (physical memory).
My C++ application occasionally runs out of memory due to large amounts of data being retrieved from a database. It has to run on 32bit WinXP machines.
Is it possible to transparently (for most of the existing code) swap out the data objects to disk and read them into memory only on demand, so I'm not limited to the 2GB that 32bit Windows gives to the process?
I've looked at VirtualAlloc and Address Window Extensions but I'm not sure it's what I want.
I also found this SO question where the questioner creates a file mapping and wants to create objects in there. One answer suggests using placement new which sounds like it would be pretty transparent to the rest of the code.
Will this prevent my application to run out of physical memory? I'm not entirely sure of it because after all there is still the 32bit address space limit. Or is this a different kind of problem that will occur when trying to create a lot of objects?
So long as you are using a 32-bit operating system there is nothing you can do about this. There is no way to have more than 3GB (2GB in the case of Windows) of data in virtual memory, whether or not it's actually swapped out to disk.
Historically databases have always handled this problem by using read, write and seek. So rather than accessing data directly from memory, they use a fake (64-bit) pointer. Data is split into blocks (normally around 4kb), and a number of these blocks are allocated in memory. When they want to access data from a fake pointer address they check if the block is loaded into memory and if it is they access it from there. If it is not then they find an empty slot and copy it in, then return the address. If there are no slots free then a piece of data will be written back out to disk (if it's been modified) and that slot will be reused.
The real beauty of this is that if your system has enough RAM then the operating system will cache much more than 2GB of this data in RAM at any point in time, and when you feel like you are actually reading and writing from disk the operating system will probably just be copying data around in memory. This, of course, requires a 32-bit operating system that support more than 3GB of physical memory, such as Linux or Windows Server with PAE.
SQLite has a nice self-contained implementation of this, which you could probably make use of with little effort.
If you do not wish to do this then your only alternatives are to either use a 64-bit operating system or to work with less data at any given point in time.
My System:
Physical memory: 3gb
Windows XP Service Pack 3 (32bit)
Swap file size: 30gb
Goal: To find the largest possible memory map size I can allocate on my machine.
When I run the following code to allocate 2gb memory map file, the call fails.
handle=CreateFileMapping(INVALID_HANDLE_VALUE,NULL,PAGE_READWRITE|SEC_COMMIT,0,INT_MAX,NULL);
I've been very puzzled by this, because I can allocate a memory map file's up to the system swap file size of 30gb by constantly calling CreateFileMapping with 100mb at a time.
After restarting the machine, and re-running the application that requests 2gb of memory mapped file to CreateFileMapping it works and it returns a valid handle. So this leads me a bit confused what the hell is going on under the hood with windows?
So the situation is this, I can create many small memory mapped files using up all the system page file (30gb), but when asking for a single allocation of 2gb the call fails. When restarting the machine and running the same application the call succeeds!
Some notes:
1) The memory mapped file is not being loaded into the proccess virtual address space, there is no view yet to the file.
2) The OS can allocate small 100mb memory mapped files to 30gb of the systems page file!
Right now the only conclusion I can come to, is that the Windows XP SP3 (32bit) virtual memory manager cannot successfully reserve the requested 2gb in the system page file, and then fails due to the system memory fragmentation (it seems like it needs to reserve a continues allocation of memory, even though the page file is 4kb each). After a restart I assume the system memory fragmentation is less, thus allowing the same call to succeed and allocate a memory mapped file of 2gb in size.
I've run some experiments, after running the machine for a day I started a small application that would allocate a memory maped file of 300mb and then release it. It would then increase the size by 1mb and try again. Finally it stops at 700mb and reports (insufficient system resources). I would then go through and close down each application and this would in turn stop the error messages and it finally continues to allocate a memory mapped file of 3.5gb in size!
So my question is what is going on here? There must be some type of memory fragmentation happening internally with the virtual memory manager, because allocating 100mbs memory mapped files will consume up to the 30gb of the system page file (commit limit).
Update
Conclusion is if you're going to create a large memory mapped file backed by the system page file with INVALID_HANDLE_VALUE, then the system page file (swap file) needs to be resize to the required size and be in a non fragmented state for large allocations > 2gb! Though under heavy IO load it can still fail. To get around all these problems you can create your own file with the needed size (I did 1tb) and memory map to that file instead.
Final Update
I ran the same tests on a Windows 7 box, and to my surprise it works every single time (up to the system page file size) without touching anything. So I guess this is just a bug, that large memory allocations can fail more often on Windows XP than Windows 7.
The problem is file fragmentation. Physical memory (RAM) has nothing to do with anything here. In a virtual memory system, 'memory' is allocated from the file system. Physical memory is just an optimization to speed access to memory.
When you request a memory-mapped file with write access, the system must have a file with contiguous pages free. The system swap file is often fragmented. If your disk drive is nicely defragmented, you should be able to create a large memory-mapped file using a file of your choice (not the system page file).
So if you really have to have a 2GB memory-mapped file, you need to create one on the drive at installation. This shifts the problem of creating a contiguous 2GB file to installation, but once created, you should be ok.
So my question is what is going on here? There must be some type of memory fragementation happening internally with the virtual memory manager, because allocating 100mbs memory mapped files will consume up to the 30gb of the system page file (commit limit).
Sounds about right. If you don't need large contiguous chunks of memory, don't ask for them if you can get the same amount of memory in smaller chunks.
To find the largest possible memory map size I can allocate on my machine.
Try it with size X.
If that fails, try with size X/2 and repeat.
This gets you a chunk at runtime, maybe not the exact largest possible chunk, but within a factor of 2.
Let's takes up Windows developer position.
Assume some user perform following steps:
Create memory mapping.
Populate some memory with sensitive data
Unmap from file
Continue using memory
Windows need to unload these pages for critical tasks.
Resolution - mapped memory should feat for swapping. But it doesn't means that mapped will be swapped.