How to flush memory-mapped files using Boost's `mapped_file_sink` class? - c++

Using the Boost Libraries version 1.62.0 and the mapped_file_sink class from Boost.IOStreams.
I want to flush the written data to disk at will, but there is no mapped_file_sink::flush() member function.
My questions are:
How can I flush the written data when using mapped_file_sink?
If the above can't be done, why not, considering that msync() and FlushViewOfFile() are available for a portable implementation?

If you look at the mapped file support for proposed Boost.AFIO v2 at https://ned14.github.io/boost.afio/classboost_1_1afio_1_1v2__xxx_1_1map__handle.html, you'll notice a lack of ability to flush mapped file views as well.
The reason why is because it's redundant on modern unified page cache kernels when the mapped view is identical in every way to the page cached buffers for that file. msync() is therefore a no-op on such kernels because dirty pages are already queued for writing out to storage as and when the system decides it is appropriate. You can block your process until the system has finished writing out all the dirty pages for that file using good old fsync().
All the above does not apply where (a) your kernel is not a unified page cache design (QNX, NetBSD etc) or (b) your file resides on a networked file system. If you are in an (a) situation, best to simply avoid memory mapped i/o altogether, just do read() and write(), they are such a small percentage of OSs nowadays let them suffer with poor performance. For the (b) situation, you are highly inadvised to be using memory mapped i/o ever with networked file systems. There is an argument for read-only maps of immutable files only, otherwise just don't do it unless you know what you're doing. Fall back to read() and write(), it's safer and less likely to surprise.
Finally, you linked to a secure file deletion program. Those programs don't work reliably any more with recent file systems because of delayed extent allocation or copy on write allocation. In other words, when you rewrite a section of an existing file, it doesn't modify the original data on storage but actually allocates new storage and points the extents list for the file at the new linked list. This allows a consistent file system to be recovered after unexpected data loss easily. To securely delete data on recent file systems you usually need to use special OS APIs, though deleting all the files and then filling the free space with random data may securely delete most of the data in question most of the time. Note copy on write filing systems may not release freed extents back to the free space pool for new allocation for many days or weeks until the next time a garbage collection routine fires or a snapshot is deleted. In this situation, filling free space with randomness will not securely delete the files in question. If all this is a problem, use FAT32 as your filing system, it's very simple and rewriting data on it really does rewrite the same data on storage (though note that some storage media e.g. SSDs are highly likely to also not rewrite data, these also write modifications to new storage and garbage collect freed extents later).

Related

Linux non-persistent backing store for mmap()

First, a little motivating background info: I've got a C++-based server process that runs on an embedded ARM/Linux-based computer. It works pretty well, but as part of its operation it creates a fairly large fixed-size array (e.g. dozens to hundreds of megabytes) of temporary/non-persistent state information, which it currently keeps on the heap, and it accesses and/or updates that data from time to time.
I'm investigating how far I can scale things up, and one problem I'm running into is that eventually (as I stress-test the server by making its configuration larger and larger), this data structure gets big enough to cause out-of-memory problems, and then the OOM killer shows up, and general unhappiness ensues. Note that this embedded configuration of Linux doesn't have swap enabled, and I can't (easily) enable a swap partition.
One idea I have on how to ameliorate the issue is to allocate this large array on the computer's local flash partition, instead of directly in RAM, and then use mmap() to make it appear to the server process like it's still in RAM. That would reduce RAM usage considerably, and my hope is that Linux's filesystem-cache would mask most of the resulting performance cost.
My only real concern is file management -- in particular, I'd like to avoid any chance of filling up the flash drive with "orphan" backing-store files (i.e. old files whose processes don't exist any longer, but the file is still present because its creating process crashed or by some other mistake forgot to delete it on exit). I'd also like to be able to run multiple instances of the server simultaneously on the same computer, without the instances interfering with each other.
My question is, does Linux have any built-it facility for handling this sort of use-case? I'm particularly imagining some way to flag a file (or an mmap() handle or similar) so that when the file that created the process exits-or-crashes, the OS automagically deletes the file (similar to the way Linux already automagically recovers all of the RAM that was allocated by a process, when the process exits-or-crashes).
Or, if Linux doesn't have any built-in auto-temp-file-cleanup feature, is there a "best practice" that people use to ensure that large temporary files don't end up filling up a drive due to unintentionally becoming persistent?
Note that AFAICT simply placing the file in /tmp won't help me, since /tmp is using a RAM-disk and therefore doesn't give me any RAM-usage advantage over simply allocating in-process heap storage.
Yes, and I do this all the time...
open the file, unlink it, use ftruncate or (better) posix_fallocate to make it the right size, then use mmap with MAP_SHARED to map it into your address space. You can then close the descriptor immediately if you want; the memory mapping itself will keep the file around.
For speed, you might find you want to help Linux manage its page cache. You can use posix_madvise with POSIX_MADV_WILLNEED to advise the kernel to page data in and POSIX_MADV_DONTNEED to advise the kernel to release the pages.
You might find that last does not work the way you want, especially for dirty pages. You can use sync_file_range to explicitly control flushing to disk. (Although in that case you will want to keep the file descriptor open.)
All of this is perfectly standard POSIX except for the Linux-specific sync_file_range.
Yes, You create/open the file. Then you remove() the file by its filename.
The file will still be open by your process and you can read/write it just like any opened file, and it will disappear when the process having the file opened exits.
I believe this behavior is mandated by posix, so it will work on any unix like system. Even at a hard reboot, the space will be reclaimed.
I believe this is filesystem-specific, but most Linux filesystems allow deletion of open files. The file will still exist until the last handle to it is closed. I would recommend that you open the file then delete it immediately and it will be automatically cleaned up when your process exits for any reason.
For further details, see this post: What happens to an open file handle on Linux if the pointed file gets moved, delete

Do memory mapped files provide advantage for large buffers?

My program works with large data sets that need to be stored in contiguous memory (several Gigabytes). Allocating memory using std::allocator (i.e. malloc or new) causes system stalls as large portions of virtual memory are reserved and physical memory gets filled up.
Since the program will mostly only work on small portions at a time, my question is if using memory mapped files would provide an advantage (i.e. mmap or the Windows equivalent.) That is creating a large sparse temporary file and mapping it to virtual memory. Or is there another technique that would change the system's pagination strategy such that less pages are loaded into physical memory at a time.
I'm trying to avoid building a streaming mechanism that loads portions of a file at a time and instead rely on the system's vm pagination.
Yes, mmap has the potential to speed things up.
Things to consider:
Remember the VMM will page things in and out in page size blocked (4k on Linux)
If your memory access is well localised over time, this will work well. But if you do random access over your entire file, you will end up with a lot of seeking and thrashing (still). So, consider whether your 'small portions' correspond with localised bits of the file.
For large allocations, malloc and free will use mmap with MAP_ANON anyway. So the difference in memory mapping a file is simply that you are getting the VMM to do the I/O for you.
Consider using madvise with mmap to assist the VMM in paging well.
When you use open and read (plus, as erenon suggests, posix_fadvise), your file is still held in buffers anyway (i.e. it's not immediately written out) unless you also use O_DIRECT. So in both situations, you are relying on the kernel for I/O scheduling.
If the data is already in a file, it would speed up things, especially in the non-sequential case. (In the sequential case, read wins)
If using open and read, consider using posix_fadvise as well.
This really depends on your mmap() implementation. Mapping a file into memory has several advantages that can be exploited by the kernel:
The kernel knows that the contents of the mmap() pages is already present on disk. If it decides to evict these pages, it can omit the write back.
You reduce copying operations: read() operations typically first read the data into kernel memory, then copy it over to user space.
The reduced copies also mean that less memory is used to store data from the file, which means more memory is available for other uses, which can reduce paging as well.
This is also, why it is generally a bad idea to use large caches within an I/O library: Modern kernels already cache everything they ever read from disk, caching a copy in user space means that the amount of data that can be cached is actually reduced.
Of course, you also avoid a lot of headaches that result from buffering data of unknown size in your application. But that is just a convenience for you as a programmer.
However, even though the kernel can exploit these properties, it does not necessarily do so. My experience is that LINUX mmap() is generally fine; on AIX, however, I have witnessed really bad mmap() performance. So, if your goal is performance, it's the old measure-compare-decide stand by.

how to cache 1000s of large C++ objects

Environment:
Windows 8 64 bit, Windows 2008 server 64 bit
Visual Studio (professional) 2012 64 bits
list L; //I have 1000s of large CMyObject in my program that I cache, which is shared by different threads in my windows service program.
For our SaaS middleware product, we cache in memory 1000s of large C++ objects (read only const objects, each about 4MB in size), which runs the system out of memory. Can we associate a disk file (or some other persistent mechanism that is OS managed) to our C++ objects? There is no need for sharing / inter-process communication.
The disk file will suffice if it works for the duration of the process (our windows service program). The read-only const C++ objects are shared by different threads in the same windows service.
I was even considering using object databases (like mongoDB) to store the objects, which will then be loaded / unloaded at each use. Though faster than reading our serialized file (hopefully), it will still spoil the performance.
The purpose is to retain caching of C++ objects for performance reason and avoid having to load / unload the serialized C++ object every time. It would be great if this disk file is OS managed and requires minimal tweaking in our code.
Thanks in advance for your responses.
The only thing which is OS managed in the manner you describe is swap file. You can create a separate application (let it be called "cache helper"), which loads all the objects into memory and waits for requests. Since it does not use it's memory pages, OS will eventually displace the pages to the swap file, recalling it only if/when needed.
Communication with the applciation can be done through named pipes or sockets.
Disadvantages of such approach are that the performance of such cache will be highly volatile, and it may degrade performance of the whole server.
I'd recommend to write your own caching algorithm/application, as you may later need to adjust its properties.
One solution is of course to simply load every object, and let the OS deal with swapping it in from/out to disk as required. (Or dynamically load, but never discard unless the object is absolutely being destroyed). This approach will work well if there are are number of objects that are more frequently used than others. And the loading from swapspace is almost certainly faster than anything you can write. The exception to this is if you do know beforehand what objects are more likely or less likely to be used next, and can "throw out" the right objects in case of low memory.
You can certainly also use a memory mapped file - this will allow you to read from and write to the file as if it was memory (and the OS will cache the content in RAM as memory is available). On WIndows, you will be using CreateFileMapping or OpenFileMapping to create/open the filemapping, and then MapViewOfFile to map the file into memory. When finished, use UnmapViewOfFile to "unmap" the memory, and then CloseHandle to close the FileMapping.
The only worry about a filemapping is that it may not appear at the same address in memory next time around, so you can't have pointers within the filemapping and load the same data as binary next time. It would of course work fine to create a new filemapping each time.
So your thousands of massive objects have constructor, destructor, virtual functions and pointers. This means you can't easily page them out. The OS can do it for you though, so your most practical approach is simply to add more physical memory, possibly an SSD swap volume, and use that 64-bit address space. (I don't know how much is actually addressable on your OS, but presumably enough to fit your ~4G of objects).
Your second option is to find a way to just save some memory. This might be using a specialized allocator to reduce slack, or removing layers of indirection. You haven't given enough information about your data for me to make concrete suggestions on this.
A third option, assuming you can fit your program in memory, is simply to speed up your deserialization. Can you change the format to something you can parse more efficiently? Can you somehow deserialize objects quickly on-demand?
The final option, and the most work, is to manually manage a swapfile. It would be sensible as a first step to split your massive polymorphic classes into two: a polymorphic flyweight (with one instance per concrete subtype), and a flattened aggregate context structure. This aggregate is the part you can swap in and out of your address space safely.
Now you just need a memory-mapped paging mechanism, some kind of cache tracking which pages are currently mapped, possibly a smart pointer replacing your raw pointer with a page+offset which can map data in on-demand, etc. Again, you haven't given enough information on your data structure and access patterns to make more detailed suggestions.

Is memory-mapped memory possible?

I know that is possible to use memory-mapped files i.e. real files on disk that are transparently mapped to memory. As far as I understand (I haven't used these yet) the mapping takes place immediately, the file is partly read on the first memory access while the OS starts "caching" the whole file in the background.
Now: Is it possible to somewhat abuse this concept and memory-map another block of memory? Assuming the OS provides such indirection one could create a kind of compressed_malloc() that returns a mapping from memory to memory. The memory returned to the caller is simple the memory-mapped range that is transparently compressed in memory and also eventually kept in memory. Thus, for large buffers it could be possible that only part of it get decompressed on-the-fly (on access) while the remaining blocks are kept compressed.
Is that concept technically possible at the moment or - if already realized (in software) - what are the things to look at?
Update 1: I am more or less looking for something that is technically achievable without modifying the OS kernel itself or which requires a virtualization platform.
Update 2: I am hoping for something which allows me to implement the compression and related logic in my own user-space code. I would just use the facilities of the operating system to create the memory-mapping.
Very much so. The VM (Virtual Memory) system is designed to handle different kinds of objects that can be mapped. There is in fact a filesystem call cramfs that does something similar in the sense that it keeps compressed data in storage, but enables transparent, uncompressed access.
You would not be modifying the kernel per se, but you will have to work in the kernel space, implementing VM handlers for this new kind of a memory mapped object.
This is possible, eg.
http://pubs.vmware.com/vsphere-4-esx-vcenter/index.jsp?topic=/com.vmware.vsphere.resourcemanagement.doc_41/managing_memory_resources/c_memory_compression.html
It is not correctly implemented in kernel space in Linux, but something like this could be implemented in user space.

Temp file that exists only in RAM?

I'm trying to write an encrpytion using the OTP method. In keeping with the security theories I need the plain text documents to be stored only in memory and never ever written to a physical drive. The tmpnam command appears to be what I need, but from what I can see it saves the file on the disk and not the RAM.
Using C++ is there any (platform independent) method that allows a file to exist only in RAM? I would like to avoid using a RAM disk method if possible.
Thanks
Edit:
Thanks, its more just a learning thing for me, I'm new to encryption and just working through different methods, I don't actually plan on using many of them (esspecially OTP due to doubling the original file size because of the "pad").
If I'm totally honest, I'm a Linux user so ditching Windows wouldn't be too bad, I'm looking into using RAM disks for now as FUSE seems a bit overkill for a "learning" thing.
The simple answer is: no, there is no platform independent way. Even keeping the data only in memory, it will still risk being swapped out to disk by the virtual memory manager.
On Windows, you can use VirtualLock() to force the memory to stay in RAM. You can also use CryptProtectMemory() to prevent other processes from reading it.
On POSIX systems (e.g. BSD, Linux) you can use mlock() to lock memory in RAM.
Not really unless you count in-memory streams (like stringstream).
No especially and specifically for security purposes: any piece of data can be swapped to disk on virtual memory systems.
Generally, if you are concerned about security, you have to use platform-specific methods for controlling access: What good is keeping your data in RAM if everyone can read it?
You might want to look at TrueCrypt's source code. Getting code at the file system level might be your best bet.
OTP is an awful encryption method for arbitrary files, unless you have a massive amount of entropy that you can guarantee never repeats itself (that's why it's called "one-time"!)
If you want to create a file-like object that only exists in memory and you don't care about Windows, I'd look at writing a custom FUSE filesystem (http://fuse.sourceforge.net/); this way you guarantee what will and will not get written to disk, and your files are accessible by all programs.
Using one of std::stringstream or fmemopen will get you file-like access to blocks of memory. If (for security) you want to avoid it being swapped out, use mlock which is probably easiest to use with fmemopen's buffer than std::stringstream. Combining mlock with std::stringstream would probably need to be done via a custom allocator (used as a template parameter).