I have a memory-mapped file on disk in my application. My application creates multiple threads and all of them are memory mapping the same file at the same time or different time. In one of the threads, I am calling unmapViewOfFile followed by SetFilePointer and SetEndOfFile. SetEndOfFile call fails with error ERROR_USER_MAPPED_FILE. This links suggests this is a known problem that it requires to unmap all views of a mapped file. How to check if a file is still mapped in another thread/process and how to unmap it from all the threads/processes?
Related
There is FlushFileBuffers() API in Windows to flush buffers till hard drive for a single file. There is sync() API in Linux to flush file buffers for all files.
However, is there WinAPI for flushing all files too, i.e. a sync() analog?
https://learn.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-flushfilebuffers
It is possible to flush the entire hard drive.
To flush all open files on a volume, call FlushFileBuffers with a handle to the volume. The caller must have administrative privileges. For more information, see Running with Special Privileges.
Also, the same article states the correct procedure to follow if, for some reason, data must be flushed: CreateFile function with the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags.
Due to disk caching interactions within the system, the FlushFileBuffers function can be inefficient when used after every write to a disk drive device when many writes are being performed separately. If an application is performing multiple writes to disk and also needs to ensure critical data is written to persistent media, the application should use unbuffered I/O instead of frequently calling FlushFileBuffers. To open a file for unbuffered I/O, call the CreateFile function with the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags. This prevents the file contents from being cached and flushes the metadata to disk with each write. For more information, see CreateFile.
But also check the restrictions of file buffering about memory and data alignment.
According to File Management Functions there is no any sync() analog from Linux in WinAPI.
I've started playing with mmap. I'm trying to create an example workspace that will be then extended to the real case.
This is what I want to achieve:
PROCESS 1:
mmap a file (actually a device, but it's okay to generate an example with a text file)
PROCESS 2: (not foked from process 1; just an independent process)
read the memory mapped by process 1
change some bits
write it to a new file
I've read several examples and documentations, but I still didn't find how to achieve this. What I'm missing is:
how can process 2 access the memory mapped by process 1, without knowing anything about the opened file?
how can I put the mmap content in a new file? I suppose I have to ftruncate a new file, mmap this file and memcpy the content of process 1 memory map to process 2 memory map (then msync)
Side info, I have a message queue opened between the two processes, so they can share some messages if needed (ex. the memory address/size, ...).
Any hints?
Thanks in advance!
MIX
This answer considers you are trying to do this stuff on linux/unix.
how can process 2 access the memory mapped by process 1, without knowing anything about the opened file?
Process 1 passes to mmap[1] the flag MAP_SHARED.
You can:
A) Share the file descriptor using unix domain sockets[2].
B) Send
the name of the file using the queues you mentioned at the end of
your message.
Process 2 opens mmap with the flag MAP_SHARED. Modifications to the mmaped memory in Process 1 will be visible for Process 2. If you need fine control of when the changes from process 1 are shown to process 2 you should control it with msync[3]
how can I put the mmap content in a new file? I suppose I have to
ftruncate a new file, mmap this file and memcpy the content of process
1 memory map to process 2 memory map (then msync)
Why just don't write the mmaped memory as regular memory with write?
[1]http://man7.org/linux/man-pages/man2/mmap.2.html
[2]Portable way to pass file descriptor between different processes
[3]http://man7.org/linux/man-pages/man2/msync.2.html
I am trying to use shared memory between user process and kernel.
Option one - to let kernel to create section and let user mode app to open memory by name "Global\my_mem". It's working only in read-only mode. When I am trying to open section with FILE_MAP_WRITE it gives access denied(5). Not sure how to grant access or modify DACL.
Option two - pass handle back via IOCTL. This one is questionable since handle to section opened in KERNEL is 0xFFFFFFFF80001234. My understanding that handles that have any of upper bits set can not be used in user mode. Especially if app will be 32-bit :) Initially I expected that section handle will be somewhat similar to kernel file handle and I will be able to use it.
What would be the correct approach to establish shared memory channel between kernel and user mode?
For option 1, you can specify the security descriptor assigned to the newly created object via the SecurityDescriptor member of the OBJECT_ATTRIBUTES structure.
For option 2, you would need to create an additional handle as a user handle, which you do by not specifying the OBJ_KERNEL_HANDLE flag in the OBJECT_ATTRIBUTES structure. This will only work if you open the new handle while running in the context of a thread belonging to the user application's process, e.g., while processing an IOCTL received from the user application.
Another option is for the kernel driver to map the section into the user-mode application's address space itself, using ZwMapViewOfSection.
One issue with using a section is that the driver itself can only safely access it from a system thread. If that is a problem, you can share memory directly rather than via a section. If you allocate the memory in kernel mode, you can map it into the user-mode application's address space using MmMapLockedPagesSpecifyCache.
Yet another option is for the driver to access a memory buffer allocated by the user-mode process.
The downside to either of these approaches is that the buffer (or the part of it being shared) must be locked in memory, whereas using a section allows the buffer to be pageable.
Since you referred to 32bit app, I assume it is between a user process and a device driver - I would go with IOCTL - METHOD_IN_DIRECT (receives data in the buffer) and METHOD_OUT_DIRECT (write data into the buffer).
If shared memory is between multiple user processes and one or more device drivers - using shared Memory Object method is recommended .
if I want to mmap a 10 GB file and load the whole file into physical memory immediately, how can I do so?
I don't want to use function like mlock because it needs root privileges.
Is there a system call which can satisfy my demand?
(I have more than enough memory.)
Read the man-page for mmap:
MAP_POPULATE (since Linux 2.5.46)
Populate (prefault) page tables for a mapping. For a file
mapping, this causes read-ahead on the file. Later accesses
to the mapping will not be blocked by page faults.
MAP_POPULATE is supported for private mappings only since
Linux 2.6.23
Issue your request, and be prepared for a short wait (unless you exceed the processes limits) (depending on disk-bandwidth and cache).
Bakcground:
I'm developing on the new SparkleDB NoSQL database, the database is ACID and has its own disk space manager (DSM) all for its database file storage accessing. The DSM allows for multiple thread concurrent I/O operations on the same physical file, ie. Asynchronous I/O or overlapped I/O. We disable disk caching, thus we write pages directly to the disk, as this is required for ACID databases.
My question is:
Is there a performance gain by arranging continuous disk page from many threads writes before sending the I/O request to the underlying disk OS I/O subsystem(thus merging the data to be written if they are continuous), or does the I/O subsystem do this for you? My question applies to UNIX, Linux, and Windows.
Example (all happends within a space of 100ms):
Thread #1: Write 4k to physical file address 4096
Thread #2: Write 4k to physical file address 0
Thread #3: Write 4k to physical file address 8192
Thread #4: Write 4k to physical file address 409600
Thread #5: Write 4k to physical file address 413696
Using this information, the DSM arranges a single 12kb write operation to physical file address 0, and a single 8kb write operation to physical file address 409600.
Update:
The DSM does all the physical file access address positioning on Windows by providing a OVERLAPPED structure, io_prep_pwrite on Linux AIO, and aiocb's aio_offset on POSIX AIO.
The most efficient method to use a hard drive is to keep writing as much data as you can while the platters are still spinning. This involves reducing the quantity of writes and increase the amount of data per write. If this can happen, then having a disk area of continuous sectors will help.
For each write, the OS needs to translate the write to your file into logical or physical coordinates on the drive. This may involve reading the directory, searching for your file and locating the mapping of your file within the directory.
After the OS determines the location, it sends data across the interface to the hard drive. Your data may be cached along the way many times until it is placed onto the platters. An efficient write will use the block sizes of the caches and data interfaces.
Now the questions are: 1) How much time does this save? and 2) Is the time saving significant. For example, if all this work saves you 1 second, this one second gained may be lost in waiting for a response from the User.
Many programs, OS and drivers will postpone writes to a hard drive to non-critical or non-peak periods. For example, while you are waiting for User input, you could be writing to the hard drive. This posting of writes may be less effort than optimizing the disk writes and have more significant impact to your application.
BTW, this has nothing to do with C++.