mmap a 10 GB file and load it into memory - c++

if I want to mmap a 10 GB file and load the whole file into physical memory immediately, how can I do so?
I don't want to use function like mlock because it needs root privileges.
Is there a system call which can satisfy my demand?
(I have more than enough memory.)

Read the man-page for mmap:
MAP_POPULATE (since Linux 2.5.46)
Populate (prefault) page tables for a mapping. For a file
mapping, this causes read-ahead on the file. Later accesses
to the mapping will not be blocked by page faults.
MAP_POPULATE is supported for private mappings only since
Linux 2.6.23
Issue your request, and be prepared for a short wait (unless you exceed the processes limits) (depending on disk-bandwidth and cache).

Related

How to unmap memory mapped file from all views?

I have a memory-mapped file on disk in my application. My application creates multiple threads and all of them are memory mapping the same file at the same time or different time. In one of the threads, I am calling unmapViewOfFile followed by SetFilePointer and SetEndOfFile. SetEndOfFile call fails with error ERROR_USER_MAPPED_FILE. This links suggests this is a known problem that it requires to unmap all views of a mapped file. How to check if a file is still mapped in another thread/process and how to unmap it from all the threads/processes?

Can I trigger the OS to load a page asynchronously with Memory Mapping Files?

My understanding is the OS will load a page from the Memory Mapped File whenever I request a virtual memory address not yet loaded into the memory. The problem with that it can have up to 4-5 milliseconds to seek to the data on the HDD and I guess my memory request will hang up until then. So I thought when I know I'll need some data soon I can send a request to load that page into memory/cache in advance.
As a naive method I can simply read the memory and then the OS will start to load/cache the page, but when I make a memory read into the Memory Mapped File then my process will hang up until the data arrives. That again will mean a big delay to me.
So the question is if there is a way to just tell the memory manager to prepare to have that data but don't hang me up until the first byte arrives? And when in a few milliseconds later I'll need the data then it will read from the loaded page.
If the answer is OS specific then I'd be interested in OSX and Win.

Get Disk Utilized by each process in c++ windows

I am trying to build a tool which is something similar to Task Manager. I was able to get the CPU and Memory of each processes, but I couldn't figure out the Disk statistics. I was able to get the I/O Read, Write bytes, but it includes all file, disk and network. How could I get only the Disk Utilized by each processes??Otherwise is it possible to segregate the disk statistics from those I/O bytes? If yes, how could I do it?

How to determine whether data has been retrieved from disk or from caches?

I have written a program in C/C++ which needs to fetch data from the disk. After some time it so happens that the operating system stores some of the data in its caches. Is there some way by which I may figure out in a C/c++ programs whether the data has been retrieved from the caches or the data has been retrieved from the disk?
A simple solution would be to time the read operation. Disk reads are significantly slower. you can read a a group of file blocks (4K) twice to get an estimate.
The problem is that if you run the program again or copy the file in a shell, the OS will cache it.

Arranging physical disk sectors before writing to disk

Bakcground:
I'm developing on the new SparkleDB NoSQL database, the database is ACID and has its own disk space manager (DSM) all for its database file storage accessing. The DSM allows for multiple thread concurrent I/O operations on the same physical file, ie. Asynchronous I/O or overlapped I/O. We disable disk caching, thus we write pages directly to the disk, as this is required for ACID databases.
My question is:
Is there a performance gain by arranging continuous disk page from many threads writes before sending the I/O request to the underlying disk OS I/O subsystem(thus merging the data to be written if they are continuous), or does the I/O subsystem do this for you? My question applies to UNIX, Linux, and Windows.
Example (all happends within a space of 100ms):
Thread #1: Write 4k to physical file address 4096
Thread #2: Write 4k to physical file address 0
Thread #3: Write 4k to physical file address 8192
Thread #4: Write 4k to physical file address 409600
Thread #5: Write 4k to physical file address 413696
Using this information, the DSM arranges a single 12kb write operation to physical file address 0, and a single 8kb write operation to physical file address 409600.
Update:
The DSM does all the physical file access address positioning on Windows by providing a OVERLAPPED structure, io_prep_pwrite on Linux AIO, and aiocb's aio_offset on POSIX AIO.
The most efficient method to use a hard drive is to keep writing as much data as you can while the platters are still spinning. This involves reducing the quantity of writes and increase the amount of data per write. If this can happen, then having a disk area of continuous sectors will help.
For each write, the OS needs to translate the write to your file into logical or physical coordinates on the drive. This may involve reading the directory, searching for your file and locating the mapping of your file within the directory.
After the OS determines the location, it sends data across the interface to the hard drive. Your data may be cached along the way many times until it is placed onto the platters. An efficient write will use the block sizes of the caches and data interfaces.
Now the questions are: 1) How much time does this save? and 2) Is the time saving significant. For example, if all this work saves you 1 second, this one second gained may be lost in waiting for a response from the User.
Many programs, OS and drivers will postpone writes to a hard drive to non-critical or non-peak periods. For example, while you are waiting for User input, you could be writing to the hard drive. This posting of writes may be less effort than optimizing the disk writes and have more significant impact to your application.
BTW, this has nothing to do with C++.