SSD raw I/O benchmarks with random read/write - c++

My laptop has a SSD disk that has 512 byte physical disk sector size and 4,096 byte logical disk sector size. I'm working on an ACID database system that has to bypass all OS caches, so I write directly from allocated internal memory (RAM) to the SSD disk. I also extend the files before I run the tests and don't resize it during the tests.
Now here is my problem, according to SSD benchmarks random read & write should be in the range 30 MB/s to 90 MB/s, respectively. But here is my (rather horrible) telemetry from my numerous perfrmance tests:
1.2 MB/s when reading random 512 byte blocks (physical sector size)
512 KB/s when writing random 512 byte blocks (physical sector size)
8.5 MB/s when reading random 4,096 byte blocks (logical sector size)
4.9 MB/s when writing random 4,096 byte blocks (logical sector size)
In addition to using asynchronous I/O I also set the FILE_SHARE_READ and FILE_SHARE_WRITE flags to disable all OS buffering - because our database is ACID I must do this, I also tried FlushFileBuffers() but that gave me even worse performance. I also wait for each async I/O operation to complete as is required by some of our code.
Here is my code, is there are problem with it or am I stuck with this bad I/O performance?
HANDLE OpenFile(const wchar_t *fileName)
{
// Set access method
DWORD desiredAccess = GENERIC_READ | GENERIC_WRITE ;
// Set file flags
DWORD fileFlags = FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING /*| FILE_FLAG_RANDOM_ACCESS*/;
//File or device is being opened or created for asynchronous I/O
fileFlags |= FILE_FLAG_OVERLAPPED ;
// Exlusive use (no share mode)
DWORD shareMode = 0;
HANDLE hOutputFile = CreateFile(
// File name
fileName,
// Requested access to the file
desiredAccess,
// Share mode. 0 equals exclusive lock by the process
shareMode,
// Pointer to a security attribute structure
NULL,
// Action to take on file
CREATE_NEW,
// File attributes and flags
fileFlags,
// Template file
NULL
);
if (hOutputFile == INVALID_HANDLE_VALUE)
{
int lastError = GetLastError();
std::cerr << "Unable to create the file '" << fileName << "'. [CreateFile] error #" << lastError << "." << std::endl;
}
return hOutputFile;
}
DWORD ReadFromFile(HANDLE hFile, void *outData, _UINT64 bytesToRead, _UINT64 location, OVERLAPPED *overlappedPtr,
asyncIoCompletionRoutine_t completionRoutine)
{
DWORD bytesRead = 0;
if (overlappedPtr)
{
// Windows demand that you split the file byte locttion into high & low 32-bit addresses
overlappedPtr->Offset = (DWORD)_UINT64LO(location);
overlappedPtr->OffsetHigh = (DWORD)_UINT64HI(location);
// Should we use a callback function or a manual event
if (!completionRoutine && !overlappedPtr->hEvent)
{
// No manual event supplied, so create one. The caller must reset and close it themselves
overlappedPtr->hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
if (!overlappedPtr->hEvent)
{
DWORD errNumber = GetLastError();
std::wcerr << L"Could not create a new event. [CreateEvent] error #" << errNumber << L".";
}
}
}
BOOL result = completionRoutine ?
ReadFileEx(hFile, outData, (DWORD)(bytesToRead), overlappedPtr, completionRoutine) :
ReadFile(hFile, outData, (DWORD)(bytesToRead), &bytesRead, overlappedPtr);
if (result == FALSE)
{
DWORD errorCode = GetLastError();
if (errorCode != ERROR_IO_PENDING)
{
std::wcerr << L"Can't read sectors from file. [ReadFile] error #" << errorCode << L".";
}
}
return bytesRead;
}

Random IO performance is not measured well in MB/sec. It is measured in IOPS. "1.2 MB/s when reading random 512 byte blocks" => 20000 IOPS. Not bad. Double the block size and you'll get 199% the MB/sec and 99% the IOPS because it takes almost the same time to read 512 bytes than it does to read 1024 bytes (almost no time at all). SSDs are not free of seeking costs as is sometimes mistakenly assumed.
So the numbers are not actually bad at all.
SSDs benefit from high queue depth. Try issuing multiple IOs at once and keep that number outstanding at all times. The optimal concurrency will be somewhere in the range of 1-32.
Because SSDs have hardware concurrency you can expect a small multiple of the single-threaded performance. My SSD has 4 parallel "banks" for example.
Using FILE_FLAG_WRITE_THROUGH | FILE_FLAG_NO_BUFFERING is all that is needed to achieve direct writes to hardware. If these flags do not work your hardware does not respect these flags and you can't do anything about it. All server hardware respects these flags and I have not seen a consumer disk that doesn't.
The sharing flags are not meaningful in this context.
The code is fine although I don't see why you use async IO and later wait on an event to wait for completion. That makes no sense. Either use synchronous IO (which will perform about the same as async IO) or use async IO with completion ports and without waiting.

Use hdparm -I /dev/sdx to check your logical and physical block size. Most modern SSDs have a 4096 byte physical block size but also support 512byte blocks for backward compatibility with older drives & OS software. This is done by "512 byte emulation" A.K.A 512e. If your drive is one of the ones that does 512 byte emulation your 512 byte accesses are actually read modify write operations. The SSD will try to turn sequential accesses in to 4k block writes.
If you can switch to 4k block writes you will (probably) see much better numbers for IOPS as well as bandwidth since this makes for much less work on the SSD. Random 512 block writes also have a big impact on long term performance due to increased write amplification.

Related

C++ FileApi.h no caching, how do i get the disk activity to 100%

Hello and thanks for reviewing my problem.
I have a system configured with PCIe RAID0 controller x16 lines connected to 4 NVMe Intel drives 2Tb each through m.2 connector.
Using ATTO Disk Benchmark application, File size set to 8Gb and block size set to 2Mb the max read rate is ~7Gb/S meanwhile by looking at the task manager the disk activity percentage is at the peak 100% during the process.
My problem:
I developed a simple c++ application using the Qt-Creator and MinGW-64bit compiler, using FileApi.h header to open a file with system caching disabled (No Buffering) and read same byte amount (2Mb) from the same file size (8Gb) the result is not even close, the rate is so slow ~1.2Gb/S and the disk activity during the process is around 23%
here is my code:
#include <fileapi.h>
void main()
{
HANDLE dataFile;
dataFile = CreateFileA("File.bin", GENERIC_READ, 0, nullptr,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL | FILE_FLAG_NO_BUFFERING, nullptr);
FlushFileBuffers(dataFile);
if (dataFile == INVALID_HANDLE_VALUE)
return ;
//Start reading 3000 times from the file
int counter = 0;
while(counter < 3000){
char * buffer = new char [pktSize*sizeof(int)];
unsigned long read;
ReadFile(dataFile, buffer, 2097152 /*2 Megabytes */, &read, nullptr);
counter+=1;
delete[] buffer;
}
}
I appreciate any help or advice and will be super thankful.
On each iteration you allocate new buffer in memory. It causes large memory traffic and performance degradation. Initialize it once and reuse:
char * buffer = new char [pktSize*sizeof(int)];
while(counter < 3000)
{
unsigned long read;
ReadFile(dataFile, buffer, 2097152 /*2 Megabytes */, &read, nullptr);
counter+=1;
}
delete[] buffer;
Also you should be sure that buffer size pktSize*sizeof(int) is greater than 2097152 /*2 Megabytes */.

File read() hangs on binary large file

I'm working on a benchmark program. Upon making the read() system call, the program appears to hang indefinitely. The target file is 1 GB of binary data and I'm attempting to read directly into buffers that can be 1, 10 or 100 MB in size.
I'm using std::vector<char> to implement dynamically-sized buffers and handing off &vec[0] to read(). I'm also calling open() with the O_DIRECT flag to bypass kernel caching.
The essential coding details are captured below:
std::string fpath{"/path/to/file"};
size_t tries{};
int fd{};
while (errno == EINTR && tries < MAX_ATTEMPTS) {
fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
tries++;
}
// Throw exception if error opening file
if (fd == -1) {
ostringstream ss {};
switch (errno) {
case EACCES:
ss << "Error accessing file " << fpath << ": Permission denied";
break;
case EINVAL:
ss << "Invalid file open flags; system may also not support O_DIRECT flag, required for this benchmark";
break;
case ENAMETOOLONG:
ss << "Invalid path name: Too long";
break;
case ENOMEM:
ss << "Kernel error: Out of memory";
}
throw invalid_argument {ss.str()};
}
size_t buf_sz{1024*1024}; // 1 MiB buffer
std::vector<char> buffer(buf_sz); // Creates vector pre-allocated with buf_sz chars (bytes)
// Result is 0-filled buffer of size buf_sz
auto bytes_read = read(fd, &buffer[0], buf_sz);
Poking through the executable with gdb shows that buffers are allocated correctly, and the file I've tested with checks out in xxd. I'm using g++ 7.3.1 (with C++11 support) to compile my code on a Fedora Server 27 VM.
Why is read() hanging on large binary files?
Edit: Code example updated to more accurately reflect error checking.
There are multiple problems with your code.
This code will never work properly if errno ever has a value equal to EINTR:
while (errno == EINTR && tries < MAX_ATTEMPTS) {
fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
tries++;
}
That code won't stop when the file has been successfully opened and will keep reopening the file over and over and leak file descriptors as it keeps looping once errno is EINTR.
This would be better:
do
{
fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
tries++;
}
while ( ( -1 == fd ) && ( EINTR == errno ) && ( tries < MAX_ATTEMPTS ) );
Second, as noted in the comments, O_DIRECT can impose alignment restrictions on memory. You might need page-aligned memory:
So
size_t buf_sz{1024*1024}; // 1 MiB buffer
std::vector<char> buffer(buf_sz); // Creates vector pre-allocated with buf_sz chars (bytes)
// Result is 0-filled buffer of size buf_sz
auto bytes_read = read(fd, &buffer[0], buf_sz);
becomes
size_t buf_sz{1024*1024}; // 1 MiB buffer
// page-aligned buffer
buffer = mmap( 0, buf_sz, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, NULL );
auto bytes_read = read(fd, &buffer[0], buf_sz);
Note also the the Linux implementation of O_DIRECT can be very dodgy. It's been getting better, but there are still potential pitfalls that aren't very well documented at all. Along with alignment restrictions, if the last amount of data in the file isn't a full page, for example, you may not be able to read it if the filesystem's implementation of direct IO doesn't allow you to read anything but full pages (or some other block size). Likewise for write() calls - you may not be able to write just any number of bytes, you might be constrained to something like a 4k page.
This is also critical:
Most examples of read() hanging appear to be when using pipes or non-standard I/O devices (e.g., serial). Disk I/O, not so much.
Some devices simply do not support direct IO. They should return an error, but again, the O_DIRECT implementation on Linux can be very hit-or-miss.
Pasting your program and running on my linux system, was a working and non-hanging program.
The most likely cause for the failure is the file is not a file-system item, or it has a hardware element which is not working.
Try with a smaller size - to confirm, and try on a different machine to help diagnose
My complete code (with no error checking)
#include <vector>
#include <string>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
int main( int argc, char ** argv )
{
std::string fpath{"myfile.txt" };
auto fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
size_t buf_sz{1024*1024}; // 1 MiB buffer
std::vector<char> buffer(buf_sz); // Creates vector pre-allocated with buf_sz chars (bytes)
// Result is 0-filled buffer of size buf_sz
auto bytes_read = read(fd, &buffer[0], buf_sz);
}
myfile.txt was created with
dd if=/dev/zero of=myfile.txt bs=1024 count=1024
If the file is not 1Mb in size, it may fail.
If the file is a pipe, it can block until the data is available.
Most examples of read() hanging appear to be when using pipes or non-standard I/O devices (e.g., serial). Disk I/O, not so much.
O_DIRECT flag is useful for filesystems and block devices. With this flag people normally map pages into the user space.
For sockets, pipes and serial devices it is plain useless because the kernel does not cache that data.
Your updated code hangs because fd is initialized with 0 which is STDIN_FILENO and it never opens that file, then it hangs reading from stdin.

What is the fastest way to read a file in disk in c++?

I am writing a program to check whether a file is PE file or not. For that, I need to read only the file headers of files(which I guess do not occupy more than first 1024 bytes of a file).
I tried using creatfile() + readfile() combination which turns out be slower because I am iterating through all the files in system drive. It is taking 15-20 minutes just to iterate through them.
Can you please tell some alternate approach to open and read the files to make it faster?
Note : Please note that I do NOT need to read the file in whole. I just need to read the initial part of the file -- DOS header, PE header etc which I guess do not occupy more than first 512 bytes of the file.
Here is my code :
bool IsPEFile(const String filePath)
{
HANDLE hFile = CreateFile(filePath.c_str(),
GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
DWORD dwBytesRead = 0;
const DWORD CHUNK_SIZE = 2048;
BYTE szBuffer[CHUNK_SIZE] = {0};
LONGLONG size;
LARGE_INTEGER li = {0};
if (hFile != INVALID_HANDLE_VALUE)
{
if(GetFileSizeEx(hFile, &li) && li.QuadPart > 0)
{
size = li.QuadPart;
ReadFile(hFile, szBuffer, CHUNK_SIZE, &dwBytesRead, NULL);
if(dwBytesRead > 0 && (WORDPTR(szBuffer[0]) == ('M' << 8) + 'Z' || WORDPTR(szBuffer[0]) == ('Z' << 8) + 'M'))
{
LONGLONG ne_pe_header = DWORDPTR(szBuffer[0x3c]);
WORD signature = 0;
if(ne_pe_header <= dwBytesRead-2)
{
signature = WORDPTR(szBuffer[ne_pe_header]);
}
else if (ne_pe_header < size )
{
SetFilePointer(hFile, ne_pe_header, NULL, FILE_BEGIN);
ReadFile(hFile, &signature, sizeof(signature), &dwBytesRead, NULL);
if (dwBytesRead != sizeof(signature))
{
return false;
}
}
if(signature == 0x4550) // PE file
{
return true;
}
}
}
CloseHandle(hFile);
}
return false;
}
Thanks in advance.
I think you're hitting the inherent limitations of mechanical hard disk drives. You didn't mention whether you're using a HDD or a solid-state disk, but I assume a HDD given that your file accesses are slow.
HDDs can read data at about 100 MB/s sequentially, but seek time is a bit over 10 ms. This means that if you seek to a certain location (10 ms), you might as well read a megabyte of data (another 10 ms). This also means that you can access only less than 100 files per second.
So, in your case it doesn't matter much whether you're reading the first 512 bytes of a file or the first hundred kilobytes of a file.
Hardware is cheap, programmer time is expensive. Your best bet is to purchase a solid-state disk drive if your file accesses are too slow. I predict that eventually all computers will have solid-state disk drives.
Note: if the bottleneck is the HDD, there is nothing you can do about it other than to replace the HDD with better technology. Practically all file access mechanisms are equally slow. The only thing you can do about it is to read only the initial part of a file if the file is really really large such as multiple megabytes. But based on your code example you're already doing that.
For faster file IO, you need to use CreateFile and ReadFile APIs of Win32.
If you want to speed up, you can use file buffering and make file non-blocking by using overlapped IO or IOCP.
See this example for help: https://msdn.microsoft.com/en-us/library/windows/desktop/bb540534%28v=vs.85%29.aspx
And I think that FILE and fstream of C and C++ respectively are not faster than Win32.

DeviceIoControl returning unexpected physical sector size

I use DeviceIoControl to return the size of a physical disk sector. It has always returned 512 bytes, until lately where it started returing 4096 bytes. Inspecting the resulting STORAGE_ACCESS_ALIGNMENT_DESCRIPTOR I see that the logical and physical byte sizes has switched places - should not the logial byte size of a disk sector always be greater or equal to the physical sector size?
#include <Windows.h>
#include <iostream>
#include <Windows.h>
#pragma comment(lib, "Kernel32.lib")
int main()
{
HANDLE hDevice;
char cDisk = 'c'; // Get metadata about the C:\ disk
// Build the logical drive path and get the drive device handle
std::wstring logicalDrive = L"\\\\.\\";
wchar_t drive[3];
drive[0] = cDisk;
drive[1] = L':';
drive[2] = L'\0';
logicalDrive.append(drive);
hDevice = CreateFile(
logicalDrive.c_str(),
0,
0,
NULL,
OPEN_EXISTING,
0,
NULL);
if (hDevice == INVALID_HANDLE_VALUE)
{
std::cerr << "Error\n";
return -1;
}
// Now that we have the device handle for the disk, let us get disk's metadata
DWORD outsize;
STORAGE_PROPERTY_QUERY storageQuery;
memset(&storageQuery, 0, sizeof(STORAGE_PROPERTY_QUERY));
storageQuery.PropertyId = StorageAccessAlignmentProperty;
storageQuery.QueryType = PropertyStandardQuery;
STORAGE_ACCESS_ALIGNMENT_DESCRIPTOR diskAlignment = {0};
memset(&diskAlignment, 0, sizeof(STORAGE_ACCESS_ALIGNMENT_DESCRIPTOR));
if (!DeviceIoControl(hDevice,
IOCTL_STORAGE_QUERY_PROPERTY,
&storageQuery,
sizeof(STORAGE_PROPERTY_QUERY),
&diskAlignment,
sizeof(STORAGE_ACCESS_ALIGNMENT_DESCRIPTOR),
&outsize,
NULL)
)
{
std::cerr << "Error\n";
return -1;
}
std::cout << "Physical sector size: " diskAlignment.BytesPerPhysicalSector << std::endl;
std::cout << "Logical sector size: " diskAlignment.BytesPerLogicalSector << std::endl;
return 0;
}
Result from running the above code is:
Physical sector size: 4096
Logical sector size: 512
Running fsutil gives the same unexpected result.
C:\WINDOWS\system32>fsutil fsinfo ntfsinfo c:
NTFS Version : 3.1
LFS Version : 2.0
Number Sectors : 0x000000001741afff
Total Clusters : 0x0000000002e835ff
Free Clusters : 0x0000000000999d28
Total Reserved : 0x0000000000003260
Bytes Per Sector : 512
Bytes Per Physical Sector : 4096
Bytes Per Cluster : 4096
Bytes Per FileRecord Segment : 1024
Clusters Per FileRecord Segment : 0
What am I doing wrong?
Nothing wrong here. From 'File Buffering' article of MSDN:
Application developers should take note of new types of storage
devices being introduced into the market with a physical media sector
size of 4,096 bytes. The industry name for these devices is "Advanced
Format". As there may be compatibility issues with directly
introducing 4,096 bytes as the unit of addressing for the media, a
temporary compatibility solution is to introduce devices that emulate
a regular 512-byte sector storage device but make available
information about the true sector size through standard ATA and SCSI
commands. As a result of this emulation, there are in essence two
sector sizes that developers will need to understand:
Logical Sector: The unit that is used for logical block addressing for the media. We can also think of it as the smallest unit of write
that the storage can accept. This is the "emulation".
Physical Sector: The unit for which read and write operations to the device are completed in a single operation. This is the unit of
atomic write, and what unbuffered I/O will need to be aligned to in
order to have optimal performance and reliability characteristics.
The 4096 byte is 8 sectors and called cluster. When you save a file, you save into one or more cluster. If the file has a size greater than 512 byte it will be saved into a cluster. The empty sectors are called slack. Some malware writes themselves into empty sectors to hide in plain sight. FAT file system uses 16 or 32 sectors for one cluster.

What C++ Write function should I use?

I prefer not to use XML library parser out there, so can you give me suggestion which good write function to use to write data to XML file? I will make alot of to calls to the write function so the write function should be able to keep track of the last write position and it should not take too much resource. I have two different write below but I can't keep track the last write position unless I have to read the file until end of file.
case#1
FILE *pfile = _tfopen(GetFileNameXML(), _T("w"));
if(pfile)
{
_fputts(TEXT(""), pfile);
}
if(pfile)
{
fclose(pfile);
pfile = NULL;
}
case#2
HANDLE hFile = CreateFile(GetFileNameXML(), GENERIC_READ|GENERIC_WRITE,
FILE_SHARE_WRITE|FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if(hFile != INVALID_HANDLE_VALUE)
{
WriteFile(hFile,,,,,);
}
CloseHandle(hFile);
thanks.
If all you need is to write some text files, use C++'s standard library file facilities. The samples here will be helpful: http://www.cplusplus.com/doc/tutorial/files/
First, what's your aversion to using a standard XML processing library?
Next, if you decide to roll your own, definitely don't go directly at the Win32 APIs - at least not unless you're going to write out the generated XML in large chunks, or you're going to implement your own buffering layer.
It's not going to matter for dealing with tiny files, but you specifically mention good performance and many calls to the write function. WriteFile has a fair amount of overhead, it does a lot of work and involves user->kernel->user mode switches, which are expensive. If you're dealing with "normally sized" XML files you probably won't be able to see much of a difference, but if you're generating monstrously sized dumps it's definitely something to keep in mind.
You mention tracking the last write position - first off, it should be easy... with FILE buffers you have ftell, with raw Win32 API you have SetFilePointerEx - call it with liDistanceToMove=0 and dwMoveMethod=FILE_CURRENT, and you get the current file position after a write. But why do you need this? If you're streaming out an XML file, you should generally keep on streaming until you're done writing - are you closing and re-opening the file? Or are you writing a valid XML file which you want to insert more data into later?
As for the overhead of the Win32 file functions, it may or may not be relevant in your case (depending on the size of the files you're dealing with), but with larger files it matters a lot - included below is a micro-benchmark that simpy reads a file to memory with ReadFile, letting you specify different buffer sizes from the command line. It's interesting to look at, say, Process Explorer's IO tab while running the tool. Here's some statistics from my measly laptop (Win7-SP1 x64, core2duo P7350#2.0GHz, 4GB ram, 120GB Intel-320 SSD).
Take it for what it is, a micro-benchmark. The performance might or might not matter in your particular situation, but I do believe the numbers demonstrate that there's considerable overhead to the Win32 file APIs, and that doing a little buffering of your own helps.
With a fully cached 2GB file:
BlkSz Speed
32 14.4MB/s
64 28.6MB/s
128 56MB/s
256 107MB/s
512 205MB/s
1024 350MB/s
4096 800MB/s
32768 ~2GB/s
With a "so big there will only be cache misses" 4GB file:
BlkSz Speed CPU
32 13MB/s 49%
64 26MB/s 49%
128 52MB/s 49%
256 99MB/s 49%
512 180MB/s 49%
1024 200MB/s 32%
4096 185MB/s 22%
32768 205MB/s 13%
Keep in mind that 49% CPU usage means that one CPU core is pretty much fully pegged - a single thread can't really push the machine much harder. Notice the pathological behavior of the 4kb buffer in the second table - it was reproducible, and I don't have an explanation for it.
Crappy micro-benchmark code goes here:
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <iostream>
#include <string>
#include <assert.h>
unsigned getDuration(FILETIME& timeStart, FILETIME& timeEnd)
{
// duration is in 100-nanoseconds, we want milliseconds
// 1 millisecond = 1000 microseconds = 1000000 nanoseconds
LARGE_INTEGER ts, te, res;
ts.HighPart = timeStart.dwHighDateTime; ts.LowPart = timeStart.dwLowDateTime;
te.HighPart = timeEnd.dwHighDateTime; te.LowPart = timeEnd.dwLowDateTime;
res.QuadPart = ((te.QuadPart - ts.QuadPart) / 10000);
assert(res.QuadPart < UINT_MAX);
return res.QuadPart;
}
int main(int argc, char* argv[])
{
if(argc < 2) {
puts("Syntax: ReadFile [filename] [blocksize]");
return 0;
}
char *filename= argv[1];
int blockSize = atoi(argv[2]);
if(blockSize < 1) {
puts("Please specify a blocksize larger than 0");
return 1;
}
HANDLE hFile = CreateFile(filename, GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, 0);
if(INVALID_HANDLE_VALUE == hFile) {
puts("error opening input file");
return 1;
}
std::vector<char> buffer(blockSize);
LARGE_INTEGER fileSize;
if(!GetFileSizeEx(hFile, &fileSize)) {
puts("Failed getting file size.");
return 1;
}
std::cout << "File size " << fileSize.QuadPart << ", that's " << (fileSize.QuadPart / blockSize) <<
" blocks of " << blockSize << " bytes - reading..." << std::endl;
FILETIME dummy, kernelStart, userStart;
GetProcessTimes(GetCurrentProcess(), &dummy, &dummy, &kernelStart, &userStart);
DWORD ticks = GetTickCount();
DWORD bytesRead = 0;
do {
if(!ReadFile(hFile, &buffer[0], blockSize, &bytesRead, 0)) {
puts("Error calling ReadFile");
return 1;
}
} while(bytesRead == blockSize);
ticks = GetTickCount() - ticks;
FILETIME kernelEnd, userEnd;
GetProcessTimes(GetCurrentProcess(), &dummy, &dummy, &kernelEnd, &userEnd);
CloseHandle(hFile);
std::cout << "Reading with " << blockSize << " sized blocks took " << ticks << "ms, spending " <<
getDuration(kernelStart, kernelEnd) << "ms in kernel and " <<
getDuration(userStart, userEnd) << "ms in user mode. Hit enter to countinue." << std::endl;
std::string dummyString;
std::cin >> dummyString;
return 0;
}