Faster way to read file than boost::file_mapping? - c++

I'm writing a latency-sensitive app which reads a text file upon initialisation. I have profiled and re-written all my algorithms such that 85% of my execution time is from the lines:
boost::interprocess::file_mapping file(Path, read_only);
boost::interprocess::mapped_region data(file, read_only);
I am writing this on windows- is there any faster way to map a file into memory? Portability is not a concern.

You could just use the native functions of Win32, but I think you won't save alot, because boost will not add alot of overhead:
OFSTRUCT ofStruct;
ofStruct.cBytes=sizeof (OFSTRUCT);
HANDLE file=(HANDLE)OpenFile(fileName, &ofStruct, OF_READ);
if (file==INVALID_HANDLE_VALUE)
handle errors
else {
HANDLE map=CreateFileMapping(file, NULL, PAGE_READONLY, 0, 0, 0);
if (map==INVALID_HANDLE_VALUE)
handle errors
else {
const char *p=(const char *)MapViewOfFile(map, FILE_MAP_READ, 0, 0, 0));
if (p) {
// enjoy using p to read access file contents.
}
// close all that handles now...
}

I would suggest dropping the idea of file mapping.
FM is a complicated construct and adds some overhead. Plain cached read also involves non-trivial interaction with the physical device. You could do unbuffered reads. Probably the next thing to ask is what kind of IO you actually want - how big is the file? is it sequential? Is it on the network? Do you have a choice of hardware, or is it on the customers' machine?

If the files are small, just open and read them into memory using standard Win32 CreateFile()/ReadFile() APIs.
If you're consuming each file sequentially (or could arrange your code in such a way that you do), you should specify FILE_FLAG_SEQUENTIAL_SCAN. This is a hint for the file/caching subsystem to read-ahead aggressively. For small files, the file might be read into cache before your first call to ReadFile() is issued.
Edit: As requested, Here's a snippet that illustrates reading the contents of a file into a vector of bytes using the Win32 API:
void ReadFileIntoBuffer( const std::wstring& fileName, std::vector< uint8_t >& output )
{
HANDLE hFile( INVALID_HANDLE_VALUE );
try
{
// Open the file.
hFile = CreateFile( filename.c_str(),
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_FLAG_SEQUENTIAL_SCAN,
NULL );
if( INVALID_HANDLE_VALUE != hFile )
throw std::runtime_error( "Failed to open file." );
// Fetch size
LARGE_INTEGER fileSize;
if( !GetFileSizeEx( hFile, &fileSize ) );
throw std::runtime_error( "GetFileSizeEx() failed." );
// Resize output buffer.
output.resize( fileSize.LowPart );
// Read the file contents.
ULONG bytesRead;
if( !ReadFile( hFile, &output[0], fileSize.LowPart, &bytesRead, NULL ) )
throw std::runtime_error( "ReadFile() failed." );
// Recover resources.
CloseHandle( hFile );
}
catch( std::exception& )
{
// Dump the error.
std::cout << e.what() << " GetLastError() = " << GetLastError() << std::endl;
// Recover resources.
if( INVALID_HANDLE_VALUE != hFile )
CloseHandle( hFile );
throw;
}
}

Related

How to append in file in Windows in UnBuffered mode using CreateFile

Every time my function is getting called it is overwriting to the file. Kindly note I am opening file in unbuffered mode using below flags.
FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH
If I am using simple buffered mode it is working fine.
FILE_ATTRIBUTE_NORMAL
I am getting following error in unbuffered mode.
** ERROR ** CreateFile failed: The parameter is incorrect.
Kindly find the code snippets below. This piece of code getting called many times.
HANDLE hFile;
LPCWSTR file_path = convertCharArrayToLPCWSTR(UNBUFFERED_FILE);
hFile = CreateFile(file_path,
FILE_APPEND_DATA,
FILE_SHARE_WRITE,
NULL,
OPEN_ALWAYS,
FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH,
NULL
);
if (hFile == INVALID_HANDLE_VALUE)
{
std::cout << "Unable to open/create file for writing" << std::endl;
PrintError(TEXT("CreateFile failed"));
}
Data *data = new Data();
DWORD dwBytesToWrite = sizeof(Data);
DWORD dwBytesWritten = 0;
BOOL bErrorFlag = FALSE;
bErrorFlag = WriteFile(
hFile, // open file handle
data, // start of data to write
dwBytesToWrite, // number of bytes to write
&dwBytesWritten, // number of bytes that were written
NULL);
if (bErrorFlag == FALSE)
{
std::cout << "Unable to write to file" << std::endl;
PrintError(TEXT("Unable to write to file"));
}
if (dwBytesToWrite != dwBytesWritten)
{
std::cout << "Error in writing: Whole data not written" << std::endl;
PrintError(TEXT("Error in writing: Whole data not written"));
}
CloseHandle(hFile);
.
Kindly suggest if any alternative idea is available.
from NtCreateFile
FILE_NO_INTERMEDIATE_BUFFERING
The file cannot be cached or buffered in a driver's internal
buffers. This flag is incompatible with the DesiredAccess
parameter's FILE_APPEND_DATA flag.
so when you call
CreateFile(file_path,
FILE_APPEND_DATA, // !!
FILE_SHARE_WRITE,
NULL,
OPEN_ALWAYS,
FILE_FLAG_NO_BUFFERING /*!!*/| FILE_FLAG_WRITE_THROUGH,
NULL
);
you use FILE_FLAG_NO_BUFFERING (mapped to FILE_NO_INTERMEDIATE_BUFFERING) with FILE_APPEND_DATA - you and must got ERROR_INVALID_PARAMETER. you need remove one flag. i suggest remove FILE_FLAG_NO_BUFFERING flag, because with it you can write only integral of the sector size.

Windows: Ensure a file is written to the physical disk

I wrote a C++ code that writes to file and attempts to flush it to the physical disk. By the end of the code I want to know for sure that the file is written to the physical disk and to commit that I'm in a stable state even if someone unplugs the machine. Nevertheless, when I unplug immediately after the execution of all of the following lines, the file is lost i.e. it hasn't been written to the physical disk although I attempted to flush and used FILE_FLAG_WRITE_THROUGH.
HANDLE hFile = CreateFileA(
filePath.c_str(),
GENERIC_WRITE,
0,
NULL,
CREATE_ALWAYS,
FILE_ATTRIBUTE_NORMAL| FILE_FLAG_WRITE_THROUGH ,
NULL);
if (hFile == INVALID_HANDLE_VALUE)
{
throw MyException("CreateFile failed");
}
DWORD bytesWritten = 0;
auto errorFlag = WriteFile(
hFile,
data.data(),
static_cast<DWORD>(data.size()),
&bytesWritten,
NULL);
if (bytesWritten != data.size() || errorFlag != TRUE)
{
CloseHandle(hFile);
throw MyException("WriteFile failed" + std::to_string(GetLastError()));
}
auto ret = FlushFileBuffers(hFile);
if (!ret)
{
CloseHandle(hFile);
throw MyException("FlushFileBuffers failed");
}
CloseHandle(hFile);
// The file isn't written to the disk yet!!!
How will I make sure that the file is already on the disk so I can commit the change?

how to create own cache for file loading from disk to memory in c++ framwork?

how to create own cache for file loading from disk to memory in c++ framwork,I do not want to use windows cache ,because some of case windows cache does not gives the good results ?
Is there any plugin in c++ I can directly use for multiImport of file from disk to memory .
thanks in advance
What do you mean by "windows cache does not gives the good results"? What version of Windows are we talking about here?
The Windows file cache is actually quite efficient, but there are things a developer can do to their own data (if they fully control it) which can greatly improve the performance of file I/O. In particular, if you ensure your files are organized into multiples of 4096 bytes (aka 4k), you can make use of "overlapped" I/O which avoids both blocking behavior and the need to do additional copies of memory data.
An example of this is the DirectX Tool Kit and the WaveBankReader class. The xwbtool command-line utility is used to pack a number of audio .wav files into a single file where each individual sound file is aligned to the 4096 boundary, a xwb file.
At runtime, the xwb reader then sets up the target memory and issues asynchronous reads. Ideally the application sets up a number of other reads, and then at some later time ensures that all async I/O is complete before using the data.
struct handle_closer { void operator()(HANDLE h) { if (h) CloseHandle(h); } };
typedef public std::unique_ptr<void, handle_closer> ScopedHandle;
inline HANDLE safe_handle( HANDLE h ) { return (h == INVALID_HANDLE_VALUE) ? 0 : h; }
ScopedHandle m_event;
#if (_WIN32_WINNT >= _WIN32_WINNT_VISTA)
m_event.reset( CreateEventEx( nullptr, nullptr, CREATE_EVENT_MANUAL_RESET, EVENT_MODIFY_STATE | SYNCHRONIZE ) );
#else
m_event.reset( CreateEvent( nullptr, TRUE, FALSE, nullptr ) );
#endif
if ( !m_event )
{
return HRESULT_FROM_WIN32( GetLastError() );
}
#if (_WIN32_WINNT >= _WIN32_WINNT_WIN8)
CREATEFILE2_EXTENDED_PARAMETERS params = { sizeof(CREATEFILE2_EXTENDED_PARAMETERS), 0 };
params.dwFileAttributes = FILE_ATTRIBUTE_NORMAL;
params.dwFileFlags = FILE_FLAG_OVERLAPPED | FILE_FLAG_SEQUENTIAL_SCAN;
ScopedHandle hFile( safe_handle( CreateFile2( szFileName,
GENERIC_READ,
FILE_SHARE_READ,
OPEN_EXISTING,
&params ) ) );
#else
ScopedHandle hFile( safe_handle( CreateFileW( szFileName,
GENERIC_READ,
FILE_SHARE_READ,
nullptr,
OPEN_EXISTING,
FILE_FLAG_OVERLAPPED | FILE_FLAG_SEQUENTIAL_SCAN,
nullptr ) ) );
#endif
if ( !hFile )
{
return HRESULT_FROM_WIN32( GetLastError() );
}
// Read and verify header
OVERLAPPED request;
memset( &request, 0, sizeof(request) );
request.hEvent = m_event.get();
bool wait = false;
if( !ReadFile( hFile.get(), &m_header, sizeof( m_header ), nullptr, &request ) )
{
DWORD error = GetLastError();
if ( error != ERROR_IO_PENDING )
return HRESULT_FROM_WIN32( error );
wait = true;
}
DWORD bytes;
#if (_WIN32_WINNT >= _WIN32_WINNT_WIN8)
BOOL result = GetOverlappedResultEx( hFile.get(), &request, &bytes, INFINITE, FALSE );
#else
if ( wait )
(void)WaitForSingleObject( m_event.get(), INFINITE );
BOOL result = GetOverlappedResult( hFile.get(), &request, &bytes, FALSE );
#endif
if ( !result || ( bytes != sizeof( m_header ) ) )
{
return HRESULT_FROM_WIN32( GetLastError() );
}
// ... code here to verify and parse header cut for readability ...
m_waveData.reset( new (std::nothrow) uint8_t[ waveLen ] );
if ( !m_waveData )
return E_OUTOFMEMORY;
dest = m_waveData.get();
memset( &m_request, 0, sizeof(OVERLAPPED) );
m_request.Offset = m_header.Segments[HEADER::SEGIDX_ENTRYWAVEDATA].dwOffset;
m_request.hEvent = m_event.get();
if ( !ReadFile( hFile.get(), dest, waveLen, nullptr, &m_request ) )
{
DWORD error = GetLastError();
if ( error != ERROR_IO_PENDING )
return HRESULT_FROM_WIN32( error );
}
else
{
m_prepared = true;
memset( &m_request, 0, sizeof(OVERLAPPED) );
}
// ...
// At some later point we need to check to see if the data is ready
// or wait if the data is not yet ready
if ( !m_prepared )
{
WaitForSingleObjectEx( m_request.hEvent, INFINITE, FALSE );
m_prepared = true;
}
This code makes use of the buffering hint FILE_FLAG_SEQUENTIAL_SCAN that the file will be read sequentially. You can also use the hint FILE_FLAG_RANDOM_ACCESS if the file will really be access randomly instead, but it is more efficient if you can arrange your data for a sequential scan.
The complexity here is that this code builds for Windows Vista, Windows 7, Windows 8.x, Windows 10, Xbox One, Windows phone 8, Windows 8 Store, and universal Windows apps. Namely I'm using the improved GetOverlappedResultEx on Windows 8 or later, but have to emulate it on older versions of the OS with WaitForSingleObject and GetOverlappedResult.
Having a few dozen outstanding read-requests of a reasonable size can also help optimize disk seek behavior but it is important not to flood the system with lots of small requests. Generally prefer to make read requests of 32k or greater at a time.
If you want to bypass the file cache for some reason (say you are doing streaming of audio and don't want any extra copies of it in memory anywhere because you know the data will only get used once before you read it again) you can use FILE_FLAG_NO_BUFFERING--be sure you aren't opening another handle to the same file without this flag or it will get buffered anyhow:
#if (_WIN32_WINNT >= _WIN32_WINNT_WIN8)
CREATEFILE2_EXTENDED_PARAMETERS params2 = { sizeof(CREATEFILE2_EXTENDED_PARAMETERS), 0 };
params2.dwFileAttributes = FILE_ATTRIBUTE_NORMAL;
params2.dwFileFlags = FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING;
m_async = CreateFile2( szFileName,
GENERIC_READ,
FILE_SHARE_READ,
OPEN_EXISTING,
&params2 );
#else
m_async = CreateFileW( szFileName,
GENERIC_READ,
FILE_SHARE_READ,
nullptr,
OPEN_EXISTING,
FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING,
nullptr );
#endif
As with all optimizations, be sure to profile both with and without FILE_FLAG_NO_BUFFERING under real world loads to make sure you aren't actually making things slower by using it.

C++ Download File WinInet - 0kb written to file

Can someone tell me what is wrong with my code?
I am trying to download a file from the internet using WinInet. The function connects to the target site just fine, I don't understand why this code isn't working. Can anyone help me out?
Here is my code:
HANDLE hFile = CreateFileW(FilePath, GENERIC_WRITE, NULL, NULL, CREATE_ALWAYS, NULL, NULL);
if (hFile != INVALID_HANDLE_VALUE || GetLastError() == ERROR_ALREADY_EXISTS)
{
CHAR Buffer[2048];
DWORD BytesRead=0, BytesToRead=0;
DWORD BytesWritten=0, BytesToWrite=0;
SetFilePointer(hFile, 0, 0, FILE_BEGIN);
do
{
if (BytesRead)
{
WriteFile(hFile, Buffer, BytesWritten, &BytesToWrite, FALSE);
}
}
while
(InternetReadFile(hRequest, (LPVOID)Buffer, BytesToRead, &BytesRead) != FALSE);
}
CloseHandle(hFile);
}
hRequest is passed to the function, it is the HINTERNET handle from HttpOpenRequestA.
Your code has some logic problems.
you are misusing GetLastError() when calling CreateFileW(). Regardless of whether the file already exists or not, CreateFileW() will not return INVALID_HANDLE if it successfully creates/opens the file. That is all you need to check for (call GetLastError() only if CreateFileW() fails and you want to find out why). also, there is no need to call SetFilePointer() at all, as CREATE_ALWAYS ensures the opened file is empty, truncating the file if it already exists and has data in it.
your do..while loop should be a while loop instead, so that InternetReadFile() is called first. There is no point in skipping WriteFile() on the first loop iteration. If you use a do..while loop, InternetReadFile() should not be used as the loop condition.
more importantly, you are breaking the loop only if InternetReadFile() fails with an error. You are expecting it to fail when it reaches the end of the response, but it actually returns TRUE and sets BytesRead to 0. This is documented behavior, which you are not handling at all:
InternetReadFile function
InternetReadFile operates much like the base ReadFile function, with a few exceptions. Typically, InternetReadFile retrieves data from an HINTERNET handle as a sequential stream of bytes. The amount of data to be read for each call to InternetReadFile is specified by the dwNumberOfBytesToRead parameter and the data is returned in the lpBuffer parameter. A normal read retrieves the specified dwNumberOfBytesToRead for each call to InternetReadFile until the end of the file is reached. To ensure all data is retrieved, an application must continue to call the InternetReadFile function until the function returns TRUE and the lpdwNumberOfBytesRead parameter equals zero. This is especially important if the requested data is written to the cache, because otherwise the cache will not be properly updated and the file downloaded will not be committed to the cache. Note that caching happens automatically unless the original request to open the data stream set the INTERNET_FLAG_NO_CACHE_WRITE flag.
ReadFile function
When a synchronous read operation reaches the end of a file, ReadFile returns TRUE and sets *lpNumberOfBytesRead to zero.
when calling WriteFile(), you are passing BytesWritten to the nNumberOfBytesToWrite parameter, but BytesWritten is never set to anything other than 0, so nothing gets written to the file. You need to pass BytesRead instead.
With that said, use something more like this:
HANDLE hFile = CreateFileW(FilePath, GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_FLAG_SEQUENTIAL_SCAN, NULL);
if (hFile == INVALID_HANDLE_VALUE)
{
// handle error as needed...
}
else
{
BYTE Buffer[2048];
DWORD BytesRead, BytesWritten;
do
{
if (!InternetReadFile(hRequest, Buffer, sizeof(Buffer), &BytesRead))
{
// handle error as needed...
break;
}
if (!BytesRead)
break;
if (!WriteFile(hFile, Buffer, BytesRead, &BytesWritten, FALSE))
{
// handle error as needed...
break;
}
}
while (true);
CloseHandle(hFile);
}
MSDN even has a full example of how to use InternetReadFile():
HOWTO: Using InternetReadFile To Get File
BOOL GetFile (HINTERNET IN hOpen, // Handle from InternetOpen()
CHAR *szUrl, // Full URL
CHAR *szFileName) // Local file name
{
DWORD dwSize;
CHAR szHead[] = "Accept: */*\r\n\r\n";
VOID * szTemp[25];
HINTERNET hConnect;
FILE * pFile;
if ( !(hConnect = InternetOpenUrl ( hOpen, szUrl, szHead,
lstrlen (szHead), INTERNET_FLAG_DONT_CACHE, 0)))
{
cerr << "Error !" << endl;
return 0;
}
if ( !(pFile = fopen (szFileName, "wb" ) ) )
{
cerr << "Error !" << endl;
return FALSE;
}
do
{
// Keep coping in 25 bytes chunks, while file has any data left.
// Note: bigger buffer will greatly improve performance.
if (!InternetReadFile (hConnect, szTemp, 50, &dwSize) )
{
fclose (pFile);
cerr << "Error !" << endl;
return FALSE;
}
if (!dwSize)
break; // Condition of dwSize=0 indicate EOF. Stop.
else
fwrite(szTemp, sizeof (char), dwSize , pFile);
} // do
while (TRUE);
fflush (pFile);
fclose (pFile);
return TRUE;
}

c++ check if file is empty

I got a project in C++ which I need to edit. This is a declaration of variable:
// Attachment
OFSTRUCT ofstruct;
HFILE hFile = OpenFile( mmsHandle->hTemporalFileName , &ofstruct , OF_READ );
DWORD hFileSize = GetFileSize( (HANDLE) hFile , NULL );
LPSTR hFileBuffer = (LPSTR)GlobalAlloc(GPTR, sizeof(CHAR) * hFileSize );
DWORD hFileSizeReaded = 0;
ReadFile( (HANDLE) hFile , hFileBuffer, hFileSize, &hFileSizeReaded, NULL );
CloseHandle( (HANDLE) hFile );
I need to check if the file is attached (I suppose I need to check if hFile has any value), but don't know how. I tried with hFile == NULL but this doesn't do the job.
Thanks,
Ile
Compare hFile with HFILE_ERROR (not with NULL!). Also, you should change OpenFile to CreateFile and call it properly, OpenFile has long been deprecated. In fact MSDN clearly states:
OpenFile Function
Only use this function with 16-bit
versions of Windows. For newer
applications, use the CreateFile
function.
When you make this change, you will get a HANDLE back, which you should compare with INVALID_HANDLE_VALUE.
Update: Correct way to get a file's size:
LARGE_INTEGER fileSize={0};
// You may want to use a security descriptor, tweak file sharing, etc...
// But this is a boiler plate file open
HANDLE hFile=CreateFile(mmsHandle->hTemporalFileName,GENERIC_READ,0,NULL,
OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
if (hFile!=INVALID_HANDLE_VALUE && GetFileSizeEx(hFile,&fileSize) &&
fileSize.QuadPart!=0)
{
// The file has size
}
else
{
// The file is missing or size==0 (or an error occurred getting its size)
}
// Do whatever else and don't forget to close the file handle when done!
if (hFile!=INVALID_HANDLE_VALUE)
CloseHandle(hFile);
Before you open the file you can try this:
WIN32_FIND_DATA wfd;
HANDLE h = FindFirstFile(filename, &wfd);
if (h != INVALID_FILE_HANDLE)
{
// file exists
if (wfd.nFileSizeHigh != 0 || wfd.nFileSizeLow != 0)
{
// file is not empty
}
FindClose(h)
}