I have a function that works fine when running inside of the Visual Studio debugging environment (with both the Debug and Release configurations), but when running the app outside of the IDE, just as an end-user would do, the program crashes. This happens with both the Debug and Release builds.
I'm aware of the differences that can exist between the Debug and Release configurations (optimizations, debug symbols, etc) and at least somewhat aware of the differences between running an app inside Visual Studio versus outside of it (debug heap, working directory, etc). I've looked at several of these things and none seem to address the issue. This is actually my first time posting to SO; normally I can find the solution from existing posts so I'm truly stumped!
I am able to attach a debugger and oddly enough I get two different error messages, based on whether I'm running the app on Windows 7 versus Windows 8.1. For Windows 7, the error is simply an access violation and it breaks right on the return statement. For Windows 8.1, it is a heap corruption error and it breaks on the construction of std::ifstream. In both cases, all of the local variables are populated correctly so I know it is not a matter of the function not being able to find the file or read its contents into the buffer data.
Also interestingly, the issue happens only about 20% of the time on Windows 8.1 and 100% of the time on Windows 7, though this may have something to do with the vastly different hardware these OS's are running on.
I'm not sure it makes any difference but the project type is a Win32 Desktop App and it initializes DirectX 11. You'll notice that the file type is interpreted as binary, which is correct as this function is primarily loading compiled shaders.
Here is the static member function LoadFile:
HRESULT MyClass::LoadFile(_In_ const CHAR* filename, _Out_ BYTE** data, _Out_ SIZE_T* length)
{
CHAR pwd[MAX_PATH];
GetCurrentDirectoryA(MAX_PATH, pwd);
std::string fullFilePath = std::string(pwd) + "\\" + filename;
std::ifstream file(fullFilePath, std::ifstream::binary);
if (file)
{
file.seekg(0, file.end);
*length = (SIZE_T)file.tellg();
file.seekg(0, file.beg);
*data = new BYTE[*length];
file.read(reinterpret_cast<CHAR*>(*data), *length);
if (file) return S_OK;
}
return E_FAIL;
}
UPDATE:
Interestingly, if I allocate std::ifstream file on the heap and do not delete it, the issue goes away. There must be something about the destruction of ifstream that is causing an issue in my case.
You don't check the return value of GetCurrentDirectoryA - maybe your current directory name is too long or something?
If you are already using Win32 (not portable!), use GetFileSize to get file size rather than doing seek
Better yet, use boost to write portable code
Switch on all warnings in compiler options
Enable ios exceptions
Okay, I gave up on trying to use ifstream. Apparently I'm not the only one that has this issue...just search "ifstream destructor crash".
Since this app is based on DirectX and will only be run on Windows, I went the Windows API route and everything works perfectly.
Working code, in case anyone cares:
HRESULT MyClass::LoadFile(_In_ const CHAR* filename, _Out_ BYTE** data, _Out_ SIZE_T* length)
{
CHAR pwd[MAX_PATH];
GetCurrentDirectoryA(MAX_PATH, pwd);
string fullFilePath = string(pwd) + "\\" + filename;
WIN32_FIND_DATAA fileData;
HANDLE file = FindFirstFileA(fullFilePath.c_str(), &fileData);
if (file == INVALID_HANDLE_VALUE) return E_FAIL;
file = CreateFileA(fullFilePath.c_str(),
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (file == INVALID_HANDLE_VALUE) return E_FAIL;
*length = (SIZE_T)fileData.nFileSizeLow;
*data = new BYTE[*length];
DWORD bytesRead;
if (ReadFile(file, *data, *length, &bytesRead, NULL) == FALSE || bytesRead != *length)
{
delete[] *data;
*length = 0;
CloseHandle(file);
return E_FAIL;
}
CloseHandle(file);
return S_OK;
}
Related
I am porting over c++ code from linux to windows. I am currently using Visual Studio 2013 to port my code.
I need to read a binary file and am using this portion of c++ code:
// Open the stream
std::ifstream is("myfile.bin");
// Determine the file length
is.seekg(0, std::ios_base::end);
std::size_t size=is.tellg();
is.seekg(0, std::ios_base::begin);
// Create a vector to store the data
int* Data = new int[size/sizeof(int)];
// Load the data
is.read((char*) &Data[0], size);
// Close the file
is.close();
In linux, the size of my binary file is correctly found to be 744mb. However, in windows, the size of my binary file is incorrectly found to be >4GB. How can I correct this issue?
Change std::ifstream is("myfile.bin"); to std::ifstream is("myfile.bin", std::ios::binary);
With your current default open mode, the compiler choses "char" mode. In Linux chars in files are UTF8, first 128 positions are 1-byte char. But for memory UTF32, 4-bytes per char, is used. In Windows chars are "wide-chars", 2-bytes per char.
I finally had the time to actually run this myself, though I had to fix a couple of things, like ios_base::beg instead of begin (different function) Also, as mentioned, the array allocation should be this int* Data = new int[size / sizeof(int) + 1]; // At most one extra int
I found your problem: you're not in the right directory. Check if you successfully opened the file or not. If you don't, then you get a huge garbage value (probably -1, but unsigned, so massive) for size.
Try this to find your directory in Windows: (probably need Windows.h or something that I "just had" already)
char dirBuf[256];
GetCurrentDirectory(256, dirBuf);
cout << "Current directory is: " << dirBuf << endl;
See if that's where your file is and move it accordingly. Or specify the ENTIRE path in the constructor to ifstream.
Also, it has nothing to do with ios::binary or not. Works fine both ways, or fails if the file isn't there.
std::size_t size=is.tellg();
The standard doesn't require tellg to return the byte offset from the beginning of the file. In general, this may not be a reliable way to get the size of the file, though it probably does what you expect on Linux and Windows.
The return type of the tellg method is std::basic_stream::pos_type, so you're starting with an implicit conversion to std::size_t which may or may not be appropriate. In a 32-bit build, for example, it's conceivable that the size of a file could be larger than a std::size_t can represent.
But the root problem is that you're not checking for errors. If you have exceptions disabled, then tellg reports an error by returning pos_type(-1). When you cast that to an unsigned type (which std::size_t is), then you get a very large value. I suspect you failed to open the file, and since you didn't detect that error, the seekg and the tellg failed. You then coerced pos_type(-1) to a std::size_t, which made it look like the file was huge.
You also have the problems others have noted: failing to open the file in binary mode and computing the wrong size for the buffer when the file isn't a multiple of the size of an int.
The most reliable to get the file size is to use the OS's API. On Windows, you can do this instead:
// Open the file. [TODO: Get the file name in wide characters and use
// CreateFileW instead. If the file name contains characters not
// representable by the user's ANSI codepage, then CreateFileA will fail.]
HANDLE hfile = CreateFileA("myfile.bin", GENERIC_READ, FILE_SHARE_READ,
nullptr, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN,
nullptr);
if (hfile == INVALID_HANDLE_VALUE) { error handling here }
// Figure out how big it is.
LARGE_INTEGER li_size;
if (!GetFileSizeEx(hfile, &li_size)) { error handling here }
// TODO: On a 32-bit build, this won't be able to handle huge files,
// so check that here.
std::size_t size = li_size.QuadPart;
// Create a buffer to store the data, being careful to round up to a
// multiple of sizeof(int). [TODO: Use a std::vector instead.]
int* Data = new int[(size + sizeof(int) - 1) / sizeof(int)];
// Load the data.
const DWORD BytesToRead = static_cast<DWORD>(size);
DWORD BytesRead = 0;
if (!ReadFile(hfile, Data, &BytesRead, nullptr) || BytesRead < BytesToRead) {
error handling here
}
// Close the file
CloseHandle(hfile);
int* Data = new int[size/sizeof(int)];
Why are you doing this? You're dividing the size by 4. You don't want to do this. It should just be int* Data = new int[size]
Also, it should be std::ifstream f("filename.bin", std::ios::binary);
I have a function to get a FileSize of a file. I am running this on WinCE. Here is my current code which seems particularily slow
int Directory::GetFileSize(const std::string &filepath)
{
int filesize = -1;
#ifdef linux
struct stat fileStats;
if(stat(filepath.c_str(), &fileStats) != -1)
filesize = fileStats.st_size;
#else
std::wstring widePath;
Unicode::AnsiToUnicode(widePath, filepath);
HANDLE hFile = CreateFile(widePath.c_str(), 0, FILE_SHARE_READ | FILE_SHARE_WRITE, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
if (hFile > 0)
{
filesize = ::GetFileSize( hFile, NULL);
}
CloseHandle(hFile);
#endif
return filesize;
}
At least for Windows, I think I'd use something like this:
__int64 Directory::GetFileSize(std::wstring const &path) {
WIN32_FIND_DATAW data;
HANDLE h = FindFirstFileW(path.c_str(), &data);
if (h == INVALID_HANDLE_VALUE)
return -1;
FindClose(h);
return data.nFileSizeLow | (__int64)data.nFileSizeHigh << 32;
}
If the compiler you're using supports it, you might want to use long long instead of __int64. You probably do not want to use int though, as that will only work correctly for files up to 2 gigabytes, and files larger than that are now pretty common (though perhaps not so common on a WinCE device).
I'd expect this to be faster than most other methods though. It doesn't require opening the file itself at all, just finding the file's directory entry (or, in the case of something like NTFS, its master file table entry).
Your solution is already rather fast to query the size of a file.
Under Windows, at least for NTFS and FAT, the file system driver will keep the file size in the cache, so it is rather fast to query it. The most time-consuming work involved is switching from user-mode to kernel-mode, rather than the file system driver's processing.
If you want to make it even faster, you have to use your own cache policy in user-mode, e.g. a special hash table, to avoid switching from user-mode to kernel-mode. But I don't recommend you to do that, because you will gain little performance.
PS: You'd better avoid the statement Unicode::AnsiToUnicode(widePath, filepath); in your function body. This function is rather time-consuming.
Just an idea (I haven't tested it), but I would expect
GetFileAttributesEx to be fastest at the system level. It
avoids having to open the file, and logically, I would expect it
to be faster than FindFirstFile, since it doesn't have to
maintain any information for continuing the search.
You could roll your own but I don't see why your approach is slow:
int Get_Size( string path )
{
// #include <fstream>
FILE *pFile = NULL;
// get the file stream
fopen_s( &pFile, path.c_str(), "rb" );
// set the file pointer to end of file
fseek( pFile, 0, SEEK_END );
// get the file size
int Size = ftell( pFile );
// return the file pointer to begin of file if you want to read it
// rewind( pFile );
// close stream and release buffer
fclose( pFile );
return Size;
}
I'm trying to get the filesize of a large file (12gb+) and I don't want to open the file to do so as I assume this would eat a lot of resources. Is there any good API to do so with? I'm in a Windows environment.
You should call GetFileSizeEx which is easier to use than the older GetFileSize. You will need to open the file by calling CreateFile but that's a cheap operation. Your assumption that opening a file is expensive, even a 12GB file, is false.
You could use the following function to get the job done:
__int64 FileSize(const wchar_t* name)
{
HANDLE hFile = CreateFile(name, GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile==INVALID_HANDLE_VALUE)
return -1; // error condition, could call GetLastError to find out more
LARGE_INTEGER size;
if (!GetFileSizeEx(hFile, &size))
{
CloseHandle(hFile);
return -1; // error condition, could call GetLastError to find out more
}
CloseHandle(hFile);
return size.QuadPart;
}
There are other API calls that will return you the file size without forcing you to create a file handle, notably GetFileAttributesEx. However, it's perfectly plausible that this function will just open the file behind the scenes.
__int64 FileSize(const wchar_t* name)
{
WIN32_FILE_ATTRIBUTE_DATA fad;
if (!GetFileAttributesEx(name, GetFileExInfoStandard, &fad))
return -1; // error condition, could call GetLastError to find out more
LARGE_INTEGER size;
size.HighPart = fad.nFileSizeHigh;
size.LowPart = fad.nFileSizeLow;
return size.QuadPart;
}
If you are compiling with Visual Studio and want to avoid calling Win32 APIs then you can use _wstat64.
Here is a _wstat64 based version of the function:
__int64 FileSize(const wchar_t* name)
{
__stat64 buf;
if (_wstat64(name, &buf) != 0)
return -1; // error, could use errno to find out more
return buf.st_size;
}
If performance ever became an issue for you then you should time the various options on all the platforms that you target in order to reach a decision. Don't assume that the APIs that don't require you to call CreateFile will be faster. They might be but you won't know until you have timed it.
I've also lived with the fear of the price paid for opening a file and closing it just to get its size. And decided to ask the performance counter^ and see how expensive the operations really are.
This is the number of cycles it took to execute 1 file size query on the same file with the three methods. Tested on 2 files: 150 MB and 1.5 GB. Got +/- 10% fluctuations so they don't seem to be affected by actual file size. (obviously this depend on CPU but it gives you a good vantage point)
190 cycles - CreateFile, GetFileSizeEx, CloseHandle
40 cycles - GetFileAttributesEx
150 cycles - FindFirstFile, FindClose
The GIST with the code used^ is available here.
As we can see from this highly scientific :) test, slowest is actually the file opener. 2nd slowest is the file finder while the winner is the attributes reader. Now, in terms of reliability, CreateFile should be preferred over the other 2. But I still don't like the concept of opening a file just to read its size... Unless I'm doing size critical stuff, I'll go for the Attributes.
PS: When I'll have time I'll try to read sizes of files that are opened and am writing to. But not right now...
Another option using the FindFirstFile function
#include "stdafx.h"
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
int _tmain(int argc, _TCHAR* argv[])
{
WIN32_FIND_DATA FindFileData;
HANDLE hFind;
LPCTSTR lpFileName = L"C:\\Foo\\Bar.ext";
hFind = FindFirstFile(lpFileName , &FindFileData);
if (hFind == INVALID_HANDLE_VALUE)
{
printf ("File not found (%d)\n", GetLastError());
return -1;
}
else
{
ULONGLONG FileSize = FindFileData.nFileSizeHigh;
FileSize <<= sizeof( FindFileData.nFileSizeHigh ) * 8;
FileSize |= FindFileData.nFileSizeLow;
_tprintf (TEXT("file size is %u\n"), FileSize);
FindClose(hFind);
}
return 0;
}
As of C++17, there is file_size as part of the standard library. (Then the implementor gets to decide how to do it efficiently!)
What about GetFileSize function?
I want to read a file from hard disk in size up to ~4-5GB. But not whole at once but in parts of ~100MB in sequence. I want to make it simple and fast as possible, but now I see that that the standard methods from C++ will not work for files bigger than 2GB.
I use Visual Studio 2008, C++/CLI. Any suggestions? I try to use CreateFile, ReadFile but for me it makes more problems than really works, or I use them wrong for reading a big file in parts.
EDIT: Sample code:
Creating handle
hFile = CreateFile(result,
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL
|FILE_FLAG_NO_BUFFERING
| FILE_FLAG_OVERLAPPED,
0);
Reading
lpOverlapped = new OVERLAPPED;
lpOverlapped->hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
lpOverlapped->Offset=10;
lpOverlapped->OffsetHigh=0;
DWORD howMuchWasRead;
BOOLEAN error = false;
do {
this->lastError = NO_ERROR;
BOOL bRet = ReadFile(this->hFile,this->fileBuffer,this->currentBufferSize,&howMuchWasRead,lpOverlapped);
this->lastError = GetLastError();
if (this->lastError == ERROR_IO_PENDING){
while(!HasOverlappedIoCompleted(this->lpOverlapped)){}
error = true;
} else {
error = false;
}
} while (error == true);
This version now returns me ERROR_INVALID_PARAMETER 87 (0x57), for 4GB .iso file, buffer size is 100MB.
You can map parts of the file into the address space of your process using CreateFile, CreateFileMapping and MapViewOfFile.
You can read the file sequentially without any problems.
The limitations is that fseek uses a long parameter for the offset when you want to seek. If you don't reposition in the file, or the offset is always less than 2GB, there is no problem.
ReadFile will handle files larger than 2GB, maybe you can rephrase your question so we can help you figure out the problems you are having with that.
I'm trying to read some ODBC details from a registry and for that I use RegQueryValueEx. The problem is when I compile the release version it simply cannot read any registry values.
The code is:
CString odbcFuns::getOpenedKeyRegValue(HKEY hKey, CString valName)
{
CString retStr;
char *strTmp = (char*)malloc(MAX_DSN_STR_LENGTH * sizeof(char));
memset(strTmp, 0, MAX_DSN_STR_LENGTH);
DWORD cbData;
long rret = RegQueryValueEx(hKey, valName, NULL, NULL, (LPBYTE)strTmp, &cbData);
if (rret != ERROR_SUCCESS)
{
free(strTmp);
return CString("?");
}
strTmp[cbData] = '\0';
retStr.Format(_T("%s"), strTmp);
free(strTmp);
return retStr;
}
I've found a workaround for this - I disabled Optimization (/Od), but it seems strange that I needed to do that. Is there some other way? I use Visual Studio 2005. Maybe it's a bug in VS?
Almost forgot - the error code is 2 (as the key wouldn't be found).
You need to initialize cbData - set it to be MAX_DSN_STR_LENGTH - 1 before calling RegQueryValueEx().
The problem is likely configuration-dependent because the variable is initialized by the compiler in one configuration and left uninitialized in another.
Also you'll be much better of using std::vector for the buffer - less code, better exception safety, less error-prone.