I am writing a C library that reads a file into memory. It skips the first 54 bytes of the file (header) and then reads the remainder as data. I use fseek to determine the length of the file, and then use fread to read in the file.
The loop runs once and then ends because the EOF is reached (no errors). At the end, bytesRead = 10624, ftell(stream) = 28726, and the buffer contains 28726 values. I expect fread to read 30,000 bytes and the file position to be 30054 when EOF is reached.
C is not my native language so I suspect I've got a dumb beginner mistake somewhere.
Code is as follows:
const size_t headerLen = 54;
FILE * stream;
errno_t ferrno = fopen_s( &stream, filename.c_str(), "r" );
if(ferrno!=0) {
return -1;
}
fseek( stream, 0L, SEEK_END );
size_t bytesTotal = (size_t)(ftell( stream )) - headerLen; //number of data bytes to read
size_t bytesRead = 0;
BYTE* localBuffer = new BYTE[bytesTotal];
fseek(stream,headerLen,SEEK_SET);
while(!feof(stream) && !ferror(stream)) {
size_t result = fread(localBuffer+bytesRead,sizeof(BYTE),bytesTotal-bytesRead,stream);
bytesRead+=result;
}
Depending on the reference you use, it's quite apparent that adding a "b" to the mode flag is the answer. Seeking nominations for the bonehead-badge. :-)
This reference talks about it in the second paragraph, second sentence (though not in their table).
MSDN doesn't discuss the binary flag until halfway down the page.
OpenGroup mentions the existance of the "b" tag, but states that it "shall have no effect".
perhaps it's a binary mode issue. Try opening the file with "r+b" as the mode.
EDIT: as noted in a comment "rb" is likely a better match to your original intent since "r+b" will open it for read/write and "rb" is read-only.
Also worth noting that simply including binmode.obj into your link command will do this for you for all file opens.
A solution, based on the previous answers:
size_t bytesRead = 0;
BYTE* localBuffer = new BYTE[bytesTotal];
fseek(stream,headerLen,SEEK_SET);
while(!feof(stream) && !ferror(stream)) {
size_t result = fread(localBuffer+bytesRead,sizeof(BYTE),bytesTotal-
bytesRead,stream);
bytesRead+=result;
}
Related
I did a sample project to read a file into a buffer.
When I use the tellg() function it gives me a larger value than the
read function is actually read from the file. I think that there is a bug.
here is my code:
EDIT:
void read_file (const char* name, int *size , char*& buffer)
{
ifstream file;
file.open(name,ios::in|ios::binary);
*size = 0;
if (file.is_open())
{
// get length of file
file.seekg(0,std::ios_base::end);
int length = *size = file.tellg();
file.seekg(0,std::ios_base::beg);
// allocate buffer in size of file
buffer = new char[length];
// read
file.read(buffer,length);
cout << file.gcount() << endl;
}
file.close();
}
main:
void main()
{
int size = 0;
char* buffer = NULL;
read_file("File.txt",&size,buffer);
for (int i = 0; i < size; i++)
cout << buffer[i];
cout << endl;
}
tellg does not report the size of the file, nor the offset
from the beginning in bytes. It reports a token value which can
later be used to seek to the same place, and nothing more.
(It's not even guaranteed that you can convert the type to an
integral type.)
At least according to the language specification: in practice,
on Unix systems, the value returned will be the offset in bytes
from the beginning of the file, and under Windows, it will be
the offset from the beginning of the file for files opened in
binary mode. For Windows (and most non-Unix systems), in text
mode, there is no direct and immediate mapping between what
tellg returns and the number of bytes you must read to get to
that position. Under Windows, all you can really count on is
that the value will be no less than the number of bytes you have
to read (and in most real cases, won't be too much greater,
although it can be up to two times more).
If it is important to know exactly how many bytes you can read,
the only way of reliably doing so is by reading. You should be
able to do this with something like:
#include <limits>
file.ignore( std::numeric_limits<std::streamsize>::max() );
std::streamsize length = file.gcount();
file.clear(); // Since ignore will have set eof.
file.seekg( 0, std::ios_base::beg );
Finally, two other remarks concerning your code:
First, the line:
*buffer = new char[length];
shouldn't compile: you have declared buffer to be a char*,
so *buffer has type char, and is not a pointer. Given what
you seem to be doing, you probably want to declare buffer as
a char**. But a much better solution would be to declare it
as a std::vector<char>& or a std::string&. (That way, you
don't have to return the size as well, and you won't leak memory
if there is an exception.)
Second, the loop condition at the end is wrong. If you really
want to read one character at a time,
while ( file.get( buffer[i] ) ) {
++ i;
}
should do the trick. A better solution would probably be to
read blocks of data:
while ( file.read( buffer + i, N ) || file.gcount() != 0 ) {
i += file.gcount();
}
or even:
file.read( buffer, size );
size = file.gcount();
EDIT: I just noticed a third error: if you fail to open the
file, you don't tell the caller. At the very least, you should
set the size to 0 (but some sort of more precise error
handling is probably better).
In C++17 there are std::filesystem file_size methods and functions, so that can streamline the whole task.
std::filesystem::file_size - cppreference.com
std::filesystem::directory_entry::file_size - cppreference.com
With those functions/methods there's a chance not to open a file, but read cached data (especially with the std::filesystem::directory_entry::file_size method)
Those functions also require only directory read permissions and not file read permission (as tellg() does)
void read_file (int *size, char* name,char* buffer)
*buffer = new char[length];
These lines do look like a bug: you create an char array and save to buffer[0] char. Then you read a file to buffer, which is still uninitialized.
You need to pass buffer by pointer:
void read_file (int *size, char* name,char** buffer)
*buffer = new char[length];
Or by reference, which is the c++ way and is less error prone:
void read_file (int *size, char* name,char*& buffer)
buffer = new char[length];
...
fseek(fptr, 0L, SEEK_END);
filesz = ftell(fptr);
will do the file if file opened through fopen
using ifstream,
in.seekg(0,ifstream::end);
dilesz = in.tellg();
would do similar
I'm reading a binary uint32_t data from a file that indicates the size of the next binary block, after that I read that block but the reading pointer is "moving" wrong.
FILE* file = fopen("file.zip", "r");
long pointerA = ftell(file);
uint32_t streamSize = 0;
fread(reinterpret_cast<char*>(&streamSize), sizeof streamSize,1,file);
long pointerB = ftell(file);
char* zipData = new char[streamSize];
fread(zipData, sizeof(char),streamSize,file);
long pointerC = ftell(file);
fseek( file, pointerA + 4 + streamSize, SEEK_SET );
long pointerD = ftell(file);
qDebug()<<"streamSize"<<streamSize<<"Positions"<<pointerA<<pointerB<<pointerC<<pointerD;
PointerA is the original position, PointerB the position after reading that uint32_t, PointerC is the pointer after reading all that binary data and
PointerD is just a check about what I suppose that should be the right behaviour.
Now let's see the debug:
streamSize 2653 Positions 151 156 4627 2808
Why the stream read position has moved too 4627 instead 2808?
Thank you in advance for any tip!
Both users #alan-birtles and #remy-lebeau were right, I opened it as text instead binary that was the issue.
Unfortunatly I cannot mark this as solved.
PS. For begginers this means open file with "rb" instead "r".
You need to open your file in binary mode. When a file is opened in text mode some characters are changed as you read them. For example on Windows when reading '\n' "\r\n" is returned. To open in binary mode add 'b' to your open mode, e.g:
FILE * file = fopen("file.txt", "rb");
Note you need to do the same when writing binary files otherwise the same transformations take place.
std::fstream also needs std::ios_base::binary to be passed to the constructor/open to avoid the same issue.
I would like to read the data with fread from file. However, I encounter the problem of setting of NULL terminator.I think the the line (fileMem[fileSize] = 0;) should have been solved. However, I still get rubbish at the with I check the value of "fileMem". Would anyone help me to find the problem?
I followed other posts about setting the NULL terminator but just does not work
File *input = fopen(filePath, "r");
fseek(input, 0, SEEK_END);
auto fileSize = ftell(input);
fseek(input, 0, SEEK_SET);
char* fileMem = new char[fileSize+1];
fileMem[fileSize] = 0;// the NULL terminator problem should have been solved here
clearerr(input);
fread(fileMem, fileSize,1, input);
What is the problem with my code?
fread is reading more bytes than fileSize, because you are specifying a record size of fileSize, and asking it to read only one text record. It then overwrites the 0 at the end with actual data, so you get garbage.
fread returns the number of bytes it actually read, so you can allocate a bigger buffer, then use the return value from fread to determine how much of it is valid (and to set a null-terminator).
Since it is updating your data in this way, I also suggest changing the file type to binary ("rb" rather than "r" in call to fopen).
The reason this is happening is because fread performs translation of text when in text mode ("r" rather than "rb"), such as carriage returns and line feeds.
Assuming you are on Windows, I think the problem is that you are opening the file in text mode and using fread which is meant for binary mode. In text mode, what you read may not be exactly what is in the file. Windows text files have "\r\n" at the end of the file, but in text mode this two character combination is converted to a single character, "\n". So the fileSize you computed will be too large and so your null terminator will be in the wrong place.
To verify this, change your fread to be:
int nr = fread( fileMen, 1, fileSize, input);
Swapping the middle args, will have fread return the number of bytes read. If you look at the value of nr, it will be smaller than fileSize because of the line end translation.
This wouldn't be a problem on a *nix system since there is no translation there in text mode.
When you want to use fread to read the contents of a file, you must open the file in binary mode.
FILE *input = fopen(filePath, "rb");
^^
Otherwise, the size of the file you get by using
fseek(input, 0, SEEK_END);
auto fileSize = ftell(input);
will be greater than the number of characters that what can be read by fread.
If you have a CR and a LF, they will count as two characters by the above method but fread will read only one character. Hence, fread will read less than fileSize characters. You can also change the fread line to:
// Swap the middle arguments.
// The first argument is supposed to be the size of each object.
// The second argument is supposed to be the number of objects to read.
auto n = fread(fileMem, 1, fileSize, input);
if ( n != fileSize )
{
// Surprise
}
fileMem[n] = '\0';
RESOLVED
I'm trying to make a simple file loader.
I aim to get the text from a shader file (plain text file) into a char* that I will compile later.
I've tried this function:
char* load_shader(char* pURL)
{
FILE *shaderFile;
char* pShader;
// File opening
fopen_s( &shaderFile, pURL, "r" );
if ( shaderFile == NULL )
return "FILE_ER";
// File size
fseek (shaderFile , 0 , SEEK_END);
int lSize = ftell (shaderFile);
rewind (shaderFile);
// Allocating size to store the content
pShader = (char*) malloc (sizeof(char) * lSize);
if (pShader == NULL)
{
fputs ("Memory error", stderr);
return "MEM_ER";
}
// copy the file into the buffer:
int result = fread (pShader, sizeof(char), lSize, shaderFile);
if (result != lSize)
{
// size of file 106/113
cout << "size of file " << result << "/" << lSize << endl;
fputs ("Reading error", stderr);
return "READ_ER";
}
// Terminate
fclose (shaderFile);
return 0;
}
But as you can see in the code I have a strange size difference at the end of the process which makes my function crash.
I must say I'm quite a beginner in C so I might have missed some subtilities regarding the memory allocation, types, pointers...
How can I solve this size issue?
*EDIT 1:
First, I shouldn't return 0 at the end but pShader; that seemed to be what crashed the program.
Then, I change the type of reult to size_t, and added a end character to pShader, adding pShdaer[result] = '/0'; after its declaration so I can display it correctly.
Finally, as #JamesKanze suggested, I turned fopen_s into fopen as the previous was not usefull in my case.
First, for this sort of raw access, you're probably better off
using the system level functions: CreateFile or open,
ReadFile or read and CloseHandle or close, with
GetFileSize or stat to get the size. Using FILE* or
std::filebuf will only introduce an additional level of
buffering and processing, for no gain in your case.
As to what you are seeing: there is no guarantee that an ftell
will return anything exploitable as a numeric value; it could
very well be just a magic cookie. On most current systems, it
is a byte offset into the physical file, but on any non-Unix
system, the offset into the physical file will not map directly
to the logical file you are reading unless you open the file in
binary mode. If you use "rb" to open the file, you'll
probably see the same values. (Theoretically, you could get
extra 0's at the end of the file, but practically, the OS's
where that happened are either extinct, or only used on legacy
mainframes.)
EDIT:
Since the answer stating this has been deleted: you should loop
on the fread until it returns 0 (setting errno to 0 before
each call, and checking it after the return to see whether the
function returned because of an error or because it reached the
end of file). Having said this: if you're on one of the usual
Windows or Unix systems, and the file is local to the machine,
and not too big, fread will read it all in one go. The
difference in size you are seeing (given the numerical values
you posted) is almost certainly due to the fact that the two
byte Windows line endings are being mapped to a single '\n'
character. To avoid this, you must open in binary mode;
alternatively, if you really are dealing with text (and want
this mapping), you can just ignore the extra bytes in your
buffer, setting the '\0' terminator after the last byte
actually read.
Just this:
int size = getFileSize(path); //Listed below
ifstream fs(path, ios::in);
ofstream os(path2, ios::out);
//Check - both streams are valid
char buff[CHUNK_SIZE]; //512
while (size > CHUNK_SIZE)
{
fs >> buff;
os << buff;
size -= CHUNK_SIZE;
}
char* dataLast = new char[size];
fs>>dataLast;
os<<dataLast;
fs.close();
os.close();
//Found on SO, works fine
int getFileSize(string path)
{
FILE *pFile = NULL;
if (fopen_s( &pFile, path.c_str(), "rb" ))
{
return 0;
}
fseek( pFile, 0, SEEK_END );
int Size = ftell( pFile );
fclose( pFile );
return Size;
}
File at path2 is corrupted and less then 1 Kb. (initial file is 30Kb);
I don't need advices how to copy file, I am curios what is wrong about this example.
First an important warning: Never (as in really never) use the formatted input operator for char* without setting the width()! You open yourself up to a buffer overrun. This is basically the C++ version of writing gets() which was bad enough to be removed (not just deprecated) from the C standard! If you insist in using formatted input with char* (normally you are much better off using std::string), set the width, e.g.:
char buffer[512];
in >> std::setw(sizeof(buffer) >> buffer;
OK, with this out of the way: it seems you actually want to change two important things:
You probably don't want to use formatted input, i.e., operator>>(): the formatted input operators start off with skipping whitespace. When reading into char* it also stops when reaching a whitespace (or when the width() is non-zero when having read a much characters and still space to store a terminating zero; note that the set width() will be reset to 0 after each of these reads). That is you probably want to use unformatted input, e.g., in.read(buffer, sizeof(buffer)) which sets in.gcount() to the number of characters actually read which may be less then size parameter, e.g., at the end of the stream.
You probably should open the file in std::ios_base::binary mode. Although it doesn't matter on some systems (e.g., POSIX systems) on some systems reading in text mode merges a line end sequence, e.g. \r\n on Windows, into the line end character \n. Likewise, when writing a \n in text mode, it will be replaced by a line end sequence on some system, i.e., you probably also want to open the output stream in text mode.
Th input and output operators, when used with strings (like buff is from the libraries point of view), reads space-delimited words, only.
If you want to read chunks, then use std::istream::read, and use std::istream::gcount to get the number of bytes actually read. Then write with std::ostream::write.
And if the data in the file is binary, you should use the binary open mode.