C++ ifstream::read() - corrupts ifstream get pointer? - c++

Does anyone here know of a way a C++ ifstream's get pointer might get corrupted after a read() call? I'm seeing some truly bizarre behaviour that I'm at a loss to explain. For example (illustrative code, rather than what I'm actually running):
int main()
{
// datafile.bin is a 2MB binary file...
std::ifstream ifs( "datafile.bin", ios::binary );
ifs.exceptions ( ifstream::eofbit | ifstream::failbit | ifstream::badbit );
int data[100];
std::istream::pos_type current_pos = ifs.tellg();
// current_pos = 0, as you'd expect...
ifs.read( reinterpret_cast<char*>(data), 100 * sizeof(int) );
// throws no exception, so no error bits set...
std::streamsize bytes_read = ifs.gcount();
// gives 400, as you'd expect...
current_pos = ifs.tellg();
// current_pos = 0x1e1a or something similarly daft
return 0;
}
My example shows an array read, but it's happened even when reading single values of built-in types; the get pointer before the read is correct, the gcount() call reports the correct number of bytes read, but afterwards the get pointer is completely screwy. This doesn't happen with every read() call - sometimes I get through bunches of them before one stuffs up. What could possibly be monkeying with the get pointer? Am I doing something profoundly stupid?
Any and all help greatly appreciated...
Simon

pos_type isn't an integral type but a class, I'd not try to try to interpret its representation. It is implicitly convertible to an integral type, but if you are looking at it in the debugger, you'll see the internal representation.

I tried running your code in VS 2008 on Vista machine, but did not get any error. I have modified your code a bit for printing on console.
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
// datafile.bin is a 2MB binary file...
std::ifstream ifs( "H_Line.bmp", ios::binary );
ifs.exceptions ( ifstream::eofbit | ifstream::failbit | ifstream::badbit );
int data[100];
std::istream::pos_type current_pos = ifs.tellg();
cout<<current_pos<<endl; // current_pos = 0, as mentioned
ifs.read( reinterpret_cast<char*>(data), 100 * sizeof(int) );
// throws no exception, so no error bits set...
std::streamsize bytes_read = ifs.gcount();
cout<<bytes_read<<endl; // gives 400, as you have mentioned
current_pos = ifs.tellg();
cout<<current_pos<<endl; // FOR ME IT IS GIVING 400
return 0;
}
I have tested this on a BMP image file of size >20 MB
Could you please elaborate which machine/compiler you are using.
Thanks

Related

Weird seek behaviour in C and C++ [duplicate]

I did a sample project to read a file into a buffer.
When I use the tellg() function it gives me a larger value than the
read function is actually read from the file. I think that there is a bug.
here is my code:
EDIT:
void read_file (const char* name, int *size , char*& buffer)
{
ifstream file;
file.open(name,ios::in|ios::binary);
*size = 0;
if (file.is_open())
{
// get length of file
file.seekg(0,std::ios_base::end);
int length = *size = file.tellg();
file.seekg(0,std::ios_base::beg);
// allocate buffer in size of file
buffer = new char[length];
// read
file.read(buffer,length);
cout << file.gcount() << endl;
}
file.close();
}
main:
void main()
{
int size = 0;
char* buffer = NULL;
read_file("File.txt",&size,buffer);
for (int i = 0; i < size; i++)
cout << buffer[i];
cout << endl;
}
tellg does not report the size of the file, nor the offset
from the beginning in bytes. It reports a token value which can
later be used to seek to the same place, and nothing more.
(It's not even guaranteed that you can convert the type to an
integral type.)
At least according to the language specification: in practice,
on Unix systems, the value returned will be the offset in bytes
from the beginning of the file, and under Windows, it will be
the offset from the beginning of the file for files opened in
binary mode. For Windows (and most non-Unix systems), in text
mode, there is no direct and immediate mapping between what
tellg returns and the number of bytes you must read to get to
that position. Under Windows, all you can really count on is
that the value will be no less than the number of bytes you have
to read (and in most real cases, won't be too much greater,
although it can be up to two times more).
If it is important to know exactly how many bytes you can read,
the only way of reliably doing so is by reading. You should be
able to do this with something like:
#include <limits>
file.ignore( std::numeric_limits<std::streamsize>::max() );
std::streamsize length = file.gcount();
file.clear(); // Since ignore will have set eof.
file.seekg( 0, std::ios_base::beg );
Finally, two other remarks concerning your code:
First, the line:
*buffer = new char[length];
shouldn't compile: you have declared buffer to be a char*,
so *buffer has type char, and is not a pointer. Given what
you seem to be doing, you probably want to declare buffer as
a char**. But a much better solution would be to declare it
as a std::vector<char>& or a std::string&. (That way, you
don't have to return the size as well, and you won't leak memory
if there is an exception.)
Second, the loop condition at the end is wrong. If you really
want to read one character at a time,
while ( file.get( buffer[i] ) ) {
++ i;
}
should do the trick. A better solution would probably be to
read blocks of data:
while ( file.read( buffer + i, N ) || file.gcount() != 0 ) {
i += file.gcount();
}
or even:
file.read( buffer, size );
size = file.gcount();
EDIT: I just noticed a third error: if you fail to open the
file, you don't tell the caller. At the very least, you should
set the size to 0 (but some sort of more precise error
handling is probably better).
In C++17 there are std::filesystem file_size methods and functions, so that can streamline the whole task.
std::filesystem::file_size - cppreference.com
std::filesystem::directory_entry::file_size - cppreference.com
With those functions/methods there's a chance not to open a file, but read cached data (especially with the std::filesystem::directory_entry::file_size method)
Those functions also require only directory read permissions and not file read permission (as tellg() does)
void read_file (int *size, char* name,char* buffer)
*buffer = new char[length];
These lines do look like a bug: you create an char array and save to buffer[0] char. Then you read a file to buffer, which is still uninitialized.
You need to pass buffer by pointer:
void read_file (int *size, char* name,char** buffer)
*buffer = new char[length];
Or by reference, which is the c++ way and is less error prone:
void read_file (int *size, char* name,char*& buffer)
buffer = new char[length];
...
fseek(fptr, 0L, SEEK_END);
filesz = ftell(fptr);
will do the file if file opened through fopen
using ifstream,
in.seekg(0,ifstream::end);
dilesz = in.tellg();
would do similar

Is readsome() appropriate to read binary data on Windows?

Context: I am trying to read the content of a PNG picture in C++ to send it later to my Android app. To do so, I open the file in binary mode, read it's content by chuncks of 512 bytes, then send the data to the app. I'm on Windows.
Issue: I use an ifstream instance and the readsome() function as shown below, and it returns me 512, which is what I expected since I asked to read 512 bytes. However, it seems that I am far from really having 512 bytes in my buffer, which confuses me. While I debug my programm step by step, the number of char in the buffer seems random, but is never 512 as expected.
Code:
int currentByteRead = 0;
std::ifstream fl(imgPath.toStdString().c_str(), ios_base::binary);
fl.seekg( 0, std::ios::end );
int length = fl.tellg();
char *imgBytes = new char[512];
fl.seekg(0, std::ios::beg);
// Send the img content by blocks of 512 bytes
while(currentByteRead + 512 < length) {
int nbRead = fl.readsome(imgBytes, 512); // nbRead is always set to 512 here
if(fl.fail()) {
qDebug() << "Error when reading file content";
}
sendMessage(...);
currentByteRead += 512;
imgBytes = new char[512];
}
// Send the remaining data
int nbRemainingBytes = length - currentByteRead;
fl.readsome(imgBytes, nbRemainingBytes);
sendMessage(...);
fl.close();
currentByteRead += nbRemainingBytes;
The length I get at the beginning is the correct one, and it seems there is no error. But it is as if not all the data was copied into the buffer during the readsome() call.
Questions: Did I misunderstood something about the readsome() function ? Is there something related to Windows causing this behaviour ? Is there a more appropriate way to proceed ?
I finally found a way to do what I wanted, and as suggested by David Herring I will put here my answer.
My thoughts about the issue: If I use a std::ifstream::pos_type variable instead of an int, the correct number of bytes is read and put in the buffer. This was not the case when using an int, as if the chars were only written in the buffer until a given (random ?) point. I am not sure to understand why this behavior occurred. My guess was that I had issues with '\n' characters, but the randomness of the final content of the buffer is still unclear for me.
Correction: This is the working code I finally reached nonetheless. Starting with this, I was able to do what I had in mind.
std::ifstream ifs(imgPath.toStdString().c_str(), std::ios::binary|std::ios::ate);
std::ifstream::pos_type pos = ifs.tellg();
int length = ifs.tellg();
std::vector<char> result(pos);
ifs.seekg(0, std::ios::beg);
ifs.read(result.data(), pos);
ifs.close();
I hope this will help others. Thank you David for your suggestions.

C++ Pointer array and buffering

I have a problem while reading from a file.
The code below ends with runtime error after like 100 loop, after tracing found that the
mybuff my doesn't reintialize with (mybuff = new char [1024];) because after debugging i still see the prvious message at the end of it.
and the problem happens when I try to fill sendbuff because same issue.
the error saying aboout "Access violation reading location" happens at this step ( sprintf(sendbuff,mybuff ))
any idea how to solve this issue?
char sendbuff[1024];
char * mybuff = new char[];
While(....){
mybuff = new char [1024];
myfile.read(mybuff ,bufsize);
sprintf(sendbuff,mybuff );
ibytessent=0;
tmpCount = strlen(sendbuff);
ibufferlen = strlen(sendbuff);
ibytessent = send(s,sendbuff,ibufferlen,0);
delete [] mybuff ;
}
I think the way you call ifstream::read() is wrong. read() doesnot add the null character at the end for you and you need to check the eofbit and failbit.
Quote from the manual,
The number of characters successfully read and stored by this function
can be accessed by calling member gcount.
I also think the run-time error is caused by the reason about read() function just as above, Also I don't think it's necessary to re-new 1024 bytes space at each iteration, why not reuse the buffer~
By the way, I try to repro your problem, I am not sure if the code below is the same with yours, and I don't get any run-time error
#include <cstdio>
#include <fstream>
using namespace std;
int bufsize = 1024;
int main(){
char sendbuff[1024];
char * mybuff = new char[];
std::ifstream ifs;
ifs.open ("test.txt", std::ifstream::in);
while(1){
mybuff = new char [1024];
ifs.read(mybuff ,bufsize);
sprintf(sendbuff,mybuff );
int ibytessent=0;
int tmpCount = strlen(sendbuff);
int ibufferlen = strlen(sendbuff);
//ibytessent = send(s,sendbuff,ibufferlen,0);
delete [] mybuff ;
}
return 0;
}

Why would a stat() call return am incorrect value of zero(0) for file size?

I'm running a windows c++ multithreaded app in which one instance/thread of the server class is appending to the file. Other threads run client instances which only load up the file upon
each client's startup.
When I get to within 2k bytes of the end of loading the file I check to see if the file has changed
in size, so I know to update how many total bytes to read. Once in a while the file size
I get back is erroneously determined to be zero(0). I am using the stat call below for this. When zero is returned, then as a sanity check, I then call getFileSizeWithTellg() to see what it returns and it returns the expected non-zero value. A value that is the same or greater than the initial value.
I realize that the cast to unsigned int could be problematic, but the files are never
larger than 5 mgBytes.
What could be causing the stat() call to return a zero value, when the ..Tellg call doesn't?
Thanks for any insight into this.
/
/ snippets from methods in different classes
//
// from client class
ifstream fileSeqIn
fileSeqIn.open(fName.c_str(), ios::in | ios::binary |ios::ate);
// to get initial size
size = fileSeqIn.tellg();
fileSeqIn.seekg(0, ios::beg);
// later to determine if the file has grown
struct stat filestatus;
unsigned int size;
if (stat(fName, &filestatus ) == 0) {
size = (unsigned int)filestatus.st_size;
}
//
unsigned int getFileSizeWithTellg(char *fname)
{
// get length of file
is.open (fname, ios::binary );
is.seekg (0, ios::end);
length = is.tellg();
is.close();
return(length);
}
//-----------------------------------------------------------------------------
// from server class
ofstream fileSeqOut;
fileSeqOut.open(fName.c_str(), ios::app | ios::out |ios::ate |ios::binary);
One significant difference: stat returns the system's view of the size of the file; tellg returns a value dependend on the internal state of the stream. File bases streams are buffered, and the data may not be passed on to the system until you flush or close the file. Do you get the same difference if you precede the call to stat with a flush of the stream?
If what Larry Osterman said is true, using GetFileInformationByHandle might solve the problem.

Reading data from binary file

I am trying to read data from binary file, and having issues. I have reduced it down to the most simple case here, and it still won't work. I am new to c++ so I may be doing something silly but, if anyone could advise I would be very grateful.
Code:
int main(int argc,char *argv[]) {
ifstream myfile;
vector<bool> encoded2;
cout << encoded2 << "\n"<< "\n" ;
myfile.open(argv[2], ios::in | ios::binary |ios::ate );
myfile.seekg(0,ios::beg);
myfile.read((char*)&encoded2, 1 );
myfile.close();
cout << encoded2 << "\n"<< "\n" ;
}
Output
00000000
000000000000000000000000000011110000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Compression_Program(58221) malloc: * error for object 0x10012d: Non-aligned pointer being freed
* set a breakpoint in malloc_error_break to debug
Thanks in advance.
Do not cast a vector<bool>* to a char*. It is does not do anything predictable.
You are reading on encoded2: myfile.read((char*)&encoded2, 1 );. this is wrong. you can to read a bool and then put it in encoded2
bool x;
myfile.read( &x, 1 );
encoded2[0] = x;
Two mistakes here:
you assume the address of a vector is the address of the first element
you rely on vector<bool>
Casting a vector into a char * is not really a good thing, because a vector is an object and stores some state along with its elements.
Here you are probably overwriting the state of the vector, thus the destructor of fails.
Maybe you would like to cast the elements of the vector (which are guaranteed to be stored contiguously in memory). But another trap is that vector<bool> may be implementation-optimized.
Therefore you should do a encoded2.reserve(8) and use myfile.read(reinterpret_cast<char *>(&encoded2[0])).
But probably you want to do something else and we need to know what the purpose is here.
You're overwriting a std::vector, which you shouldn't do. A std::vector is actually a pointer to a data array and an integer (probably a size_t) holding its size; if you overwrite these with practically random bits, data corruption will occur.
Since you're only reading a single byte, this will suffice:
char c;
myfile.read(&c, 1);
The C++ language does not provide an efficient I/O method for reading bits as bits. You have to read bits in groups. Also, you have to worry about Endianess when reading int the bits.
I suggest the old fashioned method of allocating a buffer, reading into the buffer then operating on the buffer.
Allocating a buffer
const unsigned int BUFFER_SIZE = 1024 * 1024; // Let the compiler calculate it.
//...
unsigned char * const buffer = new unsigned char [BUFFER_SIZE]; // The pointer is constant.
Reading in the data
unsigned int bytes_read = 0;
ifstream data_file("myfile.bin", ios::binary); // Open file for input without translations.
data_file.read(buffer, BUFFER_SIZE); // Read data into the buffer.
bytes_read = data_file.gcount(); // Get actual count of bytes read.
Reminders:
delete the buffer when you are
finished with it.
Close the file when you are finished
with it.
myfile.read((char*) &encoded2[0], sizeof(int)* COUNT);
or you can use push_back();
int tmp;
for(int i = 0; i < COUNT; i++) {
myfile.read((char*) &tmp, 4);
encoded2.push_back(tmp);
}