input from a file using get line - c++

I am trying to read from a file, and I have separated them by a new line character. I am using these code :
fstream input("wordfile.dat", ios::in);
char b[10];
while (!input.eof())
{
input.getline(b, 10);
cout << b << endl;
}
If I change the loop statement from while(!input.eof()) to while(input) , the program will output a blank line before the loop ends. But now it won't. The question is, in both statements the while condition must first input a line and by inputting it, it will know if it has reached end of file or if there is still more information. So input.eof() must act just like the other statement and output a blank line. First I thought it was a mistake, but I wondered why it was acting correctly. What is the difference between these two conditions?

Looking at operator bool we see ...
Notice that this function does not return the same as member good [...]
... that if (stream) is not the same as if (stream.good()), but also learn that it ...
Returns whether an error flag is set (either failbit or badbit).
So it's basically the same as not stream.fail() (which is true if either failbit or badbit is set).
This also explains the different behavior between while (stream) and while (not stream.eof()):
When the input file does not end with a newline, then stream.getline(buffer, size) will encounter the end of file before reaching the delimiting newline character (or the 10 character limit) and thus set the eofbit. Testing the stream with its operator bool will then be still true (since neither failbit nor badbit are set), and only after trying to read more using getline will set the failbit since no characters are extracted.
But when testing with not stream.eof(), the eofbit alone will end the loop.
If the stream is good, which is what you're testing with,
if (stream) // ...
then, this means that the stream is neither at the end of file (eof), nor bad nor failed.
So when it's not at the end of file, then it could still have failed or be in a bad state.
See the table here.
When reading (or writing) a stream, test for good unless you have a specific reason not to do so.
As a side note, this happens when you do input like the following, since getline returns a reference to the instance it's called on:
while (stream.getline(buffer, size)) {
// ..
}

Related

How Can I Detect That a Binary File Has Been Completely Consumed?

If I do this:
ofstream ouput("foo.txt");
output << 13;
output.close();
ifstream input("foo.txt");
int dummy;
input >> dummy;
cout << input.good() << endl;
I'll get the result: "0"
However if I do this:
ofstream ouput("foo.txt", ios_base::binary);
auto dummy = 13;
output.write(reinterpret_cast<const char*>(&dummy), sizeof(dummy));
output.close();
ifstream input("foo.txt", ios_base::binary);
input.read(reinterpret_cast<char*>(&dummy), sizeof(dummy));
cout << input.good() << endl;
I'll get the result: "1"
This is frustrating to me. Do I have to resort to inspecting the ifstream's buffer to determine whether it has been entirely consumed?
Regarding
How Can I Detect That a Binary File Has Been Completely Consumed?
A slightly inefficient but easy to understand way is to measure the size of the file:
ifstream input("foo.txt", ios_base::binary);
input.seekg(0, ios_base::end); // go to end of the file
auto filesize = input.tellg(); // current position is the size of the file
input.seekg(0, ios_base::beg); // go back to the beginning of the file
Then check current position whenever you want:
if (input.tellg() == filesize)
cout << "The file was consumed";
else
cout << "Some stuff left in the file";
This way has some disadvantages:
Not efficient - goes back and forth in the file
Doesn't work with special files (e.g. pipes)
Doesn't work if the file is changed (e.g. you open your file in read-write mode)
Only works for binary files (seems your case, so OK), not text files
So better just use the regular way people do it, that is, try to read and bail if it fails:
if (input.read(reinterpret_cast<char*>(&dummy), sizeof(dummy)))
cout << "I have read the stuff, will work on it now";
else
cout << "No stuff in file";
Or (in a loop)
while (input.read(reinterpret_cast<char*>(&dummy), sizeof(dummy)))
{
cout << "Working on your stuff now...";
}
You are doing totally different things.
The operator>> is greedy and will read as much as possible into dummy. It so happens that while doing so, it runs into the end of file. That sets the input.eof(), and the stream is no longer good(). As it did find some digits before the end, the operation is still successful.
In the second read, you ask for a specific number of bytes (4 most likely) and the read is successful. So the stream is still good().
The stream interface doesn't predict the outcome of any future I/O, because in the general case it cannot know. If you use cin instead of input there might now be more to read, if the user continued typing.
Specifically, the eof() state doesn't appear until someone tries to read past end-of-file.
For text streams, as you have written only the integer value and not even a space not an end of line, at read time, the library must try to read one character passed the 1 and 3 and hits the end of file. So the good bit is false and the eof is true.
For binary streams, you have written 4 bytes (sizeof(int)) assuming ints are 32 bits large, and you read 4 bytes. Fine. No problem has still occured and the good bit is true and eof false. Only next read will hit the end of file.
But beware. In text example, if you open the text file in a editor and simply save it without changing anything, chances are that the editor automacally adds an end of line. In that case, the read will stop on the end of line and as for the binary case the good bit will be true and eof false. Same is you write with output << 13 << std::endl;
All that means that you must never assume that a read is not the last element of a file when good it true and eof is false, because the end of file may be hit only on next read even if nothing is returned then.
TL/DR: the only foolproof way to know that there is nothing left in a file is when you are no longer able to read something from it.
You do not need to resort to inspecting the buffer. You can determine if the whole file has been consumed: cout << (input.peek() != char_traits<char>::eof()) << endl This uses: peek, which:
Reads the next character from the input stream without extracting it
good in the case of the example is:
Returning false after the last extraction operation, which occurs because the int extraction operator has to read until it finds a character that is not a digit. In this case that's the EOF character, and when that character is read even as a delimiter the stream's eofbit is set, causing good to fail
Returning true after calling read, because read extracts exactly sizeof(int)-bytes so even if the EOF character is the next character it is not read, leaving the stream's eofbit unset and good passing
peek can be used after either of these and will correctly return char_traits<char>::eof() in both cases. Effectively this is inspecting the buffer for you, but with one vital distinction for binary files: If you were to inspect a binary file yourself you'd find that it may contain the EOF character. (On most systems that's defined as 0xFF, 4 of which are in the binary representation of -1.) If you are inspecting the buffer's next char you won't know whether that's actually the end of the file or not.
peek doesn't just return a char though, it returns an int_type. If peek returns 0x000000FF then you're looking at an EOF character, but not the end of file. If peek returns char_traits<char>::eof() (typically 0xFFFFFFFF) then you're looking at the end of the file.

Difference between while(!file.eof()) and while(file >> variable)

First things first - I've got a text file in which there are binary numbers, one number for each row. I'm trying to read them and sum them up in a C++ program. I've written a function which transforms them to decimal and adds them after that and I know for sure that function's ok. And here's my problem - for these two different ways of reading a text file, I get different results (and only one of these results is right) [my function is decimal()]:
ifstream file;
file.open("sample.txt");
int sum = 0;
string BinaryNumber;
while (!file.eof()){
file >> BinaryNumber;
sum+=decimal(BinaryNumber);
}
and that way my sum is too large, but by a small quantity.
ifstream file;
file.open("sample.txt");
int sum = 0;
string BinaryNumber;
while (file >> BinaryNumber){
sum+=decimal(BinaryNumber);
}
and this way gives me the the right sum. After some testing I came to a conclusion that the while loop with eof() is making one more iteration than the other while loop. So my question is - what is the difference between those two ways of reading from a text file? Why the first while loop gives me the wrong result and what may be this extra iteration that it's doing?
The difference is that >> reads the data first, and then tells you whether it has been a success or not, while file.eof() does the check prior to the reading. That is why you get an extra read with the file.eof() approach, and that read is invalid.
You can modify the file.eof() code to make it work by moving the check to a place after the read, like this:
// This code has a problem, too!
while (true) { // We do not know if it's EOF until we try to read
file >> BinaryNumber; // Try reading first
if (file.eof()) { // Now it's OK to check for EOF
break; // We're at the end of file - exit the loop
}
sum+=decimal(BinaryNumber);
}
However, this code would break if there is no delimiter following the last data entry. So your second approach (i.e. checking the result of >>) is the correct one.
EDIT: This post was edited in response to this comment.
When using file.eof() to test the input, the last input probably fails and the value stays unchanged and is, thus, processed twice: when reading a string, the stream first skips leading whitespace and then reads characters until it finds a space. Assuming the last value is followed by a newline, the stream hasn't touched EOF, yet, i.e., file.eof() isn't true but reading a string fails because there are no non-whitespace characters.
When using file >> value the operation is executed and checked for success: always use this approach! The use of eof() is only to determine whether the failure to read was due to EOF being hit or something else.

Use of eof() in files in C++

Can anybody explain how this while condition works while accessing files in C++. In the while condition, employeeLine is a string.
while ( !inFile.getline( employeeLine, MAX_LINE, ‘\n’ ).eof( ) )
{
//some code here
}
If the file contains data like this then
will the code process the last line data or not as there is no newline character
Tomb33bb9.75<\n>
bbMarybb26bb10.15
(eof)
First of, inFile.getline(...) return the stream, i.e., a reference to inFile. When the stream reaches the end of file, the flag std::ios_base::eofbit gets set and inFile.eof() returns true. While there is any input, this flag won't be set. If the last line is incomplete, i.e., is lacking a newline character, it won't be processed!
Note, however, that the end of file may not necessarily be reached: if something goes wrong, std::ios_base::failbit is set and the stream won't response to any further attempts to read something: you'd have an infinite loop. std::istream::getline() does set std::ios_base::failbit when the line is too long to fit into the buffer (i.e., there are more than MAX_LINE - 1 characters). Another potential situation where the stream may go into failure mode without setting std::ios_base::eofbit is when an exception is thrown from the used stream buffer.
In general a better approach is to rely on the conversion to bool for a stream, i.e., to use
while (inFile.getline(employeeLine, MAX_LINE)) {
// ...
}
There is no need to pass '\n' as last parameter as it is the default. There is also no harm.
Note, that the above code won't deal with lines longer than MAX_LINE. That may be intentional, e.g., to avoid a denial of service attack based on infinitely large lines. Typically, it is preferable to use std::string, however:
for (std::string line; std::getline(inFile, line); ) {
// ...
}
If "inFile.getline( employeeLine, MAX_LINE, ‘\n’ ).eof( )" returns TRUE, it means that we reach the end of "inFile". So "!inFile.getline( employeeLine, MAX_LINE, ‘\n’ ).eof( )" means that we do not reach the end of "inFile".
See details in MSDN

What is the "right" way to read a file with C++ fstreams?

I am using the standard C++ fstreams library and I am wondering what is the right way to use it. By experience I sort of figured out a small usage protocol, but I am not really sure about it. For the sake of simplicity let's assume that I just want to read a file, e.g., to filter its content and put it on another file. My routine is roughly as follows:
I declare a local istream i("filename") variable to open the file;
I check either i.good() or i.is_open() to handle the case where something went bad when opening, e.g., because the file does not exist; after, I assume that the file exists and that i is ok;
I call i.peek() and then again i.good() or i.eof() to rule out the case where the file is empty; after, I assume that I have actually something to read;
I use >> or whatever to read the file's content, and eof() to check that I am over;
I do not explicitly close the file - I rely on RAII and keep my methods as short and coherent as I can.
Is it a sane (correct, minimal) routine? In the negative case, how would you fix it? Please note that I am not considering races - synchronization is a different affair.
I would eliminate the peek/good/eof (your third step). Simply attempt to read your data, and check whether the attempted read succeeded or failed. Likewise, in the fourth step, just check whether your attempted read succeeded or not.
Typical code would be something like:
std::ifstream i("whatever");
if (!i)
error("opening file");
while (i >> your_data)
process(your_data);
if (!i.eof())
// reading failed before end of file
It's simpler than you have described. The first two steps are fine (but the second is not necessary if you follow the rest of my advice). Then you should attempt extraction, but use the extraction as the condition of a loop or if statement. If, for example, the file is formatted as a series of lines (or other delimited sequences) all of the same format, you could do:
std::string line;
while (std::getline(i, line)) {
// Parse line
}
The body of the loop will only execute if the line extraction works. Of course, you will need to check the validity of the line inside the loop.
If you have a certain series of extractions or other operations to do on the stream, you can place them in an if condition like so:
if (i >> some_string &&
i.get() == '-' &&
i >> some_int) {
// Use some_string and some_int
}
If this first extraction fails, the i.ignore() not execute due to short-circuit evaluation of &&. The body of the if statement will only execute if both extractions succeed. If you have two extractions together, you can of course chain them:
if (i >> some_string >> some_int) {
// Use some_string and some_int
}
The second extraction in the chain will not occur if the first one fails. A failed extraction puts the stream in a state in which all following extractions also fail automatically.
For this reason, it's also fine to place the stream operations outside of the if condition and then check the state of the stream:
i >> some_string >> some_int;
if (i) {
// Use some_string and some_int
}
With both of these methods, you don't have to check for certain problems with the stream. Checking the stream for eof() doesn't necessarily mean that the next read will fail. A common case is when people use the following incorrect extraction loop:
// DO NOT DO THIS
while (!i.eof()) {
std::getline(i, line)
// Do something with line
}
Most text files end with an extra new line at the end that text editors hide from you. When you're reading lines from the text file, for the last iteration you haven't yet hit the end of file because there's still a \n to read. So the loop continues, attempts to extract the next line which doesn't exist and screws up. People often observe this as "reading the last line of the file twice".

C++: .eof on an empty file

Lets see this program:
ifstream filein("hey.txt");
if(filein.eof()){
cout<<"END"<<endl;
}
Here "hey.txt" is empty. So the if condition here is thought should have been true But it isnt
Why isnt the eof returning true although the file is empty?
If i added this before the if the eof returns true although arr is still empty and the file is still empty so both unchanged
char arr[100];
filein.getline(arr,99);
eof() function returns "true" after the program attempts to read past the end of the file.
You can use std::ifstream::peek() to check for the "logical end-of-file".
eof() tests whether the "end of file" flag is set on the C++ stream object. This flag is set when a read operation encouters the end of the input from the underlying device (file, standard input, pipe, etc.). Before you attempt a read on an empty file the flag is not set. You have to perform an operation that will try to read something before the flag will be set on the stream object.
The flag std::ios_base::eofbit is set when reaching the end of a stream while trying to read. Until an attempt is made to read past the end of a stream this flag won't be set.
In general, use of std::ios_base::eofbit is rather limited. The only reasonable use is for suppressing an error after a failed read: It is normally not an error if a read failed due to reaching end of file. Trying to use this flag for anything else won't work.