How does ifstream's eof() work? - c++

#include <iostream>
#include <fstream>
int main() {
std::fstream inf( "ex.txt", std::ios::in );
while( !inf.eof() ) {
std::cout << inf.get() << "\n";
}
inf.close();
inf.clear();
inf.open( "ex.txt", std::ios::in );
char c;
while( inf >> c ) {
std::cout << c << "\n";
}
return 0;
}
I'm really confused about eof() function. Suppose that my ex.txt's content was:
abc
It always reads an extra character and shows -1 when reading using eof(). But the inf >> c gave the correct output which was 'abc'? Can anyone help me explain this?

-1 is get's way of saying you've reached the end of file. Compare it using the std::char_traits<char>::eof() (or std::istream::traits_type::eof()) - avoid -1, it's a magic number. (Although the other one is a bit verbose - you can always just call istream::eof)
The EOF flag is only set once a read tries to read past the end of the file. If I have a 3 byte file, and I only read 3 bytes, EOF is false, because I've not tried to read past the end of the file yet. While this seems confusing for files, which typically know their size, EOF is not known until a read is attempted on some devices, such as pipes and network sockets.
The second example works as inf >> foo will always return inf, with the side effect of attempt to read something and store it in foo. inf, in an if or while, will evaluate to true if the file is "good": no errors, no EOF. Thus, when a read fails, inf evaulates to false, and your loop properly aborts. However, take this common error:
while(!inf.eof()) // EOF is false here
{
inf >> x; // read fails, EOF becomes true, x is not set
// use x // we use x, despite our read failing.
}
However, this:
while(inf >> x) // Attempt read into x, return false if it fails
{
// will only be entered if read succeeded.
}
Which is what we want.

The EOF flag is only set after a read operation attempts to read past the end of the file. get() is returning the symbolic constant traits::eof() (which just happens to equal -1) because it reached the end of the file and could not read any more data, and only at that point will eof() be true. If you want to check for this condition, you can do something like the following:
int ch;
while ((ch = inf.get()) != EOF) {
std::cout << static_cast<char>(ch) << "\n";
}

iostream doesn't know it's at the end of the file until it tries to read that first character past the end of the file.
The sample code at cplusplus.com says to do it like this: (But you shouldn't actually do it this way)
while (is.good()) // loop while extraction from file is possible
{
c = is.get(); // get character from file
if (is.good())
cout << c;
}
A better idiom is to move the read into the loop condition, like so:
(You can do this with all istream read operations that return *this, including the >> operator)
char c;
while(is.get(c))
cout << c;

eof() checks the eofbit in the stream state.
On each read operation, if the position is at the end of stream and more data has to be read, eofbit is set to true. Therefore you're going to get an extra character before you get eofbit=1.
The correct way is to check whether the eof was reached (or, whether the read operation succeeded) after the reading operation. This is what your second version does - you do a read operation, and then use the resulting stream object reference (which >> returns) as a boolean value, which results in check for fail().

Related

does the this stl operator >> function make magic happens?

I have a weird problem when I test C++ STL features.
If I uncomment the line if(eee), my while loop never exits.
I'm using vs2015 under 64-bit Windows.
int i = 0;
istream& mystream = data.getline(mycharstr,128);
size_t mycount = data.gcount();
string str(mycharstr,mycharstr+mycount);
istringstream myinput(str);
WORD myfunclist[9] = {0};
for_each(myfunclist,myfunclist+9, [](WORD& i){ i = UINT_MAX;});
CALLEESET callee_set;
callee_set.clear();
bool failbit = myinput.fail();
bool eof = myinput.eof();
while (!failbit && !eof)
{
int eee = myinput.peek();
if (EOF == eee) break;
//if (eee) // if i uncomment this line ,the failbit and eof will always be false,so the loop will never exit.
{
myinput >> myfunclist[i++];
}
//else break;
failbit = myinput.fail();
eof = myinput.eof();
cout << myinput.rdstate() << endl;
}
I think that
int eee = myinput.peek();
at some point returns zero.
Then due to
if (eee)
you stop reading from the stream and never reach EOF.
Try to do
if (eee >= 0)
instead
As an alternative you could do:
if (eee < 0)
{
break;
}
// No need for further check of eee - just do the read
myinput >> myfunclist[i++];
The root cause of your problem is a misunderstanding about the way streams set their flags: fail() and eof() are only set once a reading operation fails or tried to read after the last byte was reached.
In other words, with C++ streams you may perfectly have read the last byte of your input and be at the end of file, yet eof() will stay false until you try to read more. You will find on StackOverflow many questions and answers about why you should not loop on eof in a C++ stream.
Consequences:
You will always enter into the loop, even if there is no character to read in myinput.
You therefore have to check for the special case of peek() returning EOF.
If you're still in the loop after the peek, then there are still characters to read. Keep in mind that peek() does not consume the characters. If you do not read it in a proper way, you stay at the same position in the stream. So if for any reason you do no reach myinput >> myfunclist[i++];, you're stuck in an endless loop, constantly peeking the same character over and over again. This is the 0 case that is well described in 4386427's answer : it's still there and you do not progress in the stream.
Other comments:
since your input can be 128 bytes long, and you read integers in text encoding, you could have evil input with 64 different words, causing your loop to go out ov bounds and cause for example memory corruption.
It is not clear why at all you try to peek.
I'd suggest to forget about the flags, use the usual stream reading idiom and simplify the code to:
...
callee_set.clear(); // until there, no change
while (i<9 && myinput >> myfunclist[i++])
{
cout << myinput.rdstate() << endl; // if you really want to know ;-)
}

Why is the c++ input file stream checked twice here?

Here is a snippet from a c++ tutorial:
// istream::get example
#include <iostream> // std::cin, std::cout
#include <fstream> // std::ifstream
int main () {
char str[256];
std::cout << "Enter the name of an existing text file: ";
std::cin.get (str,256); // get c-string
std::ifstream is(str); // open file
while (is.good()) // loop while extraction from file is possible
{
char c = is.get(); // get character from file
if (is.good())
std::cout << c;
}
is.close(); // close file
return 0;
}
Notice is.good() appeared twice, first with while, then with if.
Link to the example: http://www.cplusplus.com/reference/istream/istream/get/
Why is the c++ input file stream checked twice here?
The fact of the matter is that it is unnecessarily checked twice. If the second inner if (is.good()) passes, then the outer while (is.good()) will always pass as well. The author of the code needed some way of looping, and he incorrectly assumed that a while (is.good()) is an appropriate condition because it will stop the loop when the stream fails to extract. But this is only half-true. while (is.good()) is never the correct way to perform the extraction.
You have to perform the input first and then check if it succeeded. Otherwise it is possible to perform a failed extraction, use the result of that extraction and receive unwanted behavior from your program. The correct way to do it is by using the extraction itself as the condition. The input operator will return a reference to the stream, and then it will turn into a boolean returning true if the previous read suceeded, or false otherwise:
while (is.get(c))
{
std::cout << c;
}
The variable c is also not outside of the loop. You can enclose the while() loop in a block or use a for() loop instead:
for (char c; is.get(c); )
{
std::cout << c;
}
But it seems that this code is attempting to write all the content from the file to standard output. Reading a character one-by-one is the way shown here, but you can also use stream iterators or the buffer overload of std::ostream::operator<<() as well.
There are two more problems I see in this code. Namely:
std::string is the preferred construct for manipulating dynamically-sized strings, not C-style strings which require the use of archaic input methods such as .get(), .getline(), etc, and their respective overloads.
Manually closing a file is usually unneeded. The stream will close itself at the end of the scope in which it was created. You probably only want to close the file yourself to check if it succeeds or to reopen the stream with a different file or openmode.
The first one, that in while (is.good()), checks if it has reached EOF (End Of File). If not, it doesn't enter the while loop. Once entered in while(), it means that it have at least one character remained for the instruction char c = is.get();.
What the second if() does is that it doesn't allow to print the last character read, because after a char c = is.get();, the file may reach EOF. In case it does, the character is not printed.
For example, let's say you have this file:
"Just an example!"
Now, if you had just:
while (is.good()) // loop while extraction from file is possible
{
char c = is.get(); // get character from file
std::cout << c;
}
the output would be: "Just an example! ". The last space is the EOF character (which is the last character read).
But with:
while (is.good()) // loop while extraction from file is possible
{
char c = is.get(); // get character from file
if (is.good())
std::cout << c;
}
the output would be: "Just an example!", which is what you would expect it to be.

C++ istream operator>> bad-data handling

Every time I ask a question here on SO, it turns out to be some very dumb mistake (check my history if you don't believe me), so bear with me if you can here.
It feels like my question should be very popular, but I couldn't find anything about it and I've run out of ideas to try.
Anyway, without further ado:
I'm trying to overload the input operator>>. It's supposed to read one integer at a time from a file, skipping invalid data such as chars, floats, etc.
Naturally, I'm checking if(in >> inNum) to both get() the next token and check for successful get().
If successful, not much to say there.
If it fails, however, I assume that one of two things happened:
It stumbled upon a non-integer
It reached the eof
Here's how I tried to deal with it:
istream& operator>> (istream& in, SortSetArray& setB) {
bool eof = false;
int inNum = -1;
while(!eof) {
if(in >> inNum) {
cout << "DEBUG SUCCESS: inNum = " << inNum << endl;
setB.insert(inNum);
}
else {
// check eof, using peek()
// 1. clear all flags since peek() returns eof regardless of what
// flag is raised, even if it's not `eof`
in.clear();
cout << "DEBUG FAIL: inNum = " << inNum << endl;
// 2. then check eof with peek()
eof = (in.peek() == std::char_traits<char>::eof());
}
}
return in;
}
The file contains [1 2 3 4 a 5 6 7], and the program naturally goes into infinite loop.
Okay, easy guess, peek() doesn't consume the char 'a', and maybe in >> inNum also failed to consume it somehow. No biggie, I'll just try something that does.
And that's pretty much where I've been for the last 2 hours. I tried istream::ignore(), istream::get(), ios::rdstate to check eof, double and string instead of char in the file, just in case char is read numerically.
Nothing works and I'm desperate.
Weirdly enough, the approach above worked for a previous program where I had to read a triplet of data entries on a line of the format: string int int
The only difference is I used an ifstream object for that one, and an istream object for this one.
Bonus Question: inNum has the value of 0 when the hiccup occurs. I'm guessing it's something that istream::operator>> does?
Implementation description
try to read an int
if successful;
insert the read value to setB
next iteration
else;
clear error flags
check so that we haven't reached the end of the file
still more data? next iteration.
The above is the logic description of your function, but there's something missing...
In case we try to read a value, but fail, std::istream's handle these cases by setting the approriate error flags, but it will not discard any data.
The problem with your implementation is that upon trying to read invalid data, you will just try to read the same invalid data again.. over, and over, and over, inf.
Solution
After clearing the error flags you can use std::istream::ignore to discard any data from the stream.
The function's 1st argument is the max number of potential chars to ignore, and the 2nd is the "if you hit this char, don't ignore any more*.
Let's ignore the maximum amount of characters, or until we hit ' ' (space):
#include <limits> // std::numeric_limits
in.ignore (std::numeric_limits<std::streamsize>::max(), ' ');

Find the end of stream for cin & ifstream?

I'm running myself through a C++ text book that I have as a refresher to C++ programming. One of the practice problems (without going into too much detail) wants me to define a function that can be passed ifstream or cin (e.g. istream) as an argument. From there, I have to read through the stream. Trouble is, I can't figure out a way to have this one function use cin and ifstream to effectively find the end of the stream. Namely,
while(input_stream.peek() != EOF)
isn't going to work for cin. I could rework the function to look for a certain phrase (like "#End of Stream#" or something), but I think this is a bad idea if the file stream I pass has this exact phrase.
I have thought to use function overloading, but so far the book has mentioned when it wants me to do this. I'm probably putting too much effort into this one practice problem, but I enjoy the creative process and am curious if there's such a way to do this without overloading.
eof() does work for cin. You are doing something wrong; please post your code. One common stumbling block is that eof flag gets set after you try to read behind the end of stream.
Here is a demonstration:
#include <iostream>
#include <string>
int main( int, char*[] )
{
std::string s;
for ( unsigned n = 0; n < 5; ++n )
{
bool before = std::cin.eof();
std::cin >> s;
bool after = std::cin.eof();
std::cout << int(before) << " " << int(after) << " " << s << std::endl;
}
return 0;
}
and its output:
D:>t
aaaaa
0 0 aaaaa
bbbbb
0 0 bbbbb
^Z
0 1 bbbbb
1 1 bbbbb
1 1 bbbbb
(EOF can be generated with Ctrl-Z on Windows and Ctrl-D on many other OSes)
Why won't std::cin.eof() work? cin will signal EOF when stdin closes, which will happen when the user signals it with Ctrl+d (*nix) or Ctrl+z (Windows), or (in the case of a piped input stream) when the piped file ends
If you use a stream in a boolean context then it will convert itself into a value that is equivalent to true if it has not reached the EOF and false if an attempt has been made to read past the EOF (not it is also false if there was a previous error reading from the stream).
Since most IO operations on streams return the stream (so they can be chained). You can do your read operation and use the result in the test (as above).
So a program to read a stream of numbers from a stream:
int main()
{
int x;
// Here we try and read a number from the stream.
// If this fails (because of EOF or other error) an internal flag is set.
// The stream is returned as the result of operator>>
// So the stream is then being used in the boolean context of the while()
// So it will be converted to true if operator>> worked correctly.
// or false if operator>> failed because of EOF
while(std::cin >> x)
{
// The loop is only entered if operator>> worked correctly.
std::cout << "Value: " << x << "\n";
}
// Exit when EOF (or other error).
}

C++ file handling (structures)

Following code, when compiled and run with g++,
prints '1' twice, whereas I expect '1' to be printed
only once, since I am dumping a single structure to
the file, but while reading back it seems to be
reading two structures. Why?
#include <iostream.h>
#include <fstream.h>
int main(){
struct student
{
int rollNo;
};
struct student stud1;
stud1.rollNo = 1;
ofstream fout;
fout.open("stu1.dat");
fout.write((char*)&stud1,sizeof(stud1));
fout.close();
ifstream filin("stu1.dat");
struct student tmpStu;
while(!filin.eof())
{
filin.read((char*)&tmpStu,sizeof(tmpStu));
cout << tmpStu.rollNo << endl;
}
filin.close();
}
eof only gets set after a read fails, so the read runs twice, and the second time, it doesn't modify the buffer.
Try this:
while(filin.read((char*)&tmpStu,sizeof(tmpStu)))
{
cout << tmpStu.rollNo << endl;
}
Or
while(!filin.read((char*)&tmpStu,sizeof(tmpStu)).eof())
{
cout << tmpStu.rollNo << endl;
}
Read returns a reference to filin when called, which will evaluate to true if the stream is still good. When read fails to read any more data, the reference will evaluate to false, which will prevent it from entering the loop.
Your while loop is executing twice because the EOF condition is not true until the first attempt to read beyond the end of the file. So the cout is executed twice.
This prints 1 twice because of the exact way eof and read work. If you are at the very end of a file, read will fail, then calls to eof after that return true. If you have not attempted to read past the end of the file, eof will return false because the stream is not in the EOF state, even though there is no more data left to read.
To summarize, your calls look like this:
eof - false (at beginning of file)
read (at beginning of file)
eof - false (now at end of file, but EOF not set)
read (at end of file. fails and sets EOF state internally)
eof - true (EOF state set)
A better strategy would be to check eof right after the read call.
I believe it is because you are checking for filin.eof() and that won't be true until the second time you read.
See here. It notes that eofbit is set "...The end of the source of characters is reached before n characters have been read ...". In your case you won't hit EOF until the second read.
Cool.
Another way (courtesy of experts-exchange, I asked the same question there :-))
while(filin.peek() != EOF)
{
filin.read((char*)&tmpStu,sizeof(tmpStu));
cout << tmpStu.rollNo << endl;
}