eof() bad practice? [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why is iostream::eof inside a loop condition considered wrong?
So I've been using the eof() function in a lot of my programs that require file input, and my professor said that it is fine to use but a few people on SO have said that I shouldn't use it without really specifying the reason. So I was wondering, is there a good reason?

You can use eof to test for the exact condition it reports - whether you have attempted to read past end of file. You cannot use it to test whether there's more input to read, or whether reading succeeded, which are more common tests.
Wrong:
while (!cin.eof()) {
cin >> foo;
}
Correct:
if (!(cin >> foo)) {
if (cin.eof()) {
cout << "read failed due to EOF\n";
} else {
cout << "read failed due to something other than EOF\n";
}
}

You shouldn't use it because if input fails for another reason, you can be screwed.
while(!file.eof()) {
int i;
file >> i;
doSomething(i);
}
What happens in the above loop if the contents of file are "WHAARRRRRRGARBL"? It loops infinitely, because it's not the end of the file, but you still can't extract an int from it.

How are you using it? What .eof() will tell you is that the stream has already hit the end of file, that it has tried to read data that isn't there.
This isn't usually what you want. Usually, you want to know that you are hitting the end of file, or that you are about to hit the end of file, and not that you have already hit it. In a loop like:
while(!f.eof())
{
f >> /* something */;
/* do stuff */
}
you are going to attempt to read input in and fail, and then you are going to execute everything in /* do stuff */, and then the loop condition fails and the loop stops.
Another problem with that loop is that there's other ways input can fail. If there's an error condition on the file that isn't EOF, .fail() will return true but .eof() won't.

In case the above answers are confusing:
What people thinks it does is wrong:
eof() does not look to see if the last portion you read included the last byte of the file.
It does not look to see if the next byte is after the end of the file.
What it actually does:
eof() reports whether the last read included bytes past the end of the file.
If eof() is true, you've already made a mistake. Thus explains the professors statement. Use it for what it means, an error has occurred.

Related

Explain how the for loop is terminating?

I came across the a code which I didn't understand. It was on a coding website. The code was like this-
#include<iostream>
using namespace std;
char s[4];
int x;
int main()
{
for(cin>>s;cin>>s;x+=44-s[1]);
cout<<x;
}
My question is how the for loop is terminating and since it was on a coding website so answers are checked using file operation in my knowledge. But if we are running it on IDE this for loop is not terminating instead it keeps on taking input from the user.So whats the explanation for this??
Sample Input
3
x++
x--
--x
Output
-1
EDIT
This is the problem link - Bit++
This is the solution link - In status filter set language to MS C++ Author name - wafizaini (Solution id - 27116030)
The loop is terminating because istream has operator bool() (prior to C++11 it was operator void*) which returns false when no additional input is available. Basically, the reason the loop stops is the same as why a more common while loop terminates:
while (cin >> s) {
...
}
The reason this does not terminate when you run with an IDE is that you need to supply an end-of-stream mark, which is delivered in a system-dependent way. On UNIX and other systems derived from it you press Ctrl+d, while on Windows you press Ctrl+z.
Note: Your program is at risk of getting a buffer overrun in case an end-user enters more than three characters (character #4 would be used for null terminator of the string). Also note that the initial input cin>>s is thrown away, because loop condition is checked before entering the body of the loop.
That's perfectly valid, although a bit difficult to read, C++11 code.
std::istream::operator>>()
returns a reference to the input stream itself, and
std::istream::operator bool()
in turn evaluates the stream to a boolean value, returning false whenever a fail bit is set.
When reading from a file, that loop will eventually try to read past the end of file, causing the eof fail bit to be set and thus stopping the loop.
However, when running that code on a shell, you need to manually input the EOF control code on the stream, otherwise the for loop won't stop. This can be done by pressing Ctrl+D on Unix shells, for example.
A more common loop condition is while (cin >> s).
The convention is that operator>> returns a reference to the stream (cin). The stream classes then have a conversion operator that will return false in if (cin) or while (cin) after some input has failed.
This would work in the middle part of a for-loop as well, but is a bit unusual.

Why is (foobar>>x) preferred over (! foobar.eof() ) [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why is iostream::eof inside a loop condition considered wrong?
eof() bad practice?
My teacher said we shouldn't use EOF to read in text file or binary file information instead we should use (afile>>x). He didn't explain why, can someone explain to me. Can someone also explain what are the differences in this two different method of reading
//Assuming declaration
//ifstream foobar
( ! foobar.eof() )
{
foobar>>x; // This is discouraged by my teacher
}
while (foobar>>x)
{
//This is encouraged by my teacher
}
Because the file is not at the end before you try to read from it.
operator>> returns a reference to the stream in the state it is after the read has been attempted and either succeeded or failed, and the stream evaluates to true if it succeeded or false if it failed. Testing for eof() first means that the file can have no useful data in it but not be at EOF yet, then when you read from it, it's at EOF and the read fails.
Another important detail is that operator>> for streams skips all leading whitespace, not trailing whitespace. This is why a file can not be at EOF before the read and be at EOF after a read.
Additionally, the former works when the next data in the file is data that cannot be read into an integer (for example, the next data is x), not just when it's at EOF, which is very important.
Example:
Consider the code:
int x, y;
f >> x;
if (!f.eof())
f >> y;
Assuming f is a file that contains the data 123␣ (the ␣ means space), the first read will succeed, but afterwards the file has no more integers in it and it is not at EOF. The second read will fail and the file will be at EOF, but you don't know because you tested for EOF before you tried reading. Then your code goes on to cause undefined behaviour because y is uninitialised.

Is the inconsistency of C++'s istream::eof() a bug in the spec or a bug in the implementation?

The following program demonstrates an inconsistency in the way that std::istream (specifically in my test code, std::istringstream) sets eof().
#include <sstream>
#include <cassert>
int main(int argc, const char * argv[])
{
// EXHIBIT A:
{
// An empty stream doesn't recognize that it's empty...
std::istringstream stream( "" );
assert( !stream.eof() ); // (Not yet EOF. Maybe should be.)
// ...until I read from it:
const int c = stream.get();
assert( c < 0 ); // (We received garbage.)
assert( stream.eof() ); // (Now we're EOF.)
}
// THE MORAL: EOF only happens when actually attempting to read PAST the end of the stream.
// EXHIBIT B:
{
// A stream that still has data beyond the current read position...
std::istringstream stream( "c" );
assert( !stream.eof() ); // (Clearly not yet EOF.)
// ... clearly isn't eof(). But when I read the last character...
const int c = stream.get();
assert( c == 'c' ); // (We received something legit.)
assert( !stream.eof() ); // (But we're already EOF?! THIS ASSERT FAILS.)
}
// THE MORAL: EOF happens when reading the character BEFORE the end of the stream.
// Conclusion: MADNESS.
return 0;
}
So, eof() "fires" when you read the character before the actual end-of-file. But if the stream is empty, it only fires when you actually attempt to read a character. Does eof() mean "you just tried to read off the end?" or "If you try to read again, you'll go off the end?" The answer is inconsistent.
Moreover, whether the assert fires or not depends on the compiler. Apple Clang 4.1, for example, fires the assertion (raises eof() when reading the preceding character). GCC 4.7.2, for example, does not.
This inconsistency makes it hard to write sensible loops that read through a stream but handle both empty and non-empty streams well.
OPTION 1:
while( stream && !stream.eof() )
{
const int c = stream.get(); // BUG: Wrong if stream was empty before the loop.
// ...
}
OPTION 2:
while( stream )
{
const int c = stream.get();
if( stream.eof() )
{
// BUG: Wrong when c in fact got the last character of the stream.
break;
}
// ...
}
So, friends, how do I write a loop that parses through a stream, dealing with each character in turn, handles every character, but stops without fuss either when we hit the EOF, or in the case when the stream is empty to begin with, never starts?
And okay, the deeper question: I have the intuition that using peek() could maybe workaround this eof() inconsistency somehow, but...holy crap! Why the inconsistency?
The eof() flag is only useful to determine if you hit end of file after some operation. The primary use is to avoid an error message if reading reasonably failed because there wasn't anything more to read. Trying to control a loop or something using eof() is bound to fail. In all cases you need to check after you tried to read if the read was successful. Before the attempt the stream can't know what you are going to read.
The semantics of eof() is defined thoroughly as "this flag gets set when reading the stream caused the stream buffer to return a failure". It isn't quite as easy to find this statement if I recall correct but this is what comes down. At some point the standard also says that the stream is allowed to read more than it has to in some situation which may cause eof() to be set when you don't necessarily expect it. One such example is reading a character: the stream may end up detecting that there is nothing following that character and set eof().
If you want to handle an empty stream, it's trivial: look at something from the stream and proceed only if you know it's not empty:
if (stream.peek() != std::char_traits<char>::eof()) {
do_what_needs_to_be_done_for_a_non_empty_stream();
}
else {
do_something_else();
}
Never, ever check for eof alone.
The eof flag (which is the same as the eofbit bit flag in a value returned by rdstate()) is set when end-of-file is reached during an extract operation. If there were no extract operations, eofbit is never set, which is why your first check returns false.
However eofbit is no indication as to whether the operation was successful. For that, check failbit|badbit in rdstate(). failbit means "there was a logical error", and badbit means "there was an I/O error". Conveniently, there's a fail() function that returns exactly rdstate() & (failbit|badbit). Even more conveniently, there's an operator bool() function that returns !fail(). So you can do things like while(stream.read(buffer)){ ....
If the operation has failed, you may check eofbit, badbit and failbit separately to figure out why it has failed.
What compiler / standard c++ library are you using? I tried it with gcc 4.6.3/4.7.2 and clang 3.1, and all of them worked just fine (i.e. the assertion does not fire).
I think you should report this as a bug in your tool-chain, since my reading of the standard accords with your intuition that eof() should not be set as long as get() is able to return a character.
It's not a bug, in the sense that it's the intended behavior. It is
not the intent that you use test for eof() until after input has
failed. It's main purpose is for use inside extraction functions, where
in early implementations, the fact that std::streambuf::sgetc()
returned EOF didn't mean that it would the next time it was called:
the intent was that anytime sgetc() returned EOF (now
std::char_traits<>::eof(), this would be memorized, and the stream
would make no further calls to the streambuf.
Practically speaking: we really need two eof(): one for internal use,
as above, and another which will reliably state that failure was due to
having reached end of file. As it is, given something like:
std::istringstream s( "1.23e+" );
s >> aDouble;
There's no way of detecting that the error is due to a format error,
rather than the stream not having any more data. In this case, the
internal eof should return true (because we have seen end of file, when
looking ahead, and we want to suppress all further calls to the
streambuf extractor functions), but the external one should be false,
because there was data present (even after skipping initial whitespace).
If you're not implementing an extractor function, of course, you should
never test ios_base::eof() until you've actually had an input failure.
It was never the intent that this would provide any useful information
(which makes one wonder why they defined ios_base::good()—the
fact that it returns false if eof() means that it can provide nor
reliable information untin fail() returns true, at which point, we
know that it will return false, so there's no point in calling it).
And I'm not sure what your problem is. Because the stream cannot know
in advance what your next input will be (e.g. whether it will skip
whitespace or not), it cannot know in advance whether your next input
will fail because of end of file or not. The idiom adopted is clear:
try the input, then test whether is succeeded or not. There is no
other way, because no other alternative can be implemented. Pascal did
it differently, but a file in Pascal was typed—you could only read
one type from it, so it could always read ahead one element under the
hood, and return end of file if this read ahead failed. Not having
previsional end of file is the price we pay for being able to read more
than one type from a file.
The behavior is somewhat subtle. eofbit is set when an attempt is made to read past the end of the file, but that may or may not cause failure of the current extraction operation.
For example:
ifstream blah;
// assume the file got opened
int i, j;
blah >> i;
if (!blah.eof())
blah >> j;
If the file contains 142<EOF>, then the sequence of digits is terminated by end of file, so eofbit is set AND the extraction succeeds. Extraction of j won't be attempted, because the end of file has already been encountered.
If the file contains 142 <EOF>, the the sequence of digits is terminated by whitespace (extraction of i succeeds). eofbit is not set yet, so blah >> j will be executed, and it will reach end of file without finding any digits, so it will fail.
Notice how the innocuous-looking whitespace at the end of file changed the behavior.

c++ EOF running one too many times?

This is my first time using EOF and/or files, and I am having an issue where my code hangs, which I believe is because my EOF is looping one too many times.
I am imputing from a file, and dynamically creating objects that way, and it hangs once the file is run through.
while( !studentFile.eof() )
{
cout << "38\n";
Student * temp = new Student();
(*temp).input( studentFile );
(*sdb).insert( (*temp) );
}
This chunk of code is the code in question. The cout >> "38\n"; is the line number and the reason I believe it is hanging from looping one too many times.
The file only contains 4 student's worth of data, yet 38 appears 5 times, which is the reason I believe it is looping one too many times; Once it gets the last bit of data, it doesn't seem to register that the file has ended, and loops in again, but there is no data to input so my code hangs.
How do I fix this? Is my logic correct?
Thank you.
Others have already pointed out the details of the problem you've noticed.
You should realize, however, that there are more problems you haven't noticed yet. One is a fairly obvious memory leak. Every iteration of the loop executes: Student * temp = new Student();, but you never execute a matching delete.
C++ makes memory management much simpler than some other languages (e.g., Java), which require you to new every object you use. In C++, you can just define an object and use it:
Student temp;
temp.input(studentFile);
This simplifies the code and eliminates the memory leak -- your Student object will be automatically destroyed at the end of each iteration, and a (conceptually) new/different one created at the beginning of the next iteration.
Although it's not really a bug per se, even that can still be simplified quite a bit. Since whatever sdb points at apparently has an insert member function, you can use it like a standard container (which it may actually be, though it doesn't really matter either way). To neaten up the code, start by writing an extraction operator for a Student:
std::istream &operator>>(std::istream &is, Student &s) {
s.input(is);
return is;
}
Then you can just copy data from the stream to the collection:
std::copy(std::istream_iterator<Student>(studentFile),
std::istream_iterator<Student>(),
std::inserter(*sdf));
Note that this automates correct handling of EOF, so you don't have to deal with problems like you started with at all (even if you wanted to cause them, it wouldn't be easy).
This is because the EOF flag is only set after you try to read and get no data. So it would go
Test for EOF -> No EOF
Try to read one line -> Good, read first line
Test for EOF -> No EOF
Try to read one line -> Good, read second line
Test for EOF -> No EOF
Try to read one line -> Good, read third line
Test for EOF -> No EOF
Try to read one line -> Good, read fourth line
Test for EOF -> No EOF
Try to read one line -> EOF
But by the Try to read one line -> EOF, you're already in the body of the while on the fifth iteration, which is why you're seeing the loop run 5 times. So you need to read before you check for EOF.
You need to check the stream status bits immediately after performing an operation on a stream. You don't show the code, but it would appear that (*temp).input(studentFile) is doing the reading from the stream. Call eof() (or other status check) after doing the read but before processing with the data you (attempted to) read.

Stopping endless loop when int = (something that's not an integer)

So this is a problem that I've been having since I started programming (Not that long ago. I still don't know why I started with C++). When I have some integer variables and the user's input defines them, if the user inputs something other than an integer the program freaks out and runs an endless loop of the last command it was given. I don't think sample code is needed but if it is I can make a basic example pretty easily.
If you want to know exactly what your mistake was, then we'd need to see your code, but the usual idiom is like this:
int i;
while (std::cin >> i) {
// do something with the user's input, i
}
if (std::cin.fail()) {
std::cout << "not a number!\n";
}
If failure occurs and you want to get past the invalid input so that the user can try again, first call cin.clear(), then either cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n') to ignore the whole line, or std::string s; std::cin >> s; to ignore a whitespace-separated word.
Beware that because the second case actually constructs the string in memory, the user could input a few gigabytes without a space, and the program will fail. That's usually fine if the input is from a terminal, it's the user's own stupid fault. It might be less fine if the input is from an HTTP request or other untrusted source, so some time in future you might end up worrying about it...
Check this out Guess the number - Infinite loop when bad read
When programming always, and i mean always, validate your input.
Check if the input you get is sane.
What i mean by that if you get something that is supposed to be int check if it is.
Convert it if it is not.
If you get a string check if it is in bounds, meaning is it to long, to short, to whatever.
cin
Would be the Term to Google for in your case.