Why is (foobar>>x) preferred over (! foobar.eof() ) [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why is iostream::eof inside a loop condition considered wrong?
eof() bad practice?
My teacher said we shouldn't use EOF to read in text file or binary file information instead we should use (afile>>x). He didn't explain why, can someone explain to me. Can someone also explain what are the differences in this two different method of reading
//Assuming declaration
//ifstream foobar
( ! foobar.eof() )
{
foobar>>x; // This is discouraged by my teacher
}
while (foobar>>x)
{
//This is encouraged by my teacher
}

Because the file is not at the end before you try to read from it.
operator>> returns a reference to the stream in the state it is after the read has been attempted and either succeeded or failed, and the stream evaluates to true if it succeeded or false if it failed. Testing for eof() first means that the file can have no useful data in it but not be at EOF yet, then when you read from it, it's at EOF and the read fails.
Another important detail is that operator>> for streams skips all leading whitespace, not trailing whitespace. This is why a file can not be at EOF before the read and be at EOF after a read.
Additionally, the former works when the next data in the file is data that cannot be read into an integer (for example, the next data is x), not just when it's at EOF, which is very important.
Example:
Consider the code:
int x, y;
f >> x;
if (!f.eof())
f >> y;
Assuming f is a file that contains the data 123␣ (the ␣ means space), the first read will succeed, but afterwards the file has no more integers in it and it is not at EOF. The second read will fail and the file will be at EOF, but you don't know because you tested for EOF before you tried reading. Then your code goes on to cause undefined behaviour because y is uninitialised.

Related

Are there cases where the last operation succeeds but the istream/ostream goes fail()?

I'm a new learner of C++. In my understanding, fail() indicates whether the last operation on the istream/ostream succeeded or not. For example, let ifs be an ifstream opened to some valid file:
string s;
while (ifs >> s) {
// 1. assume the reading was taken successfully
// and do something according to that
}
// 2. assume the reading was unsuccessful
// and do something according to that
May I ask if it is possible that, in some (corner) cases, we arrive at 2. while the reading was actually successfully taken?
If so, what we do at 2. could be problematic. For example, if we have set ifs.exceptions(ifs.exceptions() | ios_base::badbit);, then we would (probably) have assumed that ifs is empty at 2..
I came up with this question when I was self-learning C++ using Stroustrup's book Programming: Principle and Practice Using C++ (2nd ed.). In Chapter 11.7 of this book, the writers design a Punct_stream class that takes an istream as source, processes data extracted from the source and then put the processed data into a data member of type istringstream called buffer waiting to be inputted into the program. The writers design the operator bool() method as follows (Page 404):
Punct_stream::operator bool()
{
return !(source.fail() || source.bad()) && buffer.good();
}
Firstly, I think !(source.fail() || source.bad()) should be equivalent to just checking source since fail() accounts for both failbit and badbit, and checking source is equivalent to checking !fail().
What is more related to my question is that, by this design, operator bool() will first check if the source is in an error state, and if so, it won't even look at the buffer. But reading data from a Punct_stream into a string will be done as long as there is character in buffer. So, if the buffer is not empty and the source is fail() somehow, then reading from the Punct_stream into a string will actually succeed while the checking on the Punct_stream condition will be false.
I wonder if the implementation of the standard library istreams may behave in some similar way such that in some cases the reading is actually successfully taken but the condition is evaluated to false?
All iostream input functions start by creating a sentry object which checks the good state of the stream. If the state is not good (that is any of badbit, eof or failbit is set), then the sentry will set the failbit and return false, which will cause the input function to immediately return without reading anything, without modifying (or even checking) the streambuffer, and without writing anything into the destination of the input function.
So by definition, if the stream test in the while is false, it must have either been false or eof before (and nothing was read into the string s), or it failed to be able to read anything into the string and it set the state to false. In either case, nothing was read into the string.

Reading in one byte at a time with .get() [duplicate]

This question already has answers here:
Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?
(5 answers)
Closed 7 years ago.
So i'm reading in a input file that contains:
lololololololol
I need to read it in using binary one byte at a time for something I'm doing later on. To do this i'm using get() to read it in then storing it into a char. It seems to be working correctly except for the last char that it reads in. The vector that it is reading into contains:
lololololololol
�
I'm not quite sure what this last value is but it's totally throwing off my finial output. So my question is, is there a reason get() would read in a value or byte from my text document that is not there? Or is it reading in something that I don't know of?
code:
while(istr.good()) {
temp = istr.get();
input.push_back(temp);
}
It's reading the EOF (end of file) character. You need to do the check after reading it to avoid it being inserted to the vector:
while(temp = istr.get(), istr.good()) // comma operator
input.push_back(temp);
Or you might use the 2nd std::istream_base::get overload and let istr implicitly convert to bool:
while(istr.get(temp))
input.push_back(temp);
Or try more advanced approaches. operator>> and std::getline would also work fine for this kind of input.

Is the inconsistency of C++'s istream::eof() a bug in the spec or a bug in the implementation?

The following program demonstrates an inconsistency in the way that std::istream (specifically in my test code, std::istringstream) sets eof().
#include <sstream>
#include <cassert>
int main(int argc, const char * argv[])
{
// EXHIBIT A:
{
// An empty stream doesn't recognize that it's empty...
std::istringstream stream( "" );
assert( !stream.eof() ); // (Not yet EOF. Maybe should be.)
// ...until I read from it:
const int c = stream.get();
assert( c < 0 ); // (We received garbage.)
assert( stream.eof() ); // (Now we're EOF.)
}
// THE MORAL: EOF only happens when actually attempting to read PAST the end of the stream.
// EXHIBIT B:
{
// A stream that still has data beyond the current read position...
std::istringstream stream( "c" );
assert( !stream.eof() ); // (Clearly not yet EOF.)
// ... clearly isn't eof(). But when I read the last character...
const int c = stream.get();
assert( c == 'c' ); // (We received something legit.)
assert( !stream.eof() ); // (But we're already EOF?! THIS ASSERT FAILS.)
}
// THE MORAL: EOF happens when reading the character BEFORE the end of the stream.
// Conclusion: MADNESS.
return 0;
}
So, eof() "fires" when you read the character before the actual end-of-file. But if the stream is empty, it only fires when you actually attempt to read a character. Does eof() mean "you just tried to read off the end?" or "If you try to read again, you'll go off the end?" The answer is inconsistent.
Moreover, whether the assert fires or not depends on the compiler. Apple Clang 4.1, for example, fires the assertion (raises eof() when reading the preceding character). GCC 4.7.2, for example, does not.
This inconsistency makes it hard to write sensible loops that read through a stream but handle both empty and non-empty streams well.
OPTION 1:
while( stream && !stream.eof() )
{
const int c = stream.get(); // BUG: Wrong if stream was empty before the loop.
// ...
}
OPTION 2:
while( stream )
{
const int c = stream.get();
if( stream.eof() )
{
// BUG: Wrong when c in fact got the last character of the stream.
break;
}
// ...
}
So, friends, how do I write a loop that parses through a stream, dealing with each character in turn, handles every character, but stops without fuss either when we hit the EOF, or in the case when the stream is empty to begin with, never starts?
And okay, the deeper question: I have the intuition that using peek() could maybe workaround this eof() inconsistency somehow, but...holy crap! Why the inconsistency?
The eof() flag is only useful to determine if you hit end of file after some operation. The primary use is to avoid an error message if reading reasonably failed because there wasn't anything more to read. Trying to control a loop or something using eof() is bound to fail. In all cases you need to check after you tried to read if the read was successful. Before the attempt the stream can't know what you are going to read.
The semantics of eof() is defined thoroughly as "this flag gets set when reading the stream caused the stream buffer to return a failure". It isn't quite as easy to find this statement if I recall correct but this is what comes down. At some point the standard also says that the stream is allowed to read more than it has to in some situation which may cause eof() to be set when you don't necessarily expect it. One such example is reading a character: the stream may end up detecting that there is nothing following that character and set eof().
If you want to handle an empty stream, it's trivial: look at something from the stream and proceed only if you know it's not empty:
if (stream.peek() != std::char_traits<char>::eof()) {
do_what_needs_to_be_done_for_a_non_empty_stream();
}
else {
do_something_else();
}
Never, ever check for eof alone.
The eof flag (which is the same as the eofbit bit flag in a value returned by rdstate()) is set when end-of-file is reached during an extract operation. If there were no extract operations, eofbit is never set, which is why your first check returns false.
However eofbit is no indication as to whether the operation was successful. For that, check failbit|badbit in rdstate(). failbit means "there was a logical error", and badbit means "there was an I/O error". Conveniently, there's a fail() function that returns exactly rdstate() & (failbit|badbit). Even more conveniently, there's an operator bool() function that returns !fail(). So you can do things like while(stream.read(buffer)){ ....
If the operation has failed, you may check eofbit, badbit and failbit separately to figure out why it has failed.
What compiler / standard c++ library are you using? I tried it with gcc 4.6.3/4.7.2 and clang 3.1, and all of them worked just fine (i.e. the assertion does not fire).
I think you should report this as a bug in your tool-chain, since my reading of the standard accords with your intuition that eof() should not be set as long as get() is able to return a character.
It's not a bug, in the sense that it's the intended behavior. It is
not the intent that you use test for eof() until after input has
failed. It's main purpose is for use inside extraction functions, where
in early implementations, the fact that std::streambuf::sgetc()
returned EOF didn't mean that it would the next time it was called:
the intent was that anytime sgetc() returned EOF (now
std::char_traits<>::eof(), this would be memorized, and the stream
would make no further calls to the streambuf.
Practically speaking: we really need two eof(): one for internal use,
as above, and another which will reliably state that failure was due to
having reached end of file. As it is, given something like:
std::istringstream s( "1.23e+" );
s >> aDouble;
There's no way of detecting that the error is due to a format error,
rather than the stream not having any more data. In this case, the
internal eof should return true (because we have seen end of file, when
looking ahead, and we want to suppress all further calls to the
streambuf extractor functions), but the external one should be false,
because there was data present (even after skipping initial whitespace).
If you're not implementing an extractor function, of course, you should
never test ios_base::eof() until you've actually had an input failure.
It was never the intent that this would provide any useful information
(which makes one wonder why they defined ios_base::good()—the
fact that it returns false if eof() means that it can provide nor
reliable information untin fail() returns true, at which point, we
know that it will return false, so there's no point in calling it).
And I'm not sure what your problem is. Because the stream cannot know
in advance what your next input will be (e.g. whether it will skip
whitespace or not), it cannot know in advance whether your next input
will fail because of end of file or not. The idiom adopted is clear:
try the input, then test whether is succeeded or not. There is no
other way, because no other alternative can be implemented. Pascal did
it differently, but a file in Pascal was typed—you could only read
one type from it, so it could always read ahead one element under the
hood, and return end of file if this read ahead failed. Not having
previsional end of file is the price we pay for being able to read more
than one type from a file.
The behavior is somewhat subtle. eofbit is set when an attempt is made to read past the end of the file, but that may or may not cause failure of the current extraction operation.
For example:
ifstream blah;
// assume the file got opened
int i, j;
blah >> i;
if (!blah.eof())
blah >> j;
If the file contains 142<EOF>, then the sequence of digits is terminated by end of file, so eofbit is set AND the extraction succeeds. Extraction of j won't be attempted, because the end of file has already been encountered.
If the file contains 142 <EOF>, the the sequence of digits is terminated by whitespace (extraction of i succeeds). eofbit is not set yet, so blah >> j will be executed, and it will reach end of file without finding any digits, so it will fail.
Notice how the innocuous-looking whitespace at the end of file changed the behavior.

C++ IO file streams: writing from one file to another using operator<< and rdbuf()

I have a question about copying data from one file to another in C++ (fstream) using operator<<. Here is a code snippet that works for me:
#include <fstream>
#include <string>
void writeTo(string &fname, ofstream &out){
ifstream in;
in.open(fname.c_str(),fstream::binary);
if(in.good()){
out<<in.rdbuf();
in.close();
}else{
//error
}
}
I would like to be certain that after writing, the end of input file in stream in has been reached. However, if I test for in.eof(), it is false, despite the fact that checking the input and output files confirms that the entire contents has been properly copied over. Any ideas on how I would check for in.eof()?
EOF-bit is set when trying to read a character, but none is available (i.e. you have already consumed everything in the string). Apparently std::ostream::operator<<() does not attempt to read past the end of the string, so the bit is never set.
You should be able to get around this by attempting to access the next character: add in.peek() before you check in.eof(). I have tested this fix and it works.
The reason none of the status bits are set in the input file is because
you are reading through the streambuf, not the istream; the actual
reading takes place in the ostream::operator<<, which doesn't have
access to the istream.
I'm not sure it matters, however. The input will be read until
streambuf::sgetc returns EOF. Which would cause the eofbit to be
set in the istream if you were reading through the istream. The
only thing which might prevent this if you were reading through the
istream is if streambuf::sgetc threw an exception, which would cause
badbit to be set in istream; there is no other mechanism provided
for an input streambuf to report a read error. So wrap your out <<
in.rdbuf() in a try ... catch block, and hope that the implementation
actually does check for hardware errors. (I haven't checked recently,
but a lot of early implementations totally ignored read errors, treating
them as a normal end of file.)
And of course, since you're literally reading bytes (despite the <<, I
don't see how one could call this formatted input), you don't have to
consider the third possible source of errors, a format error (such as
"abc" when inputing an int).
Try in.rdbuf()->sgetc() == EOF.
Reference: http://www.cplusplus.com/reference/iostream/streambuf/sgetc/

eof() bad practice? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why is iostream::eof inside a loop condition considered wrong?
So I've been using the eof() function in a lot of my programs that require file input, and my professor said that it is fine to use but a few people on SO have said that I shouldn't use it without really specifying the reason. So I was wondering, is there a good reason?
You can use eof to test for the exact condition it reports - whether you have attempted to read past end of file. You cannot use it to test whether there's more input to read, or whether reading succeeded, which are more common tests.
Wrong:
while (!cin.eof()) {
cin >> foo;
}
Correct:
if (!(cin >> foo)) {
if (cin.eof()) {
cout << "read failed due to EOF\n";
} else {
cout << "read failed due to something other than EOF\n";
}
}
You shouldn't use it because if input fails for another reason, you can be screwed.
while(!file.eof()) {
int i;
file >> i;
doSomething(i);
}
What happens in the above loop if the contents of file are "WHAARRRRRRGARBL"? It loops infinitely, because it's not the end of the file, but you still can't extract an int from it.
How are you using it? What .eof() will tell you is that the stream has already hit the end of file, that it has tried to read data that isn't there.
This isn't usually what you want. Usually, you want to know that you are hitting the end of file, or that you are about to hit the end of file, and not that you have already hit it. In a loop like:
while(!f.eof())
{
f >> /* something */;
/* do stuff */
}
you are going to attempt to read input in and fail, and then you are going to execute everything in /* do stuff */, and then the loop condition fails and the loop stops.
Another problem with that loop is that there's other ways input can fail. If there's an error condition on the file that isn't EOF, .fail() will return true but .eof() won't.
In case the above answers are confusing:
What people thinks it does is wrong:
eof() does not look to see if the last portion you read included the last byte of the file.
It does not look to see if the next byte is after the end of the file.
What it actually does:
eof() reports whether the last read included bytes past the end of the file.
If eof() is true, you've already made a mistake. Thus explains the professors statement. Use it for what it means, an error has occurred.