Example of Why stream::good is Wrong?

Example of Why stream::good is Wrong? - c++

I gave an answer which I wanted to check the validity of stream each time through a loop here.
My original code used good and looked similar to this:
ifstream foo("foo.txt");
while (foo.good()){
string bar;
getline(foo, bar);
cout << bar << endl;
}
I was immediately pointed here and told to never test good. Clearly this is something I haven't understood but I want to be doing my file I/O correctly.
I tested my code out with several examples and couldn't make the good-testing code fail.
First (this printed correctly, ending with a new line):
bleck 1
blee 1 2
blah
ends in new line
Second (this printed correctly, ending in with the last line):
bleck 1
blee 1 2
blah
this doesn't end in a new line
Third was an empty file (this printed correctly, a single newline.)
Fourth was a missing file (this correctly printed nothing.)
Can someone help me with an example that demonstrates why good-testing shouldn't be done?

They were wrong. The mantra is 'never test .eof()'.
Why is iostream::eof inside a loop condition considered wrong?
Even that mantra is overboard, because both are useful to diagnose the state of the stream after an extraction failed.
So the mantra should be more like
Don't use good() or eof() to detect eof before you try to read any further
Same for fail(), and bad()
Of course stream.good can be usefully employed before using a stream (e.g. in case the stream is a filestream which has not been successfully opened)
However, both are very very very often abused to detect the end of input, and that's not how it works.
A canonical example of why you shouldn't use this method:
std::istringstream stream("a");
char ch;
if (stream >> ch) {
std::cout << "At eof? " << std::boolalpha << stream.eof() << "\n";
std::cout << "good? " << std::boolalpha << stream.good() << "\n";
}
Prints
false
true
See it Live On Coliru

This is already covered in other answers, but I'll go over it briefly for completeness. The only functional difference with
while(foo.good()) { // effectively same as while(foo) {
getline(foo, bar);
consume(bar); // consume() represents any operation that uses bar
}
And
while(getline(foo, bar)){
consume(bar);
}
Is that the former will do an extra loop when there are no lines in the file, making that case indistinguishable from the case of one empty line. I would argue that this is not typically desired behaviour. But I suppose that's matter of opinion.
As sehe says, the mantra is overboard. It's a simplification. What really is the point is that you must not consume() the result of reading the stream before you test for failure or at least EOF (and any test before the read is irrelevant). Which is what people easily do when they test good() in the loop condition.
However, the thing about getline(), is that it tests EOF internally, for you and returns an empty string even if only EOF is read. Therefore, the former version could maybe be roughly the similar to following pseudo c++:
while(foo.good()) {
// inside getline
bar = ""; // Reset bar to empty
string sentry;
if(read_until_newline(foo, sentry)) {
// The streams state is tested implicitly inside getline
// after the value is read. Good
bar = sentry // The read value is used only if it's valid.
// ... // Otherwise, bar is empty.
consume(bar);
}
I hope that illustrates what I'm trying to say. One could say that there is a "correct" version of the read loop inside getline(). This is why the rule is at least partially satisfied by the use of readline even if the outer loop doesn't conform.
But, for other methods of reading, breaking the rule hurts more. Consider:
while(foo.good()) {
int bar;
foo >> bar;
consume(bar);
}
Not only do you always get the extra iteration, the bar in that iteration is uninitialized!
So, in short, while(foo.good()) is OK in your case, because getline() unlike certain other reading functions, leaves the output in a valid state after reading EOF bit. and because you don't care or even do expect the extra iteration when the file is empty.

both good() and eof() will both give you an extra line in your code. If you have a blank file and run this:
std::ifstream foo1("foo1.txt");
std::string line;
int lineNum = 1;
std::cout << "foo1.txt Controlled With good():\n";
while (foo1.good())
{
std::getline(foo1, line);
std::cout << lineNum++ << line << std::endl;
}
foo1.close();
foo1.open("foo1.txt");
lineNum = 1;
std::cout << "\n\nfoo1.txt Controlled With getline():\n";
while (std::getline(foo1, line))
{
std::cout << line << std::endl;
}
The output you will get is
foo1.txt Controlled With good():
1
foo1.txt Controlled With getline():
This proves that it isn't working correctly since a blank file should never be read. The only way to know that is to use a read condition since the stream will always be good the first time it reads.

Using foo.good() just tells you that the previous read operation worked just fine and that the next one might as well work. .good() checks the state of the stream at a given point. It does not check if the end of the file is reached. Lets say something happened while the file was being read (network error, os error, ...) good will fail. That does not mean the end of the file was reached. Nevertheless .good() fails when end of file is reached because the stream is not able to read anymore.
On the other hand, .eof() checks if the end of file was truly reached.
So, .good() might fail while the end of file was not reached.
Hope this helps you understand why using .good() to check end of file is a bad habit.

Let me clearly say that sehe's answer is the correct one.
But the option proposed by, Nathan Oliver, Neil Kirk, and user2079303 is to use readline as the loop condition rather than good. Needs to be addressed for the sake of posterity.
We will compare the loop in the question to the following loop:
string bar;
while (getline(foo, bar)){
cout << bar << endl;
}
Because getline returns the istream passed as the first argument, and because when an istream is cast to bool it returns !(fail() || bad()), and since reading the EOF character will set both the failbit and the eofbit this makes getline a valid loop condition.
The behavior does change however when using getline as a condition because if a line containing only an EOF character is read the loop will exit preventing that line from being outputted. This doesn't occur in Examples 2 and 4. But Example 1:
bleck 1
blee 1 2
blah
ends in new line
Prints this with the good loop condition:
bleck 1
blee 1 2
blah
ends in new line
But chops the last line with the getline loop condition:
bleck 1
blee 1 2
blah
ends in new line
Example 3 is an empty file:
Prints this with the good condition:
Prints nothing with the getline condition.
Neither of these behaviors are wrong. But that last line can make a difference in code. Hopefully this answer will be helpful to you when deciding between the two for coding purposes.

Related

stringstream.rdbuf causing cout to fail

I was surprised to see my program suddenly go quiet when I added a cout at some point, so I isolated the responsible code:
std::stringstream data;
data<<"Hello World\n";
std:std::fstream file{"hello.txt", std::fstream::out};
file<<data.rdbuf();
std::cout<<"now rdbuf..."<<std::endl;
std::cout<<data.rdbuf()<<std::endl;
std::cout<<"rdbuf done."<< std::endl;
The program quietly exits without the final cout. What is going on? If I change the last .rdbuf() to .str() instead then it completes.

During the call to std::cout<<data.rdbuf(), std::cout is unable to read any characters from data's filebuf because the read position is already at the end of the file after the previous output; accordingly, this sets failbit on std::cout, and until this state is cleared any further output will fail too (i.e. your final line is essentially ignored).
std::cout<<data.str()<<std::endl; will not cause cout to enter a failed state because data.str() returns a copy of the underlying string regardless of where the read position is (for mixed-mode stringstreams anyway).

Function definition to count number of words

Write a function definition that counts the number of words in a line from your text source.
I tried two different codes and got two different results
countwords()
{
ifstream file("notes.txt");
int count=0;
char B[80];
file>>B;
While (!file.eof())
{
cout<<B<<endl;
file>>B;
count++;
}
}
This gives the desired answer.
The other way around :
countwords()
{
ifstream file("notes.txt");
char B[80];
int count=0;
While (!file.eof())
{
file>>B;
cout<<B<<endl;
count++;
}
}
But this gives an answer which is 1 more than the actual number of words.
Can someone please explain the working of the eof() function and the difference in these two loops?

The second version of your answer will always loop one extra time.
Think about this: what happens if file >> B fails? You'll still increment count.
Also, do not loop on eof() because you'll typically loop one too many times. (Why is iostream::eof inside a loop condition considered wrong?)
Instead, do the following:
while(file >> B)
{
std::cout << B << std::endl;
++count;
}
Because your filestream has an implicit conversion to bool that checks the state of it, and returns false if it's not good.

The problem is not EOF however to see its working Read this.
Talking about your code, note the file>>B; in first code. Since file>>B; fails in last execution of second code, you get one less the correct answer.

The reason for outputting 1 more than the actual number of words: In the 2nd version you output B before reading it for the first time. This is a usage of an uninitialized variable and can result in outputting what will look like garbage, or an empty line. Unreliable code.
Also I would suggest using an std::string instead of char[80] as the type for your variable B.

Why does ofstream give me an echo ie. writes input twice [duplicate]

This question already has answers here:
Testing stream.good() or !stream.eof() reads last line twice [duplicate]
(3 answers)
Closed 8 years ago.
When I run the following code, and I write for example "Peter" then the result is that I get "PeterPeter" in the file.
Why?
#include "stdafx.h"
#include "iostream"
#include "iomanip"
#include "cstdlib"
#include "fstream"
#include "string"
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
ofstream File2;
File2.open("File2.dat",ios::out);
string name;
cout<<"Name?"<<endl;
while(!cin.eof())
{
cin>>name;
File2<<name;
}
return 0;
}
When I change the while loop to
while(cin>>name)
{
File2<<name;
}
it works. But I don't understand why the first approach does not.
I can't answer my own question (as I don't have enough reputation). Hence I write my Answer here:
Ahhhh!!! Ok Thanks. Now I got it ^^
I have been testing with
while(!cin.eof())
{
cin>>name;
File2<<name;
cout<<j++<<"cin.eof() "<<cin.eof()<<endl;
}
What happens is that when I tip crtl+z he is still in the while loop. The variable name stays unchanged and is added to "File2" in the next line of code.
The following is working:
while(!cin.eof())
{
cin>>name;
if(!cin.eof()){File2<<name;}
cout<<j++<<"cin.eof() "<<cin.eof()<<endl;
}

Ho hum, the millionth time this has been asked. This is wrong
while(!cin.eof())
{
cin>>name;
File2<<name;
}
eof() doesn't do what you think it does. You think it tells you whether you're at the end of file, right?
What eof() actually does is tell you why the last read you did failed. So it's something you call after you have done a read to see why it failed, not something you do before a read to see if it will fail. The return value of eof() when the last read has not failed is more complex. It depends on what you have been reading and how you've been reading it. You are trying to use eof() in a situation where there has been no failure and so the results can vary.
The short answer is don't to it like that, do it like this
while(cin >> name)
{
File2<<name;
}
BTW, sorry for the flippant tone, I am seriously interested to know why you wrote the code wrong in the first place. We see this mistake all the time, it seems almost every newbie makes the same mistake, so I am interested to understand where this mistake comes from. Did you see that code somewhere else, did someone teach you to write that, did it just seem right to you, etc. etc. If you could explain in your case I'd appreciate it.

The basic problem of the using std::cin.eof() in the loop condition is that it tests the stream state before it is attempted to read anything from the stream. At this point, the stream has no idea what will be attempted to be read and it can't make any prediction of what will by tried. The fundamental insight is: You always have to verify that reading data was successful after reading it!
A secondary problem is that eof() only tests one of multiple error conditions. Reading a std::string can only go wrong if there is no further data but for most other data types there are also format failure. For example, reading an int can go wrong because there was a format mismatch. In that case the std::ios_base::failbit will be set and fail() would return true while eof() keeps returning false.
Testing the stream itself is equivalent to testing fail() which detects that something is wrong with the stream (it actually also tests if the stream is bad()). Thus, the canonical approach for reading a file typically has one of the following forms:
while (input) {
// multiple read operations go here
if (input) {
// processing of the read data goes here
}
}
or
while (/* reading everything goes here */) {
// processing of the read data goes here
}
Obviously, you can use a for-loop instead of a while-loop. Another interesting approach to reading data uses std::istream_iterator<T> and assumes that there is an input operator for the type T. For example:
for (std::istream_iterator<std::string> it(std::cin), end; it != end; ++it) {
std::cout << "string='" << *it << "'\n";
}
In none of these approaches eof() is used in the main reading loop. However, it is reasonable to use eof() after the loops to detect if the loop stopped because the end of the file was reached or because there was some formatting error.

Why does in_avail() output zero even if the stream has some char?

#include <iostream>
int main( )
{
using namespace std;
cout << cin.rdbuf()->in_avail() << endl;
cin.putback(1);
cin.putback(1);
cout << cin.rdbuf()->in_avail() << endl;
return 0;
} //compile by g++-4.8.1
I think this will output 0 and 2
but when I run the code, it output 0 and 0, why?
or if I change cin.putback(1); to int a; cin >> a; with input 12 12;
it still outputs 0 and 0

Apparently it's a bug/feature of some compiler implementations
Insert the line
cin.sync_with_stdio(false);
somewhere near the beginning of code, and that should fix it
EDIT: Also remember that in_avail will always return 1 more than the number of chars in the input because it counts the end of input character.
EDIT2: Also as I just checked, putback does not work unless you have attempted to read something from the stream first, hence the "back" in "putback". If you want to insert characters into the cin, this thread will provide the answer:
Injecting string to 'cin'

What must have happened is that your putback didn't find any room in the streambuf get area associated with std::cin (otherwise a read position would have been available and egptr() - gptr() would have been non-zero) and must have gone to an underlying layer thanks to pbackfail.
in_avail() will call showmanyc() and zero (which is the default implementation of this virtual function) is a safe thing to return as it means that a read might block and it might fail but isn't guaranteed to do either. Obviously it is possible for an implementation to provide a more helpful implementation for showmanyc() in this case, but the simple implementation is cheap and conformant.

Different EOF behavior with read versus ignore

I was recently just tripped up by a subtle distinction between the behavior of std::istream::read versus std::istream::ignore. Basically, read extracts N bytes from the input stream, and stores them in a buffer. The ignore function extracts N bytes from the input stream, but simply discards them rather than storing them in a buffer. So, my understanding was that read and ignore are basically the same in every way, except for the fact that read saves the extracted bytes whereas ignore just discards them.
But there is another subtle difference between read and ignore which managed to trip me up. If you read to the end of a stream, the EOF condition is not triggered. You have to read past the end of a stream in order for the EOF condition to be triggered. But with ignore it is different: you only need to read to the end of a stream.
Consider:
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
{
std::stringstream ss;
ss << "abcd";
char buf[1024];
ss.read(buf, 4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
{
std::stringstream ss;
ss << "abcd";
ss.ignore(4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
}
On GCC 4.4.5, this prints out:
EOF: false
EOF: true
So, why is the behavior different here? This subtle difference managed to confuse me enough to wonder why there is a difference. Is there some compelling reason that EOF is triggered "early" with a call to ignore?

eof() should only return true if you have already attempted to read past the end. In neither case should it be true. This may be a bug in your implementation.

I'm going to go out on a limb here and answer my own question: it really looks like this is a bug in GCC.
The standard reads in 27.6.1.3 paragraph 23:
[istream::ignore] behaves as an
unformatted input function (as
described in 27.6.1.3, paragraph 1).
After constructing a sentry object,
extracts characters and discards them.
Characters are extracted until any of
the following occurs:
if n != numeric_limits::max()
(18.2.1), n characters are extracted
end-of-file occurs on the input sequence (in which case the function
calls setstate(eofbit), which may
throw ios_base::failure(27.4.4.3));
c == delim for the next available input character c (in which case c is
extracted). Note: The last condition
will never occur if delim ==
traits::eof()
My (somewhat tentative) interpretation is that GCC is wrong here, because of the bold parts above. Ignore should behave as an unformatted input function, (like read()), which means that end-of-file should only occur on the input sequence if there is an attempt to extract additional bytes after the last byte in the stream has been extracted.
I'll submit a bug report if I find that enough people agree with this answer.

The consensus seemed to be that this was a legitimate bug in gcc. Since I saw no indication a bug report had been filed, I'm doing so now. The report can be viewed at:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51651

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Example of Why stream::good is Wrong? - c++

Related

stringstream.rdbuf causing cout to fail

Function definition to count number of words

Why does ofstream give me an echo ie. writes input twice [duplicate]

Why does in_avail() output zero even if the stream has some char?

Different EOF behavior with read versus ignore

Categories

Resources