This question already has answers here:
Testing stream.good() or !stream.eof() reads last line twice [duplicate]
(3 answers)
Closed 8 years ago.
When I run the following code, and I write for example "Peter" then the result is that I get "PeterPeter" in the file.
Why?
#include "stdafx.h"
#include "iostream"
#include "iomanip"
#include "cstdlib"
#include "fstream"
#include "string"
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
ofstream File2;
File2.open("File2.dat",ios::out);
string name;
cout<<"Name?"<<endl;
while(!cin.eof())
{
cin>>name;
File2<<name;
}
return 0;
}
When I change the while loop to
while(cin>>name)
{
File2<<name;
}
it works. But I don't understand why the first approach does not.
I can't answer my own question (as I don't have enough reputation). Hence I write my Answer here:
Ahhhh!!! Ok Thanks. Now I got it ^^
I have been testing with
while(!cin.eof())
{
cin>>name;
File2<<name;
cout<<j++<<"cin.eof() "<<cin.eof()<<endl;
}
What happens is that when I tip crtl+z he is still in the while loop. The variable name stays unchanged and is added to "File2" in the next line of code.
The following is working:
while(!cin.eof())
{
cin>>name;
if(!cin.eof()){File2<<name;}
cout<<j++<<"cin.eof() "<<cin.eof()<<endl;
}
Ho hum, the millionth time this has been asked. This is wrong
while(!cin.eof())
{
cin>>name;
File2<<name;
}
eof() doesn't do what you think it does. You think it tells you whether you're at the end of file, right?
What eof() actually does is tell you why the last read you did failed. So it's something you call after you have done a read to see why it failed, not something you do before a read to see if it will fail. The return value of eof() when the last read has not failed is more complex. It depends on what you have been reading and how you've been reading it. You are trying to use eof() in a situation where there has been no failure and so the results can vary.
The short answer is don't to it like that, do it like this
while(cin >> name)
{
File2<<name;
}
BTW, sorry for the flippant tone, I am seriously interested to know why you wrote the code wrong in the first place. We see this mistake all the time, it seems almost every newbie makes the same mistake, so I am interested to understand where this mistake comes from. Did you see that code somewhere else, did someone teach you to write that, did it just seem right to you, etc. etc. If you could explain in your case I'd appreciate it.
The basic problem of the using std::cin.eof() in the loop condition is that it tests the stream state before it is attempted to read anything from the stream. At this point, the stream has no idea what will be attempted to be read and it can't make any prediction of what will by tried. The fundamental insight is: You always have to verify that reading data was successful after reading it!
A secondary problem is that eof() only tests one of multiple error conditions. Reading a std::string can only go wrong if there is no further data but for most other data types there are also format failure. For example, reading an int can go wrong because there was a format mismatch. In that case the std::ios_base::failbit will be set and fail() would return true while eof() keeps returning false.
Testing the stream itself is equivalent to testing fail() which detects that something is wrong with the stream (it actually also tests if the stream is bad()). Thus, the canonical approach for reading a file typically has one of the following forms:
while (input) {
// multiple read operations go here
if (input) {
// processing of the read data goes here
}
}
or
while (/* reading everything goes here */) {
// processing of the read data goes here
}
Obviously, you can use a for-loop instead of a while-loop. Another interesting approach to reading data uses std::istream_iterator<T> and assumes that there is an input operator for the type T. For example:
for (std::istream_iterator<std::string> it(std::cin), end; it != end; ++it) {
std::cout << "string='" << *it << "'\n";
}
In none of these approaches eof() is used in the main reading loop. However, it is reasonable to use eof() after the loops to detect if the loop stopped because the end of the file was reached or because there was some formatting error.
Related
I usually teach my students that the safe way to tackle file input is:
while (true) {
// Try to read
if (/* failure check */) {
break;
}
// Use what you read
}
This saved me and many people from the classical and most of the time wrong:
while (!is.eof()) {
// Try to read
// Use what you read
}
But people really like this form of looping, so it has become common to see this in student code:
while (is.peek()!=EOF) { // <-- I know this is not C++ style, but this is how it is usually written
// Try to read
// Use what you read
}
Now the question is: is there a problem with this code? Are there corner cases in which things don't work exactly as expected? Ok, it's two questions.
EDIT FOR ADDITIONAL DETAILS: during exams you sometimes guarantee the students that the file will be correctly formatted, so they don't need to do all the checks and just need to verify if there's more data. And most of the time we deal with binary formats, which allow you to not worry about whitespace at all (because the data is all meaningful).
While the accepted answer is totally clear and correct, I'd still like someone to try to comment on the joint behavior of peek() and unget().
The unget() stuff came to my mind because I once observed (I believe it was on Windows) that by peeking at the 4096 internal buffer limit (so effectively causing a new buffer to be loaded), ungetting the previous byte (last of the previous buffer) failed. But I can be wrong. So that was my additional doubt: something known I missed, which maybe is well coded in the standard or in some library implementations.
is.peek()!=EOF tells you whether there are still characters left in the input stream, but it doesn't tell you whether your next read will succeed:
while (is.peek()!=EOF) {
int a;
is >> a;
// Still need to test `is` to verify that the read succeeded
}
is >> a could fail for a number of reasons, e.g. the input might not actually be a number.
So there is no point to this if you could instead do
int a;
while (is >> a) { // reads until failure of any kind
// use `a`
}
or, maybe better:
for (int a; is >> a;) { // reads until failure of any kind
// use `a`
}
or your first example, in which case the is.peek()!=EOF in the loop will become redundant.
This is assuming you want the loop to exit on every failure, following your first code example, not only on end-of-file.
I'm learning about how to work with files in C++ and as a beginner I've got some doubts that I would like to clarify :
In my book the author introduces the stream states and writes this simple piece of code to show how to read until we reach end of file or a terminator :
// somewhere make ist throw if it goes bad :
void fill_vector(istream& ist, vector<int>& v, char terminator)
{
ist.exceptions(ist.exceptions() | ios_base::badbit);
for (int i; ist >> i;) v.push_back(i);
if (ist.eof()) return; // fine: we found end of file
// not good() not bad() and not eof(), it must be fail()
ist.clear();
char c;
ist >> c; // read a character, hopefully terminator
if (c != terminator) { // not the terminator, so we must fail
ist.unget(); // maybe my caller can use that character
ist.clear(ios_base::failbit);
}
}
This was a first example, which provides a useful method to read data, but I'm having some issues with the second example where the author says :
Often, we want to check our read as we go along, this is the general strategy assuming that ist is an 'istream':
for (My_type var; ist >> var;) { // read until end of file
// maybe check that var is valid
// do something with var
}
if (ist.fail()) {
ist.clear();
char ch;
// the error function is created into the book :
if (!(ist >> ch && ch == '|')) error("Bad termination of input\n");
}
// carry on : we found end of file or terminator
If we don't want to accept a terminator-that is, to accept only the end o file as the end- we simply delete the test before the call of error().
Here's my doubt : In the first example we basically check for every possible state of the istream to be sure that the reading terminated as we wanted to, and that's ok. But I have problems in understanding the second example :
What does the author means when he says to remove the test before the call of error ?
Is it possible to avoid triggering both eof and fail when reading ? If yes, how ?
I'm really confused and I can't understand the example because from the test that I've done the failbit will always be set after eofbit, so what's the sense of checking for failbit if It will always be triggered? Why is the author doing that
What would happen to the code if I remove the test before the call of error as the author says ? Wouldn't that be useless as I would only be checking for the bad state of the stream ?
I think I see what you mean. No, it's not really useless, because you would tell someone (I don't know what error actually does), the programmer (exception) or the user (standard output), that the data had some invalid data, and someone has to act accordingly.
It may be useless, but that depends on what you want the function to do, if for example you want it to just silently ignore the error and use the correct data already processed, it really is useless.
How can I read data from a file until I just reach the end of that file without using any other terminator ?
I can't see what you mean, you are already doing that in both examples:
if (ist.eof()) return; // fine: we found end of file
and
if (ist.fail()) { //If 'ist' didn't fail (reaching eof is not a failure), just skip 'if'
I gave an answer which I wanted to check the validity of stream each time through a loop here.
My original code used good and looked similar to this:
ifstream foo("foo.txt");
while (foo.good()){
string bar;
getline(foo, bar);
cout << bar << endl;
}
I was immediately pointed here and told to never test good. Clearly this is something I haven't understood but I want to be doing my file I/O correctly.
I tested my code out with several examples and couldn't make the good-testing code fail.
First (this printed correctly, ending with a new line):
bleck 1
blee 1 2
blah
ends in new line
Second (this printed correctly, ending in with the last line):
bleck 1
blee 1 2
blah
this doesn't end in a new line
Third was an empty file (this printed correctly, a single newline.)
Fourth was a missing file (this correctly printed nothing.)
Can someone help me with an example that demonstrates why good-testing shouldn't be done?
They were wrong. The mantra is 'never test .eof()'.
Why is iostream::eof inside a loop condition considered wrong?
Even that mantra is overboard, because both are useful to diagnose the state of the stream after an extraction failed.
So the mantra should be more like
Don't use good() or eof() to detect eof before you try to read any further
Same for fail(), and bad()
Of course stream.good can be usefully employed before using a stream (e.g. in case the stream is a filestream which has not been successfully opened)
However, both are very very very often abused to detect the end of input, and that's not how it works.
A canonical example of why you shouldn't use this method:
std::istringstream stream("a");
char ch;
if (stream >> ch) {
std::cout << "At eof? " << std::boolalpha << stream.eof() << "\n";
std::cout << "good? " << std::boolalpha << stream.good() << "\n";
}
Prints
false
true
See it Live On Coliru
This is already covered in other answers, but I'll go over it briefly for completeness. The only functional difference with
while(foo.good()) { // effectively same as while(foo) {
getline(foo, bar);
consume(bar); // consume() represents any operation that uses bar
}
And
while(getline(foo, bar)){
consume(bar);
}
Is that the former will do an extra loop when there are no lines in the file, making that case indistinguishable from the case of one empty line. I would argue that this is not typically desired behaviour. But I suppose that's matter of opinion.
As sehe says, the mantra is overboard. It's a simplification. What really is the point is that you must not consume() the result of reading the stream before you test for failure or at least EOF (and any test before the read is irrelevant). Which is what people easily do when they test good() in the loop condition.
However, the thing about getline(), is that it tests EOF internally, for you and returns an empty string even if only EOF is read. Therefore, the former version could maybe be roughly the similar to following pseudo c++:
while(foo.good()) {
// inside getline
bar = ""; // Reset bar to empty
string sentry;
if(read_until_newline(foo, sentry)) {
// The streams state is tested implicitly inside getline
// after the value is read. Good
bar = sentry // The read value is used only if it's valid.
// ... // Otherwise, bar is empty.
consume(bar);
}
I hope that illustrates what I'm trying to say. One could say that there is a "correct" version of the read loop inside getline(). This is why the rule is at least partially satisfied by the use of readline even if the outer loop doesn't conform.
But, for other methods of reading, breaking the rule hurts more. Consider:
while(foo.good()) {
int bar;
foo >> bar;
consume(bar);
}
Not only do you always get the extra iteration, the bar in that iteration is uninitialized!
So, in short, while(foo.good()) is OK in your case, because getline() unlike certain other reading functions, leaves the output in a valid state after reading EOF bit. and because you don't care or even do expect the extra iteration when the file is empty.
both good() and eof() will both give you an extra line in your code. If you have a blank file and run this:
std::ifstream foo1("foo1.txt");
std::string line;
int lineNum = 1;
std::cout << "foo1.txt Controlled With good():\n";
while (foo1.good())
{
std::getline(foo1, line);
std::cout << lineNum++ << line << std::endl;
}
foo1.close();
foo1.open("foo1.txt");
lineNum = 1;
std::cout << "\n\nfoo1.txt Controlled With getline():\n";
while (std::getline(foo1, line))
{
std::cout << line << std::endl;
}
The output you will get is
foo1.txt Controlled With good():
1
foo1.txt Controlled With getline():
This proves that it isn't working correctly since a blank file should never be read. The only way to know that is to use a read condition since the stream will always be good the first time it reads.
Using foo.good() just tells you that the previous read operation worked just fine and that the next one might as well work. .good() checks the state of the stream at a given point. It does not check if the end of the file is reached. Lets say something happened while the file was being read (network error, os error, ...) good will fail. That does not mean the end of the file was reached. Nevertheless .good() fails when end of file is reached because the stream is not able to read anymore.
On the other hand, .eof() checks if the end of file was truly reached.
So, .good() might fail while the end of file was not reached.
Hope this helps you understand why using .good() to check end of file is a bad habit.
Let me clearly say that sehe's answer is the correct one.
But the option proposed by, Nathan Oliver, Neil Kirk, and user2079303 is to use readline as the loop condition rather than good. Needs to be addressed for the sake of posterity.
We will compare the loop in the question to the following loop:
string bar;
while (getline(foo, bar)){
cout << bar << endl;
}
Because getline returns the istream passed as the first argument, and because when an istream is cast to bool it returns !(fail() || bad()), and since reading the EOF character will set both the failbit and the eofbit this makes getline a valid loop condition.
The behavior does change however when using getline as a condition because if a line containing only an EOF character is read the loop will exit preventing that line from being outputted. This doesn't occur in Examples 2 and 4. But Example 1:
bleck 1
blee 1 2
blah
ends in new line
Prints this with the good loop condition:
bleck 1
blee 1 2
blah
ends in new line
But chops the last line with the getline loop condition:
bleck 1
blee 1 2
blah
ends in new line
Example 3 is an empty file:
Prints this with the good condition:
Prints nothing with the getline condition.
Neither of these behaviors are wrong. But that last line can make a difference in code. Hopefully this answer will be helpful to you when deciding between the two for coding purposes.
I've got this code with use of cin.peek() method. I noticed strange behaviour, when input to program looks like qwertyu$[Enter] everything works fine, but when it looks like qwerty[Enter]$ it works only when I type double dollar sign qwerty[Enter]$$. On the other hand when I use cin.get(char) everything works also fine.
#include <iostream>
#include <cstdlib>
using namespace std;
int main()
{
char ch;
int count = 0;
while ( cin.peek() != '$' )
{
cin >> ch; //cin.get(ch);
count++;
}
cout << count << " liter(a/y)\n";
system("pause");
return 0;
}
//Input:
// qwerty$<Enter> It's ok
//////////////////////////
//qwerty<Enter>
//$ Doesn't work
/////////////////////////////
//qwerty<Enter>
//$$ works(?)
It's because your program won't get input from the console until the user presses the ENTER key (and then it won't see anything typed on the next line until ENTER is pressed again, and so on). This is normal behavior, there's nothing you can do about it. If you want more control, create a UI.
Honestly I don't think the currently accepted answer is that good.
Hmm looking at it again I think since, operator<< is a formatted input command, and get() a plain binary, the formatted version could be waiting for more input than one character to do some formatting magic.
I presume it is way more complicated than get() if you look what it can do. I think >> will hang until it is absolutely sure it read a char according to all the flags set, and then will return. Hence it can wait for more input than just one character. For example you can specify skipws.
It clearly would need to peek into more than once character of input to get a char from \t\t\t test.
I think get() is unaffected by such flags and will just extract a character from a string, that is why it is easier for get() to behave in non-blocking fashion.
The reason why consider the currently accepted answer wrong is because it states that the program will not get any input until [enter] or some other flush-like thing. In my opinion this is obviously not the case since get() version works. Why would it, if it did not get the input?
It probably still can block due to buffering, but I think it far less likely, and it is not the case in your example.
I was looking at this article on Cplusplus.com, http://www.cplusplus.com/reference/iostream/istream/peek/
I'm still not sure what peek() returns if it reaches the end of the file.
In my code, a part of the program is supposed to run as long as this statement is true
(sourcefile.peek() != EOF)
where sourcefile is my ifstream.
However, it never stops looping, even though it has reached the end of the file.
Does EOF not mean "End of File"? Or was I using it wrong?
Consulting the Standard,
Returns:traits::eof() ifgood()isfalse. Otherwise,returnsrdbuf()->sgetc().
As for sgetc(),
Returns: If the input sequence read position is not available, returns underflow().
And underflow,
If the pending sequence is null then the function returns traits::eof() to indicate failure.
So yep, returns EOF on end of file.
An easier way to tell is that it returns int_type. Since the values of int_type are just those of char_type plus EOF, it would probably return char_type if EOF weren't possible.
As others mentioned, peek doesn't advance the file position. It's generally easiest and best to just loop on while ( input_stream ) and let failure to obtain additional input kill the parsing process.
Things that come to mind (without seeing your code).
EOF could be defined differently than you expect
sourcefile.peek() doesn't advance the file pointer. Are you advancing it manually somehow, or are you perhaps constantly looking at the same character?
EOF is for the older C-style functions. You should use istream::traits_type::eof().
Edit: viewing the comments convinces me that istream::traits_type::eof() is guaranteed to return the same value as EOF, unless by chance EOF has been redefined in the context of your source block. While the advice is still OK, this is not the answer to the question as posted.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
//myifstream_peek1.cpp
int main()
{
char ch1, ch2;
ifstream readtext2;
readtext2.open("mypeek.txt");
while(readtext2.good())
{
if(readtext2.good())
{
ch2 = readtext2.get(); cout<< ch2;
}
}
readtext2.close();
//
ifstream readtext1;
readtext1.open("mypeek.txt");
while(readtext1.good())
{
if(readtext1.good())
{
ch2 = readtext1.get();
if(ch2 ==';')
{
ch1= readtext1.peek();
cout<<ch1; exit(1);
}
else { cout<<ch2; }
}
}
cout<<"\n end of ifstream peeking";
readtext1.close();
return 0;
}
While this technically works, using ifstream::eof() would be preferable
as in
(!sourcefile.eof())