No `while (!my_ifstream.eof()) { getline(my_ifstream, line) }` in C++? - c++

On this website, someone writes:
while (! myfile.eof() )
{
getline (myfile,line);
cout << line << endl;
}
This is wrong, read carefully the documentation for the eof()
memberfunction. The correct code is this:
while( getline( myfile, line))
cout << line << endl;
Why is this?

There are two primary reasons. #Etienne has pointed out one: reading could fail for some reason other than reaching the end of the file, in which case your first version will go into an infinite loop.
Even with no other failures, however, the first won't work correctly. eof() won't be set until after an attempt at reading has failed because the end of the file was reached. That means the first loop will execute one extra iteration that you don't really want. In this case, that'll just end up adding an extra blank (empty) line at the end of the file. Depending on what you're working with, that may or may not matter. Depending on what you're using to read the data, it's also fairly common to see the last line repeated in the output.

A stream operation (such as reading) can fail for multiple reasons. eof() tests just one of them. To test them all, simply use the stream's void *conversion operator. That's what's done in the second snippet.

Related

How can a C++ program know before executing a getline(in_file,my_string) that this call will be a read past end of file?

Here is the (over-simplified) code that is reading past end-of-file.
std::ifstream in_file;
in_file.open(argv[1],ios::in);
getline(in_file,line_any);
while (!in_file.eof())
{
record_count++;
//:
//: lots of stuff
//:
std::cout << "before getline " << line_any << " >" << in_file.eof() << " >" << in_file.good() << " \n";
getline(in_file,line_any);
std::cout << " after getline " << line_any << " >" << in_file.eof() << " >" << in_file.good() << " \n";
}
Just before the getline(), the eof() is false and the good() is true, then the getline raises an exception. I know that I can change this to handle the exception, but why is the eof() false and the good() true?
Here is a clue.
if the last line of the input file has a carriage return and line feed, then the exception happens. if the last line does not have a CRLF then there is no exception raised.
I am using the Ebarcadero 7.3 compiler.
Any and all comments are welcome.
Instead of a computer reading a file, let's imagine a human reading a book. Let's make it second-personal, meaning I'll refer to this hypothetical human as "you". :)
So you are reading a book, but the book does not make a lot of sense to you. You read a page, think about it, then repeat. Now try to picture what is going on, step-by-step. You reach the end of a page, the second of the two pages you can see with the book lying flat. You think about it. It doesn't make sense. Is there more to read?
Well, if the text ended in the middle of a page (analogous to the last line not ending with an end-of-line marker), you can see that the text has ended. However, if the text went all the way to the end of the page, how do you know if there is text on the next page until you turn the page? This book happens to have several blank pages at its end, so simply looking at the thickness of what is left is inconclusive. There is no "The End" or other mark indicating the end of the book; the text simply ends once it reaches its conclusion. And you do not understand the meaning well enough to recognize the conclusion.
This is close to what getline does. You read to the end of the page; getline reads to the end-of-line marker. Neither you nor getline has any way to comprehend what you read, so you cannot guess if the text has finished based solely upon what was read. In order to see if there is more to read, you had to turn to the next page. In order to see if there is more to read, getline had to go to the next line. If the book ends at the end of a page, you cannot announce that you are done reading until you try to read something that is not there. If the file ends at the end of a line, getline cannot announce that it is done reading until it tries to read something that is not there.
So, correct. If all of your lines have end-of-line markers, then you will face a situation where there is no indication that you are at the end of the file until you try to read past the end of the file. The flags tell you that before trying to read past the end of the file, the stream is in good condition and the end of the file has not yet been seen. You might be at the end of the file, but that remains to be seen.
Yes, it would have been possible to implement the reading a different way. However, looking ahead is extra overhead that the language designers did not want to force on getline.

Reading file line by line with C++ always output 2 lines even if file is different

I need to store each lines of a text document into a vector. However any file text I try, the output is always 2 lines. First one is empty and second one always output: "DONE". I'm on Windows7 X64, using VC++2013.
I have been trying to solve this for many hours. I tried many different approach but the result stay the same. I suspect that "DONE" is the return value from getline() however I don't understand with my code is not working like it should.
int main() {
ifstream hFile("test.txt");
vector<string> lines;
string line;
while (std::getline(hFile, line))
lines.push_back(line);
cout << lines[1];
hFile.close();
getchar();
return 0;
}
EDIT: It works fine when executing the program from the compilation folder but not in the debug console of VC++...
The program looks mostly correct. The only problem is that your code assumes that there are, at least, two lines in your file: if there are few lines, e.g., just one or the files couldn't be opened, the statement
cout << lines[1];
result in undefined behavior. Did you mean to print each line of the file rather than just the second line?
From the description of the behavior I would suspect that you file either contains the string DONE or you are actually executing a different program!
Be careful, it proves nothing about the count of lines:
cout << lines[1];
Use line.size() to count the read lines. In fact for a file with one line, it's undefined behavior to access second item.

Meaning of cin.fail() in C++?

while (!correct)
{
cout << "Please enter an angle value => ";
cin >> value; //request user to input a value
if(cin.fail()) // LINE 7
{
cin.clear(); // LINE 9
while(cin.get() != '\n'); // LINE 10
textcolor(WHITE);
cout << "Please enter a valid value. "<< endl;
correct = false;
}
else
{
cin.ignore(); // LINE 18
correct =true;
}
}
Hi, this is part of the code that I have written.
The purpose of this code is to restrict users to input numbers like 10,10.00 etc,
if they input values like (abc,!$#,etc...) the code will request users to reenter the values.
In order to perform this function( restrict user to input valid valus), I get some tips and guides through forums.
I think is my responsibility to learn and understand what these codes do... since this is the first time I use this code.
can someone briefly explain to me what does the codes in
line 7,9,10, and 18 do? Especially line 10. I got a brief idea on others line just line 10 I don't know what it did.
Thanks for your guides, I appreciate that!
cin.fail() tells you if "something failed" in a previous input operation. I beleive there are four recognised states of an input stream: bad, good, eof and fail (but fail and bad can be set at the same time, for example).
cin.clear() resets the state to good.
while(cin.get() != '\n') ; will read until the end of the current input line.
cin.ignore(); will skip until the next newline, so is very similar to while(cin.get() != '\n');.
The whole code should check for end of file too, or it will hang (loop forever with failure) if no correct input is given and the input is "ended" (e.g. CTRL-Z or CTRL-D depending on platform).
// LINE 7: cin.fail() detects whether the value entered fits the value defined in the variable.
// LINE 18: cin leaves the newline character in the stream. Adding cin.ignore() to the next line clears/ignores the newline from the stream.
For line 7 and line 9, read the document.
while(cin.get() != '\n'); // LINE 10
in the while, it tests whether the line cin.get() is an empty line, i.e, containing just the new line.
Line 7: test if entered data is correct (can be read as decltype(value)). cin.fail() is always true if some error with the stream happened. Later, in
line 9: you clear cin state from bad to previous, normal state. (recover after the error). You cannot read data anymore until recovering from the bad state.
Line 10: you read until the end of line. Basically you skip one line from the input
Line 18: this line executes only if entered data is corrected. It reads and discards one line from stdin.
while(cin.get() != '\n'): All string in c are null terminated. This means that \n is the end of all the string objects. Lets say you have string "this" for c it is this\n, each alphabet being stored in a char type. Please read along
http://www.functionx.com/cpp/Lesson16.htm
cin.fail(): cin.fail() detects whether the value entered fits the value defined in the variable.
read:http://www.cplusplus.com/forum/beginner/2957/
cin.ignore(): Extracts characters from the input sequence and discards them
http://www.cplusplus.com/reference/istream/istream/ignore/
I know it's not usual in Stack Overflow to just list links, so I'll give a bit more detail, but this answer really boils down to a bunch of links.
For line 7, just google cin.fail. Here's a good reference, and what it says:
Returns true if either (or both) the failbit or the badbit error state flags is set for the stream.
At least one of these flags is set when some error other than reaching the End-Of-File occurs during an input operation.
failbit is generally set by an operation when the error is related to the internal logic of the operation itself; further operations on the stream may be possible. While badbit is generally set when the error involves the loss of integrity of the stream, which is likely to persist even if a different operation is attempted on the stream. badbit can be checked independently by calling member function bad:
One line translation: it tells you if there was an unexpected error while trying to read the input stream.
You can find similar references for cin.ignore, cin.clear and cin.get. The quick summary:
cin.ignore - ignore one single character present in the stream.
cin.clear - clear any error flags in the stream
cin.get - get one character at a time, until you hit the newline '\n' character.
The standard input stream (cin) can fail due to a number of reasons.
For example, if value is an int, and the user enters a large number like
124812471571258125, cin >> value will fail because the number is too big to fit inside an int.
But:
There is a much simpler way to do what you want. You want the user to enter only valid floating-point values, e.g. 10 or 10.00, but no characters, right? So you can just do this:
double value;
cout << "Please enter an angle value: " << endl;
while (!(cin >> value)) { //Since value is a double, (cin >> value) will be true only if the user enters a valid value that can be put inside a double
cout << "Please enter a valid value:" << endl;
}
This does the same thing that your code does, but much more simply.
If you're interested in what other things can cause cin to fail, look here:
http://www.cplusplus.com/forum/beginner/2957/

C++ unexpected hangtime

I am trying to write a function in my program that loads a huge text-file of 216,555 words and put them as strings into a set. This works properly, but as expected, it will hang for a few micro seconds while looping through the file. But there is something funky going on with my loop, and it's not behaving properly. Please take the time to read, I am sure there's a valid reason for this, but I have no idea what to search for.
The code, which is working by the way, is this:
ifstream dictionary;
dictionary.open("Dictionary.txt");
if(dictionary.fail())
{
cout<<"Could not find Dictionary.txt. Exiting."<<endl;
exit(0);
}
int i = 0;
int progress = 216555/50;
cout<<"Loading dictionary..."<<endl;
cout<<"< >"<<endl;
cout<<"<";
for(string line; getline(dictionary, line);)
{
usleep(1); //Explanation below (not the hangtime)
i++;
if(i%progress == 0)
cout<<"=";
words.insert(line);
}
The for-loop gets every string from the file, and inserts them in the map.
This is a console-application, and I want the user to see the progress. It's not much of a delay, but I wanted to do it anyway. If you don't understand the code, I'll try to explain.
When the program starts, it first prints out "Loading Dictionary...", and then a "<" and a ">" separated by 50 spaces. Then on the next line: "<" followed by a "=" for every 433 word it loops through (216555/50). The purpose of this is so the user can see the progress. The wanted output half way through the loop would be like this:
Loading dictionary...
< >
<=========================
My problem is:
The correct output is shown, but not at the expected time. It prints out the full progress bar, but that only after it has hanged and are done with the loop. How is that possible? The correct number of '=' are shown, but they all pop out at the same time, AFTER it hangs for some microseconds. I added the usleep(1) to make the hangtime a bit longer, but the same thing happened. The if-statement clearly works, or the '=' would've never showed at all, but it feels like my program is stacking the cout-calls for after the entire loop.
The weirdest thing of all, is that the last cout<<"<"; before the for-loop starts, is also shown at the same time as the rest of its line; after the loop is finished. Why and how?
You never flush the stream, so the output just goes into a buffer.
Change cout<<"="; to cout<<"="<<std::flush;
You need to flush the output stream.
cout << "=" << std::flush;
The program is "stacking the cout-calls". It's called output buffering, and every major operating system does it.
When in interactive mode (as your program is intended to be used), output is buffered by line; that is, it will be forced to the terminal when a newline is seen. You can also have block-buffered (fixed number of bytes between forced outputs; used when piping output) and unbuffered.
C++ provides the std::flush stream modifier to force an output at any point. It can be used like this:
cout << "=" << std::flush;
This will slow down your program a bit; the point of buffering is for efficiency. As you'll only be doing it about 51 times, though, the slowdown should be negligible.

Whitespace at end of file causing EOF check to fail in C++

I am reading in data from a file that has three columns. For example the data will look something like:
3 START RED
4 END RED
To read in the data I am using the following check:
while (iFile.peek() != EOF) {
// read in column 1
// read in column 2
// read in column 3
}
My problem is that the loop usually does an extra loop. I am pretty sure this is because a lot of text editors seem to put a blank line after the last line of actual content.
I did a little bit of Googling and searched on SO and found some similar situations such as Reading from text file until EOF repeats last line however I couldn't quite seem to adapt the solution given to solve my problem. Any suggestions?
EOF is not a prediction but an error state. Hence, you can't use it like you're using it now, to predict whether you can read Column 1, 2 and 3. For that reason, a common pattern in C++ is:
while (input >> obj1 >> obj2) {
use(obj1, obj2);
}
All operator>>(istream& is, T&) return the inputstream, and when used in boolean context the stream is "true" as long as the last extraction succeeded. It's then safe to use the extracted objects.
Presuming iFile is an istream:
You should break out of the loop on any error, not only on EOF (which can be checked for with iFile.eof(), BTW), because this is an endless loop when any format failure sets the stream into a bad state other that EOF. It is usually necessary to break out of a reading loop in the middle of the loop, after everything was read (either successfully or not), and before it is entered.
To make sure there isn't anything interesting coming anymore, you could, after the loop, reset the stream state and then try to read whitespace only until your reach EOF:
while( !iFile.eof() )
{
iFile >> std::ws;
string line;
std::getline(iFile,line);
if(!line.empty()) error(...);
}
If any of the reads fail (where you read the column data), just break out of the while loop. Presumably you are then at the end of the file and reading the last 'not correct' line
Maybe you'll consider it a good idea to handle whitespace and other invalid input then. Perhaps some basic validation of columns 1,2,3 would be desirable as well.
Don't worry about the number of times that you loop: just validate your data and handle invalid inputs.
Basically, check that you have three columns to read and if you don't decide if it's because the file is over or because of some other issue.