C++ Why diff EOF checks recommended for text vs numeric? - c++

My textbook recommends using the member accessor method iStreamVar.eof() when dealing with textual data and while (iStreamVar) when dealing with numeric data.
Can someone please explain why it would matter?
Quote from book:
Using the function eof to determine the end-of-file status works best if the input is text. The earlier method of determining the end-of-file status works best if the input consists of numeric data.
That is the only thing mentioned on the topic. After this, it just explains how the process works.

Which method you use for determining the end of data depends on how you use it. My guess is, both methods which your textbook mentions are used wrong, so they fail in different situations. That's why it recommends using different methods in different situations.
The correct method is not trivial, and it depends on how important error resilience is for you.
If you want to read a space-delimited stream with numbers in it, and you are sure the file contains no errors, the code is simplest:
int value;
while (iStreamVar >> value)
{
...
}
Note that it's not any of the two original options.
If your file contains space-delimited textual data, and you are sure there are no errors, use the same code (but declare the temporary variable as string instead of int).
If you want to detect and recover from errors, use more elaborate code. But I cannot recommend you any specific code structure - it depends on what exactly you want to do in case of errors. Also:
Are text records delimited by space or newline?
What if the input text-file contains an empty line?
Numbers - floating-point or not?
Numbers - if there is a stray character like a among number data, what to do?
So there is no single correct recipe for doing proper input with error resilience.

Unless there is something significant in the context that isn't shown in the question, that quote is nonsense.
The way to read from a file and check for success is to read from the file:
int data;
if (std::cin >> data)
std::cout << "read succeeded, value is " << data << '\n';
std::string data;
if (std::cin >> data)
std::cout << "read succeeded, value is " << data << '\n';
std::string data;
if (std::getline(std::cin, data)
std::cout << "read succeeded, value is " << data << '\n';
If an attempted read fails you can call .eof() to find out whether the failure was because the input was at the end of the file. Contrary to what some beginners expect (and what some languages do), if .eof() returns false it does not mean that there is data remaining in the input stream. The stream might be at the end of the file after a successful read consumed the remaining input. .eof() will return false, but the next attempted read will fail, and after that, .eof() will return true.
std::stringstream input("1234");
int data;
input >> data; // succeeds
std::cout << input.eof() << '\n'; // outputs 0, no failure
input >> data; // fails, no more input
std::cout << input.eof() << '\n'; // outputs 1, failed because at end of file

Related

How would one generalise `clearerr()` under C++?…

TL;DR
I am aware that if a program listens for EOF (e.g. ^D) as a sign to stop taking input, e.g. by relying on a conditional like while (std::cin) {...}, one needs to call cin.clear() before standard input can be read from again (readers who'd like to know more, see this table).
I recently learned that this is insufficient, and that the underlying C file descriptors, including stdin, need clearerr() to be run to forget EOF states.
Since clearerr() needs a C-style file descriptor, and C++ operates mainly with std::basic_streambufs and the like (e.g. cin), I want to generalise some code (see below) to run clearerr() on any streambuf's associated C-style file-descriptor, even if that may not be stdin.
EDITS (1&2):
I wonder if stdin is the only ever file-descriptor that behaves like this (needing clearerr() to run) ...?
If it isn't, then the following code should end the question of generalisation (idea pointed out by zkoza in their answer)
As zkoza pointed out in their comment below, stdin is the only file-descriptor that would, logically, ever need such treatment (i.e. clearerr()). Checking whether a given C++ stream is actually really attached to *std::cin.rdbuf() is all that is needed:
std::istream theStream /* some stream with some underlying streambuf */
if (theStream.rdbuf() == std::cin.rdbuf())
clearerr(stdin);
Background
I'm writing a tool in C++ where I need to get multiple lines of user input, twice.
I know there are multiple ways of getting multiline input (e.g. waiting for double-newlines), but I want to use EOF as the user's signal that they're done — not unlike when you gpg -s or -e.
After much consultation (here, here, and on cppreference.com), I decided to use... (and I quote the third):
[the] idiomatic C++ input loops such as [...]
while(std::getline(stream, string)){...}
Since these rely on std::basic_ios::operator bool to do their job, I ensured that cin.rdstate() was cleared between the first and second user-input instructions (using cin.clear()).
The gist of my code is as follows:
std::istream& getlines (std::basic_istream<char> &theStream,
std::vector<std::string> &stack) {
std::ios::iostate current_mask (theStream.exceptions());
theStream.exceptions(std::ios::badbit);
std::string &_temp (*new std::string);
while (theStream) {
if (std::getline(theStream, _temp))
stack.push_back(_temp); // I'd really like the input broken...
// ... into a stack of `\n`-terminated...
// ... strings each time
}
// If `eofbit` is set, clear it
// ... since std::basic_istream::operator bool needs `goodbit`
if (theStream.eof())
theStream.clear(theStream.rdstate()
& (std::ios::failbit | std::ios::badbit));
// Here the logical AND with
// ... (failbit OR badbit) unsets eofbit
// std::getline sets failbit if nothing was extracted
if (theStream.fail() && !stack.size()) {
throw std::ios::failure("No input recieved!");
}
else if (theStream.fail() && stack.size()) {
theStream.clear(theStream.rdstate() & std::ios::badbit);
clearerr(stdin); // 👈 the part which I want to generalise
}
delete &_temp;
theStream.exceptions(current_mask);
return theStream;
}
This does what you need:
#include <iostream>
int main()
{
std::cin.sync_with_stdio(true);
char c = '1', d = '1';
std::cout << "Enter a char: \n";
std::cin >> c;
std::cout << (int)c << "\n";
std::cout << std::cin.eof() << "\n";
std::cin.clear();
clearerr(stdin);
std::cout << std::cin.eof() << "\n";
std::cout << "Enter another char: \n";
std::cin >> d;
std::cout << (int)d << "\n";
std::cout << std::cin.eof() << "\n";
}
It works because C++'s std::cin is tied, by default, with C's stdin (so, the first line is actually not needed). You have to modify your code to check if the stream is std::cin and if so, perform clearerr(stdin);
EDIT:
Actually, sync_with_stdio ensures only synchronization between the C and C++ interfaces, but internally they work on the same file descriptors and this may be why clearerr(stdin); works whether or not the interfaces are tied by sync_with_stdio
EDIT2: Does these answer your problem? Getting a FILE* from a std::fstream
https://www.ginac.de/~kreckel/fileno/ ?

Columned Text grabbing in c++

I have a file that looks like the following
61101
test
3 69.7139 65.3935 22.2632
3 69.7708 65.6131 21.467
2 69.8974 66.0987 20.7391
I am trying to have it so that the first two lines are skipped and I average the last three columns as long as the first column is not 4.
this is what I am trying at the moment, but it doesn't really seem to be working.
getline(frames_file,tempS);
getline(frames_file,tempS);
while(frames_file.good())
{
if(typePart != 4)
{
frames_file >> typePart >> posPart[0] >> posPart[1] >> posPart[2];
numLipid++;
aPos[0] = aPos[0] + posPart[0];
aPos[1] = aPos[1] + posPart[1];
aPos[2] = aPos[2] + posPart[2];
}
}
aPos[0] = aPos[0]/numLipid;
aPos[1] = aPos[1]/numLipid;
aPos[2] = aPos[2]/numLipid;
cout << " " << aPos[0] << " " << aPos[1] << " " << aPos[2];
this did not seem to grab any values
I see multiple problems here.
getline(frames_file,tempS);
getline(frames_file,tempS);
while(frames_file.good())
Suppose the file only contained two lines. The expected result here is to just read those two lines, and finish.
Unfortunately, good() will still be true here, so the rest of the code will go off the rails. good(), generally speaking, indicates that the input stream's prior operations succeeded. Well, here the prior operations succeeded. You read the two lines successfully.
good() should be checked after attempting the next read operation, to determine if it succeeded, or failed.
if(typePart != 4)
{
frames_file >> typePart >> posPart[0] >> posPart[1] >> posPart[2];
Can you explain to your rubber duck how you expect typePart to be set to 4 before you read the next line in question? Based on your question, as I undestood it, you should be reading every line, and only performing calculations when the first column is 4. Here, you mysteriously expect typePart to be set to 4 even before you read the line in question.
To summarize:
Use std::getline() to read every line in your file, not just the first two. This is in order for you to next step correctly:
AFTER each call to std::getline(), check either good(), or eof() to determine if std::getline() failed because you reached the end of the file. If so, stop at this point, otherwise:
Stuff your read line into a std::istringstream, and use it to actually extract the pieces of the line using operator>>, if that's how you prefer to parse it. Or, use any other approach to parsig that you prefer.
Check if typePart is 4, and then perform all of your calculations AFTER you extracted the parts of the line, not before.

Differences between eof and fail

I know, there was hundreds questions about fail and eof, but no one was able to answered my question.
In this example, only eof bit will be set:
while (!Ifstream.eof()){
Ifstream >> Buffer;
cout << Buffer << endl;
}
But in this example, eof bit as well as fail bit will be set:
while (!Ifstream.fail()){
Ifstream >> Buffer;
cout << Buffer << endl;
}
What is the reason of this two differences? .I consider only situation, when stream reach end of file.
The difference is very slight.
The first piece of code, tries reading as long as it doesn't hit EOF condition. Then it breaks. But, if for some reason an error occurrs (i.e. failure to convert data through << operator), this piece of code will loop indefinitely, since FAIL will be set on error, reading will stop, and EOF will never be hit.
The second piece of code works with a small trick. It reads as long as it can and stops when error occurs. Ok, logical. However, when hittin an end-of-file, but IIRC the EOF condition will not be set yet. So, the loop will run once more, try to read at EOF state, and that will rise the FAIL flag and break the loop. But that also means that you will get one processing ( cout<
The right way to do is to check immediatelly whether READING succeeded:
while (true){
if(!(Ifstream >> Buffer))
break;
cout << Buffer << endl;
}
only that will guarantee you that the loop will stop as soon as read fails, be it EOF or FAIL or other cause.
As MatsPetersson and πάντα ῥεῖ have suggested, the above snippet may be "squashed" to form:
while (Ifstream >> Buffer)
cout << Buffer << endl;
and that's the usual way to write it. No fail/good/eof checks needed. The value returned from operator>> is the stream itself (like in operator<<) and stream is testable for "truthness", so it's just as simple as that. When talking about fail/good/eof differences, I like to write the longer form to emphasize that .. you shouldn't use fail/good/eof in the while-condition at all, and that you should check the state of the stream only after actually trying to read from it. while(true) shows the point very clearly.
fail is different from eof in that it covers various other error conditions than "file reached its end".
For example, if Buffer is int Buffer then the second will stop on reading ABC, where the first one will loop forever (not making any progress, as ABC is not numeric input).
Of course, the CORRECT thing to do is:
while(Ifstream >> Buffer)
{
cout << Buffer << endl;
}
that will stop both on EOF and invalid input (if applicable), as well as not performing the cout << Buffer << endl; when the fail or eof condition happens.
[Note that the while(!eof()) solution is valid in for example Pascal, because in Pascal, the input is "pre-read", so that the current read knows if "the next read will result in EOF" before you actually TRY to read it. C and C++ doesn't mark EOF until you actually READ past the end of the file.
Programically,
'EOF in read' and 'fail of read' is described differently.
EOF indicates End Of File.
So, programmer knows when they have to stop reading file.
But 'fail' is indicates 'not successfully'
It means some process ends with wrong state or exception has been occurred when execute the process.

C++ istream operator>> bad-data handling

Every time I ask a question here on SO, it turns out to be some very dumb mistake (check my history if you don't believe me), so bear with me if you can here.
It feels like my question should be very popular, but I couldn't find anything about it and I've run out of ideas to try.
Anyway, without further ado:
I'm trying to overload the input operator>>. It's supposed to read one integer at a time from a file, skipping invalid data such as chars, floats, etc.
Naturally, I'm checking if(in >> inNum) to both get() the next token and check for successful get().
If successful, not much to say there.
If it fails, however, I assume that one of two things happened:
It stumbled upon a non-integer
It reached the eof
Here's how I tried to deal with it:
istream& operator>> (istream& in, SortSetArray& setB) {
bool eof = false;
int inNum = -1;
while(!eof) {
if(in >> inNum) {
cout << "DEBUG SUCCESS: inNum = " << inNum << endl;
setB.insert(inNum);
}
else {
// check eof, using peek()
// 1. clear all flags since peek() returns eof regardless of what
// flag is raised, even if it's not `eof`
in.clear();
cout << "DEBUG FAIL: inNum = " << inNum << endl;
// 2. then check eof with peek()
eof = (in.peek() == std::char_traits<char>::eof());
}
}
return in;
}
The file contains [1 2 3 4 a 5 6 7], and the program naturally goes into infinite loop.
Okay, easy guess, peek() doesn't consume the char 'a', and maybe in >> inNum also failed to consume it somehow. No biggie, I'll just try something that does.
And that's pretty much where I've been for the last 2 hours. I tried istream::ignore(), istream::get(), ios::rdstate to check eof, double and string instead of char in the file, just in case char is read numerically.
Nothing works and I'm desperate.
Weirdly enough, the approach above worked for a previous program where I had to read a triplet of data entries on a line of the format: string int int
The only difference is I used an ifstream object for that one, and an istream object for this one.
Bonus Question: inNum has the value of 0 when the hiccup occurs. I'm guessing it's something that istream::operator>> does?
Implementation description
try to read an int
if successful;
insert the read value to setB
next iteration
else;
clear error flags
check so that we haven't reached the end of the file
still more data? next iteration.
The above is the logic description of your function, but there's something missing...
In case we try to read a value, but fail, std::istream's handle these cases by setting the approriate error flags, but it will not discard any data.
The problem with your implementation is that upon trying to read invalid data, you will just try to read the same invalid data again.. over, and over, and over, inf.
Solution
After clearing the error flags you can use std::istream::ignore to discard any data from the stream.
The function's 1st argument is the max number of potential chars to ignore, and the 2nd is the "if you hit this char, don't ignore any more*.
Let's ignore the maximum amount of characters, or until we hit ' ' (space):
#include <limits> // std::numeric_limits
in.ignore (std::numeric_limits<std::streamsize>::max(), ' ');

Can't get ios::beg to go back to the beginning of the file

It always seems to be the things that should be no problem that cause problems for me. I don't get it. :/
So I'm trying to make sure that I understand how to manipulate text files. I've got two files, "infile.txt" and "outfile.txt". "infile.txt" has six numbers in it and nothing else. Here is the code I used to manipulate the files.
#include<fstream>
using std::ifstream;
using std::ofstream;
using std::fstream;
using std::endl;
using std::ios;
int main()
{
ifstream inStream;
ofstream outStream;//create streams
inStream.open("infile.txt", ios::in | ios::out);
outStream.open("outfile.txt");//attach files
int first, second, third;
inStream >> first >> second >> third;
outStream << "The sum of the first 3 nums is " << (first+second+third) << endl;
//make two operations on the 6 numbers
inStream >> first >> second >> third;
outStream << "The sum of the second 3 nums is " << (first+second+third) << endl;
inStream.seekg(0); //4 different ways to force the program to go back to the beginning of the file
//2. inStream.seekg(0, ios::beg);
//3. inStream.seekg(0, inStream.beg);
//4. inStream.close(); inStream.open("infile.txt");
//I have tried all four of these lines and only #4 works.
//There has got to be a more natural option than just
//closing and reopening the file. Right?
inStream >> first >> second >> third;
outStream << "And again, the sum of the first 3 nums is " << (first+second+third) << endl;
inStream.close();
outStream.close();
return 0;
}
Maybe I don't understand quite how the stream works, but I've seen a few sources that said that seekg(0) should move the index back to the start of the file. Instead, this is what I get out of it.
The sum of the first 3 nums is 8
The sum of the second 3 nums is 14
And again, the sum of the first 3 nums is 14
It went back, but not nearly in the way I would have hoped. Any idea why this happened? Why did my first three attempts fail?
As Bo Persson states, it may be because your input has
encountered end of file; it shouldn't, because in C++, a text
file is defined as being terminated by a '\n', but practically
speaking, if you're working under Windows, a lot of ways of
generating a file will omit this final '\n'—although it
is formally required, practical considerations will mean that
you'll make sure that it works even if the final '\n' is
missing. And I can't think of any other reason off hand why the
seekg's wouldn't work. inStream.seekg( 0 ) is, of course,
undefined behavior, but in practice, it will work pretty much
everywhere. inStream.seekg( 0, ios::beg ) is guaranteed to
work if inStream.good(), and is, IMHO, preferable to the
first form. (The single argument form of seekg is normally
only used with the results of a tellg as an argument.) And of
course, it only works if the actual input source supports
seeking: it won't work if you're reading from a keyboard or
a pipe (but presumably, "infile.txt" is neither).
In general, you should check the status of inStream after each
read, before using the results. But if the only problem is that
the file doesn't end with '\n', it's probable that the status
will be OK (!fail()) after the final read, even if you've
encountered end of file. In which case, you'll need clear()
anyway.
Note that the above comments are valid for C++-03 and precedent.
C++11 has changed the specification of the single argument form
of seekg, and requires it to reset eofbit before anything
else. (Why is this change only for the single argument form of
seekg, and not the two argument form? Oversight?)
The second input reaches end-of-file for the stream. That state sticks until you call inStream.clear() to clear its state (in addition to the seek).
With a C++11 compliant compiler, option 4 should also work as close and reopen will now clear the previous state. Older compilers might not do that.
Try:
inStream.seekg(0, ios_base::beg);