fstream EOF unexpectedly throwing exception - c++

My question is very similar to a previous one. I want to open and read a file. I want exceptions thrown if the file can't be opened, but I don't want exceptions thrown on EOF. fstreams seem to give you independent control over whether exceptions are thrown on EOF, on failures, and on other bad things, but it appears that EOF tends to also get mapped to the bad and/or fail exceptions.
Here's a stripped-down example of what I was trying to do. The function f() is supposed to return true if a file contains a certain word, false if it doesn't, and throw an exception if (say) the file doesn't exist.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
bool f(const char *file)
{
ifstream ifs;
string word;
ifs.exceptions(ifstream::failbit | ifstream::badbit);
ifs.open(file);
while(ifs >> word) {
if(word == "foo") return true;
}
return false;
}
int main(int argc, char **argv)
{
try {
bool r = f(argv[1]);
cout << "f returned " << r << endl;
} catch(...) {
cerr << "exception" << endl;
}
}
But it doesn't work, because basic fstream reading using operator>> is evidently one of the operations for which EOF sets the bad or the fail bit. If the file exists and does not contain "foo" the function does not return false as desired, but rather throws an exception.

The std::ios_base::failbit flag is also set when there's an attempted extraction when the file has reached the end, something which the behavior of the stream's boolean operator allows. You should set up an extra try-catch block in f() and rethrow the exception if it doesn't correspond with the end of file condition:
std::ifstream ifs;
std::string word;
ifs.exceptions(std::ifstream::failbit | std::ifstream::badbit);
try {
ifs.open(file);
while (ifs >> word) {
if (word == "foo") return true;
}
}
catch (std::ios_base::failure&) {
if (!ifs.eof())
throw;
}
return false;

If the goal is to throw an exception only in case of a problem when opening the file, why not write:
bool f(const char *file)
{
ifstream ifs;
string word;
ifs.open(file);
if (ifs.fail()) // throw only when needed
throw std::exception("Cannot open file !"); // more accurate exception
while (ifs >> word) {
if (word == "foo") return true;
}
return false;
}
You could of course set :
ifs.exceptions(ifstream::badbit);
before or after the the open, to throw an exception in case something really bad would happen during the reading.

basic_ios::operator bool() checks fail(), not !good(). Your loop tries to read one more word after EOF is reached. operator>>(stream&, string&) sets failbit if no characters were extracted. That's why you always exit with an exception.
It's hard to avoid that though. The stream reaches EOF state not when the last character is read, but when an attempt is made to read past the last character. If that happens in the middle of a word, then failbit is not set. If it happens in the beginning (e.g. if the input has trailing whitespace), then failbit is set. You can't really reliably end up in eof() && !fail() state.

Related

getline setting failbit along with eof

I am aware of the origin of this behavior since it has been very well explained in multiple posts here in SO, some notable examples are:
Why is iostream::eof inside a loop condition considered wrong?
Use getline() without setting failbit
std::getline throwing when it hits eof
C++ istream EOF does not guarantee failbit?
And it is also included in the std::getline standard:
3) If no characters were extracted for whatever reason (not even the discarded delimiter), getline sets failbit and returns.
My question is how does one deal with this behavior, where you want your stream to catch a failbit exception for all cases except the one caused by reaching the eof, of a file with an empty last line. Is there something obvious that I am missing?
A MWE:
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
void f(const std::string & file_name, char comment) {
std::ifstream file(file_name);
file.exceptions(file.failbit);
try {
std::string line;
while (std::getline(file, line).good()) {
// empty getline sets failbit throwing an exception
if ((line[0] != comment) && (line.size() != 0)) {
std::stringstream ss(line);
// do stuff
}
}
}
catch (const std::ios_base::failure& e) {
std::cerr << "Caught an ios_base::failure.\n"
<< "Explanatory string: " << e.what() << '\n'
<< "Error code: " << e.code() << '\n';
}
}
int main() {
f("example.txt", '#');
}
where example.txt is a tab-delimited file, with its last line being only the \n char:
# This is a text file meant for testing
0 9
1 8
2 7
EDIT:
while(std::getline(file, line).good()){...} replicates the problem.
Another way to avoid setting failbit, is simply to refactor your if tests to detect the read of an empty-line. Since that is your final line in this case, you can simply return to avoid throwing the error, e.g.:
std::ifstream file (file_name);
file.exceptions (file.failbit);
try {
std::string line;
while (std::getline(file, line)) {
// detect empty line and return
if (line.size() == 0)
return;
if (line[0] != comment) {
std::stringstream ss(line);
// do stuff
}
}
}
...
You other alternative is to check whether eofbit is set in catch. If eofbit is set -- the read completed successfully. E.g.
catch (const std::ios_base::failure& e) {
if (!file.eof())
std::cerr << "Caught an ios_base::failure.\n"
<< "Explanatory string: " << e.what() << '\n'
<< "Error code: " /* << e.code() */ << '\n';
}
Edit: I misunderstood the OP, refer to David's answer above. This answer is for checking whether or not the file has a terminating newline.
At the end of your while (getline) loop, check for file.eof().
Suppose you just did std::getline() for the last line in the file.
If there is a \n after it, then std::getline() has read the delimiter and did not set eofbit. (In this case, the very next std::getline() will set eofbit.)
Whereas if there is no \n after it, then std::getline() has read EOF and did set eofbit.
In both cases, the very next std::getline() will trigger failbit and enter your exception handler.
PS: the line if ((line[0] != comment) && (line.size() != 0)) { is UB if line is empty. The conditions' order needs to be reversed.

C++ ios::exceptions throws exception when everything is fine

Here is the code from cplusplus.com/reference
#include <iostream> // std::cerr
#include <fstream> // std::ifstream
int main () {
std::ifstream file;
file.exceptions ( std::ifstream::failbit | std::ifstream::badbit );
try {
file.open ("test.txt");
while (!file.eof()) file.get();
file.close();
}
catch (std::ifstream::failure e) {
std::cerr << "Exception opening/reading/closing file\n";
}
return 0;
}
My code is very similar
int main()
{
std::vector<int> numbers;
std::vector<std::ifstream *> ifs;
std::array<std::string, 3> files = {
"f1.txt", "f2.txt", "f3.txt"
};
for (int i = 0; i < files.size(); ++i) {
std::ifstream *_ifs = new std::ifstream;
ifs.push_back(_ifs);
ifs[i]->exceptions( std::ifstream::failbit | std::ifstream::badbit );
}
try {
int n;
std::string line;
for (int i = 0; i < files.size(); ++i) {
ifs[i]->open(files[i]);
while (!ifs[i]->eof()) {
std::getline(*ifs[i], line);
std::istringstream iss(line);
while (iss >> n) {
if (n % 3 == 0 && n % 5 == 0 && n % 7 == 0)
numbers.push_back(n);
}
}
ifs[i]->close();
}
} catch (std::ifstream::failure e) {
std::cerr << "Error reading from files: " << e.what() << std::endl;
return 1;
}
for (int i = 0; i < ifs.size(); ++i)
delete ifs[i];
ifs.clear();
std::cout << "Files have been read\n";
// Do something with numbers
// ...
}
The issue is that nothing is read. Exception is thrown almost immediatly. If I comment out failbit from exceptions everything works fine, but exceptions are not thrown when the files are missing on Windows. On Ubuntu, without failbit exceptions are thrown when the files are missing, and everything is read correctly. But with failbit on Ubuntu as well exception is thrown at the beginning of reading and nothing is read. I tried to google it. Found the example from cplusplus.com . And stackoverflow question where the answer was not to check for eof, but instead read this way while(getline(ifs, line)) { /* do something with line */ } . I tried this, and got no difference. Before I did these kinds of tasks throwing user defined classes. This time I decided to try standard library for that and it seems like I am missing something.
The problem is that std::ios_base::failbit gets set when the end of the file is reached: the lines are read OK. However, once there are no further lines std::ios_base::failbit will get set: that is how the end condition is detected. As a result, only the first file is being read.
If you'd had output inside the loop reading the file you'd see that the lines are actually read. Since you filter the values read I'd guess you don't see any numbers read because none of the numbers provided matches the condition.
The check for eof() doesn't help, of course, as reading the last line will stop reading with the newline character right before reaching the end of file but it won't set std::ios_base::eofbit: the bit is only set when EOF is actually touched but that only happens with the next character read.
Since you should always check whether something was read after attempting to read, the condition while (ifs[i]->eof()) is ill-advised (and it is a good example why you should not use cplusplus.com but rather cppreference.com). Instead you should use
while (std::getline(*ifs[i], line))
You might get better results reading each of the files in their own try/catch blocks. Personally, I don't think exceptions and I/O fit well together and I have never had any production code setting an exception mask. I'd recommend staying clear of setting the exception mask for streams.

checking of the stream result?

I have two questions on the following simple code:
/*
test.cpp
© BS.
Example of creation of a class of a flow for which it is possible to assign the additional
characters interpreted as separators.
11/06/2013, Раздел 11.7
*/
//--------------------------------------------------------------------------------------------------
#include <iostream>
#include <sstream>
#include <algorithm>
#include <exception>
#include <string>
#include <vector>
using namespace std;
//--------------------------------------------------------------------------------------------------
class Punct_stream
// it is similar to istream flow, but the user can independently set separators.
{
private:
istream& source; // source of chars
istringstream buffer; // buffer for formatting
string white; // whitespaces
bool sensitive; // case sensitive
public:
Punct_stream(istream& is): source(is), sensitive(true) {}
void whitespace(const string& s) { white = s; }
void add_white(char c) { white += c; }
void case_sensitive(bool b) { sensitive = b; }
bool is_case_sensitive() { return sensitive; }
bool is_whitespace(char c);
Punct_stream& operator >> (string& s);
operator bool();
};
//--------------------------------------------------------------------------------------------------
bool Punct_stream::is_whitespace(char c)
// is the c a whitespace?
{
for(int i = 0; i < white.size(); ++i) if(c == white[i]) return true;
return false;
}
//--------------------------------------------------------------------------------------------------
Punct_stream::operator bool()
// check the input result
{
return !(source.fail() || source.bad()) && source.good();
}
//--------------------------------------------------------------------------------------------------
Punct_stream& Punct_stream::operator >> (string& s){
while(!(buffer >> s)){ // try to read the data from the buffer
if(buffer.bad() || !source.good()) return *this;
buffer.clear();
string line;
getline(source,line); // read the line from the source
// if necessary we replace characters
for(int i = 0; i < line.size(); ++i)
if(is_whitespace(line[i])) line[i] = ' ';
else if(!sensitive) line[i] = tolower(line[i]);
buffer.str(line); // write a string line to the stream
}
return *this;
}
//==================================================================================================
int main()
// entry point
try{
Punct_stream ps(cin);
ps.whitespace(";:,.?!()\"{}<>/&$##%^*|~");
ps.case_sensitive(false);
cout << "Enter the words, please: ";
vector<string> vs;
string word;
while(ps >> word) vs.push_back(word); // enter words
sort(vs.begin(), vs.end());
// we delete counterparts
for(int i = 0; i < vs.size(); ++i) if(i == 0 || vs[i] != vs[i-1]) cout << vs[i] << endl;
}
catch(exception& e){
cerr << e.what() << endl;
return 1;
}
catch(...){
cerr << "Unknown exception." << endl;
return 2;
}
The checking in the Punct_stream::operator bool() function is not clear for me:
return !(source.fail() || source.bad()) && source.good();
My questions:
Why author checked the 'fail' and the 'bad'? Why it wasn't restricted the 'good' check only? Unless the positive 'good' automatically doesn't imply, what 'fail' and 'bad' are set in 'false' value?
Besides, often in a code write such construction:cin >> x; while(cin){//...}
Why the author didn't write by analogy so:
Punct_stream::operator bool()
// check the input result
{
// return !(source.fail() || source.bad()) && source.good();
return source;
}
The alternative option shown by me doesn't work for me (Windows crashed), it would be desirable to understand why, what I missed?
Thank you.
The istream object contains a bunch of bit flags that represent its internal status. Of them, the ones that interests you are:
eofbit End-Of-File reached while performing an extracting operation on an input stream.
failbit The last input operation failed because of an error related to the internal logic of the operation itself.
badbit Error due to the failure of an input/output operation on the stream buffer.
goodbit No error. Represents the absence of all the above (the value zero).
Their status is represented in the following table
good() eof() fail() bad()
goodbit No errors (zero value iostate) true false false false
eofbit End-of-File reached on input operation false true false false
failbit Logical error on i/o operation false false true false
badbit Read/writing error on i/o operation false false true true
Respectively, good() returns the goodbit, eof() checks the eofbit, fail() checks the failbit, and, surprisingly, bad() returns the badbit.
So depending on what you are doing, each function could be used in a different settings.
However, in your case, simply checking the trueness of the good bit would be enough, as it is true when the others are all false. Testing at the same time the failbit or the badbit is redundant.
source: http://www.cplusplus.com/reference/ios/ios/
EDIT:
Actually I'm not really sure why your alternative wouldn't work, as it works for me. What data did you exactly pass to the program?

Use getline() without setting failbit

Is it possible use getline() to read a valid file without setting failbit? I would like to use failbit so that an exception is generated if the input file is not readable.
The following code always outputs basic_ios::clear as the last line - even if a valid input is specified.
test.cc:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
ifstream inf;
string line;
inf.exceptions(ifstream::failbit);
try {
inf.open(argv[1]);
while(getline(inf,line))
cout << line << endl;
inf.close();
} catch(ifstream::failure e) {
cout << e.what() << endl;
}
}
input.txt:
the first line
the second line
the last line
results:
$ ./a.out input.txt
the first line
the second line
the last line
basic_ios::clear
You can't. The standard says about getline:
If the function extracts no characters, it calls is.setstate(ios_base::failbit) which may throw ios_base::failure (27.5.5.4).
If your file ends with an empty line, i.e. last character is '\n', then the last call to getline reads no characters and fails. Indeed, how did you want the loop to terminate if it would not set failbit? The condition of the while would always be true and it would run forever.
I think that you misunderstand what failbit means. It does not mean that the file cannot be read. It is rather used as a flag that the last operation succeeded. To indicate a low-level failure the badbit is used, but it has little use for standard file streams. failbit and eofbit usually should not be interpreted as exceptional situations. badbit on the other hand should, and I would argue that fstream::open should have set badbit instead of failbit.
Anyway, the above code should be written as:
try {
ifstream inf(argv[1]);
if(!inf) throw SomeError("Cannot open file", argv[1]);
string line;
while(getline(inf,line))
cout << line << endl;
inf.close();
} catch(const std::exception& e) {
cout << e.what() << endl;
}

C++: Check istream has non-space, non-tab, non-newline characters left without extracting chars

I am reading a std::istream and I need to verify without extracting characters that:
The stream is not "empty", i.e. that trying to read a char will not result in an fail state (solved by using peek() member function and checking fail state, then setting back to original state)
That among the characters left there is at least one which is not a space, a tab or a newline char.
The reason for this is, is that I am reading text files containing say one int per line, and sometimes there may be extra spaces / new-lines at the end of the file and this causes issues when I try get back the data from the file to a vector of int.
A peek(int n) would probably do what I need but I am stuck with its implementation.
I know I could just read istream like:
while (myInt << myIstream) {…} //Will fail when I am at the end
but the same check would fail for a number of different conditions (say I have something which is not an int on some line) and being able to differentiate between the two reading errors (unexpected thing, nothing left) would help me to write more robust code, as I could write:
while (something_left(myIstream)) {
myInt << myIstream;
if (myStream.fail()) {…} //Horrible things happened
}
Thank you!
There is a function called ws which eats whitespace. Perhaps you could call that after each read. If that hits eof, then you know you've got a normal termination. If it doesn't and the next read doesn't produce a valid int, then you know you've got garbage in your file. Maybe something like:
#include <fstream>
#include <iostream>
int main()
{
std::ifstream infile("test.dat");
while (infile)
{
int i;
infile >> i;
if (!infile.fail())
std::cout << i << '\n';
else
std::cout << "garbage\n";
ws(infile);
}
}
this is what I did to skip whitespace/detect EOF before the actual input:
char c;
if (!(cin >> c)) //skip whitespace
return false; // EOF or other error
cin.unget();
This is independent of what data you are going to read.
This code relies on the skipws manipulator being set by default for standard streams, but it can be set manually cin >> skipw >> c;
And simple
for(;;){
if(!(myIstream >> myInt)){
if(myIstream.eof()) {
//end of file
}else{
//not an integer
}
}
// Do something with myInt
}
does not work? Why you need to know if there are numbers left?
Edit Changed to Ben's proposition.
The usual way to handle this situation is not to avoid reading from the stream, but to put back characters, which have been read, if needed:
int get_int(std::istream& in)
{
int n = 0;
while(true) {
if (in >> n)
return n;
clean_input(in);
}
}
void clean_input(std::istream& in)
{
if (in.fail()) {
in.clear();
// throw away (skip) pending characters in input
// which are non-digits
char ch;
while (in >> ch) {
if (isdigit(ch)) {
// stuff digit back into the stream
in.unget();
return;
}
}
}
error("No input"); // eof or bad
}