c++ is it safe to unget on fstreams - c++

Suppose input.txt is 1 byte text file:
std::ifstream fin("input.txt", std::ios::in);
fin.get(); // 1st byte extracted
fin.get(); // try to extract 2nd byte
std::cout << fin.eof(); // eof is triggered
fin.unget(); // return back
std::cout << fin.eof(); // eof is now reset
fin.get(); // try to extract 2nd byte, eof assumed
std::cout << fin.eof(); // no eof is triggered
Seems like unget() breaks eof flag triggering also it breaks file pointers. Am I doing something wrong?

eof is not set, but neither is good. The stream is ignoring operations because it's in a failure mode.
I cannot recall what unget is supposed to do after EOF, but unget goes right back into failure if I use clear to allow a retry.
http://ideone.com/JkDrwG
It's usually better to use your own buffer. Putback is a hack.

Related

std::getline is reading line where specified delimiter is not present?

I want to print each object in console from the array of the following string (stored in a file):
{ beforechars [{Object1},{Object2},{Object3}] afterchars }
I'm doing it as follows:
std::ifstream is("content.txt");
std::getline(is, content, '[');
while (std::getline(is,content,'{')) {
std::getline(is,content,'}');
std::cout << content << std::endl;
}
in.close();
But i am getting this output:
Object1
Object2
Object3
] afterchars }
My understanding is that after Object3 iteration, the ifstream should have "}] afterchars }" and the while's guard shouldn't be true because there isn't any '{' char... Am i right? Where is the mistake?
The while condition doesn't work as you expect: getline() will read successfully until it reaches an '{' or to the end of the file if not.
So what happens here ?
when you've displayed Object3 your position in the stream is after the closing '}'.
The getline() in the while condition will read all the remaining of the file into content as it encounters no '{'. As it could read something successfully, the condition is evaluated to true.
the getline() within the while block then fails to read anything, so content will remain unchanged. The stream is then in fail status. No subsequent operation will succeed until you clear this state. But nothing visible happens for now in your code.
after displaying this last result, the next loop condition will fail.
Simple workaround:
A very easy workaround would be to keep the current position in the stream before looking for '{', and in case it was not found, go back to this position. Attention: this way of parsing files is not so nice from point of view of performance, but it's ok for small files.
std::getline(is, content, '[');
auto pos = is.tellg(); // read current position
while (std::getline(is,content,'{') && !is.eof()) {
std::getline(is,content,'}');
pos = is.tellg(); // update position before iterating again
std::cout << content << std::endl;
}
is.seekg(pos); // and get back to last position
The trick here is that if '{' is not found, after the getline() the stream is not yet in fail state, but eof() is already true. We can then end the loop and go back to the last recorded position.
Online demo
std::getline reads characters until delimiter (consuming it) or until the end of the stream. It sets failbit on stream only if there were no character consumed (called on empty/invalid stream).
So your loop will terminate only when stream is empty.
Streams interface allows only to see next character, there is no way to scan input and do read if there specific character present.
If you need random access to characters, you need to read input in string and then parse it (with regular expressions or something else.)

Why is the c++ input file stream checked twice here?

Here is a snippet from a c++ tutorial:
// istream::get example
#include <iostream> // std::cin, std::cout
#include <fstream> // std::ifstream
int main () {
char str[256];
std::cout << "Enter the name of an existing text file: ";
std::cin.get (str,256); // get c-string
std::ifstream is(str); // open file
while (is.good()) // loop while extraction from file is possible
{
char c = is.get(); // get character from file
if (is.good())
std::cout << c;
}
is.close(); // close file
return 0;
}
Notice is.good() appeared twice, first with while, then with if.
Link to the example: http://www.cplusplus.com/reference/istream/istream/get/
Why is the c++ input file stream checked twice here?
The fact of the matter is that it is unnecessarily checked twice. If the second inner if (is.good()) passes, then the outer while (is.good()) will always pass as well. The author of the code needed some way of looping, and he incorrectly assumed that a while (is.good()) is an appropriate condition because it will stop the loop when the stream fails to extract. But this is only half-true. while (is.good()) is never the correct way to perform the extraction.
You have to perform the input first and then check if it succeeded. Otherwise it is possible to perform a failed extraction, use the result of that extraction and receive unwanted behavior from your program. The correct way to do it is by using the extraction itself as the condition. The input operator will return a reference to the stream, and then it will turn into a boolean returning true if the previous read suceeded, or false otherwise:
while (is.get(c))
{
std::cout << c;
}
The variable c is also not outside of the loop. You can enclose the while() loop in a block or use a for() loop instead:
for (char c; is.get(c); )
{
std::cout << c;
}
But it seems that this code is attempting to write all the content from the file to standard output. Reading a character one-by-one is the way shown here, but you can also use stream iterators or the buffer overload of std::ostream::operator<<() as well.
There are two more problems I see in this code. Namely:
std::string is the preferred construct for manipulating dynamically-sized strings, not C-style strings which require the use of archaic input methods such as .get(), .getline(), etc, and their respective overloads.
Manually closing a file is usually unneeded. The stream will close itself at the end of the scope in which it was created. You probably only want to close the file yourself to check if it succeeds or to reopen the stream with a different file or openmode.
The first one, that in while (is.good()), checks if it has reached EOF (End Of File). If not, it doesn't enter the while loop. Once entered in while(), it means that it have at least one character remained for the instruction char c = is.get();.
What the second if() does is that it doesn't allow to print the last character read, because after a char c = is.get();, the file may reach EOF. In case it does, the character is not printed.
For example, let's say you have this file:
"Just an example!"
Now, if you had just:
while (is.good()) // loop while extraction from file is possible
{
char c = is.get(); // get character from file
std::cout << c;
}
the output would be: "Just an example! ". The last space is the EOF character (which is the last character read).
But with:
while (is.good()) // loop while extraction from file is possible
{
char c = is.get(); // get character from file
if (is.good())
std::cout << c;
}
the output would be: "Just an example!", which is what you would expect it to be.

C++ istream operator>> bad-data handling

Every time I ask a question here on SO, it turns out to be some very dumb mistake (check my history if you don't believe me), so bear with me if you can here.
It feels like my question should be very popular, but I couldn't find anything about it and I've run out of ideas to try.
Anyway, without further ado:
I'm trying to overload the input operator>>. It's supposed to read one integer at a time from a file, skipping invalid data such as chars, floats, etc.
Naturally, I'm checking if(in >> inNum) to both get() the next token and check for successful get().
If successful, not much to say there.
If it fails, however, I assume that one of two things happened:
It stumbled upon a non-integer
It reached the eof
Here's how I tried to deal with it:
istream& operator>> (istream& in, SortSetArray& setB) {
bool eof = false;
int inNum = -1;
while(!eof) {
if(in >> inNum) {
cout << "DEBUG SUCCESS: inNum = " << inNum << endl;
setB.insert(inNum);
}
else {
// check eof, using peek()
// 1. clear all flags since peek() returns eof regardless of what
// flag is raised, even if it's not `eof`
in.clear();
cout << "DEBUG FAIL: inNum = " << inNum << endl;
// 2. then check eof with peek()
eof = (in.peek() == std::char_traits<char>::eof());
}
}
return in;
}
The file contains [1 2 3 4 a 5 6 7], and the program naturally goes into infinite loop.
Okay, easy guess, peek() doesn't consume the char 'a', and maybe in >> inNum also failed to consume it somehow. No biggie, I'll just try something that does.
And that's pretty much where I've been for the last 2 hours. I tried istream::ignore(), istream::get(), ios::rdstate to check eof, double and string instead of char in the file, just in case char is read numerically.
Nothing works and I'm desperate.
Weirdly enough, the approach above worked for a previous program where I had to read a triplet of data entries on a line of the format: string int int
The only difference is I used an ifstream object for that one, and an istream object for this one.
Bonus Question: inNum has the value of 0 when the hiccup occurs. I'm guessing it's something that istream::operator>> does?
Implementation description
try to read an int
if successful;
insert the read value to setB
next iteration
else;
clear error flags
check so that we haven't reached the end of the file
still more data? next iteration.
The above is the logic description of your function, but there's something missing...
In case we try to read a value, but fail, std::istream's handle these cases by setting the approriate error flags, but it will not discard any data.
The problem with your implementation is that upon trying to read invalid data, you will just try to read the same invalid data again.. over, and over, and over, inf.
Solution
After clearing the error flags you can use std::istream::ignore to discard any data from the stream.
The function's 1st argument is the max number of potential chars to ignore, and the 2nd is the "if you hit this char, don't ignore any more*.
Let's ignore the maximum amount of characters, or until we hit ' ' (space):
#include <limits> // std::numeric_limits
in.ignore (std::numeric_limits<std::streamsize>::max(), ' ');

How to check if there isn't data in file to read

std::fstream fin("emptyFile", std::fstream::in);
std::cout << fin.eof() << std::endl;
This prints 0. So using eof function I can't check if file is empty. Or after reading some data I need to check if there is no more data in it.
There are two ways to check if you "can read something" from a file:
Try to read it, and if it fails, it wasn't OK... (e.g fin >> var;)
Check the size of the file, using fin.seekg(0, ios_base::end); followed by size_t len = fin.tellg(); (and then move back to the beginning with fin.seekg(0, ios_base::beg);)
However, if you are trying to read an integer from a text-file, the second method may not work - the file could be 2MB long, and still not contain a single integer value, because it's all spaces and newlines, etc.
Note that fin.eof() tells you if there has been an attempt to read BEYOND the end of the file.
eof() gives you the wrong result because eofbit is not set yet. If you read something you will pass the end of the file and eofbit will be set.
Avoid eof() and use the following:
std::streampos current = fin.tellg();
fin.seekg (0, fin.end);
bool empty = !fin.tellg(); // true if empty file
fin.seekg (current, fin.beg); //restore stream position

How does ifstream's eof() work?

#include <iostream>
#include <fstream>
int main() {
std::fstream inf( "ex.txt", std::ios::in );
while( !inf.eof() ) {
std::cout << inf.get() << "\n";
}
inf.close();
inf.clear();
inf.open( "ex.txt", std::ios::in );
char c;
while( inf >> c ) {
std::cout << c << "\n";
}
return 0;
}
I'm really confused about eof() function. Suppose that my ex.txt's content was:
abc
It always reads an extra character and shows -1 when reading using eof(). But the inf >> c gave the correct output which was 'abc'? Can anyone help me explain this?
-1 is get's way of saying you've reached the end of file. Compare it using the std::char_traits<char>::eof() (or std::istream::traits_type::eof()) - avoid -1, it's a magic number. (Although the other one is a bit verbose - you can always just call istream::eof)
The EOF flag is only set once a read tries to read past the end of the file. If I have a 3 byte file, and I only read 3 bytes, EOF is false, because I've not tried to read past the end of the file yet. While this seems confusing for files, which typically know their size, EOF is not known until a read is attempted on some devices, such as pipes and network sockets.
The second example works as inf >> foo will always return inf, with the side effect of attempt to read something and store it in foo. inf, in an if or while, will evaluate to true if the file is "good": no errors, no EOF. Thus, when a read fails, inf evaulates to false, and your loop properly aborts. However, take this common error:
while(!inf.eof()) // EOF is false here
{
inf >> x; // read fails, EOF becomes true, x is not set
// use x // we use x, despite our read failing.
}
However, this:
while(inf >> x) // Attempt read into x, return false if it fails
{
// will only be entered if read succeeded.
}
Which is what we want.
The EOF flag is only set after a read operation attempts to read past the end of the file. get() is returning the symbolic constant traits::eof() (which just happens to equal -1) because it reached the end of the file and could not read any more data, and only at that point will eof() be true. If you want to check for this condition, you can do something like the following:
int ch;
while ((ch = inf.get()) != EOF) {
std::cout << static_cast<char>(ch) << "\n";
}
iostream doesn't know it's at the end of the file until it tries to read that first character past the end of the file.
The sample code at cplusplus.com says to do it like this: (But you shouldn't actually do it this way)
while (is.good()) // loop while extraction from file is possible
{
c = is.get(); // get character from file
if (is.good())
cout << c;
}
A better idiom is to move the read into the loop condition, like so:
(You can do this with all istream read operations that return *this, including the >> operator)
char c;
while(is.get(c))
cout << c;
eof() checks the eofbit in the stream state.
On each read operation, if the position is at the end of stream and more data has to be read, eofbit is set to true. Therefore you're going to get an extra character before you get eofbit=1.
The correct way is to check whether the eof was reached (or, whether the read operation succeeded) after the reading operation. This is what your second version does - you do a read operation, and then use the resulting stream object reference (which >> returns) as a boolean value, which results in check for fail().