Can't get ios::beg to go back to the beginning of the file - c++

It always seems to be the things that should be no problem that cause problems for me. I don't get it. :/
So I'm trying to make sure that I understand how to manipulate text files. I've got two files, "infile.txt" and "outfile.txt". "infile.txt" has six numbers in it and nothing else. Here is the code I used to manipulate the files.
#include<fstream>
using std::ifstream;
using std::ofstream;
using std::fstream;
using std::endl;
using std::ios;
int main()
{
ifstream inStream;
ofstream outStream;//create streams
inStream.open("infile.txt", ios::in | ios::out);
outStream.open("outfile.txt");//attach files
int first, second, third;
inStream >> first >> second >> third;
outStream << "The sum of the first 3 nums is " << (first+second+third) << endl;
//make two operations on the 6 numbers
inStream >> first >> second >> third;
outStream << "The sum of the second 3 nums is " << (first+second+third) << endl;
inStream.seekg(0); //4 different ways to force the program to go back to the beginning of the file
//2. inStream.seekg(0, ios::beg);
//3. inStream.seekg(0, inStream.beg);
//4. inStream.close(); inStream.open("infile.txt");
//I have tried all four of these lines and only #4 works.
//There has got to be a more natural option than just
//closing and reopening the file. Right?
inStream >> first >> second >> third;
outStream << "And again, the sum of the first 3 nums is " << (first+second+third) << endl;
inStream.close();
outStream.close();
return 0;
}
Maybe I don't understand quite how the stream works, but I've seen a few sources that said that seekg(0) should move the index back to the start of the file. Instead, this is what I get out of it.
The sum of the first 3 nums is 8
The sum of the second 3 nums is 14
And again, the sum of the first 3 nums is 14
It went back, but not nearly in the way I would have hoped. Any idea why this happened? Why did my first three attempts fail?

As Bo Persson states, it may be because your input has
encountered end of file; it shouldn't, because in C++, a text
file is defined as being terminated by a '\n', but practically
speaking, if you're working under Windows, a lot of ways of
generating a file will omit this final '\n'—although it
is formally required, practical considerations will mean that
you'll make sure that it works even if the final '\n' is
missing. And I can't think of any other reason off hand why the
seekg's wouldn't work. inStream.seekg( 0 ) is, of course,
undefined behavior, but in practice, it will work pretty much
everywhere. inStream.seekg( 0, ios::beg ) is guaranteed to
work if inStream.good(), and is, IMHO, preferable to the
first form. (The single argument form of seekg is normally
only used with the results of a tellg as an argument.) And of
course, it only works if the actual input source supports
seeking: it won't work if you're reading from a keyboard or
a pipe (but presumably, "infile.txt" is neither).
In general, you should check the status of inStream after each
read, before using the results. But if the only problem is that
the file doesn't end with '\n', it's probable that the status
will be OK (!fail()) after the final read, even if you've
encountered end of file. In which case, you'll need clear()
anyway.
Note that the above comments are valid for C++-03 and precedent.
C++11 has changed the specification of the single argument form
of seekg, and requires it to reset eofbit before anything
else. (Why is this change only for the single argument form of
seekg, and not the two argument form? Oversight?)

The second input reaches end-of-file for the stream. That state sticks until you call inStream.clear() to clear its state (in addition to the seek).
With a C++11 compliant compiler, option 4 should also work as close and reopen will now clear the previous state. Older compilers might not do that.

Try:
inStream.seekg(0, ios_base::beg);

Related

Read a file line by line in C++

I wrote the following C++ program to read a text file line by line and print out the content of the file line by line. I entered the name of the text file as the only command line argument into the command line.
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char buf[255] = {};
if (argc != 2)
{
cout << "Invalid number of files." << endl;
return 1;
}
ifstream f(argv[1], ios::in | ios::binary);
if (!f)
{
cout << "Error: Cannot open file." << endl;
return 1;
}
while (!f.eof())
{
f.get(buf,255);
cout << buf << endl;
}
f.close();
return 0;
}
However, when I ran this code in Visual Studio, the Debug Console was completely blank. What's wrong with my code?
Apart from the errors mentioned in the comments, the program has a logical error because istream& istream::get(char* s, streamsize n) does not do what you (or I, until I debugged it) thought it does. Yes, it reads to the next newline; but it leaves the newline in the input!
The next time you call get(), it will see the newline immediately and return with an empty line in the buffer, for ever and ever.
The best way to fix this is to use the appropriate function, namely istream::getline() which extracts, but does not store the newline.
The EOF issue
is worth mentioning. The canonical way to read lines (if you want to write to a character buffer) is
while (f.getline(buf, bufSz))
{
cout << buf << "\n";
}
getline() returns a reference to the stream which in turn has a conversion function to bool, which makes it usable in a boolean expression like this. The conversion is true if input could be obtained. Interestingly, it may have encountered the end of file, and f.eof() would be true; but that alone does not make the stream convert to false. As long as it could extract at least one character it will convert to true, indicating that the last input operation made input available, and the loop will work as expected.
The next read after encountering EOF would then fail because no data could be extracted: After all, the read position is still at EOF. That is considered a read failure. The condition is wrong and the loop is exited, which was exactly the intent.
The buffer size issue
is worth mentioning, as well. The standard draft says in 30.7.4.3:
Characters are extracted and stored until one of the following occurs:
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit));
traits::eq(c, delim) for the next available input character c
(in which case the input character
is extracted but not stored);
n is less than one or n - 1 characters are stored
(in which case the function calls setstate(
failbit)).
The conditions are tested in that order, which means that if n-1 characters have been stored and the next character is a newline (the default delimiter), the input was successful (and the newline is extracted as well).
This means that if your file contains a single line 123 you can read that successfully with f.getline(buf, 4), but not a line 1234 (both may or may not be followed by a newline).
The line ending issue
Another complication here is that on Windows a file created with a typical editor will have a hidden carriage return before the newline, i.e. a line actually looks like "123\r\n" ("\r" and "\n" each being a single character with the values 13 and 10, respectively). Because you opened the file with the binary flag the program will see the carriage return; all lines will contain that "invisible" character, and the number of visible characters fitting in the buffer will be one shorter than one would assume.
The console issue ;-)
Oh, and your Console was not entirely empty; it's just that modern computers are too fast and the first line which was probably printed (it was in my case) scrolled away faster than anybody could switch windows. When I looked closely there was a cursor in the bottom left corner where the program was busy printing line after line of nothing ;-).
The conclusion
Debug your programs. It's very easy with VS.
Use getline(istream, string).
Use the return value of input functions (typically the stream)
as a boolean in a while loop: "As long as you can extract any input, use that input."
Beware of line ending issues.
Consider C I/O (printf, scanf) for anything non-trivial (I didn't discuss this in my answer but I think that's what many people do).

ifstream does not read first line

I am using the code with ifstream that I used ~1 year ago, but now it does not work correctly. Here, I have the following file (so, just a line of integers):
2 4 2 3
I read it while constructing a graph from this file:
graph g = graph("file.txt");
where graph constructor starts with:
#include <iostream>
#include <fstream>
#include <sstream>
using namespace std;
graph::graph(const char *file_name) {
ifstream infile(file_name);
string line;
getline(infile, line);
cout << line << endl; // first output
istringstream iss;
iss.str(line);
iss >> R >> C >> P >> K;
iss.clear();
cout << R << " " << C << " " << P << " " << K; // second output
}
The second output (marked in code), instead of giving me 2 4 2 3, returns random(?) values -1003857504 32689 0 0. If I add the first output to check the contents of line after getline, it is just an empty string "".
All the files (main.cpp where a graph is instantiated, 'graph.cpp' where the graph is implemented and 'file.txt') are located in the same folder.
As I mentioned, this is my old code that worked before, so probably I do not see some obvious mistake which broke it. Thanks for any help.
These two locations:
where your program's original source code is located
where your program's input data is located
are completely unrelated.
Since "file.txt" is a relative path, your program looks for input data in the current working directory during execution. Sometimes that is the same as where the executable is. Sometimes it is not. (Only you can tell what it is, since it depends on how you execute your program.) There is never a connection to the location of the original source file, except possibly by chance.
When the two do not match, you get this problem, because you perform no I/O error checking in your program.
If you checked whether infile is open, I bet you'll find that it is not.
This is particularly evident since the program stopped working after a period of time without any changes to its logic; chances are, the only thing that could have changed is the location of various elements of your solution.

How exactly does the extract>> operator works in C++

I am a computer science student, an so do not have much experience with the C++ language (considering it is my first semester using this language,) or coding for that matter.
I was given an assignment to read integers from a text file in the simple form of:
19 3 -2 9 14 4
5 -9 -10 3
.
.
.
This sent me of on a journey to understand I/O operators better, since I am required to do certain things with this stream (duh.)
I was looking everywhere and could not find a simple explanation as to how does the extract>> operator works internally. Let me clarify my question:
I know that the extractor>> operator would extract one continues element until it hits space, tab, or newline. What I try to figure out is, where would the pointer(?) or read-location(?) be AFTER it extracts an element. Will it be on the last char of the element just removed or was it removed and therefore gone? will it be on the space/tab/'\n' character itself? Perhaps the beginning of the next element to extract?
I hope I was clear enough. I lack all the appropriate jargon to describe my problem clearer.
Here is why I need to know this: (in case anyone is wondering...)
One of the requirements is to sum all integers in each line separately.
I have created a loop to extract all integers one-by-one until it reaches the end of the file. However, I soon learned that the extract>> operator ignores space/tab/newline. What I want to try is to extract>> an element, and then use inputFile.get() to get the space/tab/newline. Then, if it's a newline, do what I gotta do.
This will only work if the stream pointer will be in a good position to extract the space/tab/newline after the last extraction>>.
In my previous question, I tried to solve it using getline() and an sstring.
SOLUTION:
For the sake of answering my specific question, of how operator>> works, I had to accept Ben Voigt's answer as the best one.
I have used the other solutions suggested here (using an sstring for each line) and they did work! (you can see it in my previous question's link) However, I implemented another solution using Ben's answer and it also worked:
.
.
.
if(readFile.is_open()) {
while (readFile >> newInput) {
char isNewLine = readFile.get(); //get() the next char after extraction
if(isNewLine == '\n') //This is just a test!
cout << isNewLine; //If it's a newline, feed a newline.
else
cout << "X" << isNewLine; //Else, show X & feed a space or tab
lineSum += newInput;
allSum += newInput;
intCounter++;
minInt = min(minInt, newInput);
maxInt = max(maxInt, newInput);
if(isNewLine == '\n') {
lineCounter++;
statFile << "The sum of line " << lineCounter
<< " is: " << lineSum << endl;
lineSum = 0;
}
}
.
.
.
With no regards to my numerical values, the form is correct! Both spaces and '\n's were catched:
Thank you Ben Voigt :)
Nonetheless, this solution is very format dependent and is very fragile. If any of the lines has anything else before '\n' (like space or tab), the code will miss the newline char. Therefore, the other solution, using getline() and sstrings, is much more reliable.
After extraction, the stream pointer will be placed on the whitespace that caused extraction to terminate (or other illegal character, in which case the failbit will also be set).
This doesn't really matter though, since you aren't responsible for skipping over that whitespace. The next extraction will ignore whitespaces until it finds valid data.
In summary:
leading whitespace is ignored
trailing whitespace is left in the stream
There's also the noskipws modifier which can be used to change the default behavior.
The operator>> leaves the current position in the file one
character beyond the last character extracted (which may be at
end of file). Which doesn't necessarily help with your problem;
there can be spaces or tabs after the last value in a line. You
could skip forward reading each character and checking whether
it is a white space other than '\n', but a far more idiomatic
way of reading line oriented input is to use std::getline to
read the line, then initialize an std::istringstream to
extract the integers from the line:
std::string line;
while ( std::getline( source, line ) ) {
std::istringstream values( line );
// ...
}
This also ensures that in case of a format error in the line,
the error state of the main input is unaffected, and you can
continue with the next line.
According to cppreference.com the standard operator>> delegates the work to std::num_get::get. This takes an input iterator. One of the properties of an input iterator is that you can dereference it multiple times without advancing it. Thus when a non-numeric character is detected, the iterator will be left pointing to that character.
In general, the behavior of an istream is not set in stone. There exist multiple flags to change how any istream behaves, which you can read about here. In general, you should not really care where the internal pointer is; that's why you are using a stream in the first place. Otherwise you'd just dump the whole file into a string or equivalent and manually inspect it.
Anyway, going back to your problem, a possible approach is to use the getline method provided by istream to extract a string. From the string, you can either manually read it, or convert it into a stringstream and extract tokens from there.
Example:
std::ifstream ifs("myFile");
std::string str;
while ( std::getline(ifs, str) ) {
std::stringstream ss( str );
double sum = 0.0, value;
while ( ss >> value ) sum += value;
// Process sum
}

C++ istream operator>> bad-data handling

Every time I ask a question here on SO, it turns out to be some very dumb mistake (check my history if you don't believe me), so bear with me if you can here.
It feels like my question should be very popular, but I couldn't find anything about it and I've run out of ideas to try.
Anyway, without further ado:
I'm trying to overload the input operator>>. It's supposed to read one integer at a time from a file, skipping invalid data such as chars, floats, etc.
Naturally, I'm checking if(in >> inNum) to both get() the next token and check for successful get().
If successful, not much to say there.
If it fails, however, I assume that one of two things happened:
It stumbled upon a non-integer
It reached the eof
Here's how I tried to deal with it:
istream& operator>> (istream& in, SortSetArray& setB) {
bool eof = false;
int inNum = -1;
while(!eof) {
if(in >> inNum) {
cout << "DEBUG SUCCESS: inNum = " << inNum << endl;
setB.insert(inNum);
}
else {
// check eof, using peek()
// 1. clear all flags since peek() returns eof regardless of what
// flag is raised, even if it's not `eof`
in.clear();
cout << "DEBUG FAIL: inNum = " << inNum << endl;
// 2. then check eof with peek()
eof = (in.peek() == std::char_traits<char>::eof());
}
}
return in;
}
The file contains [1 2 3 4 a 5 6 7], and the program naturally goes into infinite loop.
Okay, easy guess, peek() doesn't consume the char 'a', and maybe in >> inNum also failed to consume it somehow. No biggie, I'll just try something that does.
And that's pretty much where I've been for the last 2 hours. I tried istream::ignore(), istream::get(), ios::rdstate to check eof, double and string instead of char in the file, just in case char is read numerically.
Nothing works and I'm desperate.
Weirdly enough, the approach above worked for a previous program where I had to read a triplet of data entries on a line of the format: string int int
The only difference is I used an ifstream object for that one, and an istream object for this one.
Bonus Question: inNum has the value of 0 when the hiccup occurs. I'm guessing it's something that istream::operator>> does?
Implementation description
try to read an int
if successful;
insert the read value to setB
next iteration
else;
clear error flags
check so that we haven't reached the end of the file
still more data? next iteration.
The above is the logic description of your function, but there's something missing...
In case we try to read a value, but fail, std::istream's handle these cases by setting the approriate error flags, but it will not discard any data.
The problem with your implementation is that upon trying to read invalid data, you will just try to read the same invalid data again.. over, and over, and over, inf.
Solution
After clearing the error flags you can use std::istream::ignore to discard any data from the stream.
The function's 1st argument is the max number of potential chars to ignore, and the 2nd is the "if you hit this char, don't ignore any more*.
Let's ignore the maximum amount of characters, or until we hit ' ' (space):
#include <limits> // std::numeric_limits
in.ignore (std::numeric_limits<std::streamsize>::max(), ' ');

istream::unget() in C++ doesn't work as I thought

unget isn't working the way I thought it would... Let me explain myself. As I think, unget takes the last character extracted in the stream and it puts it back in the stream (and ready to be extracted again). Internally, it's decreasing the pointer in the stream buffer (creating the sentry and all that stuff).
But, when I use two unget() one behind the other, it's behaviour get deeply strange. If write something like hello<bye, and I use < as a delimiter, if I use getline and later two ungets, it returns me hello, and no o<bye". This is my code:
#include <iostream>
#define MAX_CHARS 256
using namespace std;
int main(){
char cadena[MAX_CHARS];
cout << "Write something: ";
cin.getline(cadena, MAX_CHARS, '<');
cout << endl << "Your first word delimited by < is: " << cadena << endl;
cin.unget(); //Delimiter (removed by getline) is put back in the stream
cin.unget(); //!?
cin >> cadena;
cout << "Your phrase with 2 ungets done..." << cadena;
return 0;
}
Try with bye<hello, then cadena gets bye and not e<hello I thought that unget works with the last one character each time it's called, what the f*** is happening?
The problem you are observing isn't surprising at all. First off, note that ungetting characters may or may not be supported by the underlying stream buffer. Typically, at least one character of putback is supported. Whether this is actually true and if any more characters are supported is entirely up to the stream buffer.
What happens in your test program is simply that the second unget() fails, the stream goes into failure state (i.e., std::ios_base::failbit is set) and another attempt to read something just fails. The failed read leave the original buffer unchanged and since it isn't tested (as it should), it looks as if the same string was read twice.
The fundamental reason std::cin is likely to support only one character to be put back is that it is synchronized with stdin by default. As a result, std::cin doesn't do any buffer (causing it to be rather slow as well for that matter). There is a fair chance that you can get better results by no synchronizing with stdin:
std::ios_base::sync_with_stdio(false);
This will improve the performance and the likelihood of putting more characters being successful. There is still no guarantee that you can put multiple character (or even just one character) back. If you really need to put back character, you should consider using a filtering stream buffer which supports as many character puthback as you need. In general, tokenizing input doesn't require any characters of putback which is the basic reason that there is only mediocre support: since putback support is bad, you are best off using proper tokenizing which reduces the need to improve putback. Somewhat of a circular argument. Since you can always create your own stream buffer it isn't really harmful, though.
The actuall reason for this behaviour is related to the failbits of stream as explained in previous answer. I can provide a work around code that may help you in achieving the results you want.
#include <iostream>
#include <boost/iostreams/filtering_stream.hpp>
// compile using g++ -std=c++11 -lboost_iostreams
#define MAX_CHARS 256
using namespace std;
int main(){
boost::iostreams::filtering_istream cinn(std::cin,0,1);
char cadena[MAX_CHARS];
cout << "Write something: ";
cinn.getline(cadena, MAX_CHARS, '<');
cout << endl << "Your first word delimited by < is: " << cadena << endl;
cinn.unget(); //Delimiter (removed by getline) is put back in the stream
cinn.unget(); //!?
cinn >> cadena;
cout << "Your phrase with 2 ungets done..." << cadena;
return 0;
}