C++ getline and gcount - c++

Suppose I have an std::istream pointing to the following contents (the line break is a '\n' character):
12345678
9
and run the following code:
std::istream & is = ...
char buff[9];
is.getline(buff, 9);
int n = is.gcount();
Now n == 8 and strcmp(buff, "12345678") == 0. The question is, how do I know that I read the entire line instead of some of the line?
If the steam instead points to the following contents:
123456789
0
and the same code is executed, I am still on the same line. How do I differentiate between these two cases?

Use std::string and the free std::getline function:
#include <istream>
#include <string>
// ...
std::istream & is = ...;
std::string line;
while (std::getline(is, line))
{
// process line
}

The key to your answer is in your question
is.getline(buff, 9);
int n = is.gcount();
Now n == 8
According to the reference for getline, it will extract up to n - 1 characters -- in your case, up to 8. This is slightly misleading, because if the nth character is the delimiter, it will also be extracted (but not copied to the buffer). More importantly, if you have NOT reached the delimiter before reaching n characters, this piece is relevant:
If the function stops reading because this size (n) is reached,
the failbit internal flag is set
So, in short, if the fail bit is set, you're still on the same line (and you'll have to clear the state in order to continue processing the istream). Sometimes, eof will also set the fail bit. So you probably want to check if the state is failbit and only failbit:
if ( is.rdstate() == std::ios::failbit ) {
std::cout << "Filled the buffer, but did NOT finish the line\n";
is.clear();
}

Read the next character. If it is a newline, you have read all the data on the line.
BTW, popular usage is to read the entire line, up to the newline, into an std::string, then process the string.

Related

Unable to parse final word in text file using peek()

I'm attempting to write a lexer and parser but I'm having trouble getting the final variable in a text file due to in_file.tellg() equaling -1. My program only works if I add a space character after the variable, otherwise I get a compiler error. I want to mention that I'm able to get every other variable in the text file but the last one. I believe the cause of the problem is in_file.peek()!=EOF setting in_file.tellg() to -1.
My program is something like this:
ifstream in(file_name);
char c;
in >> noskipws;
while(in >> c ){
if(is_letter_part_of_variable(c)) {
int start_pos = in.tellg(),
end_pos,
length;
while(is_letter_part_of_variable(c) && in.peek()!=EOF ) {
in>>c;
}
end_pos = in.tellg(); // This becomes -1 for some reason
length = end_pos - start_pos; // Should be 7
// Reset file pointer to original position to chomp word.
in.clear();
in.seekg(start_pos-1, in.beg);
// The word 'message' should go in here.
char *identifier = new char[length];
in.read(identifier, length);
identifier[length] = '\0';
}
}
example.text
message = "Hello, World"
print message
I tried removing peek()!= EOF which gives me an eternal loop. I tried !in_file.eof() and that also makes tellg() equal to -1. What can I do to fix/enhance this code?
I believe the cause of the problem is in_file.peek()!=EOF setting in_file.tellg() to -1.
Close. peek attempts to read a character and returns EOF if it reads past the end of the stream. Reading past the end of a stream sets the stream's fail bit. tellg returns -1 if the fail bit is set.
Simple Solution
clear the fail bit before calling tellg.
Better solution
Use std::string.
std::string identifier;
while(in>>c && is_letter_part_of_variable(c)) {
identifier += c;
}
All of the messing around with peek, seekg, tellg and the dreaded new vanish.

How to use input stream overloading to insert item to map member in class?

I have C++ class Question to hold data from a file questions.txt of multiple choice questions and answers:
update:
I have updated the &operator>> operator overload I have one:
it only insert first multiple choice question of 2 multiple choice questions"read the first Question "
Data in file questions.txt:
A programming language is used in this Course? 3
1. C
2. Pascal
3. C++
4. Assembly
What compiler can you use to compile the programs in this Course? 4
1. Dev-C++
2. Borland C++Builder
3. Microsoft Visual C++
4. All of the above
I'm trying to insert the multiple answers into a map. I just want to ask how to overload operator>> to iterate over multiple answers to insert them into a map:
#include <string>
#include <iostream>
#include <sstream>
#include <map>
using namespace std;
class Question
{
string question;
int correctIndex;
map<int,string> answers;
friend std::istream &operator>>(std::istream &is, Question &q) {
getline(is, q.question, '?'); // stops at '?'
is>> q.correctIndex;
string line;
while (getline(is, line) && line.empty()) // skip leading blank lines
;
while (getline(is,line) && !line.empty()) // read until blank line
{
int id;
string ans;
char pt;
stringstream sst(line); // parse the line;
sst>>id>>pt; // take number and the following point
if (!sst || id==0 || pt!='.')
cout << "parsing error on: "<<line<<endl;
else {
getline (sst, ans);
q.answers[id] = ans;
}
}
return is;
}
};
int main()
{
ifstream readFile("questions.txt");//file stream
vector<Question> questions((istream_iterator<Question>(readFile)), istream_iterator<Question>());
}
There are two issues with your code: skipping the first answer and reading through the end of the file.
In this pair of loops:
while (getline(is, line) && line.empty()) // skip leading blank lines
;
while (getline(is,line) && !line.empty()) // read until blank line
{
the first non-empty line will terminate the first loop, but then immediately you call getline() again without actually reading any of its contents. This skips the first answers choice. You'll want to make sure that you don't actually call getline() the first time. Something like...
// skip leading blank lines
while (getline(is, line) && line.empty()) {
;
}
for (; is && !line.empty(); getline(is, line)) {
// ...
}
But the second and bigger problem is if you read through the end of the file (as your code does right now) the last operator>> will cause the istream to eof(), which will disregard the last Question that you have streamed. This is tricky since you have a variable-length input stream - we don't know when we've run out of input until we've actually run out of input.
Thankfully, we can do everything quite a bit simpler. First, instead of reading off the end of the input to trigger the error, we'll use the first read to cause us to stop:
friend std::istream &operator>>(std::istream &is, Question &q) {
if (!getline(is, q.question, '?')) { // stops at '?'
return is;
}
This way, if we hit EOF early, we stop early. For the rest, we can simply the reading greatly by using skipws(). Instead of manually looping through the empty lines (which is hard to do right, as per your initial bug), we can let operator>> do this for us by just skipping ahead.
When we run out of things to read, we just back out of the error flags - since we don't want fail() (if we try to read the next index and it's actually the next question) or eof() (we're done) triggered.
Altogether:
friend std::istream &operator>>(std::istream &is, Question &q) {
if (!getline(is, q.question, '?')) { // stops at '?'
return is;
}
is >> q.correctIndex;
int id;
char pt;
string ans;
is >> skipws;
while (is >> id >> pt && getline(is, ans)) {
q.answers[id] = ans;
}
// keep the bad bit, clear the rest
is.clear(is.rdstate() & ios::badbit);
return is;
}
Now that's also a little incomplete. Perhaps you want to indicate error if you don't read into answers anything that matched correctIndex? In that case, you would set the ios::failbit too.
First improvement
When the operator>> is used for a string, it stops at the first blank separator. So for reading correctly the question you should consider:
friend std::istream &operator>>(std::istream &is, Question &q) {
getline(is, q.question, '?'); // stops at '?'
return is>> q.correctIndex;
... // to do: for the answers (see below)
}
You could consider a similar approach, for reading each question, starting with its id. Unfortunately, using operator>> on int will not allow us to detect the last answer: the reading attempt would fail with the start of a non-numeric text for the next question.
The problem with the format
The format that you use has some ambiguities:
Are blank lines mandatory and mark the begin and end of the answers ? In this case the last question is invalid : an end of answer is missing).
Or are the blank lines optional and have to be ignored ? In this case, the first char determines if it's the start of a new question (non numeric) or if it's a new answer (numeric)
Or is it always expected that there are exactly 4 answers for a question ?
Alternative 1: a blank line marks end of question
The idea is to read line by line and parsing each line separately:
...
string line;
while (getline(is, line) && line.empty()) // skip leading blank lines
;
do // read until blank line
{
int id;
string ans;
char pt;
streamstring sst(line); // parse the line;
sst>>id>>pt; // take number and the following point
if (!sst || id==0 || pt!='.')
cout << "parsing error on: "<<line<<endl;
else {
getline (sst, ans);
q.answers[id] = ans;
}
getline(is,line);
} while (getline(is, line) && !line.empty());
Attention: as per hypothesis: the missing end-of-answer blank line, will cause the reading of the last question to fail. Ideally, you'd issue an error message to clarify (e.g. unexpected end of file). Correcting the input file with an empty blank line will work (an empty line ended with a new line).
Alternative 2: test first char of line to see if it's still next answer
The other alternative peeks the first character to read in order to check if it is an answer (starts with a digit), an empty line (to be skipped) and if not, it exits the loop.
...
string line, ans;
int c, id;
char pt;
while ((c = is.peek())!=EOF && (isdigit(c) || c=='\n')) { // test first char without reading it
getline(is, line);
if (!line.empty()) {
stringstream sst(line);
... // parse the line as above
}
}
}
With this option, the requirement is that the answers ends with a newline (i.e. trailing '\n'). An unfinished line interrupted with an EOF will cause the last question to be ignored as failed.

How exactly does the extract>> operator works in C++

I am a computer science student, an so do not have much experience with the C++ language (considering it is my first semester using this language,) or coding for that matter.
I was given an assignment to read integers from a text file in the simple form of:
19 3 -2 9 14 4
5 -9 -10 3
.
.
.
This sent me of on a journey to understand I/O operators better, since I am required to do certain things with this stream (duh.)
I was looking everywhere and could not find a simple explanation as to how does the extract>> operator works internally. Let me clarify my question:
I know that the extractor>> operator would extract one continues element until it hits space, tab, or newline. What I try to figure out is, where would the pointer(?) or read-location(?) be AFTER it extracts an element. Will it be on the last char of the element just removed or was it removed and therefore gone? will it be on the space/tab/'\n' character itself? Perhaps the beginning of the next element to extract?
I hope I was clear enough. I lack all the appropriate jargon to describe my problem clearer.
Here is why I need to know this: (in case anyone is wondering...)
One of the requirements is to sum all integers in each line separately.
I have created a loop to extract all integers one-by-one until it reaches the end of the file. However, I soon learned that the extract>> operator ignores space/tab/newline. What I want to try is to extract>> an element, and then use inputFile.get() to get the space/tab/newline. Then, if it's a newline, do what I gotta do.
This will only work if the stream pointer will be in a good position to extract the space/tab/newline after the last extraction>>.
In my previous question, I tried to solve it using getline() and an sstring.
SOLUTION:
For the sake of answering my specific question, of how operator>> works, I had to accept Ben Voigt's answer as the best one.
I have used the other solutions suggested here (using an sstring for each line) and they did work! (you can see it in my previous question's link) However, I implemented another solution using Ben's answer and it also worked:
.
.
.
if(readFile.is_open()) {
while (readFile >> newInput) {
char isNewLine = readFile.get(); //get() the next char after extraction
if(isNewLine == '\n') //This is just a test!
cout << isNewLine; //If it's a newline, feed a newline.
else
cout << "X" << isNewLine; //Else, show X & feed a space or tab
lineSum += newInput;
allSum += newInput;
intCounter++;
minInt = min(minInt, newInput);
maxInt = max(maxInt, newInput);
if(isNewLine == '\n') {
lineCounter++;
statFile << "The sum of line " << lineCounter
<< " is: " << lineSum << endl;
lineSum = 0;
}
}
.
.
.
With no regards to my numerical values, the form is correct! Both spaces and '\n's were catched:
Thank you Ben Voigt :)
Nonetheless, this solution is very format dependent and is very fragile. If any of the lines has anything else before '\n' (like space or tab), the code will miss the newline char. Therefore, the other solution, using getline() and sstrings, is much more reliable.
After extraction, the stream pointer will be placed on the whitespace that caused extraction to terminate (or other illegal character, in which case the failbit will also be set).
This doesn't really matter though, since you aren't responsible for skipping over that whitespace. The next extraction will ignore whitespaces until it finds valid data.
In summary:
leading whitespace is ignored
trailing whitespace is left in the stream
There's also the noskipws modifier which can be used to change the default behavior.
The operator>> leaves the current position in the file one
character beyond the last character extracted (which may be at
end of file). Which doesn't necessarily help with your problem;
there can be spaces or tabs after the last value in a line. You
could skip forward reading each character and checking whether
it is a white space other than '\n', but a far more idiomatic
way of reading line oriented input is to use std::getline to
read the line, then initialize an std::istringstream to
extract the integers from the line:
std::string line;
while ( std::getline( source, line ) ) {
std::istringstream values( line );
// ...
}
This also ensures that in case of a format error in the line,
the error state of the main input is unaffected, and you can
continue with the next line.
According to cppreference.com the standard operator>> delegates the work to std::num_get::get. This takes an input iterator. One of the properties of an input iterator is that you can dereference it multiple times without advancing it. Thus when a non-numeric character is detected, the iterator will be left pointing to that character.
In general, the behavior of an istream is not set in stone. There exist multiple flags to change how any istream behaves, which you can read about here. In general, you should not really care where the internal pointer is; that's why you are using a stream in the first place. Otherwise you'd just dump the whole file into a string or equivalent and manually inspect it.
Anyway, going back to your problem, a possible approach is to use the getline method provided by istream to extract a string. From the string, you can either manually read it, or convert it into a stringstream and extract tokens from there.
Example:
std::ifstream ifs("myFile");
std::string str;
while ( std::getline(ifs, str) ) {
std::stringstream ss( str );
double sum = 0.0, value;
while ( ss >> value ) sum += value;
// Process sum
}

Error reading and printing a text file with C++

I have a bug with my code (the code at the end of the question). The purpose of my C++ executable is to read a file that contains numbers, copy it in a std::vector and
then just print the contents in the stdout? Where is the problem? (atoi?)
I have a simple text file that contains the following numbers (each line has one number)
mini01:algorithms ios$ cat numbers.txt
1
2
3
4
5
When I execute the program I receive one more line:
mini01:algorithms ios$ ./a.out
1
2
3
4
5
0
Why I get the 6th line in the stdout?
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
using namespace std;
void algorithm(std::vector<int>& v) {
for(int i=0; i < v.size(); i++) {
cout << v[i] << endl;
}
}
int main(int argc, char **argv) {
string line;
std::vector<int> vector1;
ifstream myfile("numbers.txt");
if ( myfile.is_open()) {
while( myfile.good() )
{
getline(myfile, line);
vector1.push_back(atoi(line.c_str()));
}
myfile.close();
}
else {
cout << "Unable to open file" << endl;
}
algorithm(vector1);
return 0;
}
You should not use while (myfile.good()), as it will loop once to many.
Instead use
while (getline(...))
The reason you can't use the flags to check for looping, is that they don't get set until after an input/output operation notices the problem (error or end-of-file).
Don't use good() as the condition of your extraction loop. It does not accurately indicate whether the next read will succeed or not. Move your call to getline into the condition:
while(getline(myfile, line))
{
vector1.push_back(atoi(line.c_str()));
}
The reason it is failing in this particular case is because text files typically have an \n at the end of the file (that is not shown by text editors). When the last line is read, this \n is extracted from the stream. Yes, that may be the very last character in the file, but getline doesn't care to look any further than the \n it has extracted. It's done. It does not set the EOF flag or do anything else to cause good() to return false.
So at the next iteration, good() is still true, the loop continues and getline attempts to extract from the file. However, now there's nothing left to extract and you just get line set to an empty string. This then gets converted to an int and pushed into the vector1, giving you the extra value.
In fact, the only robust way to check if there is a problem with extraction is to check the stream's status bits after extracting. The easiest way to do this is to make the extraction itself the condition.
You read one too many lines, since the condition while is false AFTER you had a "bad read".
Welcome to the wonderful world of C++. Before we go to the bug first, I would advise you to drop the std:: namespace resolution before defining or declaring a vector as you already have
using namespace::std;
A second advise would be to use the pre increment operator ++i instead of i++ wherever feasible. You can see more details on that here.
Coming to your problem in itself, the issue is an empty new line being read at the end of file. A simple way to avoid this would be to check the length of line before using it.
getline(myfile, line);
if (line.size()) {
vector1.push_back(atoi(line.c_str()));
}
This would enable your program now to read a file interspersed with empty lines. To be further foolproof you can check the line read for presence of any non numeric characters before using atoi on it. However the best solution as mentioned would be use to read the line read to the loop evaluation.

getline seems to not working correctly

Please tell me what am I doing wrong here. What I want to do is this:
1.Having txt file with four numbers and each of this numbers has 15 digits:
std::ifstream file("numbers.txt",std::ios::binary);
I'm trying to read those numbers into my array:
char num[4][15];
And what I'm thinking I'm doing is: for as long as you don't reach end of files write every line (max 15 chars, ending at '\n') into num[lines]. But this somewhat doesn't work. Firstly it reads correctly only first number, rest is just "" (empty string) and secondly file.eof() doesn't seems to work correctly either. In txt file which I'm presenting below this code I reached lines equal 156. What's going on?
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(num[lines],15,'\n');
}
So the whole "routine" looks like this:
int main()
{
std::ifstream file("numbers.txt",std::ios::binary);
char numbers[4][15];
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(numbers[lines],15,'\n');// sizeof(numbers[0])
}
}
This is contents of my txt file:
111111111111111
222222222222222
333333333333333
444444444444444
P.S.
I'm using VS2010 sp1
Do not use the eof() function! The canonical way to read lines is:
while( getline( cin, line ) ) {
// do something with line
}
file.getline() extracts 14 characters, filling in num[0][0] .. num[0][13]. Then it stores a '\0' in num[0][14] and sets the failbit on file because that's what it does when the buffer is full but terminating character not reached.
Further attempts to call file.getline() do nothing because failbit is set.
Tests for !file.eof() return true because the eofbit is not set.
Edit: to give a working example, best is to use strings, of course, but to fill in your char array, you could do this:
#include <iostream>
#include <fstream>
int main()
{
std::ifstream file("numbers.txt"); // not binary!
char numbers[4][16]={}; // 16 to fit 15 chars and the '\0'
for (unsigned lines = 0;
lines < 4 && file.getline(numbers[lines], 16);
++lines)
{
std::cout << "numbers[" << lines << "] = " << numbers[lines] << '\n';
}
}
tested on Visual Studio 2010 SP1
According to ifstream doc, reading stops either after n-1 characters are read or delim sign is found : first read would take then only 14 bytes.
It reads bytes : '1' (the character) is 0x41 : your buffer would be filled with 0x41 instead of 1 as you seem to expect, last character will be 0 (end of c-string)
Side note, your code doesn't check that lines doesn't go beyond your array.
Using getline supposes you're expecting text and you open the file in binary mode : seems wrong to me.
It looks like the '\n' in the end of the first like is not being considered, and remaining in the buffer. So in the next getline() it gets read.
Try adding a file.get() after each getline().
If one file.get() does not work, try two, because under the Windows default file encoding the line ends with '\n\r\' (or '\r\n', I never know :)
Change it to the following:
#include <cstring>
int main()
{
//no need to use std::ios_base::binary since it's ASCII data
std::ifstream file("numbers.txt");
//allocate one more position in array for the NULL terminator
char numbers[4][16];
//you only have 4 lines, so don't use EOF since that will cause an extra read
//which will then cause and extra loop, causing undefined behavior
for (unsigned lines = 0; lines < 4; ++lines)
{
//copy into your buffer that also includes space for a terminating null
//placing in if-statement checks for the failbit of ifstream
if (!file.getline(numbers[lines], 16,'\n'))
{
//make sure to place a terminating NULL in empty string
//since the read failed
numbers[lines][0] = '\0';
}
}
}