C++: Getline stops reading at first whitespace - c++

Basically my issue is that I'm trying to read in data from a .txt file that's full of numbers and comments and store each line into a string vector, but my getline function stops reading at the first whitespace character so a comment like (* comment *) gets broken up into
str[0] = "(*";
str[1] = "comment";
str[2] = "*)";
This is what my codeblock for the getline function looks like:
int main() {
string line;
string fileName;
cout << "Enter the name of the file to be read: ";
cin >> fileName;
ifstream inFile{fileName};
istream_iterator<string> infile_begin {inFile};
istream_iterator<string> eof{};
vector<string> data {infile_begin, eof};
while (getline(inFile, line))
{
data.push_back(line);
}
And this is what the .txt file looks like:
101481
10974
1013
(* comment *) 0
28292
35040
35372
0000
7155
7284
96110
26175
I can't figure out why it's not reading the whole line.

This is for the very simple reason that your code is not using std::getline to read the input file.
If you look at your code very carefully, you will see that before you even get to that point, your code constructs an istream_iterator<string> on the file, and by passing it, and the ending istream_iterator<string> value to the vector's constructor, this effectively swallows the entire file, one whitespace-delimited word at a time, into the vector.
And by the time things get around to the getline loop, the entire file has already been read, and the loop does absolutely nothing. Your getline isn't really doing anything, with the current state of affairs.
Get rid of that stuff that involves istream_iterators, completely, and simply let getline do the job it was intended for.

Related

How do I deal with a carriage return line feed when trying to read in file

So I am working on a file that I need to read in which contains both commas separating words and carriage return linefeed at the end of each line and I can't figure out a way to handle it. I am trying to read in each word before the comma and put it into the a vector until it hits the carriage return line feed but I am having problems.
Here is my text file (as seen on notepad++ so you can see the symbols. on the actual text, the things inside [] don't appear)
microwave,lamp,guitar,couch,bed,dog,cat[cr][lf]
P1:microwave,couch,bed,dog,chair,bookcase,fish[cr][lf]
I have tried multiple solutions, but nothing seems to work. Here is what I have tried so far. but it obviously isn't working. I have seen some users suggest using substring to somehow read out the comma, and read in the words but I am not sure how to do that. I couldn't find a good tutorial or example of one. In my head, I have the algorithm(or at least, steps on how to go about it), but i am not sure how to go about implementing it.
Import file (istream)
Read until comma, take string and place it in vector1 (getline, input, ,), vector.push_back(input)
Repeat previous step until you reach \cr\lf stop reading. (getline(input, '/r'))
move on to the next line
Read until comma, take string and place it in vector2
Repeat
Read the line until /cr/lf
Here is the code I put in practice using part of the above steps i made.
string input;
vector<string> v1;
vector<string> v2;
ifstream infile;
infile.open("example.txt");
while(getline(infile, input)) //read until end of line
{
while(getline(infile, input, '\r')) //read until it reaches a carriage return
{
while(getline(infile, input, ',')) // read until it reaches a comma
{
v1.push_back(input); //take the word and put in vector.
}
}
}
infile.close();
Any help would be appreciated.
Edit: I forgot to mention. When I used this code, it seemed to not import anything into the vectors. I am sure all the words got lost somewhere in the getline functions, but I don't know how to just read up to comma and carriage return line feed without using it.
You should use getline() to get a whole line first. It should handle carriage returns for you. Then, put the result into a stringstream and use getline() on it to separate the line at the commas.
My code that reads input into a vector of vectors:
#include <fstream>
#include <iostream>
#include <sstream>
#include <vector>
int main()
{
std::ifstream fin("input.txt");
std::vector<std::vector<std::string>> result;
for(std::string line; std::getline(fin, line);)
{
result.emplace_back();
std::stringstream ss(line);
for(std::string word; std::getline(ss, word, ',');)
{
result.back().push_back(word);
}
}
for(const auto &i : result)
{
for(const auto &j : i)
{
std::cout << j << ' ';
}
std::cout << '\n';
}
}
You can modify it to read into two vectors by just removing the outer loop and use two separate loops for each of the two vectors/lines.
In your code, you first have a loop that reads line by line until the end of the file. After you read a line, you have a loop that reads until a '\r', which as far as I know does not occur in a normal text file. Even if there are '\r's in the file, you would be overwriting what you just read in from the outer loop. Same thing with the loop inside that.
Were you taught that while(getline(fin, str)) reads from a file without knowing how it works?

Reading / Writing Files for a calculator: atof error

I currently have a text file that is as follows:
12 6 4 9
It is a very simple text file since I want to just get one line working and then maybe expand to multiple lines later. Extra aside: this is for a RPN calculator I am working on.
I want to go through this text file character by character. The way I currently have it implemented is with a simple while loop:
string line;
while (!infile.eof()){
getline(infile, line);
if (isdigit(line[0])){
rpn_stack.push_back(atof(line.c_str()));
}
}
rpn_stack is a vector since I will not be using the built in stack libraries in C++.
The problem I am currently having is that the output is just outputting "12". Why is this?
Is there a way that I can traverse through the file character by character instead of reading as a line? Is it breaking because it finds a white space (would that be considered the EOF)?
EDIT:
The code has been rewritten to be as the following:
string line;
while (!infile.eof()){
getline(infile, line);
for (int i = 0; i < line.size(); i++){
if (isdigit(line[i])){
rpn_stack.push_back(atof(line.c_str()));
}
}
}
The output is 12 5 different times, which is obviously wrong. Not only are there 4 items in the txt document, but only one of them is a 12. Can someone give some insight?
This will read as many doubles from infile as possible (i.e. until the end of file or until it comes across a token that isn't a double), separated by whitespace.
for (double d; infile >> d;)
rpn_stack.push_back(d);
If you need parse line-by-line, as #ooga says you will need a two-stage reader that looks something like this:
for (std::string line; getline(infile, line);) {
std::istringstream stream{line};
for (double d; stream >> d;)
rpn_stack.push_back(d);
}
Bonus hint: don't use .eof()

Missing line of data using Getline with Ifstream

Ok so this is killing me at the moment cause its such a simple part of my program that just doesn't want to work. I'm reading data from a textfile to use in a GA.
The first getline() works perfectly, but the second one doesn't want to write any data into my string. When i cout the string it doesn't show anything.
Here is the code:
ifstream inFile;
inFile.open(fname.c_str());
char pop[20], mut[20];
inFile.getline(pop,20);
cout << pop;
inFile.getline(mut,20);
cout << mut; //this outputs nothing
Thanks for any help in advance.
A sample form my file:
there is no line between them mutation is the line straight after population
Population size: 30
Mutation: 20
Your file's first line is 20 characters long (19+new line) but pop[20] can only contain 19 (because the last one is reserved for the null terminator '\0').
When istream::getline stops because it has extracted 20-1 characters, it doesn't discard the new line delimiter (because it was never read). So the next getline just reads the end of the first line, discarding the new line.
That's why you get nothing in the second string.
Your problem is that the length of your input line exceeds the length of the buffer which must hold it.
The solution is to not use character arrays. This is C++, use std::string!
std::ifstream inFile;
inFile.open(fname.c_str());
std::string pop;
std::getline(inFile, pop);
cout << pop << "\n";
std::string mut;
std::getline(inFile, mut);
cout << mut << "\n";
I think you need to find out what the problem is. Add error checking code to your getline calls, refactor the (simple) code into a (simple) function, with a (simple) unittest. Possibly, your second line is longer than the assumed 20 characters (null-term included!).
For an idea of what I mean, take a look at this snippet.
try something like
while (getline(in,line,'\n')){
//do something with line
}
or try something like
string text;
string temp;
ifstream file;
file.open ("test_text.txt");
while (!file.eof())
{
getline (file, temp);
text.append (temp); // Added this line
}

Tokenization of a text file with frequency and line occurrence. Using C++

once again I ask for help. I haven't coded anything for sometime!
Now I have a text file filled with random gibberish. I already have a basic idea on how I will count the number of occurrences per word.
What really stumps me is how I will determine what line the word is in. Gut instinct tells me to look for the newline character at the end of each line. However I have to do this while going through the text file the first time right? Since if I do it afterwords it will do no good.
I already am getting the words via the following code:
vector<string> words;
string currentWord;
while(!inputFile.eof())
{
inputFile >> currentWord;
words.push_back(currentWord);
}
This is for a text file with no set structure. Using the above code gives me a nice little(big) vector of words, but it doesn't give me the line they occur in.
Would I have to get the entire line, then process it into words to make this possible?
Use a std::map<std::string, int> to count the word occurrences -- the int is the number of times it exists.
If you need like by line input, use std::getline(std::istream&, std::string&), like this:
std::vector<std::string> lines;
std::ifstream file(...) //Fill in accordingly.
std::string currentLine;
while(std::getline(file, currentLine))
lines.push_back(currentLine);
You can split a line apart by putting it into an std::istringstream first and then using operator>>. (Alternately, you could cobble up some sort of splitter using std::find and other algorithmic primitaves)
EDIT: This is the same thing as in #dash-tom-bang's answer, but modified to be correct with respect to error handing:
vector<string> words;
int currentLine = 1; // or 0, however you wish to count...
string line;
while (getline(inputFile, line))
{
istringstream inputString(line);
string word;
while (inputString >> word)
words.push_back(pair(word, currentLine));
}
Short and sweet.
vector< map< string, size_t > > line_word_counts;
string line, word;
while ( getline( cin, line ) ) {
line_word_counts.push_back();
map< string, size_t > &word_counts = line_word_counts.back();
istringstream line_is( line );
while ( is >> word ) ++ word_counts[ word ];
}
cout << "'Hello' appears on line 5 " << line_word_counts[5-1]["Hello"]
<< " times\n";
You're going to have to abandon reading into strings, because operator >>(istream&, string&) discards white space and the contents of the white space (== '\n' or != '\n', that is the question...) is what will give you line numbers.
This is where OOP can save the day. You need to write a class to act as a "front end" for reading from the file. Its job will be to buffer data from the file, and return words one at a time to the caller.
Internally, the class needs to read data from the file a block (say, 4096 bytes) at a time. Then a string GetWord() (yes, returning by value here is good) method will:
First, read any white space characters, taking care to increment the object's lineNumber member every time it hits a \n.
Then read non-whitespace characters, putting them into the string object you'll be returning.
If it runs out of stuff to read, read the next block and continue.
If the you hit the end of file, the string you have is the whole word (which may be empty) and should be returned.
If the function returns an empty string, that tells the caller that the end of file has been reached. (Files usually end with whitespace characters, so reading whitespace characters cannot imply that there will be a word later on.)
Then you can call this method at the same place in your code as your cin >> line and the rest of the code doesn't need to know the details of your block buffering.
An alternative approach is to read things a line at a time, but all the read functions that would work for you require you to create a fixed-size buffer to read into beforehand, and if the line is longer than that buffer, you have to deal with it somehow. It could get more complicated than the class I described.

C++: Why does space always terminate a string when read?

Using type std::string to accept a sentence, for practice (I haven't worked with strings in C++ much) I'm checking if a character is a vowel or not. I got this:
for(i = 0; i <= analyse.length(); i++) {
if(analyse[i] == 'a' || analyse[i] == 'e' [..etc..]) {
...vowels++;
} else { ...
...consonants++;
}
This works fine if the string is all one word, but the second I add a space (IE: aeio aatest) it will only count the first block and count the space as a consonant, and quit reading the sentence (exiting the for loop or something).
Does a space count as no character == null? Or some oddity with std::string?, It would be helpful to know why that is happening!
EDIT:
I'm simply accepting the string through std::cin, such as:
std::string analyse = "";
std::cin >> analyse;
I'd guess you're reading your string with something like your_stream >> your_string;. Operator >> for strings is defined to work (about) the same as scanf's %s conversion, which reads up until it encounters whitespace -- therefore, operator>> does the same.
You can read an entire line of input instead with std::getline. You might also want to look at an answer I posted to a previous question (provides some alternatives to std::getline).
I can't tell from the code that you have pasted, but I'm going to go out on a limb and guess that you're reading into the string using the stream extraction operator (stream >> string).
The stream extraction operator stops when it encounters whitespace.
If this isn't what's going on, can you show us how you're populating your string, and what its contents are?
If I'm right, then you're going to want a different method of reading content into the string. std::getline() is probably the easiest method of reading from a file. It stops at newlines instead of at whitespace.
Edit based on edited question:
use this (doublecheck the syntax. I'm not in front of my compiler.):
std::getline(std::cin, analyze);
This ought to stop reading when you press "enter".
If you want to read in an entire line (including the blanks) then you should read using getline. Schematically it looks like this:
#include <string>
istream& std::getline( istream& is, string& s );
To read the whole line you do something like this:
string s;
getline( cin, s );
cout << "You entered " << s << endl;
PS: the word is "consonant", not "consenent".
The >> operator on an istream separates strings on whitespace. If you want to get a whole line, you can use readline(cin,destination_string).