Trouble with parsing text file based on commas (C++) - c++

I am working on creating a program that is supposed to read a text file (ex. dog, buddy,,125,,,cat,,,etc...) line by line and parse it based on commas. This is what I have so far but when I run it, nothing happens. I am not entirely sure what i'm doing wrong and I am fairly new to the higher level concepts.
#include <iostream>
#include <fstream>
#include <string>
#include <iomanip>
#include <cstdlib>
#include <sstream>
#include <vector>
using namespace std;
int main()
{
std::ifstream file_("file.txt"); //open file
std::string line_; //declare line_ as a string
std::stringstream ss(line_); //using line as stringstream
vector<string> result; //declaring vector result
while (file_.is_open() && ss.good())
{ //while the file is open and stringstream is good
std::string substr; //declares substr as a string
getline( ss, substr, ',' ); //getting the stringstream line_ and substr and parsing
result.push_back(substr);
}
return 0;
}

Did you forget to add a line like std::getline(file_, line_);? file_ was not read from at all and line_ was put into ss right after it was declared when it was empty.
I'm not sure why you checked if file_ is open in your loop condition since it will always be open unless you close it.
As far as I know, using good() as a loop condition is not a good idea. The flags will only be set the first time an attempt is made to read past the end of the file (it won't be set if you read to exactly the end of the file when hitting the delimiter), so if there was a comma at the end of the file the loop will run one extra time. Instead, you should somehow put the flag check after the extraction and before you use the result of the extraction. A simple way is to just use the getline() call as your loop condition since the function returns the stream itself, which when cast into a bool is equivalent to !ss.fail(). That way, the loop will not execute if the end of the file is reached without extracting any characters.
By the way, comments like //declaring vector result is pretty much useless since it gives no useful information that you can't easily see from the code.
My code:
#include <iostream>
#include <fstream>
#include <vector>
#include <sstream>
int main()
{
std::ifstream file("input.txt");
std::string line, word;
std::vector<std::vector<string>> result; //result[i][j] = the jth word in the input of the ith line
while(std::getline(file, line))
{
std::stringstream ss(line);
result.emplace_back();
while(std::getline(ss, word, ','))
{
result.back().push_back(word);
}
}
//printing results
for(auto &i : result)
{
for(auto &j : i)
{
std::cout << j << ' ';
}
std::cout << '\n';
}
}

Related

Finding certain characters in a line of string

I want to be able to a string that contains certain characters in a file that contains one string per line.
#include <iostream>
#include <fstream>
#include <string>
int main(){
string line;
ifstream infile;
infile.open("words.txt");
while(getline(infile, line,' ')){
if(line.find('z')){
cout << line;
}
}
}
That's my attempt at finding all the string that contains the character z.
The text file contains random strings such as
fhwaofhz
cbnooeht
rhowhrj
perwqreh
dsladsap
zpuaszu
so with my implementation, it should only print out the strings with the character z in it. However, it seems to be reprinting out all the contents from the text file again.
Problem:
In your file the strings aren't separated by a space (' ') which is the end delimiter, they are separated by a end of line ('\n'), that is a different character. As a consequence, in the first getline everything goes to line. line contains all the text in the file, including z's, so all the content is printed. Finally, the code exits the while block after running once because getline reaches the end of the file and fails.
If you run this code
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::string line;
std::ifstream infile;
infile.open("words.txt");
while(getline(infile, line,' ')){
std::cout << "Hi";
if(line.find('z')){
std::cout << line;
}
}
}
"Hi" will be only printed once. That is because the while block is only executed once.
Additionaly, see that line.find('z') won't return 0 if not match is found, it will return npos. See it running this code (As it says here):
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::string line;
std::ifstream infile;
infile.open("words.txt");
while(getline(infile,line)){
std::cout << line.find('z');
if(line.find('z')){
std::cout << line << "\n";
}
}
}
Solution:
Use getline(infile,line) that is more suitable for this case and replace if(line.find('z')) with if(line.find('z') != line.npos).
while(getline(infile,line)){
if(line.find('z') != line.npos){
std::cout << line << "\n";
}
}
If you need to put more than one string per line you can use the operator >> of ifstream.
Additional information:
Note that the code you posted won't compile because string, cout and ifstream are in the namespace std. Probably it was a part of a longer file where you were using using namespace std;. If that is the case, consider that it is a bad practice (More info here).
Full code:
#include <iostream>
#include <fstream>
#include <string>
int main(){
std::string line;
std::ifstream infile;
infile.open("words.txt");
while(getline(infile,line)){
if(line.find('z') != line.npos){
std::cout << line << "\n";
}
}
}
getline extracts characters from the source and stores them into the variable line until the delimitation character is found. Your delimiter character is a space (" "), which isn't present in the file, so line will contain the whole file.
Try getline(infile, line, '\n') or simply getline(infile, line) instead.
The method find returns the index of the found character, where 0 is a perfectly valid index. If the character is not found, it returns npos. This is a special value whcih indicates "not found", and it's nonzero to allow 0 to refer to a valid index. So the correct check is:
if (line.find('z') != string::npos)
{
// found
}

C++ Parsing a CSV file into vector of vectors: Loosing string 1st character

I am reading a CSV file into vector of string vectors. I have written code below.
#include<iostream>
#include<fstream>
#include<string>
#include <vector>
#include <fstream>
#include <cmath>
#include <sstream>
using namespace std;
int main()
{
ifstream mesh;
mesh.open("mesh_reference.csv");
vector<vector<string> > point_coordinates;
string line, word;
while (getline(mesh,line))
{
stringstream ss(line);
vector<string> row;
while (getline(ss, word, ','))
{
row.push_back(word);
}
point_coordinates.push_back(row);
}
for(int i=0; i<point_coordinates.size(); i++)
{
for(int j=0; j<3; j++)
cout<<point_coordinates[i][j]<<" ";
cout<<endl;
}
return 0;
}
When I print out the vector of vectors, I see that I am loosing the first character of Element at 0 position in the vector row. Basically, point_coordinates[0][0] is displaying 0.0001 while the string is supposed to be -0.0001. I am not able to understand the reason for the same. Kindly help.
A typical output line is
.0131 -0.019430324 0.051801
Whereas the CSV data is
0.0131,-0.019430324,0.051801
SAMPLE CSV DATA FROM FILE
NODES__X,NODES__Y,NODES__Z
0.0131,-0.019430324,0.051801
0.0131,-0.019430324,0.06699588
0.0131,-0.018630324,0.06699588
0.0131,-0.018630324,0.051801
0.0131,-0.017630324,0.050801
0.0131,-0.017630324,0.050001
0.0149,-0.017630324,0.050001
0.0149,-0.019430324,0.051801
Although the problem is already solved, I would like to show you a solution using some modern C++ algorithms and eliminating minor issues.
Do not use using namespace std;. You should not do this
Ne need for a separate file.open. The std::ifstream constructor will open the file for you. And the destructor will close it
Check if the file could be opened. The ifstreams ! operator is overloaded. So you can do a boolean check
Do not use int in for loops where you compare against .size(). Use ````size_t instead
Always initialize all variables, even if there is an assignement in the next line
For tokenizing you should use std::sregex_token_iterator. It has exactly been designed for this purpose
In modern C++ you are encouraged to use algorithms
Please see an improved version of your code below:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <regex>
const std::regex comma(",");
int main()
{
// Open source file.
std::ifstream mesh("r:\\mesh_reference.csv");
// Here we will store the result
std::vector<std::vector<std::string>> point_coordinates;
// We want to read all lines of the file
std::string line{};
while (mesh && getline(mesh, line)) {
// Tokenize the line and store result in vector. Use range constructor of std::vector
std::vector<std::string> row{ std::sregex_token_iterator(line.begin(),line.end(),comma,-1), std::sregex_token_iterator() };
point_coordinates.push_back(row);
}
// Print result. Go through all lines and then copy line elements to std::cout
std::for_each(point_coordinates.begin(), point_coordinates.end(), [](std::vector<std::string> & vs) {
std::copy(vs.begin(), vs.end(), std::ostream_iterator<std::string>(std::cout, " ")); std::cout << "\n"; });
return 0;
}
Please consider, if you may want to use such an approach in the future

How can I read from a file and sort them by category

I'm trying to read a bunch of words from a file and sort them into what kind of words they are (Nouns, Adjective, Verbs ..etc). For example :
-Nouns;
zyrian
zymurgy
zymosis
zymometer
zymolysis
-Verbs_participle;
zoom in
zoom along
zoom
zonk out
zone
I'm using getline to read until the delimiter ';' but how can I know when it read in a type and when it read in a word?
The function below stop right after "-Nouns;"
int main()
{
map<string,string> data_base;
ifstream source ;
source.open("partitioned_data.txt");
char type [MAX];
char word [MAX];
if(source) //check to make sure we have opened the file
{
source.getline(type,MAX,';');
while( source && !source.eof())//make sure we're not at the end of file
{
source.getline(word,MAX);
cout<<type<<endl;
cout<<word<<endl;
source.getline(type,MAX,';');//read the next line
}
}
source.close();
source.clear();
return 0;
}
I am not fully sure about the format of your input file. But you seem to have a file with lines, and in that, items separated by a semicolon.
Reading this should be done differently.
Please see the following example:
#include <iostream>
#include <string>
#include <sstream>
#include <fstream>
std::istringstream source{R"(noun;tree
noun;house
verb;build
verb;plant
)"};
int main()
{
std::string type{};
std::string word{};
//ifstream source{"partitioned_data.txt"};
if(source) //check to make sure we have opened the file
{
std::string line{};
while(getline(source,line))//make sure we're not at the end of file
{
size_t pos = line.find(';');
if (pos != std::string::npos) {
type = line.substr(0,pos);
word = line.substr(pos+1);
}
std::cout << type << " --> " << word << '\n';
}
}
return 0;
}
There is no need for open and close statements. The constructor and
destructor of the std::ifstream will do that for us.
Do not check eof in while statement
Do not, and never ever use C-Style arrays like char type [MAX];
Read a line in the while statement and check validity of operation in the while. Then work on the read line later.
Search the ';' in the string, and if found, take out the substrings.
If I would knwo the format of the input file, then I will write an even better example for you.
Since I do not have files on SO, I uses a std::istringstream instead. But there is NO difference compared to a file. Simply delete the std::istringstream and uncomment teh ifstream definition in the source code.

How to read the whole lines from a file (with spaces)?

I am using STL. I need to read lines from a text file. How to read lines till the first \n but not till the first ' ' (space)?
For example, my text file contains:
Hello world
Hey there
If I write like this:
ifstream file("FileWithGreetings.txt");
string str("");
file >> str;
then str will contain only "Hello" but I need "Hello world" (till the first \n).
I thought I could use the method getline() but it demands to specify the number of symbols to be read. In my case, I do not know how many symbols I should read.
You can use getline:
#include <string>
#include <iostream>
int main() {
std::string line;
if (getline(std::cin,line)) {
// line is the whole line
}
}
using getline function is one option.
or
getc to read each char with a do-while loop
if the file consists of numbers, this would be a better way to read.
do {
int item=0, pos=0;
c = getc(in);
while((c >= '0') && (c <= '9')) {
item *=10;
item += int(c)-int('0');
c = getc(in);
pos++;
}
if(pos) list.push_back(item);
}while(c != '\n' && !feof(in));
try by modifying this method if your file consists of strings..
Thanks to all of the people who answered me. I made new code for my program, which works:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(int argc, char** argv)
{
ifstream ifile(argv[1]);
// ...
while (!ifile.eof())
{
string line("");
if (getline(ifile, line))
{
// the line is a whole line
}
// ...
}
ifile.close();
return 0;
}
I suggest:
#include<fstream>
ifstream reader([filename], [ifstream::in or std::ios_base::in);
if(ifstream){ // confirm stream is in a good state
while(!reader.eof()){
reader.read(std::string, size_t how_long?);
// Then process the std::string as described below
}
}
For the std::string, any variable name will do, and for how long, whatever you feel appropriate or use std::getline as above.
To process the line, just use an iterator on the std::string:
std::string::iterator begin() & std::string::iterator end()
and process the iterator pointer character by character until you have the \n and ' ' you are looking for.

Reading line from text file and putting the strings into a vector?

I am trying to read each line of a textfile which each line contains one word and put those words into a vector. How would i go about doing that?
This is my new code: I think there is still something wrong with it.
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int main()
{
std::string line;
vector<string> DataArray;
vector<string> QueryArray;
ifstream myfile("OHenry.txt");
ifstream qfile("queries.txt");
if(!myfile) //Always test the file open.
{
cout<<"Error opening output file"<<endl;
system("pause");
return -1;
}
while (std::getline(qfile, line))
{
QueryArray.push_back(line);
}
if(!qfile) //Always test the file open.
{
cout<<"Error opening output file"<<endl;
system("pause");
return -1;
}
while (std::getline(qfile, line))
{
QueryArray.push_back(line);
}
cout<<QueryArray[0]<<endl;
cout<<DataArray[0]<<endl;
}
Simplest form:
std::string line;
std::vector<std::string> myLines;
while (std::getline(myfile, line))
{
myLines.push_back(line);
}
No need for crazy c thingies :)
Edit:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
int main()
{
std::string line;
std::vector<std::string> DataArray;
std::vector<std::string> QueryArray;
std::ifstream myfile("OHenry.txt");
std::ifstream qfile("queries.txt");
if(!myfile) //Always test the file open.
{
std::cout<<"Error opening output file"<< std::endl;
system("pause");
return -1;
}
while (std::getline(myfile, line))
{
DataArray.push_back(line);
}
if(!qfile) //Always test the file open.
{
std::cout<<"Error opening output file"<<std::endl;
system("pause");
return -1;
}
while (std::getline(qfile, line))
{
QueryArray.push_back(line);
}
std::cout<<QueryArray[20]<<std::endl;
std::cout<<DataArray[12]<<std::endl;
return 0;
}
Keyword using is illegal C++! Never use it. OK? Good. Now compare what I wrote with what you wrote and try to find out the differences. If you still have questions come back.
#FailedDev did, indeed, list the simplest form. As an alternative, here is how I often code that loop:
std::vector<std::string> myLines;
std::copy(std::istream_iterator<std::string>(myfile),
std::istream_iterator<std::string>(),
std::back_inserter(myLines));
The entire program might look like this:
// Avoid "using namespace std;" at all costs. Prefer typing out "std::"
// in front of each identifier, but "using std::NAME" isn't (very) dangerous.
#include <iostream>
using std::cout;
using std::cin;
#include <fstream>
using std::ifstream;
#include <string>
using std::string;
#include <vector>
using std::vector;
#include <iterator>
using std::istream_iterator;
#include <algorithm>
using std::copy;
int main()
{
// Store the words from the two files into these two vectors
vector<string> DataArray;
vector<string> QueryArray;
// Create two input streams, opening the named files in the process.
// You only need to check for failure if you want to distinguish
// between "no file" and "empty file". In this example, the two
// situations are equivalent.
ifstream myfile("OHenry.txt");
ifstream qfile("queries.txt");
// std::copy(InputIt first, InputIt last, OutputIt out) copies all
// of the data in the range [first, last) to the output iterator "out"
// istream_iterator() is an input iterator that reads items from the
// named file stream
// back_inserter() returns an interator that performs "push_back"
// on the named vector.
copy(istream_iterator<string>(myfile),
istream_iterator<string>(),
back_inserter(DataArray));
copy(istream_iterator<string>(qfile),
istream_iterator<string>(),
back_inserter(QueryArray));
try {
// use ".at()" and catch the resulting exception if there is any
// chance that the index is bogus. Since we are reading external files,
// there is every chance that the index is bogus.
cout<<QueryArray.at(20)<<"\n";
cout<<DataArray.at(12)<<"\n";
} catch(...) {
// deal with error here. Maybe:
// the input file doesn't exist
// the ifstream creation failed for some other reason
// the string reads didn't work
cout << "Data Unavailable\n";
}
}
Simplest version:
std::vector<std::string> lines;
for (std::string line; std::getline( ifs, line ); /**/ )
lines.push_back( line );
I'm omitting the includes and other gunk. My version is almost the same as FailedDev's but by using a 'for' loop I put the declaration of 'line' in the loop. This is not just a trick to reduce the line count. Doing this reduces the scope of line -- it disappears after the for loop. All variables should have the smallest scope possible, so therefore this is better. For loops are awesome.
A short version for C++11 and above. The vector is constructed directly from the file contents:
ifstream qfile("queries.txt");
vector<string> lines {
istream_iterator<string>(qfile),
istream_iterator<string>()
};
Note that this code will only work if the input file is in the format described by the OP, i.e. "each line contains one word". Or if you set special locale via qfile.imbue(), as mheyman kindly pointed out.