How do I obtain the column of my CSV File? - c++

I'm trying to obtain the last column of my CSV file. I tried using getline and the stringstream but it doesn't get the last column only
stringstream lineStream(line);
string bit;
while (getline(inputFile, line))
{
stringstream lineStream(line);
bit = "";
getline(lineStream, bit, ',');
getline(lineStream, bit, '\n');
getline(inputFile, line);
stringVector.push_back(bit);
}
My CSV file:
5.1,3.5,1.4,0.2,no
4.9,3.0,1.4,0.2,yes
4.7,3.2,1.3,0.2,no
4.6,3.1,1.5,0.2,yes
5.0,3.6,1.4,0.2,no
5.4,3.9,1.7,0.4,yes

Probably the simplest approach is to use std::string::rfind as follows:
while (std::getline(inputFile, line))
{
// Find position after last comma, extract the string following it and
// add to the vector. If no comma found and non-empty line, treat as
// special case and add that too.
std::string::size_type pos = line.rfind(',');
if (pos != std::string::npos)
stringVector.push_back(line.substr(pos + 1));
else if (!line.empty())
stringVector.push_back(line);
}

Related

StringStream input with comma delimited string - know columns apriori

I have a csv that I'd like to tokenize line by line with StringStream. The key is that I know apriori what the columns would look like. For example, say I know the file looks like the following
StrHeader,IntHeader
abc,123
xyz,456
I know ahead of time it is a string column, followed by an int column.
Common approach is to read the file line by line
std::string line;
stringstream lineStream;
while (getline(infile, line)) // read line by line
{
cout << "line " << line << endl;
lineStream << line;
string token;
while(getline(lineStream, token, ',')) // push into vector? this is not ideal
{
}
I know I can have 2 loops, and have inner loop tokenizes the string based on commas. Lots of sample code on stackoverflow would store the result into a vector<string>.
I don't want to do create a new vector every line. Since I know apriori what columns the file would have, can I somehow read directly into a string and int variable? Like this
std::string line;
stringstream lineStream;
while (getline(infile, line)) // read line by line
{
cout << "line " << line << endl;
lineStream << line; // DOESNT WORK - tell lineStream we have comma delimited string
string strValue;
int intValue;
lineStream >> strValue >> intValue; // SO MUCH CLEANER
// call foo(strValue, intValue);
}
The problem above is this line
lineStream << line; // DOESNT WORK - tell lineStream we have comma delimited string
From what I could tell, the above code works if the input line is space delimited, not comma delimited.
I have no control over the input. So, simply replacing the "spaces" with "commas" in the original string is not an ideal solution since I don't know if the input already has spaces.
Any ideas? thanks
You could try to only read to the delimiter with std::getline() and then put that in a string stream for conversion.
while (!infile.eof()){
std::getline(infile, strValue, ',');
std::getline(infile, line);
strstr.str(line);
strstr.clear();
int intValue;
strstr >> intValue;
foo(strValue, intValue);
}

c++ read in multiple lines, varied delimiters

The input file is structured like:
First Last,33,Male,city
score1,15/30
score2, 20/20
First Last,43,Female,city
score1,20/20
score2,18/20
with an unknown number of records, each separated by a blank line. Each record becomes an object to be stored in a dynamic array object (which is confusing in its own right).
I can place the first line into variables, but the remaining lines are lost in the ether. Although it produces the proper number of records (in this case, it would be 2), each record is filled with the first line's data and no scores.
I just have no ideas about how to read it in properly, and still don't know how to deal with that blank line in between each record. This was for a previous homework for which I cannot get a straight answer out of anybody, and would love to know, since reading in from files seems to be all the rage...
Here is what I have:
std::ifstream read_data(data_file);
std::string line;
while(std::getline(read_data, line))
{
std::stringstream ss(line);
char detectNewline;
getline(ss, tempName, ',');
getline(ss, tempAgeString, ',');
tempAge = std::atoi(tempAgeString.c_str());
getline(ss, tempGender, ',');
getline(ss, tempCity, '\n';
for(int i=0; i < 2; i++) // I am not married to this idea, seems efficient
{
getline(ss, tempScore, ',');
getline(ss, pointsEarnedHolder, '/');
tempPointsEarned += std::atof(pointsEarnedHolder.c_str());
getline(ss, totalPointsHolder, '\n');
tempTotalPoints += std::atof(totalPointsHolder.c_str());
}
// variable manipulation
ClassName object(proper vars);
previouslyDeclaredDynamicArrayObject(object);
detectNewline = read_data.peek();
if(detectNewline == '\n')
{
std::cin.ignore();
}
} //while
Thank you for any insight!
I will touch on a way to read the information efficiently.
First you can getline the first line and parse the information like you have. Then you will parse the information provided in the scores until getline hits a blank line. Then once this happens you will add the object into the array, and get the starting information for the next object and then repeat the process.
The code would look similar to this (pretty pseudo-y):
std::getline(read_data, line);
while( !read_data.eof() ) {
std::stringstream ss(line);
getline(ss, tempName, ',');
getline(ss, tempAgeString, ',');
tempAge = std::atoi(tempAgeString.c_str());
getline(ss, tempGender, ',');
getline(ss, tempCity, '\n';
std::getline( read_data, line )
while( line != "\n" ) {
getline(ss, tempScore, ',');
getline(ss, pointsEarnedHolder, '/');
tempPointsEarned += std::atof(pointsEarnedHolder.c_str());
getline(ss, totalPointsHolder, '\n');
tempTotalPoints += std::atof(totalPointsHolder.c_str());
std::getline( read_data, line )
}
// variable manipulation
ClassName object(proper vars);
previouslyDeclaredDynamicArrayObject(object);
std::getline(read_data, line);
} //while
This is assuming that you are properly extracting the information from the lines.
An easier way to delimit upon those characters is to classify them as whitespace through the std::ctype<> facet of the streams locale. Then you can simply use the extractor operator>>() instead of parsing through the unformatted functions.
Here's an example of how to set the facet:
struct ctype : std::ctype<char>
{
static mask* make_table()
{
const mask* table = classic_table();
static std::vector<mask> v(table, table + table_size);
v[' '] |= space;
v[','] |= space;
v['/'] |= space;
return &v[0];
}
ctype() : std::ctype<char>(make_table()) { }
};
You can then make a convience function to install it into the stream's locale:
std::istream& custom_ctype(std::istream& is)
{
is.imbue(std::locale(is.getloc(), new ctype));
return *is;
}
// ...
file >> custom_ctype;
After that it becomes simple to extract characters from the file into variables. Just think of the characters that you want to ignore as if they were the space character or the newline character, because that's exactly what we've done here.

Class stringstream. Cant understand how lineStream works, its parameters;

I have the following code, and I know how it works and what it does, however, not at all. I don't understand how these three lines work
std::stringstream lineStream(line);
std::string cell;
std::getline(lineStream, cell, ';')
Especially lineStream one;
I found them in google, but no sufficient explanation. Could you explain me please their behavior or share a good link please? Thanks in advance, have a nice day :)
container *begin = new container;
begin->beginBox = new box;
container *last = NULL;
std::ifstream data(filename);
std::string line;
std::getline(data, line);
for (container *i = begin; !data.eof() && std::getline(data, line);)
{
std::stringstream lineStream(line);
std::string cell;
std::getline(lineStream, cell, ';');
i->ID = atoi(cell.c_str());
for (box *j = i->beginBox; std::getline(lineStream, cell, ';'); j->next = new box, j = j->next)
{
j->apples = atoi(cell.c_str());
i->lastBox = j;
}
i->lastBox->next = NULL;
i->nextCont = new container(), last = i, i = i->nextCont, i->beginBox = new box;
}
setAutoIncrement(begin->ID + 1);
last->nextCont = NULL;
return begin;
std::stringstream lineStream(line);
This declares a variable called lineStream of type std::stringstream. It passes the line string to its constructor (2). The std::stringstream type wraps a string with a stream interface. It means you can treat it like cout and cin, using << and >> to insert and extract things from the string. Here, lineStream is being created so you can later extract its contents using std::getline.
std::string cell;
This just declares an empty std::string called cell.
std::getline(lineStream, cell, ';');
The function std::getline (1) takes a stream that it will extract a line from as its first argument. The second argument is a std::string that it will extract the line into. Without a third argument, the ending of a "line" is considered to be where we see a newline character. However, by passing a third argument, this code is making so that a line ends at ;. So this call to std::getline will extract everything from the stream up until it finds a ; character and puts that content into cell. The ; character is then discarded.
This is all very similar to the above code:
std::ifstream data(filename);
std::string line;
std::getline(data, line);
Here, the stream is a file stream instead of a string stream, and std::getline will extract everything up to a newline character because no third argument is given.

How to read a file word by word and find the position of each word?

I'm trying to read a file word by word and do some implementation on each word. In future I want to know where was the position of each word. Position is line number and character position in that line. If character position is not available I only need to know when I'm reading a file when I go to the next line. This is the sample code I have now:
string tmp;
while(fin>>tmp){
mylist.push_back(tmp);
}
I need to know when fin is going to next line?!
"I need to know when fin is going to next line"
This is not possible with stream's operator >>. You can read the input line by line and process each line separately using temporary istringstream object:
std::string line, word;
while (std::getline(fin, line)) {
// skip empty lines:
if (line.empty()) continue;
std::istringstream lineStream(line);
for (int wordPos = 0; lineStream >> word; wordPos++) {
...
mylist.push_back(word);
}
}
just don't forget to #include <sstream>
One simple way to solve this problem would be using std::getline, run your own counter, and split line's content into words using an additional string stream, like this:
string line;
int line_number = 0;
for (;;) {
if (!getline(fin, line)) {
break;
}
istringstream iss(line);
string tmp;
while (iss >> tmp) {
mylist.push_back(tmp);
}
line_number++;
}

Reading a file where middle name is optional

I'm trying to read in a file formatted as
firstName middleName(optional) lastName petName\n
With the middle name being there on half the entries, I'm unsure as to the best way to read these in and get them into the correct variable names. Any help would be greatly appreciated.
You could do something like this:
std::string line, word;
while (std::getline(myFile, line)) {
if (line.empty()) continue;
// read words from line:
std::istringstream is(line);
std::vector<std::string> words;
words.reserve(4);
for (int i = 0; is >> words && i < 4; i++)
words.push_back(word);
if (words.size() == 4)
// middle name was present ...
else
// it was not ...
}
If only middleName is optional, you can split the line and keep words in a std::vector<std::string>. Then check if size of vector is 4, then you have the middleName. If size is 3, you don't.