Reading a file where middle name is optional - c++

I'm trying to read in a file formatted as
firstName middleName(optional) lastName petName\n
With the middle name being there on half the entries, I'm unsure as to the best way to read these in and get them into the correct variable names. Any help would be greatly appreciated.

You could do something like this:
std::string line, word;
while (std::getline(myFile, line)) {
if (line.empty()) continue;
// read words from line:
std::istringstream is(line);
std::vector<std::string> words;
words.reserve(4);
for (int i = 0; is >> words && i < 4; i++)
words.push_back(word);
if (words.size() == 4)
// middle name was present ...
else
// it was not ...
}

If only middleName is optional, you can split the line and keep words in a std::vector<std::string>. Then check if size of vector is 4, then you have the middleName. If size is 3, you don't.

Related

Reading integers and strings from a text file and storing in parallel arrays

I have a text file that stores the index, student name and student ID and I am trying to read them into an array of integers index, arrays of strings studentName and studentID. I'm having problems storing the student's names because they could be more than a single word. I could separate the items in the text file by commas and use getline but that would mean the index array will have to be a string type. Is there a workaround for this without changing the original text file?
Original file:
1 James Smith E2831
2 Mohammad bin Rahman M3814
3 MJ J4790
const int SIZE = 3;
int index[SIZE];
string studentName[SIZE], studentID[SIZE];
fstream infile("students.txt");
if(infile.is_open()){
int i = 0;
while(i < 3){
infile >> index[i] >> studentName[i] >> studentID[i];
i++;
}
}
Changed file:
1,James Smith,E2831
2,Mohammad bin Rahman,M3814
3,MJ,J4790
const int SIZE = 3;
string index[SIZE];
string studentName[SIZE], studentID[SIZE];
fstream infile("students.txt");
if(infile.is_open()){
int i = 0;
while(i < 3){
getline(infile, index[i],','); //index array is string
getline(infile, studentName[i],',');
getline(infile, studentID[i],'\n');
i++;
}
}
It's an error to read one line into one student property with the given input format. You need to read one line and then split the information in this line into the 3 properties.
std::stoi can be used to convert to convert the first part of the line read to an int. Futhermore it's simpler to handle the data, if you create a custom type storing all 3 properties of a student instead of storing the information in 3 arrays.
Note that the following code requires the addition of logic to skip whitespace directly after (or perhaps even before) the ',' chars. Currently it simply includes the whitespace in the name/id. I'll leave that task to you.
struct Student
{
int m_index;
std::string m_name;
std::string m_id;
};
std::vector<Student> readStudents(std::istream& input)
{
std::vector<Student> result;
std::string line;
while (std::getline(input, line))
{
size_t endIndex = 0;
auto index = std::stoi(line, &endIndex);
if (line[endIndex] != ',')
{
throw std::runtime_error("invalid input formatting");
}
++endIndex;
auto endName = line.find(',', endIndex);
if (endName == std::string::npos)
{
throw std::runtime_error("invalid input formatting");
}
result.push_back(Student{ index, line.substr(endIndex, endName - endIndex), line.substr(endName + 1) });
}
return result;
}
int main() {
std::istringstream ss(
"1,James Smith,E2831\n"
"2, Mohammad bin Rahman, M3814\n"
"3, MJ, J4790\n");
auto students = readStudents(ss);
for (auto& student : students)
{
std::cout << "Index=" << student.m_index << "; Name=" << student.m_name << ";Id=" << student.m_id << '\n';
}
}
There are so many possible solutions that it is hard to tell, what should be used. Depends a little bit on your personal style and how good you know the language.
In nearly all solutions you would (for safety reasons) read a complete line with std::getline then put either split the line manually or use a std::istringstream for further extraction.
A csv input would be preferred, because it is more clear what belongs together.
So, what are the main possibilities? First the space separated names
You could use a std::regexand then search or match for example "(\d) ([\w ]+) (\w+)"
You cou create substrings, by searching the first space from the right side of the string, which would be the "studentID", then the getting the rest is simple
You could use a while loop and read all parts of the string and put it in a std::vector. The first element in the std::vector is then the index, the last the ID and the rest would be the name.
You could parse the string with formatted input functions. First read the index as number, then the rest as stringm the split of the last part and get the id.
And many more
For csv, you could use
the std::sregex_token_iterator looking for the comma as separator
Use also a std::vector and later pick the needed values.
Use a mixture of formatted and unformatted input
Example:
std::getline(std::getline(infile >> index >> Comma >> std::ws, name, ','), id);
It is up to you, what you would like to implement.
Write your preference in the comment, then I will add some code.

How do I obtain the column of my CSV File?

I'm trying to obtain the last column of my CSV file. I tried using getline and the stringstream but it doesn't get the last column only
stringstream lineStream(line);
string bit;
while (getline(inputFile, line))
{
stringstream lineStream(line);
bit = "";
getline(lineStream, bit, ',');
getline(lineStream, bit, '\n');
getline(inputFile, line);
stringVector.push_back(bit);
}
My CSV file:
5.1,3.5,1.4,0.2,no
4.9,3.0,1.4,0.2,yes
4.7,3.2,1.3,0.2,no
4.6,3.1,1.5,0.2,yes
5.0,3.6,1.4,0.2,no
5.4,3.9,1.7,0.4,yes
Probably the simplest approach is to use std::string::rfind as follows:
while (std::getline(inputFile, line))
{
// Find position after last comma, extract the string following it and
// add to the vector. If no comma found and non-empty line, treat as
// special case and add that too.
std::string::size_type pos = line.rfind(',');
if (pos != std::string::npos)
stringVector.push_back(line.substr(pos + 1));
else if (!line.empty())
stringVector.push_back(line);
}

How can I split a getline() into array in c++

I have an input getline:
man,meal,moon;fat,food,feel;cat,coat,cook;love,leg,lunch
And I want to split this into an array when it sees a ;, it can store all values before the ; in an array.
For example:
array[0]=man,meal,moon
array[1]=fat,food,feel
And so on...
How can I do it? I tried many times but I failed!😒
Can anyone help?
Thanks in advance.
You can use std::stringstream and std::getline.
I also suggest that you use std::vector as it's resizeable.
In the example below, we get input line and store it into a std::string, then we create a std::stringstream to hold that data. And you can use std::getline with ; as delimiter to store the string data between the semicolon into the variable word as seen below, each "word" which is pushed back into a vector:
int main()
{
string line;
string word;
getline(cin, line);
stringstream ss(line);
vector<string> vec;
while (getline(ss, word, ';')) {
vec.emplace_back(word);
}
for (auto i : vec) // Use regular for loop if you can't use c++11/14
cout << i << '\n';
Alternatively, if you can't use std::vector:
string arr[256];
int count = 0;
while (getline(ss, word, ';') && count < 256) {
arr[count++] = word;
}
Live demo
Outputs:
man,meal,moon
fat,food,feel
cat,coat,cook
love,leg,lunch
I don't want to give you some code because you must be new at C++ and you have to learn by yourself but I can give an hint: use substring to store it into a vector of string.

How to read a file word by word and find the position of each word?

I'm trying to read a file word by word and do some implementation on each word. In future I want to know where was the position of each word. Position is line number and character position in that line. If character position is not available I only need to know when I'm reading a file when I go to the next line. This is the sample code I have now:
string tmp;
while(fin>>tmp){
mylist.push_back(tmp);
}
I need to know when fin is going to next line?!
"I need to know when fin is going to next line"
This is not possible with stream's operator >>. You can read the input line by line and process each line separately using temporary istringstream object:
std::string line, word;
while (std::getline(fin, line)) {
// skip empty lines:
if (line.empty()) continue;
std::istringstream lineStream(line);
for (int wordPos = 0; lineStream >> word; wordPos++) {
...
mylist.push_back(word);
}
}
just don't forget to #include <sstream>
One simple way to solve this problem would be using std::getline, run your own counter, and split line's content into words using an additional string stream, like this:
string line;
int line_number = 0;
for (;;) {
if (!getline(fin, line)) {
break;
}
istringstream iss(line);
string tmp;
while (iss >> tmp) {
mylist.push_back(tmp);
}
line_number++;
}

Fast, Simple CSV Parsing in C++

I am trying to parse a simple CSV file, with data in a format such as:
20.5,20.5,20.5,0.794145,4.05286,0.792519,1
20.5,30.5,20.5,0.753669,3.91888,0.749897,1
20.5,40.5,20.5,0.701055,3.80348,0.695326,1
So, a very simple and fixed format file. I am storing each column of this data into a STL vector. As such I've tried to stay the C++ way using the standard library, and my implementation within a loop looks something like:
string field;
getline(file,line);
stringstream ssline(line);
getline( ssline, field, ',' );
stringstream fs1(field);
fs1 >> cent_x.at(n);
getline( ssline, field, ',' );
stringstream fs2(field);
fs2 >> cent_y.at(n);
getline( ssline, field, ',' );
stringstream fs3(field);
fs3 >> cent_z.at(n);
getline( ssline, field, ',' );
stringstream fs4(field);
fs4 >> u.at(n);
getline( ssline, field, ',' );
stringstream fs5(field);
fs5 >> v.at(n);
getline( ssline, field, ',' );
stringstream fs6(field);
fs6 >> w.at(n);
The problem is, this is extremely slow (there are over 1 million rows per data file), and seems to me to be a bit inelegant. Is there a faster approach using the standard library, or should I just use stdio functions? It seems to me this entire code block would reduce to a single fscanf call.
Thanks in advance!
Using 7 string streams when you can do it with just one sure doesn't help wrt. performance.
Try this instead:
string line;
getline(file, line);
istringstream ss(line); // note we use istringstream, we don't need the o part of stringstream
char c1, c2, c3, c4, c5; // to eat the commas
ss >> cent_x.at(n) >> c1 >>
cent_y.at(n) >> c2 >>
cent_z.at(n) >> c3 >>
u.at(n) >> c4 >>
v.at(n) >> c5 >>
w.at(n);
If you know the number of lines in the file, you can resize the vectors prior to reading and then use operator[] instead of at(). This way you avoid bounds checking and thus gain a little performance.
I believe the major bottleneck (put aside the getline()-based non-buffered I/O) is the string parsing. Since you have the "," symbol as a delimiter, you may perform a linear scan over the string and replace all "," by "\0" (the end-of-string marker, zero-terminator).
Something like this:
// tmp array for the line part values
double parts[MAX_PARTS];
while(getline(file, line))
{
size_t len = line.length();
size_t j;
if(line.empty()) { continue; }
const char* last_start = &line[0];
int num_parts = 0;
while(j < len)
{
if(line[j] == ',')
{
line[j] = '\0';
if(num_parts == MAX_PARTS) { break; }
parts[num_parts] = atof(last_start);
j++;
num_parts++;
last_start = &line[j];
}
j++;
}
/// do whatever you need with the parts[] array
}
I don't know if this will be quicker than the accepted answer, but I might as well post it anyway in case you wish to try it.
You can load in the entire contents of the file using a single read call by knowing the size of the file using some fseek magic. This will be much faster than multiple read calls.
You could then do something like this to parse your string:
//Delimited string to vector
vector<string> dstov(string& str, string delimiter)
{
//Vector to populate
vector<string> ret;
//Current position in str
size_t pos = 0;
//While the the string from point pos contains the delimiter
while(str.substr(pos).find(delimiter) != string::npos)
{
//Insert the substring from pos to the start of the found delimiter to the vector
ret.push_back(str.substr(pos, str.substr(pos).find(delimiter)));
//Move the pos past this found section and the found delimiter so the search can continue
pos += str.substr(pos).find(delimiter) + delimiter.size();
}
//Push back the final element in str when str contains no more delimiters
ret.push_back(str.substr(pos));
return ret;
}
string rawfiledata;
//This call will parse the raw data into a vector containing lines of
//20.5,30.5,20.5,0.753669,3.91888,0.749897,1 by treating the newline
//as the delimiter
vector<string> lines = dstov(rawfiledata, "\n");
//You can then iterate over the lines and parse them into variables and do whatever you need with them.
for(size_t itr = 0; itr < lines.size(); ++itr)
vector<string> line_variables = dstov(lines[itr], ",");
std::ifstream file{ InputFilename };
std::vector<std::string> line_elements;
for (std::string line; std::getline(file, line);)
{
line_elements.clear();
std::istringstream ss(line);
for (std::string value; std::getline(ss, value, ',');)
{
line_elements.push_back(std::move(value));
}
// Do something with the line_elements.
}