Reading a text file and storing data into multiple arrays C++ - c++

I am trying to read a database file (as txt) where I want to skip empty lines and skip the column header line within the file and store each record as an array. I would like to take stop_id and find the stop_name appropriately. i.e.
If i say give me stop 17, the program will get "Jackson & Kolmar".
The file format is as follows:
17,17,"Jackson & Kolmar","Jackson & Kolmar, Eastbound, Southeast Corner",41.87685748,-87.73934698,0,,1
18,18,"Jackson & Kilbourn","Jackson & Kilbourn, Eastbound, Southeast Corner",41.87688572,-87.73761421,0,,1
19,19,"Jackson & Kostner","Jackson & Kostner, Eastbound, Southeast Corner",41.87691497,-87.73515882,0,,1
So far I am able to get the stop_id values but now I want to get the stop name values and am fairly new to c++ string manipulation
mycode.cpp
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
using namespace std;
int main()
{
string filename;
filename = "test.txt";
string data;
ifstream infile(filename.c_str());
while(!infile.eof())
{
getline(infile,line);
int comma = line.find(",");
data = line.substr(0,comma);
cout << "Line " << count << " "<< "is "<< data << endl;
count++;
}
infile.close();
string sent = "i,am,the,champion";
return 0;
}

You can use string::find 3 times to search for the third occurrence of the comma, and you must store the positions of the last 2 occurrences found in line, then use them as input data with string::substr and get the searched text:
std::string line ("17,17,\"Jackson & Kolmar\",\"Jackson & Kolmar, Eastbound, Southeast Corner\",41.87685748,-87.73934698,0,,1");
std::size_t found=0, foundBack;
int i;
for(i=0;i<3 && found!=std::string::npos;i++){
foundBack = found;
found=line.find(",",found+1);
}
std::cout << line.substr(foundBack+1,found-foundBack-1) << std::endl;

You can read the whole line of the file intoa string and then use stringstream to give you each piece one at a time up until and exluding the commas. Then you can fill up your arrays. I am assuming that you wanted each line in it's own array and that you wanted unlimited arrays. The best way to do that is to have an array of arrays.
std::string Line;
std::array<std::array<string>> Data;
while (std::getline(infile, Line))
{
std::stringstream ss;
ss << Line;
Data.push_back(std::vector<std::string>);
std::string Temp;
while (std::getline(ss, Temp, ','))
{
Data[Data.size() - 1].push_back(Temp);
}
}
This way you will have a vector, full of vectors, each of which conatining strings of all your data in that line. To access the strings as numbers, you can use std::stoi(std::string) which converts a string to an integer.

Related

How to read and process text from a file in this specific way?

I have a question regarding some code to process some names or numbers from a file I'm reading. So the text in the file looks like this:
Imp;1;down;67
Comp;3;up;4
Imp;42;right;87
As you can see , there are 3 lines with words and numbers delimited by the character ';' . I want to read each line at a time, and split the entire string in one line into the words and numbers , and then process the information (will be used to create a new object with the data). Then move on to the next line, and so on, until EOF.
So, i want to read the first line of text, split it into an array of strings formed out of the words and numbers in the line , then create an object of a class out of them. For example for the first line , create an object of the class Imp like this Imp objImp(Imp, 1, down, 67) .
In Java i did the same thing using information = line.split(";")' (where line was a line of text) and then used information[0], information[1] to access the members of the string array and create the object. I`m trying to do the same here
Don't use char array for buffer, and don't use std::istream::eof. That's been said, let's continue in solving the problem.
std::getline is simmilar to std::istream::getline, except that it uses std::string instead of char arrays.
In both, the parameter delim means a delimiting character, but in a way that it's the character, which when encountered, std::getline stops reading (does not save it and discards it). It does not mean a delimiter in a way that it will magically split the input for you between each ; on the whole line.
Thus, you'll have to do this:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
...
std::ifstream myFile("D:\\stuff.txt"); // one statement
if (myFile.is_open()) {
std::string line;
while (std::getline(myFile, line)) { // line by line reading
std::istringstream line_stream(line);
std::string output;
while (std::getline(line_stream, output, ';')) // line parsing
std::cout << output << std::endl;
}
}
We construct a std::istringstream from line, so we can parse it again with std::getline.
One other (slightly different) alternative:
/*
* Sample output:
* line:Imp;1;down;67
* "Imp", "1", "down", "67"
* line:Comp;3;up;4
* "Comp", "3", "up", "4"
* line:Imp;42;right;87
* "Imp", "42", "right", "87"
* line:Imp;42;right;87
* "Imp", "42", "right", "87"
*/
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;
void split(const std::string &s, char delim, std::vector<string> &fields)
{
fields.clear();
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim)) {
fields.push_back(item);
}
}
void print (std::vector<string> &fields)
{
cout << " ";
for (size_t i = 0; i < fields.size() - 1; i++)
cout << "\"" << fields[i] << "\", ";
cout << "\"" << fields[fields.size()-1] << "\"" << endl;
}
int main ()
{
std::ifstream fp("tmp.txt");
std::string line;
while (!fp.eof()) {
fp >> line;
cout << "line:" << line << endl;
std::vector<std::string> fields;
split(line, ';', fields);
print(fields);
}
fp.close();
return 0;
}

Reading two columns in CSV file in c++

I have a CSV file in the form of two columns: name, age
To read and store the info, I did this
struct person
{
string name;
int age;
}
person record[10];
ifstream read("....file.csv");
However, when I did
read >> record[0].name;
read.get();
read >> record[0].age;
read>>name gave me the whole line instead of just the name. How could I possibly avoid this problem so that I can read the integer into age?
Thank you!
You can first read the whole line with std:getline, then parse it via a std::istringstream (must #include <sstream>), like
std::string line;
while (std::getline(read, line)) // read whole line into line
{
std::istringstream iss(line); // string stream
std::getline(iss, record[0].name, ','); // read first part up to comma, ignore the comma
iss >> record[0].age; // read the second part
}
Below is a fully working general example that tokenizes a CSV file Live on Ideone
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
int main()
{
// in your case you'll have a file
// std::ifstream ifile("input.txt");
std::stringstream ifile("User1, 21, 70\nUser2, 25,68");
std::string line; // we read the full line here
while (std::getline(ifile, line)) // read the current line
{
std::istringstream iss{line}; // construct a string stream from line
// read the tokens from current line separated by comma
std::vector<std::string> tokens; // here we store the tokens
std::string token; // current token
while (std::getline(iss, token, ','))
{
tokens.push_back(token); // add the token to the vector
}
// we can now process the tokens
// first display them
std::cout << "Tokenized line: ";
for (const auto& elem : tokens)
std::cout << "[" << elem << "]";
std::cout << std::endl;
// map the tokens into our variables, this applies to your scenario
std::string name = tokens[0]; // first is a string, no need for further processing
int age = std::stoi(tokens[1]); // second is an int, convert it
int height = std::stoi(tokens[2]); // same for third
std::cout << "Processed tokens: " << std::endl;
std::cout << "\t Name: " << name << std::endl;
std::cout << "\t Age: " << age << std::endl;
std::cout << "\t Height: " << height << std::endl;
}
}
read>>name gave me the whole line instead of just the name. How could I possibly avoid this problem so that I can read the integer into age?
read >> name will read everything into name until a white space is encountered.
If you have a comma separated line without white spaces, it makes sense that the entire line is read into name.
You can use std::getline to read the entire line to one string. Then use various methods of tokenizing a std::string.
Sample SO posts that address tokenizing a std::string:
How do I tokenize a string in C++?
c++ tokenize std string
Splitting a C++ std::string using tokens, e.g. ";"
You maybe could use stringstreams for that, but I wouldn't trust this, if I'm honest.
If I was you, I would write a small function, that reads the whole line into a string and after that, it should search for the separator character in the string. Everything in front of that is the first column and everything behind the second one. With the string operations provided by C++ you can move these parts in your variables (you can convert them into the correct type if you need).
I wrote a small C++ Library for CSV parsing, maybe a look at it helps you. You can find it on GitHub.
EDIT:
In this Gist you can find the parsing function

Reading data in from a .csv into usable format using C++

I would like to be able to read the data that I have into C++ and then start to do things to manipulate it. I am quite new but have a tiny bit of basic knowledge. The most obvious way of doing this that strikes me (and maybe this comes from using excel previously) would be to read the data into a 2d array. This is the code that I have so far.
#include <iostream>
#include <fstream>
#include <algorithm>
#include <string>
#include <sstream>
using namespace std;
string C_J;
int main()
{
float data[1000000][10];
ifstream C_J_input;
C_J_input.open("/Users/RT/B/CJ.csv");
if (!C_J_input) return -1;
for(int row = 0; row <1000000; row++)
{
string line;
getline(C_J_input, C_J, '?');
if ( !C_J_input.good() )
break;
stringstream iss(line);
for(int col = 0; col < 10; col++)
{
string val;
getline(iss, val, ',');
if (!iss.good() )
break;
stringstream converter(val);
converter >> data[row][col];
}
}
cout << data;
return 0;
}
Once I have the data read in I would like to be able to read through it line by line and then pull analyse it, looking for certain things however I think that could probably be the topic of another thread, once I have the data read in.
Just let me know if this is a bad question in any way and I will try to add anything more that might make it better.
Thanks!
as request of the asker, this is how you would load it into a string, then split into lines, and then further split into elements:
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <sstream>
//This takes a string and splits it with a delimiter and returns a vector of strings
std::vector<std::string> &SplitString(const std::string &s, char delim, std::vector<std::string> &elems)
{
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim))
{
elems.push_back(item);
}
return elems;
}
int main(int argc, char* argv[])
{
//load the file with ifstream
std::ifstream t("test.csv");
if (!t)
{
std::cout << "Unknown File" << std::endl;
return 1;
}
//this is just a block of code designed to load the whole file into one string
std::string str;
//this sets the read position to the end
t.seekg(0, std::ios::end);
str.reserve(t.tellg());//this gives the string enough memory to allocate up the the read position of the file (which is the end)
t.seekg(0, std::ios::beg);//this sets the read position back to the beginning to start reading it
//this takes the everything in the stream (the file data) and loads it into the string.
//istreambuf_iterator is used to loop through the contents of the stream (t), and in this case go up to the end.
str.assign((std::istreambuf_iterator<char>(t)),
std::istreambuf_iterator<char>());
//if (sizeof(rawData) != *rawSize)
// return false;
//if the file has size (is not empty) then analyze
if (str.length() > 0)
{
//the file is loaded
//split by delimeter(which is the newline character)
std::vector<std::string> lines;//this holds a string for each line in the file
SplitString(str, '\n', lines);
//each element in the vector holds a vector of of elements(strings between commas)
std::vector<std::vector<std::string> > LineElements;
//for each line
for (auto it : lines)
{
//this is a vector of elements in this line
std::vector<std::string> elementsInLine;
//split with the comma, this would seperate "one,two,three" into {"one","two","three"}
SplitString(it, ',', elementsInLine);
//take the elements in this line, and add it to the line-element vector
LineElements.push_back(elementsInLine);
}
//this displays each element in an organized fashion
//for each line
for (auto it : LineElements)
{
//for each element IN that line
for (auto i : it)
{
//if it is not the last element in the line, then insert comma
if (i != it.back())
std::cout << i << ',';
else
std::cout << i;//last element does not get a trailing comma
}
//the end of the line
std::cout << '\n';
}
}
else
{
std::cout << "File Is empty" << std::endl;
return 1;
}
system("PAUSE");
return 0;
}
On second glance, I've noticed few obvious issues which will slow your progress greatly, so I'll drop them here:
1) you are using two disconnected variables for reading the lines:
C_J - which receives data from getline function
line - which is used as the source of stringstream
I'm pretty sure that the C_J is completely unnecessary. I think you wanted to simply do
getline(C_J_input, line, ...) // so that the textline read will fly to the LINE var
// ...and later
stringstream iss(line); // no change
or, alternatively:
getline(C_J_input, C_J, ...) // no change
// ...and later
stringstream iss(C_J); // so that ISS will read the textline we've just read
elsewise, the stringstream will never see what getline has read form the file - getline writes the data to different place (C_J) than the stringstream looks at (line).
2) another tiny bit is that you are feeding a '?' into getline() as the line separator. CSVs usually use a 'newline' character to separate the data lines. Of course, your input file may use '?' - I dont know. But if you wanted to use a newline instead then omit the parameter at all, getline will use default newline character matching your OS, and this will probably be just OK.
3) your array of float is, um huge. Consider using list instead. It will nicely grow as you read rows. You can even nest them, so list<list<float>> is also very usable. I'd actually probably use list<vector<float>> as the number of columns is constant though. Using a preallocated huge array is not a good idea, as there always be a file with one-line-too-much you know and ka-boom.
4) your code contains a just-as-huge loop that iterates a constant number of times. A loop itself is ok, but the linecount will vary. You actually don't need to count the lines. Especially if you use list<> to store the values. Just like you;ve checked if the file is properly open if(!C_J_input), you may also check if you have reached End-Of-File:
if(C_J_input.eof())
; // will fire ONLY if you are at the end of the file.
see here for an example
uh.. well, that's for start. Goodluck!

Reading from a CSV/text file with quotes in C++

I have a working function that reads lines from a text file (CSV), but I need to modify it to be able to read double quotes (I need to have these double quotes because some of my string values contain commas, so I am using double-quotes to denote the fact that the read function should ignore commas between the double-quotes). Is there a relatively simple way to modify the function below to accommodate the fact that some of the fields will be enclosed in double quotes?
A few other notes:
I could have all of the fields enclosed in double-quotes fairly easily if that helps (rather than just the ones that are strings, as is currently the case)
I could also change the delimiter fairly easily from a comma to some other character (like a pipe), but was hoping to stick with CSV if its easy to do so
Here is my current function:
void ReadLoanData(vector<ModelLoanData>& mLoan, int dealnum) {
// Variable declarations
fstream InputFile;
string CurFileName;
ostringstream s1;
string CurLineContents;
int LineCounter;
char * cstr;
vector<string> currow;
const char * delim = ",";
s1 << "ModelLoanData" << dealnum << ".csv";
CurFileName = s1.str();
InputFile.open(CurFileName, ios::in);
if (InputFile.is_open()) {
LineCounter = 1;
while (InputFile.good()) {
// Grab the line
while (getline (InputFile, CurLineContents)) {
// Create a c-style string so we can tokenize
cstr = new char [CurLineContents.length()+1];
strcpy (cstr, CurLineContents.c_str());
// Need to resolve the "blank" token issue (strtok vs. strsep)
currow = split(cstr,delim);
// Assign the values to our model loan data object
mLoan[LineCounter] = AssignLoanData(currow);
delete[] cstr;
++LineCounter;
}
}
// Close the input file
InputFile.close();
}
else
cout << "Error: File Did Not Open" << endl;
}
The following works with the given input: a,b,c,"a,b,c","a,b",d,e,f
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main() {
std::string line;
while(std::getline(cin, line, '"')) {
std::stringstream ss(line);
while(std::getline(ss, line, ',')) {
cout << line << endl;
}
if(std::getline(cin, line, '"')) {
cout << line;
}
}
}

Reading a file into an array

I would like to read a text file and input its contents into an array. Then I would like to show the contents of the array in the command line.
My idea is to open the file using:
inFile.open("pigData.txt")
And then to get the contents of the file using:
inFile >> myarray [size]
And then show the contents using a for loop.
My problem is that the file I am trying to read contain words and I don't know how to get a whole word as an element in the array. Also, let's say that the words are divided by spaces, thus:
hello goodbye
Could be found on the file. I would like to read the whole line "hello goodbye" into an element of a parallel array. How can I do that?
Should be pretty straightforward.
std::vector<std::string> file_contents;
std::string line;
while ( std::getline(inFile,line) )
file_contents.push_back(line);
std::vector<std::string>::iterator it = file_contents.begin();
for(; it!=file_contents.end() ; ++it)
std::cout << *it << "\n";
Edit:
Your comment about having "hello goodbye" as element zero and element one is slightly confusing to me. The above code snip will read each line of the file and store that as an individual entry in the array 'file_contents'. If you want to read it and split it on spaces that is slightly different.
For context, you could have provided a link to your previous question, about storing two lists of words in different languages. There I provided an example of reading the contents of a text file into an array:
const int MaxWords = 100;
std::string piglatin[MaxWords];
int numWords = 0;
std::ifstream input("piglatin.txt");
std::string line;
while (std::getline(input, line) && numWords < MaxWords) {
piglatin[numWords] = line;
++numWords;
}
if (numWords == MaxWords) {
std::cerr << "Too many words" << std::endl;
}
You can't have one parallel array. For something to be parallel, there must be at least two. For parallel arrays of words, you could use a declarations like this:
std::string piglatin[MaxWords];
std::string english[MaxWords];
Then you have two options for filling the arrays from the file:
Read an entire line, and the split the line into two words based on where the first space is:
while (std::getline(input, line) && numWords < MaxWords) {
std::string::size_type space = line.find(' ');
if (space == std::string::npos)
std::cerr << "Only one word" << std::endl;
piglatin[numWords] = line.substr(0, space);
english[numWords] = line.substr(space + 1);
++numWords;
}
Read one word at a time, and assume that each line has exactly two words on it. The >> operator will read a word at a time automatically. (If each line doesn't have exactly two words, then you'll have problems. Try it out to see how things go wrong. Really. Getting experience with a bug when you know what the cause is will help you in the future when you don't know what the cause is.)
while (input && numWords < MaxWords) {
input >> piglatin[numWords];
input >> english[numWords];
++numWords;
}
Now, if you really one one array with two elements, then you need to define another data structure because an array can only have one "thing" in each element. Define something that can hold two strings at once:
struct word_pair {
std::string piglatin;
std::string english;
};
Then you'll have just one array:
word_pair words[MaxWords];
You can fill it like this:
while (std::getline(input, line) && numWords < MaxWords) {
std::string::size_type space = line.find(' ');
if (space == std::string::npos)
std::cerr << "Only one word" << std::endl;
words[numWords].piglatin = line.substr(0, space);
words[numWords].english = line.substr(space + 1);
++numWords;
}
Notice how the code indexes into the words array to find the next word_pair object, and then it uses the . operator to get to the piglatin or english field as necessary.
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
using namespace std;
int main()
{
// This will store each word (separated by a space)
vector<string> words;
// Temporary variable
string buff;
// Reads the data
fstream inFile("words.txt");
while(!inFile.eof())
{
inFile>>buff;
words.push_back(buff);
}
inFile.close();
// Display
for(size_t i=0;i<words.size();++i) cout<<words[i]<<" ";
return 0;
}
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int main ()
{
vector<string> fileLines;
string line;
ifstream inFile("pigData.txt");
if ( inFile.is_open() ) {
while ( !inFile.eof() ) {
getline(inFile, line);
fileLines.push_back(line);
}
inFile.close();
} else {
cerr << "Error opening file" << endl;
exit(1);
}
for (int i=0; i<fileLines.size(); ++i) {
cout << fileLines[i] << "\n";
}
cout << endl;
return 0;
}