Read columns from a comma delimited data file c++ - c++

I have been trying to read following data table and create an object for the HUBs(rows) and another object for continent (columns). Since I am not a C++ experienced user I have been facing some difficulties. The data is in following. The number after HUB and the dash shows the order from the hub. The other numbers under each continent are the corresponding cost and tariffs between a HUB and continent. I would like to be able to cout for instance following and get the result which would be 73.
cout << hub(1)->cont(USA)->transport() << endl;
,USA,EUROPE,ASIA
HUB1-12000,,,
Transportation Cost,73,129,141
Tariffs,5,5,1
ShippingType,a,b,c
OtherFees,0.6,0.3,0.8
HUB2-11000,,,
Transportation Cost,57,101,57
Tariffs,7,7,5
ShippingType,b,b,d
OtherFees,0.7,0.3,0.6
Really appreciate your help. Here is what I have tried so far:
void Hub()
{
string file = "/hubs.csv";
// 1-First read the first line and save the continent name
string str, field;
getline( fin, str );
vector<string> contList;
stringstream linestr( str );
while ( linestr.good() )
{
getline( linestr, field, ',' );
string contname;
contList.push_back(contname);
}
// 2-Then read the rest
getline( fin, str );
while ( !fin.eof() ) // Read the whole file
{
stringstream linestr( str );
string contname, order;
if ( qstr[0] == 'HUB1' || qstr[0] == 'HUB2')
{
// Read the name of the hub
getline( linestr, hubname, ',' ); // Read the hub name
getline( linestr, order, ',' ); // Read the order quantityity
int quantity;
istringstream orderstream( order);
orderstream >> quantity;
// Find the hub and add the order to the hub
Hub* hub = glob->FindHubName( hubname ); // this returns a pointer
if ( glob->FindHubName( hubname ) == nullptr )
{
hubNotFound.push_back( hubname );
getline( fin, qstr );
continue;
}
hub->TotalOrder( quantity );
}
else if ( qstr[0] != 'HUB1' || qstr[0] != 'HUB2')
{
// Read costs and tariffs
cout << hub(1)->cont(ASIA)->transport()
}
getline( fin, qstr );
}
fin.close();
}

Something like this:
#include <iostream>
#include <fstream>
#include <boost/tokenizer.hpp>
#include <string>
int main() {
using namespace std;
using namespace boost;
string line, file_contents;
fstream file("test.csv");
if (!file.is_open()) {
cerr << "Unable to open file" << endl;
return 1;
}
getline(file, line);
tokenizer<> tok_head(line);
int n_columns = 0;
for (tokenizer<>::iterator beg=tok_head.begin(); beg!=tok_head.end(); ++beg) {
cout << *beg << '\t';
n_columns++;
}
cout << endl;
while (getline(file, line)) {
file_contents += line;
}
file.close();
tokenizer<> tok(file_contents);
int i = 0;
for (tokenizer<>::iterator beg=tok.begin(); beg!=tok.end(); ++beg, ++i) {
cout << *beg;
if (i % n_columns) {
cout << '\t';
} else {
cout << endl;
}
}
return 0;
}
Makefile
all: t
t: csv.cpp
g++ -I /usr/include/boost csv.cpp -o t

It looks like you must parse each line using different logic, so you should check first column first and using it apply appropriate logic, below is some pseudocode for that:
std::fstream fs("test.txt");
std::string line;
//
// Read line by line
while (std::getline(fs, line)) {
std::istringstream str(line);
std::string rec_type;
// Read record type (your first two lines looks like are of no type?)
if ( !std::getline(str, rec_type, ',') )
continue;
// Decide type of record, and parse it accordingly
if ( rec_type == "Transportation Cost") {
std::string val;
// Read comma delimited values
if ( !std::getline(str, val, ',') )
continue;
int ival1 = std::stoi(val);
if ( !std::getline(str, val, ',') )
continue;
int ival2 = std::stoi(val);
// ...
}
if ( rec_type == "Tariffs") {
std::string val;
if ( !std::getline(str, val, ',') )
continue;
int ival = std::stoi(val);
// ...
}
}

One method is to consider each line as a separate record and object.
Let the objects read their data.
For example:
class Tariff
{
int values[3];
public:
friend std::istream& operator>>(std::istream& input, Tariff& t);
};
std::istream& operator>>(std::istream& input, Tariff& t)
{
// Read and ignore the label "Tariff"
std::string name;
std::getline(input, name, ','); // Read until ',' delimiter.
input >> t.value[0];
// Note: the ',' is not a digit, so it causes an error state,
// which must be cleared.
input.clear();
input >> t.value[1];
input.clear();
input >> t.value[2];
input.clear();
}
Another method is to read the label first, then delegate to a function that reads in the row.
std::string row_text;
std::getline(text_file, row_text); // Read in first line and ignore.
while (std::getline(text_file, row_text))
{
std::istringstream text_stream(row_text);
std::string label;
std::getline(text_stream, label, ','); // Parse the label.
// Delegate based on label.
// Note: can't use switch for strings.
if (label == "Tariffs")
{
Input_Tariff_Data(text_stream);
}
else if (label == "ShippingType")
{
Input_Shipping_Type_Data(text_stream);
}
//...
} // End-while
The if-else ladder can be replaced by a lookup table that uses function pointers. Sometimes the table is easier to read.
typedef void (*P_Input_Processor)(std::istringstream& text_stream);
struct Table_Entry
{
char const * label;
*P_Input_Processor input_processor;
};
//...
const Table_Entry delegation_table[] =
{
{"Tariffs", Input_Tariff_Data},
{"ShippingType", Input_Shipping_Type_Data},
};
const unsigned int entry_quantity =
sizeof(delegation_table) / sizeof(delegation_table[0]);
// ...
std::string row_text;
std::getline(input_file, row_text); // Read and ignore first line.
while (std::getline(input_file, row_text))
{
// Create a stream for parsing.
std::istringstream text_stream(row_text);
// Extract label text
std::string label;
std::getline(text_stream, label, ',');
// Lookup label in table and execute associated function.
for (unsigned int index = 0; index < entry_quantity; ++index)
{
if (label == delegation_table[index].name)
{
// Execute the associated input function
// by derferencing the function pointer.
delegation_table[index](text_stream);
break;
}
}
}
An alternative to the lookup table is to use:
std::map<std::string, P_Input_Processor>
or
std::map<std::string, void (*P_Input_Processor)(std::istringstream&)>

Related

Reading a file with getline function, but first column appears empty

I'm trying to read a file and write the data I read into a structure. With the getline function, i read the whole line, and then divide it in columns. Each term goes into an argument of the structure, and each line is a new instance of the structure. The problem is that my first column is empty.
The folowing code works partially, I read the whole file, all the other columns work perfectly but the first one is filled with nothing.
this is my structure and the data I put into it :
struct employe {
int num_employe;
string nom;
string prenom;
string date_naissance;
string ville_resi;
int code_postal;
};
employe saisir_1_employe(vector<string> row) {
employe e;
e.num_employe = stoi(row[0]);
e.nom = row[1];
e.prenom = row[2];
e.date_naissance = row[3];
e.ville_resi = row[4];
e.code_postal = stoi(row[5]);
return (e);
}
I extract the data from the file like this :
if (myfile.is_open()) {
while (myfile >> temp) {
row.clear();
// read an entire row and
// store it in a string variable 'line'
getline(myfile, line, '\n');
istringstream s(line);
// read every column data of a row and
// store it in a string variable, 'word'
while (getline(s, word, '\t')) {
// add all the column data
// of a row to a vector
row.push_back(word);
}
e = saisir_1_employe(row);
afficher_1_employe(e);
}
}
my file I extract the data from looks like this : https://pastebin.com/Nfhu2tEp
When I display the second column (cout << row[1]) i get the names perfectly. But when I do cout << row[0] i get an empty column when it is supposed to be a string that I then convert to an int with e.num_employe = stoi(row[0]). It's there and has the right number of lines but just empty.
I think you should loop like this
while(std::getline(myfile, line, '\n'))
instead of
while (myfile >> temp)
which is cutting away the first word in every line ...
Use getline() by itself to get each line
There is no need to use this line:
while(myfile >> temp)
This is grabbing the first word and is never called again.
Instead, loop on each line, by calling getline() on the filestream directly:
while (getline(myfile, line, '\n'))
I like your use of stringstream to parse the words. I probably would have used a stringstream in the saisir function as well to do the parsing calls (e.g. instead of stoi()).
#include <string>
#include <sstream>
#include <iostream>
#include <vector>
#include <fstream>
using namespace std;
struct employe {
int num_employe;
string nom;
string prenom;
string date_naissance;
string ville_resi;
int code_postal;
};
employe saisir_1_employe(vector<string> row )
{
employe e;
e.num_employe = stoi(row[0]);
e.nom = row[1];
e.prenom = row[2];
e.date_naissance = row[3];
e.ville_resi = row[4];
e.code_postal = stoi(row[5]);
return (e);
}
int main()
{
fstream myfile{"myfile.txt"};
std::string line;
std::string word;
std::vector<employe> employees;
if (myfile.is_open()) {
// while (myfile >> temp) {
//
// row.clear();
// read an entire row and
// store it in a string variable 'line'
while (getline(myfile, line, '\n')) {
std::vector<std::string> row;
std::istringstream sline(line);
// read every column data of a row and
// store it in a string variable, 'word'
while (getline(sline, word, '\t')) {
// add all the column data
// of a row to a vector
row.push_back(word);
}
employe e;
e = saisir_1_employe(row);
employees.push_back(e);
// afficher_1_employe(e);
}
int index = 1;
// dump number and nom
for(const auto& emp : employees) {
std::cout << index++ << ". " << emp.num_employe
<< " " << emp.nom << std::endl;
}
}
}

Alternative for deleted copy constrctor of ifstream

I want to extract data from a csv file but I have to get the number of rows and columns of the table first.
What I have so far is the following:
std::ifstream myfile(filename); // filename is a string and passed in by the constructor
if (myfile.is_open())
{
// First step: Get number of rows and columns of the matrix to initialize it.
// We have to close and re-open the file each time we want to work with it.
int rows = getRows(myfile);
std::ifstream myfile1(filename);
int columns = getColumns(myfile1);
if (rows == columns) // Matrix has to be quadratic.
{
std::ifstream myfile2(filename);
abwicklungsdreieck.set_Matrix(QuantLib::Matrix(rows, columns, 0)); // abwicklungsdreieck is initialised before
//...
}
else
{
std::cout << "\nNumber of rows has to equal number of columns.";
}
}
// [...]
int getRows(std::ifstream &myfile)
{
std::string line;
int rows = 0;
while (std::getline(myfile, line)) // While-loop simply counts rows.
{
rows++;
}
myfile.close();
return rows - 1;
}
int getColumns(std::ifstream &myfile)
{
std::string line;
char delimiter = ';';
size_t pos = 0;
int columns = 0;
while (std::getline(myfile, line) && columns == 0) // Consider first line in the .csv file.
{
line = line + ";";
while ((pos = line.find(delimiter)) != std::string::npos) // Counts columns.
{
line.erase(0, pos + 1);
columns++;
}
}
myfile.close();
return columns - 1;
}
This code is working. However, I have to open the file for three times which I do not like. Is there a way to evade this?
I was thinking about working with tempfiles in getRows() and getColumns() but the copying streams isn't possible since it doesn't make sense as I learned recently.
So, is there another way do that? Or can I for example evade the getline() and the line.erase() methods?
You can read the file line by line, convert each line to a stream, then read the columns on the stream:
std::ifstream myfile(filename);
if(!myfile) return 0;
std::string line;
while(std::getline(myfile, line))
{
std::stringstream ss(line);
std::string column;
while(std::getline(ss, column, ';'))
{
cout << column;
}
cout << "\n";
}
getline(myfile, line) will copy each row in to line.
Convert line to ss stream.
getline(ss, column, ';') will break the line in to columns.
Use std::stoi to convert the string in to integer.
If your matrix is based on std::vector, you can grow the vector one row at a time, so you don't need to know the size in advance.
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
void readfile(const std::string &filename)
{
std::vector<std::vector<int>> matrix;
std::ifstream myfile(filename);
if(!myfile) return;
std::string buf;
while(std::getline(myfile, buf))
{
int maxrow = matrix.size();
std::stringstream ss(buf);
matrix.resize(maxrow + 1);
cout << "break in to columns:" << buf << "\n";
while(std::getline(ss, buf, ';'))
{
try {
int num = std::stoi(buf);
matrix[maxrow].push_back(num);
}
catch(...) { }
}
}
for(auto &row : matrix) {
for(auto col : row)
cout << col << "|";
cout << "\n";
}
}

CSV Parsing with c++

I'm trying to create a code that will parse through a csv database with stock information. Currently, I have the code generated so that it will search with a keyword and print the whole row, but I'm trying to get it so that it will print the whole row with the header row in a neatly formatted way.
I'm trying to get it so that if I searched Google, it'd return
Symbol GOOG
NAME Google Inc
High Today $568.77
How the csv looks like:
Symbol,Name,Price,High Today,Low Today,52 Week Low
GOOG,Google Inc.,$568.77 ,$570.25 ,$560.35
AAPL,Apple Inc.,$93.28 ,$63.89 ,$99.44.
Code:
string NameSearch::getInput()
{
cout << "Enter the name of the company you would like to search for: ";
getline(cin, input);
return input;
}
void NameSearch::NameAlgorithm()
{
string line;
ifstream fs("Stock Database.csv");
while (!fs.eof())
{
getline(fs, line);
string companyname = "";
string a;
int column = 1;
int commacount = 0;
int ChrCount = 0;
while (line != "\0")
{
a = line[ChrCount];
ChrCount++;
if (a == ",")
{
commacount++;
}
else if (commacount == column)
{
companyname.append(a);
}
else if (commacount > column)
{
break;
}
if (companyname == input)
{
cout << endl << line;
}
}
}
}
First a comma should be parsed as whitespace. You can do this by changing the internal std::ctype<charT> facet in the stream's locale:
struct csv_classification : std::ctype<char> {
csv_classification() : ctype(make_table()) { }
private:
static mask* make_table() {
const mask* classic = classic_table();
static std::vector<mask> v(classic, classic + table_size);
v[','] |= space;
v[' '] &= ~space;
return &v[0];
}
};
Then set the locale using:
ifs.imbue(std::locale(ifs.getloc(), new csv_classification));
Next make a manipulator that checks to see if you're at the end of the line. If you are it sets the std::ios_base::failbit flag in the stream state. Also use internal storage to tell if the record belongs as a key or value in the map. Borrowing a bit from Dietmar...
static int row_end = std::ios_base::xalloc();
std::istream& record(std::istream& is) {
while (std::isspace(is.peek())) {
int c(is.peek());
is.ignore();
if (c == '\n') {
is.iword(row_end) = !is.iword(row_end);
is.setstate(std::ios_base::failbit);
}
}
return is;
}
Then you can do:
std::vector<std::string> keys, values;
for (std::string item;;) {
if (ifs >> record >> item)
keys.push_back(item);
else if (ifs.eof())
break;
else if (ifs.iword(row_end)) {
ifs.clear();
while (ifs >> record >> item)
values.push_back(item);
}
else
break;
}
Now we need to apply both the keys and values and print them out. We can create a new algorithm for that:
template<class Iter1, class Iter2, class Function>
void for_each_binary_range(Iter1 first1, Iter1 last1,
Iter2 first2, Iter2 last2, Function f)
{
assert(std::distance(first1, last1) <= std::distance(first2, last2));
while (first1 != last1) {
f(*first1++, *first2++);
}
}
Finally we do:
for_each_binary_range(std::begin(keys), std::end(keys),
std::begin(values), std::end(values),
[&] (std::string const& key, std::string const& value)
{
std::cout << key << ": " << value << std::endl;
}
Live Demo
Here it is your solution (the most close to what you requested):
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
using namespace std;
typedef std::vector<std::string> record;
std::istream&
operator>>(
std::istream& is,
record& r)
{
r.clear();
string line;
getline(is, line);
istringstream iss(line);
string field;
while (getline(iss, field, ',' ))
r.push_back(field);
return is;
}
int
main()
{
ifstream file("Stock Database.csv");
record headers, r;
if (file.good())
file >> headers;
while (file >> r)
{
for (int i = 0; i < r.size(); i++)
cout << headers[i] << ":\t" << r[i] << endl;
cout << "------------------------------" << endl;
}
return 0;
}
// EOF
The content of the "Stock Database.csv" file is:
Symbol,Name,Price,High Today,Low Today,52 Week Low
GOOG,Google Inc.,$568.77 ,$570.25 ,$560.35
AAPL,Apple Inc.,$93.28 ,$63.89 ,$99.44
Just do with the record whatever you want. The first read from the file suppose to bring you headers. Every next read fill each record with csv values.

Reading in a file with delimiter and blank lines for hashing program

How do I read in lines from a file and assign specific segments of that line to the information in structs? And how can I stop at a blank line, then continue again until end of file is reached?
Background: I am building a program that will take an input file, read in information, and use double hashing for that information to be put in the correct index of the hashtable.
Suppose I have the struct:
struct Data
{
string city;
string state;
string zipCode;
};
But the lines in the file are in the following format:
20
85086,Phoenix,Arizona
56065,Minneapolis,Minnesota
85281
56065
I cannot seem to figure this out. I am having a really hard time reading in the file. The first line is basically the size of the hash table to be constructed. The next blank line should be ignored. Then the next two lines are information that should go into the struct and be hashed into the hash table. Then another blank line should be ignored. And finally, the last two lines are input that need to be matched to see if they exist in the hash table or not. So in this case, 85281 is not found. While 56065 is found.
This is what I have and it doesn't seem to be doing what I want it to do:
int main(int argc, char *argv[])
{
string str;
//first line of file is size of hashtable
getline(cin, str);
stringstream ss(str);
int hashSize;
ss >> hashSize;
//construct hash table
Location *hashTable = new Location[hashSize];
//skip next line
getline(cin, str);
string blank = " ";
while(getline(cin, str))
{
{
//next lines are data
Location locate;
string line;
getline(cin, line);
istringstream is(line);
getline(is, locate.zipCode, ',');
getline(is, locate.city, ',');
getline(is, locate.state, ',');
insertElementIntoHash(hashTable, locate, hashSize);
}
}
dispHashTable(hashTable, hashSize);
//read third set of lines that check if the zipCodes are in the hashtable or not
while(getline(cin, str))
{
//stop reading at a blank line or in this case, end of file
stringstream is(str);
string searchZipCode;
is >> searchZipCode;
searchElementInHash(hashTable, hashSize, searchZipCode);
}
//delete hash table after use
delete []hashTable;
return 0;
}
You might read the input this way:
#include <iostream>
#include <sstream>
#include <vector>
struct Location
{
std::string city;
std::string state;
std::string zipCode;
};
int main(int argc, char *argv[]) {
std::istringstream input(
"2\n"
"\n"
"85086,Phoenix,Arizona\n"
"56065,Minneapolis,Minnesota\n"
"\n"
"85281\n"
"56065\n"
);
// Make the size unsigned, to avoid signed/unsigned compare warnings.
unsigned hashSize;
std::string line;
getline(input, line);
std::istringstream hash_line(line);
// Ignore white space.
if( ! (hash_line >> hashSize >> std::ws && hash_line.eof())) {
std::cerr << "Error: Invalid file format [1].\n" << line << '\n';
return -1;
}
else {
getline(input, line);
std::istringstream first_blank_line(line);
// Ignore white space.
first_blank_line >> std::ws;
if( ! first_blank_line.eof()) {
// Missing blank line.
std::cerr << "Error: Invalid file format [2].\n" << line << '\n';
return -2;
}
else {
// Have a local variable (No need to allocate it)
// (Is it a hash table !???)
std::vector<Location> hashTable;
hashTable.reserve(hashSize);
while(hashTable.size() < hashSize && getline(input, line)) {
std::istringstream data_line(line);
Location locate;
getline(data_line, locate.zipCode, ',');
getline(data_line, locate.city, ',');
getline(data_line, locate.state); // Note: No comma here.
if(data_line && data_line.eof()) {
// Note: The fields may have leading and/or trailing white space.
std::cout
<< "Insert the location into the hash table.\n"
<< locate.zipCode << '\n'
<< locate.city << '\n'
<< locate.state << '\n';
hashTable.push_back(locate);
}
else {
std::cerr << "Error: Invalid file format [3].\n" << line << '\n';
return -3;
}
}
if(hashTable.size() != hashSize) {
std::cerr << "Error: Invalid file format [4].\n";
return -4;
}
else {
getline(input, line);
std::istringstream second_blank_line(line);
// Ignore white space.
second_blank_line >> std::ws;
if( ! second_blank_line.eof()) {
// Missing blank line.
std::cerr << "Error: Invalid file format [5].\n";
return -5;
}
else {
std::string searchZipCode;
while(input >> searchZipCode) {
// Search element in the hash table
}
}
}
}
}
return 0;
}
Following modification should work:
//skip next line
getline(cin, str);
string blank = " ";
string line;
while(getline(cin, line) && (line != ""))
{
{
//next lines are data
Location locate;
istringstream is(line);
getline(is, locate.zipCode, ',');
getline(is, locate.city, ',');
getline(is, locate.state, ',');
insertElementIntoHash(hashTable, locate, hashSize);
}
}

How to do parsing istringstream C++?

I need to print some data from stream - istringstream ( in main () ).
example:
void Add ( istream & is )
{
string name;
string surname;
int data;
while ( //something )
{
// Here I need parse stream
cout << name;
cout << surname;
cout << data;
cout << endl;
}
}
int main ( void )
{
is . clear ();
is . str ( "John;Malkovich,10\nAnastacia;Volivach,30\nJohn;Brown,60\nJames;Bond,30\n" );
a . Add ( is );
return 0;
}
How to do parsing this line
is.str ("John;Malkovich,10\nAnastacia;Volivach,30\nJohn;Brown,60\nJames;Bond,30\n");"
to name;surname,data?
This is somewhat fragile, but if you know your format is exactly what you posted, there's nothing wrong with it:
while(getline(is, name, ';') && getline(is, surname, ',') && is >> data)
{
is.ignore(); // ignore the new line
/* ... */
}
If you know the delimiters will always be ; and ,, it should be fairly easy:
string record;
getline(is, record); // read one line from is
// find ; for first name
size_t semi = record.find(';');
if (semi == string::npos) {
// not found - handle error somehow
}
name = record.substr(0, semi);
// find , for last name
size_t comma = record.find(',', semi);
if (comma == string::npos) {
// not found - handle error somehow
}
surname = record.substr(semi + 1, comma - (semi + 1));
// convert number to int
istringstream convertor(record.substr(comma + 1));
convertor >> data;