Hi all, im just start to learn how to do the csv file management using c++, currently this code works. it can print out the 'math' column.
but that is only if when i assigned every column using the getline(ss,#any column variable#, ',')
then i print out the column that i want. but if im using this for a big list, lets say a csv file that have about 100 column. then, how can i simplified it? or is there any ways for me to only get specific column only without assigning/parsing each column to each variable? lets say from 100 column, i only want the column 47 with any possible names? or maybe i could get the column by its name?
Thanks.
or is there any ways for me to only get specific column only without assigning/parsing each column to each variable?
It's not really practical with the CSV format to avoid reading every column, so really what you want to do is basically just discard the columns you do not want, much like you are already doing.
To make it work with an unknown number of columns, you can read into a std::vector, which is basically a dynamically sized array, so really useful for cases like this.
std::vector<std::string> read_csv_line(const std::string &line)
{
std::vector<std::string> ret;
std::string val;
std::stringstream ss(line);
while (std::getline(ss, val, ','))
ret.push_back(std::move(val));
return ret;
}
...
std::getline(is, line);
auto row = read_csv_line(line);
if (row.size() > 10) // Check each row is expected size!
std::cout << row[0] << ", " << row[10] << std::endl;
else std::cerr << "Row too short" << std::endl;
You can then access the specific columns you want.
or maybe i could get the column by its name?
Assuming your CSV file has a header line, you can read that into say a std::unordered_map<std::string, size_t> where the value is the column index. Alternatively something like a std::vector with std::find.
Note that handling of quoted values, and some other possible CSV features can't be done with a single std::getline.
Here's a quick [working] example.
The 1st part reads in the table.
The 2nd part (after fin.close()) lets you choose what you want to print out (or whatever you choose to do with it).
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <vector>
#include <algorithm> //std::find
using namespace std;
int main(int argc, char** argv)
{
ifstream fin("filename");
string line;
int rowCount=0;
int rowIdx=0; //keep track of inserted rows
//count the total nb of lines in your file
while(getline(fin,line)){
rowCount++;
}
//this will be your table. A row is represented by data[row_number].
//If you want to access the name of the column #47, you would
//cout << data[0][46]. 0 being the first row(assuming headers)
//and 46 is the 47 column.
//But first you have to input the data. See below.
vector<string> data[rowCount];
fin.clear(); //remove failbit (ie: continue using fin.)
fin.seekg(fin.beg); //rewind stream to start
while(getline(fin,line)) //for every line in input file
{
stringstream ss(line); //copy line to stringstream
string value;
while(getline(ss,value,’,’)){ //for every value in that stream (ie: every cell on that row)
data[rowIdx].push_back(value);//add that value at the end of the current row in our table
}
rowIdx++; //increment row number before reading in next line
}
}
fin.close();
//Now you can choose to access the data however you like.
//If you want to printout only column 47...
int colNum=47; //set this number to the column you want to printout
for(int row=0; row<rowCount; row++)
{
cout << data[row][colNum] << "\t"; //print every value in column 47 only
}
cout << endl
return 0;
}
EDIT: Adding this for a more complete answer.
To search a column by name, replace the last for loop with this snippet
//if you want to look up a column by name, instead of by column number...
//Use find on that row to get its column number.
//Than you can printout just that column.
int colNum;
string colName = "computer science";
//1.Find the index of column name "computer science" on the first row, using iterator
//note: if "it == data[0].end()", it means that that column name was not found
vector<string>::iterator it = find(data[0].begin(), data[0].end(),colName);
//calulate its index (ie: column number integer)
colNum = std::distance(data[0].begin(), it);
//2. Print the column with the header "computer science"
for(int row=0; row<rowCount; row++)
{
cout << data[row][colNum] << "\t"; //print every value in column 47 only
}
cout << endl
return 0;
}
Related
Here I have an excel file whose first column has ID's i.e:
ID
12
32
45
12
..
There are other columns as well but I only want to read the data present in first column i.e. ID.
Here is my code which throws exception. I don't know why?
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <vector>
#include <algorithm>
#include<string>
#include<cstdlib>
//std::find
#include<cstring>
using namespace std;
int main()
{
ifstream fin("1.csv");
string line;
int rowCount = 0;
int rowIdx = 0; //keep track of inserted rows
//count the total nb of lines in your file
while (getline(fin, line)) {
rowCount++;
}
//this will be your table. A row is represented by data[row_number].
//If you want to access the name of the column #47, you would
//cout << data[0][46]. 0 being the first row(assuming headers)
//and 46 is the 47 column.
//But first you have to input the data. See below.
std::vector<std::vector<std::string>> data;
fin.clear(); //remove failbit (ie: continue using fin.)
fin.seekg(fin.beg); //rewind stream to start
while (getline(fin, line)) //for every line in input file
{
stringstream ss(line); //copy line to stringstream
string value;
while (getline(ss, value, ',')) { //for every value in that stream (ie: every cell on that row)
data[rowIdx].push_back(value);//add that value at the end of the current row in our table
}
rowIdx++; //increment row number before reading in next line
}
fin.close();
//Now you can choose to access the data however you like.
//If you want to printout only column 47...
int colNum;
string colName = "ID";
//1.Find the index of column name "computer science" on the first row, using iterator
//note: if "it == data[0].end()", it means that that column name was not found
vector<string>::iterator it = find(data[0].begin(), data[0].end(), colName);
//calulate its index (ie: column number integer)
colNum = std::distance(data[0].begin(), it);
//2. Print the column with the header "computer science"
for (int row = 0; row < rowCount; row++)
{
cout << data[row][colNum] << "\t"; //print every value in column 47 only
}
cout << endl;
return 0;
}
Kindly help me to fix the issue. I want to display only first column which contain ID's.
Here's a simplified version of the above code. It gets rid of the 2D vector, and only reads the first column.
std::vector<std::string> data;
ifstream fin("1.csv");
string line;
while (getline(fin, line)) //for every line in input file
{
stringstream ss(line); //copy line to stringstream
string value;
if (getline(ss, value, ',')) {
data.push_back(value);
}
}
EDIT
How to display the data
for (size_t i = 0; i < data.size(); ++i)
cout << data[i] << '\n';
The problem:
std::vector<std::vector<std::string>> data;
Allocates a vector of vectors. Both dimensions are currently set to 0.
data[rowIdx].push_back(value);
pushes into and potentially resizes the inner vector, but the outer vector remains size 0. No value of rowIdx is valid. The naive solution is to use rowCount to size the outer vector, but it turns out that's a waste. You can assemble whole rows and then push_back the row into data, but even this is a waste since only one column is needed.
One of the beauties of vector is you don't have to know the number of rows. Your make a row vector, push the columns into it, then push the row into data. when you hit the end of he file, you're done reading. No need to read the file twice, but you probably trade off a bit of wasted storage on data's last self-resize.
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <vector>
// reduced includes to minimum needed for this example
// removed using namespace std; to reduce odds of a naming collision
int main() {
std::ifstream fin("1.csv"); // Relative path, so watch out for problems
// with working directory
std::string line;
std::vector<std::string> data; // only need one column? Only need one
// dimension
if (fin.is_open())
{
while (std::getline(fin, line)) //for every line in input file
{
std::stringstream ss(line);
std::string value;
if (std::getline(ss, value, ','))
{
// only take the first column If you need, say, the third
// column read and discard the first two columns
// and store the third
data.push_back(value); // vector sizes itself with push_back,
// so there is no need to count the rows
}
}
}
else
{
std::cerr << "Cannot open file\n";
return -1;
}
// use the column of data here
}
I have a problem with using binary_search, it works, but only if the whole string is inserted as the search-key
I want it to work without searching after whole string, but just a key word and return "found" if a string(search-key) is part of another string(from sorted vector of strings)
case 5: // case til søgning efter telefonnummer
cout << "Indtast telefonnummer til soegning: " << endl;
getline(cin >> ws, key);
vector<string> mylines_sorted;
for (int i = 0; i < mylines.size(); i++) {
mylines_sorted.push_back(mylines[i]); // vector of strings is transferred to new vector of strings
}
sort(mylines_sorted.begin(), mylines_sorted.end());
for (int i = 0; i < mylines.size(); i++) {
cout << mylines_sorted[i] << endl; // just a check if data is sorted
}
bool result = binary_search(mylines_sorted.begin(), mylines_sorted.end(), key);
cout << result << endl; // another check
if (result == false) {
cout << "Soegning gav intet...!" << endl;
}
else {
cout << "Soegning: " << key << " findes i datafil!" << endl;
}
break;
}
return 0;
string line;
vector<string> mylines;
while (getline(database, line)) {
mylines.push_back(line);
}
I don't know if this part is relevant, I dont think so, but I transfer data from data file to vector of strings
struct Data {
char navn[80];
char addresse[80];
int alder;
unsigned int tlf;
};
There's a very simple way to get "words" from a string: Put the string into an std::istringstream and use std::istream_iterator<std::string> to get words out of it.
Combine this with the vectors insert function to add the strings to the vector you sort and search.
For example something like this:
// For each line...
for (auto const& line : mylines)
{
// Put the line into an input string stream
std::istringstream iss(line);
// Read from the string stream, adding words to the sorted_mylines vector
sorted_mylines.insert(end(sorted_mylines),
std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>());
}
After the above, sorted_mylines will contain all the words from all the lines in mylines.
You can now sort it and search for individual words. Or just skip the sorting and do a linear search.
Considering your edit, and the structure you use, I suggest you first read the file, parse each line into the corresponding structure, and create a vector of that instead of working with lines.
Then you could easily search for either name (which I recommend you split into separate first and last name) or address (which I recommend you also split up into its distinct parts, like street name, house number, postal code, etc.).
If you split it up then it will become much easier to search for specific parts. If you want a more generic search then to a linear loop over all entries, and look in all relevant structure members.
Struggling with current file input/output. I know to use cctype to distinguish between the different groups but I have no idea how to get the total of each. Any help would be super appreciated
Here's the assignment.
Write a program to gather the number of times particular groups of characters appear in a file.
Allow the user to specify which file to process.
Count how often the following types of characters occur:
Alphabetic characters
Numeric characters
Control characters
Upper case characters
Lower case characters
Punctuation characters
Total number of characters in the file
Have the program output the gathered results to a file name “stats.txt”.
Display the information (left justified) in the order shown above.
For each group of character types add a column that shows how often these characters appear.
Add an additional column that shows what percentage that this groups comprises of the total
number of characters in the file. Display this number with one decimal point. (Note: The
percentage is not required on the line that display the total number of character processed.
Please ensure all numerical columns are right justified, display appropriate precision and form a nice vertical table.
Here's my code so far:
enter code here
#include <iostream>
#include <cctype>
#include <fstream>
#include <iomanip>
#include <string>
//included libraries//
using namespace std;
//function definitions//
int main() {
int count;
int character;
string filename;
int alpha = 0, num = 0, con = 0, UC = 0, LC = 0, pun = 0, total = 0;
cout << "Enter file name and extension to process.\n";
cin >> filename;
ofstream fileout;
ifstream filein;
//input file//
filein.open(filename);
if (filein.fail()) {
cerr << "Failed to open the file: " << filename << endl;
}
//open file//
fileout.open("stats.txt");
if (fileout.fail()) {
cout << "Error, unable to open output file\n";
system("pause");
exit(1);
}
fileout.setf(ios::fixed);
fileout.setf(ios::showpoint);
fileout.precision(3);
filein.close();
fileout.close();
system("pause");
return 0;
}
I have no idea how to get the total of each
For this I suggest making a struct for a classification function containing:
The name, like Alphabetic.
A pointer to a classification function, like std::isalpha.
The total count.
The struct could look like this:
struct classifier {
std::string_view heading; // e.g. "Alphabetic"
int (*class_func)(int); // e.g. std::isalpha
std::uintmax_t count = 0; // start out with zero
};
You could then create an array of such structs:
std::array<classifier, 6> classifiers{{
{"Alphabetic", std::isalpha},
{"Numeric", std::isdigit},
{"Control", std::iscntrl},
{"Upper case", std::isupper},
{"Lower case", std::islower},
{"Punctuation", std::ispunct},
}};
Now, for each character you read from the file, you loop through the classifiers and check the character against the classification function and add to the count in the classifier if it was a match:
char ch = .. read from file ...;
++total_chars;
for(auto& [_, func, count] : classifiers) {
count += func(static_cast<unsigned char>(ch)) != 0;
}
When the whole file has been read, the sum of each should be in each classifier:
for(auto& [heading, _, count] : classifiers) {
std::cout << heading << " characters " << '\t' << count << '\n';
}
I am trying to do file read by column. Main function is that my file should be show one of columns all values.
I am trying to do it with vectors.
void search(){
const int COLUMNS = 4;
vector< vector <int> > data;
string filename = "bla.txt";
ifstream ifile(filename.c_str());
if (ifile.is_open()) {
int num;
vector <int> numbers_in_line;
while (ifile >> num) {
numbers_in_line.push_back(num);
if (numbers_in_line.size() == COLUMNS) {
data.push_back(numbers_in_line);
numbers_in_line.clear();
}
}
}
else {
cerr << "There was an error opening the input file!\n";
exit(1);
}
//now get the column from the 2d vector:
vector <int> column;
int col = 2;//example: the 2nd column
for (int i = 0; i < data.size(); ++i) {
column.push_back(data[i][col - 1]);
cout << column[i] << endl;
}
ifile.close();
}
my file looks like:
John 1990 1.90 1
Peter 1980 1.88 0
...
This code compiles, but I am not getting any value shown in console. When I try to debug last line it wont get cached, so I guess they do nothing?
while (ifile >> num) {
The loop is never entered because num is an int and the first element of the input line is John, which cannot be interpreted as an int, so ifile is set to an error state and the loop condition is immediately false.
The clean fix is to first read the entire line with std::getline and then tokenise the resulting std::string, for example with an std::istringstream.
Individual std::string tokens resulting from that tokenisation can then be converted to appropriate types with functions like std::stoi.
Do step by step and make sure each step is correct.
Read each line, then print out to make sure you are doing it correctly.
You need to split each line. After this step, you will have John, 1990 etc as strings. My favorite split method
Now convert 2-4th columns into integers.
There are good solutions for each step that you can easily find.
I have a large file (50x11k) of a grid of numbers. All i am trying to do is place the values into a vector so that i can access the values of different lines at the same time. I get a seg fault everytime (i cannot even do a cout before a the while loop). Anyone see the issue?
If there is an easier way to do this then please let me know. Its a large file and I need to be able to compare the values of one row with another so a simple getline does not work, Is there a way to jump around a file and not "grab" the lines, but just "examine" the lines so that I can later go back an examine that same line by putting in that number? Like looking at the file like a big array? I wanna look at the third line and 5 character in that line at the same time i look at the 56th line and 9th character, something like that.
#include <stdlib.h>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
//int g_Max = 0;
int main() {
vector<vector<string> > grid;
ifstream in("grid.txt");
int row = 0;
int column = 0;
string c;
if (!in) {
cout << "NO!";
}
while (!in.eof()) {
c = in.get();
if ( c.compare("\n") == 0) {
row++;
column = 0;
}
else {
c = grid[column][row];
cout << grid[column][row];
column++;
}
}
return 0;
}
vector<vector<string> > grid;
This declares an empty vector, with no elements.
c = grid[column][row];
This accesses elements of the vector, but there are no elements.
If you change it to use vector::at() instead of vector::operator[] like so:
c = grid.at(column).at(row);
then you'll get exceptions telling you you're accessing out of range.
You need to populate the vector with elements before you can access them. One way is to declare it with the right number of elements up front:
vector<vector<string> > grid(11000, std::vector<string>(50));
You probably also want to fix your IO loop, testing !in.eof() is usually wrong. Why not read a line at a time and split the line up, instead of reading single characters?
while (getline(in, c))
If all you need is to access all lines at once why you don't declare it as std::vector<std::string> and each line is an string??
std::string s;
std::vector<std::string> lines;
while( std::getline(in, s) ) lines.push_back( s );
std::cout << "File contain " << lines.size() << " line" << std::endl;
std::cout << "Char at [1][2] is " << lines[1][2] << std::endl; // assume [1][2] is valid!