I'm trying to read in a CSV file that contains rows of 3 people/patients, where col 1 is userid, col 2 is fname, col 3 is lname, col 4 is insurance, and col 5 is version that looks something like below.
Edit: Apologies, I simply copy/pasted my CSV spreadsheet in here, so it didn't show the commas before. Wouldn't it look something more like below? John below also pointed out that there are no commas after the version, and this seemed to fix the issue! Thanks so much John! ( trying to figure out how I can accept your answer :) )
nm92,Nate,Matthews,Aetna,1
sc91,Steve,Combs,Cigna,2
ml94,Morgan,Lands,BCBS,3
I'm trying to use getline() inside of a loop to read everything in, and it works fine for the first iteration, but getline() seems to be causing it to skip a value on the next iterations. Any idea how I can solve this?
I'm also not sure why the output looks like below, because I'm not seeing where the lines w/ "sc91" and "ml94" are being printed in the code. This is what the output of the current code looks like.
userid is: nm92
fname is: Nate
lname is: Matthews
insurance is: Aetna
version is: 1
sc91
userid is: Steve
fname is: Combs
lname is: Cigna
insurance is: 2
ml94
version is: Morgan
userid is: Lands
fname is: BCBS
lname is: 3
insurance is:
version is:
I've done a ton of research on differences between getline() and the >> stream operator, but most of the getline() materials seem to revolve around getting input from cin rather than reading from a file like here, so I'm thinking there's something going on w/ getline() and how it's reading the file that I'm not understanding. Unfortunately when I tried >> operator, that forces me to use the strtok() function, and I was struggling a lot with c strings and assigning them to an array of C++ strings.
#include <iostream>
#include <string> // for strings
#include <cstring> // for strtok()
#include <fstream> // for file streams
using namespace std;
struct enrollee
{
string userid = "";
string fname = "";
string lname = "";
string insurance = "";
string version = "";
};
int main()
{
const int ENROLL_SIZE = 1000; // used const instead of #define since the performance diff is negligible,
const int numCols = 5; // while const allows for greater utility/debugging bc it is known to the compiler ,
// while #define is a preprocessor directive
ifstream inputFile; // create input file stream for reading only
struct enrollee enrollArray[ENROLL_SIZE]; // array of structs to store each enrollee and their respective data
int arrayPos = 0;
// open the input file to read
inputFile.open("input.csv");
// read the file until we reach the end
while(!inputFile.eof())
{
//string inputBuffer; // buffer to store input, which will hold an entire excel row w/ cells delimited by commas
// must be a c string since strtok() only takes c string as input
string tokensArray[numCols];
string userid = "";
string fname = "";
string lname = "";
string insurance = "";
string sversion = "";
//int version = -1;
//getline(inputFile,inputBuffer,',');
//cout << inputBuffer << endl;
getline(inputFile,userid,',');
getline(inputFile,fname,',');
getline(inputFile,lname,',');
getline(inputFile,insurance,',');
getline(inputFile,sversion,',');
enrollArray[0].userid = userid;
enrollArray[0].fname = fname;
enrollArray[0].lname = lname;
enrollArray[0].insurance = insurance;
enrollArray[0].version = sversion;
cout << "userid is: " << enrollArray[0].userid << endl;
cout << "fname is: " << enrollArray[0].fname << endl;
cout << "lname is: " << enrollArray[0].lname << endl;
cout << "insurance is: " << enrollArray[0].insurance << endl;
cout << "version is: " << enrollArray[0].version << endl;
}
}
Your problem is that there is no comma after the final data item in each line, so
getline(inputFile,sversion,',');
is incorrect because it reads to the next comma, which is actually on the next line after the user id of the next patient. This explains the output you see where the user id of the next patent gets output with the version.
To fix this simply replace the code above with
getline(inputFile,sversion);
which will read to the end of line as required.
Regarding your function. If you look at the structure of the source file, then you will see that it contains 5 strings, separated by ",". So a typical CSV file.
A call to std::getline will read a complete line with the 5 strings. In your code you are trying to call std::getline for each single string, followed by a comma. Commaa is not present after the last string. That will not work. You should also use getline to get a complete line.
You need to read the whole line and then tokenize it.
I will show you an example on how to do that with the std::sregex_token_iterator. That is very simple. Additionally, we will overwrite the inserter and extracot operator. With that, you can easiyl read and write "enrollee" data like Enrollee e{}; std::cout << e;
Additionally I use C++ algorithms. That makes life very easy. Input and Output are a one-liner in main.
Please see:
#include <iostream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <regex>
struct Enrollee
{
// Data
std::string userid{};
std::string fname{};
std::string lname{};
std::string insurance{};
std::string version{};
// Overload Extractor Operator to read data from somewhere
friend std::istream& operator >> (std::istream &is, Enrollee& e) {
std::vector<std::string> wordsInLine{}; // Here we will store all words that we read in onle line;
std::string wholeLine; // Temporary storage for the complete line that we will get by getline
std::regex separator("[ \\;\\,]"); ; // Separator for a CSV file
std::getline(is, wholeLine); // Read one complete line and split it into parts
std::copy(std::sregex_token_iterator(wholeLine.begin(), wholeLine.end(), separator, -1), std::sregex_token_iterator(), std::back_inserter(wordsInLine));
// If we have read all expted strings, then store them in our struct
if (wordsInLine.size() == 5) {
e.userid = wordsInLine[0];
e.fname = wordsInLine[1];
e.lname = wordsInLine[2];
e.insurance = wordsInLine[3];
e.version = wordsInLine[4];
}
return is;
}
// Overload Inserter operator. Insert data into output stream
friend std::ostream& operator << (std::ostream& os, const Enrollee& e) {
return os << "userid is: " << e.userid << "\nfname is: " << e.fname << "\nlname is: " << e.lname << "\ninsurance is: " << e.insurance << "\nversion is: " << e.version << '\n';
}
};
int main()
{
// Her we will store all Enrollee data in a dynamic growing vector
std::vector<Enrollee> enrollmentData{};
// Define inputFileStream and open the csv
std::ifstream inputFileStream("r:\\input.csv");
// If we could open the file
if (inputFileStream) {
// Then read all csv data
std::copy(std::istream_iterator<Enrollee>(inputFileStream), std::istream_iterator<Enrollee>(), std::back_inserter(enrollmentData));
// For Debug Purposes: Print all data to cout
std::copy(enrollmentData.begin(), enrollmentData.end(), std::ostream_iterator<Enrollee>(std::cout, "\n"));
}
else {
std::cerr << "Could not open file 'input.csv'\n";
}
}
This will read the input file "input.csv" containing
nm92,Nate,Matthews,Aetna,1
sc91,Steve,Combs,Cigna,2
ml94,Morgan,Lands,BCBS,3
And show as output:
userid is: nm92
fname is: Nate
lname is: Matthews
insurance is: Aetna
version is: 1
userid is: sc91
fname is: Steve
lname is: Combs
insurance is: Cigna
version is: 2
userid is: ml94
fname is: Morgan
lname is: Lands
insurance is: BCBS
version is: 3
That is only an idea, but it could help you. It's a piece of code of one project I am working on:
std::vector<std::string> ARDatabase::split(const std::string& line, char delimiter)
{
std::vector<std::string> tokens;
std::string token;
std::istringstream tokenStream(line);
while (std::getline(tokenStream, token, delimiter))
{
tokens.push_back(token);
}
return tokens;
}
void ARDatabase::read_csv_map(std::string root_csv_map)
{
qDebug() << "Starting to read the people database...";
std::ifstream file(root_csv_map);
std::string str;
while (std::getline(file, str))
{
std::vector<std::string> tokens = split(str, ' ');
std::vector<std::string> splitnames = split(tokens.at(1), '_');
std::string name_w_spaces;
for(auto i: splitnames) name_w_spaces = name_w_spaces + i + " ";
people_names.insert(std::make_pair(stoi(tokens.at(0)), name_w_spaces));
people_images.insert(std::make_pair(stoi(tokens.at(0)), std::string("database/images/" + tokens.at(2))));
}
}
Instead of std::vector, you might want to use other container more suitable for your case. And the last example is made for the input format of my case. You can modify it easily for adapting it to your code.
Related
i am trying to get my file to remove the leading and trailing space but it does not work.
this is the txt file contents:
392402 wench
I have tried printing out my code, and this is what is displayed.
first: 392402 wench second:
I want it to display this instead
first: 392402 second: wench
this is my code
void readFile(const string &fileName) {
int limit;
ifstream ifs(fileName);
string::size_type position;
key_type item;
mapped_type count;
string line;
if (ifs.is_open()) {
ifs >> limit;
for (int i = 0; i < limit; i++) {
getline(ifs, line);
position = line.find(" ", 0);
auto c = line.substr(position + 1);
item = line.substr(0, position);
cout << "first: " << c << " second: " << item << endl;
value_type value(item, count);
values.push_back(value);
}
} else {
cout << "Can't open file.";
}
what am i doing wrong? Thank you
The two biggest mistakes you're making are (a) not checking your values for expected output as you go, and (b) not running your code in a debugger to see what is really happening. If you had, the values of position, c, and item would have been blatantly wrong, and you could then surmise where to go from there.
Belaying the highly-likely possibility that the loop iteration is broken from inception because you never consumed the remainder of the entry line containing input, let's look at the actual data and what you're asking of it with your code.
We read this entire line:
392402 wench
You then ask "find the first single-space string in this line" via this code:
position = line.find(" ", 0);
Well, that would be here:
392402 wench
^here
So position is zero (0). You then ask for the sub-string, starting a that position + 1, through the end of the string with this code:
auto c = line.substr(position + 1);
Therefore c now contains (leading space removed via the +1):
392402 wench
Now we build item, which is done with this line:
item = line.substr(0, position);
Remember, position is zero, so you're asking for the string, starting at location 0, length 0. As you can imagine, that isn't going to amount to anything. So now item is an empty string.
Finally, the output statement:
cout << "first: " << c << " second: " << item << endl;
will produce:
first: 392402 wench second:
I.e. exactly what you're seeing. And that's it. Clearly this is wrong.
Alternative
Use better error checking, value checking, and a string stream for per-line extraction. The following code doesn't give two cents about your type aliases (mainly because you didn't include them anyway and I'd rather not loft any guesses as to their origin).
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <limits>
// Expects a file with the following format:
// count
// name1 value1
// name2 value2
// ...
void readFile(const std::string &fileName)
{
std::ifstream ifs(fileName);
if (ifs.is_open())
{
int limit;
if (ifs >> limit && limit > 0)
{
// consume through end of line.
ifs.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
// repeat until `limit` iterations or stream error/eof
std::string line;
for (int i = 0; i < limit && std::getline(ifs, line); i++)
{
std::istringstream iss(line);
// extract line values. Note these *can* be formatted
// extraction for things besides just strings
std::string first, second;
if (iss >> first >> second)
{
std::cout << "first: " << first << " second: " << second << '\n';
// TODO: whatever you want to do with first/second
}
}
}
ifs.close();
}
else
{
std::cerr << "Can't open file: " << fileName << '\n';
}
}
Note: The above code will NOT work for remaining-line-content as the expected second value. E.g. It will not process something like this as you may first expect:
10000 this is a multi-word description
will produce this:
first: 10000 second: this
which is considerably different than what you may be expecting:
first: 10000 second: this is a multi-word description
There was no suggestion in the original post such support was mandatory, though adding it wouldn't be terribly difficult to add. If it is a requirement, I leave that task to you.
I want to parse a file with the following content:
2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13
Here, 2 is the number of lines, 300 is the weight.
Each line has a string and a number(price). Same with 3 and 400.
I need to store 130, 456 in an array.
Currently, I am reading the file and each line is processed as std::string. I need help to progress further.
Code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
//void processString(string line);
void process2(string line);
int main(int argc, char ** argv) {
cout << "You have entered " << argc <<
" arguments:" << "\n";
for (int i = 1; i < argc; ++i)
cout << argv[i] << "\n";
//2, 4 are the file names
//Reading file - market price file
string line;
ifstream myfile(argv[2]);
if (myfile.is_open()) {
while (getline(myfile, line)) {
// cout << line << '\n';
}
myfile.close();
} else cout << "Unable to open market price file";
//Reading file - price list file
string line_;
ifstream myfile2(argv[4]);
int c = 1;
if (myfile2.is_open()) {
while (getline(myfile2, line_)) {
// processString(line_);
process2(line_);
}
myfile2.close();
} else cout << "Unable to open price lists file";
//processString(line_);
return 0;
}
void process2(string line) {
string word = "";
for (auto x: line) {
if (x == ' ') {
word += " ";
} else {
word = word + x;
}
}
cout << word << endl;
}
Is there a split function like in Java, so I can split and store everything as tokens?
You have 2 questions in your post:
How do I parse this file in cpp?
Is there a split function like in Java, so I can split and store everything as tokens?
I will answer both questions and show a demo example.
Let's start with splitting a string into tokens. There are several possibilities. We start with the easy ones.
Since the tokens in your string are delimited by a whitespace, we can take advantage of the functionality of the extractor operator (>>). This will read data from an input stream, up to a whitespace and then converts this read data into the specified variable. You know that this operation can be chained.
Then for the example string
const std::string line{ "Token1 Token2 Token3 Token4" };
you can simply put that into a std::istringstream and then extract the variables from the stream:
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
The disadvantage is that you need to write a lot of stuff and you have to know the number of elements in the string.
We can overcome this problem with using a vector as the taget data store and fill it with its range constructor. The vectors range constructor takes a begin and and end interator and copies the data into it.
As iterator we use the std::istream_iterator. This will, in simple terms, call the extractor operator (>>) until all data is consumed. Whatever number of data we will have.
This will then look like the below:
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
This may look complicated, but is not. We define a variable "token" of type std::vector. We use its range constructor.
And, we can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction", C++17 required).
Additionally, you can see that I do not use the "end()"-iterator explicitely.
This iterator will be constructed from the empty brace-enclosed default initializer with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
There is an additional solution. It is the most powerful solution and hence maybe a little bit to complicated in the beginning.
With that can avoid the usage of std::istringstream and directly convert the string into tokens using std::sregex_token_iterator. Very simple to use. And the result is a one liner for splitting the original string:
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
So, modern C++ has a build in functionality which is exactly designed for the purpose of tokenizing strings. It is called std::sregex_token_iterator. What is this thing?
As it name says, it is an iterator. It will iterate over a string (hence the 's' in its name) and return the split up tokens. The tokens will be matched again a regular expression. Or, natively, the delimiter will be matched and the rest will be seen as token and returned. This will be controlled via the last flag in its constructor.
Let's have a look at this constructor:
token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
The first parameter is, where it should start in the source string, the 2nd parameter is the end position, up to which the iterator should work. The last parameter is:
1, if you want to have a positive match for the regex
-1, will return everything that not matches the regex
And last but not least the regex itself. Please read in the net abot regex'es. There are tons of pages available.
Please see a demo for all 3 solutions here:
#include <iostream>
#include <string>
#include <vector>
#include <regex>
#include <sstream>
#include <iterator>
#include <algorithm>
/// Split string into tokens
int main() {
// White space separated tokens in a string
const std::string line{ "Token1 Token2 Token3 Token4" };
// Solution 1: Use extractor operator ----------------------------------
// Here, we will store the result
std::string subString1{}, subString2{}, subString3{}, subString4{};
// Put the line into an istringstream for easier extraction
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
// Show result
std::cout << "\nSolution 1: Use inserter operator\n- Data: -\n" << subString1 << "\n"
<< subString2 << "\n" << subString3 << "\n" << subString4 << "\n";
// Solution 2: Use istream_iterator ----------------------------------
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
// Show result
std::cout << "\nSolution 2: Use istream_iterator\n- Data: -\n";
std::copy(token.begin(), token.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
// Solution 3: Use std::sregex_token_iterator ----------------------------------
const std::regex re(" ");
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
// Show result
std::cout << "\nSolution 3: Use sregex_token_iterator\n- Data: -\n";
std::copy(token2.begin(), token2.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
So, now the answer on how you could read you text file.
It is essential to create the correct data structures. Then, overwrite the inserter and extractor operator and put the above functionality in it.
Please see the below demo example. Of course there are many other possible solutions:
#include <string>
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
struct ItemAndPrice {
// Data
std::string item{};
unsigned int price{};
// Extractor
friend std::istream& operator >> (std::istream& is, ItemAndPrice& iap) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
// Read the item and price from that line and check, if that worked
if (std::istringstream iss(line); !(iss >> iap.item >> iap.price))
// There was an error, while reading item and price. Set failbit of input stream
is.setf(std::ios::failbit);
}
return is;
}
// Inserter
friend std::ostream& operator << (std::ostream& os, const ItemAndPrice& iap) {
// Simple output of our internal data
return os << iap.item << " " << iap.price;
}
};
struct MarketPrice {
// Data
std::vector<ItemAndPrice> marketPriceData{};
size_t numberOfElements() const { return marketPriceData.size(); }
unsigned int weight{};
// Extractor
friend std::istream& operator >> (std::istream& is, MarketPrice& mp) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
size_t numberOfEntries{};
// Read the number of following entries and the weigth from that line and check, if that worked
if (std::istringstream iss(line); (iss >> numberOfEntries >> mp.weight)) {
mp.marketPriceData.clear();
// Now copy the numberOfEntries next lines into our vector
std::copy_n(std::istream_iterator<ItemAndPrice>(is), numberOfEntries, std::back_inserter(mp.marketPriceData));
}
else {
// There was an error, while reading number of following entries and the weigth. Set failbit of input stream
is.setf(std::ios::failbit);
}
}
return is;
};
// Inserter
friend std::ostream& operator << (std::ostream& os, const MarketPrice& mp) {
// Simple output of our internal data
os << "\nNumber of Elements: " << mp.numberOfElements() << " Weight: " << mp.weight << "\n";
// Now copy all marekt price data to output stream
if (os) std::copy(mp.marketPriceData.begin(), mp.marketPriceData.end(), std::ostream_iterator<ItemAndPrice>(os, "\n"));
return os;
}
};
// For this example I do not use argv and argc and file streams.
// This, because on Stackoverflow, I do not have files on Stackoverflow
// So, I put the file data in an istringstream. But for the below example,
// there is no difference between a file stream or a string stream
std::istringstream sourceFile{R"(2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13)"};
int main() {
// Here we will store all the resulting data
// So, read the complete source file, parse the data and store result in vector
std::vector mp(std::istream_iterator<MarketPrice>(sourceFile), {});
// Now, all data are in mp. You may work with that now
// Show result on display
std::copy(mp.begin(), mp.end(), std::ostream_iterator<MarketPrice>(std::cout, "\n"));
return 0;
}
Lets say I want to input the hours, minutes and seconds from the first line of a file and store them to 3 different variables, hrs, mins and sec respectively.
I cant figure out an easy way to skip reading the colon character (":").
Input file example:
12:49:00
Store:
hrs = 12
mins = 59
sec = 00
You can use std::regex to match, range-check and validate your input all at once.
#include <iostream>
#include <regex>
#include <string>
int main()
{
const std::regex time_regex("(\\d|[0,1]\\d|2[0-3]):([0-5]\\d):([0-5]\\d)");
std::smatch time_match;
std::string line;
while (std::getline(std::cin, line))
{
if (std::regex_match(line, time_match, time_regex))
{
int hours = std::stoi(time_match[1]);
int minutes = std::stoi(time_match[2]);
int seconds = std::stoi(time_match[3]);
std::cout << "h=" << hours << " m=" << minutes << " s=" << seconds << std::endl;
}
else
{
std::cout << "Invalid time: " << line << std::endl;
}
}
return 0;
}
See this example live here.
Breaking down the regular expression (\\d|[0,1]\\d|2[0-3]):([0-5]\\d):([0-5]\\d):
\d|[0,1]\d|2[0-3] matches the hour (24-hour time) which is one of:
\d : 0-9
[0,1]\d : 01-19
2[0-3] : 20-23
[0-5]\d matches the minutes: two digits 00-59
[0-5]\d matches the seconds: two digits 00-59, as above.
An alternative not using a temporary character for skipping the colon:
#include <iostream>
int main()
{
int h,m,s;
std::cin >> h;
std::cin.ignore(1) >> m;
std::cin.ignore(1) >> s;
std::cout << h << ':' << m << ':' << s << std::endl;
return 0;
}
This seems to work:
int h, m, s;
char c;
cin >> h >> c >> m >> c >> s;
You just skip : symbol this way. I don't know whether it's a good solution.
With cin.ignore:
cin >> h;
cin.ignore(1);
cin >> m;
cin.ignore(1);
cin >> s;
There are already several good answers and one that has already been accepted; however I like to propose my solution not only as a valid answer to your problem but also in regards to a good design practice. IMHO when it involves reading information from a file and storing it's contents to variables or data structures I prefer to do it in a specific way. I like to separate the functionality and responsibility of specific operations into their own functions:
1: I first like to have a function to open a file, read the contents and to store the information into either a string, a stream or some large buffer. Once the appropriate amount of information is read from the file, then the function will close the file handle as we are done with it and then return back the results. There are several ways to do this yet they are all similar.
a: Read a single line from the file and return back a string or a stream.
b: Read in all information form the file line by line and store each line into its own string or stream and return back a vector of those strings or streams.
c: Read in all of the contents of the file into a single string, stream or large buffer and return that back.
2: After I have the contents of that file then I will typically call a function that will parse that data and these functions will vary depending on the type of content that needs to be parsed based on the data structures that will be used. Also, these parsing functions will call a function that will split the string into a vector of strings called tokens. After the split string function is called then the parsing of data will use the string manipulators-converters to convert a string to the required built in types that are needed for the current data structure that is in use and store them into the data structure that is passed in by reference.
3: There are two variations of my splitString function.
a: One takes a single character as a delimiter.
b: The other will take a string as its delimiter.
c: Both functions will return a vector of strings, based on the delimiter used.
Here is an example of my code using this text file for input.
time.txt
4:32:52
main.cpp
#include <vector>
#include <string>
#include <sstream>
#include <fstream>
#include <iostream>
#include <exception>
struct Time {
int hours;
int minutes;
int seconds;
};
std::vector<std::string> splitString( const std::string& s, char delimiter ) {
std::vector<std::string> tokens;
std::string token;
std::istringstream tokenStream( s );
while( std::getline( tokenStream, token, delimiter ) ) {
tokens.push_back( token );
}
return tokens;
}
std::string getLineFromFile( const char* filename ) {
std::ifstream file( filename );
if( !file ) {
std::stringstream stream;
stream << "failed to open file " << filename << '\n';
throw std::runtime_error( stream.str() );
}
std::string line;
std::getline( file, line );
file.close();
return line;
}
void parseLine( const std::string& fileContents, Time& time ) {
std::vector<std::string> output = splitString( fileContents, ':' );
// This is where you would want to do your sanity check to make sure
// that the contents from the file are valid inputs before converting
// them to the appropriate types and storing them into your data structure.
time.hours = std::stoi( output.at( 0 ) );
time.minutes = std::stoi( output.at( 1 ) );
time.seconds = std::stoi( output.at( 2 ) );
}
int main() {
try {
Time t;
std::string line = getLineFromFile( "time.txt" );
parseLine( line, t );
std::cout << "Hours: " << t.hours << '\n'
<< "Minutes: " << t.minutes << '\n'
<< "Seconds: " << t.seconds << "\n\n";
} catch( std::runtime_error& e ) {
std::cerr << e.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
Output:
Hours: 4
Minutes: 32
Seconds: 52
Now as you can see in this particular situation the functions that are being used here is designed only to read a single line from the file and of course the very first line from the file. I have other functions in my library not shown here that will read each line of a file until there are no more lines to read, or read all of the file into a single buffer. I have another version of split string that will take a string as its delimiter instead of a single character. Finally for the parsing function, each parsing function will end up being unique due to the fact that it will rely on the data structure that you are trying to use.
This allows the code to be readable as each function does what it is supposed to do and nothing more. I prefer this design over the fact of trying to get information from a file and trying to parse it while the file is open. Too many things can go wrong while the file is open and if the data is read wrong or corrupted but to the point where the compiler doesn't complain about it, then your variables or data structures may contain invalid information without you being aware of it. At least in this way you can open the file, get what you need from the file and store it into a string or a vector of strings, close the file when done reading and return back the contents. Then it becomes the parsing function's responsibility to test the data after it has been tokenized. Now, in the current parsing function that I shown above I did not do any sanity check to keep things simple, but that is where you would test your data to see if the information is valid before returning back your populated data structure.
If you are interested in another version of this where there are multiple lines being read in from the file, just comment a request and I will append it to this answer.
I am trying to read a database file (as txt) where I want to skip empty lines and skip the column header line within the file and store each record as an array. I would like to take stop_id and find the stop_name appropriately. i.e.
If i say give me stop 17, the program will get "Jackson & Kolmar".
The file format is as follows:
17,17,"Jackson & Kolmar","Jackson & Kolmar, Eastbound, Southeast Corner",41.87685748,-87.73934698,0,,1
18,18,"Jackson & Kilbourn","Jackson & Kilbourn, Eastbound, Southeast Corner",41.87688572,-87.73761421,0,,1
19,19,"Jackson & Kostner","Jackson & Kostner, Eastbound, Southeast Corner",41.87691497,-87.73515882,0,,1
So far I am able to get the stop_id values but now I want to get the stop name values and am fairly new to c++ string manipulation
mycode.cpp
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
using namespace std;
int main()
{
string filename;
filename = "test.txt";
string data;
ifstream infile(filename.c_str());
while(!infile.eof())
{
getline(infile,line);
int comma = line.find(",");
data = line.substr(0,comma);
cout << "Line " << count << " "<< "is "<< data << endl;
count++;
}
infile.close();
string sent = "i,am,the,champion";
return 0;
}
You can use string::find 3 times to search for the third occurrence of the comma, and you must store the positions of the last 2 occurrences found in line, then use them as input data with string::substr and get the searched text:
std::string line ("17,17,\"Jackson & Kolmar\",\"Jackson & Kolmar, Eastbound, Southeast Corner\",41.87685748,-87.73934698,0,,1");
std::size_t found=0, foundBack;
int i;
for(i=0;i<3 && found!=std::string::npos;i++){
foundBack = found;
found=line.find(",",found+1);
}
std::cout << line.substr(foundBack+1,found-foundBack-1) << std::endl;
You can read the whole line of the file intoa string and then use stringstream to give you each piece one at a time up until and exluding the commas. Then you can fill up your arrays. I am assuming that you wanted each line in it's own array and that you wanted unlimited arrays. The best way to do that is to have an array of arrays.
std::string Line;
std::array<std::array<string>> Data;
while (std::getline(infile, Line))
{
std::stringstream ss;
ss << Line;
Data.push_back(std::vector<std::string>);
std::string Temp;
while (std::getline(ss, Temp, ','))
{
Data[Data.size() - 1].push_back(Temp);
}
}
This way you will have a vector, full of vectors, each of which conatining strings of all your data in that line. To access the strings as numbers, you can use std::stoi(std::string) which converts a string to an integer.
I have a CSV file in the form of two columns: name, age
To read and store the info, I did this
struct person
{
string name;
int age;
}
person record[10];
ifstream read("....file.csv");
However, when I did
read >> record[0].name;
read.get();
read >> record[0].age;
read>>name gave me the whole line instead of just the name. How could I possibly avoid this problem so that I can read the integer into age?
Thank you!
You can first read the whole line with std:getline, then parse it via a std::istringstream (must #include <sstream>), like
std::string line;
while (std::getline(read, line)) // read whole line into line
{
std::istringstream iss(line); // string stream
std::getline(iss, record[0].name, ','); // read first part up to comma, ignore the comma
iss >> record[0].age; // read the second part
}
Below is a fully working general example that tokenizes a CSV file Live on Ideone
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
int main()
{
// in your case you'll have a file
// std::ifstream ifile("input.txt");
std::stringstream ifile("User1, 21, 70\nUser2, 25,68");
std::string line; // we read the full line here
while (std::getline(ifile, line)) // read the current line
{
std::istringstream iss{line}; // construct a string stream from line
// read the tokens from current line separated by comma
std::vector<std::string> tokens; // here we store the tokens
std::string token; // current token
while (std::getline(iss, token, ','))
{
tokens.push_back(token); // add the token to the vector
}
// we can now process the tokens
// first display them
std::cout << "Tokenized line: ";
for (const auto& elem : tokens)
std::cout << "[" << elem << "]";
std::cout << std::endl;
// map the tokens into our variables, this applies to your scenario
std::string name = tokens[0]; // first is a string, no need for further processing
int age = std::stoi(tokens[1]); // second is an int, convert it
int height = std::stoi(tokens[2]); // same for third
std::cout << "Processed tokens: " << std::endl;
std::cout << "\t Name: " << name << std::endl;
std::cout << "\t Age: " << age << std::endl;
std::cout << "\t Height: " << height << std::endl;
}
}
read>>name gave me the whole line instead of just the name. How could I possibly avoid this problem so that I can read the integer into age?
read >> name will read everything into name until a white space is encountered.
If you have a comma separated line without white spaces, it makes sense that the entire line is read into name.
You can use std::getline to read the entire line to one string. Then use various methods of tokenizing a std::string.
Sample SO posts that address tokenizing a std::string:
How do I tokenize a string in C++?
c++ tokenize std string
Splitting a C++ std::string using tokens, e.g. ";"
You maybe could use stringstreams for that, but I wouldn't trust this, if I'm honest.
If I was you, I would write a small function, that reads the whole line into a string and after that, it should search for the separator character in the string. Everything in front of that is the first column and everything behind the second one. With the string operations provided by C++ you can move these parts in your variables (you can convert them into the correct type if you need).
I wrote a small C++ Library for CSV parsing, maybe a look at it helps you. You can find it on GitHub.
EDIT:
In this Gist you can find the parsing function