c++, using get and >> for ifstream - c++

I have a text file that I am inputting data in from, but I can't seem to get it right.
Here are two lines from the text file as an example (these aren't real people don't worry):
Michael Davidson 153 Summer Avenue Evanston CO 80303
Ingrid Johnson 2075 Woodland Road Aurora IL 60507
Here is the code I have to load the text file and put the data into a struct. I am still new to C++(obviously) and I'm having a hard time using get and >> together. The code I have below, works fine until I get to the "state" and then something goes wrong. Thanks for the help!
//constants
const int FIRST_NAME_LEN = 11;
const int LAST_NAME_LEN = 13;
const int ADDRESS = 25;
const int CITY_NAME_LEN = 16;
const int STATE_LEN = 3;
//define struct data types
struct CustomerType {
char firstName[FIRST_NAME_LEN];
char lastName[LAST_NAME_LEN];
char streetAddress[ADDRESS];
char city[CITY_NAME_LEN];
char state[STATE_LEN];
int zipCode;
};
//prototype function
ifstream& getInfo(CustomerType& CT_Struct, ifstream& infile);
int main() {
//declare struct objects
CustomerType CT_Struct;
ifstream infile("PGM951_customers.txt");
if(!infile) {
cerr << "Could not open the input file." << endl;
exit(1); //terminates the program
}
//call the function
getInfo(CT_Struct, infile);
return 0;
}
ifstream& getInfo(CustomerType& CT_Struct, ifstream& infile) {
while(infile) {
infile.get(CT_Struct.firstName, sizeof(CT_Struct.firstName));
infile.get(CT_Struct.lastName, sizeof(CT_Struct.lastName));
infile.get(CT_Struct.streetAddress, sizeof(CT_Struct.streetAddress));
infile.get(CT_Struct.city, sizeof(CT_Struct.city));
infile.get(CT_Struct.state, sizeof(CT_Struct.state));
infile >> ws;
infile >> CT_Struct.zipCode;
cout << CT_Struct.firstName << " | " << CT_Struct.lastName << " | " << CT_Struct.streetAddress
<< " | " << CT_Struct.city << " | " << CT_Struct.state << " | " << CT_Struct.zipCode << endl;
}
return infile;
}
=== edit ===========
Reading in the state at 8 char was just me messing around and then I forgot to change it back...sorry.

The problem is istream::get() breaks for streetAddress which has spaces in it.
One way is to tokenize the input line first into say, a vector of strings and then depending on the number of tokens convert these to appropriate fields of your CustomerType:
vector<string> tokenize(string& line, char delim=' ') {
vector<string> tokens;
size_t spos = 0, epos = string::npos;
while ((epos = line.find_first_of(delim)) != string::npos) {
tokens.push_back(line.substr(spos, epos - spos));
spos = epos;
}
return tokens;
}
I'd rather a stream extraction operator for CustomerType :
struct CustomerType {
friend istream& operator>>(istream& i, CustomerType& c);
string firstName, lastName, ...;
// ...
};
istream& operator>>(istream& i, CustomerType& c) {
i >> c.firstName >> c.lastName;
string s1, s2, s3;
i >> s1 >> s2 >> s3;
c.streetAddress = s1 + s2 + s3;
i >> c.city >> c.state >> c.zipCode;
return i;
}

You're getting 8 characters for State, which includes all your zipcode, and is larger than your field.
It'd also be tempting to use the skipws operator:
infile >> skipws >> CT_Struct.firstName
>> CT_Struct.lastName
>> ... ;
(Update: that's what I get for doing that from memory. This is more closely approximating correct.)

If I were you I would start again from scratch. I would:
use std::strings instead of character arrays for your data
reads line at a time from the file using std::getline
parse the line up using a stringstream
avoid mixing formatted and unformatted input

My approach to this would be the following:
1) Read each line into a null terminated buffer.
2) Use a split() function that you're gonna have to write. This function should take a string as its input and return a list. It should also take a separator. The separator in this case is ' '.
3) iterate over the list carefully (are there never middle names?) What about 1 word, or 3 word street names? Since many of these columns are really variable in number of words, and you have no seperator other than whitspace, this may prove a fairly tough task. If you NEVER have middle names, you could assume the first two columns are first and last name. You know for sure what the last two are. Everything between them could be assigned to a single address field.

Related

C++ Using getline() inside loop to read in CSV file

I'm trying to read in a CSV file that contains rows of 3 people/patients, where col 1 is userid, col 2 is fname, col 3 is lname, col 4 is insurance, and col 5 is version that looks something like below.
Edit: Apologies, I simply copy/pasted my CSV spreadsheet in here, so it didn't show the commas before. Wouldn't it look something more like below? John below also pointed out that there are no commas after the version, and this seemed to fix the issue! Thanks so much John! ( trying to figure out how I can accept your answer :) )
nm92,Nate,Matthews,Aetna,1
sc91,Steve,Combs,Cigna,2
ml94,Morgan,Lands,BCBS,3
I'm trying to use getline() inside of a loop to read everything in, and it works fine for the first iteration, but getline() seems to be causing it to skip a value on the next iterations. Any idea how I can solve this?
I'm also not sure why the output looks like below, because I'm not seeing where the lines w/ "sc91" and "ml94" are being printed in the code. This is what the output of the current code looks like.
userid is: nm92
fname is: Nate
lname is: Matthews
insurance is: Aetna
version is: 1
sc91
userid is: Steve
fname is: Combs
lname is: Cigna
insurance is: 2
ml94
version is: Morgan
userid is: Lands
fname is: BCBS
lname is: 3
insurance is:
version is:
I've done a ton of research on differences between getline() and the >> stream operator, but most of the getline() materials seem to revolve around getting input from cin rather than reading from a file like here, so I'm thinking there's something going on w/ getline() and how it's reading the file that I'm not understanding. Unfortunately when I tried >> operator, that forces me to use the strtok() function, and I was struggling a lot with c strings and assigning them to an array of C++ strings.
#include <iostream>
#include <string> // for strings
#include <cstring> // for strtok()
#include <fstream> // for file streams
using namespace std;
struct enrollee
{
string userid = "";
string fname = "";
string lname = "";
string insurance = "";
string version = "";
};
int main()
{
const int ENROLL_SIZE = 1000; // used const instead of #define since the performance diff is negligible,
const int numCols = 5; // while const allows for greater utility/debugging bc it is known to the compiler ,
// while #define is a preprocessor directive
ifstream inputFile; // create input file stream for reading only
struct enrollee enrollArray[ENROLL_SIZE]; // array of structs to store each enrollee and their respective data
int arrayPos = 0;
// open the input file to read
inputFile.open("input.csv");
// read the file until we reach the end
while(!inputFile.eof())
{
//string inputBuffer; // buffer to store input, which will hold an entire excel row w/ cells delimited by commas
// must be a c string since strtok() only takes c string as input
string tokensArray[numCols];
string userid = "";
string fname = "";
string lname = "";
string insurance = "";
string sversion = "";
//int version = -1;
//getline(inputFile,inputBuffer,',');
//cout << inputBuffer << endl;
getline(inputFile,userid,',');
getline(inputFile,fname,',');
getline(inputFile,lname,',');
getline(inputFile,insurance,',');
getline(inputFile,sversion,',');
enrollArray[0].userid = userid;
enrollArray[0].fname = fname;
enrollArray[0].lname = lname;
enrollArray[0].insurance = insurance;
enrollArray[0].version = sversion;
cout << "userid is: " << enrollArray[0].userid << endl;
cout << "fname is: " << enrollArray[0].fname << endl;
cout << "lname is: " << enrollArray[0].lname << endl;
cout << "insurance is: " << enrollArray[0].insurance << endl;
cout << "version is: " << enrollArray[0].version << endl;
}
}
Your problem is that there is no comma after the final data item in each line, so
getline(inputFile,sversion,',');
is incorrect because it reads to the next comma, which is actually on the next line after the user id of the next patient. This explains the output you see where the user id of the next patent gets output with the version.
To fix this simply replace the code above with
getline(inputFile,sversion);
which will read to the end of line as required.
Regarding your function. If you look at the structure of the source file, then you will see that it contains 5 strings, separated by ",". So a typical CSV file.
A call to std::getline will read a complete line with the 5 strings. In your code you are trying to call std::getline for each single string, followed by a comma. Commaa is not present after the last string. That will not work. You should also use getline to get a complete line.
You need to read the whole line and then tokenize it.
I will show you an example on how to do that with the std::sregex_token_iterator. That is very simple. Additionally, we will overwrite the inserter and extracot operator. With that, you can easiyl read and write "enrollee" data like Enrollee e{}; std::cout << e;
Additionally I use C++ algorithms. That makes life very easy. Input and Output are a one-liner in main.
Please see:
#include <iostream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <regex>
struct Enrollee
{
// Data
std::string userid{};
std::string fname{};
std::string lname{};
std::string insurance{};
std::string version{};
// Overload Extractor Operator to read data from somewhere
friend std::istream& operator >> (std::istream &is, Enrollee& e) {
std::vector<std::string> wordsInLine{}; // Here we will store all words that we read in onle line;
std::string wholeLine; // Temporary storage for the complete line that we will get by getline
std::regex separator("[ \\;\\,]"); ; // Separator for a CSV file
std::getline(is, wholeLine); // Read one complete line and split it into parts
std::copy(std::sregex_token_iterator(wholeLine.begin(), wholeLine.end(), separator, -1), std::sregex_token_iterator(), std::back_inserter(wordsInLine));
// If we have read all expted strings, then store them in our struct
if (wordsInLine.size() == 5) {
e.userid = wordsInLine[0];
e.fname = wordsInLine[1];
e.lname = wordsInLine[2];
e.insurance = wordsInLine[3];
e.version = wordsInLine[4];
}
return is;
}
// Overload Inserter operator. Insert data into output stream
friend std::ostream& operator << (std::ostream& os, const Enrollee& e) {
return os << "userid is: " << e.userid << "\nfname is: " << e.fname << "\nlname is: " << e.lname << "\ninsurance is: " << e.insurance << "\nversion is: " << e.version << '\n';
}
};
int main()
{
// Her we will store all Enrollee data in a dynamic growing vector
std::vector<Enrollee> enrollmentData{};
// Define inputFileStream and open the csv
std::ifstream inputFileStream("r:\\input.csv");
// If we could open the file
if (inputFileStream) {
// Then read all csv data
std::copy(std::istream_iterator<Enrollee>(inputFileStream), std::istream_iterator<Enrollee>(), std::back_inserter(enrollmentData));
// For Debug Purposes: Print all data to cout
std::copy(enrollmentData.begin(), enrollmentData.end(), std::ostream_iterator<Enrollee>(std::cout, "\n"));
}
else {
std::cerr << "Could not open file 'input.csv'\n";
}
}
This will read the input file "input.csv" containing
nm92,Nate,Matthews,Aetna,1
sc91,Steve,Combs,Cigna,2
ml94,Morgan,Lands,BCBS,3
And show as output:
userid is: nm92
fname is: Nate
lname is: Matthews
insurance is: Aetna
version is: 1
userid is: sc91
fname is: Steve
lname is: Combs
insurance is: Cigna
version is: 2
userid is: ml94
fname is: Morgan
lname is: Lands
insurance is: BCBS
version is: 3
That is only an idea, but it could help you. It's a piece of code of one project I am working on:
std::vector<std::string> ARDatabase::split(const std::string& line, char delimiter)
{
std::vector<std::string> tokens;
std::string token;
std::istringstream tokenStream(line);
while (std::getline(tokenStream, token, delimiter))
{
tokens.push_back(token);
}
return tokens;
}
void ARDatabase::read_csv_map(std::string root_csv_map)
{
qDebug() << "Starting to read the people database...";
std::ifstream file(root_csv_map);
std::string str;
while (std::getline(file, str))
{
std::vector<std::string> tokens = split(str, ' ');
std::vector<std::string> splitnames = split(tokens.at(1), '_');
std::string name_w_spaces;
for(auto i: splitnames) name_w_spaces = name_w_spaces + i + " ";
people_names.insert(std::make_pair(stoi(tokens.at(0)), name_w_spaces));
people_images.insert(std::make_pair(stoi(tokens.at(0)), std::string("database/images/" + tokens.at(2))));
}
}
Instead of std::vector, you might want to use other container more suitable for your case. And the last example is made for the input format of my case. You can modify it easily for adapting it to your code.

Read in from file into structure

I want to read in from txt file into structure using fstream.
I save the data to the file in the way shown below:
To read the data i tried some cheeky stuff with getlines or tabsin<
struct tab{
int type,use;
string name, brand;
};
tab tabs[500];
ofstream tabsout;
tabsout.open("tab.txt", ios::out);
for (int i = 0; i < 500; i++){
if (tabs[i].use==1){
tabsout << tabs[i].type << " " << tabs[i].name << " " << tabs[i].brand << "\n";
}
}
tabsout.close();
//input part that fails me :(
int i=0;
ifstream tabsin;
tabsin.open("tab.txt", ios::in);
if (tabsin.is_open()){
while(tabsin.eof() == false)
{
tabsin >> tabs[i].type>>tabs[i].name>>tabs[i].brand;
i++
}
tabsin.close();
You usually want to overload operator>> and operator<< for the class/struct, and put the reading/writing code there:
struct tab{
int type,use;
string name, brand;
friend std::istream &operator>>(std::istream &is, tab &t) {
return is >> t.type >> t.name >> t.brand;
}
friend std::ostream &operator<<(std::ostream &os, tab const &t) {
return os << t.type << " " << t.name << " " << t.brand;
}
};
Then you can read in a file of objects like:
std::ifstream tabsin("tab.txt");
std::vector<tab> tabs{std::istream_iterator<tab>(tabsin),
std::istream_iterator<tab>()};
....and write out the objects like:
for (auto const &t : tabs)
tabsout << t << "\n";
Note that (like any sane C++ programmer) I've used a vector instead of an array, to (among other things) allow storing an arbitrary number of items, and automatically track how many are actually being stored.
For starters, do not use .eof() to control your loop: it doesn't work. Instead, use the stream's state after reading:
int type;
std::string name, brand;
while (in >> type >> name >> brand) {
tabs.push_back(tab(type, name, brand));
}
If your name or brand contain spaces, the above won't work and you will need to write a format where you can know when to stop abd read correspondingly, e.g., using std::getline().
You might also consider wrapping the logic to read or write an object by suitable operators.
istream& getline (istream& is, string& str, char delim);
Take a look at the third parameter, you can use std::getline to parse your line. But that is definitely not the best way to serialize objects. Instead of using a text file, you should use a byte stream.

How to extract specific data from text file containing whitespace and newlines?

I would like to extract and analyze data from a large text file. The data contains floats, integers and words.
The way I thought of doing this is to extract a complete line (up to newline) using std::getline(). Then extract individual data from the line extracted before (extract until whitespace, then repeat).
So far I have this:
int main( )
{
std::ifstream myfile;
myfile.open( "example.txt", std::ios::in );
if( !(myfile.is_open()) )
{ std::cout << "Error Opening File";
std::exit(0); }
std::string firstline;
while( myfile.good() )
{
std::getline( myfile, firstline);
std::cout<< "\n" << firstline <<"\n";
}
myfile.close();
return 0;
}
I have several problems:
1) How do I extract up to a whitespace?
2) What would be the best method of storing the data? There are about 7-9 data types, and the data file is large.
EDIT: An example of the file would be:
Result Time Current Path Requirements
PASS 04:31:05 14.3 Super_Duper_capacitor_413 -39.23
FAIL 04:31:45 13.2 Super_Duper_capacitor_413 -45.23
...
Ultimately I would like to analyze the data, but so far I'm more concerned about proper input/reading.
You can use std::stringstream to parse the data and let it worry about skipping the whitspaces. Since each element in the input line appears to require additional processing just parse them into local variables and after all post processing is done store the final results into a data structure.
#include <sstream>
#include <iomanip>
std::stringstream templine(firstline);
std::string passfail;
float floatvalue1;
std::string timestr;
std::string namestr;
float floatvalue2;
// split to two lines for readability
templine >> std::skipws; // no need to worry about whitespaces
templine >> passfail >> timestr >> floatvalue1 >> namestr >> floatvalue2;
If you do not need or want to validate that the data is in the correct format you can parse the lines directly into a data structure.
struct LineData
{
std::string passfail;
float floatvalue1;
int hour;
int minute;
int seconds;
std::string namestr;
float floatvalue2;
};
LineData a;
char sep;
// parse the pass/fail
templine >> a.passfail;
// parse time value
templine >> a.hour >> sep >> a.minute >> sep >> a.seconds;
// parse the rest of the data
templine >> a.timestr >> a.floatvalue1 >> a.namestr >> a.floatvalue2;
For the first question, you can do this:
while( myfile.good() )
{
std::getline( myfile, firstline);
std::cout<< "\n" << firstline <<"\n";
std::stringstream ss(firstline);
std::string word;
while (std::getline(ss,word,' '))
{
std::cout << "Word: " << word << std::endl;
}
}
As for the second question, can you give us more precision about the data types and what is it you want to do with the data once stored?

C++ file reading

I have a file that has a number in which is the number of names that follow. For example:
4
bob
jim
bar
ted
im trying to write a program to read these names.
void process_file(ifstream& in, ofstream& out)
{
string i,o;
int tmp1,sp;
char tmp2;
prompt_user(i,o);
in.open (i.c_str());
if (in.fail())
{
cout << "Error opening " << i << endl;
exit(1);
}
out.open(o.c_str());
in >> tmp1;
sp=tmp1;
do
{
in.get(tmp2);
} while (tmp2 != '\n');
in.close();
out.close();
cout<< sp;
}
So far I am able to read the first line and assign int to sp
I need sp to be a counter for how many names. How do I get this to read the names.
The only problem I have left is how to get the names while ignoring the first number.
Until then i cannot implement my loop.
while (in >> tmp1)
sp=tmp1;
This successfuly reads the first int from the and then tries to continue. Since the second line is not an int, extraction fails, so it stops looping. So far so good.
However, the stream is now in fail state, and all subsequent extractions will fail unless you clear the error flags.
Say in.clear() right after the first while loop.
I don't really see why you wrote a loop to extract a single integer, though. You could just write
if (!(in >> sp)) { /* error, no int */ }
To read the names, read in strings. A loop is fine this time:
std::vector<std::string> names;
std::string temp;
while (in >> temp) names.push_back(temp);
You'd might want to add a counter somewhere to make sure that the number of names matches the number you've read from the file.
int lines;
string line;
inputfile.open("names.txt");
lines << inputfile;
for(i=0; i< lines; ++i){
if (std::getline(inputfile, line) != 0){
cout << line << std::endl;
}
}
First of all, assuming that the first loop:
while (in >> tmp1)
sp=tmp1;
Is meant to read the number in the beginning, this code should do:
in >> tmp1;
According to manual operator>>:
The istream object (*this).
The extracted value or sequence is not returned, but directly stored
in the variable passed as argument.
So don't use it in condition, rather use:
in >> tmp1;
if( tmp1 < 1){
exit(5);
}
Second, NEVER rely on assumption that the file is correctly formatted:
do {
in.get(tmp2);
cout << tmp2 << endl;
} while ( (tmp2 != '\n') && !in.eof());
Although whole algorithm seems a bit clumsy to me, this should prevent infinite loop.
Here's a simple example of how to read a specified number of words from a text file in the way you want.
#include <string>
#include <iostream>
#include <fstream>
void process_file() {
// Get file name.
std::string fileName;
std::cin >> fileName;
// Open file for read access.
std::ifstream input(fileName);
// Check if file exists.
if (!input) {
return EXIT_FAILURE;
}
// Get number of names.
int count = 0;
input >> count;
// Get names and print to cout.
std::string token;
for (int i = 0; i < count; ++i) {
input >> token;
std::cout << token;
}
}

input from txt file to arrays

I have a text file with a line like:
James Dean 10 Automotive 27010.43
and I need to read that file and put each of the 4 above into arrays.
char nameArray[MAX][NAME_MAX];
int yearArray[MAX];
char departmentArray[MAX][DEP_MAX];
double payArray[MAX];
while(i < MAX && infile) {
infile.getline(nameArray[i], 20);
infile >> yearArray[i];
infile.getline(departmentArray[i], 15);
infile >> payArray[i];
cout << nameArray[i] << " " << yearArray[i] << " " << departmentArray[i] << " " << fixed << setprecision(2) << payArray[i] << endl;
i++;
}
The code is cut down just to give you an idea of what I am trying to do, but when I run this, I get something like:
James Dean -858993460 -92559631349317830000000000000000000000000000
000000000000000000.00
Thanks for the help.
==== Edit ==========================================
I changed from getline to get, thanks for that. I have to use get and not >> because some of the lines I am reading in are more than just "James Dean", they are up to 20 char long...ex: "William K. Woodward" is another one.
So, if I just use get, then it reads the first line in fine, but then I get the same messed up text for the second line.
Here is the code:
infile.get(nameArray[i], 20);
infile >> yearArray[i];
infile.get(departmentArray[i], 15);
infile >> payArray[i];
The getline functions takes an input stream and a string to write to. So, two getline calls read in two lines. Your input mechanism is broken. Either, use getline or the stream extraction operator (i.e. >>) but not both.
If you plan to use getline you need to parse the string (which is effectively one line of input) into tokes, and then store them in appropriately typed arrays. The second and fourth tokens are numbers, hence you will need to convert these from string to int or double.
The operator >> approach:
string name, surname;
int year;
double pay;
while (infile) {
infile >> name >> surname >> year >> department >> pay;
namearray[ i ] = name + " " + surname;
// ...
payarray[ i ] = pay;
++i;
}
The getline approach:
string line;
while (getline(infile, line)) {
parse(line, tokens);
namearray[ i ] = token[ 0 ] + " " + token[ 1 ];
// ...
payarray[ i ] = strTodouble(token[ 4 ]);
++i;
}
// parse definition
void parse(string line, vector<string>& token) {
// roll your own
}
double strToDouble(string s) {
// ...
}
I dont see where you define infile but I will assume that it is an ifile . In that case you should use it the same way u use cin to get input.
Why do you do a getline () ?
That function will stop only at an '\n' char or at an EOF char. So it means, you start reading the int after the end of the line, some random data.
Correct me if i'm wrong, but are there 20 or 19 characters in that first string (James Dean) before the number (10) ?