Remove entire rows with missing values c++ - c++

I am reading the data with different variables by the following codes, currently when the program touches missing values (represented in data by string "NA", it will change them to zero. Alternatively, I wonder if how can we remove entire rows when program touch "NA". I have tried to look for the same question but they all are for R, not C++. Please, if you can give me some advises. Thanks
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
struct Data {
vector<double> cow_id;
vector<double> age_obs;
vector<double> dim_obs;
vector<double> my_obs;
vector<double> mcf_obs;
vector<double> mcp_obs;
vector<double> mcl_obs;
vector<double> bw_obs;
vector<double> bcs_obs;
double get_number (string value)
{
if (value == "NA")
{return 0.0;}
else
{
istringstream iss (value);
double val;
iss>>val;
return val;
}
}
void read_input (const string filepath)
{
ifstream data_in (filepath.c_str());
if (!data_in)
{cout<<"Failed to open"<<endl;}
else
{
// Read tokens as strings.
string id, age, dim, my, mcf, mcp, mcl, bw, bcs;
string dummy_line;
getline(data_in, dummy_line);
string line;
while (data_in >> id >> age >> dim >> my >> mcf >> mcp >> mcl >> bw >> bcs)
{
// Get the number from the string and add to the vectors.
cow_id.push_back(get_number(id));
age_obs.push_back(get_number(age));
dim_obs.push_back(get_number(dim));
my_obs.push_back(get_number(my));
mcf_obs.push_back(get_number(mcf));
mcp_obs.push_back(get_number(mcp));
mcl_obs.push_back(get_number(mcl));
bw_obs.push_back(get_number(bw));
bcs_obs.push_back(get_number(bcs));
}
data_in.close();
}
size_t size=age_obs.size();
for (size_t i=0; i<size; i++)
{
cout<<cow_id[i]<<'\t'<<age_obs[i]<<'\t'<<dim_obs[i]<<'\t'<<my_obs[i] <<'\t'<<mcf_obs[i]<<'\t'<<mcp_obs[i]<<'\t'<<mcl_obs[i]<<'\t'<<bw_obs[i] <<'\t'<<bcs_obs[i]<<endl;
}
};
int main()
{
Data input;
input.read_input("C:\\Data\\C++\\learncpp\\data.txt");
}

Let's talk tables here.
Tables are containers of records (rows). The data you are capturing from your input file is already organized into records. So the obvious model is to use a structure that matches your file's data records.
struct Record
{
unsigned int cow_id;
unsigned int age_obs;
unsigned int dim_obs;
// ...
};
Your table could be represented as:
std::vector<record> my_table;
So to remove a record from the table, you can use the std::vector::erase() method. Easy. Also, you can use the std::find() function to search the table.
Let's relieve some reader's headaches with your present code by introducing a concept of the record loading its members from the file.
Reading a record from a file is best performed by overloading the stream extraction operator>>:
struct Record
{
//...
friend std::istream& operator>>(std::istream& input, Record& r);
};
std::istream&
operator>>(std::istream& input, Record& r)
{
std::string record_text;
std::getline(input, record_text);
// Extract a field from the record text and check for NA,
// Assign fields of r to those values:
r.cow_id = value;
// Etc.
return input;
}
With the overloaded operator, your input looks like:
Record r;
while (input_file >> r)
{
table.push_back(r);
}
Elegant and simple (reducing injection of defects).

Related

How to use getline() with multiple variables in the same line?

I'm trying to read each line from a file and store the data in each line. Say the line is "x y z". What arguments should the getline function use in order to read and store x, y and z individually?
void readData(Gene *data, int num)
{
int codeNum;
int i = 0;
int k = num;
ifstream inputFile;
inputFile.open("example.data");
inputFile >> codeNum;
while(i < k){
getline(inputFile, data[i].geneCode, data[i].MutCode[0],
data[i].MutCost[0], data[i].MutCode[1],
data[i].MutCost[1]);
i++;
}
This is what I have. Note that all the vars I'm trying to read are strings, and that k is the total number of lines. when trying to compile I get an error saying "no matching function to call to getline()" and something about "candidate function template not viable". Any idea what I'm doing wrong?
I highly recommend you use a vector of structures (or classes) rather than multiple, parallel arrays.
struct Mutation_Code_Cost
{
Mutation_Code_Type MutCode;
Mutation_Cost_Type MutCost;
};
struct Gene
{
Gene_Code_Type geneCode;
Mutation_Code_Cost mutation_info[2];
};
You can then overload operator>> to read in the structures from a text stream:
struct Mutation_Code_Cost
{
Mutation_Code_Type MutCode;
Mutation_Cost_Type MutCost;
friend std::istream& operator>>(std::istream& input, Mutation_Code_Cost& mcc);
};
std::istream& operator>>(std::istream& input, Mutation_Code_Cost& mcc)
{
input >> mcc.MutCode;
input >> mcc.MutCost;
return input;
}
struct Gene
{
Gene_Code_Type geneCode;
Mutation_Code_Cost mutation_info[2];
friend std::istream& operator>>(std::istream& input, Gene& g);
};
std::istream& operator>>(std::istream& input, Gene& g)
{
input >> g.geneCode;
input >> g.mutation_info[0];
input >> g.mutation_info[1];
return input;
}
You can the read from the file like so:
std::vector<Gene> database;
Gene g;
std::string record;
while (std::getline(input_file, record))
{
std::istringstream record_stream(record);
if (record >> g)
{
database.push_back(g);
}
}

Find number of items in .txt file

I'm looking for a way to find the number of items in a .txt file.
The file structure is as follows:
students.txt pricem 1441912123
house.pdf jatkins 1442000124
users.txt kevin_tomlinson 1442001032
accounts.mdb kevin_tomlinson 1442210121
vacation.jpg smitty83 1442300125
calendar.cpp burtons 1442588012
The result should be 18 in this example since there are 18 separate "words" in this file.
I need that value so I can iterate through the items and assign them to an array of structures (maybe there's a way to accomplish both of these steps together?):
// my structure
struct AccessRecord
{
string filename;
string username;
long timestamp;
};
// new instance of AccessRecord
// max possible records: 500
AccessRecord logRecords[500];
// while file has content
while (!fin.eof())
{
// loop through file until end
// max possible records: 500
for (int i = 0; i < 500; i++) // need to figure out how to iterate
{
fin >> logRecords[i].filename
>> logRecords[i].username
>> logRecords[i].timestamp;
}
}
Which will then be written to the screen.
So the question is, how do I find the count? Or is there a better way?
You know that each line contains a string, a string and a long, so you can iterate with:
std::vector<AccessRecord> logs;
std::string fname, uname;
long tstamp;
while(fin >> fname >> uname >> tstamp) {
logs.push_back(AccessRecord(fname, uname, tstamp));
//To avoid copies, use: (thanks #Rakete1111!)
//logs.emplace_back(std::move(fname), std::move(uname), tstamp);
}
This is assuming you've created a constructor for your struct like:
AccessRecord(std::string f, std::string u, long t)
: filename(f), username(u), timestamp(t) { }
Notice that I'm using an std::vector here instead of an array so that we don't even have to worry about the number of items, since the vector will resize itself dynamically!
You should overload operator>> for your structure:
struct AccessRecord
{
string filename;
string username;
long timestamp;
friend std::istream& operator>>(std::istream& input, AccessRecord& ar);
};
std::istream& operator>>(std::istream& input, AccessRecord& ar)
{
input >> ar.filename;
input >> ar.username;
input >> ar.timestamp;
return input;
}
This allows you to simplify your input function:
AccessRecord ar;
std::vector<AccessRecord> logs;
//...
while (fin >> ar)
{
database.push_back(ar);
}
Usually, if you are accessing an objects data members directly outside of the class or structure, something is wrong. Search the internet for "data hiding", "c++ encapsulation" and "c++ loose coupling".

How to read pieces of string into a class array C++

I have an array of dvd from a Video class I created
Video dvd[10];
each video has the property,
class Video {
string _title;
string _genre;
int _available;
int _holds;
public:
Video(string title, string genre, int available, int holds);
Video();
void print();
void read(istream & is, Video dvd);
int holds();
void restock(int num);
string getTitle();
~Video();
};
I'm trying to fill up this array with data from my text file where each info such as the title and genre is separated by a comma
Legend of the seeker, Fantasy/Adventure, 3, 2
Mindy Project, Comedy, 10, 3
Orange is the new black, Drama/Comedy, 10, 9
I've tried using getline(in, line, ',') but my brain halts when its time to insert each line into the dvd array.
I also created a read method to read each word separated by a whitespace but I figured thats not what I really want.
I also tried to read a line with getline, store the line in a string and split it from there but I get confused along the line.
**I can get the strings I need from each line, my confusion is in how to insert it into my class array in the while loop especially when I can only read one word at a time.
I need help on what approach I should follow to tackle this problem.
**My code
#include <iostream>
#include <fstream>
#include <cassert>
#include <vector>
#define MAX 10
using namespace std;
class Video {
string _title;
string _genre;
int _available;
int _holds;
public:
Video(string title, string genre, int available, int holds);
Video();
void print();
void read(istream & is, Video dvd);
int holds();
void restock(int num);
string getTitle();
~Video();
};
Video::Video(string title, string genre, int available, int holds){
_title = title;
_genre = genre;
_available = available;
_holds = holds;
}
void Video::read (istream & is, Video dvd)
{
is >> _title >> _genre >> _available>>_holds;
dvd = Video(_title,_genre,_available,_holds);
}
int Video::holds(){
return _holds;
}
void Video::restock(int num){
_available += 5;
}
string Video::getTitle(){
return _title;
}
Video::Video(){
}
void Video::print(){
cout<<"Video title: " <<_title<<"\n"<<
"Genre: "<<_genre<<"\n"<<
"Available: " <<_available<<"\n"<<
"Holds: " <<_holds<<endl;
}
Video::~Video(){
cout<<"DESTRUCTOR ACTIVATED"<<endl;
}
int main(int params, char **argv){
string line;
int index = 0;
vector<string> tokens;
//Video dvd = Video("23 Jump Street", "comedy", 10, 3);
//dvd.print();
Video dvd[MAX];
dvd[0].holds();
ifstream in("input.txt");
/*while (getline(in, line, ',')) {
tokens.push_back(line);
}
for (int i = 0; i < 40; ++i)
{
cout<<tokens[i]<<endl;
}*/
if(!in.fail()){
while (getline(in, line)) {
dvd[index].read(in, dvd[index]);
/*cout<<line<<endl;
token = line;
while (getline(line, token, ',')){
}
cout<<"LINE CUT#####"<<endl;
cout<<line<<endl;
cout<<"TOKEN CUT#####"<<endl;*/
//dvd[index] =
index++;
}
}else{
cout<<"Invalid file"<<endl;
}
for (int i = 0; i < MAX; ++i)
{
dvd[i].print();
}
}
First, I would change the Video::read function into an overload of operator >>. This will allow the Video class to be used as simply as any other type when an input stream is being used.
Also, the way you implemented read as a non-static member function returning a void is not intuitive and very clunky to use. How would you write the loop, and at the same time detect that you've reached the end of file (imagine if there are only 3 items to read -- how would you know to not try to read a fourth item)? The better, intuitive, and frankly, de-facto way to do this in C++ is to overload the >> operator.
(At the end, I show how to write a read function that uses the overloaded >>)
class Video
{
//...
public:
friend std::istream& operator >> (std::istream& is, Video& vid);
//..
};
I won't go over why this should be a friend function, as that can be easily researched here on how to overload >>.
So we need to implement this function. Here is an implementation that reads in a single line, and copies the information to the passed-in vid:
std::istream& operator >> (std::istream& is, Video& vid)
{
std::string line;
std::string theTitle, theGenre, theAvail, theHolds;
// First, we read the entire line
if (std::getline(is, line))
{
// Now we copy the line into a string stream and break
// down the individual items
std::istringstream iss(line);
// first item is the title, genre, available, and holds
std::getline(iss, theTitle, ',');
std::getline(iss, theGenre, ',');
std::getline(iss, theAvail, ',');
std::getline(iss, theHolds, ',');
// now we can create a Video and copy it to vid
vid = Video(theTitle, theGenre,
std::stoi(theAvail), // need to change to integer
std::stoi(theHolds)); // same here
}
return is; // return the input stream
}
Note how vid is a reference parameter, not passed by value. Your read function, if you were to keep it, would need to make the same change.
What we did above is that we read the entire line in first using the "outer" call to std::getline. Once we have the line as a string, we break down that string by using an std::istringstream and delimiting each item on the comma using an "inner" set of getline calls that works on the istringstream. Then we simply create a temporary Video from the information we retrieved from the istringstream and copy it to vid.
Here is a main function that now reads into a maximum of 10 items:
int main()
{
Video dvd[10];
int i = 0;
while (i < 10 && std::cin >> dvd[i])
{
dvd[i].print();
++i;
}
}
So if you look at the loop, all we did is 1) make sure we don't go over 10 items, and 2) just use cin >> dvd[i], which looks just like your everyday usage of >> when inputting an item. This is the magic of the overloaded >> for Video.
Here is a live example, using your data.
If you plan to keep the read function, then it would be easier if you changed the return type to bool that returns true if the item was read or false otherwise, and just calls the operator >>.
Here is an example:
bool Video::read(std::istream & is, Video& dvd)
{
if (is.good())
{
is >> dvd;
return true;
}
return false;
}
And here is the main function:
int main()
{
Video dvd[10];
int i = 0;
while (i < 10 && dvd[i].read(std::cin, dvd[i]))
{
dvd[i].print();
++i;
}
}
Live Example #2
However, I still say that the making of Video::read a non-static member makes the code in main clunky.

Reading an Input File And Store The Data Into an Array (beginner)!

The Input file:
1 4 red
2 0 blue
3 1 white
4 2 green
5 2 black
what I want to do is take every row and store it into 2D array.
for example:
array[0][0] = 1
array[0][1] = 4
array[0][2] = red
array[1][0] = 2
array[1][1] = 0
array[1][2] = blue
etc..
code Iam working on it:
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
using namespace std;
int convert_str_to_int(const string& str) {
int val;
stringstream ss;
ss << str;
ss >> val;
return val;
}
string getid(string str){
istringstream iss(str);
string pid;
iss >> pid;
return pid;
}
string getnumberofcolors(string str){
istringstream iss(str);
string pid,c;
iss >> pid>>c;
return c;
}
int main() {
string lineinfile ;
vector<string> lines;
ifstream infile("myinputfile.txt");
if ( infile ) {
while ( getline( infile , lineinfile ) ) {
lines.push_back(lineinfile);
}
}
//first line - number of items
int numofitems = convert_str_to_int(lines[0]);
//lopps items info
string ar[numofitems ][3];
int i = 1;
while(i<=numofitems ){
ar[i][0] = getid(lines[i]);
i++;
}
while(i<=numofitems ){
ar[i][1] = getarrivel(lines[i]);
i++;
}
infile.close( ) ;
return 0 ;
}
when I add the second while loop my program stopped working for some reason!
is there any other way to to this or a solution to my program to fix it.
It's better to show you how to do it much better:
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int main() {
ifstream infile("myinputfile.txt"); // Streams skip spaces and line breaks
//first line - number of items
size_t numofitems;
infile >> numofitems;
//lopps items info
vector<pair<int, pair<int, string>> ar(numofitems); // Or use std::tuple
for(size_t i = 0; i < numofitems; ++i){
infile >> ar[i].first >> ar[i].second.first >> ar[i].second.second;
}
// infile.close( ) ; // Not needed -- closed automatically
return 0 ;
}
You are probably solving some kind of simple algorithmic task. Take a look at std::pair and std::tuple, which are useful not only as container for two elements, but because of their natural comparison operators.
The answer given is indeed a much better solution than your's. I figured i should point out some of your design flaws and give some tips too improve it.
You redefined a function that already exists in the standard, which is
std::stoi() to convert a string to an integer. Remember, if a function
exists already, it's OK to reuse it, don't think you have to reinvent what's
already been invented. If you're not sure search your favorite c++ reference guide.
The solution stores the data "as is" while you store it as a full string. This doesn't really make sense. You know what the data is beforehand, use that to your advantage. Plus, when you store a line of data like that it must be parsed, converted, and then constructed before it can be used in any way, whereas in the solution the data is constructed once and only once.
Because the format of the data is known beforehand an even better way to load the information is by defining a structure, along with input/output operators. This would look something like this:
struct MyData
{
int num1;
int num2;
std::string color;
friend std::ostream& operator << (std::ostream& os, const MyData& d);
friend std::istream& operator >> (std::istream& os, const MyData& d);
};
Then you could simply do something like this:
...
MyData tmp;
outfile << tmp;
vData.push_back(tmp);
...
Their is no question of intent, we are obviously reading a data type from a stream and storing it in a container. If anything, it's clearer as to what you are doing than either your original solution or the provided one.

Reading from file into a vector

I'm using C++ and I'm reading from a file lines like this:
D x1 x2 x3 y1
My code has:
struct gate {
char name;
vector <string> inputs;
string output;
};
In the main function:
vector <gate> eco;
int c=0;
int n=0;
int x = line.length();
while(netlist[c][0])
{
eco.push_back(gate());
eco[n].name = netlist[c][0];
eco[n].output[0] = netlist[c][x-2];
eco[n].output[1] = netlist[c][x-1];
}
where netlist is a 2D array I have copied the file into.
I need help to loop over the inputs and save them in the vector eco.
I don’t fully understand the sense of the 2D array but I suspect it’s redundant. You should use this code:
ifstream somefile(path);
vector<gate> eco;
gate g;
while (somefile >> g)
eco.push_back(g);
// or, simpler, requiring #include <iterator>
vector<gate> eco(std::istream_iterator<gate>(somefile),
std::istream_iterator<gate>());
And overload operator >> appropriately for your type gate:
std::istream& operator >>(std::istream& in, gate& value) {
// Error checking … return as soon as a failure is encountered.
if (not (in >> gate.name))
return in;
gate.inputs.resize(3);
return in >> gate.inputs[0] >>
gate.inputs[1] >>
gate.inputs[2] >>
gate.output;
}