Best way to search for vector values in a file - c++

for a project, I'm trying to use my randomly generated vector to compare to a given CSV file full of int data. My vector is 6 random numbers, I need to read my csv files, and check if those numbers within my vector exist in all of the files.
What is the best way to access my vector one by one, and compare to all files? my code below worked fine when I originally used an array to store the random numbers but after changing over to a vector, it doesnt seem to work.
I'm very new to C++ for context
int csv_reader() {
string line;
vector<int> numbers;
int filecontents = 0;
num_gen(numbers);
std::string path_to_dir = "example/directory/to/csv's";
for( const auto & entry : std::filesystem::directory_iterator( path_to_dir )) {
if (entry.path().extension().string() != ".csv") continue;
ifstream file(entry.path());
if(file.is_open())
{
while(getline(file, line))
{
file >> filecontents;
for(int i = 0; i > numbers.size(); i++)
{
if(filecontents == numbers.at(i))
cout << "success";
else
cout << "doesnt exist";
}

Related

Output issue with .CSV file

Whenever I attempt to output a line, it outputs the data from the file vertically instead of outputting the full line horizontally. My main goal is to output each line individually and remove commas and repeat till no more lines are in the CSV file.
An example when I run the code:
cout << data[1] << "\t";
Output:
Huggenkizz Pinzz White Dwarf Dildock Operknockity DeVille
What I'm trying to get is:
Huggenkizz Amanda 3/18/1997 Sales Associate 2 A A F
My CSV File:
ID,Last Name,First Name,DOB,DtHire,Title,Level,Region,Status,Gender
1,Huggenkizz,Amanda,3/18/1997,,Sales Associate,2,A,A,F
2,Pinzz,Bobby,5/12/1986,,Sales Associate,3,B,A,F
3,White,Snow,12/23/1995,,Sales Associate,2,C,A,F
4,Dwarf,Grumpy,9/8/1977,,Sales Associate,2,C,A,M
5,Dildock,Dopey,4/1/1992,,Sales Associate,1,B,A,M
6,Operknockity,Michael,10/2/1989,,Sales Associate,1,A,S,M
9,DeVille,Cruella,8/23/1960,,Sales Manager,,,A,F
My Code:
vector<string> SplitString(string s, string delimiter)
{
string section;
size_t pos = 0;
vector<string> annualSalesReport;
while ((pos = s.find(delimiter)) != string::npos) //finds string till, if not returns String::npos
{
section = (s.substr(0, pos)); // returns the substring section
annualSalesReport.push_back(section); // places comma split section into the next array
s.erase(0, pos + delimiter.length()); // removes the previous string up to the current pos
}
annualSalesReport.push_back((s));
return annualSalesReport;
}
int main()
{
vector<string> data;
string readLine;
ifstream myIFS;
myIFS.open("SalesAssociateAnnualReport.csv");
int lineCounter = 0;
while (getline(myIFS, readLine))
{
lineCounter++;
if (lineCounter > 1)
{
data = SplitString(readLine, ",");
if (data.size() > 1) //removes top line
{
cout << data[1]<< "\t";
}
}
}
myIFS.close();
return 0;
}
Please change your main function as follows
int main()
{
vector<vector<string>> data;
string readLine;
ifstream myIFS;
myIFS.open("SalesAssociateAnnualReport.csv");
int lineCounter = 0;
while (getline(myIFS, readLine))
{
lineCounter++;
if (lineCounter > 1)
{
vector<string> dataLine = SplitString(readLine, ",");
data.push_back(dataLine);
}
}
myIFS.close();
// output the first data line of csv file without delimiter and without first column
for (size_t i = 1; i < data[0].size(); i++)
{
cout << data[0][i] << '\t';
}
return 0;
}
to get your desired output of
Huggenkizz Amanda 3/18/1997 Sales Associate 2 A AF
without having to change your SplitString function.
Please be aware that C++ first array index is always 0 instead of 1.
I've separated the CSV file input processing and the output generation, just to follow the simple programming model IPO:
Input -> Process -> Output
Therefore I've introduced the matrix of strings vector<vector<string>> to store the whole desired CSV file data.
As mentioned in the comments, the SplitString function may be refactored and it should also be fixed to split the last two columns properly.
Hope it helps?

I have a text file with each line containing an integer. I want to open the text tile and count the number of integers in the file

void DataHousing::FileOpen() {
int count = 0;
// attempt to open the file with read permission
ifstream inputHandle("NumFile500.txt", ios::in);
if (inputHandle.is_open() == true) {
while (!inputHandle.eof()) {
count++;
}
inputHandle.close();
}
else {
cout << "error";
}
cout << count;
}
This is getting stuck in the while loop. But shouldn't the while loop end when it gets to the end of file? Also, I'm not even sure yet if it is counting correctly.
A fairly easy way to do this would be to use std::cin instead. Assuming that you want to count the number of integers in a file you can just use a while loop like so:
int readInt;
int count = 0;
while(std::cin >> readInt){
count++;
}
Then you just pass in the file as an argument parameter to your executable as so:
exec < filename
If you prefer to go through the route you're going then you can just replace your while loop condition with !inputHandle.eof() && std::getline(inputHandle, someStringHere) Then proceed to check if someStringHere is an int and increment your count if it is like so:
int count = 0;
std::string s;
ifstream inputHandle("NumFile500.txt", ios::in);
if (inputHandle.is_open() == true) {
while (!inputHandle.eof() && std::getline(inputHandle, s)) {
if(check to see if it's a number here)
count++;
}
inputHandle.close();
}

How to get line in a file partially by C++

I want to read data in an input file partially. For example, input file is 1GB, I want to read only 100MB each time, then store in a vector. How can I continue reading the next line after the first loop? As you can see in my code below, after the first loop of i, maybe the vector v stored 1000 lines from the input file. I'm not sure if the next loop of i, the command while(std::getline(infile, line)) will continue to read from line 1001 from the input file or not? If not, how can I modify my code to get lines from the input in several groups (1~1000), (1001~2000), (2001~3000)... then store in vector v?
#define FILESIZE 1000000000 // size of the file on disk
#define TOTAL_MEM 100000 // max items the memory buffer can hold
void ExternalSort(std::string infilepath, std::string outfilepath)
{
std::vector<std::string> v;
int runs_count;
std::ifstream infile;
if(!infile.is_open())
{
std::cout << "Unable to open file\n";
}
infile.open(infilepath, std::ifstream::in);
if(FILESIZE % TOTAL_MEM > 0)
runs_count = FILESIZE/TOTAL_MEM + 1;
else
runs_count = FILESIZE/TOTAL_MEM;
// Iterate through the elements in the file
for(i = 0; i < runs_count; i++)
{
// Step 1: Read M-element chunk at a time from the file
for (j = 0; j < (TOTAL_MEM < FILESIZE ? TOTAL_MEM : FILESIZE); j++)
{
while(std::getline(infile, line))
{
// If line is empty, ignore it
if(line.empty())
continue;
new_line = line + "\n";
// Line contains string of length > 0 then save it in vector
if(new_line.size() > 0)
v.push_back(new_line);
}
}
// Step 2: Sort M elements
sort(v.begin(), v.end()); //sort(v.begin(), v.end(), compare);
// Step 3: Create temporary files and write sorted data into those files.
std::ofstream tf;
tf.open(tfile + ToString(i) + ".txt", std::ofstream::out | std::ofstream::app);
std::ostream_iterator<std::string> output_iterator(tf, "\n");
std::copy(v.begin(), v.end(), output_iterator);
v.clear();
//for(std::vector<std::string>::iterator it = v.begin(); it != v.end(); ++it)
// tf << *it << "\n";
tf.close();
}
infile.close();
I didn’t have the patience to check the whole code. It was easier to write a splitter from scratch. Here are some observations, anyhow:
std::ifstream infile;
if (!infile.is_open())
{
std::cout << "Unable to open file\n";
}
infile.open(infilepath, std::ifstream::in);
You will always get the message since you check before opening the file. One correct way to open a file is:
std::ifstream infile(infilepath);
if (!infile)
throw "could not open the input file";
if (infile.peek() == std::ifstream::traits_type::eof())
This will be true, for instance, even for nonexistent files. The algorithm should work for empty files, too.
if(FILESIZE % TOTAL_MEM > 0)
runs_count = FILESIZE/TOTAL_MEM + 1;
else
runs_count = FILESIZE/TOTAL_MEM;
Why do you need the number of resulting files before generate them? You will never be able to calculate it correctly since it depends on how long lines are (you cannot read half of line just to fit it into TOTAL_MEM). You should read from input file at most TOTAL_MEM bytes (but a line, at least), sort & save and then continue from where you left (see the loop in execute, below).
How can I continue reading the next line after the first loop?
If you do not close the input stream, the next read will continue from exactly where you left.
A solution:
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include <vector>
#include <iterator>
std::vector<std::string> split_file(const char* fn, std::size_t mem); // see the implementation below
int main()
{
const std::size_t max_mem = 8;
auto r = split_file("input.txt", max_mem);
std::cout << "generated files:" << std::endl;
for (const auto& fn : r)
std::cout << fn << std::endl;
}
class split_file_t
{
public:
split_file_t(std::istream& is, std::size_t mem) :is_{ is }, mem_{ mem }
{
// nop
}
std::vector<std::string> execute()
{
while (make_file())
;
return std::move(ofiles_);
}
protected:
std::istream& is_;
std::size_t mem_;
std::vector<std::string> ofiles_;
static std::string make_temp_file()
{
std::string fn(512, 0);
tmpnam_s(&fn.front(), fn.size()); // this might be system dependent
std::ofstream os(fn);
os.close();
return fn;
}
bool make_file()
{
using namespace std;
// read lines
vector<string> lines;
{
streamsize max_gpos = is_.tellg() + streamsize(mem_);
string line;
while (is_.tellg() < max_gpos && getline(is_, line))
lines.push_back(line);
}
//
if (lines.empty())
return false;
// sort lines
sort(lines.begin(), lines.end());
// save lines
{
string ofile = make_temp_file();
ofstream os{ ofile };
if (!os)
throw "could not open output file";
copy(lines.begin(), lines.end(), ostream_iterator<string>(os, "\n"));
ofiles_.push_back(ofile);
}
//
return bool(is_);
}
};
std::vector<std::string> split_file(const char* fn, std::size_t mem)
{
using namespace std;
ifstream is{ fn };
if (!is)
return vector<string>();
return split_file_t{ is, mem }.execute();
}

c++ csv data importing problems [duplicate]

This question already has answers here:
How can I read and parse CSV files in C++?
(39 answers)
Closed 5 years ago.
I am a college student and as part of my final project for a c++ class we are assigned to read a csv file that has distance and force data and then find torque from it. The problem that I am running into is how to actually get the data out of the csv file into a workable format. Currently I have been trying to get it into a matrix, to do this though I will need to first determine the size of the csv file as it is supposed to take any size file. This is the format of the data
Case 1,,,,,x_position (m),y_position (m),z_position (m),F_x (N),F_y (N),F_z (N)16.00,5.00,8.00,394.00,-18.00,396.0022.00,26.00,14.00,-324.00,-420.00,429.0028.00,25.00,21.00,73.00,-396.00,-401.006.00,9.00,12.00,-367.00,-137.00,-143.00
Also obviously the different data pieces (distance and forces) need to be put into different vectors or matrices.
This is what I have so far to try to find the number of lines in the file.
ifstream myfile("force_measurements.csv");
if(myfile.is_open()){
string line;
double num=0;
getline(myfile, line);
if(line==""){
cout<<num<<endl;
myfile.close();
}
else{
num++;
}
}
After this works how would you go about putting the data into a matrix? or would a different format be easier? vectors was my other though.
// you need to include the proper headers here
class vec3D:
{
double m_x, m_y, m_z;
public:
void Set(size_t indx, double val)
{
switch(indx)
{
case 0:
m_x = val;
break;
case 1:
m_y = val;
break;
case 2:
m_z = val;
break;
default:
std::cout << "index out of range" << std::endl;
}
}
double magnitude() { return std::sqrt( m_x*m_x + m_y*m_y + m_z*m_y);}
};
int main()
{
std::ifstream my_data_file ( "your_file_here.csv" );
std::vector<std::array<vec3D, 2>> Pos_Forz;
if ( ! my_data_file.is_open())
{
std::cout << "failure to open the file" <<std::endl;
return -1;
}
std::string temp_str;
double arr[6];
std::array<vec3D,2> temp_3Dvec;
while(!my_data_file.eof())
{
for (int i = 0; i<5 ; i++)
{
getline(my_data_file, temp_str, ",");
arr[i] = std::stod(temp_str);
}
getline(my_data_file, temp_str);
arr[5] = std::stod(temp_str);
for(int i = 0; i<2; i++)
{
for (int j=0; j<3; j++)
{
temp_3Dvec[i].set(j, arr[i*3 + j]);
}
}
Pos_Forz.push_back(temp_3Dvec);
}
for(auto e: Pos_Forz)
{
std::cout << " toque = " << (e[0].magnitude() * e[1].magnitude()) <<std::endl;
}
return 0;
}
Fix what need to be fixed, and inspire yourself from this code, also stop asking this kinda of questions here, you need to read the HOW TO of posting
You have multiple questions. Here's something I used for loading items out of a csv. This gets you started, and you can figure out whether further organization of the data is useful. I personally have classes that take a tokenized line and create an instance of themselves with them.
#include <boost/tokenizer.hpp>
typedef vector<string> csvLine;
void aFunction()
{
string line;
vector<csvLine> usedCollection;
ifstream csvFile("myfile.csv", ios::in);
if (!csvFile)
return -1;
while (getline(csvFile, line))
{
vector<string> tokenizedLine;
boost::tokenizer<boost::escaped_list_separator<char> > tk(
line, boost::escaped_list_separator<char>('\\', ',', '\"'));
for (const string& str : tk)
{
tokenizedLine.push_back(str);
}
usedCollection.push_back(tokenizedLine);
}
csvFile.close();
}
And so usedCollection is a collection of csvLines, where each of those is a collection of strings broken up by the separator character.

Txt to 2 different arrays c++

I have a txt file with a lot of things in it.
The lines have this pattern: 6 spaces then 1 int, 1 space, then a string.
Also, the 1st line has the amount of lines that the txt has.
I want to put the integers in an array of ints and the string on an array of strings.
I can read it and put it into an array , but only if I'm considering the ints as chars and putting into one array of strings.When I try to separate things I have no idea on how I'd do it. Any ideas?
The code I used for putting everything in an array was this:
int size()
{
ifstream sizeX;
int x;
sizeX.open("cities.txt");
sizeX>>x;
return x;
};
int main(void)
{
int size = size();
string words[size];
ifstream file("cities.txt");
file.ignore(100000,'\n');
if(file.is_open())
{
for(int i=0; i<size; i++)
{
getline(file,words[i]);
}
}
}
Just to start I'm going to provide some tips about your code:
int size = size();
Why do you need to open the file, read the first line and then close it? That process can be done opening the file just once.
The code string words[size]; is absolutely not legal C++. You cannot instantiate a variable-length-array in C++. That C feature has been not included in C++ standard (some ref). I suggest you to replace with std::vector, which is more C++ code.
Here I write a snippet of function which perform what you need.
int parse_file(const std::string& filename,
std::vector<std::string>* out_strings,
std::vector<int>* out_integers) {
assert(out_strings != nullptr);
assert(out_integers != nullptr);
std::ifstream file;
file.open(filename, std::ios_base::in);
if (file.fail()) {
// handle the error
return -1;
}
// Local variables
int num_rows;
std::string line;
// parse the first line
std::getline(file, line);
if (line.size() == 0) {
// file empty, handle the error
return -1;
}
num_rows = std::stoi(line);
// reserve memory
out_strings->clear();
out_strings->reserve(num_rows);
out_integers->clear();
out_integers->reserve(num_rows);
for (int row = 0; row < num_rows; ++row) {
// read the line
std::getline(file, line);
if (line.size() == 0) {
// unexpected end of line, handle it
return -1;
}
// get the integer
out_integers->push_back(
std::stoi(line.substr(6, line.find(' ', 6) - 6)));
// get the string
out_strings->push_back(
line.substr(line.find(' ', 6) + 1, std::string::npos));
}
file.close();
return 0;
}
You can definitely improved it, but I think it's a good point where to start.
The last suggest I can give you, in order to improve the robustness of your code, you can match each line with a regular expression. In this way you can be sure your line is formatted exactly how you need.
For example:
std::regex line_pattern("\\s{6}[0-9]+\\s[^\\n]+");
if (std::regex_match(line, line_pattern) == false) {
// ups... the line is not formatted how you need
// this is an error
}