Data parsing from text file - c++

i have encountered an issue regarding parsing values from a text file. What i am trying to do is i need to add up all the values for each specific events for all days and find the average of it. Example will be (290+370+346+325+325)/5 and (5+5+5+12)/4 based on the data in the text file.
A sample is listed below
For each line --> First event:Second event:Third event...:Total number of event:
Every new line is considered a new day.
3:290:61:148:2:5:
2:370:50:173:4:5:
5:346:87:131:4:
3:325:60:145:5:5:
3:325:60:145:5:12:13:7:
I have tried to do it myself but i have only managed to store each column in a string array only. Sample code below. Will appreciate if you guys can help, thanks!
void IDS::parseBase() {
string temp = "";
int counting = 0;
int maxEvent = 0;
int noOfLines = 0;
vector<string> baseVector;
ifstream readBaseFile("Base-Data.txt");
ifstream readBaseFileAgain("Base-Data.txt");
while (getline(readBaseFile, temp)) {
baseVector.push_back(temp);
}
readBaseFile.close();
//Fine the no. of lines
noOfLines = baseVector.size();
//Find the no. of events
for (int i=0; i<baseVector.size(); i++)
{
counting = count(baseVector[i].begin(), baseVector[i].end(), ':') - 1;
if (maxEvent < counting)
{
maxEvent = counting;
}
}
//Store individual events into array
string a[maxEvent];
while (getline(readBaseFileAgain, temp)) {
stringstream streamTemp(temp);
for (int i=0; i<maxEvent; i++)
{
getline(streamTemp, temp, ':');
a[i] += temp + "\n";
}
}
}

I suggest:
int a[maxEvent];
char c; // to hold the colon
while(streamTemp >> a[i++] >> c);

Related

String values become 0 after using getline from .csv into an array C++

I'm reading a csv file into c++ to make an multiple parallel arrays and print reports from it. I'm able to cout the strings, no problem, but when I try to create indices, conditions, or translate into int, the strings return no values. There are no characters when I try to atoi, so that is not the issue. I've looked everywhere and can't find any other similar questions. The logic of my code is also not the issue because it works when I populate the tempArray2 manually. Also, I'm a first semester student, so there's going to be some kludge in my code. Does the getline eliminate the value of the string?
(Because this was flagged as answered: This is not a duplicate question, I am not trying to parse the csv into vectors, I'm using parallel arrays and this worked for generating my first report because those values were not strings. Using a parser someone else programed would cause me to fail this assignment and doesn't answer my question why values become 0.)
Here's a snippet of the .csv; the lines are not double spaced on the notepad.
"LOCATION","INDICATOR","SUBJECT","MEASURE","FREQUENCY","TIME","Value","Flag Codes"
"GBR","FERTILITY","TOT","CHD_WOMAN","A","2019",1.63,
"USA","FERTILITY","TOT","CHD_WOMAN","A","1970",2.48,
"USA","FERTILITY","TOT","CHD_WOMAN","A","1971",2.27,
Here's the readFile function. You can see the comment where I tried to atoi the year column (replacing yr[x] with temp like in column 7, which works) and it returns 0 value, even though that column clearly has only ints in the string.
bool ReadFile(string loc[], string yr[], float rates[])
{
ifstream input{FILENAME.c_str()};
if (!input)
{
return false;
}
//init trash and temp vars
string trash, temp;
//trash header line
for (int i{ 0 }; i < 1; ++i)
{
getline(input, trash);
}
for (int x{ 0 }; x < SIZE; ++x)
{
getline(input, loc[x], ',');//read column 1 country
getline(input, trash, ','); // column 2 trash
getline(input, trash, ','); // column 3 trash
getline(input, trash, ','); // column 4 trash
getline(input, trash, ','); // column 5 trash
getline(input, yr[x], ',');// read column 6 year
//yr[x] = atoi(temp.c_str());// would always return value 0 even though no char interfering
getline(input, temp, ',');//read column 7 rates
rates[x] = atof(temp.c_str());
cout << yr[x];
cout << loc[x];
cout << rates[x];
}
return true;
}
And the analyze function, where the issue also occurs in the tempArray2 block. It cannot read "USA" for the boolean, but works if I set it to !=, it will also include indices that clearly have "USA". The first tempArray works as expected.
void Analyze(string loc[], string yr[], float rates[], FertilityResults& result)
{
// array for highest and lowest fertrates
float tempArray[SIZE];
for (int i{ 0 }; i < SIZE; ++i)
{
tempArray[i] = rates[i];
}
result.highestRate = 0;
result.lowestRate = 20;
for (int i{ 1 }; i < SIZE; ++i)
{
if (tempArray[i] >= result.highestRate)
{
result.highestRate = tempArray[i];
result.highestRateIndex = i;
}
else if (tempArray[i] > 0 && tempArray[i] < result.lowestRate)
{
result.lowestRate = tempArray[i];
result.lowestRateIndex = i;
}
}
//2nd array to retrieve USA data
string tempArray2[SIZE];
for (int i{ 0 }; i < SIZE; ++i)
{
tempArray2[i] = loc[i];
//cout << tempArray2[i];
if (tempArray2[i] == "USA")
{
cout << "hi";
result.usaIndex = i;
cout <<result.usaIndex<<endl;
}
}
}
Please let me know if you need anything else or if it runs on your terminal as is somehow.
Thanks, this is my final project and the first time I've had to ask for help.

Reading in a CSV file for integers

I have a test file set up here trying to read in a CSV file
ifstream file;
file.open("New Microsoft Excel Worksheet.csv");
string temp;
string arr[15];
int size = 0;
int index = 0;
while (getline(file, temp, ','))
{
if (!temp.empty())
{
arr[index] = temp;
std::cout << arr[index];
size++;
index++;
}
}
Output
34568
29774
18421
it successfully captures each index, and even lines them out in a row (I'm guessing its also capturing a \n?)
however I need them to be integers, I would do this in the same loop with a stoi() function, but I need the size of the array to be dynamic (I don't want to use vectors here because this is fitting into another part of code that needs an array)
Here is how I turn them into integers and put them in a new array
int *intArr = new int[size];
for (index = 0; index < size; index++)
{
intArr[index] = stoi(arr[index]);
std::cout << intArr[index];
}
and here is the output for this
3456897748421
It seems to miss each number after it switches to the next row in the csv
If I structure it properly, heres whats going on
34568
9774
8421
I'm guessing this has something to do with CSV files giving a \n at the end of a row in an excel file.
How do I fix this? I need all the values to be integers, thanks!
I figured it out, you need two while loops like this:
while (getline(file, temp))
{
istringstream ss(temp);
while (getline(ss, data, ','))
{
arr[index] = data;
std::cout << arr[index];
size++;
index++;
}
}
I'm not exactly sure why you need two, because this seems kind of redundant, but getting the line THEN putting it in a string stream seems to get rid of that new line
Thanks to #brc-dd for the help!

Split text file into multiple files c++

I'm trying to split txt file into few new files. That's what I've done so far:
long c = 0;
string s;
vector<string> v;
I need to count how many lines my txt file has (it works):
while(getline(inputFile, s, '\n')){
v.push_back(s);
c++;
}
long lineNumber = c;
long max = 100;
long nFiles;
checking how many new files will be created:
if((lineNumber % max) ==0)
nFiles = lineNumber/max;
else
nFiles = lineNumber/max + 1;
creating new names of files:
long currentLine = 0;
for(long i = 1; i <= nFiles; i++){
stringstream sstream;
string a_i;
sstream <<i;
sstream >> a_i;
string outputfiles = "name" +"_" + a_i +".txt";
ofstream fout(outputfiles.c_str());
for (int j = currentLine; j<max; j++){
fout << v[j]<<endl;
}
fout.close();
currentLine = max;
}
inputFile.close();
It creates files but then suddenly stops working. Does anyone know why?
This is a prime example of a time where using a debugger could help you out.
You loop here:
for (int j = currentLine; j<max; j++){
fout << line[j]<<endl;
}
fout.close();
currentLine = max;
max = max + nMax;
max can be bigger than the size of line and this will cause a segmentation fault when you try to access line[j]. This inner loop really should check that you are not going over the length of line which you could find with line.size(). Even after you fix this the program logic isn't quite right, line doesn't appear to grow in size yet in each iteration of the outer loop you make the accesses to line move an additional max indexes along, this will always fail in the last file you try to write if you don't stop the loop at the end of line.

User input to matrix in C++

I have trouble to read in an input from user and convert them into matrix for calculation. For example, with the input = {1 2 3 / 4 5 6}, the program should read in the matrix in the form of
1 2 3
4 5 6
which have 3 cols and 2 rows. What i got so far which does not seem to work:
input.replace(input.begin(), input.end(), '/', ' ');
stringstream ss(input);
string token;
while (getline(ss, token, ' '))
{
for (int i = 0; i < row; i++)
{
for (int j = 0; j < col; j++)
{
int tok = atoi(token.c_str());
(*matrix).setElement(i, j, tok);
}
}
}
So what I'm trying to do is to break the input into token and store them into the matrix using the setElement function which take the number of row, column and the variable that user want to store. What wrong with this code is that the variable of tok doesnt seem to change and keep stuck in 0. Assuming that row and col are knows.
Thanks so much for any help.
Although many simple ways exist to solve the specific problem (and other answer have various good suggestions) let me try to give a more general view of the problem of "formatted input".
There are essentially three kind of problems, here:
at low level you have to do a string to number conversion
at a higher level you have to parse a composite format (understanding rows and line separation)
finally you also have to understand the size of the compound (how many rows and cols?)
this 3 things are not fully independent and the last is needed to know how to store elements (how do you size the matrix?)
Finally there is a 4th problem (that is spread all other the other 3): what to do if the input is "wrong".
These kind of problem are typically afforded in two opposite ways:
Read the data as they come, recognize if the format is matched, and dynamically grow the data structure that have to contain them or...
Read all the data as once as they are (textual form), then analyze the text to figure out how many elements it has, then isolate the "chunks" and do the conversions.
Point 2. requires good string manipulations, but also requires the ability to know how the input is long (what happens if one of the separating spaces is a new-line? the idea the everything is got with a getline fails in those cases)
Point 1 requires a Matrix class that is capable to grow as you read or a temporary dynamic structure (like and std container) in which you can place what you read before sending it into the appropriate place.
Since I don't know how your matrix works, let me keep a temporary vector and counters to store lines.
#include <vector>
#include <iostream>
#include <cassert>
class readmatrix
{
std::vector<int> data; //storage
size_t rows, cols; //the counted rows and columns
size_t col; //the counting cols in a current row
Matrix& mtx; //refer to the matrix that has to be read
public:
// just keep the reference to the destination
readmatrix(Matrix& m) :data(), rows(), cols(), cols(), mtx(m)
{}
// make this class a istream-->istream functor and let it be usable as a stream
// manipulator: input >> readmatrix(yourmatrix)
std::istream& operator()(std::istream& s)
{
if(s) //if we can read
{
char c=0:
s >> c; //trim spaces and get a char.
if(c!='{') //not an open brace
{ s.setstate(s.failbit); return s; } //report the format failure
while(s) //loop on rows (will break when the '}' will be found)
{
col=0;
while(s) //loop on cols (will break when the '/' or '}' will be found)
{
c=0; s >> c;
if(c=='/' || c=='}') //row finished?
{
if(!cols) cols=col; //got first row length
else if(cols != col) //it appears rows have different length
{ s.setstate(s.failbit); return s; } //report the format failure
if(c!='/') s.unget(); //push the char back for later
break; //row finished
}
s.unget(); //pushthe "not /" char back
int x; s >> x; //get an integer
if(!s) return s; //failed to read an integer!
++col; data.push_back(x); //save the read data
}
++rows; //got an entire row
c=0; s >> c;
if(c == '}') break; //finished the rows
else s.unget(); //push back the char: next row begin
}
}
//now, if read was successful,
// we can dispatch the data into the final destination
if(s)
{
mtx.setsize(rows,cols); // I assume you can set the matrix size this way
auto it = data.begin(); //will scan the inner vector
for(size_t r=0; r<rows; ++r) for(size_t c=0; c<cols; ++c, ++it)
mtx(r,c) = *it; //place the data
assert(it == data.end()); //this must be true if counting have gone right
}
return s;
}
};
Now you can read the matrix as
input >> readmatrix(matrix);
You will notice at this point that there are certain recurring patterns in the code: this is typical in one-pass parses, and those patterns can be grouped to form sub-parsers. If you do it generically you -in fact- will rewrite boost::spirit.
Of course some adaption can be done depending on how your matrix works (has it fixed sizes??), or what to do if rows sizes don't match (partial column filling ??)
You can even add a formatted input operator like
std::istream& operator>>(std::istream& s, Matrix& m)
{ return s >> readmatrix(m); }
so that you can just do
input >> matrix;
You are trying to operate on each cell of the matrix for each char read in the input!
You have to take one char for each cell, not multiple.
Splitting a string in tokens can be done by using the following function.
Please don't be shocked that the following code isn't runnable, this is due to the missing matrix class.
Try the following:
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
using namespace std;
void split(const string& str, char delimiter, vector<string>& result) {
string::size_type i = 0;
string::size_type delimOcc = str.find(delimiter);
while (delimOcc != string::npos) {
result.push_back(str.substr(i, delimOcc-i));
i = ++delimOcc;
delimOcc = str.find(delimiter, delimOcc);
if (delimOcc == string::npos) {
result.push_back(str.substr(i, str.length()));
}
}
}
int main()
{
std::string input = "1 2 3 / 4 5 6";
vector<string> rows;
split(input, '/', rows);
for(int i = 0; i < rows.size(); i++) {
vector<string> cols;
split(rows[i], ' ', cols);
for(int j = 0; j < cols.size(); j++) {
if(cols[j][0] != '\0'){
int tok = stoi(cols[j]);
(*matrix).setElement(i, j, tok);
cout << tok << " - " << i << " - " << j << endl;
}
else {
if(j == 0) j--;
}
}
}
return 0;
}
If you know the size of the matrix on forehand you actually don't need getline, you should read int by int. (untested code)
input.replace(input.begin(), input.end(), '/', '\n');
stringstream ss(input);
for (int i = 0; i < row; i++)
{
for (int j = 0; j < col; j++)
{
int tok;
ss >> tok;
(*matrix).setElement(i, j, tok);
}
}

Bug in selection sort loop

I need to make a program that will accept a input file of numbers(integer.txt) which will be sorted one number per line, into a vector, then use a selection sort algorithm to sort the numbers in descending order and write them to the output file (sorted.txt). I'm quite sure something is wrong in my selectionSort() function that is causing the loop not to get the right values, because after tested with cout I get vastly improper output. I'm sure it's a beginning programmer's goof.
vector<string> getNumbers()
{
vector<string> numberList;
ifstream inputFile ("integer.txt");
string pushToVector;
while (inputFile >> pushToVector)
{
numberList.push_back(pushToVector);
}
return numberList;
}
vector<string> selectionSort()
{
vector<string> showNumbers = getNumbers();
int vectorMax = showNumbers.size();
int vectorRange = (showNumbers.size() - 1);
int i, j, iMin;
for (j = 0; j < vectorMax; j++)
{
iMin = j;
for( i = j; i < vectorMax; i++)
{
if(showNumbers[i] < showNumbers[iMin])
{
iMin = i;
}
}
if (iMin != j)
{
showNumbers[j] = showNumbers [iMin];
}
}
return showNumbers;
}
void vectorToFile()
{
vector<string> sortedVector = selectionSort();
int vectorSize = sortedVector.size();
ofstream writeTo;
writeTo.open("sorted.txt");
int i = 0;
while (writeTo.is_open())
{
while (i < vectorSize)
{
writeTo << sortedVector[i] << endl;
i += 1;
}
writeTo.close();
}
return;
}
int main()
{
vectorToFile();
}
vectorRange defined but not used.
In your selectionSort(), the only command that changes the vector is:
showNumbers[j] = showNumbers [iMin];
Every time control reaches that line, you overwrite an element of the vector.
You must learn to swap two values, before you even think about sorting a vector.
Also, your functions are over-coupled. If all you want to fix is selectionSort, then you should be able to post that plus a main that calls it with some test data and displays the result, but no, your functions all call each other. Learn to decouple.
Also your variable names are awful.