Output issue with .CSV file - c++

Whenever I attempt to output a line, it outputs the data from the file vertically instead of outputting the full line horizontally. My main goal is to output each line individually and remove commas and repeat till no more lines are in the CSV file.
An example when I run the code:
cout << data[1] << "\t";
Output:
Huggenkizz Pinzz White Dwarf Dildock Operknockity DeVille
What I'm trying to get is:
Huggenkizz Amanda 3/18/1997 Sales Associate 2 A A F
My CSV File:
ID,Last Name,First Name,DOB,DtHire,Title,Level,Region,Status,Gender
1,Huggenkizz,Amanda,3/18/1997,,Sales Associate,2,A,A,F
2,Pinzz,Bobby,5/12/1986,,Sales Associate,3,B,A,F
3,White,Snow,12/23/1995,,Sales Associate,2,C,A,F
4,Dwarf,Grumpy,9/8/1977,,Sales Associate,2,C,A,M
5,Dildock,Dopey,4/1/1992,,Sales Associate,1,B,A,M
6,Operknockity,Michael,10/2/1989,,Sales Associate,1,A,S,M
9,DeVille,Cruella,8/23/1960,,Sales Manager,,,A,F
My Code:
vector<string> SplitString(string s, string delimiter)
{
string section;
size_t pos = 0;
vector<string> annualSalesReport;
while ((pos = s.find(delimiter)) != string::npos) //finds string till, if not returns String::npos
{
section = (s.substr(0, pos)); // returns the substring section
annualSalesReport.push_back(section); // places comma split section into the next array
s.erase(0, pos + delimiter.length()); // removes the previous string up to the current pos
}
annualSalesReport.push_back((s));
return annualSalesReport;
}
int main()
{
vector<string> data;
string readLine;
ifstream myIFS;
myIFS.open("SalesAssociateAnnualReport.csv");
int lineCounter = 0;
while (getline(myIFS, readLine))
{
lineCounter++;
if (lineCounter > 1)
{
data = SplitString(readLine, ",");
if (data.size() > 1) //removes top line
{
cout << data[1]<< "\t";
}
}
}
myIFS.close();
return 0;
}

Please change your main function as follows
int main()
{
vector<vector<string>> data;
string readLine;
ifstream myIFS;
myIFS.open("SalesAssociateAnnualReport.csv");
int lineCounter = 0;
while (getline(myIFS, readLine))
{
lineCounter++;
if (lineCounter > 1)
{
vector<string> dataLine = SplitString(readLine, ",");
data.push_back(dataLine);
}
}
myIFS.close();
// output the first data line of csv file without delimiter and without first column
for (size_t i = 1; i < data[0].size(); i++)
{
cout << data[0][i] << '\t';
}
return 0;
}
to get your desired output of
Huggenkizz Amanda 3/18/1997 Sales Associate 2 A AF
without having to change your SplitString function.
Please be aware that C++ first array index is always 0 instead of 1.
I've separated the CSV file input processing and the output generation, just to follow the simple programming model IPO:
Input -> Process -> Output
Therefore I've introduced the matrix of strings vector<vector<string>> to store the whole desired CSV file data.
As mentioned in the comments, the SplitString function may be refactored and it should also be fixed to split the last two columns properly.
Hope it helps?

Related

reading a large txt file (2GB) passing it to a string, takes too long

I have a big text file (2GB) that contains couple of books. I want to create a (**char)
that contains each word of the whole text file. But firstly i pass all the text file data in a HUGE string, THEN making the **char variable
the problem is that it takes TOO long(hours) for the getline() loop to end.I ran it for 30 mins and the program read 500.000 lines. The whole file is 43.000.000 lines
int main (){
ifstream book;
string sbook,str;
book.open("gutenberg.txt"); // the huge file
cout<<"Reading the file ....."<<endl;
while(!book.eof()){
getline(book,sbook);//passing the line as a string to sbook
if(str.empty()){
str= sbook;
}
else
str= str + " " + sbook;//apend sbook to another string until the file closes
}//I never managed to get out of this loop
cout<<"Done reading the file."<<endl;
cout<<"Removal....."<<endl;
removal(str);//removes all puncuations and makes each upperccase letter to a lowercase
cout<<"done removal"<<endl;
cout<<"Removing doublewhitespaces...."<<endl;
int whitespaces=removedoublewhitespace(str);//removes excess whitespaces leaving only one whitespace within each word
//and returns the number of all the whitespaces
cout<<"doublewhitespaces removed."<<endl;
cout<<"initiating leksis....."<<endl;
char **leksis=new char*[whitespaces+1];//whitespase+1 is how many words are left in the file
for(int i=0;i<whitespaces+1;i++){
leksis[i]= new char[30];
}
cout<<"done initiating leksis."<<endl;
int y=0,j=0;
cout<<"constructing leksis,finding plithos...."<<endl;
for(int i=0;i<str.length();i++){
if(isspace(str[i])){;
y++;
j=0;
leksis[y][j]=' ';
j++;
}
else{
leksis[y][j]=str[i];
j++;
}
}
cout<<"Done constructing leksis,finding plithos...."<<endl;
removal() function
void removal(string &s) {
for (int i = 0, len = s.size(); i < len; i++)
{
char c=s[i];
if(isupper(s[i])){
s[i]=tolower(s[i]);
}
int flag=ispunct(s[i]);
if (flag){
s.erase(i--, 1);
len = s.size();
}
}
}
removedoublewhitespace() function :
int removedoublewhitespace(string &str){
int wcnt=0;
for(int i=str.size()-1; i >= 0; i-- )
{
if(str[i]==' '&&str[i]==str[i-1]) //added equal sign
{
str.erase( str.begin() + i );
}
}
for(int i=0;i<str.size();i++){
if(isspace(str[i])){
wcnt++;
}
}
return wcnt;
}
this loop
while(!book.eof()){
getline(book,sbook);//passing the line as a string to sbook
if(str.empty()){
str= sbook;
}
else
str= str + " " + sbook;
is hugely inefficient. Concatenating an huge string like that is terrible. If you must have the whole file in memory at once then put it in a linked list of strings, one for each line. Or a vector of strings, thats also a huge chunk of memory but it will be allocated more efficiently

Best way to search for vector values in a file

for a project, I'm trying to use my randomly generated vector to compare to a given CSV file full of int data. My vector is 6 random numbers, I need to read my csv files, and check if those numbers within my vector exist in all of the files.
What is the best way to access my vector one by one, and compare to all files? my code below worked fine when I originally used an array to store the random numbers but after changing over to a vector, it doesnt seem to work.
I'm very new to C++ for context
int csv_reader() {
string line;
vector<int> numbers;
int filecontents = 0;
num_gen(numbers);
std::string path_to_dir = "example/directory/to/csv's";
for( const auto & entry : std::filesystem::directory_iterator( path_to_dir )) {
if (entry.path().extension().string() != ".csv") continue;
ifstream file(entry.path());
if(file.is_open())
{
while(getline(file, line))
{
file >> filecontents;
for(int i = 0; i > numbers.size(); i++)
{
if(filecontents == numbers.at(i))
cout << "success";
else
cout << "doesnt exist";
}

Removing a Line from a File without making a new file

I'm trying to remove a single line from a file without creating a new file. For example in the file before it is modified it would be:
This
is
a
file
and after it would be:
This
a
file
However, with the way I'm currently trying to do it what happens is
This
a
file
I know I could do it by writing only the contents that I want into another file and then renaming that file and deleting the old one but I wanted to know if there is another way besides that.
I've tried using
if (string::npos != line.find(SPSID))
{
iPos = (pos - line.size() - 2);
stream.seekg(iPos);
for (int i = (pos - line.size() - 2); i < pos; i++)
{
//Sets input position to the beginning of the current line and replaces it with NULL
stream.put(0);
}
stream.seekp(iPos);
pos = stream.tellp();
}
as well as replacing stream.put(0); with stream.write(nullLine, iPos);
but neither have worked.
int Delete(string fileName, string SPSID)
{
//Variables
string line;
char input[MAX_CHAR];
fstream stream;
streamoff pos = 0;
streamoff iPos = 0;
//Opening and confirming opened
stream.open(fileName);
if (!stream.is_open())
{
cout << "File Did not open.\n" << endl;
return -1;
}
//Loops until the end of the file
do
{
//Gets one line from the file and converts it to c++ string
stream.getline(input, MAX_CHAR, '\n');
line.assign(input);
//Finds the current output position (which is the start of the next line)
pos = stream.tellp();
//Finds and checks if the SPSID is in the string. If it is then print to screen otherwise do nothing
if (string::npos != line.find(SPSID))
{
iPos = (pos - line.size() - 2);
stream.seekg(iPos);
for (int i = (pos - line.size() - 2); i < pos; i++)
{
//Sets input position to the begining of the current line and replaces it with ""
stream.put(0);
}
stream.seekp(iPos);
pos = stream.tellp();
}
} while (stream.eof() == false); //Checks that the end of the file has not been reached
stream << "Test" << endl;
//Resets the input and output positions to the begining of the stream
stream.seekg(0, stream.beg);
stream.seekp(0, stream.beg);
//Closing and Confirming closed
stream.close();
if (stream.is_open())
{
cout << "File did not close.\n" << endl;
return -2;
}
return 0;
}
I'm probably gonna have to make a new file and rename it but figured it was still worth asking if this is possible. :/

Shifting an array of structs C++

I am quite new to c++ programming and data structures and really need some help. I am working on an assignment where I have a text file with 100 lines and on each line there is an item, a status(for sale or wanted), and a price. I need to go through the text file and add lines to an array of structs and as I add lines I need to compare the new information with the previously submitted information. If there is a line that is wanted and has a price higher than a previously input item that is for sale then the item would be removed from the struct and the array of structs shifted.
The place that I am having trouble is in actually shifting all the structs once a line that satisfies the condition is found.
My issue is that when I try to shift the array of structs using the second for loop nothing happens and I just get null structs and nothing seems to move.
Please if you guys can offer any help it would be greatly appreciated.
Below is the code of the text file and my current code.
#include<iostream>
#include<fstream>
#include <string>
#include <algorithm>
#include <sstream>
using namespace std;
struct items
{
string type;
int status;
int price;
} itemArray [100];
int main(int argc, char *argv[]) {
int x = -1;
//int chickenCount = 0;
int counter = 0;
int itemsSold = 0;
int itemsRemoved = 0;
int itemsForSale = 0;
int itemsWanted = 0;
string itemType;
int itemStatus = 0;
int itemPrice = 0;
int match = 0;
ifstream myReadFile( "messageBoard.txt" ) ;
std::string line;
//char output[100];
if (myReadFile.is_open()) {
while (!myReadFile.eof()) {
getline(myReadFile,line); // Saves the line in STRING.
line.erase(std::remove(line.begin(), line.end(), ' '), line.end());
//cout<<line<<endl; // Prints our STRING.
x++;
std::string input = line;
std::istringstream ss(input);
std::string token;
while(std::getline(ss, token, ',')) {
counter++;
//std::cout << token << '\n';
if (counter>3){
counter =1;
}
//cout << x << endl;
if (counter == 1){
itemType = token;
//cout<< itemType<<endl;
}
if (counter == 2){
if (token == "forsale"){
itemStatus = 1;
//itemsForSale++;
}
if (token == "wanted"){
itemStatus = 0;
//itemsWanted++;
}
//cout<< itemStatus<<endl;
}
if (counter == 3){
itemPrice = atoi(token.c_str());
//cout<< itemPrice<<endl;
}
//cout<<"yo"<<endl;
}
if (x >= 0){
for (int i = 0; i<100;i++){
if (itemArray[i].type == itemType){
//cout<<itemType<<endl;
if(itemArray[i].status != itemStatus){
if (itemArray[i].status == 1){
if(itemPrice>=itemArray[i].price){
itemsSold++;
match =1;
//itemArray[i].type = "sold";
for (int j=i; j<100-1;j++){
//cout<<j<<endl;
itemArray[j].type = itemArray[j+1].type;
itemArray[j].status = itemArray[j+1].status;
itemArray[j].price = itemArray[j+1].price;
}
i =i-1;
break;
}
}
if (itemArray[i].status == 0){
if(itemArray[i].price>=itemPrice){
itemsSold++;
match = 1;
//itemArray[i].type = "sold";
for (int j=i; j<100-1;j++){
//cout<<j<<endl;
itemArray[j].type = itemArray[j+1].type;
itemArray[j].status = itemArray[j+1].status;
itemArray[j].price = itemArray[j+1].price;
}
i=i-1;
break;
}
}
}
}
}
}
if (counter == 3 && match == 0){
itemArray[(x)].type = itemType;
itemArray[(x)].status = itemStatus;
itemArray[(x)].price = itemPrice;
}
match = 0;
// cout << itemArray[x].type << " " << itemArray[x].status<<" "<<itemArray[x].price<<endl;
}
for(int i=0;i<100;i++){
cout<<itemArray[i].type<< " "<<itemArray[i].status<<" "<<itemArray[i].price<<endl;
}
//cout<<itemArray[1].price<<endl;
cout << itemsSold<<endl;
}
myReadFile.close();
return 0;
}
text file: https://drive.google.com/file/d/0B8O3izVcHJBzem0wMzA3VHoxNk0/view?usp=sharing
Thanks for the help
I see several issues in the code, but without being able to test it, I think the main problem is that you always insert new elements at position 'x' which correspond to the currently line read from the file, without taking into account any shift of elements done. You should insert the new element at the first empty slot (or just overwrite the old element instead of shifting everything).
An other issue is that you do not initialize the status and price in your array.
The best way would be to rewrite the code by using more standard C++ features, for example:
replace the items structure by a class with a constructor defining default values
use object copy (there is no need to copy a struct element by element)
use standard C++ containers like a list (see http://www.cplusplus.com/reference/list/list/) which has insert and erase methods

C++ fstream outputs wrong data

Context first:
My program do some parallel calculation which are logged in a file. Threads are grouped by blocks (I'm using CUDA). The log file is formated this way:
#begin run
({blockIdx,threadIdx}) {thread_info}
({blockIdx,threadIdx}) {thread_info}
...
#end run
I've wrote a function that should read the log file and sort each run messages by thread.
//------------------------------------------------------------------------------
// Comparison struct for log file sorting
//------------------------------------------------------------------------------
typedef struct
{
bool operator()(const string &rString1 , const string &rString2)
{
int closeParenthesisLocalition1 = rString1.find_first_of(')');
int closeParenthesisLocalition2 = rString2.find_first_of(')');
int compResult = rString1.compare(0 , closeParenthesisLocalition1 + 2 , rString2 , 0 , closeParenthesisLocalition2 + 2);
return (compResult < 0);
}
} comp;
//------------------------------------------------------------------------------------
// Sort the log file. Lines with same prefix (blockIdx,ThreadIdx) will be grouped in file per run.
//------------------------------------------------------------------------------------
void CudaUnitTest::sortFile()
{
comp comparison;
deque<string> threadsPrintfs;
ifstream inputFile(m_strInputFile);
assert(inputFile.is_open());
//Read whole input file and close it. Saves disk accesses.
string strContent((std::istreambuf_iterator<char>(inputFile)), std::istreambuf_iterator<char>());
inputFile.close();
ofstream outputFile(m_strOutputFile);
assert(outputFile.is_open());
string strLine;
int iBeginRunIdx = -10; //value just to addapt on while loop (to start on [0])
int iBeginRunNewLineOffset = 10; //"idx offset to a new line char in string. Starts with the offset of the string "#begin run\n".
int iEndRunIdx;
int iLastNewLineIdx;
int iNewLineIdx;
while((iBeginRunIdx = strContent.find("#begin run\n" , iBeginRunIdx + iBeginRunNewLineOffset)) != string::npos)
{
iEndRunIdx = strContent.find("#end run\n" , iBeginRunIdx + iBeginRunNewLineOffset);
assert(iEndRunIdx != string::npos);
iLastNewLineIdx = iBeginRunIdx + iBeginRunNewLineOffset;
while((iNewLineIdx = strContent.find("\n" , iLastNewLineIdx + 1)) < iEndRunIdx)
{
strLine = strContent.substr(iLastNewLineIdx + 1 , iNewLineIdx);
if(verifyPrefix(strLine))
threadsPrintfs.push_back(strLine);
iLastNewLineIdx = iNewLineIdx;
}
//sort last run info
sort(threadsPrintfs.begin() , threadsPrintfs.end() , comparison);
threadsPrintfs.push_front("#begin run\n");
threadsPrintfs.push_back("#end run\n");
//output it
for(deque<string>::iterator it = threadsPrintfs.begin() ; it != threadsPrintfs.end() ; ++it)
{
assert(outputFile.good());
outputFile.write(it->c_str() , it->size());
}
outputFile.flush();
threadsPrintfs.clear();
}
outputFile.close();
}
The problem is that the resulting file has a lot of trash data. For example an input log file with 6KB generated a output log of 192KB! It appears the output file has a lot of repetitions of the input file. When debugging code the deque showed the right values before and after sort, though. I think there is something wrong with the ofstream write itself.
Edit: The function isn't running in parallel.
Just to show the final code. Note the change on substr, now instead of an index it's receiving the lenght.
//------------------------------------------------------------------------------------
// Sort the log file. Lines with same prefix (blockIdx,ThreadIdx) will be grouped in file per run.
//------------------------------------------------------------------------------------
void CudaUnitTest::sortFile()
{
comp comparison;
deque<string> threadsPrintfs;
ifstream inputFile(m_strInputFile);
assert(inputFile.is_open());
//Read whole input file and close it. Saves disk accesses.
string strContent((std::istreambuf_iterator<char>(inputFile)), std::istreambuf_iterator<char>());
inputFile.close();
ofstream outputFile(m_strOutputFile);
assert(outputFile.is_open());
string strLine;
int iBeginRunIdx = -10; //value just to addapt on while loop (to start on [0])
int iBeginRunNewLineOffset = 10; //"idx offset to a new line char in string. Starts with the offset of the string "#begin run\n".
int iEndRunIdx;
int iLastNewLineIdx;
int iNewLineIdx;
while((iBeginRunIdx = strContent.find("#begin run\n" , iBeginRunIdx + iBeginRunNewLineOffset)) != string::npos)
{
iEndRunIdx = strContent.find("#end run\n" , iBeginRunIdx + iBeginRunNewLineOffset);
assert(iEndRunIdx != string::npos);
iLastNewLineIdx = iBeginRunIdx + iBeginRunNewLineOffset;
while((iNewLineIdx = strContent.find("\n" , iLastNewLineIdx + 1)) < iEndRunIdx)
{
strLine = strContent.substr(iLastNewLineIdx + 1 , iNewLineIdx - iLastNewLineIdx);
if(verifyPrefix(strLine))
threadsPrintfs.push_back(strLine);
iLastNewLineIdx = iNewLineIdx;
}
//sort last run info
sort(threadsPrintfs.begin() , threadsPrintfs.end() , comparison);
threadsPrintfs.push_front("#begin run\n");
threadsPrintfs.push_back("#end run\n");
//output it
for(deque<string>::iterator it = threadsPrintfs.begin() ; it != threadsPrintfs.end() ; ++it)
{
assert(outputFile.good());
outputFile.write(it->c_str() , it->size());
}
threadsPrintfs.clear();
}
outputFile.close();
}