Trouble overloading extraction operator for custom PriorityQueue

Trouble overloading extraction operator for custom PriorityQueue - c++

I am trying to overload operator>> for a custom PriorityQueue class I've been writing, code is below:
/**
* #brief Overloaded stream extraction operator.
*
* Bitshift operator>>, i.e. extraction operator. Used to write data from an input stream
* into a targeted priority queue instance. The data is written into the queue in the format,
*
* \verbatim
[item1] + "\t" + [priority1] + "\n"
[item2] + "\t" + [priority2] + "\n"
...
* \endverbatim
*
* #todo Implement functionality for any generic Type and PriorityType.
* #warning Only works for primitives as template types currently!
* #param inStream Reference to input stream
* #param targetQueue Instance of priority queue to manipulate with extraction stream
* #return Reference to input stream containing target queue data
*/
template<typename Type, typename PriorityType> std::istream& operator>>(std::istream& inStream, PriorityQueue<Type, PriorityType>& targetQueue) {
// vector container for input storage
std::vector< std::pair<Type, PriorityType> > pairVec;
// cache to store line input from stream
std::string input;
std::getline(inStream, input);
if (typeid(inStream) == typeid(std::ifstream)) {
inStream.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
}
// loop until empty line
while (!input.empty()) {
unsigned int first = 0;
// loop over input cache
for (unsigned int i = 0; i < input.size(); ++i) {
// if char at index i of cache is a tab, break from loop
if (input.at(i) == '\t')
break;
++first;
}
std::string data_str = input.substr(0, first);
// convert from std::string to reqd Type
Type data = atoi(data_str.c_str());
std::string priority_str = input.substr(first);
// convert from std::string to reqd PriorityType
PriorityType priority = atof(priority_str.c_str());
pairVec.push_back(std::make_pair(data, priority));
// get line from input stream and store in input string
std::getline(inStream, input);
}
// enqueue pairVec container into targetQueue
//targetQueue.enqueueWithPriority(pairVec);
return inStream;
}
This currently works for stdin or std::cin input however it doesn't work for fstream input - the very first getline always reads an empty line from the input such that the while loop never gets triggered, and I can't seem to skip it (I tried with inStream.ignore() as you can see above but this doesn't work.
Edit:
Currently I just want to get it working for file input ignoring the fact it only works for int data type and double priority type right now - these aren't relevant (and neither is the actual manipulation of the targetQueue object itself).
For the moment I'm just concerned with resolving the blank-line issue when trying to stream through file-input.
Example file to pass:
3 5.6
2 6.3
1 56.7
12 45.1
where the numbers on each line are \t separated.
Example testing:
#include "PriorityQueue.h"
#include <sstream>
#include <iostream>
#include <fstream>
int main(void) {
// create pq of MAX binary heap type
PriorityQueue<int, double> pq(MAX);
std::ifstream file("test.txt");
file >> pq;
std::cout << pq;
}
where "test.txt" is the in the format of the example file above.
Edit: Simpler Example
Code:
#include <iostream>
#include <fstream>
#include <vector>
class Example {
public:
Example() {}
size_t getSize() const { return vec.size(); }
friend std::istream& operator>>(std::istream& is, Example& example);
private:
std::vector< std::pair<int, double> > vec;
};
std::istream& operator>>(std::istream& is, Example& example) {
int x;
double y;
while (is >> x >> y) {
std::cout << "in-loop" << std::endl;
example.vec.push_back(std::make_pair(x, y));
}
return is;
}
int main(void) {
Example example;
std::ifstream file("test.txt");
file >> example;
file.close();
std::cout << example.getSize() << std::endl;
return 0;
}

The operator is already overloaded -- and shall be overloaded -- for many types. Let those functions do their work:
template<typename Type, typename PriorityType>
std::istream& operator>>(std::istream& inStream, PriorityQueue<Type, PriorityType>& targetQueue)
{
std::vector< std::pair<Type, PriorityType> > pairVec;
Type data;
PriorityType priority;
while(inStream >> data >> priority)
pairVec.push_back(std::make_pair(data, priority));
targetQueue.enqueueWithPriority(pairVec);
return inStream;
}

Related

Handle very large data in C++

I have a .csv file which has ~3GB of data. I want to read all that data and process it. The following program reads the data from a file and stores it into a std::vector<std::vector<std::string>>. However, the program runs for too long and the application (vscode) freezes and needs to be restarted. What have I done wrong?
#include <algorithm>
#include <iostream>
#include <fstream>
#include "sheet.hpp"
extern std::vector<std::string> split(const std::string& str, const std::string& delim);
int main() {
Sheet sheet;
std::ifstream inputFile;
inputFile.open("C:/Users/1032359/cpp-projects/Straggler Job Analyzer/src/part-00001-of-00500.csv");
std::string line;
while(inputFile >> line) {
sheet.addRow(split(line, ","));
}
return 0;
}
// split and Sheet's member functions have been tested thoroughly and work fine. split has a complexity of N^2 though...
EDIT1: The file read has been fixed as per the suggestions in the comments.
The Split function:
std::vector<std::string> split(const std::string& str, const std::string& delim) {
std::vector<std::string> vec_of_tokens;
std::string token;
for (auto character : str) {
if (std::find(delim.begin(), delim.end(), character) != delim.end()) {
vec_of_tokens.push_back(token);
token = "";
continue;
}
token += character;
}
vec_of_tokens.push_back(token);
return vec_of_tokens;
}
EDIT2:
dummy csv row:
5612000000,5700000000,4665712499,798,3349189123,0.02698,0.06714,0.07715,0.004219,0.004868,0.06726,7.915e-05,0.0003681,0.27,0.00293,3.285,0.008261,0,0,0.01608
limits:
field1: starting timestamp (nanosecs)
field2: ending timestamp (nanosecs)
field3: job id (<= 1,000,000)
field4: task id (<= 10,000)
field5: machine id (<= 30,000,000)
field6: CPU time (sorry, no clue)
field7-20: no idea, unused for the current stage, but needed for later stages.
EDIT3: Required Output
remember the .thenby function in Excel?
the sorting order here is sort first on 5th column (1-based indexing), then on 3rd column and lastly on 4th column; all ascending.

I would start by defining a class to carry the information about one record and add overloads for operator>> and operator<< to help reading/writing records from/to streams. I'd probably add a helper to deal with the comma delimiter too.
First, the set of headers I've used:
#include <algorithm> // sort
#include <array> // array
#include <cstdint> // integer types
#include <filesystem> // filesystem
#include <fstream> // ifstream
#include <iostream> // cout
#include <iterator> // istream_iterator
#include <tuple> // tie
#include <vector> // vector
A simple delimiter helper could look like below. It discards (ignore()) the delimiter if it's in the stream or sets the failbit on the stream if the delimiter is not there.
template <char Char> struct delimiter {};
template <char Char> // read a delimiter
std::istream& operator>>(std::istream& is, const delimiter<Char>) {
if (is.peek() == Char) is.ignore();
else is.setstate(std::ios::failbit);
return is;
}
template <char Char> // write a delimiter
std::ostream& operator<<(std::ostream& os, const delimiter<Char>) {
return os.put(Char);
}
The actual record class can, with the information you've supplied, look like this:
struct record {
uint64_t start; // ns
uint64_t end; // ns
uint32_t job_id; // [0,1000000]
uint16_t task_id; // [0,10000]
uint32_t machine_id; // [0,30000000]
double cpu_time;
std::array<double, 20 - 6> unknown;
};
Reading such a record from a stream can then be done like this, using the delimiter class template (instantiated to use a comma and newline as delimiters):
std::istream& operator>>(std::istream& is, record& r) {
delimiter<','> del;
delimiter<'\n'> nl;
// first read the named fields
if (is >> r.start >> del >> r.end >> del >> r.job_id >> del >>
r.task_id >> del >> r.machine_id >> del >> r.cpu_time)
{
// then read the unnamed fields:
for (auto& unk : r.unknown) is >> del >> unk;
}
return is >> nl;
}
Writing a record is similarly done by:
std::ostream& operator<<(std::ostream& os, const record& r) {
delimiter<','> del;
delimiter<'\n'> nl;
os <<
r.start << del <<
r.end << del <<
r.job_id << del <<
r.task_id << del <<
r.machine_id << del <<
r.cpu_time;
for(auto&& unk : r.unknown) os << del << unk;
return os << nl;
}
Reading the whole file into memory, sorting it and then printing the result:
int main() {
std::filesystem::path filename = "C:/Users/1032359/cpp-projects/"
"Straggler Job Analyzer/src/part-00001-of-00500.csv";
std::vector<record> records;
// Reserve space for "3GB" / 158 (the length of a record + some extra bytes)
// records. Increase the 160 below if your records are actually longer on average:
records.reserve(std::filesystem::file_size(filename) / 160);
// open the file
std::ifstream inputFile(filename);
// copy everything from the file into `records`
std::copy(std::istream_iterator<record>(inputFile),
std::istream_iterator<record>{},
std::back_inserter(records));
// sort on columns 5-3-4 (ascending)
auto sorter = [](const record& lhs, const record& rhs) {
return std::tie(lhs.machine_id, lhs.job_id, lhs.task_id) <
std::tie(rhs.machine_id, rhs.job_id, rhs.task_id);
};
std::sort(records.begin(), records.end(), sorter);
// display the result
for(auto& r : records) std::cout << r;
}
The above process takes ~2 minutes on my old computer with spinning disks. If this is too slow, I'd measure the time of the long running parts:
reserve
copy
sort
Then, you can probably use that information to try to figure out where you need to improve it. For example, if sorting is a bit slow, it could help to use a std::vector<double> instead of a std::array<double, 20-6> to store the unnamed fields:
struct record {
record() : unknown(20-6) {}
uint64_t start; // ns
uint64_t end; // ns
uint32_t job_id; // [0,1000000]
uint16_t task_id; // [0,10000]
uint32_t machine_id; // [0,30000000]
double cpu_time;
std::vector<double> unknown;
};

I would suggest a slightly different approach:
Do NOT parse the entire row, only extract fields that are used for sorting
Note that your stated ranges require small number of bits, that together fit in one 64-bit value:
30,000,000 - 25 bit
10,000 - 14 bit
1,000,000 - 20 bit
Save a "raw" source in your vector, so that you can write it out as needed.
Here is what I got:
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <chrono>
#include <algorithm>
struct Record {
uint64_t key;
std::string str;
Record(uint64_t key, std::string&& str)
: key(key)
, str(std::move(str))
{}
};
int main()
{
auto t1 = std::chrono::high_resolution_clock::now();
std::ifstream src("data.csv");
std::vector<Record> v;
std::string str;
uint64_t key(0);
while (src >> str)
{
size_t pos = str.find(',') + 1;
pos = str.find(',', pos) + 1;
char* p(nullptr);
uint64_t f3 = strtoull(&str[pos], &p, 10);
uint64_t f4 = strtoull(++p, &p, 10);
uint64_t f5 = strtoull(++p, &p, 10);
key = f5 << 34;
key |= f3 << 14;
key |= f4;
v.emplace_back(key, std::move(str));
}
std::sort(v.begin(), v.end(), [](const Record& a, const Record& b) {
return a.key < b.key;
});
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << std::endl;
std::ofstream out("out.csv");
for (const auto& r : v) {
out.write(r.str.c_str(), r.str.length());
out.write("\n", 1);
}
auto t3 = std::chrono::high_resolution_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t3 - t2).count() << std::endl;
}
Of course, you can reserve space in your vector upfront to avoid reallocation.
I've generated a file with 18,000,000 records. My timing shows ~30 second for reading / sorting the file, and ~200 seconds to write the output.
UPDATE:
Replaced streaming with out.write(), reduced writing time from 200 seconds to 17!

As an alternate way to solve this problem, I would suggest to not read all data in memory, but to use the minimum amount of RAM to sort the huge CSV file: a std::vector of line offsets.
The important thing is to understand the concept, not the precise implementation.
As the implementation only needs 8 bytes per line (in 64-bit mode), to sort the 3 GB data file, we only need roughly 150 MB of RAM. The drawback is that the parsing of numbers need to be done several times for the same line, roughly log2(17e6)= 24 times. However, I think that this overhead is partially compensated by the less memory used and no need to parse all numbers of the row.
#include <Windows.h>
#include <cstdint>
#include <vector>
#include <algorithm>
#include <array>
#include <fstream>
std::array<uint64_t, 5> readFirst5Numbers(const char* line)
{
std::array<uint64_t, 5> nbr;
for (int i = 0; i < 5; i++)
{
nbr[i] = atoll(line);
line = strchr(line, ',') + 1;
}
return nbr;
}
int main()
{
// 1. Map the input file in memory
const char* inputPath = "C:/Users/1032359/cpp-projects/Straggler Job Analyzer/src/part-00001-of-00500.csv";
HANDLE fileHandle = CreateFileA(inputPath, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
DWORD highsize;
DWORD lowsize = GetFileSize(fileHandle, &highsize);
HANDLE mappingHandle = CreateFileMapping(fileHandle, NULL, PAGE_READONLY, highsize, lowsize, NULL);
size_t fileSize = (size_t)lowsize | (size_t)highsize << 32;
const char* memoryAddr = (const char*)MapViewOfFile(mappingHandle, FILE_MAP_READ, 0, 0, fileSize);
// 2. Find the offset of the start of lines
std::vector<size_t> linesOffset;
linesOffset.push_back(0);
for (size_t i = 0; i < fileSize; i++)
if (memoryAddr[i] == '\n')
linesOffset.push_back(i + 1);
linesOffset.pop_back();
// 3. sort the offset according to some logic
std::sort(linesOffset.begin(), linesOffset.end(), [memoryAddr](const size_t& offset1, const size_t& offset2) {
std::array<uint64_t, 5> nbr1 = readFirst5Numbers(memoryAddr + offset1);
std::array<uint64_t, 5> nbr2 = readFirst5Numbers(memoryAddr + offset2);
if (nbr1[4] != nbr2[4])
return nbr1[4] < nbr2[4];
if (nbr1[2] != nbr2[2])
return nbr1[2] < nbr2[2];
return nbr1[4] < nbr2[4];
});
// 4. output sorted array
const char* outputPath = "C:/Users/1032359/cpp-projects/Straggler Job Analyzer/output/part-00001-of-00500.csv";
std::ofstream outputFile;
outputFile.open(outputPath);
for (size_t offset : linesOffset)
{
const char* line = memoryAddr + offset;
size_t len = strchr(line, '\n') + 1 - line;
outputFile.write(line, len);
}
}

C++ Reading data from text file into array of structures

I am reasonably new to programming in C++ and i'm having some trouble reading data from a text file into an array of structures. I have looked around similar posts to try and find a solution however, I have been unable to make any of it work for me and wanted to ask for some help. Below is an example of my data set (P.S. I will be using multiple data sets of varying sizes):
00010 0
00011 1
00100 0
00101 1
00110 1
00111 0
01000 0
01001 1
Below is my code:
int variables = 5;
typedef struct {
int variables[variables];
int classification;
} myData;
//Get the number of rows in the file
int readData(string dataset)
{
int numLines = 0;
string line;
ifstream dataFile(dataset);
while (getline(dataFile, line))
{
++numLines;
}
return numLines;
}
//Store data set into array of data structure
int storeData(string dataset)
{
int numLines = readData(dataset);
myData *dataArray = new myData[numLines];
...
return 0;
}
int main()
{
storeData("dataset.txt");
What I am trying to achieve is to store the first 5 integers of each row of the text file into the 'variables' array in the 'myData' structure and then store the last integer separated by white space into the 'classification' variable and then store that structure into the array 'dataArray' and then move onto the next row.
For example, the first structure in the array will have the variables [00010] and the classification will be 0. The second will have the variables [00011] and the classification will be 1, and so on.
I would really appreciate some help with this, cheers!

Provide stream extraction and stream insertion operators for your type:
#include <cstddef> // std::size_t
#include <cstdlib> // EXIT_FAILURE
#include <cctype> // std::isspace(), std::isdigit()
#include <vector> // std::vector<>
#include <iterator> // std::istream_iterator<>, std::ostream_iterator<>
#include <fstream> // std::ifstream
#include <iostream> // std::cout, std::cerr, std::cin
#include <algorithm> // std::copy()
constexpr std::size_t size{ 5 };
struct Data {
int variables[size];
int classification;
};
// stream extraction operator
std::istream& operator>>(std::istream &is, Data &data)
{
Data temp; // don't write directly to data since extraction might fail
// at any point which would leave data in an undefined state.
int ch; // signed integer because std::istream::peek() and ...get() return
// EOF when they encounter the end of the file which is usually -1.
// don't feed std::isspace
// signed values
while ((ch = is.peek()) != EOF && std::isspace(static_cast<unsigned>(ch)))
is.get(); // read and discard whitespace
// as long as
// +- we didn't read all variables
// | +-- the input stream is in good state
// | | +-- and the character read is not EOF
// | | |
for (std::size_t i{}; i < size && is && (ch = is.get()) != EOF; ++i)
if (std::isdigit(static_cast<unsigned>(ch)))
temp.variables[i] = ch - '0'; // if it is a digit, assign it to our temp
else is.setstate(std::ios_base::failbit); // else set the stream to a
// failed state which will
// cause the loop to end (is)
if (!(is >> temp.classification)) // if extraction of the integer following the
return is; // variables fails, exit.
data = temp; // everything fine, assign temp to data
return is;
}
// stream insertion operator
std::ostream& operator<<(std::ostream &os, Data const &data)
{
std::copy(std::begin(data.variables), std::end(data.variables),
std::ostream_iterator<int>{ os });
os << ' ' << data.classification;
return os;
}
int main()
{
char const *filename{ "test.txt" };
std::ifstream is{ filename };
if (!is.is_open()) {
std::cerr << "Failed to open \"" << filename << "\" for reading :(\n\n";
return EXIT_FAILURE;
}
// read from ifstream
std::vector<Data> my_data{ std::istream_iterator<Data>{ is },
std::istream_iterator<Data>{} };
// print to ostream
std::copy(my_data.begin(), my_data.end(),
std::ostream_iterator<Data>{ std::cout, "\n" });
}
Uncommented it looks less scary:
std::istream& operator>>(std::istream &is, Data &data)
{
Data temp;
int ch;
while ((ch = is.peek()) != EOF && std::isspace(static_cast<unsigned>(ch)))
is.get();
for (std::size_t i{}; i < size && is && (ch = is.get()) != EOF; ++i)
if (std::isdigit(static_cast<unsigned>(ch)))
temp.variables[i] = ch - '0';
else is.setstate(std::ios_base::failbit);
if (!(is >> temp.classification))
return is;
data = temp;
return is;
}
std::ostream& operator<<(std::ostream &os, Data const &data)
{
std::copy(std::begin(data.variables), std::end(data.variables),
std::ostream_iterator<int>{ os });
os << ' ' << data.classification;
return os;
}

It looks line you are trying to keep binary values as integer index. If that is the case, it will be converted into integer internally. You may need int to binary conversion again.
If you want to preserve data as is in the text file, then you need to choose either char/string type for the index value. For classification, it seems value will be either 0 or 1. So you can choose bool as data type.
#include <iostream>
#include <map>
using namespace std;
std::map<string, bool> myData;
int main()
{
// THIS IS SAMPLE INSERT. INTRODUCE LOOP FOR INSERT.
/*00010 0
00011 1
00100 0
00101 1
00110 1*/
myData.insert(std::pair<string, bool>("00010", 0));
myData.insert(std::pair<string, bool>("00011", 1));
myData.insert(std::pair<string, bool>("00100", 0));
myData.insert(std::pair<string, bool>("00101", 1));
myData.insert(std::pair<string, bool>("00110", 1));
// Display contents
std::cout << "My Data:\n";
std::map<string, bool>::iterator it;
for (it=myData.begin(); it!=myData.end(); ++it)
std::cout << it->first << " => " << it->second << '\n';
return 0;
}

Reading an Input File And Store The Data Into an Array (beginner)!

The Input file:
1 4 red
2 0 blue
3 1 white
4 2 green
5 2 black
what I want to do is take every row and store it into 2D array.
for example:
array[0][0] = 1
array[0][1] = 4
array[0][2] = red
array[1][0] = 2
array[1][1] = 0
array[1][2] = blue
etc..
code Iam working on it:
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
using namespace std;
int convert_str_to_int(const string& str) {
int val;
stringstream ss;
ss << str;
ss >> val;
return val;
}
string getid(string str){
istringstream iss(str);
string pid;
iss >> pid;
return pid;
}
string getnumberofcolors(string str){
istringstream iss(str);
string pid,c;
iss >> pid>>c;
return c;
}
int main() {
string lineinfile ;
vector<string> lines;
ifstream infile("myinputfile.txt");
if ( infile ) {
while ( getline( infile , lineinfile ) ) {
lines.push_back(lineinfile);
}
}
//first line - number of items
int numofitems = convert_str_to_int(lines[0]);
//lopps items info
string ar[numofitems ][3];
int i = 1;
while(i<=numofitems ){
ar[i][0] = getid(lines[i]);
i++;
}
while(i<=numofitems ){
ar[i][1] = getarrivel(lines[i]);
i++;
}
infile.close( ) ;
return 0 ;
}
when I add the second while loop my program stopped working for some reason!
is there any other way to to this or a solution to my program to fix it.

It's better to show you how to do it much better:
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int main() {
ifstream infile("myinputfile.txt"); // Streams skip spaces and line breaks
//first line - number of items
size_t numofitems;
infile >> numofitems;
//lopps items info
vector<pair<int, pair<int, string>> ar(numofitems); // Or use std::tuple
for(size_t i = 0; i < numofitems; ++i){
infile >> ar[i].first >> ar[i].second.first >> ar[i].second.second;
}
// infile.close( ) ; // Not needed -- closed automatically
return 0 ;
}
You are probably solving some kind of simple algorithmic task. Take a look at std::pair and std::tuple, which are useful not only as container for two elements, but because of their natural comparison operators.

The answer given is indeed a much better solution than your's. I figured i should point out some of your design flaws and give some tips too improve it.
You redefined a function that already exists in the standard, which is
std::stoi() to convert a string to an integer. Remember, if a function
exists already, it's OK to reuse it, don't think you have to reinvent what's
already been invented. If you're not sure search your favorite c++ reference guide.
The solution stores the data "as is" while you store it as a full string. This doesn't really make sense. You know what the data is beforehand, use that to your advantage. Plus, when you store a line of data like that it must be parsed, converted, and then constructed before it can be used in any way, whereas in the solution the data is constructed once and only once.
Because the format of the data is known beforehand an even better way to load the information is by defining a structure, along with input/output operators. This would look something like this:
struct MyData
{
int num1;
int num2;
std::string color;
friend std::ostream& operator << (std::ostream& os, const MyData& d);
friend std::istream& operator >> (std::istream& os, const MyData& d);
};
Then you could simply do something like this:
...
MyData tmp;
outfile << tmp;
vData.push_back(tmp);
...
Their is no question of intent, we are obviously reading a data type from a stream and storing it in a container. If anything, it's clearer as to what you are doing than either your original solution or the provided one.

Reading from file into a vector

I'm using C++ and I'm reading from a file lines like this:
D x1 x2 x3 y1
My code has:
struct gate {
char name;
vector <string> inputs;
string output;
};
In the main function:
vector <gate> eco;
int c=0;
int n=0;
int x = line.length();
while(netlist[c][0])
{
eco.push_back(gate());
eco[n].name = netlist[c][0];
eco[n].output[0] = netlist[c][x-2];
eco[n].output[1] = netlist[c][x-1];
}
where netlist is a 2D array I have copied the file into.
I need help to loop over the inputs and save them in the vector eco.

I don’t fully understand the sense of the 2D array but I suspect it’s redundant. You should use this code:
ifstream somefile(path);
vector<gate> eco;
gate g;
while (somefile >> g)
eco.push_back(g);
// or, simpler, requiring #include <iterator>
vector<gate> eco(std::istream_iterator<gate>(somefile),
std::istream_iterator<gate>());
And overload operator >> appropriately for your type gate:
std::istream& operator >>(std::istream& in, gate& value) {
// Error checking … return as soon as a failure is encountered.
if (not (in >> gate.name))
return in;
gate.inputs.resize(3);
return in >> gate.inputs[0] >>
gate.inputs[1] >>
gate.inputs[2] >>
gate.output;
}

Compile errors while read/write size of multiple structs to file

I've already asked 2 questions kind of related to this project, and i've reached this conclusion. Writing the size of the Struct to the file , and then reading it back is the best way to do this.
I'm creating a program for a homework assignment that will allow me to maintain inventory. I need to read / write multiple structs of the same type to a file.
The problem is... this is really complicated and i'm having trouble wrapping my head around the whole process. I've seen a bunch of examples and i'm trying to put it all together. I'm getting compile errors... and I have zero clue on how to fix them. If you could help me on this I would be so appreciative... thank you. I'm so lost right now...
**** HOPEFULLY THE LAST EDIT #3 *************
My Code:
// Project 5.cpp : main project file.
#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <algorithm>
using namespace System;
using namespace std;
#pragma hdrstop
int checkCommand (string line);
template<typename Template>
void readFromFile(Template&);
template<typename Template>
void writeToFile(Template&);
template<typename T>
void writeVector(ofstream &out, const vector<T> &vec);
template<typename Template>
void readVector(ifstream& in, vector<Template>& vec);
struct InventoryItem {
string Item;
string Description;
int Quantity;
int wholesaleCost;
int retailCost;
int dateAdded;
} ;
int main(void)
{
cout << "Welcome to the Inventory Manager extreme! [Version 1.0]" << endl;
vector<InventoryItem> structList;
ofstream out("data.dat");
writeVector( out, structList );
while (1)
{
string line = "";
cout << endl;
cout << "Commands: " << endl;
cout << "1: Add a new record " << endl;
cout << "2: Display a record " << endl;
cout << "3: Edit a current record " << endl;
cout << "4: Exit the program " << endl;
cout << endl;
cout << "Enter a command 1-4: ";
getline(cin , line);
int rValue = checkCommand(line);
if (rValue == 1)
{
cout << "You've entered a invalid command! Try Again." << endl;
} else if (rValue == 2){
cout << "Error calling command!" << endl;
} else if (!rValue) {
break;
}
}
system("pause");
return 0;
}
int checkCommand (string line)
{
int intReturn = atoi(line.c_str());
int status = 3;
switch (intReturn)
{
case 1:
break;
case 2:
break;
case 3:
break;
case 4:
status = 0;
break;
default:
status = 1;
break;
}
return status;
}
template <typename Template>
void readFromFile(Template& t)
{
ifstream in("data.dat");
readVector(in, t); Need to figure out how to pass the vector structList via a Template
in.close();
}
template <typename Template>
void writeToFile(Template& t)
{
ofstream out("data.dat");
readVector(out, t); Need to figure out how to pass the vector structList via a Template
out.close();
}
template<typename T>
void writeVector(ofstream &out, const vector<T> &vec)
{
out << vec.size();
for(vector<T>::const_iterator i = vec.begin(); i != vec.end(); ++i)
{
out << *i; // SUPER long compile error
}
}
template<typename T>
vector<T> readVector(ifstream &in)
{
size_t size;
in >> size;
vector<T> vec;
vec.reserve(size);
for(int i = 0; i < size; ++i)
{
T tmp;
in >> tmp;
vec.push_back(tmp);
}
return vec;
}
My Compile Errors:
1>.\Project 5.cpp(128) : error C2679: binary '<<' : no operator found which takes a right-hand operand of type 'const InventoryItem' (or there is no acceptable conversion)
1> C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\ostream(653): could be 'std::basic_ostream<_Elem,_Traits> &std::operator <<<char,std::char_traits<char>>(std::basic_ostream<_Elem,_Traits> &,const char *)'
1> with
That is the only error i'm getting now. I see your code is SO Much better. My new compiler error is SUPER long. I've shown where it the error points to. Could you help me just one last time?

Your read and write functions are buggy. In particular, you should be doing something like this instead:
template<typename T>
void write(ofstream &out, const T &t)
{
out << T;
}
OLD: bind1st requires you do include functional for it to work:
#include <functional>
Instead of dealing with all these functions and such, though, it'd be better to rely on iterators:
template<typename T>
void writeVector(ofstream &out, const vector<T> &vec)
{
out << vec.size();
for(vector<T>::const_iterator i = vec.begin(); i != vec.end(); ++i)
{
out << *i;
}
}
template<typename T>
vector<T> readVector(ifstream &in)
{
size_t size;
in >> size;
vector<T> vec;
vec.reserve(size);
for(int i = 0; i < size; ++i)
{
T tmp;
in >> tmp;
vec.push_back(tmp);
}
return vec;
}
You'd want functions to read and write your InventoryItem as well, probably:
ostream &operator<<(ostream &out, const InventoryItem &i)
{
out << i.Item << i.Description; // FIXME Read/write strings properly.
out << i.Quantity;
out << i.wholesaleCost << i.retailCost;
out << i.dateAdded;
}
istream &operator>>(istream &out, InventoryItem &i)
{
// Keep in same order as operator<<(ostream &, const InventoryItem &)!
in >> i.Item >> i.Description; // FIXME Read/write strings properly.
in >> i.Quantity;
in >> i.wholesaleCost >> i.retailCost;
in >> i.dateAdded;
}

NOTE: This is not an answer to the compilation errors you are getting, but rather a broader view of the persistence problem you are handling.
Serialization and deserialization is not the simplest problem you can work on. My advice would be investing in learning libraries (boost::serialization) and using them. They have already worked out many of the problems you will face at one time or another. Plus they already have different output formats (binary, xml, json...)
The first thing you must decide, that is if you decide to go ahead and implement your own, is what will be the file format and whether it suits all your needs. Will it always be used in the same environment? Will the platform change (32/64bits)? You can decide to make it binary as it is the simplest, or make it readable for a human being. If you decide on XML, JSON or any other more complex formats, just forget it and use a library.
The simplest solution is working on a binary file and it is also the solution that will give you a smallest file. On the other hand, it is quite sensible to architecture changes (say you migrate from a 32 to a 64 bit architecture/OS)
After deciding the format you will need to work on the extra information that is not part of your objects now but needs to be inserted into the file for later retrieval. Then start working (and testing) from the smallest parts to more complex elements.
Another advice would be to start working with the simplest most defined part and build from there on. Start avoiding templates as much as possible, and once you have it clear and working for a given data type, work on how to generalize it for any other type.
Disclaimer: I have written the code directly on the browser, so there could be some errors, typos or just about anything :)
Text
The first simple approach is just writting a textual representation of the text. The advantage is that it is portable and shorter in code (if not simpler) than the binary approach. The resulting files will be bigger but user readable.
At this point you need to know how reading text works with iostreams. Particularly, whenever you try to read a string the system will read characters until it reaches a separator. This means that the following code:
std::string str;
std::cin >> str;
will only read up to the first space, tab or end of line. When reading numbers (ints as an example) the system will read all valid digits up to the first non-valid digit. That is:
int i;
std::cin >> i;
with input 12345a will consume all characters up to 'a'. You need to know this because that will influence the way you persist data for later retrieval.
// input: "This is a long Description"
std::string str;
std::cin >> str; // Will read 'This' but ignore the rest
int a = 1;
int b = 2;
std::cout << a << b; // will produce '12'
// input: 12
int read;
std::cint >> read; // will read 12, not 1
So you pretty much need separators to insert in the output and to parse the input. For sample purposes I will select the '|' character. It must be a character that does not appear in the text fields.
It will also be a good idea to not only separate elements but also add some extra info (size of the vector). For the elements in the vector you can decide to use a different separator. If you want to be able to read the file manually you can use '\n' so that each item is in its own line
namespace textual {
std::ostream & operator<<( std::ostream& o, InventoryItem const & data )
{
return o << data.Item << "|" << data.Description << "|" << data.Quantity
<< "|" << data. ...;
}
std::ostream & operator<<( std::ostream & o, std::vector<InventoryItem> const & v )
{
o << v.size() << std::endl;
for ( int i = 0; i < v.size(); ++i ) {
o << v[i] << std::endl; // will call the above defined operator<<
}
}
}
For reading, you will need to split the input by '\n' to get each element and then with '|' to parse the InventoryItem:
namespace textual {
template <typename T>
void parse( std::string const & str, T & data )
{
std::istringstream st( str ); // Create a stream with the string
st >> data; // use operator>>( std::istream
}
std::istream & operator>>( std::istream & i, InventoryItem & data )
{
getline( i, data.Item, '|' );
getline( i, data.Description, '|' );
std::string tmp;
getline( i, tmp, '|' ); // Quantity in text
parse( tmp, data.Quantity );
getline( i, tmp, '|' ); // wholesaleCost in text
parse( tmp, data. wholesaleCost );
// ...
return i;
}
std::istream & operator>>( std::istream & i, std::vector<InventoryItem> & data )
{
int size;
std::string tmp;
getline( i, tmp ); // size line, without last parameter getline splits by lines
parse( tmp, size ); // obtain size as string
for ( int i = 0; i < size; ++i )
{
InventoryItem data;
getline( i, tmp ); // read an inventory line
parse( tmp, data );
}
return i;
}
}
In the vector reading function I have used getline + parse to read the integer. That is to guarantee that the next getline() will actually read the first InventoryItem and not the trailing '\n' after the size.
The most important piece of code there is the 'parse' template that is able to convert from a string to any type that has the insertion operator defined. It can be used to read primitive types, library types (string, for example), and user types that have the operator defined. We use it to simplify the rest of the code quite a bit.
Binary
For a binary format, (ignoring architecture, this will be a pain in the ass if you migrate) the simplest way I can think of is writing the number of elemements in the vector as a size_t (whatever the size is in your implementation), followed by all the elements. Each element will printout the binary representation of each of its members. For basic types as int, it will just output the binary format of the int. For strings we will resort to writting a size_t number with the number of characters in the string followed by the contents of the string.
namespace binary
{
void write( std::ofstream & o, std::string const & str )
{
int size = str.size();
o.write( &size, sizeof(int) ); // write the size
o.write( str.c_str(), size ); // write the contents
}
template <typename T>
void write_pod( std::ofstream & o, T data ) // will work only with POD data and not arrays
{
o.write( &data, sizeof( data ) );
}
void write( std::ofstream & o, InventoryItem const & data )
{
write( o, data.Item );
write( o, data.Description );
write_pod( o, data.Quantity );
write_pod( o, data. ...
}
void write( std::ofstream & o, std::vector<InventoryItem> const & v )
{
int size = v.size();
o.write( &size, sizeof( size ) ); // could use the template: write_pod( o, size )
for ( int i = 0; i < v.size(); ++i ) {
write( o, v[ i ] );
}
}
}
I have selected a different name for the template that writes basic types than the functions that write strings or InventoryItems. The reason is that we don't want to later on by mistake use the template to write a complex type (i.e. UserInfo containing strings) that will store an erroneous representation in disk.
Retrieval from disk should be fairly similar:
namespace binary {
template <typename T>
void read_pod( std::istream & i, T& data)
{
i.read( &data, sizeof(data) );
}
void read( std::istream & i, std::string & str )
{
int size;
read_pod( i, size );
char* buffer = new char[size+1]; // create a temporary buffer and read into it
i.read( buffer, size );
buffer[size] = 0;
str = buffer;
delete [] buffer;
}
void read( std::istream & i, InventoryItem & data )
{
read( i, data.Item );
read( i, data.Description );
read( i, data.Quantity );
read( i, ...
}
void read( std::istream & i, std::vector< InventoryItem > & v )
{
v.clear(); // clear the vector in case it is not empty
int size;
read_pod( i, size );
for ( int i = 0; i < size; ++i )
{
InventoryItem item;
read( i, item );
v.push_back( item );
}
}
}
For using this approach, the std::istream and std::ostream must be opened in binary mode.
int main()
{
std::ifstream persisted( "file.bin", ios:in|ios::binary );
std::vector<InventoryItem> v;
binary::read( persisted, v );
// work on data
std::ofstream persist( "output.bin", ios::out|ios::binary );
binary::write( persist, v );
}
All error checking is left as an exercise for the reader :)
If you have any question on any part of the code, just ask.

EDIT: Trying to clear up FUD:
bind1st is part of STL's functional header. STL existed before boost showed up. It is deprecated in C++0x in favor of the more generic version i.e. bind (aka boost::bind). See Annex D.8 Binders for more information.
Now the real problem (multiple edits may make this look silly, but I'll keep this for posterity's sake):
write<long>(out, structList.size());
This is the offending line. This expects a long as the second parameter, whereas the vector's size() is of type size_t or unsigned int under the hoods.
Update there was a typo: use size_t and not size_T:
write<size_t>(out, structList.size());
Next part:
for_each(structList.begin(), structList.end(), bind1st(write<InventoryItem>, out));
This should be structList or some other type. Also, include functional to be able to use bind1st. Add at the top:
#include <functional>
The template bind1st takes a functor. Passing around ordinary function pointers is not possible without some other hacks. You can use boost::bind as an alternative. Or:
for(InventoryItem::iterator i = structList.begin(), f = structList.end();
i != f; ++i)
write<InventoryItem>(out, *i);
Now for other nitpicks:
What is:
#include <String>
...
using namespace System;
Are you sure of what you are using here? If you want STL strings you need to include:
#include <string>
void main(void)
is not a standard signature. Use any one of:
int main(void)
or
int main(int argc, char *argv[]);
I/O is usually much easier with the predefined insertion/extraction operators. You can (and really should) use:
istream is(...);
is >> data;
and similarly
ostream os(...);
os << data;
Note also your readFromFile and writeToFile functions need to be fixed to use vector<InventoryItem> instead of vector simply.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Trouble overloading extraction operator for custom PriorityQueue - c++

Related

Handle very large data in C++

C++ Reading data from text file into array of structures

Reading an Input File And Store The Data Into an Array (beginner)!

Reading from file into a vector

Compile errors while read/write size of multiple structs to file

Categories

Resources