Tokenize a std::string to a struct - c++

Let's say I have the following string that I want to tokenize as per the delimiter '>':
std::string veg = "orange>kiwi>apple>potato";
I want every item in the string to be placed in a structure that has the following format:
struct pack_item
{
std::string it1;
std::string it2;
std::string it3;
std::string it4;
};
I know how to do it this way:
pack_item pitem;
std::stringstream veg_ss(veg);
std::string veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it1 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it2 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it3 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it4 = veg_item;
Is there a better and one-liner kind of way to do it?

Something like this:
#include <string>
#include <vector>
#include <sstream>
#include <iostream>
std::string veg = "orange>kiwi>apple>potato";
typedef std::vector<std::string> it_vec;
int main(int argc, char* argv[]) {
it_vec vec;
std::stringstream veg_ss(veg);
std::string veg_item;
while (std::getline(veg_ss, veg_item, '>')) {
vec.push_back(veg_item);
}
for (const std::string& vec_item : vec) {
std::cout << vec_item << std::endl;
}
}

You don't need an intermediate variable.
pack_item pitem;
std::stringstream veg_ss(veg);
std::getline(veg_ss, pitem.it1, '>');
std::getline(veg_ss, pitem.it2, '>');
std::getline(veg_ss, pitem.it3, '>');
std::getline(veg_ss, pitem.it4, '>');
You might want to make that a function, e.g. operator >> (with a similar operator <<)
std::istream& operator >>(std::istream& is, pack_item & pitem) {
std::getline(is, pitem.it1, '>');
std::getline(is, pitem.it2, '>');
std::getline(is, pitem.it3, '>');
std::getline(is, pitem.it4, '>');
return is;
}
std::ostream& operator <<(std::ostream& os, pack_item & pitem) {
return os << pitem.it1 << '>'
<< pitem.it2 << '>'
<< pitem.it3 << '>'
<< pitem.it4 << '>';
}
int main() {
std::stringstream veg_ss("orange>kiwi>apple>potato>");
pack_item pitem;
veg_ss >> pitem;
}
Is there a better and one-liner kind of way to do it?
You can make a type who's >> reads in a string up to a delimiter, and read all four elements in one statement. Is that really "better"?
template <bool is_const>
struct delimited_string;
template<>
struct delimited_string<true> {
const std::string & string;
char delim;
};
template<>
struct delimited_string<false> {
std::string & string;
char delim;
};
delimited_string(const std::string &, char) -> delimited_string<true>;
delimited_string(std::string &, char) -> delimited_string<false>;
std::istream& operator >>(std::istream& is, delimited_string<false> s) {
return std::getline(is, s.string, s.delim);
}
template <bool is_const>
std::ostream& operator <<(std::ostream& os, delimited_string<is_const> s) {
return os << s.string << s.delim;
}
std::istream& operator >>(std::istream& is, pack_item & pitem) {
return is >> delimited_string { pitem.it1, '>' }
>> delimited_string { pitem.it2, '>' }
>> delimited_string { pitem.it3, '>' }
>> delimited_string { pitem.it4, '>' };
}
std::ostream& operator <<(std::ostream& os, const pack_item & pitem) {
return os << delimited_string { pitem.it1, '>' }
<< delimited_string { pitem.it2, '>' }
<< delimited_string { pitem.it3, '>' }
<< delimited_string { pitem.it4, '>' };
}

As suggested in the comments, you could use a for loop as such:
pack_item a;
std::array<std::reference_wrapper<std::string>, 4> arr{a.it1, a.it2, a.it3, a.it4};
constexpr std::string_view veg = "orange>kiwi>apple>potato";
std::istringstream ss(veg.data());
std::string str;
for(std::size_t idx = 0; std::getline(ss, str, '>'); ++idx){
arr[idx].get() = std::move(str);
}
If you meant "one-liner" in its true sense, then you could be nasty and use:
std::getline(std::getline(std::getline(std::getline(ss, a.it1, '>'), a.it2, '>'), a.it3, '>'), a.it4, '>');

Indeed:
#include <iostream>
#include <sstream>
#include <string>
struct pack_item
{
std::string it1;
std::string it2;
std::string it3;
std::string it4;
};
pack_item pack( const std::string & s )
{
pack_item p;
getline(getline(getline(getline(std::istringstream(s), p.it1,'>'), p.it2,'>'), p.it3,'>'), p.it4);
return p;
}
int main()
{
auto pitem = pack( "orange>kiwi>apple>potato" );
std::cout << pitem.it4 << "<" << pitem.it3 << "<" << pitem.it2 << "<" << pitem.it1 << "\n";
}
BTW, there is nothing wrong with multiple lines of code. The quest for the one-liner is often a distraction to doing things the Right Way™.

What I would do is to create a constructor with std::string_view as argument (the second, which is predefined, would be the separator), and use the find function.
The reason of using std::string_view is posted here: How exactly is std::string_view faster than const std::string&?
struct pack_item
{
std::string it1;
std::string it2;
std::string it3;
std::string it4;
pack_item():it1(){}
pack_item(std::string_view in, char sep = '>'){
auto ptr = in.begin();
auto l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it1 = std::string(l_ptr, ptr++);
l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it2 = std::string(l_ptr, ptr++);
l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it3 = std::string(l_ptr, ptr++);
l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it4 = std::string(l_ptr, ptr++);
}
};
You can see here that this can be easily converted into a loop if you want and stop it by checking:
if(ptr == in.end()) break;

Related

C++ Find and save duplicates in vector

I have a custom vector of my user defined type vector
First vector gets filled with elements through stdin, then i sort it and try to find duplicates in it and save them
i've managed to find all unique elements, but i need to find and get a vector of duplicates
I need a hint or a simple solution for this problem
here's my code below:
Agressor.h
#ifndef Agressor_h
#define Agressor_h
#include <string>
#include <vector>
using namespace std;
class Agressor{
public:
/*const char**/ string traderIdentifier;
/*const char**/ string side;
int quantity;
int price;
vector<Agressor> v;
void display(){
cout << traderIdentifier << " " << side << " " << quantity << " " << price << endl;
}
explicit Agressor(){
}
~Agressor(){
}
friend ostream &operator<<(ostream& stream, const Agressor& item);
const friend bool operator > (const Agressor &a1, const Agressor &a2);
// const friend bool operator == (const Agressor &a1, const Agressor &a2);
/* vector<Agressor>& operator[](int i ){
return v[i];
}*/
};
ostream &operator<<(ostream& stream, const Agressor& item) {
string side = "";
if(item.side == "B"){
side = '+';
}else{
if(item.side == "S"){
side = "-";
}
}
stream << item.traderIdentifier << side << item.quantity << "#" << item.price << "\n";
return stream;
}
const bool operator == (const Agressor &a1, const Agressor &a2){
bool isEqual = false;
if((a1.price*a1.quantity == a2.price*a2.quantity) && (a1.traderIdentifier == a2.traderIdentifier) && (a1.side == a2.side)){
isEqual = true;
}
return(isEqual);
}
const bool operator > (const Agressor &a1, const Agressor &a2){
bool isGreater = false;
if(a1.price*a1.quantity > a2.price*a2.quantity){
isGreater = true;
}
return(isGreater);
}
#endif /* Agressor_h */
main.cpp
#include <iostream>
#include "Agressor.h"
#include <sstream>
using namespace std;
vector<string> &split(const string &s, char delim, vector<string> &elems)
{
stringstream ss(s);
string item;
while (getline(ss, item, delim))
{
elems.push_back(item);
}
return elems;
}
vector<string> split(const string &s, char delim)
{
vector<string> elems;
split(s, delim, elems);
return elems;
}
bool equal_comp(const Agressor& a1, const Agressor& a2){
if((a1.price*a1.quantity == a2.price*a2.quantity) && (a1.traderIdentifier == a2.traderIdentifier) && (a1.side == a2.side)){
return true;
}
return false;
}
int main(int argc, const char * argv[]) {
Agressor agr;
while (true) {
std::string sText;
cout << "enter query:" << endl;
std::getline(std::cin, sText);
if(sText == "q"){
cout << "Program terminated by user" << endl;
break;
}else{
std::vector<std::string> sWords = split(sText, ' ');
agr.traderIdentifier = sWords[0];
agr.side = sWords[1];
agr.quantity = stoi(sWords[2]);
agr.price = stoi(sWords[3]);
agr.v.push_back(agr);
vector<Agressor>::iterator it;
sort(agr.v.begin(), agr.v.end(), greater<Agressor>());
//unique(agr.v.begin(), agr.v.end(), equal_comp);
for (vector<Agressor>::const_iterator i = agr.v.begin(); i != agr.v.end(); ++i)
cout << *i << ' ';
}
}
cout << "here we go..." << endl;
vector<Agressor>::iterator it;
sort(agr.v.begin(), agr.v.end(), greater<Agressor>());
//it = unique(agr.v.begin(),agr.v.end(), equal_comp);
//agr.v.resize( distance(agr.v.begin(),it) );
agr.v.erase(unique(agr.v.begin(),agr.v.end(), equal_comp), agr.v.end());
copy(agr.v.begin(), agr.v.end(), ostream_iterator<Agressor>(cout, "\n"));
return 0;
}
You might use something like:
template <typename T>
std::vector<T> get_duplicates(const std::vector<T>& v)
{
// expect sorted vector
auto it = v.begin();
auto end = v.end();
std::vector<T> res;
while (it != end) {
it = std::adjacent_find(it, end);
if (it != end) {
++it;
res.push_back(*it);
}
}
return res;
}
std::unique overwrites duplicate values with later non-duplicate values. You can implement a similar algorithm that moves the values to somewhere.
template<class ForwardIt, class OutputIt, class BinaryPredicate>
ForwardIt unique_retain(ForwardIt first, ForwardIt last, OutputIt d_first, BinaryPredicate p)
{
if (first == last)
return last;
ForwardIt result = first;
while (++first != last) {
if (!p(*result, *first) && ++result != first) {
*d_first++ = std::move(*result);
*result = std::move(*first);
}
}
return ++result;
}
(adapted from this possible implementation of std::unique)
You would then use it like
vector<Agressor> dups;
sort(agr.v.begin(), agr.v.end(), greater<Agressor>());
auto it = unique_retain(agr.v.begin(),agr.v.end(), std::back_inserter(dups), equal_comp);
agr.v.erase(it, agr.v.end());

how to use istream_iterators to split an equation?

I'm trying to split a string like ( 1 + 2 ) into a vector and when using an istream_iterators<string> it doesn't split the parentheses so I get vector outputs like
(1 , + , 2) when I want ( , 1, + , 2 ,)
Is it possible to use istream_iterators to achieve this?
string eq = "(1 + 2)";
istringstream ss(eq);
istream_iterator<string> begin(ss);
istream_iterator<string> end;
vector<string> vec(begin, end);
You can do this by creating a custom type Token and using it with
istream_iterator. Bonus feature: this code will parse multiple digits, multiple operators, and nested expressions. So enjoy. :)
#include <iterator>
#include <string>
#include <sstream>
#include <vector>
#include <iostream>
#include <cctype>
using namespace std;
class Token {
private:
string val;
public:
Token() : val("") {}
Token(string& v) : val(v) {}
friend istream& operator>>(istream &in, Token& tok);
friend ostream& operator<<(ostream &out, Token& tok);
};
istream& operator>>(istream &in, Token& tok) {
char c;
string v;
if (in >> c) {
if (isdigit(c)) {
v.push_back(c);
while (in >> c && isdigit(c)) {
v.push_back(c);
}
in.putback(c);
} else if (c == ' ') {
while (in >> c && c == ' ') ;
in.putback(c);
} else {
v.push_back(c);
}
}
tok = v;
return in;
}
ostream& operator<<(ostream &out, Token& tok) {
out << tok.val;
return out;
}
int main() {
string eq = "(1 + 2)";
//eq = "(100 + 200)"; // multiple digits
//eq = "(100 + 200 * 300)"; // extra operator
//eq = "(100 + (200 * 300))"; // nested parens
istringstream ss(eq);
istream_iterator<Token> begin(ss);
istream_iterator<Token> end;
vector<Token> vec(begin, end);
for (auto& x : vec) {
cout << "[" << x << "] ";
}
cout << endl;
}
I don't think you can do it using istream_iterator. Instead, simply do it by hand:
vector<string> vec;
vec.reserve(eq.size() / 4); // rough guess
bool in_number = false;
for (char ch : eq) {
if (isspace(ch)) {
in_number = false;
} else if (isdigit(ch)) {
if (in_number) {
vec.back().push_back(ch);
} else {
vec.emplace_back(1, ch);
in_number = true;
}
} else {
vec.emplace_back(1, ch);
in_number = false;
}
}

Is there a way to stop std::cin when it encounters a `\n`

I've written code that works pretty well except for one thing. The task that I'm making this code for inputs data into the program as a whitespace seperated string of doubles. And their precicion may be larger than 10^-25. So I made my own class that can handle that.
The problem is, when I was writing the code, I was testing it by entering two values into the console by hand each time pressing enter so that my program can understand where one double ends and another starts (it was looking for a '\n' basically).
Now I really need to adapt this code to make i work with my task's input (whitespace seperated list of doubles like 2.521 32.12334656 23.21 .....). But I'm having a problem with getline in my overloaded >> operator. It simply eats the '\n' character and starts looking for more input. The only way I can get it to work is by manually enetering values and manually entering an additional whitespace after the last value and only then hit enter.
I'm asking for your help.
Here's the full code:
#include <iostream>
#include <string>
#include <algorithm>
class BigNumber {
private:
std::string fullPart;
std::string floatPart;
public:
BigNumber() : fullPart("0"), floatPart("0") {}
friend std::ostream & operator << (std::ostream & os, const BigNumber & bn);
friend std::istream & operator >> (std::istream & os, BigNumber & bn);
void operator+=(BigNumber & bn);
};
int main()
{
BigNumber bn, bntemp;
while (std::cin >> bntemp)
{
bn += bntemp;
if (std::cin.peek() == '\n')
break;
}
std::cout << bn << std::endl;
return 0;
}
void addFullPart(const std::string & add, std::string & add_to)
{
auto addConv = std::stold(add);
auto addToConv = std::stold(add_to);
auto newFull = std::to_string(addConv + addToConv);
add_to = std::string(newFull.begin(), std::find(newFull.begin(), newFull.end(), '.'));
}
bool carryReminder(std::string & add_to, int32_t indx_from)
{
for (auto curr = indx_from; curr >= 0; --curr)
{
if (add_to[curr] != '9')
{
++(add_to[curr]);
return true;
}
else
add_to[curr] = '0';
}
return false;
}
std::pair<std::string, int32_t> addFloatPart(std::string & add, std::string & add_to)
{
std::string resultFloat;
int32_t reminderReturn{};
// don't forget to reverse str
if (add.size() != add_to.size())
{
// add remaining 0's
if (add.size() < add_to.size())
{
while (add.size() != add_to.size())
{
auto tempBigger = add_to.back();
add_to.pop_back();
resultFloat.push_back(tempBigger);
}
}
else
{
while (add.size() != add_to.size())
{
auto tempBigger = add.back();
add.pop_back();
resultFloat.push_back(tempBigger);
}
}
}
// now they are equal and have a form of 120(3921) 595
for (int32_t i = add_to.size() - 1; i >= 0; --i)
{
int32_t add_toDigit = add_to[i] - '0';
int32_t addDigit = add[i] - '0';
if (add_toDigit + addDigit >= 10)
{
resultFloat.append(std::to_string((add_toDigit + addDigit) - 10));
// we have a remainder
if (i == 0 || !carryReminder(add_to, i - 1))
reminderReturn = 1;
}
else
{
resultFloat.append(std::to_string(add_toDigit + addDigit));
}
}
std::reverse(resultFloat.begin(), resultFloat.end());
return std::make_pair(resultFloat, reminderReturn);
}
std::ostream & operator<<(std::ostream & os, const BigNumber & bn)
{
os << bn.fullPart << "." << bn.floatPart;
return os;
}
std::istream & operator>>(std::istream & is, BigNumber & bn)
{
std::string temp;
std::getline(is, temp, ' ');
auto fullPartTemp = std::string(temp.begin(), std::find(temp.begin(), temp.end(), '.'));
auto floatPartTemp = std::string(std::find(temp.begin(), temp.end(), '.') + 1, temp.end());
bn.floatPart = floatPartTemp;
bn.fullPart = fullPartTemp;
return is;
}
void BigNumber::operator+=(BigNumber & bn)
{
auto pair = addFloatPart(bn.floatPart, floatPart);
floatPart = pair.first;
if (pair.second > 0)
addFullPart(std::to_string(std::stoi(bn.fullPart) + 1), fullPart);
else
addFullPart(bn.fullPart, fullPart);
}
I suggest that you first use getline to read a line. Then you can make an istringstream and use your >> on that. Specifically, you could add #include <sstream> and change the main function to the following:
int main()
{
BigNumber bn, bntemp;
std::string temp;
std::getline(std::cin, temp);
std::istringstream ln(temp);
while (ln.good()) {
ln >> bntemp;
bn += bntemp;
}
std::cout << bn << std::endl;
return 0;
}
Two changes are needed. In main
Discarded the peek approach. Too brittle.
int main()
{
BigNumber bn, bntemp;
std::string line;
std::getline(std::cin, line);
std::stringstream stream(line);
while (stream >> bntemp)
{
bn += bntemp;
}
std::cout << bn << std::endl;
return 0;
}
And in operator>>
std::istream & operator >> (std::istream & is, BigNumber & bn)
{
std::string temp;
// also do NOTHING if the read fails!
if (std::getline(is, temp, ' '))
{
// recommend some isdigit testing in here to make sure you're not
// being fed garbage. Set fail flag in stream and bail out.
auto floatPartTemp = std::string(temp.begin(), std::find(temp.begin(), temp.end(), '.'));
// if there is no . you are in for a world of hurt here
auto floatPartTemp = std::string(std::find(temp.begin(), temp.end(), '.') + 1, temp.end());
bn.floatPart = ;
bn.fullPart = fullPartTemp;
}
return is;
}
So it should probably look more like
std::istream & operator >> (std::istream & is, BigNumber & bn)
{
std::string temp;
if (std::getline(is, temp, ' '))
{
if (std::all_of(temp.cbegin(), temp.cend(), [](char ch) { return isdigit(ch) || ch == '.'; }))
{
auto dotpos = std::find(temp.begin(), temp.end(), '.');
bn.fullPart = std::string(temp.begin(), dotpos);
std::string floatPartTemp;
if (dotpos != temp.end())
{
floatPartTemp = std::string(dotpos + 1, temp.end());
}
bn.floatPart = floatPartTemp;
}
else
{
is.setstate(std::ios::failbit);
}
}
return is;
}
Perhaps you could then use
std::string temp;
is >> temp;
instead of std::getline().
If I remember well that breaks on spaces and keeps the newline in the buffer.

How to read a csv file data into an array?

I have a CSV file formatted as below:
1,50,a,46,50,b
2, 20,s,56,30,f
3,35,b,5,67,s
...
How can I turn that to a 2D array so that I could do some calculations?
void Core::parseCSV(){
std::ifstream data("test.csv");
std::string line;
while(std::getline(data,line))
{
std::stringstream lineStream(line);
std::string cell;
while(std::getline(lineStream,cell,','))
{
//not sure how to create the 2d array here
}
}
};
Try this,
void Core::parseCSV()
{
std::ifstream data("test.csv");
std::string line;
std::vector<std::vector<std::string> > parsedCsv;
while(std::getline(data,line))
{
std::stringstream lineStream(line);
std::string cell;
std::vector<std::string> parsedRow;
while(std::getline(lineStream,cell,','))
{
parsedRow.push_back(cell);
}
parsedCsv.push_back(parsedRow);
}
};
Following code is working. hope it can help you. It has a CSV file istream_iterator-like class. It is a template so that it can read strings, ints, doubles, etc. Its constructors accept a char delimiter, so that it may be used for more than strictly comma-delimited files. It also has a specialization for strings so that white space characters could be retained.
#include <iostream>
#include <sstream>
#include <fstream>
#include <iterator>
using namespace std;
template <class T>
class csv_istream_iterator: public iterator<input_iterator_tag, T>
{
istream * _input;
char _delim;
string _value;
public:
csv_istream_iterator( char delim = ',' ): _input( 0 ), _delim( delim ) {}
csv_istream_iterator( istream & in, char delim = ',' ): _input( &in ), _delim( delim ) { ++*this; }
const T operator *() const {
istringstream ss( _value );
T value;
ss >> value;
return value;
}
istream & operator ++() {
if( !( getline( *_input, _value, _delim ) ) )
{
_input = 0;
}
return *_input;
}
bool operator !=( const csv_istream_iterator & rhs ) const {
return _input != rhs._input;
}
};
template <>
const string csv_istream_iterator<string>::operator *() const {
return _value;
}
int main( int argc, char * args[] )
{
{ // test for integers
ifstream fin( "data.csv" );
if( fin )
{
copy( csv_istream_iterator<int>( fin ),
csv_istream_iterator<int>(),
ostream_iterator<int>( cout, " " ) );
fin.close();
}
}
cout << endl << "----" << endl;
{ // test for strings
ifstream fin( "data.csv" );
if( fin )
{
copy( csv_istream_iterator<string>( fin ),
csv_istream_iterator<string>(),
ostream_iterator<string>( cout, "|" ) );
fin.close();
}
}
return 0;
}
I would go for something like this (untested, incomplete) and eventually refine operator >>, if you have strings instead of chars, or floats instead of ints.
struct data_t
{
int a ;
int b ;
char c ;
int d ;
int e ;
char f ;
} ;
std::istream &operator>>(std::istream &ist, data_t &data)
{
char comma ;
ist >> data.a >> comma
>> data.b >> comma
>> data.c >> comma
>> data.d >> comma
>> data.e >> comma
>> data.f
;
return ist ;
}
void Core::parseCSV(){
std::ifstream data("test.csv");
std::string line;
std::vector<data_t> datavect ;
while(std::getline(data,line))
{
data_t data ;
std::stringstream lineStream(line);
lineStream >> data ;
datavect.push_back(data) ;
}
};

Iterating over mmaped gzip file with boost

I am trying to learn boost and some template programming in C++ but I am really having such an hard time to implement a simple class for iterating over Gzip files using mapped_file_source. I essentially have an edge list in TSV format such that each line in the gzip file is of the format: <src:int><tab><dst:int>. What I want is to implement a gz_file class that exposes a begin and end iterator over which I can get an edge (std::pair<int,int>) each time I query the iterator.
The problem is the copy constructor which is broken since I cannot known where I am positioned in the gzip file.
Here is the code I have so far:
class gz_graph {
public:
gz_graph(const char * filename)
{
m_file.open(filename);
if (!m_file.is_open()) {
throw std::runtime_error("Error opening file");
}
m_data = m_file.data();
m_data_size = m_file.size() / sizeof(m_data[0]);
auto ret = posix_madvise((void*)m_data, m_data_size, POSIX_MADV_SEQUENTIAL);
}
class iterator;
iterator begin() const
{
return iterator(this, false);
}
iterator end() const
{
return iterator(this, true);
}
class iterator : public std::iterator<std::forward_iterator_tag, Edge> {
public:
iterator(gz_graph const * ref, bool consumed)
: m_ref(ref),
m_cur_edge(-1, -1),
m_consumed(consumed)
{
if (!consumed) {
initialize();
advance();
}
}
iterator(const iterator& x)
: m_ref(x.m_ref),
m_cur_edge(x.m_cur_edge)
{
if (!x.m_consumed) {
initialize();
advance();
}
std::cout << "Copy constructor" << std::endl;
}
value_type const& operator*() const
{
return m_cur_edge;
}
value_type const* operator->() const
{
return &m_cur_edge;
}
iterator& operator++()
{
advance();
return *this;
}
bool operator==(iterator const& other) const
{
assert(m_ref == other.m_ref);
return m_cur_edge == other.m_cur_edge;
}
bool operator!=(iterator const& other) const
{
return !(*this == other);
}
private:
void initialize()
{
boost::iostreams::array_source source(m_ref->m_data, m_ref->m_data_size);
m_in.push(boost::iostreams::gzip_decompressor());
m_in.push(source);
}
void advance()
{
std::string line_str;
if (!getline(m_in, line_str)) {
m_consumed = true;
m_cur_edge = Edge(-1, -1);
return;
}
std::vector<std::string> strs;
boost::split(strs, line_str, boost::is_any_of("\t"));
if (strs.size() != 2)
throw std::runtime_error("Required 2 fields per line");
int src = boost::lexical_cast<int>(strs.at(0));
int dst = boost::lexical_cast<int>(strs.at(1));
m_cur_edge = Edge(src, dst);
// std::cout << "Read line " << line_str << std::endl;
}
gz_graph const * m_ref;
Edge m_cur_edge;
boost::iostreams::filtering_istream m_in;
bool m_consumed;
};
private:
boost::iostreams::mapped_file_source m_file;
char const* m_data;
size_t m_data_size;
};
I would just use a std::istream_iterator here. I'm not sure how exactly to interpret your "input pseudo-code", so let me humor you and do the "complicated" parsing:
struct Edge : std::pair<int, int> { };
std::istream& operator>>(std::istream& is, Edge& edge)
{
using namespace boost::spirit::qi;
return is >> match("src:" > int_ > '\t' > "dst:" > int_ >> eol, edge.first, edge.second);
}
I expect you would be happy to have it much simpler, but simpler is easier, right?
Now the main program looks like
for (
std::istream_iterator<Edge> it(fs >> std::noskipws), end;
it != end;
++it)
{
std::cout << it->first << " to " << it->second << "\n";
}
Where fs is the filtering_istream that has the gzip_decompressor. See it Live On Coliru
Full Code
#include <boost/iostreams/device/mapped_file.hpp>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_match.hpp>
#include <iterator>
struct Edge : std::pair<int, int> {
};
std::istream& operator>>(std::istream& is, Edge& edge)
{
using namespace boost::spirit::qi;
return is >> match("src:" > int_ > '\t' > "dst:" > int_ >> eol, edge.first, edge.second);
}
namespace io = boost::iostreams;
int main()
{
io::mapped_file_source csv("csv.txt.gz");
io::stream<io::mapped_file_source> textstream(csv);
io::filtering_istream fs;
fs.push(io::gzip_decompressor{});
fs.push(textstream);
for (
std::istream_iterator<Edge> it(fs >> std::noskipws), last;
it != last;
++it)
{
std::cout << it->first << " to " << it->second << "\n";
}
}