How to iterate through a variadic tuple in a variadic template function? - c++

I was writing a CSV parser and I thought it would be a great idea to put in practice some advanced C++. In particular, there's a useful function to split a line of a CSV file given a delimiter. Although it's a straightfoward function to write, now I want that function to return a tuple with a varying number of arguments and types. For example :
int main() {
auto [a, b, c] = extract<int, std::string, float>("42;hello;3.1415", ';');
std::cout << a << ' ' << b << ' ' << c << std::endl;
}
Should print out :
42 hello 3.1415
So I thought of a variadic template function :
template <typename... T>
std::tuple<T...> extract(const std::string&& str, const char&& delimiter) {
std::tuple<T...> splited_line;
/* ... */
return splited_line;
}
But I can't modify the tuple inside that function with a variable parameter, like so :
std::get<i>(splited_line) // doesn't work
That wasn't a big surprise, I'm quite new to this language. I'm now wondering how to achieve this small function in a elegant way.
Thanks for any help.

You might do something like (I let you implement "parsing" part):
// Parsing parts
std::vector<std::string> split(const std::string& s, char delimiter);
template <typename T>
T ConvertTo(const std::string& s);
// Variadic part
template <typename... Ts, std::size_t ... Is>
std::tuple<Ts...> extract_impl(std::index_sequence<Is...>,
const std::vector<std::string>& v)
{
return { ConvertTo<Ts>(v[Is])... };
}
template <typename... Ts>
std::tuple<Ts...> extract(const std::string& s, char delimiter) {
const auto strings = split(s, delimiter);
if (strings.size() != sizeof...(Ts)) {
// Error handling
// ...
}
return extract_impl<Ts...>(std::index_sequence_for<Ts...>(), strings);
}

template<class F>
auto foreach_argument( F&& f ) {
return [f = std::forward<F>(f)](auto&&...elems) {
( (void)f(elems), ... );
};
}
template <class... Ts>
std::tuple<Ts...> extract(const std::string& str, const char delimiter) {
std::tuple<Ts...> splited_line;
std::size_t i = 0;
std::size_t index = 0;
auto operation = [&](auto&& elem){
if (index == std::string::npos)
return;
auto next = str.find( delimiter, index );
std::string element = str.substr( index, next );
index = next;
// parse the string "element" into the argument "elem"
++i;
};
std::apply(foreach_argument(operation), splitted_line);
return splited_line;
}
this results in default-constructed Ts first, and if the element isn't found it remains default-constructed.
The return value
std::optional<std::tuple<Ts...>>
or throw-if-not-matching options would have a
std::tuple<std::optional<Ts>...>
within the function, and the lambda in apply would .emplace the element when it was found. Then ensure that all elements are valid before returning, else throw or return the empty optional.
Ie, to turn a std::tuple<std::optional<Ts>...>> into a std::tuple<Ts...> something like:
return std::apply( [](auto&&elems){ return std::make_tuple( *elems... ); }, splitted_line );

Okay, thanks to the help of the community, I got my problem solved. Maybe it'll help someone understands variadic template functions, so I'm going to share a working code (based on Adam Nevraumont's code) :
#include <iostream>
#include <string>
#include <tuple>
#include <string_view>
#include <sstream>
template <typename... Ts>
std::tuple<Ts...> extract(std::string_view str, char delimiter = ';') {
size_t idx = 0;
auto pop = [&](auto&& elem) {
auto next = str.find(delimiter, idx);
std::stringstream ss;
ss << str.substr(idx, next - idx);
ss >> elem;
idx = next + 1;
};
std::tuple<Ts...> splited;
std::apply([&](auto&&...elems) { (pop(elems), ...); }, splited);
return splited;
}
int main() {
std::string dataline = "-42;hello;3.1415;c";
auto [i, s, f, c] = extract<int, std::string, float, char>(dataline);
std::cout << i << " " << s << " " << f << " " << c << std::endl;
}
As you can see, I convert string into the type I want with stringstream... maybe if you have more control on the type you're handling in the tuple, you have to implement an another template variadic function and then specialize it (based on Jarod42's code) :
#include <iostream>
#include <string>
#include <tuple>
#include <string_view>
template <typename T> T convert_to(const std::string_view& s) { return T(); } // default constructor
template <> std::string convert_to(const std::string_view& s) { return std::string(s); }
template <> float convert_to(const std::string_view& s) { return std::stof(std::string(s)); }
template <> int convert_to(const std::string_view& s) { return std::stoi(std::string(s)); }
template <typename... Ts, size_t... Is>
std::tuple<Ts...> extract_impl(std::index_sequence<Is...>,
std::string_view splited[sizeof...(Ts)]) {
return { convert_to<Ts>(splited[Is])... };
}
template <typename... Ts>
std::tuple<Ts...> extract(std::string_view str, char delimiter = ';') {
std::string_view splited[sizeof...(Ts)];
for (size_t i = 0, idx = 0; i < sizeof...(Ts); ++i) {
auto next = str.find(delimiter, idx);
splited[i] = str.substr(idx, next - idx);
idx = next + 1;
}
return extract_impl<Ts...>(std::index_sequence_for<Ts...>(), splited);
}
int main() {
auto [a, b, c] = extract<int, std::string, float>("-42;hello;3.1415");
std::cout << a << ' ' << b << ' ' << c;
}

Related

Compiling a function with templates fails in variadic template function

I have come across a compiler error involving variadic templates. The following code is a strongly simplified version which reproduces the error in my original code:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
typedef std::vector<std::string> stringvec;
// dummy function: return string version of first vectorelemenr
template<typename T>
std::string vecDummy(const std::string sFormat, const T t) {
std::stringstream ss("");
if (t.size() > 0) {
ss << t[0];
}
return ss.str();
}
// recursion termination
std::string recursiveY(stringvec &vFlags, uint i) {
return "";
}
// walk through arguments
template<typename T, typename... Args>
std::string recursiveY(stringvec &vFlags, uint i, T value, Args... args) {
std::string sRes = "";
if (vFlags[i] == "%v") {
sRes += vecDummy(vFlags[i], value);
}
sRes += " "+recursiveY(vFlags, i+1, args...);
return sRes;
}
int main(void) {
stringvec vPasta = {"spagis", "nudle", "penne", "tortellini"};
stringvec vFormats = {"%v", "%s"};
std::string st = "";
st += recursiveY(vFormats, 0, vPasta, "test12");
std::cout << ">>" << st << "<<" << std::endl;
return 0;
}
This simple code should walk through the arguments passed to recursiveY() and, if the current format string is "%v" it would pass the corresponding argument to vecDummy() which would return a string version of the vector's first element (if there is one).
The error message from the compiler is
sptest2.cpp: In instantiation of ‘std::string vecDummy(std::string, T) [with T = const char*; std::string = std::__cxx11::basic_string<char>]’:
sptest2.cpp:30:25: required from ‘std::string recursiveY(stringvec&, uint, T, Args ...) [with T = const char*; Args = {}; std::string = std::__cxx11::basic_string<char>; stringvec = std::vector<std::__cxx11::basic_string<char> >; uint = unsigned int]’
sptest2.cpp:32:27: required from ‘std::string recursiveY(stringvec&, uint, T, Args ...) [with T = std::vector<std::__cxx11::basic_string<char> >; Args = {const char*}; std::string = std::__cxx11::basic_string<char>; stringvec = std::vector<std::__cxx11::basic_string<char> >; uint = unsigned int]’
sptest2.cpp:43:21: required from here
sptest2.cpp:12:11: error: request for member ‘size’ in ‘t’, which is of non-class type ‘const char* const’
12 | if (t.size() > 0) {
| ~~^~~~
It seems as if the compiler uses all types i pass to recursiveY() in main, but vecDummy() is designed to only work with vectors of some kind (and not with const char*, for example).
Is there a possibility to modify this code so that it will work as intended?
Is there perhaps a way of assuring the compiler that i will only pass vectors to vecDummy() (even at the risk of a runtime error or unexpected behaviour - similar to passing an integer to printf() when it expects a string)?
You can add an overload of vecDummy that handles the std::vector case and 'dumb down' the more general one to (say) just return an empty string:
// dummy function: return string version of first vectorelement (catch-all)
template<typename T>
std::string vecDummy(const std::string, const T) {
return "";
}
// dummy function: return string version of first vectorelement (vectors only)
template<typename T>
std::string vecDummy(const std::string, const std::vector <T> &t) {
std::stringstream ss("");
if (t.size() > 0) {
ss << t[0];
}
return ss.str();
}
Live demo
I think if constexpr can help here. I'm posting two solutions. (And a third, more complete one, at the end after an edit I did some time after posting)
First solution (only the function vecDummy changes). Here, vecDummy receives parameters of all types, doing its intended work only for vectors and doing nothing when the parameter is not a vector.
template<typename T>
std::string vecDummy(
[[maybe_unused]] const std::string sFormat,
[[maybe_unused]] const T t
)
noexcept
{
std::stringstream ss("");
if constexpr (std::is_same_v<T, stringvec>) {
if (t.size() > 0) {
ss << t[0];
}
}
else {
// nothing, since this is intended to work only for vectors
}
return ss.str();
}
Another solution is to move if constexpr inside recursiveY (and leave vecDummy unchanged). In this solution, vecDummy is only called for parameters of vector-type and when the format is "%v".
template<typename T, typename... Args>
std::string recursiveY(
const stringvec& vFormats,
std::size_t i,
T value, Args... args
)
noexcept
{
std::cout << "(1) i= " << i << '\n';
std::string res = "";
if (vFormats[i] == "%v") {
if constexpr (std::is_same_v<T, stringvec>) {
res += vecDummy(vFormats[i], value);
}
}
else {
if constexpr (not std::is_same_v<T, stringvec>) {
res += std::string(value);
}
}
res += " " + recursiveY(vFormats, i+1, args...);
return res;
}
EDIT
Following a question asked by OP in a comment, I've updated the solution to a more complete one in which std::vector<int> is allowed. Also, I've added the handling of the case "%s", which was lacking in the original post.
This update uses the construct is_vector that I found in this post.
The result of this code is >>spagis 1 1234 6789 test12 <<.
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
typedef std::vector<std::string> stringvec;
/* ------------------------------------------------ */
// copied from https://stackoverflow.com/a/12043020/12075306
template <typename T, typename _ = void>
struct is_vector {
static const bool value = false;
};
template <typename T>
struct is_vector< T,
typename std::enable_if<
std::is_same<T,
std::vector< typename T::value_type,
typename T::allocator_type >
>::value
>::type
>
{
static const bool value = true;
};
/* ------------------------------------------------ */
// dummy function: return string version of first vector element
template<typename T>
std::string vecDummy(
[[maybe_unused]] const std::string sFormat,
[[maybe_unused]] const std::vector<T>& t
)
noexcept
{
std::stringstream ss("");
if (t.size() > 0) { ss << t[0]; }
return ss.str();
}
// recursion termination
std::string recursiveY
([[maybe_unused]] const stringvec& vFlags, [[maybe_unused]] std::size_t i) noexcept
{ return ""; }
template<typename T, typename... Args>
std::string recursiveY(
const stringvec& vFormats,
std::size_t i,
T value, Args... args
)
noexcept
{
std::cout << "(1) i= " << i << '\n';
std::string res = "";
if (vFormats[i] == "%v") {
if constexpr (is_vector<T>::value) {
res += vecDummy(vFormats[i], value);
}
}
else {
if constexpr (not is_vector<T>::value) {
res += std::string(value);
}
}
res += " " + recursiveY(vFormats, i+1, args...);
return res;
}
int main(void) {
stringvec vPasta = {"spagis", "nudle", "penne", "tortellini"};
std::vector<int> vMoney1 = {1, 2, 3, 4};
std::vector<int> vMoney2 = {1234, 2, 3, 4};
std::vector<int> vMoney3 = {6789, 2, 3, 4};
stringvec vFormats = {"%v", "%v", "%v", "%v", "%s"};
std::string st = "";
st += recursiveY(vFormats, 0, vPasta, vMoney1, vMoney2, vMoney3, "test12");
std::cout << ">>" << st << "<<" << std::endl;
return 0;
}

Getting field names with boost::pfr

Hi I'm using boost::pfr for basic reflection, it works fine, but the problem is it is only print or deal with the field values, like with boost::pfr::io it prints each member of the struct, but how can I print it as name value pairs, same issue with for_each_field, the functor only accepts values, but not names. How can I get the field names?
struct S {
int n;
std::string name;
};
S o{1, "foo"};
std::cout << boost::pfr::io(o);
// Outputs: {1, "foo"}, how can I get n = 1, name = "foo"?
If you think adapting a struct is not too intrusive (it doesn't change your existing definitions, and you don't even need to have it in a public header):
BOOST_FUSION_ADAPT_STRUCT(S, n, name)
Then you can concoct a general operator<< for sequences:
namespace BF = boost::fusion;
template <typename T,
typename Enable = std::enable_if_t<
// BF::traits::is_sequence<T>::type::value>
std::is_same_v<BF::struct_tag, typename BF::traits::tag_of<T>::type>>>
std::ostream& operator<<(std::ostream& os, T const& v)
{
bool first = true;
auto visitor = [&]<size_t I>() {
os << (std::exchange(first, false) ? "" : ", ")
<< BF::extension::struct_member_name<T, I>::call()
<< " = " << BF::at_c<I>(v);
};
// visit members
[&]<size_t... II>(std::index_sequence<II...>)
{
return ((visitor.template operator()<II>(), ...);
}
(std::make_index_sequence<BF::result_of::size<T>::type::value>{});
return os;
}
(Prior to c++20 this would require some explicit template types instead of the lambdas, perhaps making it more readable. I guess I'm lazy...)
Here's a live demo: Live On Compiler Explorer
n = 1, name = foo
Bonus: Correctly quoting string-like types
Live On Compiler Explorer
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/for_each.hpp>
#include <boost/fusion/include/at_c.hpp>
#include <iostream>
#include <iomanip>
namespace MyLib {
struct S {
int n;
std::string name;
};
namespace BF = boost::fusion;
static auto inline pretty(std::string_view sv) { return std::quoted(sv); }
template <typename T,
typename Enable = std::enable_if_t<
not std::is_constructible_v<std::string_view, T const&>>>
static inline T const& pretty(T const& v)
{
return v;
}
template <typename T,
typename Enable = std::enable_if_t<
// BF::traits::is_sequence<T>::type::value>
std::is_same_v<BF::struct_tag, typename BF::traits::tag_of<T>::type>>>
std::ostream& operator<<(std::ostream& os, T const& v)
{
bool first = true;
auto visitor = [&]<size_t I>() {
os << (std::exchange(first, false) ? "" : ", ")
<< BF::extension::struct_member_name<T, I>::call()
<< " = " << pretty(BF::at_c<I>(v));
};
// visit members
[&]<size_t... II>(std::index_sequence<II...>)
{
return (visitor.template operator()<II>(), ...);
}
(std::make_index_sequence<BF::result_of::size<T>::type::value>{});
return os;
}
} // namespace MyLib
BOOST_FUSION_ADAPT_STRUCT(MyLib::S, n, name)
int main()
{
MyLib::S o{1, "foo"};
std::cout << o << "\n";
}
Outputs:
n = 1, name = "foo"
The library cannot offer any such functionality because it is currently impossible to obtain the name of a member of a class as value of an object.
If you want to output field names, you need to declare string objects mapped with the members and implement a operator<< which uses these strings manually.
To do this a more sophisticated reflection library would probably offer macros to use in the definition of the members. Macros can expand their argument(s) into a declaration using the provided name as identifier while also producing code using the name as string literal (via the # macro replacement operator).
It's stupid but hey, with a stringifying macro per field it could be enough for you.
C++14, no additional library
#include <boost/pfr.hpp>
struct S
{
int n;
std::string name;
static char const* const s_memNames[2];
};
char const* const S::s_memNames[2] = {"n", "name"};
// utility
template< size_t I, typename TR >
char const* MemberName()
{
using T = std::remove_reference_t<TR>;
if (I < std::size(T::s_memNames))
return T::s_memNames[I];
return nullptr;
}
// test:
#include <iostream>
using std::cout;
template< size_t I, typename T >
void StreamAt(T&& inst)
{
char const* n = MemberName<I,T>();
auto& v = boost::pfr::get<I>(inst);
cout << "(" << n << " = " << v << ")";
}
int main()
{
S s{2, "boo"};
boost::pfr::for_each_field(s, [&](const auto&, auto I)
{
StreamAt<decltype(I)::value>(s);
cout << "\n";
});
}
output:
(n = 2)
(name = boo)
(previous version of the suggestion, this one has more fluff so less interesting)
#include <boost/pfr.hpp>
// library additions:
static char const* g_names[100];
template< size_t V >
struct Id : std::integral_constant<size_t, V > {};
template< size_t I, typename T >
using TypeAt = boost::pfr::tuple_element_t<I, T>;
template<std::size_t Pos, class Struct>
constexpr int Ni() // name index
{
return std::tuple_element_t<Pos, typename std::remove_reference_t<Struct>::NamesAt >::value;
}
struct StaticCaller
{
template< typename Functor >
StaticCaller(Functor f) { f();}
};
///
/// YOUR CODE HERE
struct S
{
using NamesAt = std::tuple<Id<__COUNTER__>, Id<__COUNTER__>>; // add this
int n;
std::string name;
static void Init() // add this
{
g_names[Ni<0,S>()] = "n";
g_names[Ni<1,S>()] = "name";
}
};
StaticCaller g_sc__LINE__(S::Init); // add this
// utilities
template< size_t I, typename T >
auto GetValueName(T&& inst)
{
return std::make_pair(boost::pfr::get<I>(inst), g_names[Ni<I,T>()]);
}
// test:
#include <iostream>
using std::cout;
template< size_t I, typename T >
void StreamAt(T&& inst)
{
auto const& [v,n] = GetValueName<I>(inst);
cout << "(" << v << ", " << n << ")";
}
int main()
{
S s{2, "boo"};
boost::pfr::for_each_field(s, [&](const auto&, auto I)
{
StreamAt<decltype(I)::value>(s);
cout << "\n";
});
}
output
(2, n)
(boo, name)

Stringify function for generic container in C++

How create a generic stringify function for a generic container also for nested container e.g. map<string,vector<list<int>>>?
This is my attempt but it doesn't work.
template<class T>
string stringify(const T& c) {
int length = c.size();
string str = "[";
int i=0;
for (; i <= length-2; i++) {
str += stringfy(c[i]) + ", ";
}
str += c[i] + "]";
return str;
}
It is doable but mostly pointless as you would typically know what kind of data you have to process.
I managed to come up with something like this. It will work for every type that is iterable with for-each loop or a tuple, or has overloaded operator<<. You can do it without C++20 features but it will be a total SFINAE mess.
#include <iostream>
#include <string>
#include <type_traits>
#include <map>
#include <list>
#include <vector>
#include <tuple>
#include <sstream>
using namespace std;
template <typename T>
concept InterableRange = requires (T a) {
std::begin(a);
std::end(a);
};
template <typename T>
concept TupleLikeType = requires (T a) {
std::tuple_size<T>();
};
template<TupleLikeType T>
string stringify(const T& c);
template<class T>
string stringify(const T& c);
template<InterableRange T>
string stringify(const T& c) {
string str = "[ ";
auto size = std::size(c);
std::size_t i = 0;
for (const auto& elem : c) {
str += stringify(elem);
if(i++ < size - 1)
str += ", ";
}
str += " ]";
return str;
}
template<TupleLikeType T>
string stringify(const T& c) {
string str = "[ ";
auto size = std::tuple_size<T>();
std::size_t i = 0;
std::stringstream input;
auto insert = [&input, size, &i](const auto& data) {
input << stringify(data);
if(i++ < size - 1)
{
input.put(',');
input.put(' ');
}
};
std::apply([&insert](const auto&... args){
(insert(args), ...);
}, c);
str += input.str();
str += " ]";
return str;
}
template<class T>
string stringify(const T& c) {
std::stringstream input;
input << c;
return input.str();
}
int main() {
map<string,vector<list<int>>> m {
{ "1", {{1,2}, {3, 4}}},
{ "2", {{10,20}, {30, 40}}}
};
cout << stringify(m);
}
It will print
[ [ [ 1 ], [ [ 1, 2 ], [ 3, 4 ] ] ], [ [ 2 ], [ [ 10, 20 ], [ 30, 40 ] ] ] ]
I did a C++17 solution, the SFINAE is still bearable:
#include <iostream>
#include <set>
#include <map>
#include <string>
#include <vector>
#include <sstream>
// Forward declarations so we can have arbitrary interactions
template<class Container, class Iter = decltype(cbegin(std::declval<Container>()))> // SFINAE to get only containers
std::string stringify(const Container&c);
template<class T1, class T2>
std::string stringify(const std::pair<T1, T2> &p); // Can we get this into the tuple case?
template<class ...Ts>
std::string stringify(std::tuple<Ts...> &t);
template<class T, class = decltype(std::declval<std::stringstream>() << std::declval<T>())>
std::string stringify(T t) {
std::stringstream s;
s << t;
return s.str();
}
template<class ...Ts>
std::string stringify(std::tuple<Ts...> &t) {
const auto string_comma = [] (const auto & arg) { return stringify(arg) + ", "; };
// This prints a , too much but I am too lazy to fix that
return '(' + std::apply([&] (const auto& ...args) { return (string_comma(args) + ...); }, t) + ')';
}
template<class T1, class T2>
std::string stringify(const std::pair<T1, T2> &p) {
return '(' + stringify(p.first) + ", " + stringify(p.second) + ')';
}
template<class Iter>
std::string stringify(Iter begin, Iter end) {
std::string ret{'['};
for(; begin != end;) {
ret += stringify(*begin);
if(++begin != end) {
ret += ", ";
}
}
ret+=']';
return ret;
}
template<class Container, class Iter>
std::string stringify(const Container&c) {
return stringify(cbegin(c), cend(c));
}
int main () {
std::set<std::vector<std::map<int, char>>> v {{{{1, 'A'}, {2, 'E'}}, {{2, 'B'}}}, {{{2, 'C'}}, {{3, 'D'}}}};
std::tuple tup {1.0, "HELLO WORLD", std::pair{67, 42}};
std::cout << stringify(begin(v), end(v)) << '\n';
std::cout << stringify(tup) << '\n';
}

Spirit X3: parser with internal state

I want to efficiently parse large CSV-like files, whose order of columns I get at runtime. With Spirit Qi, I would parse each field with a lazy auxiliary parser that would select at runtime which column-specific parser to apply to each column. But X3 doesn't seem to have lazy (despite that it's listed in documentation). After reading recommendations here on SO, I've decided to write a custom parser.
It ended up being pretty nice, but now I've noticed I don't really need the pos variable be exposed anywhere outside the custom parser itself. I've tried putting it into the custom parser itself and started getting compiler errors stating that the column_value_parser object is read-only. Can I somehow put pos into the parser structure?
Simplified code that gets the compile-time error, with commented out parts of my working version:
#include <iostream>
#include <variant>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/support.hpp>
namespace helpers {
// https://bitbashing.io/std-visit.html
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
}
auto const unquoted_text_field = *(boost::spirit::x3::char_ - ',' - boost::spirit::x3::eol);
struct text { };
struct integer { };
struct real { };
struct skip { };
typedef std::variant<text, integer, real, skip> column_variant;
struct column_value_parser : boost::spirit::x3::parser<column_value_parser> {
typedef boost::spirit::unused_type attribute_type;
std::vector<column_variant>& columns;
// size_t& pos;
size_t pos;
// column_value_parser(std::vector<column_variant>& columns, size_t& pos)
column_value_parser(std::vector<column_variant>& columns)
: columns(columns)
// , pos(pos)
, pos(0)
{ }
template<typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr) const {
auto const saved_f = f;
bool successful = false;
visit(
helpers::overloaded {
[&](skip const&) {
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::omit[unquoted_text_field]);
},
[&](text& c) {
std::string value;
successful = boost::spirit::x3::parse(f, l, unquoted_text_field, value);
if(successful) {
std::cout << "Text: " << value << '\n';
}
},
[&](integer& c) {
int value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::int_, value);
if(successful) {
std::cout << "Integer: " << value << '\n';
}
},
[&](real& c) {
double value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::double_, value);
if(successful) {
std::cout << "Real: " << value << '\n';
}
}
},
columns[pos]);
if(successful) {
pos = (pos + 1) % columns.size();
return true;
} else {
f = saved_f;
return false;
}
}
};
int main(int argc, char *argv[])
{
std::string input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
// Comes from external source.
std::vector<column_variant> columns = {text{}, integer{}, real{}, skip{}};
size_t pos = 0;
boost::spirit::x3::parse(
input.begin(), input.end(),
// (column_value_parser(columns, pos) % ',') % boost::spirit::x3::eol);
(column_value_parser(columns) % ',') % boost::spirit::x3::eol);
}
XY: My goal is to parse ~500 GB of pseudo-CSV files in a reasonable time on a machine with little RAM, convert into a list of (roughly) [row-number, column-name, value], then put into storage. The format is actually a little more complex than CSV: database dumps formatted in… human-friendly way, with column values being actually several small sublangauges (e.g. dates or, uh, something similar to whole apache log lines stuffed into a single field), and I'm often extracting only one specific part of each column. Different files may have different columns and in different order, which I can only learn by parsing yet another set of files containing original queries. Thankfully, Spirit makes it a breeze…
Three answers:
The easiest fix is to make pos a mutable member
The X3 hardcore answer is x3::with<>
Functional composition
1. Making pos mutable
Live On Wandbox
#include <iostream>
#include <variant>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/support.hpp>
namespace helpers {
// https://bitbashing.io/std-visit.html
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
}
auto const unquoted_text_field = *(boost::spirit::x3::char_ - ',' - boost::spirit::x3::eol);
struct text { };
struct integer { };
struct real { };
struct skip { };
typedef std::variant<text, integer, real, skip> column_variant;
struct column_value_parser : boost::spirit::x3::parser<column_value_parser> {
typedef boost::spirit::unused_type attribute_type;
std::vector<column_variant>& columns;
size_t mutable pos = 0;
struct pos_tag;
column_value_parser(std::vector<column_variant>& columns)
: columns(columns)
{ }
template<typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& /*ctx*/, Other const& /*other*/, Attr& /*attr*/) const {
auto const saved_f = f;
bool successful = false;
visit(
helpers::overloaded {
[&](skip const&) {
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::omit[unquoted_text_field]);
},
[&](text&) {
std::string value;
successful = boost::spirit::x3::parse(f, l, unquoted_text_field, value);
if(successful) {
std::cout << "Text: " << value << '\n';
}
},
[&](integer&) {
int value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::int_, value);
if(successful) {
std::cout << "Integer: " << value << '\n';
}
},
[&](real&) {
double value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::double_, value);
if(successful) {
std::cout << "Real: " << value << '\n';
}
}
},
columns[pos]);
if(successful) {
pos = (pos + 1) % columns.size();
return true;
} else {
f = saved_f;
return false;
}
}
};
int main() {
std::string input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
std::vector<column_variant> columns = {text{}, integer{}, real{}, skip{}};
boost::spirit::x3::parse(
input.begin(), input.end(),
(column_value_parser(columns) % ',') % boost::spirit::x3::eol);
}
2. x3::with<>
This is similar but with better (re)entrancy and encapsulation:
Live On Wandbox
#include <iostream>
#include <variant>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/support.hpp>
namespace helpers {
// https://bitbashing.io/std-visit.html
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
}
auto const unquoted_text_field = *(boost::spirit::x3::char_ - ',' - boost::spirit::x3::eol);
struct text { };
struct integer { };
struct real { };
struct skip { };
typedef std::variant<text, integer, real, skip> column_variant;
struct column_value_parser : boost::spirit::x3::parser<column_value_parser> {
typedef boost::spirit::unused_type attribute_type;
std::vector<column_variant>& columns;
column_value_parser(std::vector<column_variant>& columns)
: columns(columns)
{ }
template<typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx const& ctx, Other const& /*other*/, Attr& /*attr*/) const {
auto const saved_f = f;
bool successful = false;
size_t& pos = boost::spirit::x3::get<pos_tag>(ctx).value;
visit(
helpers::overloaded {
[&](skip const&) {
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::omit[unquoted_text_field]);
},
[&](text&) {
std::string value;
successful = boost::spirit::x3::parse(f, l, unquoted_text_field, value);
if(successful) {
std::cout << "Text: " << value << '\n';
}
},
[&](integer&) {
int value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::int_, value);
if(successful) {
std::cout << "Integer: " << value << '\n';
}
},
[&](real&) {
double value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::double_, value);
if(successful) {
std::cout << "Real: " << value << '\n';
}
}
},
columns[pos]);
if(successful) {
pos = (pos + 1) % columns.size();
return true;
} else {
f = saved_f;
return false;
}
}
template <typename T>
struct Mutable { T mutable value; };
struct pos_tag;
auto invoke() const {
return boost::spirit::x3::with<pos_tag>(Mutable<size_t>{}) [ *this ];
}
};
int main() {
std::string input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
std::vector<column_variant> columns = {text{}, integer{}, real{}, skip{}};
column_value_parser p(columns);
boost::spirit::x3::parse(
input.begin(), input.end(),
(p.invoke() % ',') % boost::spirit::x3::eol);
}
3. Functional Composition
Because it's so much easier in X3, my favourite is to just generate the parser on demand.
Without requirements, this is the simplest I'd propose:
Live On Wandbox
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
namespace CSV {
struct text { };
struct integer { };
struct real { };
struct skip { };
auto const unquoted_text_field = *~x3::char_(",\n");
static inline auto as_parser(skip) { return x3::omit[unquoted_text_field]; }
static inline auto as_parser(text) { return unquoted_text_field; }
static inline auto as_parser(integer) { return x3::int_; }
static inline auto as_parser(real) { return x3::double_; }
template <typename... Spec>
static inline auto line_parser(Spec... spec) {
auto delim = ',' | &(x3::eoi | x3::eol);
return ((as_parser(spec) >> delim) >> ... >> x3::eps);
}
template <typename... Spec> static inline auto csv_parser(Spec... spec) {
return line_parser(spec...) % x3::eol;
}
}
#include <iostream>
#include <iomanip>
using namespace CSV;
int main() {
std::string const input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
auto f = begin(input), l = end(input);
auto p = csv_parser(text{}, integer{}, real{}, skip{});
if (parse(f, l, p)) {
std::cout << "Parsed\n";
} else {
std::cout << "Failed\n";
}
if (f!=l) {
std::cout << "Remaining: " << std::quoted(std::string(f,l)) << "\n";
}
}
A version with debug information enabled:
Live On Wandbox
<line>
<try>Hello,1,13.7,XXX\nWor</try>
<CSV::text>
<try>Hello,1,13.7,XXX\nWor</try>
<success>,1,13.7,XXX\nWorld,2,</success>
</CSV::text>
<CSV::integer>
<try>1,13.7,XXX\nWorld,2,1</try>
<success>,13.7,XXX\nWorld,2,1e</success>
</CSV::integer>
<CSV::real>
<try>13.7,XXX\nWorld,2,1e3</try>
<success>,XXX\nWorld,2,1e3,YYY</success>
</CSV::real>
<CSV::skip>
<try>XXX\nWorld,2,1e3,YYY</try>
<success>\nWorld,2,1e3,YYY</success>
</CSV::skip>
<success>\nWorld,2,1e3,YYY</success>
</line>
<line>
<try>World,2,1e3,YYY</try>
<CSV::text>
<try>World,2,1e3,YYY</try>
<success>,2,1e3,YYY</success>
</CSV::text>
<CSV::integer>
<try>2,1e3,YYY</try>
<success>,1e3,YYY</success>
</CSV::integer>
<CSV::real>
<try>1e3,YYY</try>
<success>,YYY</success>
</CSV::real>
<CSV::skip>
<try>YYY</try>
<success></success>
</CSV::skip>
<success></success>
</line>
Parsed
Notes, Caveats:
With anything mutable, beware of side-effects. E.g. if you have a | b and a includes column_value_parser, the side-effect of incrementing pos will not be rolled back when a fails and b is matched instead.
In short, this makes your parse function impure.

std::string is passing the std::is_fundamental check when it should not - template metaprogramming

I'm having a problem with an assignment of mine. The question for the assignment is as follows:
Write a function template named Interpolate that will make the below work. Each argument will be output when its corresponding % is encountered in the format string. All output should be ultimately done with the appropriate overloaded << operator. A \% sequence should output a percent sign.
SomeArbitraryClass obj;
int i = 1234;
double x = 3.14;
std::string str("foo");
std::cout << Interpolate(R"(i=%, x1=%, x2=%\%, str1=%, str2=%, obj=%)", i, x, 1.001, str, "hello", obj) << std::endl;
If there is a mismatch between the number of percent signs and the number of arguments to output, throw an exception of type cs540::WrongNumberOfArgs.
Now, I've started to write the code to make it work. However, I'm running into a problem using non-PODs. Here is what I have written so far:
#include <iostream>
#include <sstream>
#include <string>
#include <type_traits>
std::string Interpolate(std::string raw_string) {
std::size_t found = raw_string.find_first_of("%");
if(found != std::string::npos && raw_string[found-1] != '\\') {
std::cout << "Throw cs540::ArgsMismatchException" << std::endl;
}
return raw_string;
}
template <typename T, typename ...Args>
std::string Interpolate(std::string raw_string, T arg_head, Args... arg_tail) {
std::size_t found = raw_string.find_first_of("%");
while(found != 0 && raw_string[found-1] == '\\') {
found = raw_string.find_first_of("%", found + 1);
}
if(found == std::string::npos) {
std::cout << "Throw cs540::ArgsMismatchException." << std::endl;
}
// Checking the typeid of the arg_head, and converting it to a string, and concatenating the strings together.
else {
if(std::is_arithmetic<T>::value) {
raw_string = raw_string.substr(0, found) + std::to_string(arg_head) + raw_string.substr(found + 1, raw_string.size());
}
}
return Interpolate(raw_string, arg_tail...);
}
int main(void) {
int i = 24332;
float x = 432.321;
std::string str1("foo");
//Works
std::cout << Interpolate(R"(goo % goo % goo)", i, x) << std::endl;
// Does not work, even though I'm not actually doing anything with the string argument
std::cout << Interpolate(R"(goo %)", str1) << std::endl;
}
This is a run time check semantically. This means that the code in the {} is compiled, even if the expression is always false:
if(std::is_arithmetic<T>::value) {
raw_string = raw_string.substr(0, found) + std::to_string(arg_head) + raw_string.substr(found + 1, raw_string.size());
}
to fix this, you can do this:
template<typename T>
void do_arithmetic( std::string& raw_string, T&& t, std::true_type /* is_arthmetic */ ) {
raw_string = raw_string.substr(0, found) + std::to_string(std::forward<T>(t)) + raw_string.substr(found + 1, raw_string.size());
}
template<typename T>
void do_arithmetic( std::string& raw_string, T&& t, std::false_type /* is_arthmetic */ ) {
// do nothing
}
then put in your code:
do_arithmetic( raw_string, arg_head, std::is_arithmetic<T>() );
which does a compile-time branch. The type of std::is_arithmetic is either true_type or false_type depending on if T is arithmetic. This causes different overloads of do_arithmetic to be called.
In C++1y you can do this inline.
template<typename F, typename...Args>
void do_if(std::true_type, F&& f, Args&&... args){
std::forward<F>(f)( std::forward<Args>(args)... );
}
template<typename...Args>
void do_if(std::false_type, Args&&...){
}
template<bool b,typename...Args>
void do_if_not(std::integral_constant<bool,b>, Args&& args){
do_if( std::integral_constant<bool,!b>{}, std::forward<Args>(args)... );
}
template<typename C, typename F_true, typename F_false, typename...Args>
void branch( C c, F_true&&f1, F_false&& f0, Args&&... args ){
do_if(c, std::forward<F_true>(f1), std::forward<Args>(args)... );
do_if_not(c, std::forward<F_false>(f0), std::forward<Args>(args)... );
}
which is boilerplate. We can then do in our function:
do_if(std::is_arithmetic<T>{},
[&](auto&& arg_head){
raw_string = raw_string.substr(0, found) + std::to_string(arg_head) + raw_string.substr(found + 1, raw_string.size());
},
arg_head
);
or, if you want both branches:
branch(std::is_arithmetic<T>{},
[&](auto&& x){
raw_string = std::to_string(x); // blah blah
}, [&](auto&&) {
// else case
},
arg_head
);
and the first method only gets instantianted with x=arg_head if is_arithmetic is true.
Needs polish, but sort of neat.