Related
Hi I'm using boost::pfr for basic reflection, it works fine, but the problem is it is only print or deal with the field values, like with boost::pfr::io it prints each member of the struct, but how can I print it as name value pairs, same issue with for_each_field, the functor only accepts values, but not names. How can I get the field names?
struct S {
int n;
std::string name;
};
S o{1, "foo"};
std::cout << boost::pfr::io(o);
// Outputs: {1, "foo"}, how can I get n = 1, name = "foo"?
If you think adapting a struct is not too intrusive (it doesn't change your existing definitions, and you don't even need to have it in a public header):
BOOST_FUSION_ADAPT_STRUCT(S, n, name)
Then you can concoct a general operator<< for sequences:
namespace BF = boost::fusion;
template <typename T,
typename Enable = std::enable_if_t<
// BF::traits::is_sequence<T>::type::value>
std::is_same_v<BF::struct_tag, typename BF::traits::tag_of<T>::type>>>
std::ostream& operator<<(std::ostream& os, T const& v)
{
bool first = true;
auto visitor = [&]<size_t I>() {
os << (std::exchange(first, false) ? "" : ", ")
<< BF::extension::struct_member_name<T, I>::call()
<< " = " << BF::at_c<I>(v);
};
// visit members
[&]<size_t... II>(std::index_sequence<II...>)
{
return ((visitor.template operator()<II>(), ...);
}
(std::make_index_sequence<BF::result_of::size<T>::type::value>{});
return os;
}
(Prior to c++20 this would require some explicit template types instead of the lambdas, perhaps making it more readable. I guess I'm lazy...)
Here's a live demo: Live On Compiler Explorer
n = 1, name = foo
Bonus: Correctly quoting string-like types
Live On Compiler Explorer
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/for_each.hpp>
#include <boost/fusion/include/at_c.hpp>
#include <iostream>
#include <iomanip>
namespace MyLib {
struct S {
int n;
std::string name;
};
namespace BF = boost::fusion;
static auto inline pretty(std::string_view sv) { return std::quoted(sv); }
template <typename T,
typename Enable = std::enable_if_t<
not std::is_constructible_v<std::string_view, T const&>>>
static inline T const& pretty(T const& v)
{
return v;
}
template <typename T,
typename Enable = std::enable_if_t<
// BF::traits::is_sequence<T>::type::value>
std::is_same_v<BF::struct_tag, typename BF::traits::tag_of<T>::type>>>
std::ostream& operator<<(std::ostream& os, T const& v)
{
bool first = true;
auto visitor = [&]<size_t I>() {
os << (std::exchange(first, false) ? "" : ", ")
<< BF::extension::struct_member_name<T, I>::call()
<< " = " << pretty(BF::at_c<I>(v));
};
// visit members
[&]<size_t... II>(std::index_sequence<II...>)
{
return (visitor.template operator()<II>(), ...);
}
(std::make_index_sequence<BF::result_of::size<T>::type::value>{});
return os;
}
} // namespace MyLib
BOOST_FUSION_ADAPT_STRUCT(MyLib::S, n, name)
int main()
{
MyLib::S o{1, "foo"};
std::cout << o << "\n";
}
Outputs:
n = 1, name = "foo"
The library cannot offer any such functionality because it is currently impossible to obtain the name of a member of a class as value of an object.
If you want to output field names, you need to declare string objects mapped with the members and implement a operator<< which uses these strings manually.
To do this a more sophisticated reflection library would probably offer macros to use in the definition of the members. Macros can expand their argument(s) into a declaration using the provided name as identifier while also producing code using the name as string literal (via the # macro replacement operator).
It's stupid but hey, with a stringifying macro per field it could be enough for you.
C++14, no additional library
#include <boost/pfr.hpp>
struct S
{
int n;
std::string name;
static char const* const s_memNames[2];
};
char const* const S::s_memNames[2] = {"n", "name"};
// utility
template< size_t I, typename TR >
char const* MemberName()
{
using T = std::remove_reference_t<TR>;
if (I < std::size(T::s_memNames))
return T::s_memNames[I];
return nullptr;
}
// test:
#include <iostream>
using std::cout;
template< size_t I, typename T >
void StreamAt(T&& inst)
{
char const* n = MemberName<I,T>();
auto& v = boost::pfr::get<I>(inst);
cout << "(" << n << " = " << v << ")";
}
int main()
{
S s{2, "boo"};
boost::pfr::for_each_field(s, [&](const auto&, auto I)
{
StreamAt<decltype(I)::value>(s);
cout << "\n";
});
}
output:
(n = 2)
(name = boo)
(previous version of the suggestion, this one has more fluff so less interesting)
#include <boost/pfr.hpp>
// library additions:
static char const* g_names[100];
template< size_t V >
struct Id : std::integral_constant<size_t, V > {};
template< size_t I, typename T >
using TypeAt = boost::pfr::tuple_element_t<I, T>;
template<std::size_t Pos, class Struct>
constexpr int Ni() // name index
{
return std::tuple_element_t<Pos, typename std::remove_reference_t<Struct>::NamesAt >::value;
}
struct StaticCaller
{
template< typename Functor >
StaticCaller(Functor f) { f();}
};
///
/// YOUR CODE HERE
struct S
{
using NamesAt = std::tuple<Id<__COUNTER__>, Id<__COUNTER__>>; // add this
int n;
std::string name;
static void Init() // add this
{
g_names[Ni<0,S>()] = "n";
g_names[Ni<1,S>()] = "name";
}
};
StaticCaller g_sc__LINE__(S::Init); // add this
// utilities
template< size_t I, typename T >
auto GetValueName(T&& inst)
{
return std::make_pair(boost::pfr::get<I>(inst), g_names[Ni<I,T>()]);
}
// test:
#include <iostream>
using std::cout;
template< size_t I, typename T >
void StreamAt(T&& inst)
{
auto const& [v,n] = GetValueName<I>(inst);
cout << "(" << v << ", " << n << ")";
}
int main()
{
S s{2, "boo"};
boost::pfr::for_each_field(s, [&](const auto&, auto I)
{
StreamAt<decltype(I)::value>(s);
cout << "\n";
});
}
output
(2, n)
(boo, name)
Following the question in Heterogenous vectors of pointers. How to call functions.
I would like to know how to identify null points inside the vector of boost::variant.
Example code:
#include <boost/variant.hpp>
#include <vector>
template< typename T>
class A
{
public:
A(){}
~A(){}
void write();
private:
T data;
};
template< typename T>
void A<T>::write()
{
std::cout << data << std::endl;
}
class myVisitor
: public boost::static_visitor<>
{
public:
template< typename T>
void operator() (A<T>* a) const
{
a->write();
}
};
int main()
{
A<int> one;
A<double> two;
typedef boost::variant<A<int>*, A<double>* > registry;
std::vector<registry> v;
v.push_back(&one);
v.push_back(&two);
A<int>* tst = new A<int>;
for(auto x: v)
{
boost::apply_visitor(myVisitor(), x);
try {delete tst; tst = nullptr;}
catch (...){}
}
}
Since I am deleting the pointer I would hope that the last one will give me an error or something. How can I check if the entry in the entry is pointing to nullptr?
Note: this partly ignores the X/Y of this question, based on the tandom question (Heterogenous vectors of pointers. How to call functions)
What you seem to be after is polymorphic collections, but not with a virtual type hierarchy.
This is known as type erasure, and Boost Type Erasure is conveniently wrapped for exactly this use case with Boost PolyCollection.
The type erased variation would probably look like any_collection:
Live On Coliru
#include <boost/variant.hpp>
#include <cmath>
#include <iostream>
#include <vector>
#include <boost/poly_collection/any_collection.hpp>
#include <boost/type_erasure/member.hpp>
namespace pc = boost::poly_collection;
BOOST_TYPE_ERASURE_MEMBER(has_write, write)
using writable = has_write<void()>;
template <typename T> class A {
public:
A(T value = 0) : data(value) {}
// A() = default; // rule of zero
//~A() = default;
void write() const { std::cout << data << std::endl; }
private:
T data/* = 0*/;
};
int main()
{
pc::any_collection<writable> registry;
A<int> one(314);
A<double> two(M_PI);
registry.insert(one);
registry.insert(two);
for (auto& w : registry) {
w.write();
}
}
Prints
3.14159
314
Note that the insertion order is preserved, but iteration is done type-by-type. This is also what makes PolyCollection much more efficient than "regular" containers that do not optimize allocation sizes or use pointers.
BONUS: Natural printing operator<<
Using classical dynamic polymorphism, this would not work without adding virtual methods, but with Boost TypeErasure ostreamable is a ready-made concept:
Live On Coliru
#include <boost/variant.hpp>
#include <cmath>
#include <iostream>
#include <vector>
#include <boost/poly_collection/any_collection.hpp>
#include <boost/type_erasure/operators.hpp>
namespace pc = boost::poly_collection;
using writable = boost::type_erasure::ostreamable<>;
template <typename T> class A {
public:
A(T value = 0) : data(value) {}
// A() = default; // rule of zero
//~A() = default;
private:
friend std::ostream& operator<<(std::ostream& os, A const& a) {
return os << a.data;
}
T data/* = 0*/;
};
int main()
{
pc::any_collection<writable> registry;
A<int> one(314);
A<double> two(M_PI);
registry.insert(one);
registry.insert(two);
for (auto& w : registry) {
std::cout << w << "\n";
}
}
Printing the same as before.
UPDATE
To the comment:
I want to create n A<someType> variables (these are big objects). All of these variables have a write function to write something to a file.
My idea is to collect all the pointers of these variables and at the end loop through the vector to call each write function. Now, it might happen that I want to allocate memory and delete a A<someType> variable. If this happens it should not execute the write function.
This sounds like one of the rare occasions where shared_ptr makes sense, because it allows you to observe the object's lifetime using weak_ptr.
Object Graph Imagined...
Let's invent a node type that can participate in a pretty large object graph, such that you would keep an "index" of pointers to some of its nodes. For this demonstration, I'll make it a tree-structured graph, and we're going to keep References to the leaf nodes:
using Object = std::shared_ptr<struct INode>;
using Reference = std::weak_ptr<struct INode>;
Now, lets add identification to the Node base so we have an arbitrary way to identify nodes to delete (e.g. all nodes with odd ids). In addition, any node can have child nodes, so let's put that in the base node as well:
struct INode {
virtual void write(std::ostream& os) const = 0;
std::vector<Object> children;
size_t id() const { return _id; }
private:
size_t _id = s_idgen++;
};
Now we need some concrete derived node types:
template <typename> struct Node : INode {
void write(std::ostream& os) const override;
};
using Root = Node<struct root_tag>;
using Banana = Node<struct banana_tag>;
using Pear = Node<struct pear_tag>;
using Bicycle = Node<struct bicycle_tag>;
// etc
Yeah. Imagination is not my strong suit ¯\(ツ)/¯
Generate Random Data
// generating demo data
#include <random>
#include <functional>
#include <array>
static std::mt19937 s_prng{std::random_device{}()};
static std::uniform_int_distribution<size_t> s_num_children(0, 3);
Object generate_object_graph(Object node, unsigned max_depth = 10) {
std::array<std::function<Object()>, 3> factories = {
[] { return std::make_shared<Banana>(); },
[] { return std::make_shared<Pear>(); },
[] { return std::make_shared<Bicycle>(); },
};
for(auto n = s_num_children(s_prng); max_depth && n--;) {
auto pick = factories.at(s_prng() % factories.size());
node->children.push_back(generate_object_graph(pick(), max_depth - 1));
}
return node;
}
Nothing fancy. Just a randomly generated tree with a max_depth and random distribution of node types.
write to Pretty-Print
Let's add some logic to display any object graph with indentation:
// for demo output
#include <boost/core/demangle.hpp>
template <typename Tag> void Node<Tag>::write(std::ostream& os) const {
os << boost::core::demangle(typeid(Tag*).name()) << "(id:" << id() << ") {";
if (not children.empty()) {
for (auto& ch : children) {
ch->write(os << linebreak << "- " << indent);
os << unindent;
}
os << linebreak;
}
os << "}";
}
To keep track of the indentation level I'll define these indent/unindent
manipulators modifying some custom state inside the stream object:
static auto s_indent = std::ios::xalloc();
std::ostream& indent(std::ostream& os) { return os.iword(s_indent) += 3, os; }
std::ostream& unindent(std::ostream& os) { return os.iword(s_indent) -= 3, os; }
std::ostream& linebreak(std::ostream& os) {
return os << "\n" << std::setw(os.iword(s_indent)) << "";
}
That should do.
Getting Leaf Nodes
Leaf nodes are the nodes without any children.
This is a depth-first tree visitor taking any output iterator:
template <typename Out>
Out get_leaf_nodes(Object const& tree, Out out) {
if (tree) {
if (tree->children.empty()) {
*out++ = tree; // that's a leaf node!
} else {
for (auto& ch : tree->children) {
get_leaf_nodes(ch, out);
}
}
}
return out;
}
Removing some nodes:
Yet another depht-first visitor:
template <typename Pred>
size_t remove_nodes_if(Object tree, Pred predicate)
{
size_t n = 0;
if (!tree)
return n;
auto& c = tree->children;
// depth first
for (auto& child : c)
n += remove_nodes_if(child, predicate);
auto e = std::remove_if(begin(c), end(c), predicate);
n += std::distance(e, end(c));
c.erase(e, end(c));
return n;
}
DEMO TIME
Tieing it all together, we can print a randomly generated graph:
int main()
{
auto root = generate_object_graph(std::make_shared<Root>());
root->write(std::cout);
This puts all its leaf node References in a container:
std::list<Reference> leafs;
get_leaf_nodes(root, back_inserter(leafs));
Which we can print using their write() methods:
std::cout << "\nLeafs: " << leafs.size();
for (Reference& ref : leafs)
if (Object alive = ref.lock())
alive->write(std::cout << " ");
Of course all the leafs are still alive. But we can change that! We will remove one in 5 nodes by id:
auto _2mod5 = [](Object const& node) { return (2 == node->id() % 5); };
std::cout << "\nRemoved " << remove_nodes_if(root, _2mod5) << " 2mod5 nodes from graph\n";
std::cout << "\n(Stale?) Leafs: " << leafs.size();
The reported number of leafs nodes would still seem the same. That's... not
what you wanted. Here's where your question comes in: how do we detect the
nodes that were deleted?
leafs.remove_if(std::mem_fn(&Reference::expired));
std::cout << "\nLive leafs: " << leafs.size();
Now the count will accurately reflect the number of leaf nodes remaining.
Live On Coliru
#include <memory>
#include <vector>
#include <ostream>
using Object = std::shared_ptr<struct INode>;
using Reference = std::weak_ptr<struct INode>;
static size_t s_idgen = 0;
struct INode {
virtual void write(std::ostream& os) const = 0;
std::vector<Object> children;
size_t id() const { return _id; }
private:
size_t _id = s_idgen++;
};
template <typename> struct Node : INode {
void write(std::ostream& os) const override;
};
using Root = Node<struct root_tag>;
using Banana = Node<struct banana_tag>;
using Pear = Node<struct pear_tag>;
using Bicycle = Node<struct bicycle_tag>;
// etc
// for demo output
#include <boost/core/demangle.hpp>
#include <iostream>
#include <iomanip>
static auto s_indent = std::ios::xalloc();
std::ostream& indent(std::ostream& os) { return os.iword(s_indent) += 3, os; }
std::ostream& unindent(std::ostream& os) { return os.iword(s_indent) -= 3, os; }
std::ostream& linebreak(std::ostream& os) {
return os << "\n" << std::setw(os.iword(s_indent)) << "";
}
template <typename Tag> void Node<Tag>::write(std::ostream& os) const {
os << boost::core::demangle(typeid(Tag*).name()) << "(id:" << id() << ") {";
if (not children.empty()) {
for (auto& ch : children) {
ch->write(os << linebreak << "- " << indent);
os << unindent;
}
os << linebreak;
}
os << "}";
}
// generating demo data
#include <random>
#include <functional>
#include <array>
static std::mt19937 s_prng{std::random_device{}()};
static std::uniform_int_distribution<size_t> s_num_children(0, 3);
Object generate_object_graph(Object node, unsigned max_depth = 10) {
std::array<std::function<Object()>, 3> factories = {
[] { return std::make_shared<Banana>(); },
[] { return std::make_shared<Pear>(); },
[] { return std::make_shared<Bicycle>(); },
};
for(auto n = s_num_children(s_prng); max_depth && n--;) {
auto pick = factories.at(s_prng() % factories.size());
node->children.push_back(generate_object_graph(pick(), max_depth - 1));
}
return node;
}
template <typename Out>
Out get_leaf_nodes(Object const& tree, Out out) {
if (tree) {
if (tree->children.empty()) {
*out++ = tree;
} else {
for (auto& ch : tree->children) {
get_leaf_nodes(ch, out);
}
}
}
return out;
}
template <typename Pred>
size_t remove_nodes_if(Object tree, Pred predicate)
{
size_t n = 0;
if (!tree)
return n;
auto& c = tree->children;
// depth first
for (auto& child : c)
n += remove_nodes_if(child, predicate);
auto e = std::remove_if(begin(c), end(c), predicate);
n += std::distance(e, end(c));
c.erase(e, end(c));
return n;
}
#include <list>
int main()
{
auto root = generate_object_graph(std::make_shared<Root>());
root->write(std::cout);
std::list<Reference> leafs;
get_leaf_nodes(root, back_inserter(leafs));
std::cout << "\n------------"
<< "\nLeafs: " << leafs.size();
for (Reference& ref : leafs)
if (Object alive = ref.lock())
alive->write(std::cout << " ");
auto _2mod5 = [](Object const& node) { return (2 == node->id() % 5); };
std::cout << "\nRemoved " << remove_nodes_if(root, _2mod5) << " 2mod5 nodes from graph\n";
std::cout << "\n(Stale?) Leafs: " << leafs.size();
// some of them are not alive, see which are gone ("detecing the null pointers")
leafs.remove_if(std::mem_fn(&Reference::expired));
std::cout << "\nLive leafs: " << leafs.size();
}
Prints e.g.
root_tag*(id:0) {
- bicycle_tag*(id:1) {}
- bicycle_tag*(id:2) {
- pear_tag*(id:3) {}
}
- bicycle_tag*(id:4) {
- bicycle_tag*(id:5) {}
- bicycle_tag*(id:6) {}
}
}
------------
Leafs: 4 bicycle_tag*(id:1) {} pear_tag*(id:3) {} bicycle_tag*(id:5) {} bicycle_tag*(id:6) {}
Removed 1 2mod5 nodes from graph
(Stale?) Leafs: 4
Live leafs: 3
Or see the COLIRU link for a much larger sample.
I have some var = std::variant<std::monostate, a, b, c> when a, b, c is some types.
How, at runtime, do I check what type var contains?
In the official documentation I found information that if var contains a type and I write std::get<b>(var) I get an exception. So I thought about this solution:
try {
std::variant<a>(var);
// Do something
} catch(const std::bad_variant_access&) {
try {
std::variant<b>(var);
// Do something else
} catch(const std::bad_variant_access&) {
try {
std::variant<c>(var);
// Another else
} catch (const std::bad_variant_access&) {
// std::monostate
}
}
}
But it's so complicated and ugly! Is there a simpler way to check what type std::variant contains?
std::visit is the way to go:
There is even overloaded to allow inlined visitor:
// helper type for the visitor #4
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
// explicit deduction guide (not needed as of C++20)
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
and so:
std::visit(overloaded{
[](std::monostate&){/*..*/},
[](a&){/*..*/},
[](b&){/*..*/},
[](c&){/*..*/}
}, var);
To use chained if-branches instead, you might used std::get_if
if (auto* v = std::get_if<a>(var)) {
// ...
} else if (auto* v = std::get_if<b>(var)) {
// ...
} else if (auto* v = std::get_if<c>(var)) {
// ...
} else { // std::monostate
// ...
}
The most simple way is to switch based on the current std::variant::index(). This approach requires your types (std::monostate, A, B, C) to always stay in the same order.
// I omitted C to keep the example simpler, the principle is the same
using my_variant = std::variant<std::monostate, A, B>;
void foo(my_variant &v) {
switch (v.index()) {
case 0: break; // do nothing because the type is std::monostate
case 1: {
doSomethingWith(std::get<A>(v));
break;
}
case 2: {
doSomethingElseWith(std::get<B>(v));
break;
}
}
}
If your callable works with any type, you can also use std::visit:
void bar(my_variant &v) {
std::visit([](auto &&arg) -> void {
// Here, arg is std::monostate, A or B
// This lambda needs to compile with all three options.
// The lambda returns void because we don't modify the variant, so
// we could also use const& arg.
}, v);
}
If you don't want std::visit to accept std::monostate, then just check if the index is 0. Once again, this relies on std::monostate being the first type of the variant, so it is good practice to always make it the first.
You can also detect the type using if-constexpr inside the callable. With this approach, the arguments don't have to be in the same order anymore:
void bar(my_variant &v) {
std::visit([](auto &&arg) -> my_variant {
using T = std::decay_t<decltype(arg)>;
if constexpr (std::is_same_v<std::monostate, T>) {
return arg; // arg is std::monostate here
}
else if constexpr (std::is_same_v<A, T>) {
return arg + arg; // arg is A here
}
else if constexpr (std::is_same_v<B, T>) {
return arg * arg; // arg is B here
}
}, v);
}
Note that the first lambda returns void because it just processes the current value of the variant. If you want to modify the variant, your lambda needs to return my_variant again.
You could use an overloaded visitor inside std::visit to handle A or B separately. See std::visit for more examples.
You can use standard std::visit
Usage example:
#include <variant>
#include <iostream>
#include <type_traits>
struct a {};
struct b {};
struct c {};
int main()
{
std::variant<a, b, c> var = a{};
std::visit([](auto&& arg) {
using T = std::decay_t<decltype(arg)>;
if constexpr (std::is_same_v<T, a>)
std::cout << "is an a" << '\n';
else if constexpr (std::is_same_v<T, b>)
std::cout << "is a b" << '\n';
else if constexpr (std::is_same_v<T, c>)
std::cout << "is a c" << '\n';
else
std::cout << "is not in variant type list" << '\n';
}, var);
}
Well, with some macro magic, you can do something like:
#include <variant>
#include <type_traits>
#include <iostream>
#define __X_CONCAT_1(x,y) x ## y
#define __X_CONCAT(x,y) __X_CONCAT_1(x,y)
template <typename T>
struct __helper { };
// extract the type from a declaration
// we use function-type magic to get that: typename __helper<void ( (declaration) )>::type
// declaration is "int &x" for example, this class template extracts "int"
template <typename T>
struct __helper<void (T)> {
using type = std::remove_reference_t<T>;
};
#define variant_if(variant, declaration) \
if (bool __X_CONCAT(variant_if_bool_, __LINE__) = true; auto * __X_CONCAT(variant_if_ptr_, __LINE__) = std::get_if<typename __helper<void ( (declaration) )>::type>(&(variant))) \
for (declaration = * __X_CONCAT(variant_if_ptr_, __LINE__); __X_CONCAT(variant_if_bool_, __LINE__); __X_CONCAT(variant_if_bool_, __LINE__) = false)
#define variant_switch(variant) if (auto &__variant_switch_v = (variant); true)
#define variant_case(x) variant_if(__variant_switch_v, x)
int main() {
std::variant<int, long> v = 12;
std::variant<int, long> w = 32l;
std::cout << "variant_if test" << std::endl;
variant_if(v, int &x) {
std::cout << "int = " << x << std::endl;
}
else variant_if(v, long &x) {
std::cout << "long = " << x << std::endl;
}
std::cout << "variant_switch test" << std::endl;
variant_switch(v) {
variant_case(int &x) {
std::cout << "int = " << x << std::endl;
variant_switch (w) {
variant_case(int &x) {
std::cout << "int = " << x << std::endl;
}
variant_case(long &x) {
std::cout << "long = " << x << std::endl;
}
}
};
variant_case(long &x) {
std::cout << "long = " << x << std::endl;
variant_switch (w) {
variant_case(int &x) {
std::cout << "int = " << x << std::endl;
}
variant_case(long &x) {
std::cout << "long = " << x << std::endl;
}
}
};
}
return 0;
}
I tested this approach with GCC and Clang, no guarantees for MSVC.
I want to efficiently parse large CSV-like files, whose order of columns I get at runtime. With Spirit Qi, I would parse each field with a lazy auxiliary parser that would select at runtime which column-specific parser to apply to each column. But X3 doesn't seem to have lazy (despite that it's listed in documentation). After reading recommendations here on SO, I've decided to write a custom parser.
It ended up being pretty nice, but now I've noticed I don't really need the pos variable be exposed anywhere outside the custom parser itself. I've tried putting it into the custom parser itself and started getting compiler errors stating that the column_value_parser object is read-only. Can I somehow put pos into the parser structure?
Simplified code that gets the compile-time error, with commented out parts of my working version:
#include <iostream>
#include <variant>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/support.hpp>
namespace helpers {
// https://bitbashing.io/std-visit.html
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
}
auto const unquoted_text_field = *(boost::spirit::x3::char_ - ',' - boost::spirit::x3::eol);
struct text { };
struct integer { };
struct real { };
struct skip { };
typedef std::variant<text, integer, real, skip> column_variant;
struct column_value_parser : boost::spirit::x3::parser<column_value_parser> {
typedef boost::spirit::unused_type attribute_type;
std::vector<column_variant>& columns;
// size_t& pos;
size_t pos;
// column_value_parser(std::vector<column_variant>& columns, size_t& pos)
column_value_parser(std::vector<column_variant>& columns)
: columns(columns)
// , pos(pos)
, pos(0)
{ }
template<typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& ctx, Other const& other, Attr& attr) const {
auto const saved_f = f;
bool successful = false;
visit(
helpers::overloaded {
[&](skip const&) {
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::omit[unquoted_text_field]);
},
[&](text& c) {
std::string value;
successful = boost::spirit::x3::parse(f, l, unquoted_text_field, value);
if(successful) {
std::cout << "Text: " << value << '\n';
}
},
[&](integer& c) {
int value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::int_, value);
if(successful) {
std::cout << "Integer: " << value << '\n';
}
},
[&](real& c) {
double value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::double_, value);
if(successful) {
std::cout << "Real: " << value << '\n';
}
}
},
columns[pos]);
if(successful) {
pos = (pos + 1) % columns.size();
return true;
} else {
f = saved_f;
return false;
}
}
};
int main(int argc, char *argv[])
{
std::string input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
// Comes from external source.
std::vector<column_variant> columns = {text{}, integer{}, real{}, skip{}};
size_t pos = 0;
boost::spirit::x3::parse(
input.begin(), input.end(),
// (column_value_parser(columns, pos) % ',') % boost::spirit::x3::eol);
(column_value_parser(columns) % ',') % boost::spirit::x3::eol);
}
XY: My goal is to parse ~500 GB of pseudo-CSV files in a reasonable time on a machine with little RAM, convert into a list of (roughly) [row-number, column-name, value], then put into storage. The format is actually a little more complex than CSV: database dumps formatted in… human-friendly way, with column values being actually several small sublangauges (e.g. dates or, uh, something similar to whole apache log lines stuffed into a single field), and I'm often extracting only one specific part of each column. Different files may have different columns and in different order, which I can only learn by parsing yet another set of files containing original queries. Thankfully, Spirit makes it a breeze…
Three answers:
The easiest fix is to make pos a mutable member
The X3 hardcore answer is x3::with<>
Functional composition
1. Making pos mutable
Live On Wandbox
#include <iostream>
#include <variant>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/support.hpp>
namespace helpers {
// https://bitbashing.io/std-visit.html
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
}
auto const unquoted_text_field = *(boost::spirit::x3::char_ - ',' - boost::spirit::x3::eol);
struct text { };
struct integer { };
struct real { };
struct skip { };
typedef std::variant<text, integer, real, skip> column_variant;
struct column_value_parser : boost::spirit::x3::parser<column_value_parser> {
typedef boost::spirit::unused_type attribute_type;
std::vector<column_variant>& columns;
size_t mutable pos = 0;
struct pos_tag;
column_value_parser(std::vector<column_variant>& columns)
: columns(columns)
{ }
template<typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx& /*ctx*/, Other const& /*other*/, Attr& /*attr*/) const {
auto const saved_f = f;
bool successful = false;
visit(
helpers::overloaded {
[&](skip const&) {
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::omit[unquoted_text_field]);
},
[&](text&) {
std::string value;
successful = boost::spirit::x3::parse(f, l, unquoted_text_field, value);
if(successful) {
std::cout << "Text: " << value << '\n';
}
},
[&](integer&) {
int value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::int_, value);
if(successful) {
std::cout << "Integer: " << value << '\n';
}
},
[&](real&) {
double value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::double_, value);
if(successful) {
std::cout << "Real: " << value << '\n';
}
}
},
columns[pos]);
if(successful) {
pos = (pos + 1) % columns.size();
return true;
} else {
f = saved_f;
return false;
}
}
};
int main() {
std::string input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
std::vector<column_variant> columns = {text{}, integer{}, real{}, skip{}};
boost::spirit::x3::parse(
input.begin(), input.end(),
(column_value_parser(columns) % ',') % boost::spirit::x3::eol);
}
2. x3::with<>
This is similar but with better (re)entrancy and encapsulation:
Live On Wandbox
#include <iostream>
#include <variant>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/support.hpp>
namespace helpers {
// https://bitbashing.io/std-visit.html
template<class... Ts> struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts> overloaded(Ts...) -> overloaded<Ts...>;
}
auto const unquoted_text_field = *(boost::spirit::x3::char_ - ',' - boost::spirit::x3::eol);
struct text { };
struct integer { };
struct real { };
struct skip { };
typedef std::variant<text, integer, real, skip> column_variant;
struct column_value_parser : boost::spirit::x3::parser<column_value_parser> {
typedef boost::spirit::unused_type attribute_type;
std::vector<column_variant>& columns;
column_value_parser(std::vector<column_variant>& columns)
: columns(columns)
{ }
template<typename It, typename Ctx, typename Other, typename Attr>
bool parse(It& f, It l, Ctx const& ctx, Other const& /*other*/, Attr& /*attr*/) const {
auto const saved_f = f;
bool successful = false;
size_t& pos = boost::spirit::x3::get<pos_tag>(ctx).value;
visit(
helpers::overloaded {
[&](skip const&) {
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::omit[unquoted_text_field]);
},
[&](text&) {
std::string value;
successful = boost::spirit::x3::parse(f, l, unquoted_text_field, value);
if(successful) {
std::cout << "Text: " << value << '\n';
}
},
[&](integer&) {
int value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::int_, value);
if(successful) {
std::cout << "Integer: " << value << '\n';
}
},
[&](real&) {
double value;
successful = boost::spirit::x3::parse(f, l, boost::spirit::x3::double_, value);
if(successful) {
std::cout << "Real: " << value << '\n';
}
}
},
columns[pos]);
if(successful) {
pos = (pos + 1) % columns.size();
return true;
} else {
f = saved_f;
return false;
}
}
template <typename T>
struct Mutable { T mutable value; };
struct pos_tag;
auto invoke() const {
return boost::spirit::x3::with<pos_tag>(Mutable<size_t>{}) [ *this ];
}
};
int main() {
std::string input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
std::vector<column_variant> columns = {text{}, integer{}, real{}, skip{}};
column_value_parser p(columns);
boost::spirit::x3::parse(
input.begin(), input.end(),
(p.invoke() % ',') % boost::spirit::x3::eol);
}
3. Functional Composition
Because it's so much easier in X3, my favourite is to just generate the parser on demand.
Without requirements, this is the simplest I'd propose:
Live On Wandbox
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
namespace CSV {
struct text { };
struct integer { };
struct real { };
struct skip { };
auto const unquoted_text_field = *~x3::char_(",\n");
static inline auto as_parser(skip) { return x3::omit[unquoted_text_field]; }
static inline auto as_parser(text) { return unquoted_text_field; }
static inline auto as_parser(integer) { return x3::int_; }
static inline auto as_parser(real) { return x3::double_; }
template <typename... Spec>
static inline auto line_parser(Spec... spec) {
auto delim = ',' | &(x3::eoi | x3::eol);
return ((as_parser(spec) >> delim) >> ... >> x3::eps);
}
template <typename... Spec> static inline auto csv_parser(Spec... spec) {
return line_parser(spec...) % x3::eol;
}
}
#include <iostream>
#include <iomanip>
using namespace CSV;
int main() {
std::string const input = "Hello,1,13.7,XXX\nWorld,2,1e3,YYY";
auto f = begin(input), l = end(input);
auto p = csv_parser(text{}, integer{}, real{}, skip{});
if (parse(f, l, p)) {
std::cout << "Parsed\n";
} else {
std::cout << "Failed\n";
}
if (f!=l) {
std::cout << "Remaining: " << std::quoted(std::string(f,l)) << "\n";
}
}
A version with debug information enabled:
Live On Wandbox
<line>
<try>Hello,1,13.7,XXX\nWor</try>
<CSV::text>
<try>Hello,1,13.7,XXX\nWor</try>
<success>,1,13.7,XXX\nWorld,2,</success>
</CSV::text>
<CSV::integer>
<try>1,13.7,XXX\nWorld,2,1</try>
<success>,13.7,XXX\nWorld,2,1e</success>
</CSV::integer>
<CSV::real>
<try>13.7,XXX\nWorld,2,1e3</try>
<success>,XXX\nWorld,2,1e3,YYY</success>
</CSV::real>
<CSV::skip>
<try>XXX\nWorld,2,1e3,YYY</try>
<success>\nWorld,2,1e3,YYY</success>
</CSV::skip>
<success>\nWorld,2,1e3,YYY</success>
</line>
<line>
<try>World,2,1e3,YYY</try>
<CSV::text>
<try>World,2,1e3,YYY</try>
<success>,2,1e3,YYY</success>
</CSV::text>
<CSV::integer>
<try>2,1e3,YYY</try>
<success>,1e3,YYY</success>
</CSV::integer>
<CSV::real>
<try>1e3,YYY</try>
<success>,YYY</success>
</CSV::real>
<CSV::skip>
<try>YYY</try>
<success></success>
</CSV::skip>
<success></success>
</line>
Parsed
Notes, Caveats:
With anything mutable, beware of side-effects. E.g. if you have a | b and a includes column_value_parser, the side-effect of incrementing pos will not be rolled back when a fails and b is matched instead.
In short, this makes your parse function impure.
Having fun with boost::hana. I wish to check for a specific nested type that acts like a tag in another type, so I borrow from hana::when_valid example and defined a class is_S along with its SFINAE-enabled specialization:
#include <iostream>
#include <boost/hana/core/when.hpp>
namespace hana = boost::hana;
#define V(x) std::cout << x << std::endl
struct S_tag { };
struct S {
using tag = S_tag;
};
struct T {
using tag = int;
};
template< typename T, typename = hana::when< true > >
struct is_S {
static constexpr bool value = false;
};
template< typename T >
struct is_S< T, hana::when_valid< typename T::tag > > {
static constexpr bool value = std::is_same<
typename T::tag, S_tag >::value;
};
int main () {
std::cout << "is_S ( S { }) = "; V ((is_S< S >::value));
std::cout << "is_S ( T { }) = "; V ((is_S< T >::value));
std::cout << "is_S (float { }) = "; V ((is_S< float >::value));
return 0;
}
This prints:
$ clang++ -std=c++1z sfinae.cpp && ./a.out | c++filt
is_S ( S { }) = 1
is_S ( T { }) = 0
is_S (float { }) = 0
Is there a simpler/shorter/more succinct way of writing the same check, in keeping with value-type computation of hana philosophy?
Here's what I might write:
#include <boost/hana.hpp>
#include <iostream>
namespace hana = boost::hana;
struct S_tag { };
struct S { using tag = S_tag; };
struct T { using tag = int; };
auto tag_of = [](auto t) -> hana::type<typename decltype(t)::type::tag> {
return {};
};
auto is_S = [](auto t) {
return hana::sfinae(tag_of)(t) == hana::just(hana::type<S_tag>{});
};
int main() {
std::cout << "is_S ( S { }) = " << is_S(hana::type<S>{})() << std::endl;
std::cout << "is_S ( T { }) = " << is_S(hana::type<T>{})() << std::endl;
std::cout << "is_S (float { }) = " << is_S(hana::type<float>{})() << std::endl;
}
I woukd be tempted by:
template<class...T>
constexpr std::integral_constant<bool,false> is_S(T const&...){ return {}; }
template<class T>
constexpr
std::integral_constant<bool,std::is_same<typename T::tag,S_tag>{}>
is_S(T const&){ return {}; }