Related
The examples in the Boost.Spirit documentation seem to fall in two cases:
1/ Define a parser in a function: semantic actions can access local variables and data as they are local lambdas. Like push_back here: https://www.boost.org/doc/libs/master/libs/spirit/doc/x3/html/spirit_x3/tutorials/number_list___stuffing_numbers_into_a_std__vector.html
2/ Define a parser in a namespace, like here: https://www.boost.org/doc/libs/1_69_0/libs/spirit/doc/x3/html/spirit_x3/tutorials/minimal.html
which seems to be necessary to be able to invoke BOOST_SPIRIT_DEFINE.
My question is: how to combine both (properly, without globals) ? My dream API would be to pass some argument to phrase_parse and then do some x3::_arg(ctx) but I couldn't find anything like this.
Here is for instance my parser: for now the actions are writing to std::cerr. What if I wanted to write to a custom std::ostream& instead, that would be passed to the parse function?
using namespace boost::spirit;
using namespace boost::spirit::x3;
rule<struct id_action> action = "action";
rule<struct id_array> array = "array";
rule<struct id_empty_array> empty_array = "empty_array";
rule<struct id_atom> atom = "atom";
rule<struct id_sequence> sequence = "sequence";
rule<struct id_root> root = "root";
auto access_index_array = [] (const auto& ctx) { std::cerr << "access_array: " << x3::_attr(ctx) << "\n" ;};
auto access_empty_array = [] (const auto& ctx) { std::cerr << "access_empty_array\n" ;};
auto access_named_member = [] (const auto& ctx) { std::cerr << "access_named_member: " << x3::_attr(ctx) << "\n" ;};
auto start_action = [] (const auto& ctx) { std::cerr << "start action\n" ;};
auto finish_action = [] (const auto& ctx) { std::cerr << "finish action\n" ;};
auto create_array = [] (const auto& ctx) { std::cerr << "create_array\n" ;};
const auto action_def = +(lit('.')[start_action]
>> -((+alnum)[access_named_member])
>> *(('[' >> x3::int_ >> ']')[access_index_array] | lit("[]")[access_empty_array]));
const auto sequence_def = (action[finish_action] % '|');
const auto array_def = ('[' >> sequence >> ']')[create_array];
const auto root_def = array | action;
BOOST_SPIRIT_DEFINE(action)
BOOST_SPIRIT_DEFINE(array)
BOOST_SPIRIT_DEFINE(sequence)
BOOST_SPIRIT_DEFINE(root)
bool parse(std::string_view str)
{
using ascii::space;
auto first = str.begin();
auto last = str.end();
bool r = phrase_parse(
first, last,
parser::array_def | parser::sequence_def,
ascii::space
);
if (first != last)
return false;
return r;
}
About the approaches:
1/ Yes, this is viable for small, contained parsers. Typically only used in a single TU, and exposed via non-generic interface.
2/ This is the approach for (much) larger grammars, that you might wish to spread across TUs, and/or are instantiated across several TU's generically.
Note that you do NOT need BOOST_SPIRIT_DEFINE unless you
have recursive rules
want to split declaration from definition. [This becomes pretty complicated, and I recommend against using that for X3.]
The Question
My question is: how to combine both (properly, without globals) ?
You can't combine something with namespace level declarations, if one of the requiremenents is "without globals".
My dream API would be to pass some argument to phrase_parse and then do some x3::_arg(ctx) but I couldn't find anything like this.
I don't know what you think x3::_arg(ctx) would do, in that particular dream :)
Here is for instance my parser: for now the actions are writing to std::cerr. What if I wanted to write to a custom std::ostream& instead, that would be passed to the parse function?
Now that's a concrete question. I'd say: use the context.
You could make it so that you can use x3::get<ostream>(ctx) returns the stream:
struct ostream{};
auto access_index_array = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_array: " << x3::_attr(ctx) << "\n" ;};
auto access_empty_array = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_empty_array\n" ;};
auto access_named_member = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_named_member: " << x3::_attr(ctx) << "\n" ;};
auto start_action = [] (const auto& ctx) { x3::get<ostream>(ctx) << "start action\n" ;};
auto finish_action = [] (const auto& ctx) { x3::get<ostream>(ctx) << "finish action\n" ;};
auto create_array = [] (const auto& ctx) { x3::get<ostream>(ctx) << "create_array\n";};
Now you need to put the tagged param in the context during parsing:
bool r = phrase_parse(
f, l,
x3::with<parser::ostream>(std::cerr)[parser::array_def | parser::sequence_def],
x3::space);
Live Demo: http://coliru.stacked-crooked.com/a/a26c8eb0af6370b9
Prints
start action
access_named_member: a
finish action
start action
access_named_member: b
start action
start action
access_array: 2
start action
access_named_member: foo
start action
access_empty_array
finish action
start action
access_named_member: c
finish action
create_array
true
Intermixed with the standard X3 debug output:
<sequence>
<try>.a|.b..[2].foo.[]|.c</try>
<action>
<try>.a|.b..[2].foo.[]|.c</try>
<success>|.b..[2].foo.[]|.c]</success>
</action>
<action>
<try>.b..[2].foo.[]|.c]</try>
<success>|.c]</success>
</action>
<action>
<try>.c]</try>
<success>]</success>
</action>
<success>]</success>
</sequence>
But Wait #1 - Event Handlers
It looks like you're parsing something similar to JSON Pointer or jq syntax. In the case that you wanted to provide a callback-interface (SAX-events), why not bind the callback interface instead of the actions:
struct handlers {
using N = x3::unused_type;
virtual void index(int) {}
virtual void index(N) {}
virtual void property(std::string) {}
virtual void start(N) {}
virtual void finish(N) {}
virtual void create_array(N) {}
};
#define EVENT(e) ([](auto& ctx) { x3::get<handlers>(ctx).e(x3::_attr(ctx)); })
const auto action_def =
+(x3::lit('.')[EVENT(start)] >> -((+x3::alnum)[EVENT(property)]) >>
*(('[' >> x3::int_ >> ']')[EVENT(index)] | x3::lit("[]")[EVENT(index)]));
const auto sequence_def = action[EVENT(finish)] % '|';
const auto array_def = ('[' >> sequence >> ']')[EVENT(create_array)];
const auto root_def = array | action;
Now you can implement all handlers neatly in one interface:
struct default_handlers : parser::handlers {
std::ostream& os;
default_handlers(std::ostream& os) : os(os) {}
void index(int i) override { os << "access_array: " << i << "\n"; };
void index(N) override { os << "access_empty_array\n" ; };
void property(std::string n) override { os << "access_named_member: " << n << "\n" ; };
void start(N) override { os << "start action\n" ; };
void finish(N) override { os << "finish action\n" ; };
void create_array(N) override { os << "create_array\n"; };
};
auto f = str.begin(), l = str.end();
bool r = phrase_parse(f, l,
x3::with<parser::handlers>(default_handlers{std::cout}) //
[parser::array_def | parser::sequence_def],
x3::space);
See it Live On Coliru once again:
start action
access_named_member: a
finish action
start action
access_named_member: b
start action
start action
access_array: 2
start action
access_named_member: foo
start action
access_empty_array
finish action
start action
access_named_member: c
finish action
create_array
true
But Wait #2 - No Actions
The natural way to expose attributes would be to build an AST. See also Boost Spirit: "Semantic actions are evil"?
Without further ado:
namespace AST {
using Id = std::string;
using Index = int;
struct Member {
std::optional<Id> name;
};
struct Indexer {
std::optional<int> index;
};
struct Action {
Member member;
std::vector<Indexer> indexers;
};
using Actions = std::vector<Action>;
using Sequence = std::vector<Actions>;
struct ArrayCtor {
Sequence actions;
};
using Root = boost::variant<ArrayCtor, Actions>;
}
Of course, I'm making some assumptions. The rules can be much simplified:
namespace parser {
template <typename> struct Tag {};
#define AS(T, p) (x3::rule<Tag<AST::T>, AST::T>{#T} = p)
auto id = AS(Id, +x3::alnum);
auto member = AS(Member, x3::lit('.') >> -id);
auto indexer = AS(Indexer,'[' >> -x3::int_ >> ']');
auto action = AS(Action, member >> *indexer);
auto actions = AS(Actions, +action);
auto sequence = AS(Sequence, actions % '|');
auto array = AS(ArrayCtor, '[' >> -sequence >> ']'); // covers empty array
auto root = AS(Root, array | actions);
} // namespace parser
And the parsing function returns the AST:
AST::Root parse(std::string_view str) {
auto f = str.begin(), l = str.end();
AST::Root parsed;
phrase_parse(f, l, x3::expect[parser::root >> x3::eoi], x3::space, parsed);
return parsed;
}
(Note that it now throws x3::expection_failure if the input is invalid or not completely parsed)
int main() {
std::cout << parse("[.a|.b..[2].foo.[]|.c]");
}
Now prints:
[.a|.b./*none*/./*none*/[2].foo./*none*/[/*none*/]|.c]
See it Live On Coliru
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <ostream>
#include <optional>
namespace x3 = boost::spirit::x3;
namespace AST {
using Id = std::string;
using Index = int;
struct Member {
std::optional<Id> name;
};
struct Indexer {
std::optional<int> index;
};
struct Action {
Member member;
std::vector<Indexer> indexers;
};
using Actions = std::vector<Action>;
using Sequence = std::vector<Actions>;
struct ArrayCtor {
Sequence actions;
};
using Root = boost::variant<ArrayCtor, Actions>;
}
BOOST_FUSION_ADAPT_STRUCT(AST::Member, name)
BOOST_FUSION_ADAPT_STRUCT(AST::Indexer, index)
BOOST_FUSION_ADAPT_STRUCT(AST::Action, member, indexers)
BOOST_FUSION_ADAPT_STRUCT(AST::ArrayCtor, actions)
namespace parser {
template <typename> struct Tag {};
#define AS(T, p) (x3::rule<Tag<AST::T>, AST::T>{#T} = p)
auto id = AS(Id, +x3::alnum);
auto member = AS(Member, x3::lit('.') >> -id);
auto indexer = AS(Indexer,'[' >> -x3::int_ >> ']');
auto action = AS(Action, member >> *indexer);
auto actions = AS(Actions, +action);
auto sequence = AS(Sequence, actions % '|');
auto array = AS(ArrayCtor, '[' >> -sequence >> ']'); // covers empty array
auto root = AS(Root, array | actions);
} // namespace parser
AST::Root parse(std::string_view str) {
auto f = str.begin(), l = str.end();
AST::Root parsed;
phrase_parse(f, l, x3::expect[parser::root >> x3::eoi], x3::space, parsed);
return parsed;
}
// for debug output
#include <iostream>
#include <iomanip>
namespace AST {
static std::ostream& operator<<(std::ostream& os, Member const& m) {
return os << "." << m.name.value_or("/*none*/");
}
static std::ostream& operator<<(std::ostream& os, Indexer const& i) {
if (i.index)
return os << "[" << *i.index << "]";
else
return os << "[/*none*/]";
}
static std::ostream& operator<<(std::ostream& os, Action const& a) {
os << a.member;
for (auto& i : a.indexers)
os << i;
return os;
}
static std::ostream& operator<<(std::ostream& os, Actions const& aa) {
for (auto& a : aa)
os << a;
return os;
}
static std::ostream& operator<<(std::ostream& os, Sequence const& s) {
bool first = true;
for (auto& a : s)
os << (std::exchange(first, false) ? "" : "|") << a;
return os;
}
static std::ostream& operator<<(std::ostream& os, ArrayCtor const& ac) {
return os << "[" << ac.actions << "]";
}
}
int main() {
std::cout << parse("[.a|.b..[2].foo.[]|.c]");
}
I have a parser in which I want to capture certain types of whitespace as enum values and preserve the spaces for the "text" values.
My whitespace parser is pretty basic (Note: I've only added the pipe character here for test/dev purposes):
struct whitespace_p : x3::symbols<Whitespace>
{
whitespace_p()
{
add
("\n", Whitespace::NEWLINE)
("\t", Whitespace::TAB)
("|", Whitespace::PIPE)
;
}
} whitespace;
And I want to capture everything either into my enum or into std::strings:
struct Element : x3::variant<Whitespace, std::string>
{
using base_type::base_type;
using base_type::operator=;
};
And to parse my input I use something like this:
const auto contentParser
= x3::rule<class ContentParserID, Element, true> { "contentParser" }
= x3::no_skip[+(x3::char_ - (whitespace))]
| whitespace
;
using Elements = std::vector<Element>;
const auto elementsParser
= x3::rule<class ContentParserID, Elements, true> { "elementsParser" }
= contentParser >> *(contentParser);
The problem though is that the parser stops at the first tab or newline it hits.
Code: http://coliru.stacked-crooked.com/a/d2cda4ce721279a4
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
#include <iostream>
namespace x3 = boost::spirit::x3;
enum Whitespace
{
NEWLINE,
TAB,
PIPE
};
struct whitespace_p : x3::symbols<Whitespace>
{
whitespace_p()
{
add
("\n", Whitespace::NEWLINE)
("\t", Whitespace::TAB)
("|", Whitespace::PIPE)
;
}
} whitespace;
struct Element : x3::variant<Whitespace, std::string>
{
using base_type::base_type;
using base_type::operator=;
};
const auto contentParser
= x3::rule<class ContentParserID, Element, true> { "contentParser" }
= x3::no_skip[+(x3::char_ - (whitespace))]
| whitespace
;
using Elements = std::vector<Element>;
const auto elementsParser
= x3::rule<class ContentParserID, Elements, true> { "elementsParser" }
= contentParser >> *(contentParser);
struct print_visitor
: public boost::static_visitor<std::string>
{
std::string operator()(const Whitespace& ws) const
{
if (ws == Whitespace::NEWLINE)
{
return "newline";
}
else if (ws == Whitespace::PIPE)
{
return "pipe";
}
else
{
return "tab";
}
}
std::string operator()(const std::string& str) const
{
return str;
}
};
int main()
{
const std::string text = "Hello \n World";
std::string::const_iterator start = std::begin(text);
const std::string::const_iterator stop = std::end(text);
Elements elements{};
bool result =
phrase_parse(start, stop, elementsParser, x3::ascii::space, elements);
if (!result)
{
std::cout << "failed to parse!\n";
}
else if (start != stop)
{
std::cout << "unparsed: " << std::string{start, stop} << '\n';
}
else
{
for (const auto& e : elements)
{
std::cout << "element: [" << boost::apply_visitor(print_visitor{}, e) << "]\n";
}
}
}
If I parse the text Hello | World then I get the results I'm expecting. But if I instead use Hello \n World the whitespace after the \n is swallowed and the World is never parsed. Ideally I'd like to see this output:
element: [Hello ]
element: [newline]
element: [ World]
How can I accomplish this? Thank you!
My goto reference on skipper issues: Boost spirit skipper issues
In this case you made it work with no_skip[]. That's correct.
no_skip is like lexeme except it doesn't pre-skip, from the source (boost/spirit/home/x3/directive/no_skip.hpp):
// same as lexeme[], but does not pre-skip
Alternative Take
In your case I would flip the logic: just adjust the skipper itself.
Also, don't supply the skipper with phrase_parse, because your grammar is highly sensitive to the correct value of the skipper.
Your whole grammar could be:
const auto p = x3::skip(x3::space - whitespace) [
*(+x3::graph | whitespace)
];
Here's a Live Demo On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
#include <iostream>
#include <iomanip>
namespace x3 = boost::spirit::x3;
enum Whitespace { NEWLINE, TAB, PIPE };
struct whitespace_p : x3::symbols<Whitespace> {
whitespace_p() {
add
("\n", Whitespace::NEWLINE)
("\t", Whitespace::TAB)
("|", Whitespace::PIPE)
;
}
} static const whitespace;
struct Element : x3::variant<Whitespace, std::string> {
using base_type::base_type;
using base_type::operator=;
};
using Elements = std::vector<Element>;
static inline std::ostream& operator<<(std::ostream& os, Element const& el) {
struct print_visitor {
std::ostream& os;
auto& operator()(Whitespace ws) const {
switch(ws) {
case Whitespace::NEWLINE: return os << "[newline]";
case Whitespace::PIPE: return os << "[pipe]";
case Whitespace::TAB: return os << "[tab]";
}
return os << "?";
}
auto& operator()(const std::string& str) const { return os << std::quoted(str); }
} vis{os};
return boost::apply_visitor(vis, el);
}
int main() {
std::string const text = "\tHello \n World";
auto start = begin(text), stop = end(text);
const auto p = x3::skip(x3::space - whitespace) [
*(+x3::graph | whitespace)
];
Elements elements;
if (!parse(start, stop, p, elements)) {
std::cout << "failed to parse!\n";
} else {
std::copy(begin(elements), end(elements), std::ostream_iterator<Element>(std::cout, "\n"));
}
if (start != stop) {
std::cout << "unparsed: " << std::quoted(std::string(start, stop)) << '\n';
}
}
Prints
[tab]
"Hello"
[newline]
"World"
Even Simpler?
It doesn't seem like you'd need any skipper here at all. Why not:
const auto p = *(+~x3::char_("\n\t|") | whitespace);
While we're at it, there's no need for symbols to map enums:
struct Element : x3::variant<char, std::string> {
// ...
};
using Elements = std::vector<Element>;
And then
const auto p
= x3::rule<struct ID, Element> {}
= +~x3::char_("\n\t|") | x3::char_;
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
#include <iostream>
#include <iomanip>
namespace x3 = boost::spirit::x3;
struct Element : x3::variant<char, std::string> {
using variant = x3::variant<char, std::string>;
using variant::variant;
using variant::operator=;
friend std::ostream& operator<<(std::ostream& os, Element const& el) {
struct print_visitor {
std::ostream& os;
auto& operator()(char ws) const {
switch(ws) {
case '\n': return os << "[newline]";
case '\t': return os << "[pipe]";
case '|': return os << "[tab]";
}
return os << "?";
}
auto& operator()(const std::string& str) const { return os << std::quoted(str); }
} vis{os};
return boost::apply_visitor(vis, el);
}
};
using Elements = std::vector<Element>;
int main() {
std::string const text = "\tHello \n World";
auto start = begin(text);
auto const stop = end(text);
Elements elements;
const auto p
= x3::rule<struct ID, Element> {}
= +~x3::char_("\n\t|") | x3::char_;
if (!parse(start, stop, *p, elements)) {
std::cout << "failed to parse!\n";
} else {
std::copy(begin(elements), end(elements), std::ostream_iterator<Element>(std::cout, "\n"));
}
if (start != stop) {
std::cout << "unparsed: " << std::quoted(std::string(start, stop)) << '\n';
}
}
Prints
[pipe]
"Hello "
[newline]
" World"
The problems are that you are using a phrase_parser instead of a parser at line 76.
Try to use something like
bool result =
parse(start, stop, elementsParser, elements);
Your phrase_parser was instructed to skip spaces, what you really don't want.
Look the first answer of How to use boost::spirit to parse a sequence of words into a vector?
I've learned a lot in the last couple of weeks about this stuff, but not enough. The code below compiles and runs but the code in TEST_ADAPT is incomplete, I'm not sure how to make the connection.
The object is to parse into a plane jane container that has no variant dependency. I have figured out I can get a tuple of references to my storage which spirit likes well enough. (see TEST_REF). The kleene operator is looking for a single sequential container but on a set of alternatives, that doesn't look to be possible. So I guess I need to hand it something that is a proxy for that container but has gear work to locate in a tuple the destination references.
I think it will be a great exercise for me to write this ContainerAdaptor even if it is the wrong way to approach this. So I'm wondering if I'm in right field or on the right track.
The best I know I can complete is to use the TEST_VECT method and make a pass over the vector to copy the data into my ALL container. But that's just not right.
Update:
I have made Target::All fusion adapted and made ContainerAdaptor partially functional. Enough so that the kleene operator accepts it. I should be able to connect to the Target::All object, maybe...
#include <iostream>
#include <boost/fusion/include/as_vector.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/fusion/adapted/std_tuple.hpp>
#include <boost/fusion/include/boost_tuple.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
//parse kleene operator to a set of alternatives, adaptor? with spirit x3
#define TEST_VECT
#define TEST_REF
#define TEST_ADAPT
// l.......................................................................
namespace Target {
struct Int
{
int int_val;
};
using IntVect = std::vector<Int>;
struct Word
{
std::string word_val;
};
using WordVect = std::vector<Word>;
struct All
{
IntVect int_vect;
WordVect word_vect;
};
}
BOOST_FUSION_ADAPT_STRUCT(Target::Int, int_val)
BOOST_FUSION_ADAPT_STRUCT(Target::Word, word_val)
BOOST_FUSION_ADAPT_STRUCT(Target::All, int_vect, word_vect)
std::ostream& operator << (std::ostream& o, const Target::Int& in) { o << in.int_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::Word& in) { o << in.word_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::IntVect& in) { for( auto& i : in ) o << i << " "; return o; }
std::ostream& operator << (std::ostream& o, const Target::WordVect& in) { for (auto& i : in) o << i << " "; return o; }
#define DEF_RULE( RuleName, Attr ) static auto const RuleName = rule<struct Attr##_def, Attr>( #RuleName )
namespace Target {
using namespace boost::spirit::x3;
auto const bare_word = lexeme[+char_("a-z")];
DEF_RULE(int_rule, Int) = int_;
DEF_RULE(word_rule, Word) = bare_word;
auto const int_vect_rule= "int" >> *int_rule;
auto const word_vect_rule= "word" >> *(word_rule - "int");
//another test
DEF_RULE(f_int_vect_rule, IntVect) = int_vect_rule;
DEF_RULE(f_word_vect_rule, IntVect) = word_vect_rule;
}//namespace Target
namespace Target {
struct Printer {
Printer(std::ostream& out) : out(out) {};
using result_type = void;
void operator()(const IntVect& expression) {
out << "IntVect: ";
for (auto& t : expression)
out << t << " ";
out << std::endl;
}
void operator()(const WordVect& expression) {
out << "Word: ";
for (auto& t : expression)
out << t << " ";
out << std::endl;
}
private:
std::ostream& out;
};
}//namespace Target
template<class Arg>
class ContainerAdaptor
{
public:
ContainerAdaptor(Arg& arg) :arg(arg) { }
typedef boost::spirit::x3::variant<Target::IntVect,Target::WordVect> value_type;
typedef size_t size_type;
struct Vis : public boost::static_visitor<>
{
void operator()(const Target::IntVect & i) const
{
std::cout << i << std::endl;
}
void operator()(const Target::WordVect & i) const
{
std::cout << i << std::endl;
}
};
void insert(value_type* e, const value_type& v) {
std::cout << "haha! ";
boost::apply_visitor(Vis(), v);
}
value_type* end() { return nullptr; }
value_type* begin() { return nullptr; }
size_t size;
private:
Arg & arg;
};
int main()
{
using namespace Target;
std::string thestr("word test more int 1 2 3 4 word this and that int 5 4 int 99 22");
std::string::iterator end = thestr.end();
#if defined(TEST_ADAPT)
{
std::cout << "\nTEST_ADAPT\n";
std::string::iterator begin = thestr.begin();
All all;
auto fwd = std::forward_as_tuple(all.word_vect, all.int_vect);
ContainerAdaptor<All>attr( all );
phrase_parse(begin, end, *( int_vect_rule | word_vect_rule), space, attr);
Printer printer(std::cout);
}
#endif
#if defined(TEST_VECT)
{
std::cout << "TEST_VECT\n";
std::string::iterator begin = thestr.begin();
using Vars = variant<Target::IntVect, Target::WordVect>;
std::vector< Vars > a_vect;
bool r = phrase_parse(begin, end, *( int_vect_rule | word_vect_rule), space, a_vect);
Printer printer(std::cout);
for (auto& i : a_vect)
i.apply_visitor(printer);
}
#endif
#if defined(TEST_REF)
{
std::cout << "\nTEST_REF\n";
std::string::iterator begin = thestr.begin();
All all;
auto fwd = std::forward_as_tuple(all.word_vect,all.int_vect);
phrase_parse(begin, end, word_vect_rule >> int_vect_rule, space, fwd);
Printer printer(std::cout);
std::_For_each_tuple_element(fwd, printer);
}
#endif
return 0;
}
The ContainerAdaptor Hack
Sufficiently simplified, it works:
Live On Coliru
#include <iostream>
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
namespace Target {
struct Int { int int_val; };
struct Word { std::string word_val; };
using IntVect = std::vector<Int>;
using WordVect = std::vector<Word>;
struct All {
IntVect int_vect;
WordVect word_vect;
};
}
BOOST_FUSION_ADAPT_STRUCT(Target::Int, int_val)
BOOST_FUSION_ADAPT_STRUCT(Target::Word, word_val)
std::ostream& operator << (std::ostream& o, const Target::Int& in) { o << in.int_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::Word& in) { o << in.word_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::IntVect& in) { for( auto& i : in ) o << i << " "; return o; }
std::ostream& operator << (std::ostream& o, const Target::WordVect& in) { for (auto& i : in) o << i << " "; return o; }
namespace Target {
using namespace boost::spirit::x3;
static auto const int_rule = rule<struct Int_def, Int>("int_rule") = int_;
static auto const word_rule = rule<struct Word_def, Word>("word_rule") = lexeme[+char_("a-z")];
static auto const int_vect_rule = "int" >> *int_rule;
static auto const word_vect_rule = "word" >> *(word_rule - "int");
}
template<class Arg> struct ContainerAdaptor
{
typedef boost::spirit::x3::variant<Target::IntVect,Target::WordVect> value_type;
void insert(value_type* /*e*/, const value_type& v) {
std::cout << "haha! ";
struct Vis {
//using result_type = void;
void operator()(const Target::IntVect & i) const { std::cout << i << std::endl; }
void operator()(const Target::WordVect & i) const { std::cout << i << std::endl; }
};
boost::apply_visitor(Vis(), v);
}
value_type* end() { return nullptr; }
value_type* begin() { return nullptr; }
Arg & arg;
};
int main() {
using namespace Target;
std::string const thestr("word test more int 1 2 3 4 word this and that int 5 4 int 99 22");
All all;
ContainerAdaptor<All> attr { all };
if (phrase_parse(begin(thestr), end(thestr), *( int_vect_rule | word_vect_rule), space, attr)) {
std::cout << "Parsed: \n";
std::cout << all.int_vect << "\n";
std::cout << all.word_vect << "\n";
}
}
Prints:
haha! test more
haha! 1 2 3 4
haha! this and that
haha! 5 4
haha! 99 22
Parsed:
Why Not Semantic Actions?
Live On Coliru
#include <iostream>
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
namespace Target {
struct Int { int int_val; };
struct Word { std::string word_val; };
using IntVect = std::vector<Int>;
using WordVect = std::vector<Word>;
struct All {
IntVect int_vect;
WordVect word_vect;
};
}
BOOST_FUSION_ADAPT_STRUCT(Target::Int, int_val)
BOOST_FUSION_ADAPT_STRUCT(Target::Word, word_val)
std::ostream& operator << (std::ostream& o, const Target::Int& in) { o << in.int_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::Word& in) { o << in.word_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::IntVect& in) { for( auto& i : in ) o << i << " "; return o; }
std::ostream& operator << (std::ostream& o, const Target::WordVect& in) { for (auto& i : in) o << i << " "; return o; }
namespace x3 = boost::spirit::x3;
namespace Target {
using namespace x3;
static auto const int_rule = rule<struct Int_def, Int>("int_rule") = int_;
static auto const word_rule = rule<struct Word_def, Word>("word_rule") = lexeme[+char_("a-z")];
static auto const int_vect_rule = "int" >> *int_rule;
static auto const word_vect_rule = "word" >> *(word_rule - "int");
}
int main() {
std::string const thestr("word test more int 1 2 3 4 word this and that int 5 4 int 99 22");
Target::All all;
struct {
Target::All& _r;
void operator()(Target::IntVect const&v) const { _r.int_vect.insert(_r.int_vect.end(), v.begin(), v.end()); }
void operator()(Target::WordVect const&v) const { _r.word_vect.insert(_r.word_vect.end(), v.begin(), v.end()); }
} push_back { all };
auto unary = [](auto f) { return [f](auto& ctx) { return f(x3::_attr(ctx)); }; };
auto action = unary(push_back);
if (phrase_parse(begin(thestr), end(thestr), *(Target::int_vect_rule[action] | Target::word_vect_rule[action]), x3::space)) {
std::cout << "Parsed: \n";
std::cout << all.int_vect << "\n";
std::cout << all.word_vect << "\n";
}
}
Prints
Parsed:
1 2 3 4 5 4 99 22
test more this and that
Using Traits
Re-introducing the variant "value_type":
Live On Coliru
#include <iostream>
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
namespace Target {
struct Int { int int_val; };
struct Word { std::string word_val; };
using IntVect = std::vector<Int>;
using WordVect = std::vector<Word>;
struct All {
IntVect int_vect;
WordVect word_vect;
};
}
BOOST_FUSION_ADAPT_STRUCT(Target::Int, int_val)
BOOST_FUSION_ADAPT_STRUCT(Target::Word, word_val)
std::ostream& operator << (std::ostream& o, const Target::Int& in) { o << in.int_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::Word& in) { o << in.word_val; return o; }
std::ostream& operator << (std::ostream& o, const Target::IntVect& in) { for( auto& i : in ) o << i << " "; return o; }
std::ostream& operator << (std::ostream& o, const Target::WordVect& in) { for (auto& i : in) o << i << " "; return o; }
namespace boost { namespace spirit { namespace x3 { namespace traits {
template<>
struct container_value<Target::All> {
using type = boost::variant<Target::IntVect, Target::WordVect>;
};
template<>
struct push_back_container<Target::All> {
template <typename V>
static bool call(Target::All& c, V&& v) {
struct {
Target::All& _r;
void operator()(Target::IntVect const&v) const { _r.int_vect.insert(_r.int_vect.end(), v.begin(), v.end()); }
void operator()(Target::WordVect const&v) const { _r.word_vect.insert(_r.word_vect.end(), v.begin(), v.end()); }
} vis {c};
boost::apply_visitor(vis, v);
return true;
}
};
} } } }
namespace x3 = boost::spirit::x3;
namespace Target {
using namespace x3;
static auto const int_rule = rule<struct Int_def, Int>("int_rule") = int_;
static auto const word_rule = rule<struct Word_def, Word>("word_rule") = lexeme[+char_("a-z")];
static auto const int_vect_rule = "int" >> *int_rule;
static auto const word_vect_rule = "word" >> *(word_rule - "int");
}
int main() {
std::string const thestr("word test more int 1 2 3 4 word this and that int 5 4 int 99 22");
Target::All all;
if (phrase_parse(begin(thestr), end(thestr), *(Target::int_vect_rule | Target::word_vect_rule), x3::space, all)) {
std::cout << "Parsed: \n";
std::cout << all.int_vect << "\n";
std::cout << all.word_vect << "\n";
}
}
Prints
Parsed:
1 2 3 4 5 4 99 22
test more this and that
Given the following x3 grammar that parses correctly, I want to add validation of parameters, qualifiers, and properties. This would seem to indicate some method of dynamically switching which symbol table is being used within the various rules. What is the best way to implement this? It would seem to be some mixture of semantic actions and attributes, but it is not clear to me how.
#include <string>
#include <vector>
#include <iostream>
#include <iomanip>
#include <map>
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/variant.hpp>
#include <boost/fusion/adapted/struct.hpp>
namespace x3 = boost::spirit::x3;
namespace scl
{
//
// This will take a symbol value and return the string associated with that value. From an example by sehe
// TODO: There is probably a better C++14/17 way to do this with the symbol.for_each operator and a lambda,
// but I haven't figured it out yet
//
template <typename T>
struct SYMBOL_LOOKUP
{
SYMBOL_LOOKUP (T Symbol, std::string& String) : _sought (Symbol), _found (String)
{
}
void operator () (std::basic_string <char> s, T ct)
{
if (_sought == ct)
{
_found = s;
}
}
std::string found () const { return _found; }
private:
T _sought;
std::string& _found;
};
//
// This section describes the valid verbs, the parameters that are valid for each verb, and
// the qualifiers that are valid for each verb or parameter of a verb.
// TODO: There is probably some complicated C++11/14/17 expression template for generating all
// of this as a set of linked tables, where each verb points to a parameter table, which points
// to a qualifier table, but that is currently beyond my ability to implement, so each structure
// is implemented discretely
//
//
// Legal verbs
//
enum class VERBS
{
load, //
set, //
show, //
};
struct VALID_VERBS : x3::symbols <VERBS>
{
VALID_VERBS ()
{
add
("load", VERBS::load) //
("set", VERBS::set) //
("show", VERBS::show) //
;
}
} const valid_verbs;
//
// LOAD parameter 1
//
enum class LOAD_PARAMETER1
{
dll, // LOAD DLL <file-spec>
pdb, // LOAD PDB <file-spec>
};
struct VALID_LOAD_PARAMETER1 : x3::symbols <LOAD_PARAMETER1>
{
VALID_LOAD_PARAMETER1 ()
{
add
("dll", LOAD_PARAMETER1::dll) //
("pdb", LOAD_PARAMETER1::pdb) //
;
}
} const valid_load_parameter1;
//
// SET parameter 1
//
enum class SET_PARAMETER1
{
debug, // SET DEBUG {/ON | /OFF}
trace, // SET TRACE {/ON | OFF}
};
struct VALID_SET_PARAMETER1 : x3::symbols <SET_PARAMETER1>
{
VALID_SET_PARAMETER1 ()
{
add
("debug", SET_PARAMETER1::debug) //
("trace", SET_PARAMETER1::trace) //
;
}
} const valid_set_parameter1;
//
// SET qualifiers
//
enum class SET_QUALIFIERS
{
off, //
on //
};
struct VALID_SET_QUALIFIERS : x3::symbols <SET_QUALIFIERS>
{
VALID_SET_QUALIFIERS ()
{
add
("off", SET_QUALIFIERS::off) //
("on", SET_QUALIFIERS::on) //
;
}
} const valid_set_qualifiers;
//
// SHOW parameter 1
//
enum class SHOW_PARAMETER1
{
debug, // SHOW DEBUG
module, // SHOW MODULE <wildcard-expression> [/SYMBOLS]
symbols, // SHOW SYMBOLS *{/ALL /FULL /OUT=<file-spec> /TYPE=(+{all,exports,imports})} [wild-card-expression]
trace, // SHOW TRACE
};
struct VALID_SHOW_PARAMETER1 : x3::symbols <SHOW_PARAMETER1>
{
VALID_SHOW_PARAMETER1 ()
{
add
("debug", SHOW_PARAMETER1::debug) //
("module", SHOW_PARAMETER1::module) //
("symbols", SHOW_PARAMETER1::symbols) //
("trace", SHOW_PARAMETER1::trace) //
;
}
} const valid_show_parameter1;
//
// SHOW qualifiers
//
enum class SHOW_QUALIFIERS
{
all, // Display all objects of the specified type
full, // Display all information about the specified object(s)
out, // Write output to the specified file (/out=<file spec>)
type, // List of properties to display
};
struct VALID_SHOW_QUALIFIERS : x3::symbols <SHOW_QUALIFIERS>
{
VALID_SHOW_QUALIFIERS ()
{
add
("all", SHOW_QUALIFIERS::all) //
("full", SHOW_QUALIFIERS::full) //
("out", SHOW_QUALIFIERS::out) //
("type", SHOW_QUALIFIERS::type) // Valid properties in VALID_SHOW_TYPE_PROPERTIES
;
}
} const valid_show_qualifiers;
//
// SHOW /TYPE=(property_list)
//
enum class SHOW_TYPE_PROPERTIES
{
all, //
exports, //
imports, //
};
struct VALID_SHOW_TYPE_PROPERTIES : x3::symbols <SHOW_TYPE_PROPERTIES>
{
VALID_SHOW_TYPE_PROPERTIES ()
{
add
("all", SHOW_TYPE_PROPERTIES::all) //
("exports", SHOW_TYPE_PROPERTIES::exports) //
("imports", SHOW_TYPE_PROPERTIES::imports) //
;
}
} const valid_show_type_properties;
//
// Convert a verb value to its string representation
//
std::string to_string (const VERBS Verb)
{
std::string result;
SYMBOL_LOOKUP <VERBS> lookup (Verb, result);
//
// Loop through all the entries in the symbol table looking for the specified value
// Is there a better way to use this for_each with a lambda?
//
valid_verbs.for_each (lookup);
return result;
} // End to_string
} // End namespace scl
namespace scl_ast
{
struct KEYWORD : std::string
{
using std::string::string;
using std::string::operator=;
};
struct NIL
{
};
using VALUE = boost::variant <NIL, std::string, int, double, KEYWORD>;
struct PROPERTY
{
KEYWORD name;
VALUE value;
};
struct QUALIFIER
{
enum KIND
{
positive,
negative
} kind;
std::string identifier;
std::vector <PROPERTY> properties;
};
struct PARAMETER
{
KEYWORD keyword;
std::vector <QUALIFIER> qualifiers;
};
struct COMMAND
{
scl::VERBS verb;
std::vector <QUALIFIER> qualifiers;
std::vector <PARAMETER> parameters;
};
//
// Overloads for printing the AST to the console
//
#pragma region debug
static inline std::ostream& operator<< (std::ostream& os, VALUE const& v)
{
struct
{
std::ostream& _os;
void operator() (std::string const& s) const { _os << std::quoted (s); }
void operator() (int i) const { _os << i; }
void operator() (double d) const { _os << d; }
void operator() (KEYWORD const& kwv) const { _os << kwv; }
void operator() (NIL) const { }
} vis { os };
boost::apply_visitor (vis, v);
return os;
}
static inline std::ostream& operator<< (std::ostream& os, PROPERTY const& prop)
{
os << prop.name;
if (prop.value.which ())
{
os << "=" << prop.value;
}
return os;
}
static inline std::ostream& operator<< (std::ostream& os, QUALIFIER const& q)
{
os << "/" << (q.kind == QUALIFIER::negative ? "no" : "") << q.identifier;
if (!q.properties.empty ())
{
os << "=(";
}
for (auto const& prop : q.properties)
{
os << prop << " ";
}
if (!q.properties.empty ())
{
os << ")";
}
return os;
}
static inline std::ostream& operator<< (std::ostream& os, std::vector <QUALIFIER> const& qualifiers)
{
for (auto const& qualifier : qualifiers)
{
os << " " << qualifier;
}
return os;
}
static inline std::ostream& operator<< (std::ostream& os, PARAMETER const& p)
{
return os << p.keyword << " " << p.qualifiers;
}
static inline std::ostream& operator<< (std::ostream& os, COMMAND const& cmd)
{
os << scl::to_string (cmd.verb) << cmd.qualifiers;
for (auto& param : cmd.parameters)
{
os << " " << param;
}
return os;
}
#pragma endregion debug
}; // End namespace scl_ast
BOOST_FUSION_ADAPT_STRUCT (scl_ast::PROPERTY, name, value);
BOOST_FUSION_ADAPT_STRUCT (scl_ast::QUALIFIER, kind, identifier, properties);
BOOST_FUSION_ADAPT_STRUCT (scl_ast::PARAMETER, keyword, qualifiers);
BOOST_FUSION_ADAPT_STRUCT (scl_ast::COMMAND, verb, qualifiers, parameters);
//
// Grammar for simple command language
//
namespace scl
{
using namespace x3;
auto const param = rule <struct _keyword, scl_ast::KEYWORD> { "param" }
= lexeme [+char_ ("a-zA-Z0-9$_.\\*?+-")];
auto const identifier
= lexeme [+char_ ("a-zA-Z0-9_")];
auto const quoted_string
= lexeme ['"' >> *('\\' > char_ | ~char_ ('"')) >> '"'];
auto const property_value
= quoted_string
| real_parser <double, x3::strict_real_policies <double>> {}
| int_
| param;
auto const property = rule <struct _property, scl_ast::PROPERTY> { "property" }
= identifier >> -('=' >> property_value);
auto const property_list = rule <struct _property_list, std::vector <scl_ast::PROPERTY>> { "property_list" }
= '(' >> property % ',' >> ')';
auto const qual
= attr (scl_ast::QUALIFIER::positive) >> lexeme ['/' >> identifier] >> -( '=' >> (property_list | repeat (1) [property]));
auto const neg_qual
= attr (scl_ast::QUALIFIER::negative) >> lexeme [no_case ["/no"] >> identifier] >> repeat (0) [property]; // Negated qualifiers never have properties (repeat(0) keeps the compiler happy)
auto const qualifier
= neg_qual | qual;
auto const verb
= no_case [valid_verbs]; // Uses static list of allowed verbs
auto const parameter = rule <struct _parameter, scl_ast::PARAMETER> { "parameter" }
= param >> *qualifier;
auto const command = rule <struct _command, scl_ast::COMMAND> { "command" }
= skip (blank) [verb >> *qualifier >> *parameter];
}; // End namespace scl
int
main ()
{
std::vector <std::string> input =
{
"load dll test.dll",
"LOAD pdb test.pdb",
"set debug /on",
"show debug",
"SHOW module test.dll/symbols",
"show symbols/type=export test*",
"show symbols test.dll/type=(import,export)",
"show symbols s*/out=s.txt",
"show symbols /all /full",
};
for (auto const& str : input)
{
scl_ast::COMMAND cmd;
auto b = str.begin ();
auto e = str.end ();
bool ok = parse (b, e, scl::command, cmd);
std::cout << (ok ? "OK" : "FAIL") << '\t' << std::quoted (str) << std::endl;
if (ok)
{
std::cout << " -- Full AST: " << cmd << std::endl;
std::cout << " -- Verb + Qualifiers: " << scl::to_string (cmd.verb) << cmd.qualifiers << std::endl;
for (auto const& param : cmd.parameters)
{
std::cout << " -- Parameter + Qualifiers: " << param << std::endl;
}
if (b != e)
{
std::cout << "*** Remaining unparsed: " << std::quoted (std::string (b, e)) << std::endl;
}
}
std::cout << std::endl;
} // End for
return 0;
} // End main
So, I spent quite some time thinking about this.
I admit most of the thoughts didn't escape brainstorm. However, I made a proof-of-concept, from scratch, starting from /just/ the bare minimum:
/* Synopsis:
*
* LOAD DLL <file-spec>
* LOAD PDB <file-spec>
* SET DEBUG {/ON | /OFF}
* SET TRACE {/ON | /OFF}
*
* SHOW DEBUG
* SHOW MODULE <wildcard-expression> [/SYMBOLS]
* SHOW SYMBOLS { [/ALL] [/FULL] [/OUT=<file-spec>] [/TYPE=(+{all,exports,imports})] [wild-card-expression] }...
* SHOW TRACE
*/
Utilities
Since we have several domains that can have sets of options that are to be treated as (case-insensitive) keyword identifiers, I thought of creating a facility for those:
Note: for brevity this keeps all values as int for now. In that, it falls short of "Better Enum". But given a few macros you should be able to make Options::type (and Enum<TagType>) resolve to a proper enum type.
namespace util {
template <typename Tag> struct FlavouredString : std::string {
using std::string::string;
using std::string::operator=;
};
template <typename Tag> struct Options {
using type = int; // TODO typed enums? Requires macro tedium
std::vector<char const*> _options;
Options(std::initializer_list<char const*> options) : _options(options) {}
Options(std::vector<char const*> options) : _options(options) {}
std::string to_string(type id) const { return _options.at(id); }
type to_id(std::string const& name) const { return find(_options.begin(), _options.end(), name) - _options.begin(); }
};
template <typename Tag> using Enum = typename Options<Tag>::type;
template <typename Tag> struct IcOptions : Options<Tag> { using Options<Tag>::Options; };
}
To support our AST types, we will create instances of these utilities like:
IcOptions<struct DllPdb> static const dll_pdb { "DLL", "PDB" };
IcOptions<struct Setting> static const setting { "DEBUG", "TRACE" };
IcOptions<struct OnOff> static const on_off { "OFF", "ON" };
IcOptions<struct SymType> static const sym_type{ "all", "imports", "exports" };
using Wildcard = FlavouredString<struct _Wild>;
using Filespec = FlavouredString<struct _Filespec>;
AST Types
This goes a completely different route from before: instead of defining a general-purpose AST with arbitrary numbers of arbitrary-type arguments and values, I've opted to define the commands strongly-typed:
namespace ast {
struct LoadCommand {
Enum<DllPdb> kind = {};
Filespec filespec;
};
struct SetCommand {
Enum<Setting> setting = {};
Enum<OnOff> value = {};
};
struct ShowSettingCommand {
Enum<Setting> setting;
};
struct ShowModuleCommand {
Wildcard wildcard;
bool symbols = false;
};
using SymbolTypes = std::vector<Enum<SymType> >;
struct ShowSymbolsCommand {
bool all = false;
bool full = false;
Filespec out;
SymbolTypes types;
Wildcard wildcard;
};
using Command = boost::variant<
LoadCommand,
SetCommand,
ShowSettingCommand,
ShowModuleCommand,
ShowSymbolsCommand
>;
}
Adaptation is like before:
BOOST_FUSION_ADAPT_STRUCT(scl::ast::LoadCommand, kind, filespec)
BOOST_FUSION_ADAPT_STRUCT(scl::ast::SetCommand, setting, value)
BOOST_FUSION_ADAPT_STRUCT(scl::ast::ShowSettingCommand, setting)
BOOST_FUSION_ADAPT_STRUCT(scl::ast::ShowModuleCommand, wildcard, symbols)
Note that ShowSymbolsCommand is not adapted because the rule doesn't follow the struct layout
Parser Utilities
Let's support our core concepts with some composable parser factories:
// (case insensitive) keyword handling
static auto kw = [](auto p) { return x3::lexeme[p >> !(x3::graph - x3::char_("/=,()"))]; };
static auto ikw = [](auto p) { return x3::no_case [kw(p)]; };
static auto qualifier = [](auto p) { return x3::lexeme['/' >> ikw(p)]; };
I could explain these, but the usage below will be more clear. So, without further ado, presenting the trick that allows us to use any Options or CiOptions instance directly in a parser expression:
// Options and CiOptions
namespace util {
template <typename Tag>
auto as_spirit_parser(Options<Tag> const& o, bool to_lower = false) {
x3::symbols<typename Options<Tag>::type> p;
int n = 0;
for (std::string el : o._options) {
if (to_lower) boost::to_lower(el);
p.add(el, n++);
}
return kw(p);
}
template <typename Tag>
auto as_spirit_parser(IcOptions<Tag> const& o) {
return x3::no_case [ as_spirit_parser(o, true) ];
}
}
Nothing unexpected there, I suppose, but it does allow for elegant rule definitions:
The Rule Definitions
DEF_RULE(Filespec) = quoted_string | bare_string;
DEF_RULE(Wildcard) = lexeme[+char_("a-zA-Z0-9$_.\\*?+-")];
DEF_RULE(LoadCommand)
= ikw("load") >> ast::dll_pdb >> Filespec;
DEF_RULE(SetCommand)
= ikw("set") >> ast::setting >> qualifier(ast::on_off);
DEF_RULE(ShowSettingCommand)
= ikw("show") >> ast::setting;
DEF_RULE(ShowModuleCommand)
= ikw("show") >> ikw("module") >> Wildcard >> matches[qualifier("symbols")];
// ... ShowSymbolsQualifiers (see below) ...
DEF_RULE(ShowSymbolsCommand)
= ikw("show") >> ikw("symbols") >> *ShowSymbolsQualifiers;
DEF_RULE(Command)
= skip(blank)[ LoadCommand | SetCommand | ShowSettingCommand | ShowModuleCommand | ShowSymbolsCommand ];
The Heavier Lifting
You'll note I skipped ShowSymbolsQualifiers. That's because that's the only rule that cannot benefit from automatic attribute propagation, so I've resorted to using semantic actions:
Note the IIFE idiom allows for "very local" helper definitions
DEF_RULE(SymbolTypes) = [] {
auto type = as_parser(ast::sym_type);
return '(' >> (type % ',') >> ')' | repeat(1) [ type ];
}(); // IIFE pattern
RULE(ShowSymbolsQualifiers, ShowSymbolsCommand)
= [] {
auto set = [](auto member, auto p) {
auto propagate = [member](auto& ctx) {
traits::move_to(_attr(ctx), _val(ctx).*(member));
};
return as_parser(p)[propagate];
};
using T = ast::ShowSymbolsCommand;;
return qualifier("all") >> set(&T::all, attr(true))
| qualifier("full") >> set(&T::full, attr(true))
| qualifier("out") >> set(&T::out, '=' >> Filespec)
| qualifier("type") >> set(&T::types, '=' >> SymbolTypes)
| set(&T::wildcard, Wildcard);
}(); // IIFE pattern
FULL DEMO
Live On Coliru
//#define BOOST_SPIRIT_X3_DEBUG
#include <iomanip>
#include <iostream>
#include <string>
#include <vector>
#include <boost/algorithm/string/case_conv.hpp> // to_lower
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/home/x3.hpp>
/* Synopsis:
*
* LOAD DLL <file-spec>
* LOAD PDB <file-spec>
* SET DEBUG {/ON | /OFF}
* SET TRACE {/ON | /OFF}
*
* SHOW DEBUG
* SHOW MODULE <wildcard-expression> [/SYMBOLS]
* SHOW SYMBOLS { [/ALL] [/FULL] [/OUT=<file-spec>] [/TYPE=(+{all,exports,imports})] [wild-card-expression] }...
* SHOW TRACE
*/
namespace scl {
namespace util {
template <typename Tag> struct FlavouredString : std::string {
using std::string::string;
using std::string::operator=;
};
template <typename Tag> struct Options {
using type = int; // TODO typed enums? Requires macro tedium
std::vector<char const*> _options;
Options(std::initializer_list<char const*> options) : _options(options) {}
Options(std::vector<char const*> options) : _options(options) {}
std::string to_string(type id) const { return _options.at(id); }
type to_id(std::string const& name) const { return find(_options.begin(), _options.end(), name) - _options.begin(); }
};
template <typename Tag> using Enum = typename Options<Tag>::type;
template <typename Tag> struct IcOptions : Options<Tag> { using Options<Tag>::Options; };
}
namespace ast {
using namespace util;
IcOptions<struct DllPdb> static const dll_pdb { "DLL", "PDB" };
IcOptions<struct Setting> static const setting { "DEBUG", "TRACE" };
IcOptions<struct OnOff> static const on_off { "OFF", "ON" };
IcOptions<struct SymType> static const sym_type{ "all", "imports", "exports" };
using Wildcard = FlavouredString<struct _Wild>;
using Filespec = FlavouredString<struct _Filespec>;
struct LoadCommand {
Enum<DllPdb> kind = {};
Filespec filespec;
};
struct SetCommand {
Enum<Setting> setting = {};
Enum<OnOff> value = {};
};
struct ShowSettingCommand {
Enum<Setting> setting;
};
struct ShowModuleCommand {
Wildcard wildcard;
bool symbols = false;
};
using SymbolTypes = std::vector<Enum<SymType> >;
struct ShowSymbolsCommand {
bool all = false;
bool full = false;
Filespec out;
SymbolTypes types;
Wildcard wildcard;
};
using Command = boost::variant<
LoadCommand,
SetCommand,
ShowSettingCommand,
ShowModuleCommand,
ShowSymbolsCommand
>;
}
}
#ifndef NDEBUG // for debug printing
namespace scl { namespace ast {
static inline std::ostream &operator<<(std::ostream &os, Wildcard const &w) { return os << std::quoted(w); }
static inline std::ostream &operator<<(std::ostream &os, Filespec const &s) { return os << std::quoted(s); }
static inline std::ostream &operator<<(std::ostream &os, LoadCommand const &cmd) {
return os << "LOAD " << dll_pdb.to_string(cmd.kind) << " " << cmd.filespec ;
}
static inline std::ostream &operator<<(std::ostream &os, SetCommand const &cmd) {
return os << "SET " << setting.to_string(cmd.setting) << " /" << on_off.to_string(cmd.value);
}
static inline std::ostream &operator<<(std::ostream &os, ShowSettingCommand const &cmd) {
return os << "SHOW " << setting.to_string(cmd.setting);
}
static inline std::ostream &operator<<(std::ostream &os, ShowModuleCommand const &cmd) {
return os << "SHOW MODULE " << cmd.wildcard << (cmd.symbols?" /SYMBOLS":"");
}
static inline std::ostream &operator<<(std::ostream &os, ShowSymbolsCommand const &cmd) {
os << "SHOW SYMBOLS";
if (cmd.all) os << " /ALL";
if (cmd.full) os << " /FULL";
if (cmd.out.size()) os << " /OUT=" << cmd.out;
if (cmd.types.size()) {
os << " /TYPE=(";
bool first = true;
for (auto type : cmd.types)
os << (std::exchange(first, false)?"":",") << sym_type.to_string(type);
os << ")";
}
return os << " " << cmd.wildcard;
}
} }
#endif
BOOST_FUSION_ADAPT_STRUCT(scl::ast::LoadCommand, kind, filespec)
BOOST_FUSION_ADAPT_STRUCT(scl::ast::SetCommand, setting, value)
BOOST_FUSION_ADAPT_STRUCT(scl::ast::ShowSettingCommand, setting)
BOOST_FUSION_ADAPT_STRUCT(scl::ast::ShowModuleCommand, wildcard, symbols)
// Grammar for simple command language
namespace scl {
namespace x3 = boost::spirit::x3;
// (case insensitive) keyword handling
static auto kw = [](auto p) { return x3::lexeme[p >> !(x3::graph - x3::char_("/=,()"))]; };
static auto ikw = [](auto p) { return x3::no_case [kw(p)]; };
static auto qualifier = [](auto p) { return x3::lexeme['/' >> ikw(p)]; };
// Options and CiOptions
namespace util {
template <typename Tag>
auto as_spirit_parser(Options<Tag> const& o, bool to_lower = false) {
x3::symbols<typename Options<Tag>::type> p;
int n = 0;
for (std::string el : o._options) {
if (to_lower) boost::to_lower(el);
p.add(el, n++);
}
return kw(p);
}
template <typename Tag>
auto as_spirit_parser(IcOptions<Tag> const& o) {
return x3::no_case [ as_spirit_parser(o, true) ];
}
}
// shorthand rule declarations
#define RULE(name, Attr) static auto const name = x3::rule<struct _##Attr, ast::Attr>{#Attr}
#define DEF_RULE(Attr) RULE(Attr, Attr)
using namespace x3;
auto const bare_string
= lexeme[+char_("a-zA-Z0-9$_.\\*?+-")]; // bare string taken from old "param" rule
auto const quoted_string
= lexeme['"' >> *(('\\' > char_) | ~char_('"')) >> '"'];
DEF_RULE(Filespec) = quoted_string | bare_string;
DEF_RULE(Wildcard) = lexeme[+char_("a-zA-Z0-9$_.\\*?+-")];
DEF_RULE(LoadCommand)
= ikw("load") >> ast::dll_pdb >> Filespec;
DEF_RULE(SetCommand)
= ikw("set") >> ast::setting >> qualifier(ast::on_off);
DEF_RULE(ShowSettingCommand)
= ikw("show") >> ast::setting;
DEF_RULE(ShowModuleCommand)
= ikw("show") >> ikw("module") >> Wildcard >> matches[qualifier("symbols")];
// Note the IIFE idiom allows for "very local" helper definitions
DEF_RULE(SymbolTypes) = [] {
auto type = as_parser(ast::sym_type);
return '(' >> (type % ',') >> ')' | repeat(1) [ type ];
}(); // IIFE idiom
RULE(ShowSymbolsQualifiers, ShowSymbolsCommand)
= [] {
auto set = [](auto member, auto p) {
auto propagate = [member](auto& ctx) {
traits::move_to(_attr(ctx), _val(ctx).*(member));
};
return as_parser(p)[propagate];
};
using T = ast::ShowSymbolsCommand;;
return qualifier("all") >> set(&T::all, attr(true))
| qualifier("full") >> set(&T::full, attr(true))
| qualifier("out") >> set(&T::out, '=' >> Filespec)
| qualifier("type") >> set(&T::types, '=' >> SymbolTypes)
| set(&T::wildcard, Wildcard);
}(); // IIFE idiom
DEF_RULE(ShowSymbolsCommand)
= ikw("show") >> ikw("symbols") >> *ShowSymbolsQualifiers;
DEF_RULE(Command)
= skip(blank)[ LoadCommand | SetCommand | ShowSettingCommand | ShowModuleCommand | ShowSymbolsCommand ];
#undef DEF_RULE
#undef RULE
} // End namespace scl
int main() {
for (std::string const str : {
"load dll test.dll",
"LOAD pdb \"test special.pdb\"",
"LOAD pDb test.pdb",
"set debug /on",
"show debug",
"SHOW module test.dll/symbols",
"SHOW MODULE TEST.DLL /SYMBOLS",
"SHOW module test.dll / symbols",
"SHOW module test.dll",
"show symbols/type=exports test*",
"show symbols/type=(exports,imports) test*",
"show symbols test.dll/type=(imports,exports)",
"show symbols test.dll/tyPE=(imports,exports)",
"show symbols s*/out=s.txt",
"show symbols /all /full",
}) {
std::cout << " ======== " << std::quoted(str) << std::endl;
auto b = str.begin(), e = str.end();
scl::ast::Command cmd;
if (parse(b, e, scl::Command, cmd))
std::cout << " - Parsed: " << cmd << std::endl;
if (b != e)
std::cout << " - Remaining unparsed: " << std::quoted(std::string(b, e)) << std::endl;
}
}
Prints
======== "load dll test.dll"
- Parsed: LOAD DLL "test.dll"
======== "LOAD pdb \"test special.pdb\""
- Parsed: LOAD PDB "test special.pdb"
======== "LOAD pDb test.pdb"
- Parsed: LOAD PDB "test.pdb"
======== "set debug /on"
- Parsed: SET DEBUG /ON
======== "show debug"
- Parsed: SHOW DEBUG
======== "SHOW module test.dll/symbols"
- Parsed: SHOW MODULE "test.dll" /SYMBOLS
======== "SHOW MODULE TEST.DLL /SYMBOLS"
- Parsed: SHOW MODULE "TEST.DLL" /SYMBOLS
======== "SHOW module test.dll / symbols"
- Parsed: SHOW MODULE "test.dll"
- Remaining unparsed: "/ symbols"
======== "SHOW module test.dll"
- Parsed: SHOW MODULE "test.dll"
======== "show symbols/type=exports test*"
- Parsed: SHOW SYMBOLS /TYPE=(exports) "test*"
======== "show symbols/type=(exports,imports) test*"
- Parsed: SHOW SYMBOLS /TYPE=(exports,imports) "test*"
======== "show symbols test.dll/type=(imports,exports)"
- Parsed: SHOW SYMBOLS /TYPE=(imports,exports) "test.dll"
======== "show symbols test.dll/tyPE=(imports,exports)"
- Parsed: SHOW SYMBOLS /TYPE=(imports,exports) "test.dll"
======== "show symbols s*/out=s.txt"
- Parsed: SHOW SYMBOLS /OUT="s.txt" "s*"
======== "show symbols /all /full"
- Parsed: SHOW SYMBOLS /ALL /FULL ""
I got a working parser for reading position descriptions for a board game (international draughts, official grammar):
#include <boost/spirit/home/x3.hpp>
#include <iostream>
namespace x3 = boost::spirit::x3;
auto const colon = x3::lit(':');
auto const comma = x3::lit(',');
auto const dash = x3::lit('-');
auto const dot = x3::lit('.');
auto const king = x3::char_('K');
auto const color = x3::char_("BW");
auto const num_sq = x3::int_;
auto const num_pc = -king >> num_sq; // Kxx means king on square xx, xx a man on that square
auto const num_rng = num_pc >> dash >> num_sq; // xx-yy means range of squares xx through yy (inclusive)
auto const num_seq = (num_rng | num_pc) % comma; // <--- attribute should be std::vector<boost::variant>
auto const ccn = colon >> color >> -num_seq;
auto const num_not = x3::repeat(2)[ccn]; // need to specify both white and black pieces
auto const fen = color >> num_not >> -dot;
Live On Coliru
Now I want to extract the values from the synthesized attributes, so I did the boilerplate dance around Boost.Fusion etc.,
namespace ast {
struct num_pc { boost::optional<char> k; int sq; };
struct num_rng { boost::optional<char> k; int first, last; };
using rng_or_pc = boost::variant<num_rng, num_pc>;
struct num_seq { std::vector<rng_or_pc> sqrs; };
struct ccn { char c; boost::optional<num_seq> seq; };
struct num_not { std::vector<ccn> n; };
struct fen { char c; num_not n; };
} // namespace ast
BOOST_FUSION_ADAPT_STRUCT(ast::num_pc, (boost::optional<char>, k), (int, sq))
BOOST_FUSION_ADAPT_STRUCT(ast::num_rng, (boost::optional<char>, k), (int, first), (int, last))
BOOST_FUSION_ADAPT_STRUCT(ast::num_seq, (std::vector<ast::rng_or_pc>, sqrs))
BOOST_FUSION_ADAPT_STRUCT(ast::ccn, (char, c), (boost::optional<ast::num_seq>, seq))
BOOST_FUSION_ADAPT_STRUCT(ast::num_not, (std::vector<ast::ccn>, n))
BOOST_FUSION_ADAPT_STRUCT(ast::fen, (char, c), (ast::num_not, n))
x3::rule<class num_pc_class, ast::num_pc > num_pc = "num_pc";
x3::rule<class num_rng_class, ast::num_rng> num_rng = "num_rng";
x3::rule<class num_seq_class, ast::num_seq> num_seq = "num_seq";
x3::rule<class ccn_class, ast::ccn > ccn = "ccn";
x3::rule<class num_not_class, ast::num_not> num_not = "num_not";
x3::rule<class fen_class, ast::fen > fen = "fen";
auto const colon = x3::lit(':');
auto const comma = x3::lit(',');
auto const dash = x3::lit('-');
auto const dot = x3::lit('.');
auto const king = x3::char_('K');
auto const color = x3::char_("BW");
auto const num_sq = x3::int_;
auto const num_pc_def = -king >> num_sq;
auto const num_rng_def = num_pc >> dash >> num_sq;
auto const num_seq_def = (num_rng | num_pc) % comma;
auto const ccn_def = colon >> color >> -num_seq;
auto const num_not_def = x3::repeat(2)[ccn];
auto const fen_def = color >> num_not >> -dot;
BOOST_SPIRIT_DEFINE(num_pc, num_rng, num_seq, ccn, num_not, fen)
Live On Coliru
However, I then get an error saying that
error: static_assert failed "Attribute does not have the expected
size."
and a couple of pages down:
^ main.cpp:16:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from
'std::vector<boost::variant<ast::num_rng, ast::num_pc>,
std::allocator<boost::variant<ast::num_rng, ast::num_pc> > >' to
'ast::num_seq' for 1st argument struct num_seq {
std::vector<rng_or_pc> sqrs; };
^ main.cpp:16:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from
'std::vector<boost::variant<ast::num_rng, ast::num_pc>,
std::allocator<boost::variant<ast::num_rng, ast::num_pc> > >' to
'const ast::num_seq' for 1st argument struct num_seq {
std::vector<rng_or_pc> sqrs; };
Question: where is this error coming from, and how to resolve it? Apparently the synthesized attribute of my num_seq rule is not equal to std::vector<boost::variant>>. How can I correct this?
I've spent some time trying to understand the grammar.
I strongly suggest readable identifiers. It's very hard to understand what's going on, while I have the strong impression it's actually a really simple grammar
I suggest a simplification version shown below.
Because your grammar doesn't use recursion there's no real need for the rule and tagged parser types.
Also use a namespace for the parser artefacts.
Consider encapsulation the use of a skipper instead of letting the caller decide (x3::skip[])
Add a few helpers to be able to print the AST for verification:
template <typename T> std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
os << "{"; for (auto& el : v) os << el << " "; return os << "}";
}
std::ostream& operator<<(std::ostream& os, num_pc const& p) { if (p.k) os << p.k; return os << p.sq; }
std::ostream& operator<<(std::ostream& os, num_rng const& r) { return os << r.pc << "-" << r.last; }
std::ostream& operator<<(std::ostream& os, ccn const& o) { return os << o.c << " " << o.seq; }
std::ostream& operator<<(std::ostream& os, num_not const& nn) { return os << nn.n; }
I'd avoid wrapping the other vector unnecessarily too:
using num_not = std::vector<ccn>;
Use the modern ADAPT macros (as you're using C++14 by definition):
BOOST_FUSION_ADAPT_STRUCT(ast::num_pc, k, sq)
BOOST_FUSION_ADAPT_STRUCT(ast::num_rng, pc, last)
BOOST_FUSION_ADAPT_STRUCT(ast::ccn, c, seq)
BOOST_FUSION_ADAPT_STRUCT(ast::fen, c, n)
-
Live Demo
Live On Coliru
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/as_vector.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/optional/optional_io.hpp>
#include <boost/optional.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/variant.hpp>
#include <iostream>
#include <vector>
namespace ast {
struct num_pc {
boost::optional<char> k;
int sq;
};
struct num_rng {
num_pc pc;
int last;
};
using rng_or_pc = boost::variant<num_rng, num_pc>;
using num_seq = std::vector<rng_or_pc>;
struct ccn {
char c;
boost::optional<num_seq> seq;
};
using num_not = std::vector<ccn>;
struct fen {
char c;
num_not n;
};
template <typename T> std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
os << "{"; for (auto& el : v) os << el << " "; return os << "}";
}
std::ostream& operator<<(std::ostream& os, num_pc const& p) { if (p.k) os << p.k; return os << p.sq; }
std::ostream& operator<<(std::ostream& os, num_rng const& r) { return os << r.pc << "-" << r.last; }
std::ostream& operator<<(std::ostream& os, ccn const& o) { return os << o.c << " " << o.seq; }
}
BOOST_FUSION_ADAPT_STRUCT(ast::num_pc, k, sq)
BOOST_FUSION_ADAPT_STRUCT(ast::num_rng, pc, last)
BOOST_FUSION_ADAPT_STRUCT(ast::ccn, c, seq)
BOOST_FUSION_ADAPT_STRUCT(ast::fen, c, n)
namespace FEN {
namespace x3 = boost::spirit::x3;
namespace grammar
{
using namespace x3;
template<typename T>
auto as = [](auto p) { return rule<struct _, T>{} = as_parser(p); };
uint_type const number {};
auto const color = char_("BW");
auto const num_pc = as<ast::num_pc> ( -char_('K') >> number );
auto const num_rng = as<ast::num_rng> ( num_pc >> '-' >> number );
auto const num_seq = as<ast::num_seq> ( (num_rng | num_pc) % ',' );
auto const ccn = as<ast::ccn> ( ':' >> color >> -num_seq );
auto const num_not = as<ast::num_not> ( repeat(2)[ccn] );
auto const fen = as<ast::fen> ( color >> num_not >> -lit('.') );
}
using grammar::fen;
}
int main() {
for (std::string const t : {
"B:W18,24,27,28,K10,K15:B12,16,20,K22,K25,K29",
"B:W18,19,21,23,24,26,29,30,31,32:B1,2,3,4,6,7,9,10,11,12",
"W:B1-20:W31-50", // initial position
"W:B:W", // empty board
"W:B1:W", // only black pieces
"W:B:W50" // only white pieces
}) {
auto b = t.begin(), e = t.end();
ast::fen data;
bool ok = phrase_parse(b, e, FEN::fen, FEN::x3::space, data);
std::cout << t << "\n";
if (ok) {
std::cout << "Parsed: " << boost::fusion::as_vector(data) << "\n";
} else {
std::cout << "Parse failed:\n";
std::cout << "\t on input: " << t << "\n";
}
if (b != e)
std::cout << "\t Remaining unparsed: '" << std::string(b, e) << '\n';
}
}
Prints:
B:W18,24,27,28,K10,K15:B12,16,20,K22,K25,K29
Parsed: (B {W {18 24 27 28 K10 K15 } B {12 16 20 K22 K25 K29 } })
B:W18,19,21,23,24,26,29,30,31,32:B1,2,3,4,6,7,9,10,11,12
Parsed: (B {W {18 19 21 23 24 26 29 30 31 32 } B {1 2 3 4 6 7 9 10 11 12 } })
W:B1-20:W31-50
Parsed: (W {B {1-20 } W {31-50 } })
W:B:W
Parsed: (W {B -- W -- })
W:B1:W
Parsed: (W {B {1 } W -- })
W:B:W50
Parsed: (W {B -- W {50 } })