The examples in the Boost.Spirit documentation seem to fall in two cases:
1/ Define a parser in a function: semantic actions can access local variables and data as they are local lambdas. Like push_back here: https://www.boost.org/doc/libs/master/libs/spirit/doc/x3/html/spirit_x3/tutorials/number_list___stuffing_numbers_into_a_std__vector.html
2/ Define a parser in a namespace, like here: https://www.boost.org/doc/libs/1_69_0/libs/spirit/doc/x3/html/spirit_x3/tutorials/minimal.html
which seems to be necessary to be able to invoke BOOST_SPIRIT_DEFINE.
My question is: how to combine both (properly, without globals) ? My dream API would be to pass some argument to phrase_parse and then do some x3::_arg(ctx) but I couldn't find anything like this.
Here is for instance my parser: for now the actions are writing to std::cerr. What if I wanted to write to a custom std::ostream& instead, that would be passed to the parse function?
using namespace boost::spirit;
using namespace boost::spirit::x3;
rule<struct id_action> action = "action";
rule<struct id_array> array = "array";
rule<struct id_empty_array> empty_array = "empty_array";
rule<struct id_atom> atom = "atom";
rule<struct id_sequence> sequence = "sequence";
rule<struct id_root> root = "root";
auto access_index_array = [] (const auto& ctx) { std::cerr << "access_array: " << x3::_attr(ctx) << "\n" ;};
auto access_empty_array = [] (const auto& ctx) { std::cerr << "access_empty_array\n" ;};
auto access_named_member = [] (const auto& ctx) { std::cerr << "access_named_member: " << x3::_attr(ctx) << "\n" ;};
auto start_action = [] (const auto& ctx) { std::cerr << "start action\n" ;};
auto finish_action = [] (const auto& ctx) { std::cerr << "finish action\n" ;};
auto create_array = [] (const auto& ctx) { std::cerr << "create_array\n" ;};
const auto action_def = +(lit('.')[start_action]
>> -((+alnum)[access_named_member])
>> *(('[' >> x3::int_ >> ']')[access_index_array] | lit("[]")[access_empty_array]));
const auto sequence_def = (action[finish_action] % '|');
const auto array_def = ('[' >> sequence >> ']')[create_array];
const auto root_def = array | action;
BOOST_SPIRIT_DEFINE(action)
BOOST_SPIRIT_DEFINE(array)
BOOST_SPIRIT_DEFINE(sequence)
BOOST_SPIRIT_DEFINE(root)
bool parse(std::string_view str)
{
using ascii::space;
auto first = str.begin();
auto last = str.end();
bool r = phrase_parse(
first, last,
parser::array_def | parser::sequence_def,
ascii::space
);
if (first != last)
return false;
return r;
}
About the approaches:
1/ Yes, this is viable for small, contained parsers. Typically only used in a single TU, and exposed via non-generic interface.
2/ This is the approach for (much) larger grammars, that you might wish to spread across TUs, and/or are instantiated across several TU's generically.
Note that you do NOT need BOOST_SPIRIT_DEFINE unless you
have recursive rules
want to split declaration from definition. [This becomes pretty complicated, and I recommend against using that for X3.]
The Question
My question is: how to combine both (properly, without globals) ?
You can't combine something with namespace level declarations, if one of the requiremenents is "without globals".
My dream API would be to pass some argument to phrase_parse and then do some x3::_arg(ctx) but I couldn't find anything like this.
I don't know what you think x3::_arg(ctx) would do, in that particular dream :)
Here is for instance my parser: for now the actions are writing to std::cerr. What if I wanted to write to a custom std::ostream& instead, that would be passed to the parse function?
Now that's a concrete question. I'd say: use the context.
You could make it so that you can use x3::get<ostream>(ctx) returns the stream:
struct ostream{};
auto access_index_array = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_array: " << x3::_attr(ctx) << "\n" ;};
auto access_empty_array = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_empty_array\n" ;};
auto access_named_member = [] (const auto& ctx) { x3::get<ostream>(ctx) << "access_named_member: " << x3::_attr(ctx) << "\n" ;};
auto start_action = [] (const auto& ctx) { x3::get<ostream>(ctx) << "start action\n" ;};
auto finish_action = [] (const auto& ctx) { x3::get<ostream>(ctx) << "finish action\n" ;};
auto create_array = [] (const auto& ctx) { x3::get<ostream>(ctx) << "create_array\n";};
Now you need to put the tagged param in the context during parsing:
bool r = phrase_parse(
f, l,
x3::with<parser::ostream>(std::cerr)[parser::array_def | parser::sequence_def],
x3::space);
Live Demo: http://coliru.stacked-crooked.com/a/a26c8eb0af6370b9
Prints
start action
access_named_member: a
finish action
start action
access_named_member: b
start action
start action
access_array: 2
start action
access_named_member: foo
start action
access_empty_array
finish action
start action
access_named_member: c
finish action
create_array
true
Intermixed with the standard X3 debug output:
<sequence>
<try>.a|.b..[2].foo.[]|.c</try>
<action>
<try>.a|.b..[2].foo.[]|.c</try>
<success>|.b..[2].foo.[]|.c]</success>
</action>
<action>
<try>.b..[2].foo.[]|.c]</try>
<success>|.c]</success>
</action>
<action>
<try>.c]</try>
<success>]</success>
</action>
<success>]</success>
</sequence>
But Wait #1 - Event Handlers
It looks like you're parsing something similar to JSON Pointer or jq syntax. In the case that you wanted to provide a callback-interface (SAX-events), why not bind the callback interface instead of the actions:
struct handlers {
using N = x3::unused_type;
virtual void index(int) {}
virtual void index(N) {}
virtual void property(std::string) {}
virtual void start(N) {}
virtual void finish(N) {}
virtual void create_array(N) {}
};
#define EVENT(e) ([](auto& ctx) { x3::get<handlers>(ctx).e(x3::_attr(ctx)); })
const auto action_def =
+(x3::lit('.')[EVENT(start)] >> -((+x3::alnum)[EVENT(property)]) >>
*(('[' >> x3::int_ >> ']')[EVENT(index)] | x3::lit("[]")[EVENT(index)]));
const auto sequence_def = action[EVENT(finish)] % '|';
const auto array_def = ('[' >> sequence >> ']')[EVENT(create_array)];
const auto root_def = array | action;
Now you can implement all handlers neatly in one interface:
struct default_handlers : parser::handlers {
std::ostream& os;
default_handlers(std::ostream& os) : os(os) {}
void index(int i) override { os << "access_array: " << i << "\n"; };
void index(N) override { os << "access_empty_array\n" ; };
void property(std::string n) override { os << "access_named_member: " << n << "\n" ; };
void start(N) override { os << "start action\n" ; };
void finish(N) override { os << "finish action\n" ; };
void create_array(N) override { os << "create_array\n"; };
};
auto f = str.begin(), l = str.end();
bool r = phrase_parse(f, l,
x3::with<parser::handlers>(default_handlers{std::cout}) //
[parser::array_def | parser::sequence_def],
x3::space);
See it Live On Coliru once again:
start action
access_named_member: a
finish action
start action
access_named_member: b
start action
start action
access_array: 2
start action
access_named_member: foo
start action
access_empty_array
finish action
start action
access_named_member: c
finish action
create_array
true
But Wait #2 - No Actions
The natural way to expose attributes would be to build an AST. See also Boost Spirit: "Semantic actions are evil"?
Without further ado:
namespace AST {
using Id = std::string;
using Index = int;
struct Member {
std::optional<Id> name;
};
struct Indexer {
std::optional<int> index;
};
struct Action {
Member member;
std::vector<Indexer> indexers;
};
using Actions = std::vector<Action>;
using Sequence = std::vector<Actions>;
struct ArrayCtor {
Sequence actions;
};
using Root = boost::variant<ArrayCtor, Actions>;
}
Of course, I'm making some assumptions. The rules can be much simplified:
namespace parser {
template <typename> struct Tag {};
#define AS(T, p) (x3::rule<Tag<AST::T>, AST::T>{#T} = p)
auto id = AS(Id, +x3::alnum);
auto member = AS(Member, x3::lit('.') >> -id);
auto indexer = AS(Indexer,'[' >> -x3::int_ >> ']');
auto action = AS(Action, member >> *indexer);
auto actions = AS(Actions, +action);
auto sequence = AS(Sequence, actions % '|');
auto array = AS(ArrayCtor, '[' >> -sequence >> ']'); // covers empty array
auto root = AS(Root, array | actions);
} // namespace parser
And the parsing function returns the AST:
AST::Root parse(std::string_view str) {
auto f = str.begin(), l = str.end();
AST::Root parsed;
phrase_parse(f, l, x3::expect[parser::root >> x3::eoi], x3::space, parsed);
return parsed;
}
(Note that it now throws x3::expection_failure if the input is invalid or not completely parsed)
int main() {
std::cout << parse("[.a|.b..[2].foo.[]|.c]");
}
Now prints:
[.a|.b./*none*/./*none*/[2].foo./*none*/[/*none*/]|.c]
See it Live On Coliru
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <ostream>
#include <optional>
namespace x3 = boost::spirit::x3;
namespace AST {
using Id = std::string;
using Index = int;
struct Member {
std::optional<Id> name;
};
struct Indexer {
std::optional<int> index;
};
struct Action {
Member member;
std::vector<Indexer> indexers;
};
using Actions = std::vector<Action>;
using Sequence = std::vector<Actions>;
struct ArrayCtor {
Sequence actions;
};
using Root = boost::variant<ArrayCtor, Actions>;
}
BOOST_FUSION_ADAPT_STRUCT(AST::Member, name)
BOOST_FUSION_ADAPT_STRUCT(AST::Indexer, index)
BOOST_FUSION_ADAPT_STRUCT(AST::Action, member, indexers)
BOOST_FUSION_ADAPT_STRUCT(AST::ArrayCtor, actions)
namespace parser {
template <typename> struct Tag {};
#define AS(T, p) (x3::rule<Tag<AST::T>, AST::T>{#T} = p)
auto id = AS(Id, +x3::alnum);
auto member = AS(Member, x3::lit('.') >> -id);
auto indexer = AS(Indexer,'[' >> -x3::int_ >> ']');
auto action = AS(Action, member >> *indexer);
auto actions = AS(Actions, +action);
auto sequence = AS(Sequence, actions % '|');
auto array = AS(ArrayCtor, '[' >> -sequence >> ']'); // covers empty array
auto root = AS(Root, array | actions);
} // namespace parser
AST::Root parse(std::string_view str) {
auto f = str.begin(), l = str.end();
AST::Root parsed;
phrase_parse(f, l, x3::expect[parser::root >> x3::eoi], x3::space, parsed);
return parsed;
}
// for debug output
#include <iostream>
#include <iomanip>
namespace AST {
static std::ostream& operator<<(std::ostream& os, Member const& m) {
return os << "." << m.name.value_or("/*none*/");
}
static std::ostream& operator<<(std::ostream& os, Indexer const& i) {
if (i.index)
return os << "[" << *i.index << "]";
else
return os << "[/*none*/]";
}
static std::ostream& operator<<(std::ostream& os, Action const& a) {
os << a.member;
for (auto& i : a.indexers)
os << i;
return os;
}
static std::ostream& operator<<(std::ostream& os, Actions const& aa) {
for (auto& a : aa)
os << a;
return os;
}
static std::ostream& operator<<(std::ostream& os, Sequence const& s) {
bool first = true;
for (auto& a : s)
os << (std::exchange(first, false) ? "" : "|") << a;
return os;
}
static std::ostream& operator<<(std::ostream& os, ArrayCtor const& ac) {
return os << "[" << ac.actions << "]";
}
}
int main() {
std::cout << parse("[.a|.b..[2].foo.[]|.c]");
}
Related
My ultimate goal is to write a hlsl shading language parser. My first experience with parsing has been by following bob nystrom's "crafting interpreters".
The issue I am currently facing is that I am trying to parse a 'chained member access' sequence (or multiple 'dot operators)....
first.Second.third
Obviously I could parse that into a list % sequence as a vector of strings, but I am trying to stick to the ast shown in the crafting interpreters book by having nested 'Get' nodes.
I am trying to parse this nested Get sequence so that I can eventually put that into a Set ast node. But I thought it would be best to at least get the 'Get' part first. before building on top of that.
https://craftinginterpreters.com/classes.html#set-expressions
Here's my minimal compiling program that tries to do that....
#include "boost/variant.hpp"
#include <boost/config/warning_disable.hpp>
#include <boost/fusion/adapted/std_tuple.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/fusion/include/std_tuple.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/position_tagged.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
#include <boost/spirit/home/x3/support/utility/annotate_on_success.hpp>
#include <boost/spirit/home/x3/support/utility/error_reporting.hpp>
#include <iostream>
#include <string>
#include <tuple>
#include <variant>
namespace hlsl {
namespace ast {
struct Get;
struct ExprVoidType {};
struct Variable {
Variable(std::string name) : name(std::move(name)) {
}
Variable() = default;
std::string name;
};
using Expr =
boost::spirit::x3::variant<ExprVoidType,
boost::spirit::x3::forward_ast<Get>, Variable>;
struct Get {
Get(Expr& object, std::string name) : object_{object}, name_{name} {
}
Get() = default;
Expr object_;
std::string name_;
};
} // namespace ast
} // namespace hlsl
struct visitor {
using result_type = void;
void operator()(const std::string name) {
std::cout << name << "\n";
}
void operator()(const hlsl::ast::Get& get) {
std::cout << "get expr\n";
get.object_.apply_visitor(*this);
std::cout << get.name_ << "\n";
}
void operator()(const hlsl::ast::Variable& var) {
std::cout << var.name << "\n";
};
void operator()(const hlsl::ast::ExprVoidType& var){};
};
BOOST_FUSION_ADAPT_STRUCT(hlsl::ast::Variable, name)
BOOST_FUSION_ADAPT_STRUCT(hlsl::ast::Get, object_, name_)
namespace x3 = boost::spirit::x3;
namespace ascii = boost::spirit::x3::ascii;
using ascii::char_;
using ascii::space;
using x3::alnum;
using x3::alpha;
using x3::double_;
using x3::int_;
using x3::lexeme;
using x3::lit;
struct error_handler {
template <typename Iterator, typename Exception, typename Context>
x3::error_handler_result on_error(Iterator& first, Iterator const& last,
Exception const& x,
Context const& context) {
auto& error_handler = x3::get<x3::error_handler_tag>(context).get();
std::string message = "Error! Expecting: " + x.which() + " here:";
error_handler(x.where(), message);
return x3::error_handler_result::fail;
}
};
/////////////////////////////////////////
// RULES
///////////////////////////////////////////
x3::rule<class identifier_class, std::string> const identifier = "identifier";
auto const identifier_def = +alnum;
BOOST_SPIRIT_DEFINE(identifier);
x3::rule<class expression_class, hlsl::ast::Expr> const expression =
"expression";
x3::rule<class variable_class, hlsl::ast::Variable> const variable = "variable";
x3::rule<class get_class, hlsl::ast::Get> const get = "get";
auto const variable_def = identifier;
BOOST_SPIRIT_DEFINE(variable);
auto const expression_def = get | variable;
BOOST_SPIRIT_DEFINE(expression);
///////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////
// get
auto const get_def = (variable | expression) >> '.' >> identifier;
BOOST_SPIRIT_DEFINE(get);
/////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////
struct program_class;
x3::rule<program_class, hlsl::ast::Expr> const program = "program";
auto const program_def = get;
BOOST_SPIRIT_DEFINE(program);
struct program_class : error_handler {};
// struct program_class;
/////////////////////////////////////////////////////////
// } // namespace parser
// } // namespace client
////////////////////////////////////////////////////////////////////////////
// Main program
////////////////////////////////////////////////////////////////////////////
int main() {
using boost::spirit::x3::error_handler_tag;
using boost::spirit::x3::with;
using iterator_type = std::string::const_iterator;
using error_handler_type = boost::spirit::x3::error_handler<iterator_type>;
// input string
std::string input = "first.Second.third";
hlsl::ast::Expr fs;
auto iter = input.begin();
auto const end = input.end();
// Our error handler
error_handler_type error_handler(iter, end, std::cerr);
auto const parser =
// we pass our error handler to the parser so we can access
// it later in our on_error and on_sucess handlers
with<error_handler_tag>(std::ref(error_handler))[program];
bool r;
r = phrase_parse(iter, end, parser, space, fs);
visitor v;
if (r) {
std::cout << "Parse Suceeded\n\n";
fs.apply_visitor(v);
} else {
std::cout << "Sorry :(\n\n";
std::cout << *iter;
}
std::cout << "Bye... :-) \n\n";
return 0;
}
What I want is something like this
Get {
object_: Get {
object_: Variable {
name : "first"
},
name_: second
},
name_: third
}
Is this kind of thing even possible using x3 and the way it constructs parsers from grammar?
Sure. Your grammar parses left-to right, and that's also how you want to build your ast (outside-in, not inside out).
I'd rephrase the whole thing:
expression = variable >> *('.' >> identifier);
Now you'll have to massage the attribute propagation as each . member access wraps the previous expression in another Get{expression, name} instance:
x3::rule<struct identifier_, std::string> const identifier{"identifier"};
x3::rule<struct variable_, ast::Variable> const variable{"variable"};
x3::rule<struct expression_, ast::Expr> const expression{"expression"};
x3::rule<struct program_, ast::Expr> const program{"program"};
auto identifier_def = x3::lexeme[x3::alpha >> *x3::alnum];
auto variable_def = identifier;
Now let's use two semantic actions to propagate the expression parts:
auto as_expr = [](auto& ctx) { _val(ctx) = ast::Expr(std::move(_attr(ctx))); };
auto as_get = [](auto& ctx) {
_val(ctx) = ast::Get{std::move(_val(ctx)), _attr(ctx)};
};
auto expression_def = variable[as_expr] >> *('.' >> identifier[as_get]);
Let's also bake the skipper into the grammar while we're at it:
auto program_def = x3::skip(x3::space)[expression];
Live Demo
With a lot of simplifications, e.g. for the AST & visitor:
Live On Coliru
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
#include <boost/spirit/home/x3/support/utility/error_reporting.hpp>
#include <iomanip>
#include <iostream>
namespace x3 = boost::spirit::x3;
namespace hlsl {
namespace ast {
struct Void {};
struct Get;
struct Variable {
std::string name;
};
using Expr = x3::variant<Void, x3::forward_ast<Get>, Variable>;
struct Get {
Expr object_;
std::string property_;
};
} // namespace ast
struct printer {
std::ostream& _os;
using result_type = void;
void operator()(hlsl::ast::Get const& get) const {
_os << "get { object_:";
get.object_.apply_visitor(*this);
_os << ", property_:" << quoted(get.property_) << " }";
}
void operator()(hlsl::ast::Variable const& var) const {
_os << "var{" << quoted(var.name) << "}";
};
void operator()(hlsl::ast::Void const&) const { _os << "void{}"; };
};
} // namespace hlsl
BOOST_FUSION_ADAPT_STRUCT(hlsl::ast::Variable, name)
BOOST_FUSION_ADAPT_STRUCT(hlsl::ast::Get, object_, property_)
namespace hlsl::parser {
struct eh_tag;
struct error_handler {
template <typename It, typename Exc, typename Ctx>
auto on_error(It&, It, Exc const& x, Ctx const& context) const {
x3::get<eh_tag>(context)( //
x.where(), "Error! Expecting: " + x.which() + " here:");
return x3::error_handler_result::fail;
}
};
struct program_ : error_handler {};
x3::rule<struct identifier_, std::string> const identifier{"identifier"};
x3::rule<struct variable_, ast::Variable> const variable{"variable"};
x3::rule<struct expression_, ast::Expr> const expression{"expression"};
x3::rule<struct program_, ast::Expr> const program{"program"};
auto as_expr = [](auto& ctx) { _val(ctx) = ast::Expr(std::move(_attr(ctx))); };
auto as_get = [](auto& ctx) {
_val(ctx) = ast::Get{std::move(_val(ctx)), _attr(ctx)};
};
auto identifier_def = x3::lexeme[x3::alpha >> *x3::alnum];
auto variable_def = identifier;
auto expression_def = variable[as_expr] >> *('.' >> identifier)[as_get];
auto program_def = x3::skip(x3::space)[expression];
BOOST_SPIRIT_DEFINE(variable, expression, identifier, program);
} // namespace hlsl::parser
int main() {
using namespace hlsl;
for (std::string const input :
{
"first",
"first.second",
"first.Second.third",
}) //
{
std::cout << "===== " << quoted(input) << "\n";
auto f = input.begin(), l = input.end();
// Our error handler
auto const p = x3::with<parser::eh_tag>(
x3::error_handler{f, l, std::cerr})[hlsl::parser::program];
if (hlsl::ast::Expr fs; parse(f, l, p, fs)) {
fs.apply_visitor(hlsl::printer{std::cout << "Parsed: "});
std::cout << "\n";
} else {
std::cout << "Parse failed at " << quoted(std::string_view(f, l)) << "\n";
}
}
}
Prints
===== "first"
Parsed: var{"first"}
===== "first.second"
Parsed: get { object_:var{"first"}, property_:"second" }
===== "first.Second.third"
Parsed: get { object_:get { object_:var{"first"}, property_:"Second" }, property_:"third" }
More Simplifications
In the current scenario none of the rules are recursive, so don't need the _DEFINE magic. Assuming you need recursion in the expression later, you could at least remove some redundancy:
namespace hlsl::parser {
x3::rule<struct expression_, ast::Expr> const expression{"expression"};
auto as_expr = [](auto& ctx) { _val(ctx) = ast::Expr(std::move(_attr(ctx))); };
auto as_get = [](auto& ctx) { _val(ctx) = ast::Get{std::move(_val(ctx)), _attr(ctx)}; };
auto identifier
= x3::rule<void, std::string>{"identifier"}
= x3::lexeme[x3::alpha >> *x3::alnum];
auto variable = x3::rule<void, ast::Variable>{"variable"} = identifier;
auto expression_def = variable[as_expr] >> *('.' >> identifier)[as_get];
auto program = x3::skip(x3::space)[expression];
BOOST_SPIRIT_DEFINE(expression)
} // namespace hlsl::parser
Note also that the lexeme is important to suppress skipping (Boost spirit skipper issues)
See it Live On Coliru as well.
Oh and for bonus, a version without x3::variant or visitation:
Live On Coliru
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>
namespace x3 = boost::spirit::x3;
namespace hlsl::ast {
struct Void {};
struct Get;
struct Variable {
std::string name;
};
using Expr = boost::variant<Void, boost::recursive_wrapper<Get>, Variable>;
struct Get {
Expr object_;
std::string property_;
};
static inline std::ostream& operator<<(std::ostream& os, Void) {
return os << "void()";
}
static inline std::ostream& operator<<(std::ostream& os, Variable const& v) {
return os << "var{" << std::quoted(v.name) << "}";
}
static inline std::ostream& operator<<(std::ostream& os, Get const& g) {
return os << "get{ object_:" << g.object_ << ", property_:" << quoted(g.property_)
<< " }";
}
} // namespace hlsl::ast
BOOST_FUSION_ADAPT_STRUCT(hlsl::ast::Variable, name)
BOOST_FUSION_ADAPT_STRUCT(hlsl::ast::Get, object_, property_)
namespace hlsl::parser {
x3::rule<struct expression_, ast::Expr> const expression{"expression"};
auto as_expr = [](auto& ctx) { _val(ctx) = ast::Expr(std::move(_attr(ctx))); };
auto as_get = [](auto& ctx) { _val(ctx) = ast::Get{std::move(_val(ctx)), _attr(ctx)}; };
auto identifier
= x3::rule<void, std::string>{"identifier"}
= x3::lexeme[x3::alpha >> *x3::alnum];
auto variable = x3::rule<void, ast::Variable>{"variable"} = identifier;
auto expression_def = variable[as_expr] >> *('.' >> identifier)[as_get];
auto program = x3::skip(x3::space)[expression];
BOOST_SPIRIT_DEFINE(expression)
} // namespace hlsl::parser
int main() {
using namespace hlsl;
for (std::string const input :
{
"first",
"first.second",
"first.Second.third",
}) //
{
std::cout << "===== " << quoted(input) << "\n";
auto f = input.begin(), l = input.end();
if (ast::Expr fs; parse(f, l, parser::program, fs)) {
std::cout << "Parsed: " << fs << "\n";
} else {
std::cout << "Parse failed at " << quoted(std::string_view(f, l)) << "\n";
}
}
}
Prints just the same:
===== "first"
Parsed: var{"first"}
===== "first.second"
Parsed: get{ object_:var{"first"}, property_:"second" }
===== "first.Second.third"
Parsed: get{ object_:get{ object_:var{"first"}, property_:"Second" }, property_:"third" }
That's >100 lines of code removed. With no functionality sacrificed.
What is the algorithm for developing a string parser to create a geometry? The geometry is generated in 2 steps: at the first step, we create primitives; at the second, we combine primitives into objects.
The syntax is presented in the string below.
string str="[GEOMETRY]
PRIMITIVE1=SPHERE(RADIUS=5.5);
PRIMITIVE2=BOX(A=-5.2, B=7.3);
//...
OBJECT1=PRIMITIVE2*(-PRIMITIVE1);
//..."
class PRIMITIVE{
int number;
public:
Primitive& operator+ (Primitive& primitive) {}; //overloading arithmetic operations
Primitive& operator* (Primitive& primitive) {};
Primitive& operator- (Primitive& primitive) {};
virtual bool check_in_point_inside_primitive = 0;
};
class SPHERE:public PRIMITIVE{
double m_radius;
public:
SPHERE(double radius): m_radius(radius) {}; //In which part of the parser to create objects?
bool check_in_point_inside_sphere(Point& point){};
};
class BOX:public PRIMITIVE{
double m_A;
double m_B;
public:
BOX(double A, double B): m_A(A), m_B(B) {};
bool check_in_point_inside_box(Point& point){};
};
class OBJECT{
int number;
PRIMITIVE& primitive;
public:
OBJECT(){};
bool check_in_point_inside_object(Primitive& PRIMITIVE1, Primitive& PRIMITIVE2, Point& point){
//>How to construct a function from an expression 'PRIMITIVE2*(-PRIMITIVE1)' when parsing?
}
};
How to analyze the string PRIMITIVE1=SPHERE(RADIUS=5.5) and pass a parameter to the constructor of SPHERE()? How to identify this object with the name PRIMITIVE 1 to call to it in OBJECT? Is it possible to create a pair<PRIMITIVE1,SPHERE(5.5)> and store all primitives in map?
How to parse the string of the OBJECT1 and to construct a function from an expression PRIMITIVE2*(-PRIMITIVE1) inside an OBJECT1? This expression will be required multiple times when determining the position of each point relative to the object.
How to use boost::spirit for this task? Tokenize a string using boost::spirit::lex, and then develop rules using boost::spirit::qi?
As a finger exercise, and despite the serious problems I see with the chosen virtual type hierarchy, let's try to make a value-oriented container of Primitives that can be indexed by their id (ById):
Live On Coliru
#include <boost/intrusive/set.hpp>
#include <boost/poly_collection/base_collection.hpp>
#include <iostream>
namespace bi = boost::intrusive;
struct Point {
};
using IndexHook = bi::set_member_hook<bi::link_mode<bi::auto_unlink>>;
class Primitive {
int _id;
public:
struct ById {
bool operator()(auto const&... oper) const { return std::less<>{}(access(oper)...); }
private:
static int access(int id) { return id; }
static int access(Primitive const& p) { return p._id; }
};
IndexHook _index;
Primitive(int id) : _id(id) {}
virtual ~Primitive() = default;
int id() const { return _id; }
Primitive& operator+= (Primitive const& primitive) { return *this; } //overloading arithmetic operations
Primitive& operator*= (Primitive const& primitive) { return *this; }
Primitive& operator-= (Primitive const& primitive) { return *this; }
virtual bool check_in_point_inside(Point const&) const = 0;
};
using Index =
bi::set<Primitive, bi::constant_time_size<false>,
bi::compare<Primitive::ById>,
bi::member_hook<Primitive, IndexHook, &Primitive::_index>>;
class Sphere : public Primitive {
double _radius;
public:
Sphere(int id, double radius)
: Primitive(id)
, _radius(radius) {} // In which part of the parser to create objects?
bool check_in_point_inside(Point const& point) const override { return false; }
};
class Box : public Primitive {
double _A;
double _B;
public:
Box(int id, double A, double B) : Primitive(id), _A(A), _B(B) {}
bool check_in_point_inside(Point const& point) const override { return false; }
};
class Object{
int _id;
Primitive& _primitive;
public:
Object(int id, Primitive& p) : _id(id), _primitive(p) {}
bool check_in_point_inside_object(Primitive const& p1, Primitive const& p2,
Point const& point) const
{
//>How to construct a function from an expression
//'PRIMITIVE2*(-PRIMITIVE1)' when parsing?
return false;
}
};
using Primitives = boost::poly_collection::base_collection<Primitive>;
int main() {
Primitives test;
test.insert(Sphere{2, 4.0});
test.insert(Sphere{4, 4.0});
test.insert(Box{2, 5, 6});
test.insert(Sphere{1, 4.0});
test.insert(Box{3, 5, 6});
Index idx;
for (auto& p : test)
if (not idx.insert(p).second)
std::cout << "Duplicate id " << p.id() << " not indexed\n";
for (auto& p : idx)
std::cout << typeid(p).name() << " " << p.id() << "\n";
std::cout << "---\n";
for (auto& p : test)
std::cout << typeid(p).name() << " " << p.id() << "\n";
}
Prints
Duplicate id 2 not indexed
6Sphere 1
3Box 2
3Box 3
6Sphere 4
---
3Box 2
3Box 3
6Sphere 2
6Sphere 4
6Sphere 1
So far so good. This is an important building block to prevent all manner of pain when dealing with virtual types in Spirit grammars¹
PS: I've since dropped the idea of intrusive_set. It doesn't work because the base_container moves items around on reallocation, and that unlinks the items from their intrusive set.
Instead, see below for an approach that doesn't try to resolve ids during the parse.
Parsing primitives
We get the ID from the PRIMITIVE1. We could store it somewhere before naturally parsing the primitives themselves, then set the id on it on commit.
Let's start with defining a State object for the parser:
struct State {
Ast::Id next_id;
Ast::Primitives primitives;
Ast::Objects objects;
template <typename... T> void commit(boost::variant<T...>& val) {
boost::apply_visitor([this](auto& obj) { commit(obj); }, val);
}
template <typename T> void commit(T& primitiveOrExpr) {
auto id = std::exchange(next_id, 0);
if constexpr (std::is_base_of_v<Ast::Primitive, T>) {
primitiveOrExpr.id = id;
primitives.insert(std::move(primitiveOrExpr));
} else {
objects.push_back(Ast::Object{id, std::move(primitiveOrExpr)});
}
}
};
As you can see, we just have a place to store the primitives, objects. And then there is the temporary storage for our next_id while we're still parsing the next entity.
The commit function helps sorting the products of the parser rules. As it happens, they can be variant, which is why we have the apply_visitor dispatch for commit on a variant.
Again, as the footnote¹ explains, Spirit's natural attribute synthesis favors static polymorphism.
The semantic actions we need are now:
static inline auto& state(auto& ctx) { return get<State>(ctx); }
auto draft = [](auto& ctx) { state(ctx).next_id = _attr(ctx); };
auto commit = [](auto& ctx) { state(ctx).commit(_attr(ctx)); };
Now let's jump ahead to the primitives:
auto sphere = as<Ast::Sphere>(eps >> "sphere" >>'(' >> param("radius") >> ')');
auto box = as<Ast::Box>(eps >> "box" >> '(' >> param('a') >> ',' >> param('b') >> ')');
auto primitive =
("primitive" >> uint_[draft] >> '=' >> (sphere | box)[commit]) > ';';
That's still cheating a little, as I've used the param helper to reduce typing:
auto number = as<Ast::Number>(double_, "number");
auto param(auto name, auto p) { return eps >> omit[name] >> '=' >> p; }
auto param(auto name) { return param(name, number); }
As you can see I've already assumed most parameters will have numerical nature.
What Are Objects Really?
Looking at it for a while, I concluded that really an Object is defined as an id number (OBJECT1, OBJECT2...) which is tied to an expression. The expression can reference primitives and have some unary and binary operators.
Let's sketch an AST for that:
using Number = double;
struct RefPrimitive { Id id; };
struct Binary;
struct Unary;
using Expr = boost::variant< //
Number, //
RefPrimitive, //
boost::recursive_wrapper<Unary>, //
boost::recursive_wrapper<Binary> //
>;
struct Unary { char op; Expr oper; };
struct Binary { Expr lhs; char op; Expr rhs; };
struct Object { Id id; Expr expr; };
Now To Parse Into That Expression AST
It's really 1:1 rules for each Ast node type. E.g.:
auto ref_prim = as<Ast::RefPrimitive>(lexeme["primitive" >> uint_]);
Now many of the expression rules can recurse, so we need declared rules with definitions via BOOST_SPIRIT_DEFINE:
// object expression grammar
rule<struct simple_tag, Ast::Expr> simple{"simple"};
rule<struct unary_tag, Ast::Unary> unary{"unary"};
rule<struct expr_tag, Ast::Expr> expr{"expr"};
rule<struct term_tag, Ast::Expr> term{"term"};
rule<struct factor_tag, Ast::Expr> factor{"factor"};
As you can tell, some of these are not 1:1 with the Ast nodes, mainly because of the recursion and the difference in operator precedence (term vs factor vs. simple). It's easier to see with the rule definition:
auto unary_def = char_("-+") >> simple;
auto simple_def = ref_prim | unary | '(' >> expr >> ")";
auto factor_def = simple;
auto term_def = factor[assign] >> *(char_("*/") >> term)[make_binary];
auto expr_def = term[assign] >> *(char_("-+") >> expr)[make_binary];
Because none of the rules actually expose a Binary, automatic attribute propagation is not convenient there². Instead, we use assign and make_binary semantic actions:
auto assign = [](auto& ctx) { _val(ctx) = _attr(ctx); };
auto make_binary = [](auto& ctx) {
using boost::fusion::at_c;
auto& attr = _attr(ctx);
auto op = at_c<0>(attr);
auto& rhs = at_c<1>(attr);
_val(ctx) = Ast::Binary { _val(ctx), op, rhs };
};
Finally, let's tie the defintions to the declared rules (using their tag types):
BOOST_SPIRIT_DEFINE(simple, unary, expr, term, factor)
All we need is a similar line to primitive:
auto object =
("object" >> uint_[draft] >> '=' >> (expr)[commit]) > ';';
And we can finish up by defining each line as a primitive|object:
auto line = primitive | object;
auto file = no_case[skip(ws_comment)[*eol >> "[geometry]" >> (-line % eol) >> eoi]];
At the top level we expect the [GEOMETRY] header, specify that we want to be case insensitive and ... that ws_comment is to be skipped³:
auto ws_comment = +(blank | lexeme["//" >> *(char_ - eol) >> eol]);
This allows us to ignore the // comments as well.
Live Demo Time
Live On Compiler Explorer
//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/poly_collection/base_collection.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <list>
#include <map>
namespace x3 = boost::spirit::x3;
namespace Ast {
using Id = uint32_t;
struct Point { }; // ?? where does this belong?
struct Primitive {
Id id;
virtual ~Primitive() = default;
};
struct Sphere : Primitive { double radius; };
struct Box : Primitive { double a, b; };
using Number = double;
struct RefPrimitive { Id id; };
struct Binary;
struct Unary;
using Expr = boost::variant< //
Number, //
RefPrimitive, //
boost::recursive_wrapper<Unary>, //
boost::recursive_wrapper<Binary> //
>;
struct Unary { char op; Expr oper; };
struct Binary { Expr lhs; char op; Expr rhs; };
struct Object { Id id; Expr expr; };
using Primitives = boost::poly_collection::base_collection<Primitive>;
using Objects = std::list<Object>;
using Index = std::map<Id, std::reference_wrapper<Primitive const>>;
std::ostream& operator<<(std::ostream& os, Primitive const& p) {
return os << boost::core::demangle(typeid(p).name()) << " "
<< "(id: " << p.id << ")";
}
std::ostream& operator<<(std::ostream& os, Object const& o) {
return os << "object(id:" << o.id << ", expr:" << o.expr << ")";
}
std::ostream& operator<<(std::ostream& os, RefPrimitive ref) {
return os << "reference(prim:" << ref.id << ")";
}
std::ostream& operator<<(std::ostream& os, Binary const& b) {
return os << '(' << b.lhs << b.op << b.rhs << ')';
}
std::ostream& operator<<(std::ostream& os, Unary const& u) {
return os << '(' << u.op << u.oper << ')';
}
} // namespace Ast
BOOST_FUSION_ADAPT_STRUCT(Ast::Primitive, id)
BOOST_FUSION_ADAPT_STRUCT(Ast::Sphere, radius)
BOOST_FUSION_ADAPT_STRUCT(Ast::Box, a, b)
BOOST_FUSION_ADAPT_STRUCT(Ast::Object, id)
BOOST_FUSION_ADAPT_STRUCT(Ast::RefPrimitive, id)
BOOST_FUSION_ADAPT_STRUCT(Ast::Unary, op, oper)
namespace Parser {
using namespace x3;
struct State {
Ast::Id next_id;
Ast::Primitives primitives;
Ast::Objects objects;
template <typename... T> void commit(boost::variant<T...>& val) {
boost::apply_visitor([this](auto& obj) { commit(obj); }, val);
}
template <typename T> void commit(T& val) {
auto id = std::exchange(next_id, 0);
if constexpr (std::is_base_of_v<Ast::Primitive, T>) {
val.id = id;
primitives.insert(std::move(val));
} else {
objects.push_back(Ast::Object{id, std::move(val)});
}
}
};
static inline auto& state(auto& ctx) { return get<State>(ctx); }
auto draft = [](auto& ctx) { state(ctx).next_id = _attr(ctx); };
auto commit = [](auto& ctx) { state(ctx).commit(_attr(ctx)); };
template <typename T>
auto as = [](auto p, char const* name = "as") {
return rule<struct _, T>{name} = p;
};
auto ws_comment = +(blank | lexeme["//" >> *(char_ - eol) >> (eol | eoi)]);
auto number = as<Ast::Number>(double_, "number");
auto param(auto name, auto p) { return eps >> omit[name] >> '=' >> p; }
auto param(auto name) { return param(name, number); }
auto sphere = as<Ast::Sphere>(eps >> "sphere" >>'(' >> param("radius") >> ')');
auto box = as<Ast::Box>(eps >> "box" >> '(' >> param('a') >> ',' >> param('b') >> ')');
auto primitive =
("primitive" >> uint_[draft] >> '=' >> (sphere | box)[commit]) > ';';
auto ref_prim = as<Ast::RefPrimitive>(lexeme["primitive" >> uint_], "ref_prim");
// object expression grammar
rule<struct simple_tag, Ast::Expr> simple{"simple"};
rule<struct unary_tag, Ast::Unary> unary{"unary"};
rule<struct expr_tag, Ast::Expr> expr{"expr"};
rule<struct term_tag, Ast::Expr> term{"term"};
rule<struct factor_tag, Ast::Expr> factor{"factor"};
auto assign = [](auto& ctx) { _val(ctx) = _attr(ctx); };
auto make_binary = [](auto& ctx) {
using boost::fusion::at_c;
auto& attr = _attr(ctx);
auto op = at_c<0>(attr);
auto& rhs = at_c<1>(attr);
_val(ctx) = Ast::Binary { _val(ctx), op, rhs };
};
auto unary_def = char_("-+") >> simple;
auto simple_def = ref_prim | unary | '(' >> expr >> ")";
auto factor_def = simple;
auto term_def = factor[assign] >> *(char_("*/") >> term)[make_binary];
auto expr_def = term[assign] >> *(char_("-+") >> expr)[make_binary];
BOOST_SPIRIT_DEFINE(simple, unary, expr, term, factor)
auto object =
("object" >> uint_[draft] >> '=' >> (expr)[commit]) > ';';
auto line = primitive | object;
auto file = no_case[skip(ws_comment)[*eol >> "[geometry]" >> (-line % eol) >> eoi]];
} // namespace Parser
int main() {
for (std::string const input :
{
R"(
[geometry]
primitive1=sphere(radius=5.5);
primitive2=box(a=-5.2, b=7.3);
//...
object1=primitive2*(-primitive1);
//...)",
R"(
[GEOMETRY]
PRIMITIVE1=SPHERE(RADIUS=5.5);
PRIMITIVE2=BOX(A=-5.2, B=7.3);
//...
OBJECT1=PRIMITIVE2*(-PRIMITIVE1);
//...)",
}) //
{
Parser::State state;
bool ok = parse(begin(input), end(input),
x3::with<Parser::State>(state)[Parser::file]);
std::cout << "Parse success? " << std::boolalpha << ok << "\n";
Ast::Index index;
for (auto& p : state.primitives)
if (auto[it,ok] = index.emplace(p.id, p); not ok) {
std::cout << "Duplicate id " << p
<< " (conflicts with existing " << it->second.get()
<< ")\n";
}
std::cout << "Primitives by ID:\n";
for (auto& [id, prim] : index)
std::cout << " - " << prim << "\n";
std::cout << "Objects in definition order:\n";
for (auto& obj: state.objects)
std::cout << " - " << obj << "\n";
}
}
Prints
Parse success? true
Primitives by ID:
- Ast::Sphere (id: 1)
- Ast::Box (id: 2)
Objects in definition order:
- object(id:1, expr:(reference(prim:2)*(-reference(prim:1))))
Parse success? true
Primitives by ID:
- Ast::Sphere (id: 1)
- Ast::Box (id: 2)
Objects in definition order:
- object(id:1, expr:(reference(prim:2)*(-reference(prim:1))))
¹ How can I use polymorphic attributes with boost::spirit::qi parsers?
² and insisting on that leads to classical in-efficiency with rules that cause a lot of backtracking
³ outside of lexemes
The boost::spirit::x3 error handling utilities allow for the user to choose what is shown to the user when an expectation failure occurs. This, however, does not seem to be the case for the line number portion of the message, which is exactly what I'd like to modify. So instead of it printing out In line 1: etc. I would like to print some other message in it's place with the same line number info. Anyone know how I could do that, or if it is even modifiable in the first place?
EDIT:
Here's the code straight from https://www.boost.org/doc/libs/1_68_0/libs/spirit/doc/x3/html/spirit_x3/tutorials/error_handling.html:
struct error_handler
{
template <typename Iterator, typename Exception, typename Context>
x3::error_handler_result on_error(
Iterator& first, Iterator const& last
, Exception const& x, Context const& context)
{
auto& error_handler = x3::get<x3::error_handler_tag>(context).get();
std::string message = "Error! Expecting: " + x.which() + " here:";
error_handler(x.where(), message);
return x3::error_handler_result::fail;
}
};
In addition to the on_error function printing out the message, it prints "In line x: ", where x is the line number. I really can't have that, it does not fit in with my project in the slightest.
Wow. First of all, I did not know all details about that example and x3::error_handler<>.
For a good break-down of how to provide error handling/diagnostic messages in X3 from basic principles, see this walk-through: Spirit X3, Is this error handling approach useful?
Traditionally (as in Qi) we would do the position tracking using an iterator adaptor:
Get current line in boost spirit grammar or Cross-platform way to get line number of an INI file where given option was found
or even the classic version of this How to pass the iterator to a function in spirit qi
At first glance it looks like the position_cache can be used separately (see eg. Boost Spirit x3 not compiling).
However, it turns out that - sadly - x3::annotate_on_success conflated the annotation task with error-handling, by assuming that position cache will always live inside the error handler. This at once means:
the error handler is more complicated than strictly required
this compounds with the fact that x3::error_handler<> is not well-suited for inheritance (due to private members and tricky to unambiguously overload operator() while keeping some overloads)
x3::annotate_on_success is simply not available to you unless you at least have a no-op error-handler like (Live On Coliru)
template <typename It> struct dummy_handler_for_annotate_on_success {
x3::position_cache<std::vector<It> > pos_cache;
dummy_handler_for_annotate_on_success(It f, It l) : pos_cache(f,l) {}
template <typename T> void tag(T& ast, It first, It last) {
return pos_cache.annotate(ast, first, last);
}
};
and have that present in the context under the x3::error_handler_tag for annotate_on_success to work.
On the positive, this does have the benefit of not requiring two separate context injections, like:
auto const parser
= x3::with<x3::position_cache_tag>(std::ref(pos_cache)) [
x3::with<x3::error_handler_tag>(error_handler)
[ parser::employees ]
]
;
So, here's my take on providing a custom error-handler implementation. I simplified it a bit from the built-in version¹.
One simplification is also an optimization, resting on the assumption that the iterator type is bidirectional. If not, I think you'd be better off using spirit::line_pos_iterator<> as linked above.
template <typename It> class diagnostics_handler {
x3::position_cache<std::vector<It> > _pos_cache;
std::ostream& _os;
public:
diagnostics_handler(It f, It l, std::ostream& os) : _pos_cache(f, l), _os(os) {}
void operator()(x3::position_tagged const& ast, std::string const& error_message) const {
auto where = _pos_cache.position_of(ast);
operator()(where.begin(), where.end(), error_message);
}
void operator()(It err_first, std::string const& error_message) const {
operator()(err_first, boost::none, error_message);
}
void operator()(It err_first, boost::optional<It> err_last, std::string const& error_message) const {
auto first = _pos_cache.first(),
last = _pos_cache.last();
while (err_first != last && std::isspace(*err_first))
++err_first;
_os << "L:"<< line_number(err_first) << " "
<< error_message << std::endl;
It cursor = get_line_start(first, err_first);
print_line(cursor, last);
auto score = [&](It& it, char fill) -> auto& {
auto f = _os.fill();
auto n = std::distance(cursor, it);
cursor = it;
return _os << std::setfill(fill) << std::setw(n) << "" << std::setfill(f);
};
if (err_last.has_value()) {
score(err_first, ' ');
score(*err_last, '~') << " <<-- Here" << std::endl;
} else {
score(err_first, '_') << "^_" << std::endl;
}
}
template <typename AST> void tag(AST& ast, It first, It last) {
return _pos_cache.annotate(ast, first, last);
}
auto const& get_position_cache() const { return _pos_cache; }
private:
static constexpr std::array crlf { '\r', '\n' };
auto get_line_start(It first, It pos) const {
return std::find_first_of( // assumed bidir iterators
std::make_reverse_iterator(pos), std::make_reverse_iterator(first),
crlf.begin(), crlf.end()
).base();
}
auto line_number(It i) const {
return 1 + std::count(_pos_cache.first(), i, '\n');
}
void print_line(It f, It l) const {
std::basic_string s(f, std::find_first_of(f, l, crlf.begin(), crlf.end()));
_os << boost::locale::conv::utf_to_utf<char>(s) << std::endl;
}
};
Which you can then demo like Live On Coliru
custom::diagnostics_handler<It> diags(iter, end, std::clog);
auto const parser
= x3::with<x3::error_handler_tag>(std::ref(diags))
[ parser::employees ]
;
std::vector<ast::employee> ast;
if (phrase_parse(iter, end, parser >> x3::eoi, x3::space, ast)) {
std::cout << "Parsing succeeded\n";
for (auto const& emp : ast) {
std::cout << "got: " << emp << std::endl;
diags(emp.who.last_name, "note: that's a nice last name");
diags(emp.who, "warning: the whole person could be nice?");
}
} ...
Which prints:
With custom diagnostics only:
Parsing succeeded
got: (23 (Amanda Stefanski) 1000.99)
L:1 note: that's a nice last name
{ 23, "Amanda", "Stefanski", 1000.99 },
~~~~~~~~~~~ <<-- Here
L:1 warning: the whole person could be nice?
{ 23, "Amanda", "Stefanski", 1000.99 },
~~~~~~~~~~~~~~~~~~~~~ <<-- Here
got: (35 (Angie Chilcote) 2000.99)
L:2 note: that's a nice last name
{ 35, "Angie", "Chilcote", 2000.99 }
~~~~~~~~~~ <<-- Here
L:2 warning: the whole person could be nice?
{ 35, "Angie", "Chilcote", 2000.99 }
~~~~~~~~~~~~~~~~~~~ <<-- Here
----- Now with parse error:
L:3 error: expecting: person
'Amanda', "Stefanski", 1000.99 },
_^_
Parsing failed
Simplifying Down
By breaking the false coupling between annotate_on_success and x3::error_handler_tag context, you could slim it down, a lot:
template <typename It> struct diagnostics_handler {
It _first, _last;
std::ostream& _os;
void operator()(It err_first, std::string const& error_message) const {
size_t line_no = 1;
auto bol = _first;
for (auto it = bol; it != err_first; ++it)
if (*it == '\n') {
bol = it+1;
line_no += 1;
}
_os << "L:" << line_no
<< ":" << std::distance(bol, err_first)
<< " " << error_message << "\n";
}
};
See it Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/position_tagged.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <iostream>
#include <iomanip>
#include <string>
namespace x3 = boost::spirit::x3;
namespace ast {
struct name : std::string, x3::position_tagged {
using std::string::string;
using std::string::operator=;
};
struct person : x3::position_tagged { ast::name first_name, last_name; };
struct employee : x3::position_tagged { int age; person who; double salary; };
using boost::fusion::operator<<;
}
BOOST_FUSION_ADAPT_STRUCT(ast::person, first_name, last_name)
BOOST_FUSION_ADAPT_STRUCT(ast::employee, age, who, salary)
namespace custom {
struct diagnostics_handler_tag;
template <typename It> struct diagnostics_handler {
It _first, _last;
std::ostream& _os;
void operator()(It err_first, std::string const& error_message) const {
size_t line_no = 1;
auto bol = _first;
for (auto it = bol; it != err_first; ++it)
if (*it == '\n') {
bol = it+1;
line_no += 1;
}
_os << "L:"<< line_no
<< ":" << std::distance(bol, err_first)
<< " " << error_message << "\n";
}
};
} // namespace custom
namespace parser {
namespace x3 = boost::spirit::x3;
namespace ascii = boost::spirit::x3::ascii;
struct error_handler {
template <typename It, typename E, typename Ctx>
x3::error_handler_result on_error(It&, It const&, E const& x, Ctx const& ctx) {
auto& handler = x3::get<custom::diagnostics_handler_tag>(ctx);
handler(x.where(), "error: expecting: " + x.which());
return x3::error_handler_result::fail;
}
};
struct annotate_position {
template <typename T, typename Iterator, typename Context>
inline void on_success(const Iterator &first, const Iterator &last, T &ast, const Context &context)
{
auto &position_cache = x3::get<annotate_position>(context).get();
position_cache.annotate(ast, first, last);
}
};
struct quoted_string_class : annotate_position {};
struct person_class : annotate_position {};
struct employee_class : error_handler, annotate_position {};
x3::rule<quoted_string_class, ast::name> const name = "name";
x3::rule<person_class, ast::person> const person = "person";
x3::rule<employee_class, ast::employee> const employee = "employee";
auto const name_def
= x3::lexeme['"' >> +(x3::char_ - '"') >> '"']
;
auto const person_def
= name > ',' > name
;
auto const employee_def
= '{' > x3::int_ > ',' > person > ',' > x3::double_ > '}'
;
BOOST_SPIRIT_DEFINE(name, person, employee)
auto const employees = employee >> *(',' >> employee);
}
void parse(std::string const& input) {
using It = std::string::const_iterator;
It iter = input.begin(), end = input.end();
x3::position_cache<std::vector<It> > pos_cache(iter, end);
custom::diagnostics_handler<It> diags { iter, end, std::clog };
auto const parser =
x3::with<parser::annotate_position>(std::ref(pos_cache)) [
x3::with<custom::diagnostics_handler_tag>(diags) [
parser::employees
]
];
std::vector<ast::employee> ast;
if (phrase_parse(iter, end, parser >> x3::eoi, x3::space, ast)) {
std::cout << "Parsing succeeded\n";
for (auto const& emp : ast) {
std::cout << "got: " << emp << std::endl;
diags(pos_cache.position_of(emp.who.last_name).begin(), "note: that's a nice last name");
diags(pos_cache.position_of(emp.who).begin(), "warning: the whole person could be nice?");
}
} else {
std::cout << "Parsing failed\n";
ast.clear();
}
}
static std::string const
good_input = R"({ 23, "Amanda", "Stefanski", 1000.99 },
{ 35, "Angie", "Chilcote", 2000.99 }
)",
bad_input = R"(
{ 23,
'Amanda', "Stefanski", 1000.99 },
)";
int main() {
std::cout << "With custom diagnostics only:" << std::endl;
parse(good_input);
std::cout << "\n\n ----- Now with parse error:" << std::endl;
parse(bad_input);
}
Prints:
With custom diagnostics only:
Parsing succeeded
got: (23 (Amanda Stefanski) 1000.99)
L:1:16 note: that's a nice last name
L:1:6 warning: the whole person could be nice?
got: (35 (Angie Chilcote) 2000.99)
L:2:23 note: that's a nice last name
L:2:14 warning: the whole person could be nice?
----- Now with parse error:
L:2:13 error: expecting: person
Parsing failed
¹ also fixed a bug that causes diagnostics to display wrongly on the first line(?) with x3::error_handler<> implementation
In the grammar below, when I add the alternative (| property) to the start rule, I get this error
'boost::spirit::x3::traits::detail::move_to': none of the 3 overloads
could convert all the argument types
e:\data\boost\boost_1_65_1\boost\spirit\home\x3\support\traits\move_to.hpp 180
I suspect that the problem is that the property attribute is a struct, and property_list is a vector (shouldn't x3 create a vector of one entry?). What is the recommended way to design the AST structures to support alternatives? A Boost variant?
#include <string>
#include <vector>
#include <iostream>
#include <iomanip>
#include <map>
#pragma warning(push)
#pragma warning(disable : 4348)
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/variant.hpp>
#include <boost/fusion/adapted/struct.hpp>
#pragma warning(pop)
namespace x3 = boost::spirit::x3;
namespace scl_ast
{
struct KEYWORD : std::string
{
using std::string::string;
using std::string::operator=;
};
struct NIL
{
};
using VALUE = boost::variant <NIL, std::string, int, double, KEYWORD>;
struct PROPERTY
{
KEYWORD name;
VALUE value;
};
static inline std::ostream& operator<< (std::ostream& os, VALUE const& v)
{
struct
{
std::ostream& _os;
void operator () (std::string const& s) const { _os << std::quoted (s); }
void operator () (int i) const { _os << i; }
void operator () (double d) const { _os << d; }
void operator () (KEYWORD const& k) const { _os << k; }
void operator () (NIL) const { }
} vis { os };
boost::apply_visitor (vis, v);
return os;
}
static inline std::ostream& operator<< (std::ostream& os, PROPERTY const& prop)
{
os << prop.name;
if (prop.value.which ())
{
os << "=" << prop.value;
}
return os;
}
static inline std::ostream& operator<< (std::ostream& os, std::vector <PROPERTY> const& props)
{
for (auto const& prop : props)
{
os << prop << " ";
}
return os;
}
}; // End namespace scl_ast
BOOST_FUSION_ADAPT_STRUCT (scl_ast::PROPERTY, name, value)
//
// Keyword-value grammar for simple command language
//
namespace scl
{
using namespace x3;
auto const keyword = rule <struct _keyword, std::string> { "keyword" }
= lexeme [+char_ ("a-zA-Z0-9$_")];
auto const quoted_string
= lexeme ['"' >> *('\\' > char_ | ~char_ ('"')) >> '"'];
auto const value
= quoted_string
| x3::real_parser<double, x3::strict_real_policies<double>>{}
| x3::int_
| keyword;
auto const property = rule <struct _property, scl_ast::PROPERTY> { "property" }
= keyword >> -(("=" >> value));
auto const property_list = rule <struct _property_list, std::vector <scl_ast::PROPERTY>> { "property_list" }
= lit ('(') >> property % ',' >> lit (')');
auto const start = skip (blank) [property_list | property];
}; // End namespace scl
int
main ()
{
std::vector <std::string> input =
{
"(abc=1.,def=.5,ghi=2.0)",
"(ghi = 1, jkl = 3)",
"(abc,def=1,ghi=2.4,jkl=\"mno 123\", pqr = stu)",
"(abc = test, def, ghi=2)",
"abc=1",
"def = 2.7",
"ghi"
};
for (auto const& str : input)
{
std::vector <scl_ast::PROPERTY> result;
auto b = str.begin (), e = str.end ();
bool ok = x3::parse (b, e, scl::start, result);
std::cout << (ok ? "OK" : "FAIL") << '\t' << std::quoted (str) << std::endl;
if (ok)
{
std::cout << " -- Parsed: " << result << std::endl;
if (b != e)
{
std::cout << " -- Unparsed: " << std::quoted (std::string (b, e)) << std::endl;
}
}
std::cout << std::endl;
} // End for
return 0;
} // End main
I suspect that the problem is that the property attribute is a struct, and property_list is a vector (shouldn't x3 create a vector of one entry?)
Yes, yes, and yes, depending.
I see roughly 3 approaches to tackle this. Let's start with the simples:
Force Container-like synthesis: Live On Coliru
= skip(blank)[property_list | repeat(1)[property] ];
Coercing the attribute type. Turns out I was wrong here: It used to work for Qi, but apparently that has been dropped. Here it is, not-working and all:
auto coerce = [](auto p) { return rule<struct _, std::vector<scl_ast::PROPERTY> > {} = p; };
auto const start
= skip(blank)[property_list | coerce(property)];
Third is actually moot because the same problem. So I guess I owe you a contrived workaround, using semantic actions: Live On Coliru
auto push_back = [](auto& ctx) {
_val(ctx).push_back(_attr(ctx));
};
auto const start
= rule<struct _start, std::vector<scl_ast::PROPERTY>, true>{ "start" }
= skip(blank)[property_list | omit[property[push_back]]];
I got a working parser for reading position descriptions for a board game (international draughts, official grammar):
#include <boost/spirit/home/x3.hpp>
#include <iostream>
namespace x3 = boost::spirit::x3;
auto const colon = x3::lit(':');
auto const comma = x3::lit(',');
auto const dash = x3::lit('-');
auto const dot = x3::lit('.');
auto const king = x3::char_('K');
auto const color = x3::char_("BW");
auto const num_sq = x3::int_;
auto const num_pc = -king >> num_sq; // Kxx means king on square xx, xx a man on that square
auto const num_rng = num_pc >> dash >> num_sq; // xx-yy means range of squares xx through yy (inclusive)
auto const num_seq = (num_rng | num_pc) % comma; // <--- attribute should be std::vector<boost::variant>
auto const ccn = colon >> color >> -num_seq;
auto const num_not = x3::repeat(2)[ccn]; // need to specify both white and black pieces
auto const fen = color >> num_not >> -dot;
Live On Coliru
Now I want to extract the values from the synthesized attributes, so I did the boilerplate dance around Boost.Fusion etc.,
namespace ast {
struct num_pc { boost::optional<char> k; int sq; };
struct num_rng { boost::optional<char> k; int first, last; };
using rng_or_pc = boost::variant<num_rng, num_pc>;
struct num_seq { std::vector<rng_or_pc> sqrs; };
struct ccn { char c; boost::optional<num_seq> seq; };
struct num_not { std::vector<ccn> n; };
struct fen { char c; num_not n; };
} // namespace ast
BOOST_FUSION_ADAPT_STRUCT(ast::num_pc, (boost::optional<char>, k), (int, sq))
BOOST_FUSION_ADAPT_STRUCT(ast::num_rng, (boost::optional<char>, k), (int, first), (int, last))
BOOST_FUSION_ADAPT_STRUCT(ast::num_seq, (std::vector<ast::rng_or_pc>, sqrs))
BOOST_FUSION_ADAPT_STRUCT(ast::ccn, (char, c), (boost::optional<ast::num_seq>, seq))
BOOST_FUSION_ADAPT_STRUCT(ast::num_not, (std::vector<ast::ccn>, n))
BOOST_FUSION_ADAPT_STRUCT(ast::fen, (char, c), (ast::num_not, n))
x3::rule<class num_pc_class, ast::num_pc > num_pc = "num_pc";
x3::rule<class num_rng_class, ast::num_rng> num_rng = "num_rng";
x3::rule<class num_seq_class, ast::num_seq> num_seq = "num_seq";
x3::rule<class ccn_class, ast::ccn > ccn = "ccn";
x3::rule<class num_not_class, ast::num_not> num_not = "num_not";
x3::rule<class fen_class, ast::fen > fen = "fen";
auto const colon = x3::lit(':');
auto const comma = x3::lit(',');
auto const dash = x3::lit('-');
auto const dot = x3::lit('.');
auto const king = x3::char_('K');
auto const color = x3::char_("BW");
auto const num_sq = x3::int_;
auto const num_pc_def = -king >> num_sq;
auto const num_rng_def = num_pc >> dash >> num_sq;
auto const num_seq_def = (num_rng | num_pc) % comma;
auto const ccn_def = colon >> color >> -num_seq;
auto const num_not_def = x3::repeat(2)[ccn];
auto const fen_def = color >> num_not >> -dot;
BOOST_SPIRIT_DEFINE(num_pc, num_rng, num_seq, ccn, num_not, fen)
Live On Coliru
However, I then get an error saying that
error: static_assert failed "Attribute does not have the expected
size."
and a couple of pages down:
^ main.cpp:16:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from
'std::vector<boost::variant<ast::num_rng, ast::num_pc>,
std::allocator<boost::variant<ast::num_rng, ast::num_pc> > >' to
'ast::num_seq' for 1st argument struct num_seq {
std::vector<rng_or_pc> sqrs; };
^ main.cpp:16:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from
'std::vector<boost::variant<ast::num_rng, ast::num_pc>,
std::allocator<boost::variant<ast::num_rng, ast::num_pc> > >' to
'const ast::num_seq' for 1st argument struct num_seq {
std::vector<rng_or_pc> sqrs; };
Question: where is this error coming from, and how to resolve it? Apparently the synthesized attribute of my num_seq rule is not equal to std::vector<boost::variant>>. How can I correct this?
I've spent some time trying to understand the grammar.
I strongly suggest readable identifiers. It's very hard to understand what's going on, while I have the strong impression it's actually a really simple grammar
I suggest a simplification version shown below.
Because your grammar doesn't use recursion there's no real need for the rule and tagged parser types.
Also use a namespace for the parser artefacts.
Consider encapsulation the use of a skipper instead of letting the caller decide (x3::skip[])
Add a few helpers to be able to print the AST for verification:
template <typename T> std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
os << "{"; for (auto& el : v) os << el << " "; return os << "}";
}
std::ostream& operator<<(std::ostream& os, num_pc const& p) { if (p.k) os << p.k; return os << p.sq; }
std::ostream& operator<<(std::ostream& os, num_rng const& r) { return os << r.pc << "-" << r.last; }
std::ostream& operator<<(std::ostream& os, ccn const& o) { return os << o.c << " " << o.seq; }
std::ostream& operator<<(std::ostream& os, num_not const& nn) { return os << nn.n; }
I'd avoid wrapping the other vector unnecessarily too:
using num_not = std::vector<ccn>;
Use the modern ADAPT macros (as you're using C++14 by definition):
BOOST_FUSION_ADAPT_STRUCT(ast::num_pc, k, sq)
BOOST_FUSION_ADAPT_STRUCT(ast::num_rng, pc, last)
BOOST_FUSION_ADAPT_STRUCT(ast::ccn, c, seq)
BOOST_FUSION_ADAPT_STRUCT(ast::fen, c, n)
-
Live Demo
Live On Coliru
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/as_vector.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/optional/optional_io.hpp>
#include <boost/optional.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/variant.hpp>
#include <iostream>
#include <vector>
namespace ast {
struct num_pc {
boost::optional<char> k;
int sq;
};
struct num_rng {
num_pc pc;
int last;
};
using rng_or_pc = boost::variant<num_rng, num_pc>;
using num_seq = std::vector<rng_or_pc>;
struct ccn {
char c;
boost::optional<num_seq> seq;
};
using num_not = std::vector<ccn>;
struct fen {
char c;
num_not n;
};
template <typename T> std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
os << "{"; for (auto& el : v) os << el << " "; return os << "}";
}
std::ostream& operator<<(std::ostream& os, num_pc const& p) { if (p.k) os << p.k; return os << p.sq; }
std::ostream& operator<<(std::ostream& os, num_rng const& r) { return os << r.pc << "-" << r.last; }
std::ostream& operator<<(std::ostream& os, ccn const& o) { return os << o.c << " " << o.seq; }
}
BOOST_FUSION_ADAPT_STRUCT(ast::num_pc, k, sq)
BOOST_FUSION_ADAPT_STRUCT(ast::num_rng, pc, last)
BOOST_FUSION_ADAPT_STRUCT(ast::ccn, c, seq)
BOOST_FUSION_ADAPT_STRUCT(ast::fen, c, n)
namespace FEN {
namespace x3 = boost::spirit::x3;
namespace grammar
{
using namespace x3;
template<typename T>
auto as = [](auto p) { return rule<struct _, T>{} = as_parser(p); };
uint_type const number {};
auto const color = char_("BW");
auto const num_pc = as<ast::num_pc> ( -char_('K') >> number );
auto const num_rng = as<ast::num_rng> ( num_pc >> '-' >> number );
auto const num_seq = as<ast::num_seq> ( (num_rng | num_pc) % ',' );
auto const ccn = as<ast::ccn> ( ':' >> color >> -num_seq );
auto const num_not = as<ast::num_not> ( repeat(2)[ccn] );
auto const fen = as<ast::fen> ( color >> num_not >> -lit('.') );
}
using grammar::fen;
}
int main() {
for (std::string const t : {
"B:W18,24,27,28,K10,K15:B12,16,20,K22,K25,K29",
"B:W18,19,21,23,24,26,29,30,31,32:B1,2,3,4,6,7,9,10,11,12",
"W:B1-20:W31-50", // initial position
"W:B:W", // empty board
"W:B1:W", // only black pieces
"W:B:W50" // only white pieces
}) {
auto b = t.begin(), e = t.end();
ast::fen data;
bool ok = phrase_parse(b, e, FEN::fen, FEN::x3::space, data);
std::cout << t << "\n";
if (ok) {
std::cout << "Parsed: " << boost::fusion::as_vector(data) << "\n";
} else {
std::cout << "Parse failed:\n";
std::cout << "\t on input: " << t << "\n";
}
if (b != e)
std::cout << "\t Remaining unparsed: '" << std::string(b, e) << '\n';
}
}
Prints:
B:W18,24,27,28,K10,K15:B12,16,20,K22,K25,K29
Parsed: (B {W {18 24 27 28 K10 K15 } B {12 16 20 K22 K25 K29 } })
B:W18,19,21,23,24,26,29,30,31,32:B1,2,3,4,6,7,9,10,11,12
Parsed: (B {W {18 19 21 23 24 26 29 30 31 32 } B {1 2 3 4 6 7 9 10 11 12 } })
W:B1-20:W31-50
Parsed: (W {B {1-20 } W {31-50 } })
W:B:W
Parsed: (W {B -- W -- })
W:B1:W
Parsed: (W {B {1 } W -- })
W:B:W50
Parsed: (W {B -- W {50 } })