Boost Spirit X3 cannot compile repeat directive with variable factor

Boost Spirit X3 cannot compile repeat directive with variable factor - c++

I am trying to use the Boost Spirit X3 directive repeat with a repetition factor that is variable. The basic idea is that of a header + payload, where the header specifies the size of the payload. A simple example “3 1 2 3” is interpreted as header = 3, data= {1, 2, 3} (3 integers).
I could only find examples from the spirit qi documentation. It uses boost phoenix reference to wrap the variable factor: http://www.boost.org/doc/libs/1_50_0/libs/spirit/doc/html/spirit/qi/reference/directive/repeat.html
std::string str;
int n;
test_parser_attr("\x0bHello World",
char_[phx::ref(n) = _1] >> repeat(phx::ref(n))[char_], str);
std::cout << n << ',' << str << std::endl; // will print "11,Hello World"
I wrote the following simple example for spirit x3 without luck:
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <string>
#include <iostream>
namespace x3 = boost::spirit::x3;
using x3::uint_;
using x3::int_;
using x3::phrase_parse;
using x3::repeat;
using x3::space;
using std::string;
using std::cout;
using std::endl;
int main( int argc, char **argv )
{
string data("3 1 2 3");
string::iterator begin = data.begin();
string::iterator end = data.end();
unsigned int n = 0;
auto f = [&n]( auto &ctx ) { n = x3::_attr(ctx); };
bool r = phrase_parse( begin, end, uint_[f] >> repeat(boost::phoenix::ref(n))[int_], space );
if ( r && begin == end )
cout << "Parse success!" << endl;
else
cout << "Parse failed, remaining: " << string(begin,end) << endl;
return 0;
}
Compiling the code above with boost 1.59.0 and clang++ (flags: -std=c++14) gives the following:
boost_1_59_0/boost/spirit/home/x3/directive/repeat.hpp:72:47: error: no matching constructor for
initialization of 'proto_child0' (aka 'boost::reference_wrapper<unsigned int>')
typename RepeatCountLimit::type i{};
If I hardcode repeat(3) instead of repeat(boost::phoenix::ref(n)) it works properly, but it is not a possible solution since it should support a variable repetition factor.
Compilation with repeat(n) completes successfully, but it fails parsing with the following output:
“Parse failed, remaining: 1 2 3"
Looking at the source code for boost/spirit/home/x3/directive/repeat.hpp:72 it calls the empty constructor for template type RepeatCountLimit::type variable i and then assign during the for loop, iterating over min and max. However since the type is a reference it should be initialized in the constructor, so compilation fails. Looking at the equivalent source code from the previous library version boost/spirit/home/qi/directive/repeat.hpp:162 it is assigned directly:
typename LoopIter::type i = iter.start();
I am not sure what I am doing wrong here, or if x3 currently does not support variable repetition factors. I would appreciate some help solving this issue. Thank you.

From what I gather, reading the source and the mailing list, Phoenix is not integrated into X3 at all: the reason being that c++14 makes most of it obsolete.
I agree that this leaves a few spots where Qi used to have elegant solutions, e.g. eps(DEFERRED_CONDITION), lazy(*RULE_PTR) (the Nabialek trick), and indeed, this case.
Spirit X3 is still in development, so we might see this added¹
For now, Spirit X3 has one generalized facility for stateful context. This essentially replaces locals<>, in some cases inherited arguments, and can be /made to/ validate the number of elements in this particular case as well:
x3::with²
Here's how you could use it:
with<_n>(std::ref(n))
[ omit[uint_[number] ] >>
*(eps [more] >> int_) >> eps [done] ]
Here, _n is a tag type that identifies the context element for retrieval with get<_n>(cxtx).
Note, currently we have to use a reference-wrapper to an lvalue n because with<_n>(0u) would result in constant element inside the context. I suppose this, too, is a QoI that may be lifted as X# matures
Now, for the semantic actions:
unsigned n;
struct _n{};
auto number = [](auto &ctx) { get<_n>(ctx).get() = _attr(ctx); };
This stores the parsed unsigned number into the context. (In fact, due to the ref(n) binding it's not actually part of the context for now, as mentioned)
auto more = [](auto &ctx) { _pass(ctx) = get<_n>(ctx) > _val(ctx).size(); };
Here we check that we're actually not "full" - i.e. more integers are allowed
auto done = [](auto &ctx) { _pass(ctx) = get<_n>(ctx) == _val(ctx).size(); };
Here we check that we're "full" - i.e. no more integers are allowed.
Putting it all together:
Live On Coliru
#include <string>
#include <iostream>
#include <iomanip>
#include <boost/spirit/home/x3.hpp>
int main() {
for (std::string const input : {
"3 1 2 3", // correct
"4 1 2 3", // too few
"2 1 2 3", // too many
//
" 3 1 2 3 ",
})
{
std::cout << "\nParsing " << std::left << std::setw(20) << ("'" + input + "':");
std::vector<int> v;
bool ok;
{
using namespace boost::spirit::x3;
unsigned n;
struct _n{};
auto number = [](auto &ctx) { get<_n>(ctx).get() = _attr(ctx); };
auto more = [](auto &ctx) { _pass(ctx) = get<_n>(ctx) > _val(ctx).size(); };
auto done = [](auto &ctx) { _pass(ctx) = get<_n>(ctx) == _val(ctx).size(); };
auto r = rule<struct _r, std::vector<int> > {}
%= with<_n>(std::ref(n))
[ omit[uint_[number] ] >> *(eps [more] >> int_) >> eps [done] ];
ok = phrase_parse(input.begin(), input.end(), r >> eoi, space, v);
}
if (ok) {
std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout << v.size() << " elements: ", " "));
} else {
std::cout << "Parse failed";
}
}
}
Which prints:
Parsing '3 1 2 3': 3 elements: 1 2 3
Parsing '4 1 2 3': Parse failed
Parsing '2 1 2 3': Parse failed
Parsing ' 3 1 2 3 ': 3 elements: 1 2 3
¹ lend your support/voice at the [spirit-general] mailing list :)
² can't find a suitable documentation link, but it's used in some of the samples

Related

Boost Spirit X3: skip parser that would do nothing

I'm getting myself familiarized with boost spirit v3. The question I want to ask is how to state the fact that you don't want to use skip parser in any way.
Consider a simple example of parsing comma-separated sequence of integers:
#include <iostream>
#include <string>
#include <vector>
#include <boost/spirit/home/x3.hpp>
int main()
{
using namespace boost::spirit::x3;
const std::string input{"2,4,5"};
const auto parser = int_ % ',';
std::vector<int> numbers;
auto start = input.cbegin();
auto r = phrase_parse(start, input.end(), parser, space, numbers);
if(r && start == input.cend())
{
// success
for(const auto &item: numbers)
std::cout << item << std::endl;
return 0;
}
std::cerr << "Input was not parsed successfully" << std::endl;
return 1;
}
This works totally fine. However, I would like to forbid having spaces in between (i.e. "2, 4,5" should not be parsed well).
I tried using eps as a skip parser in phrase_parse, but as you can guess, the program ended up in the infinite loop because eps matches to an empty string.
Solution I found is to use no_skip directive (https://www.boost.org/doc/libs/1_75_0/libs/spirit/doc/html/spirit/qi/reference/directive/no_skip.html). So the parser now becomes:
const auto parser = no_skip[int_ % ','];
This works fine, but I don't find it to be an elegant solution (especially providing "space" parser in phrase_parse when I want no whitespace skips). Are there no skip parsers that would simply do nothing? Am I missing something?
Thanks for Your time. Looking forward to any replies.

You can use either no_skip[] or lexeme[]. They're almost identical, except for pre-skip (Boost Spirit lexeme vs no_skip).
Are there no skip parsers that would simply do nothing? Am I missing something?
A wild guess, but you might be missing the parse API that doesn't accept a skipper in the first place
Live On Coliru
#include <iostream>
#include <iomanip>
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
int main() {
std::string const input{ "2,4,5" };
auto f = begin(input), l = end(input);
const auto parser = x3::int_ % ',';
std::vector<int> numbers;
auto r = parse(f, l, parser, numbers);
if (r) {
// success
for (const auto& item : numbers)
std::cout << item << std::endl;
} else {
std::cerr << "Input was not parsed successfully" << std::endl;
return 1;
}
if (f!=l) {
std::cout << "Remaining input " << std::quoted(std::string(f,l)) << "\n";
return 2;
}
}
Prints
2
4
5

How can I keep certain semantic actions out of the AST in boost::spirit::qi

I have a huge amount of files I am trying to parse using boost::spirit::qi. Parsing is not a problem, but some of the files contain noise that I want to skip. Building a simple parser (not using boost::spirit::qi) verifies that I can avoid the noise by skipping anything that doesn't match rules at the beginning of a line. So, I'm looking for a way to write a line based parser that skip lines when not matching any rule.
The example below allows the grammar to skip lines if they don't match at all, but the 'junk' rule still inserts an empty instance of V(), which is unwanted behaviour.
The use of \r instead of \n in the example is intentional as I have encountered both \n, \r and \r\n in the files.
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>(), ascii::space_type> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v %= string("v") > double_ > double_ > double_;
junk = +(char_ - eol);
start %= +(v | junk);
v.name("v");
junk.name("junk");
start.name("start");
using phx::val;
using phx::construct;
on_error<fail>(
start,
std::cout
<< val("Error! Expecting \n\n'")
<< qi::_4
<< val("'\n\n here: \n\n'")
<< construct<std::string>(qi::_3, qi::_2)
<< val("'")
<< std::endl
);
//debug(v);
//debug(junk);
//debug(start);
}
qi::rule<Iterator> junk;
//qi::rule<Iterator, qi::unused_type()> junk; // Doesn't work either
//qi::rule<Iterator, qi::unused_type(), qi::unused_type()> junk; // Doesn't work either
qi::rule<Iterator, V(), ascii::space_type> v;
qi::rule<Iterator, std::vector<V>(), ascii::space_type> start;
};
} // namespace client
int main(int argc, char* argv[]) {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = phrase_parse(iter, end, v_grammar, ascii::space, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (r && (iter == end)) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}
Here's the output from the example:
run
done ... r: true, iter==end: true
v, 1, 2, 3
, 0, 0, 0
v, 4, 5, 6
v, 7, 8, 9
v, 10, 11, 12
And here is what I actually want the parser to return.
run
done ... r: true, iter==end: true
v, 1, 2, 3
v, 4, 5, 6
My main problem right now is to keep the 'junk' rule from adding an empty V() object. How do I accomplish this? Or am I overthinking the problem?
I have tried adding lit(junk) to the start rule, since lit() doesn't return anything, but this will not compile. It fails with: "static assertion failed: error_invalid_expression".
I have also tried to set the semantic action on the junk rule to qi::unused_type() but the rule still creates an empty V() in that case.
I am aware of the following questions, but they don't address this particular issue. I have tried out the comment skipper earlier, but it looks like I'll have to reimplement all the parse rules in the skipper in order to identify noise. My example is inspired by the solution in the last link:
How to skip line/block/nested-block comments in Boost.Spirit?
How to parse entries followed by semicolon or newline (boost::spirit)?
Version info:
Linux debian 4.9.0-7-amd64 #1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13) x86_64 GNU/Linux
g++ (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
#define BOOST_VERSION 106200
and:
Linux raspberrypi 4.14.24-v7+ #1097 SMP Mon Mar 5 16:42:05 GMT 2018 armv7l GNU/Linux
g++ (Raspbian 4.9.2-10+deb8u1) 4.9.2
#define BOOST_VERSION 106200
For those who wonder: yes I'm trying to parse files similar to Wavefront OBJ files and I'm aware that there is already a bunch of parsers available. However, the data I'm parsing is part of a larger data structure which also requires parsing, so it does make sense to build a new parser.

What you are wanting to achieve is called error recover.
Unfortunately, Spirit does not have a nice way of doing it (there are also some internal decisions which makes it hard to make it externally). However, in your case it is simple to achieve by grammar rewrite.
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>()> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v = skip(blank)[no_skip[string("v")] > double_ > double_ > double_];
junk = +(char_ - eol);
start = (v || -junk) % eol;
v.name("v");
junk.name("junk");
start.name("start");
using phx::val;
using phx::construct;
on_error<fail>(
start,
std::cout
<< val("Error! Expecting \n\n'")
<< qi::_4
<< val("'\n\n here: \n\n'")
<< construct<std::string>(qi::_3, qi::_2)
<< val("'")
<< std::endl
);
//debug(v);
//debug(junk);
//debug(start);
}
qi::rule<Iterator> junk;
//qi::rule<Iterator, qi::unused_type()> junk; // Doesn't work either
//qi::rule<Iterator, qi::unused_type(), qi::unused_type()> junk; // Doesn't work either
qi::rule<Iterator, V()> v;
qi::rule<Iterator, std::vector<V>()> start;
};
} // namespace client
int main(int argc, char* argv[]) {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = parse(iter, end, v_grammar, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (r && (iter == end)) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}

I have tried adding lit(junk) to the start rule, since lit() doesn't return anything, but this will not compile. It fails with: "static assertion failed: error_invalid_expression".
What you're looking for would be omit[junk], but it should make no difference because it will still make the synthesized attribute optional<>.
Fixing Things
First of all, you need newlines to be significant. Which means you cannot skip space. Because it eats newlines. What's worse, you need leading whitespace to be significant as well (to junk that last line, e.g.). You cannot even use qi::blank for the skipper then. (See Boost spirit skipper issues).
Just so you can still have whitespace inside the v rule, just have a local skipper (that doesn't eat newlines):
v %= &lit("v") >> skip(blank) [ string("v") > double_ > double_ > double_ ];
It engages the skipper only after establishing that there was no unexpected leading whitespace.
Note that the string("v") is a bit redundant this way, but that brings us to the second motive:
Second of all, I'm with you in avoiding semantic actions. However, this means you have to make your rules reflect your data structures.
In this particular instance, it means you should probably turn the line skipping a bit inside-out. What if you express the grammar as a straight repeat of v, interspersed with /whatever/, instead of just /newline/? I'd write that like:
junk = *(char_ - eol);
other = !v >> junk;
start = *(v >> junk >> eol % other);
Note that
the delimiter expression now uses the operator% (list operator) itself: (eol % other). What this cleverly accomplishes is that it keeps eating newlines as long as they are only delimited by "other" lines (anything !v at this point).
other is more constrained than junk, because junk may eat v, whereas other makes sure that never happens
therefore v >> junk allows the third line of your sample to be correctly processed (the line that has v 4 5 6 v 7 8 9\r)
Now it all works: Live On Coliru:
run
done ... r: true, iter==end: true
v, 1, 2, 3
v, 4, 5, 6
Perfecting It
You might be aware of the fact that this does not handle the case when the first line(s) are not v lines. Let's add that case to the sample and make sure it works as well:
Live On Coliru:
//#define BOOST_SPIRIT_DEBUG
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>()> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v %= &lit("v") >> skip(blank) [ string("v") > double_ > double_ > double_ ];
junk = *(char_ - eol);
other = !v >> junk;
start =
other >> eol % other >>
*(v >> junk >> eol % other);
BOOST_SPIRIT_DEBUG_NODES((v)(junk)(start))
on_error<fail>(
start,
std::cout
<< phx::val("Error! Expecting \n\n'") << qi::_4
<< "'\n\n here: \n\n'" << phx::construct<std::string>(qi::_3, qi::_2)
<< "'\n"
);
}
private:
qi::rule<Iterator> other, junk;
qi::rule<Iterator, V()> v;
qi::rule<Iterator, std::vector<V>()> start;
};
} // namespace client
int main() {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "o a b c\r"; // parse as junk
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = parse(iter, end, v_grammar, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (iter != end)
std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
if (r) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}

Boost.Spirit X3 Alternative Operator

I have the following code:
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
struct printer {
template <typename int_type>
void operator()(std::vector<int_type> &vec) {
std::cout << "vec(" << sizeof(int_type) << "): { ";
for( auto const &elem : vec ){
std::cout << elem << ", ";
}
std::cout << "}\n";
}
};
template <typename Iterator>
void parse_int_list(Iterator first, Iterator last) {
namespace x3 = boost::spirit::x3;
x3::variant<vector<uint32_t>, vector<uint64_t>> vecs;
x3::parse( first, last,
(x3::uint32 % '|') | (x3::uint64 % '|'), vecs );
boost::apply_visitor(printer{}, vecs);
}
I expected this to first try parsing input into a 32 bit uint vector, then if that failed into a 64 bit uint vector. This works great if the first integer in the list matches a type that is large enough for anything else in the list. I.e.,
string ints32 = "1|2|3";
parse_int_list(being(ints32), end(ints32))
// prints vec(4): { 1, 2, 3, }
string ints64 = "10000000000|20000000000|30000000000";
parse_int_list(being(ints64), end(ints64))
// prints vec(8): { 10000000000, 20000000000, 30000000000, }
However it does not work when the first number is a 32 bit and a later number is a 64 bit.
string ints_mixed = "1|20000000000|30000000000";
parse_int_list(being(ints_mixed), end(ints_mixed))
// prints vec(4): { 1, }
The return value of x3::parse indicates a parse failure. But according to my read of the documentation it should try the second alternative if it can't parse the with the first.
Any pointers on how I'm reading this incorrectly, and how the alternative parser actually works?
Edit: After seeing the responses, I realized that x3::parse was actually returning a parse success. I was checking that it had parsed the entire stream, first == last, to determine success, as demonstrated in the documentation. However, this hides the fact that due to the greedy nature of klean star and not anchoring to the end of stream, it was successfully able to parse a portion of the input. Thanks all.

The issue here is that "3" is a valid input for the (x3::uint32 % '|') parser, so the first branch of the alternative passes, consuming only the 3.
The cleanest way for you to fix this would be to have a list of alternatives instead of an alternative of lists.
i.e.:
(x3::uint32 | x3::uint64) % '|'
However, that would mean you would have to parse in a different structure.
vector<x3::variant<uint32_t,uint64_t>> vecs;
Edit:
Alternatively, if you do not intend to use this parser as a sub-parser, you can force a end-of-input in each branch.
(x3::uint32 % '|' >> x3::eoi) | (x3::uint64 % '|' >> x3::eoi)
This would force the first branch to fail if it does not reach the end of the stream, dropping into the alternative.

As Frank commented, the issue with the Kleene list operator being greedy, accepting as many elements as will match, and considering that a "match".
If you want it to reject input if "some elements have not been parsed", make it so:
parse(first, last, x3::uint32 % '|' >> x3::eoi | x3::uint64 % '|' >> x3::eoi, vecs);
Demo
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <iostream>
struct printer {
template <typename int_type> void operator()(std::vector<int_type> &vec) const {
std::cout << "vec(" << sizeof(int_type) << "): { ";
for (auto const &elem : vec) {
std::cout << elem << ", ";
}
std::cout << "}\n";
}
};
template <typename Iterator> void parse_int_list(Iterator first, Iterator last) {
namespace x3 = boost::spirit::x3;
boost::variant<std::vector<uint32_t>, std::vector<uint64_t> > vecs;
parse(first, last, x3::uint32 % '|' >> x3::eoi | x3::uint64 % '|' >> x3::eoi, vecs);
apply_visitor(printer{}, vecs);
}
int main() {
for (std::string const input : {
"1|2|3",
"4294967295",
"4294967296",
"4294967295|4294967296",
}) {
parse_int_list(input.begin(), input.end());
}
}
Prints
vec(4): { 1, 2, 3, }
vec(4): { 4294967295, }
vec(8): { 4294967296, }
vec(8): { 4294967295, 4294967296, }

Boost Spirit X3 AST not working with semantic actions when using separate rule definition and instantiation

I am trying to use Boost Spirit X3 with semantic actions while parsing the structure to an AST. If I use a rule without separate definition and instantiation it works just fine, for example:
#include <vector>
#include <string>
#include <iostream>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/x3.hpp>
namespace ast
{
struct ast_struct
{
int number;
std::vector<int> numbers;
};
}
BOOST_FUSION_ADAPT_STRUCT(
ast::ast_struct,
(int, number)
(std::vector<int>, numbers)
)
namespace x3 = boost::spirit::x3;
using namespace std;
void parse( const std::string &data )
{
string::const_iterator begin = data.begin();
string::const_iterator end = data.end();
unsigned n(0);
auto f = [&n]( auto &ctx )
{
n = x3::_attr(ctx);
};
ast::ast_struct ast;
bool r = x3::parse( begin, end,
x3::int_[f] >> +( x3::omit[+x3::blank] >> x3::int_ ), ast );
if ( r && begin == end )
{
cout << "n: " << n << ", ";
std::copy(ast.numbers.begin(), ast.numbers.end(),
std::ostream_iterator<int>(std::cout << ast.numbers.size() << " elements: ", " "));
cout << endl;
}
else
cout << "Parse failed" << endl;
}
int main()
{
parse( "3 1 2 3" );
parse( "4 1 2 3 4" );
return 0;
}
Running the code above (compiled with flags -std=c++14) outputs the expected result:
n: 3, 3 elements: 1 2 3
n: 4, 4 elements: 1 2 3 4
Now I am trying to have my Spirit X3 parser organized more or less the same way as the calc 9 example from Boost Spirit X3, but it does not work:
ast.hxx: defines the abstract syntax tree.
grammar.hxx: user interface exposing the parser methods.
grammar.cxx: instantiates the rules.
grammar_def.hxx: parser grammar definition.
config.hxx: parser configuration.
main.cxx: parser usage example.
ast.hxx:
#ifndef AST_HXX
#define AST_HXX
#include <vector>
#include <boost/fusion/include/adapt_struct.hpp>
namespace ast
{
struct ast_struct
{
int number;
std::vector<int> numbers;
};
}
BOOST_FUSION_ADAPT_STRUCT(
ast::ast_struct,
(int, number)
(std::vector<int>, numbers)
)
#endif
grammar.hxx:
#ifndef GRAMMAR_HXX
#define GRAMMAR_HXX
#include "ast.hxx"
#include <boost/spirit/home/x3.hpp>
namespace parser
{
namespace x3 = boost::spirit::x3;
using my_rule_type = x3::rule<class my_rule_class, ast::ast_struct>;
BOOST_SPIRIT_DECLARE( my_rule_type );
const my_rule_type &get_my_rule();
}
#endif
grammar.cxx:
#include "grammar_def.hxx"
#include "config.hxx"
namespace parser
{
BOOST_SPIRIT_INSTANTIATE( my_rule_type, iterator_type, context_type )
}
grammar_def.hxx:
#ifndef GRAMMAR_DEF_HXX
#define GRAMMAR_DEF_HXX
#include <iostream>
#include <boost/spirit/home/x3.hpp>
#include "grammar.hxx"
#include "ast.hxx"
namespace parser
{
namespace x3 = boost::spirit::x3;
const my_rule_type my_rule( "my_rule" );
unsigned n;
auto f = []( auto &ctx )
{
n = x3::_attr(ctx);
};
auto my_rule_def = x3::int_[f] >> +( x3::omit[+x3::blank] >> x3::int_ );
BOOST_SPIRIT_DEFINE( my_rule )
const my_rule_type &get_my_rule()
{
return my_rule;
}
}
#endif
config.hxx:
#ifndef CONFIG_HXX
#define CONFIG_HXX
#include <string>
#include <boost/spirit/home/x3.hpp>
namespace parser
{
namespace x3 = boost::spirit::x3;
using iterator_type = std::string::const_iterator;
using context_type = x3::unused_type;
}
#endif
main.cxx:
#include "ast.hxx"
#include "grammar.hxx"
#include "config.hxx"
#include <iostream>
#include <boost/spirit/home/x3.hpp>
#include <string>
namespace x3 = boost::spirit::x3;
using namespace std;
void parse( const std::string &data )
{
parser::iterator_type begin = data.begin();
parser::iterator_type end = data.end();
ast::ast_struct ast;
cout << "Parsing [" << string(begin,end) << "]" << endl;
bool r = x3::parse( begin, end, parser::get_my_rule(), ast );
if ( r && begin == end )
{
std::copy(ast.numbers.begin(), ast.numbers.end(),
std::ostream_iterator<int>(std::cout << ast.numbers.size() << " elements: ", " "));
cout << endl;
}
else
cout << "Parse failed" << endl;
}
int main()
{
parse( "3 1 2 3" );
parse( "4 1 2 3 4" );
return 0;
}
Compiling main.cxx and grammar.cxx (flags: -std=c++14) and running the code above prints:
Parsing [3 1 2 3]
0 elements:
Parsing [4 1 2 3 4]
0 elements:
I apologize for the long source code, I tried to make it as small as possible.
Please notice I have some usage for the unsigned n global variable, it will be used with a custom repeat directive (see question here and one of the solutions here). In order to keep the question focused I removed the repeat part from this question, so even though I could remove the semantic action in this example, it is not a possible solution.
I would appreciate some help to get this issue uncovered, it is not clear to me why the code above does not work. Thank you in advance.

I must admit actually reconstructing your sample was a bit too much work for me (call me lazy...).
However, I know the answer and a trick to make your life simpler.
The Answer
Semantic actions on a rule definition inhibit automatic attribute propagation. From the Qi docs (the same goes for X3, but I always lose the link to the docs):
r = p; Rule definition
This is equivalent to r %= p (see below) if there are no semantic actions attached anywhere in p.
r %= p; Auto-rule definition
The attribute of p should be compatible with the synthesized attribute of r. When p is successful, its attribute is automatically propagated to r's synthesized attribute.
The Trick
You can inject state (your n reference, in this case) using the x3::with<> directive. That way you don't have the namespace global (n) and can make the parser reentrant, threadsafe etc.
Here's my "simplist" take on things, in a single file:
namespace parsing {
x3::rule<struct parser, ast::ast_struct> parser {"parser"};
struct state_tag { };
auto record_number = [](auto &ctx) {
unsigned& n = x3::get<state_tag>(ctx);
n = x3::_attr(ctx);
};
auto parser_def = x3::rule<struct parser_def, ast::ast_struct> {}
%= x3::int_[record_number] >> +(x3::omit[+x3::blank] >> x3::int_);
BOOST_SPIRIT_DEFINE(parser)
}
Tip: run the demo with = instead of the %= to see the difference in behaviour!
Note that get<state_tag>(ctx) returns a reference_wrapper<unsigned> just because we use the parser as follows:
void parse(const std::string &data) {
using namespace std;
ast::ast_struct ast;
unsigned n;
auto parser = x3::with<parsing::state_tag>(ref(n)) [parsing::parser] >> x3::eoi;
if (x3::parse(data.begin(), data.end(), parser, ast)) {
cout << "n: " << n << ", ";
copy(ast.numbers.begin(), ast.numbers.end(), ostream_iterator<int>(cout << ast.numbers.size() << " elements: ", " "));
cout << "\n";
} else
cout << "Parse failed\n";
}
Live Demo
Live On Coliru
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
namespace ast {
struct ast_struct {
int number;
std::vector<int> numbers;
};
}
BOOST_FUSION_ADAPT_STRUCT(ast::ast_struct, number, numbers)
namespace x3 = boost::spirit::x3;
namespace parsing {
x3::rule<struct parser, ast::ast_struct> parser {"parser"};
struct state_tag { };
auto record_number = [](auto &ctx) {
unsigned& n = x3::get<state_tag>(ctx); // note: returns reference_wrapper<T>
n = x3::_attr(ctx);
};
auto parser_def = x3::rule<struct parser_def, ast::ast_struct> {}
%= x3::int_[record_number] >> +(x3::omit[+x3::blank] >> x3::int_);
BOOST_SPIRIT_DEFINE(parser)
}
void parse(const std::string &data) {
using namespace std;
ast::ast_struct ast;
unsigned n = 0;
auto parser = x3::with<parsing::state_tag>(ref(n)) [parsing::parser] >> x3::eoi;
if (x3::parse(data.begin(), data.end(), parser, ast)) {
cout << "n: " << n << ", ";
copy(ast.numbers.begin(), ast.numbers.end(), ostream_iterator<int>(cout << ast.numbers.size() << " elements: ", " "));
cout << "\n";
} else
cout << "Parse failed\n";
}
int main() {
parse("3 1 2 3");
parse("4 1 2 3 4");
}
Prints
n: 3, 3 elements: 1 2 3
n: 4, 4 elements: 1 2 3 4

Boost::Spirit result of phrase_parse

Hello everybody I am new to boost and boost::spirit, so I am sorry for noob question.
When I use qi::phrase_parsefunction, the function returns only bool variable which indicates whether parsing has been successful or not, but I don't know where I can find the result of parsing ...some sort of syntax tree etc.
If I use macro #define BOOST_SPIRIT_DEBUG XML representation of tree is printed on standard output, but these nodes has to be stored somewhere. Can you help me please?

You can 'bind' attribute references. qi::parse, qi::phrase_parse (and related) accept variadic arguments that will be used to receive the exposed attributes.
A simplistic example is: (EDIT included a utree example too)
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_utree.hpp>
namespace qi = boost::spirit::qi;
int main()
{
using namespace qi;
std::string input("1 2 3 4 5");
std::string::const_iterator F(input.begin()), f(F), l(input.end());
std::vector<int> ints;
if (qi::phrase_parse(f = F, l, *qi::int_, qi::space, ints))
std::cout << ints.size() << " ints parsed\n";
int i;
std::string s;
// it is variadic:
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> +qi::char_, i, s))
std::cout << "i: " << i << ", s: " << s << '\n';
std::pair<int, std::string> data;
// any compatible sequence can be used:
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> +qi::char_, data))
std::cout << "first: " << data.first << ", second: " << data.second << '\n';
// using utree:
boost::spirit::utree tree;
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> qi::as_string [ +qi::char_ ], tree))
std::cout << "tree: " << tree << '\n';
}
Outputs:
5 ints parsed
i: 3, s: 4 5
first: 3, second: 4 5
tree: ( 3 " 4 5" )
A few samples of parsers with 'AST' like data structures:
Boolean expression (grammar) parser in c++
Boost::Spirit Expression Parser
If you want to have a very generic AST structure, look at utree: http://www.boost.org/doc/libs/1_50_0/libs/spirit/doc/html/spirit/support/utree.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Boost Spirit X3 cannot compile repeat directive with variable factor - c++

Related

Boost Spirit X3: skip parser that would do nothing

How can I keep certain semantic actions out of the AST in boost::spirit::qi

Boost.Spirit X3 Alternative Operator

Boost Spirit X3 AST not working with semantic actions when using separate rule definition and instantiation

Boost::Spirit result of phrase_parse

Categories

Resources