parsing a single value into an ast node with a container - c++

My problem is the following. I have an ast node which is defined as like the following:
struct foo_node{
std::vector<std::string> value;
}
and I have a parser like this for parsing into the struct, which works fine:
typedef x3::rule<struct foo_node_class, foo_node> foo_node_type;
const foo_node_type foo_node = "foo_node";
auto const foo_node_def = "(" >> +x3::string("bar") >> ")";
Now I want to achieve that the parser also parses "bar", without brackets, but only if its a single bar. I tried to do it like this:
auto const foo_node_def = x3::string("bar")
| "(" > +x3::string("bar") > ")";
but this gives me a compile time error, since x3::string("bar") returns a string and not a std::vector<std::string>.
My question is, how can I achieve, that the x3::string("bar") parser (and every other parser which returns a string) parses into a vector?

The way to parse a single element and expose it as a single-element container attribute is x3::repeat(1) [ p ]:
Live On Coliru
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
namespace x3 = boost::spirit::x3;
struct foo_node {
std::vector<std::string> value;
};
BOOST_FUSION_ADAPT_STRUCT(foo_node, value)
namespace rules {
auto const bar
= x3::string("bar");
auto const foo_node
= '(' >> +bar >> ')'
| x3::repeat(1) [ +bar ]
;
}
int main() {
for (std::string const input : {
"bar",
"(bar)",
"(barbar)",
})
{
auto f = input.begin(), l = input.end();
foo_node data;
bool ok = x3::parse(f, l, rules::foo_node, data);
if (ok) {
std::cout << "Parse success: " << data.value.size() << " elements\n";
} else {
std::cout << "Parse failed\n";
}
if (f != l)
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
}
Prints
Parse success: 1 elements
Parse success: 1 elements
Parse success: 2 elements

Related

Boost X3: Can a variant member be avoided in disjunctions?

I'd like to parse string | (string, int) and store it in a structure that defaults the int component to some value. The attribute of such a construction in X3 is a variant<string, tuple<string, int>>. I was thinking I could have a struct that takes either a string or a (string, int) to automagically be populated:
struct bar
{
bar (std::string x = "", int y = 0) : baz1 {x}, baz2 {y} {}
std::string baz1;
int baz2;
};
BOOST_FUSION_ADAPT_STRUCT (disj::ast::bar, baz1, baz2)
and then simply have:
const x3::rule<class bar, ast::bar> bar = "bar";
using x3::int_;
using x3::ascii::alnum;
auto const bar_def = (+(alnum) | ('(' >> +(alnum) >> ',' >> int_ >> ')')) >> ';';
BOOST_SPIRIT_DEFINE(bar);
However this does not work:
/usr/include/boost/spirit/home/x3/core/detail/parse_into_container.hpp:139:59: error: static assertion failed: Expecting a single element fusion sequence
139 | static_assert(traits::has_size<Attribute, 1>::value,
Setting baz2 to an optional does not help. One way to solve this is to have a variant field or inherit from that type:
struct string_int {
std::string s;
int i;
};
struct foo {
boost::variant<std::string, string_int> var;
};
BOOST_FUSION_ADAPT_STRUCT (disj::ast::string_int, s, i)
BOOST_FUSION_ADAPT_STRUCT (disj::ast::foo, var)
(For some reason, I have to use boost::variant instead of x3::variant for operator<< to work; also, using std::pair or tuple for string_int does not work, but boost::fusion::deque does.) One can then equip foo somehow to get the string and integer.
Question: What is the proper, clean way to do this in X3? Is there a more natural way than this second option and equipping foo with accessors?
Live On Coliru
Sadly the wording in the x3 section is exceedingly sparse and allows it (contrast the Qi section). A quick test confirms it:
Live On Coliru
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
template <typename Expr>
std::string inspect(Expr const& expr) {
using A = typename x3::traits::attribute_of<Expr, x3::unused_type>::type;
return boost::core::demangle(typeid(A).name());
}
int main()
{
std::cout << inspect(x3::double_ | x3::int_) << "\n"; // variant expected
std::cout << inspect(x3::int_ | "bla" >> x3::int_) << "\n"; // variant "understandable"
std::cout << inspect(x3::int_ | x3::int_) << "\n"; // variant suprising:
}
Prints
boost::variant<double, int>
boost::variant<int, int>
boost::variant<int, int>
All Hope Is Not Lost
In your specific case you could trick the system:
auto const bar_def = //
(+x3::alnum >> x3::attr(-1) //
| '(' >> +x3::alnum >> ',' >> x3::int_ >> ')' //
) >> ';';
Note how we "inject" an int value for the first branch. That satisfies the attribute propagation gods:
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted/struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <iomanip>
namespace x3 = boost::spirit::x3;
namespace disj::ast {
struct bar {
std::string x;
int y;
};
using boost::fusion::operator<<;
} // namespace disj::ast
BOOST_FUSION_ADAPT_STRUCT(disj::ast::bar, x, y)
namespace disj::parser {
const x3::rule<class bar, ast::bar> bar = "bar";
auto const bar_def = //
(+x3::alnum >> x3::attr(-1) //
| '(' >> +x3::alnum >> ',' >> x3::int_ >> ')' //
) >> ';';
BOOST_SPIRIT_DEFINE(bar)
}
namespace disj {
void run_tests() {
for (std::string const input : {
"",
";",
"bla;",
"bla, 42;",
"(bla, 42);",
}) {
ast::bar val;
auto f = begin(input), l = end(input);
std::cout << "\n" << quoted(input) << " -> ";
if (phrase_parse(f, l, parser::bar, x3::space, val)) {
std::cout << "Parsed: " << val << "\n";
} else {
std::cout << "Failed\n";
}
if (f!=l) {
std::cout << " -- Remaining " << quoted(std::string_view(f, l)) << "\n";
}
}
}
}
int main()
{
disj::run_tests();
}
Prints
"" -> Failed
";" -> Failed
-- Remaining ";"
"bla;" -> Parsed: (bla -1)
"bla, 42;" -> Failed
-- Remaining "bla, 42;"
"(bla, 42);" -> Parsed: (bla 42)
ยน just today

boost spirit x3 match an end of lexeme? [duplicate]

How does one prevent X3 symbol parsers from matching partial tokens? In the example below, I want to match "foo", but not "foobar". I tried throwing the symbol parser in a lexeme directive as one would for an identifier, but then nothing matches.
Thanks for any insights!
#include <string>
#include <iostream>
#include <iomanip>
#include <boost/spirit/home/x3.hpp>
int main() {
boost::spirit::x3::symbols<int> sym;
sym.add("foo", 1);
for (std::string const input : {
"foo",
"foobar",
"barfoo"
})
{
using namespace boost::spirit::x3;
std::cout << "\nParsing " << std::left << std::setw(20) << ("'" + input + "':");
int v;
auto iter = input.begin();
auto end = input.end();
bool ok;
{
// what's right rule??
// this matches nothing
// auto r = lexeme[sym - alnum];
// this matchs prefix strings
auto r = sym;
ok = phrase_parse(iter, end, r, space, v);
}
if (ok) {
std::cout << v << " Remaining: " << std::string(iter, end);
} else {
std::cout << "Parse failed";
}
}
}
Qi used to have distinct in their repository.
X3 doesn't.
The thing that solves it for the case you showed is a simple lookahead assertion:
auto r = lexeme [ sym >> !alnum ];
You could make a distinct helper easily too, e.g.:
auto kw = [](auto p) { return lexeme [ p >> !(alnum | '_') ]; };
Now you can just parse kw(sym).
Live On Coliru
#include <iostream>
#include <boost/spirit/home/x3.hpp>
int main() {
boost::spirit::x3::symbols<int> sym;
sym.add("foo", 1);
for (std::string const input : { "foo", "foobar", "barfoo" }) {
std::cout << "\nParsing '" << input << "': ";
auto iter = input.begin();
auto const end = input.end();
int v = -1;
bool ok;
{
using namespace boost::spirit::x3;
auto kw = [](auto p) { return lexeme [ p >> !(alnum | '_') ]; };
ok = phrase_parse(iter, end, kw(sym), space, v);
}
if (ok) {
std::cout << v << " Remaining: '" << std::string(iter, end) << "'\n";
} else {
std::cout << "Parse failed";
}
}
}
Prints
Parsing 'foo': 1 Remaining: ''
Parsing 'foobar': Parse failed
Parsing 'barfoo': Parse failed

What's the appropriate way to indicate a Qi transform attribute fail?

What's the proper way to indicate a parse fail in a boost::spirit::traits::transform_attribute? Can I throw any old exception, or is there a specific thing it wants me to do?
namespace boost
{
namespace spirit
{
namespace traits
{
template <>
struct transform_attribute<TwoNums, std::vector<char>, qi::domain>
{
typedef std::vector<char> type;
static type pre(TwoWords&) { return{}; }
static void post(TwoWords& val, type const& attr) {
std::string stringed(attr.begin(), attr.end());
//https://stackoverflow.com/questions/236129/the-most-elegant-way-to-iterate-the-words-of-a-string
std::vector<std::string> strs;
boost::split(strs, stringed, ",");
if(strs.size()!=2)
{
//What do I do here?
}
val = TwoWords(strs[0],strs[1]);
}
static void fail(FDate&) { }
};
}
}
}
Yes, raising an exception seems the only out-of-band way.
You could use qi::on_error to trap and respond to it.
However, it's a bit unclear what you need this for. It seems a bit upside down to use split inside a parser. Splitting is basically a poor version of parsing.
Why not have a rule for the sub-parsing?
1. Simple Throw...
Live On Coliru
#include <boost/algorithm/string.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;
struct Invalid {};
struct TwoWords {
std::string one, two;
};
namespace boost { namespace spirit { namespace traits {
template <> struct transform_attribute<TwoWords, std::vector<char>, qi::domain> {
typedef std::vector<char> type;
static type pre(TwoWords &) { return {}; }
static void post(TwoWords &val, type const &attr) {
std::string stringed(attr.begin(), attr.end());
std::vector<std::string> strs;
boost::split(strs, stringed, boost::is_any_of(","));
if (strs.size() != 2) {
throw Invalid{};
}
val = TwoWords{ strs.at(0), strs.at(1) };
}
static void fail(TwoWords &) {}
};
} } }
template <typename It>
struct Demo1 : qi::grammar<It, TwoWords()> {
Demo1() : Demo1::base_type(start) {
start = qi::attr_cast<TwoWords>(+qi::char_);
}
private:
qi::rule<It, TwoWords()> start;
};
int main() {
Demo1<std::string::const_iterator> parser;
for (std::string const input : { ",", "a,b", "a,b,c" }) {
std::cout << "Parsing " << std::quoted(input) << " -> ";
TwoWords tw;
try {
if (parse(input.begin(), input.end(), parser, tw)) {
std::cout << std::quoted(tw.one) << ", " << std::quoted(tw.two) << "\n";
} else {
std::cout << "Failed\n";
}
} catch(Invalid) {
std::cout << "Input invalid\n";
}
}
}
Prints
Parsing "," -> "", ""
Parsing "a,b" -> "a", "b"
Parsing "a,b,c" -> Input invalid
2. Handling Errors Inside The Parser
This feels a bit hacky because it will require you to throw a expectation_failure.
This is not optimal since it assumes you know the iterator the parser is going to be instantiated with.
on_error was designed for use with expectation points
*Live On Coliru
#include <boost/algorithm/string.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;
struct Invalid {};
struct TwoWords {
std::string one, two;
};
namespace boost { namespace spirit { namespace traits {
template <> struct transform_attribute<TwoWords, std::vector<char>, qi::domain> {
typedef std::vector<char> type;
static type pre(TwoWords &) { return {}; }
static void post(TwoWords &val, type const &attr) {
std::string stringed(attr.begin(), attr.end());
std::vector<std::string> strs;
boost::split(strs, stringed, boost::is_any_of(","));
if (strs.size() != 2) {
throw qi::expectation_failure<std::string::const_iterator>({}, {}, info("test"));
}
val = TwoWords{ strs.at(0), strs.at(1) };
}
static void fail(TwoWords &) {}
};
} } }
template <typename It>
struct Demo2 : qi::grammar<It, TwoWords()> {
Demo2() : Demo2::base_type(start) {
start = qi::attr_cast<TwoWords>(+qi::char_);
qi::on_error(start, [](auto&&...){});
// more verbose spelling:
// qi::on_error<qi::error_handler_result::fail> (start, [](auto&&...){[>no-op<]});
}
private:
qi::rule<It, TwoWords()> start;
};
int main() {
Demo2<std::string::const_iterator> parser;
for (std::string const input : { ",", "a,b", "a,b,c" }) {
std::cout << "Parsing " << std::quoted(input) << " -> ";
TwoWords tw;
try {
if (parse(input.begin(), input.end(), parser, tw)) {
std::cout << std::quoted(tw.one) << ", " << std::quoted(tw.two) << "\n";
} else {
std::cout << "Failed\n";
}
} catch(Invalid) {
std::cout << "Input invalid\n";
}
}
}
Prints
Parsing "," -> "", ""
Parsing "a,b" -> "a", "b"
Parsing "a,b,c" -> Failed
3. Finally: Sub-rules Rule!
Let's assume a slightly more interesting grammar in which you have a ; separated list of TwoWords:
"foo,bar;a,b"
We parse into a vector of TwoWords:
using Word = std::string;
struct TwoWords { std::string one, two; };
using TwoWordses = std::vector<TwoWords>;
Instead of using traits to "coerce" attributes, we just adapt the struct and rely on automatic attribute propagation:
BOOST_FUSION_ADAPT_STRUCT(TwoWords, one, two)
The parser mimics the data-types:
template <typename It>
struct Demo3 : qi::grammar<It, TwoWordses()> {
Demo3() : Demo3::base_type(start) {
using namespace qi;
word = *(graph - ',' - ';');
twowords = word >> ',' >> word;
start = twowords % ';';
}
private:
qi::rule<It, Word()> word;
qi::rule<It, TwoWords()> twowords;
qi::rule<It, TwoWordses()> start;
};
And the full test is Live On Coliru
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;
using Word = std::string;
struct TwoWords { std::string one, two; };
using TwoWordses = std::vector<TwoWords>;
BOOST_FUSION_ADAPT_STRUCT(TwoWords, one, two);
template <typename It>
struct Demo3 : qi::grammar<It, TwoWordses()> {
Demo3() : Demo3::base_type(start) {
using namespace qi;
word = *(graph - ',' - ';');
twowords = word >> ',' >> word;
start = twowords % ';';
}
private:
qi::rule<It, Word()> word;
qi::rule<It, TwoWords()> twowords;
qi::rule<It, TwoWordses()> start;
};
int main() {
using It = std::string::const_iterator;
Demo3<It> parser;
for (std::string const input : {
",",
"foo,bar",
"foo,bar;qux,bax",
"foo,bar;qux,bax;err,;,ful",
// failing cases or cases with trailing input:
"",
"foo,bar;",
"foo,bar,qux",
})
{
std::cout << "Parsing " << std::quoted(input) << " ->\n";
TwoWordses tws;
It f = input.begin(), l = input.end();
if (parse(f, l, parser, tws)) {
for(auto& tw : tws) {
std::cout << " - " << std::quoted(tw.one) << ", " << std::quoted(tw.two) << "\n";
}
} else {
std::cout << "Failed\n";
}
if (f != l) {
std::cout << "Remaining unparsed input: " << std::quoted(std::string(f,l)) << "\n";
}
}
}
Prints
Parsing "," ->
- "", ""
Parsing "foo,bar" ->
- "foo", "bar"
Parsing "foo,bar;qux,bax" ->
- "foo", "bar"
- "qux", "bax"
Parsing "foo,bar;qux,bax;err,;,ful" ->
- "foo", "bar"
- "qux", "bax"
- "err", ""
- "", "ful"
Parsing "" ->
Failed
Parsing "foo,bar;" ->
- "foo", "bar"
Remaining unparsed input: ";"
Parsing "foo,bar,qux" ->
- "foo", "bar"
Remaining unparsed input: ",qux"

How to write a boost::spirit::qi parser to parse an integer range from 0 to std::numeric_limits<int>::max()?

I tried to use qi::uint_parser<int>(). But it is the same like qi::uint_. They all match integers range from 0 to std::numeric_limits<unsigned int>::max().
Is qi::uint_parser<int>() designed to be like this? What parser shall I use to match an integer range from 0 to std::numeric_limits<int>::max()? Thanks.
Simplest demo, attaching a semantic action to do the range check:
uint_ [ _pass = (_1>=0 && _1<=std::numeric_limits<int>::max()) ];
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
template <typename It>
struct MyInt : boost::spirit::qi::grammar<It, int()> {
MyInt() : MyInt::base_type(start) {
using namespace boost::spirit::qi;
start %= uint_ [ _pass = (_1>=0 && _1<=std::numeric_limits<int>::max()) ];
}
private:
boost::spirit::qi::rule<It, int()> start;
};
template <typename Int>
void test(Int value, char const* logical) {
MyInt<std::string::const_iterator> p;
std::string const input = std::to_string(value);
std::cout << " ---------------- Testing '" << input << "' (" << logical << ")\n";
auto f = input.begin(), l = input.end();
int parsed;
if (parse(f, l, p, parsed)) {
std::cout << "Parse success: " << parsed << "\n";
} else {
std::cout << "Parse failed\n";
}
if (f!=l) {
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
}
int main() {
unsigned maxint = std::numeric_limits<int>::max();
MyInt<std::string::const_iterator> p;
test(maxint , "maxint");
test(maxint-1, "maxint-1");
test(maxint+1, "maxint+1");
test(0 , "0");
test(-1 , "-1");
}
Prints
---------------- Testing '2147483647' (maxint)
Parse success: 2147483647
---------------- Testing '2147483646' (maxint-1)
Parse success: 2147483646
---------------- Testing '2147483648' (maxint+1)
Parse failed
Remaining unparsed: '2147483648'
---------------- Testing '0' (0)
Parse success: 0
---------------- Testing '-1' (-1)
Parse failed
Remaining unparsed: '-1'

How can I add conditional expectation points in spirit X3

I am currentl adding expectation points to my grammar in X3.
Now I came accross an rule, which looks like this.
auto const id_string = +x3::char("A-Za-z0-9_);
auto const nested_identifier_def =
x3::lexeme[
*(id_string >> "::")
>> *(id_string >> ".")
>> id_string
];
I am wondering how I can add conditional expectation points to this rule.
Like "if there is a "::" then there musst follow an id_string" or "when there is a . then there musst follow an id_string"
and so on.
How can I achieve such a behaviour for such a rule?
I'd write it exactly the way you intend it:
auto const identifier
= lexeme [+char_("A-Za-z0-9_")];
auto const qualified_id
= identifier >> *("::" > identifier);
auto const simple_expression // only member expressions supported now
= qualified_id >> *('.' > identifier);
With a corresponding AST:
namespace AST {
using identifier = std::string;
struct qualified_id : std::vector<identifier> { using std::vector<identifier>::vector; };
struct simple_expression {
qualified_id lhs;
std::vector<identifier> rhs;
};
}
LIVE DEMO
Live On Coliru
#include <iostream>
#include <string>
#include <vector>
namespace AST {
using identifier = std::string;
struct qualified_id : std::vector<identifier> { using std::vector<identifier>::vector; };
struct simple_expression {
qualified_id lhs;
std::vector<identifier> rhs;
};
}
#include <boost/fusion/adapted.hpp>
BOOST_FUSION_ADAPT_STRUCT(AST::simple_expression, lhs, rhs)
#include <boost/spirit/home/x3.hpp>
namespace Parser {
using namespace boost::spirit::x3;
auto const identifier
= rule<struct identifier_, AST::identifier> {}
= lexeme [+char_("A-Za-z0-9_")];
auto const qualified_id
= rule<struct qualified_id_, AST::qualified_id> {}
= identifier >> *("::" > identifier);
auto const simple_expression // only member expressions supported now
= rule<struct simple_expression_, AST::simple_expression> {}
= qualified_id >> *('.' > identifier);
}
int main() {
using It = std::string::const_iterator;
for (std::string const input : { "foo", "foo::bar", "foo.member", "foo::bar.member.subobject" }) {
It f = input.begin(), l = input.end();
AST::simple_expression data;
bool ok = phrase_parse(f, l, Parser::simple_expression, boost::spirit::x3::space, data);
if (ok) {
std::cout << "Parse success: ";
for (auto& el : data.lhs) std::cout << "::" << el;
for (auto& el : data.rhs) std::cout << "." << el;
std::cout << "\n";
}
else {
std::cout << "Parse failure ('" << input << "')\n";
}
if (f != l)
std::cout << "Remaining unparsed input: '" << std::string(f, l) << "'\n";
}
}
Prints
Parse success: ::foo
Parse success: ::foo::bar
Parse success: ::foo.member
Parse success: ::foo::bar.member.subobject