How can i implement const in Boost Spirit? - c++

I am interested in Boost Spirit nowadays and trying to build something. Can we implement something like a const in C++ using Spirit? For instance, user will define an item like;
constant var PROG_LANG="Java";
"constant var" seems weird, I accept but you got the idea. I searched the internet but can't found anything about it.

What the BigBoss said :)
Only I'd do without the semantic actions - making it far less... verbose (See also Boost Spirit: "Semantic actions are evil"?):
vdef =
("constant" >> attr(true) | attr(false)) >>
"var" >> identifier >> '=' >> identifier_value >> ';' ;
That's all. This uses qi::attr to account for the default (missing constant keyword).
Here's a full demo with output:
http://liveworkspace.org/code/c9e4bef100d2249eb4d4b88205f85c4b
Output:
parse success: 'var myvariable = "has some value";'
data: false;myvariable;has some value;
parse success: 'constant var myvariable = "has some value";'
data: true;myvariable;has some value;
Code:
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx = boost::phoenix;
struct var_definition {
bool is_constant;
std::string name;
std::string value;
var_definition() : is_constant( false ) {}
};
BOOST_FUSION_ADAPT_STRUCT(var_definition, (bool, is_constant)(std::string, name)(std::string, value))
void doParse(const std::string& input)
{
typedef std::string::const_iterator It;
qi::rule<It, std::string()> identifier, identifier_value;
qi::rule<It, var_definition(), qi::space_type> vdef;
{
using namespace qi;
identifier_value = '"' >> lexeme [ +~char_('"') ] > '"';
identifier = lexeme [ +graph ];
vdef =
("constant" >> attr(true) | attr(false)) >>
"var" >> identifier >> '=' >> identifier_value >> ';' ;
}
var_definition data;
It f(std::begin(input)), l(std::end(input));
bool ok = qi::phrase_parse(f,l,vdef,qi::space,data);
if (ok)
{
std::cout << "parse success: '" << input << "'\n";
std::cout << "data: " << karma::format_delimited(karma::auto_, ';', data) << "\n";
}
}
int main()
{
doParse("var myvariable = \"has some value\";");
doParse("constant var myvariable = \"has some value\";");
}

I don't get your question correctly, spirit is a parser and it has nothing to do with the meaning of constant it can only parse it, but if you mean parse an optional variable like constant then it can be something line:
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
typedef std::string::const_iterator it;
struct var_definition {
bool is_constant;
std::string name;
std::string value;
var_definition() : is_constant( false ) {}
};
qi::rule<it, std::string()> identifier;
qi::rule<it, std::string()> identifier_value;
qi::rule<it, var_definition(), boost::spirit::ascii::space_type> vdef;
void mark_var_as_constant(var_definition& vd) {vd.is_constant=true;}
void set_var_name(var_definition& vd, std::string const& val) {vd.name=val;}
void set_var_value(var_definition& vd, std::string const& val) {vd.value=val;}
vdef %=
-qi::lit("constant")[phx::bind(mark_var_as_constant, qi::_val)] >>
qi::lit("var") >>
identifier[phx::bind(set_var_name, qi::_val, qi::_1)] >>
qi::char_('=') >>
identifier_value[phx::bind(set_var_value, qi::_val, qi::_1)] >>
qi::char_(';');
Of course there are other ways, for example:
(qi::lit("constant")[phx::bind(mark_var_as_constant, qi::_val)] | qi::eps)
And the easiest is:
qi::hold[ qi::lit("constant")[phx::bind(mark_var_as_constant, qi::_val)] ]

Related

Boost Spirit template specialization failure

Below is a very compact version of a grammar I'm trying to write using boost::spirit::qi.
Environment: VS2013, x86, Boost1.64
When #including the header file, the compiler complains about the line
rBlock = "{" >> +(rInvocation) >> "}";
with a very long log (I've only copied the beginning and the end):
more than one partial specialization matches the template argument list
...
...
see reference to function template instantiation
'boost::spirit::qi::rule
&boost::spirit::qi::rule::operator =>(const Expr &)' being compiled
Where is my mistake?
The header file:
//mygrammar.h
#pragma once
#include <boost/spirit/include/qi.hpp>
namespace myNS
{
typedef std::string Identifier;
typedef ::boost::spirit::qi::rule <const char*, Identifier()> myIdentifierRule;
typedef ::boost::variant<char, int> Expression;
typedef ::boost::spirit::qi::rule <const char*, Expression()> myExpressionRule;
struct IdntifierEqArgument
{
Identifier ident;
Expression arg;
};
typedef ::boost::variant < IdntifierEqArgument, Expression > Argument;
typedef ::boost::spirit::qi::rule <const char*, Argument()> myArgumentRule;
typedef ::std::vector<Argument> ArgumentList;
typedef ::boost::spirit::qi::rule <const char*, myNS::ArgumentList()> myArgumentListRule;
struct Invocation
{
Identifier identifier;
::boost::optional<ArgumentList> args;
};
typedef ::boost::spirit::qi::rule <const char*, Invocation()> myInvocationRule;
typedef ::std::vector<Invocation> Block;
typedef ::boost::spirit::qi::rule <const char*, myNS::Block()> myBlockRule;
}
BOOST_FUSION_ADAPT_STRUCT(
myNS::IdntifierEqArgument,
(auto, ident)
(auto, arg)
);
BOOST_FUSION_ADAPT_STRUCT(
myNS::Invocation,
(auto, identifier)
(auto, args)
);
namespace myNS
{
struct myRules
{
myIdentifierRule rIdentifier;
myExpressionRule rExpression;
myArgumentRule rArgument;
myArgumentListRule rArgumentList;
myInvocationRule rInvocation;
myBlockRule rBlock;
myRules()
{
using namespace ::boost::spirit;
using namespace ::boost::spirit::qi;
rIdentifier = as_string[((qi::alpha | '_') >> *(qi::alnum | '_'))];
rExpression = char_ | int_;
rArgument = (rIdentifier >> "=" >> rExpression) | rExpression;
rArgumentList = rArgument >> *("," >> rArgument);
rInvocation = rIdentifier >> "(" >> -rArgumentList >> ")";
rBlock = "{" >> +(rInvocation) >> "}";
}
};
}
I'm not exactly sure where the issue is triggered, but it clearly is a symptom of too many ambiguities in the attribute forwarding rules.
Conceptually this could be triggered by your attribute types having similar/compatible layouts. In language theory, you're looking at a mismatch between C++'s nominative type system versus the approximation of structural typing in the attribute propagation system. But enough theorism :)
I don't think attr_cast<> will save you here as it probably uses the same mechanics and heuristics under the hood.
It drew my attention that making the ArgumentList optional is ... not very useful (as an empty list already accurately reflects absense of arguments).
So I tried simplifying the rules:
rArgumentList = -(rArgument % ',');
rInvocation = rIdentifier >> '(' >> rArgumentList >> ')';
And the declared attribute type can be simply ArgumentList instead of boost::optional::ArgumentList.
This turns out to remove the ambiguity when propagating into the vector<Invocation>, so ... you're saved.
If this feels "accidental" to you, you should! What would I do if this hadn't removed the ambiguity "by chance"? I'd have created a semantic action to propagate the Invocation by simpler mechanics. There's a good chance that fusion::push_back(_val, _1) or similar would have worked.
See also Boost Spirit: "Semantic actions are evil"?
Review And Demo
In the cleaned up review here I present a few fixes/improvements and a test run that dumps the parsed AST.
Separate AST from parser (you don't want use qi in the AST types. You specifically do not want using namespace directives in the face of generic template libraries)
Do not use auto in the adapt macros. That's not a feature. Instead, since you can ostensibly use C++11, use the C++11 (decltype) based macros
BOOST_FUSION_ADAPT_STRUCT(myAST::IdntifierEqArgument, ident,arg);
BOOST_FUSION_ADAPT_STRUCT(myAST::Invocation, identifier,args);
AST is leading (also, prefer c++11 for clarity):
namespace myAST {
using Identifier = std::string;
using Expression = boost::variant<char, int>;
struct IdntifierEqArgument {
Identifier ident;
Expression arg;
};
using Argument = boost::variant<IdntifierEqArgument, Expression>;
using ArgumentList = std::vector<Argument>;
struct Invocation {
Identifier identifier;
ArgumentList args;
};
using Block = std::vector<Invocation>;
}
It's nice to have the definitions separate
Regarding the parser,
I'd prefer the qi::grammar convention. Also,
You didn't declare any of the rules with a skipper. I "guessed" from context that whitespace is insignificant outside of the rules for Expression and Identifier.
Expression ate every char_, so also would eat ')' or even '3'. I noticed this only when testing and after debugging with:
//#define BOOST_SPIRIT_DEBUG
BOOST_SPIRIT_DEBUG_NODES((start)(rBlock)(rInvocation)(rIdentifier)(rArgumentList)(rArgument)(rExpression))
I highly recommend using these facilities
All in all the parser comes down to
namespace myNS {
namespace qi = boost::spirit::qi;
template <typename Iterator = char const*>
struct myRules : qi::grammar<Iterator, myAST::Block()> {
myRules() : myRules::base_type(start) {
rIdentifier = qi::raw [(qi::alpha | '_') >> *(qi::alnum | '_')];
rExpression = qi::alpha | qi::int_;
rArgument = (rIdentifier >> '=' >> rExpression) | rExpression;
rArgumentList = -(rArgument % ',');
rInvocation = rIdentifier >> '(' >> rArgumentList >> ')';
rBlock = '{' >> +rInvocation >> '}';
start = qi::skip(qi::space) [ rBlock ];
BOOST_SPIRIT_DEBUG_NODES((start)(rBlock)(rInvocation)(rIdentifier)(rArgumentList)(rArgument)(rExpression))
}
private:
qi::rule<Iterator, myAST::Block()> start;
using Skipper = qi::space_type;
qi::rule<Iterator, myAST::Argument(), Skipper> rArgument;
qi::rule<Iterator, myAST::ArgumentList(), Skipper> rArgumentList;
qi::rule<Iterator, myAST::Invocation(), Skipper> rInvocation;
qi::rule<Iterator, myAST::Block(), Skipper> rBlock;
// implicit lexemes
qi::rule<Iterator, myAST::Identifier()> rIdentifier;
qi::rule<Iterator, myAST::Expression()> rExpression;
};
}
Adding a test driver
int main() {
std::string const input = R"(
{
foo()
bar(a, b, 42)
qux(someThing_awful01 = 9)
}
)";
auto f = input.data(), l = f + input.size();
myAST::Block block;
bool ok = parse(f, l, myNS::myRules<>{}, block);
if (ok) {
std::cout << "Parse success\n";
for (auto& invocation : block) {
std::cout << invocation.identifier << "(";
for (auto& arg : invocation.args) std::cout << arg << ",";
std::cout << ")\n";
}
}
else
std::cout << "Parse failed\n";
if (f!=l)
std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
}
Complete Demo
See it Live On Coliru
//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
namespace myAST {
using Identifier = std::string;
using Expression = boost::variant<char, int>;
struct IdntifierEqArgument {
Identifier ident;
Expression arg;
};
using Argument = boost::variant<IdntifierEqArgument, Expression>;
using ArgumentList = std::vector<Argument>;
struct Invocation {
Identifier identifier;
ArgumentList args;
};
using Block = std::vector<Invocation>;
// for debug printing
static inline std::ostream& operator<<(std::ostream& os, myAST::IdntifierEqArgument const& named) {
return os << named.ident << "=" << named.arg;
}
}
BOOST_FUSION_ADAPT_STRUCT(myAST::IdntifierEqArgument, ident,arg);
BOOST_FUSION_ADAPT_STRUCT(myAST::Invocation, identifier,args);
namespace myNS {
namespace qi = boost::spirit::qi;
template <typename Iterator = char const*>
struct myRules : qi::grammar<Iterator, myAST::Block()> {
myRules() : myRules::base_type(start) {
rIdentifier = qi::raw [(qi::alpha | '_') >> *(qi::alnum | '_')];
rExpression = qi::alpha | qi::int_;
rArgument = (rIdentifier >> '=' >> rExpression) | rExpression;
rArgumentList = -(rArgument % ',');
rInvocation = rIdentifier >> '(' >> rArgumentList >> ')';
rBlock = '{' >> +rInvocation >> '}';
start = qi::skip(qi::space) [ rBlock ];
BOOST_SPIRIT_DEBUG_NODES((start)(rBlock)(rInvocation)(rIdentifier)(rArgumentList)(rArgument)(rExpression))
}
private:
qi::rule<Iterator, myAST::Block()> start;
using Skipper = qi::space_type;
qi::rule<Iterator, myAST::Argument(), Skipper> rArgument;
qi::rule<Iterator, myAST::ArgumentList(), Skipper> rArgumentList;
qi::rule<Iterator, myAST::Invocation(), Skipper> rInvocation;
qi::rule<Iterator, myAST::Block(), Skipper> rBlock;
// implicit lexemes
qi::rule<Iterator, myAST::Identifier()> rIdentifier;
qi::rule<Iterator, myAST::Expression()> rExpression;
};
}
int main() {
std::string const input = R"(
{
foo()
bar(a, b, 42)
qux(someThing_awful01 = 9)
}
)";
auto f = input.data(), l = f + input.size();
myAST::Block block;
bool ok = parse(f, l, myNS::myRules<>{}, block);
if (ok) {
std::cout << "Parse success\n";
for (auto& invocation : block) {
std::cout << invocation.identifier << "(";
for (auto& arg : invocation.args) std::cout << arg << ",";
std::cout << ")\n";
}
}
else
std::cout << "Parse failed\n";
if (f!=l)
std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
}
Prints output
Parse success
foo()
bar(a,b,42,)
qux(someThing_awful01=9,)
Remaining unparsed input: '
'

boost::spirit parsing into a fusion adapted structure optional but exclusive

If there's a structure:
struct record
{
std::string type;
std::string delimiter;
uint32_t length;
std::string name;
record()
{
type = "";
delimiter = "";
length = 0;
name = "";
}
};
Which is adapted using boost::fusion, and the below grammar:
struct record_parser : qi::grammar<Iterator, record(), ascii::space_type>
{
record_parser() : record_parser::base_type(start)
{
using qi::lit;
using qi::uint_;
using qi::lexeme;
using ascii::char_;
using ascii::blank;
using ascii::string;
using qi::attr;
using qi::eps;
type %= lexeme[+(char_ - (blank|char('(')))];
delimiter_double_quote %= char('(') >> lexeme[char('"') >> +(char_ - char('"')) >> char('"') ] >> char(')');
delimiter_single_quote %= char('(') >> lexeme[char('\'') >> +(char_ - char('\'')) >> char('\'')] >> char(')');
delimiter %= (delimiter_double_quote | delimiter_single_quote);
name %= lexeme[+(char_ - (blank|char(';')))] >> char(';');
length %= (char('(') >> uint_ >> char(')'));
start %=
eps >
lit("record")
>> char('{')
>> type
>> (delimiter | attr("")) >> (length | attr(0))
>> name
>> char('}')
;
}
qi::rule<Iterator, std::string(), ascii::space_type> type;
qi::rule<Iterator, std::string(), ascii::space_type> delimiter_double_quote;
qi::rule<Iterator, std::string(), ascii::space_type> delimiter_single_quote;
qi::rule<Iterator, std::string(), ascii::space_type> delimiter;
qi::rule<Iterator, uint32_t(), ascii::space_type> length;
qi::rule<Iterator, std::string(), ascii::space_type> name;
qi::rule<Iterator, record(), ascii::space_type> start;
};
I am looking to parse 'delimiter' and 'length' as optional. However, one of them has to be present, and if one is present, the other one should not exist.
For Example:
record { string(5) Alex; }
record { string("|") Alex; }
But Not:
record { string(5)("|") Alex; }
record { string Alex; }
I have attempted to do it this way, but compilation fails:
start %=
eps >
lit("record")
>> char('{')
>> type
>> ((delimiter >> attr(0)) | (attr("") >> length))
>> name
>> char('}')
;
Thank you for your help in advance. Below is the full source code:
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_object.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <string>
namespace client
{
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phoenix = boost::phoenix;
struct record
{
std::string type;
std::string delimiter;
uint32_t length;
std::string name;
record()
{
type = "";
delimiter = "";
length = 0;
name = "";
}
};
}
BOOST_FUSION_ADAPT_STRUCT(
client::record,
(std::string, type)
(std::string, delimiter)
(uint32_t, length)
(std::string, name)
)
namespace client
{
template <typename Iterator>
struct record_parser : qi::grammar<Iterator, record(), ascii::space_type>
{
record_parser() : record_parser::base_type(start)
{
using qi::lit;
using qi::uint_;
using qi::lexeme;
using ascii::char_;
using ascii::blank;
using ascii::string;
using qi::attr;
using qi::eps;
type %= lexeme[+(char_ - (blank|char('(')))];
delimiter_double_quote %= char('(') >> lexeme[char('"') >> +(char_ - char('"')) >> char('"') ] >> char(')');
delimiter_single_quote %= char('(') >> lexeme[char('\'') >> +(char_ - char('\'')) >> char('\'')] >> char(')');
delimiter %= (delimiter_double_quote | delimiter_single_quote);
name %= lexeme[+(char_ - (blank|char(';')))] >> char(';');
length %= (char('(') >> uint_ >> char(')'));
start %=
eps >
lit("record")
>> char('{')
>> type
>> (delimiter | attr("")) >> (length | attr(0))
>> name
>> char('}')
;
}
qi::rule<Iterator, std::string(), ascii::space_type> type;
qi::rule<Iterator, std::string(), ascii::space_type> delimiter_double_quote;
qi::rule<Iterator, std::string(), ascii::space_type> delimiter_single_quote;
qi::rule<Iterator, std::string(), ascii::space_type> delimiter;
qi::rule<Iterator, uint32_t(), ascii::space_type> length;
qi::rule<Iterator, std::string(), ascii::space_type> name;
qi::rule<Iterator, record(), ascii::space_type> start;
};
}
////////////////////////////////////////////////////////////////////////////
// Main program
////////////////////////////////////////////////////////////////////////////
int main()
{
std::string storage = "record { string(5) Alex; }";
using boost::spirit::ascii::space;
typedef std::string::const_iterator iterator_type;
typedef client::record_parser<iterator_type> record_parser;
record_parser g; // Our grammar
client::record rec;
std::string::const_iterator iter = storage.begin();
std::string::const_iterator end = storage.end();
bool r = phrase_parse(iter, end, g, space, rec);
if (r && iter == end)
{
std::cout << boost::fusion::tuple_open('[');
std::cout << boost::fusion::tuple_close(']');
std::cout << boost::fusion::tuple_delimiter(", ");
std::cout << "-------------------------\n";
std::cout << "Parsing succeeded\n";
std::cout << "got: " << boost::fusion::as_vector(rec) << std::endl;
std::cout << "\n-------------------------\n";
}
else
{
std::string::const_iterator some = iter+30;
std::string context(iter, (some>end)?end:some);
std::cout << "-------------------------\n";
std::cout << "Parsing failed\n";
std::cout << "stopped at -->" << context << "...\n";
std::cout << "-------------------------\n";
}
return 0;
}
You can just write out the combinations:
>> (
delimiter >> attr(0)
| attr("") >> length
| attr("") >> attr(0)
)
The best way to make it work with automatic attribute propagation is to use an AST structure that is similar:
namespace client {
struct record {
std::string type;
struct param_t {
std::string delimiter;
uint32_t length = 0;
} param;
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(client::record::param_t, delimiter, length)
BOOST_FUSION_ADAPT_STRUCT(client::record, type, param, name)
Full Demo Live On Coliru
Note how much simpler the grammar has been made (all those char(' ') things are unnecessary; use lexemes only if you declare a skipper; use ~char_ instead of character set subtraction; use graph instead of char_ - space etc.).
type = +(graph - '(');
delimiter_double_quote = '"' >> +~char_('"') >> '"' ;
delimiter_single_quote = "'" >> +~char_("'") >> "'" ;
delimiter = '(' >> (delimiter_double_quote | delimiter_single_quote) >> ')';
name = +(graph - ';');
length = '(' >> uint_ >> ')';
start = eps > lit("record") >> '{'
>> type
>> (
delimiter >> attr(0)
| attr("") >> length
| attr("") >> attr(0)
)
>> name >> ';' >> '}'
;
Full code:
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <string>
namespace qi = boost::spirit::qi;
namespace client {
struct record {
std::string type;
struct param_t {
std::string delimiter;
uint32_t length = 0;
} param;
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(client::record::param_t, delimiter, length)
BOOST_FUSION_ADAPT_STRUCT(client::record, type, param, name)
namespace client {
std::ostream& operator<<(std::ostream& os, record::param_t const& v) { return os << boost::fusion::as_vector(v); }
std::ostream& operator<<(std::ostream& os, record const& v) { return os << boost::fusion::as_vector(v); }
}
namespace client
{
template <typename Iterator, typename Skipper = qi::ascii::space_type>
struct record_parser : qi::grammar<Iterator, record(), Skipper>
{
record_parser() : record_parser::base_type(start)
{
using namespace qi;
type = +(graph - '(');
delimiter_double_quote = '"' >> +~char_('"') >> '"' ;
delimiter_single_quote = "'" >> +~char_("'") >> "'" ;
delimiter = '(' >> (delimiter_double_quote | delimiter_single_quote) >> ')';
name = +(graph - ';');
length = '(' >> uint_ >> ')';
start = eps > lit("record") >> '{'
>> type
>> (
delimiter >> attr(0)
| attr("") >> length
| attr("") >> attr(0)
)
>> name >> ';' >> '}'
;
}
private:
qi::rule<Iterator, record(), Skipper> start;
qi::rule<Iterator, uint32_t(), Skipper> length;
qi::rule<Iterator, std::string(), Skipper> delimiter;
// lexemes
qi::rule<Iterator, std::string()> type, delimiter_double_quote, delimiter_single_quote, name;
};
}
int main()
{
for (std::string const storage : {
"record { string(5) Alex; }",
"record { string(\"|\") Alex; }",
})
{
typedef std::string::const_iterator iterator_type;
typedef client::record_parser<iterator_type> record_parser;
record_parser g; // Our grammar
client::record rec;
auto iter = storage.begin(), end = storage.end();
bool r = phrase_parse(iter, end, g, qi::ascii::space, rec);
if (r) {
std::cout << "Parsing succeeded: " << rec << std::endl;
} else {
std::cout << "Parsing failed\n";
}
if (iter != end) {
std::cout << "Remaining: '" << std::string(iter, end) << "'...\n";
}
}
}
Prints:
Parsing succeeded: (string ( 5) Alex)
Parsing succeeded: (string (| 0) Alex)
Because it's 2016, adding a X3 example too. Once again, taking the variant approach, which I find to be typical in Spirit code.
namespace AST {
struct record {
std::string type;
boost::variant<std::string, uint32_t> param;
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(AST::record, type, param, name)
namespace parser {
using namespace x3;
auto quoted = [](char q) { return q >> +~char_(q) >> q; };
static auto const type = +(graph - '(');
static auto const delimiter = '(' >> (quoted('"') | quoted('\'')) >> ')';
static auto const name = +(graph - ';');
static auto const length = '(' >> uint_ >> ')';
static auto const start = lit("record") >> '{' >> type >> (delimiter | length) >> name >> ';' >> '}';
}
That's all. The calling code is virtually unchanged:
int main()
{
for (std::string const storage : {
"record { string(5) Alex; }",
"record { string(\"|\") Alex; }",
"record { string Alex; }",
})
{
typedef std::string::const_iterator iterator_type;
AST::record rec;
auto iter = storage.begin(), end = storage.end();
bool r = phrase_parse(iter, end, parser::start, x3::ascii::space, rec);
if (r) {
std::cout << "Parsing succeeded: " << boost::fusion::as_vector(rec) << std::endl;
} else {
std::cout << "Parsing failed\n";
}
if (iter != end) {
std::cout << "Remaining: '" << std::string(iter, end) << "'\n";
}
}
}
Everything compiles a lot quicker and I'd not be surprised if the resultant code was at least twice as fast at runtime too.
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
namespace x3 = boost::spirit::x3;
namespace AST {
struct record {
std::string type;
boost::variant<std::string, uint32_t> param;
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(AST::record, type, param, name)
namespace parser {
using namespace x3;
auto quoted = [](char q) { return q >> +~char_(q) >> q; };
static auto const type = +(graph - '(');
static auto const delimiter = '(' >> (quoted('"') | quoted('\'')) >> ')';
static auto const name = +(graph - ';');
static auto const length = '(' >> uint_ >> ')';
static auto const start = lit("record") >> '{' >> type >> (delimiter | length) >> name >> ';' >> '}';
}
#include <iostream>
#include <boost/fusion/include/io.hpp>
#include <boost/fusion/include/as_vector.hpp>
#include <boost/optional/optional_io.hpp>
int main()
{
for (std::string const storage : {
"record { string(5) Alex; }",
"record { string(\"|\") Alex; }",
"record { string Alex; }",
})
{
typedef std::string::const_iterator iterator_type;
AST::record rec;
auto iter = storage.begin(), end = storage.end();
bool r = phrase_parse(iter, end, parser::start, x3::ascii::space, rec);
if (r) {
std::cout << "Parsing succeeded: " << boost::fusion::as_vector(rec) << std::endl;
} else {
std::cout << "Parsing failed\n";
}
if (iter != end) {
std::cout << "Remaining: '" << std::string(iter, end) << "'\n";
}
}
}
Prints
Parsing succeeded: (string 5 Alex)
Parsing succeeded: (string | Alex)
Parsing failed
Remaining: 'record { string Alex; }'
sehe's first answer is perfect (or would be if he corrected what he realized in the comments), but I just wanted to add an explanation of the problem and a possible alternative. The code below is based on that excellent answer.
You have a couple of problems with the attributes of your start rule. The attribute you want to get is record which is basically tuple<string,string,uint32_t,string>. Let's see the attributes of several parsers:
Something similar (but simpler) to your original rule:
Attribute of: "lit("record") >> char_('{') >> type >> delimiter >> length >> name >> char_('}')"
tuple<char,string,string,uint32_t,string,char>
As you can see you have two extra char caused b your use of char_(has an attribute of char) instead of lit(has no attribute). omit[char_] could also work, but would be a little silly.
Let's change char_ to lit:
Attribute of: "lit("record") >> lit('{') >> type >> delimiter >> length >> name >> lit('}')"
tuple<string,string,uint32_t,string>
Which is what we want.
Your original rule with lit:
Attribute of: "lit("record") >> lit('{') >> type >> (delimiter | attr("")) >> (length | attr(0)) >> name >> lit('}')"
tuple<string,variant<string,char const (&)[1]>,variant<uint32_t,int>,string>
Since the branches of | aren't identical, you get variants instead of the attribute you want. (In this simple case everything works as if there were no variants though)
Let's remove the variants (since they cause errors in more complex scenarios):
Attribute of: "lit("record") >> lit('{') >> type >> (delimiter | attr(string())) >> (length | attr(uint32_t())) >> name >> lit('}')"
tuple<string,string,uint32_t,string>
This works in the cases you want but also when both are missing.
sehe's approach:
Attribute of: "lit("record") >> lit('{') >> type >> ((delimiter >> attr(uint32_t())) | (attr(string()) >> length)) >> name >> lit('}')"
tuple<string,tuple<string,uint32_t>,string>
Looking at this synthesized attribute you can see the need to create the param_t helper struct to make your record attribute match.
See on Coliru a way to "calculate" the previous attributes.
The possible alternative is a custom directive using boost::fusion::flatten_view. Keep in mind that this directive has very little testing so I would recommend the approach shown by sehe, but it seems to work (at least in this case).
The example in this question with this directive on Wandbox
Several other examples where this directive can be useful
flatten_directive.hpp
#pragma once
#include <boost/spirit/home/qi/meta_compiler.hpp>
#include <boost/spirit/home/qi/skip_over.hpp>
#include <boost/spirit/home/qi/parser.hpp>
#include <boost/spirit/home/support/unused.hpp>
#include <boost/spirit/home/support/common_terminals.hpp>
#include <boost/spirit/home/qi/detail/attributes.hpp>
#include <boost/spirit/home/support/info.hpp>
#include <boost/spirit/home/support/handles_container.hpp>
#include <boost/fusion/include/flatten_view.hpp>
#include <boost/fusion/include/for_each.hpp>
#include <boost/fusion/include/zip_view.hpp>
namespace custom
{
BOOST_SPIRIT_TERMINAL(flatten);
}
namespace boost {
namespace spirit
{
///////////////////////////////////////////////////////////////////////////
// Enablers
///////////////////////////////////////////////////////////////////////////
template <>
struct use_directive<qi::domain, custom::tag::flatten> // enables flatten
: mpl::true_ {};
}
}
namespace custom
{
template <typename Subject>
struct flatten_directive : boost::spirit::qi::unary_parser<flatten_directive<Subject> >
{
typedef Subject subject_type;
flatten_directive(Subject const& subject)
: subject(subject) {}
template <typename Context, typename Iterator>
struct attribute
{
typedef boost::fusion::flatten_view<typename
boost::spirit::traits::attribute_of<subject_type, Context, Iterator>::type>
type;//the attribute of the directive is a flatten_view of whatever is the attribute of the subject
};
template <typename Iterator, typename Context
, typename Skipper, typename Attribute>
bool parse(Iterator& first, Iterator const& last
, Context& context, Skipper const& skipper
, Attribute& attr) const
{
Iterator temp = first;
boost::spirit::qi::skip_over(first, last, skipper);
typename boost::spirit::traits::attribute_of<subject_type, Context, Iterator>::type original_attr;
if (subject.parse(first, last, context, skipper, original_attr))//parse normally
{
typename attribute<Context, Iterator>::type flattened_attr(original_attr);//flatten the attribute
typedef boost::fusion::vector<Attribute&,typename attribute<Context,Iterator>::type&> sequences;
boost::fusion::for_each(//assign to each element of Attribute the corresponding element of the flattened sequence
boost::fusion::zip_view<sequences>(
sequences(attr,flattened_attr)
)
,
[](const auto& pair)//substitute with a functor with templated operator() to support c++98/03
{
boost::spirit::traits::assign_to(boost::fusion::at_c<1>(pair),boost::fusion::at_c<0>(pair));
}
);
return true;
}
first = temp;
return false;
}
template <typename Context>
boost::spirit::info what(Context& context) const
{
return info("flatten", subject.what(context));
}
Subject subject;
};
}//custom
///////////////////////////////////////////////////////////////////////////
// Parser generators: make_xxx function (objects)
///////////////////////////////////////////////////////////////////////////
namespace boost {
namespace spirit {
namespace qi
{
template <typename Subject, typename Modifiers>
struct make_directive<custom::tag::flatten, Subject, Modifiers>
{
typedef custom::flatten_directive<Subject> result_type;
result_type operator()(unused_type, Subject const& subject, unused_type) const
{
return result_type(subject);
}
};
}
}
}
namespace boost {
namespace spirit {
namespace traits
{
///////////////////////////////////////////////////////////////////////////
template <typename Subject>
struct has_semantic_action<custom::flatten_directive<Subject> >
: unary_has_semantic_action<Subject> {};
///////////////////////////////////////////////////////////////////////////
template <typename Subject, typename Attribute, typename Context
, typename Iterator>
struct handles_container<custom::flatten_directive<Subject>, Attribute
, Context, Iterator>
: unary_handles_container<Subject, Attribute, Context, Iterator> {};
}
}
}
main.cpp
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/fusion/include/flatten_view.hpp>
#include <boost/fusion/include/copy.hpp>
#include "flatten_directive.hpp"
#include <string>
namespace qi = boost::spirit::qi;
namespace client {
struct record {
std::string type;
std::string delimiter;
uint32_t length = 0;
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(client::record, type, delimiter, length, name)
namespace client {
std::ostream& operator<<(std::ostream& os, record const& v) { return os << boost::fusion::tuple_open('[') << boost::fusion::tuple_close(']') << boost::fusion::tuple_delimiter(", ") << boost::fusion::as_vector(v); }
}
namespace client
{
template <typename Iterator, typename Skipper = qi::ascii::space_type>
struct record_parser : qi::grammar<Iterator, record(), Skipper>
{
record_parser() : record_parser::base_type(start)
{
using namespace qi;
type = +(graph - '(');
delimiter_double_quote = '"' >> +~char_('"') >> '"';
delimiter_single_quote = "'" >> +~char_("'") >> "'";
delimiter = '(' >> (delimiter_double_quote | delimiter_single_quote) >> ')';
name = +(graph - ';');
length = '(' >> uint_ >> ')';
start =
custom::flatten[
lit("record")
>> '{'
>> type
>> (
delimiter >> attr(uint32_t())//the attributes of both branches must be exactly identical
| attr(std::string("")) >> length//const char[1]!=std::string int!=uint32_t
)
>> name
>> ';'
>> '}'
]
;
}
private:
qi::rule<Iterator, record(), Skipper> start;
qi::rule<Iterator, uint32_t(), Skipper> length;
qi::rule<Iterator, std::string(), Skipper> delimiter;
// lexemes
qi::rule<Iterator, std::string()> type, delimiter_double_quote, delimiter_single_quote, name;
};
}
int main()
{
for (std::string const storage : {
"record { string(5) Alex; }",
"record { string(\"|\") Alex; }",
"record { string Alex; }",
"record { string (\"|\")(5) Alex; }"
})
{
typedef std::string::const_iterator iterator_type;
typedef client::record_parser<iterator_type> record_parser;
record_parser g; // Our grammar
client::record rec;
auto iter = storage.begin(), end = storage.end();
bool r = phrase_parse(iter, end, g, qi::ascii::space, rec);
if (r) {
std::cout << "Parsing succeeded: " << rec << std::endl;
}
else {
std::cout << "Parsing failed\n";
}
if (iter != end) {
std::cout << "Remaining: '" << std::string(iter, end) << "'...\n";
}
}
}
Here's the more typical approach that parsed variant<std::string, uint32_t> so the AST reflects that only one can be present:
With Nil-Param
With the same misunderstanding as in my first answer, allowing both params to be optional:
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/optional/optional_io.hpp>
#include <string>
namespace qi = boost::spirit::qi;
namespace client {
struct nil { friend std::ostream& operator<<(std::ostream& os, nil) { return os << "(nil)"; } };
struct record {
std::string type;
boost::variant<nil, std::string, uint32_t> param;
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(client::record, type, param, name)
namespace client
{
template <typename Iterator, typename Skipper = qi::ascii::space_type>
struct record_parser : qi::grammar<Iterator, record(), Skipper>
{
record_parser() : record_parser::base_type(start)
{
using namespace qi;
type = +(graph - '(');
delimiter_double_quote = '"' >> +~char_('"') >> '"' ;
delimiter_single_quote = "'" >> +~char_("'") >> "'" ;
delimiter = '(' >> (delimiter_double_quote | delimiter_single_quote) >> ')';
name = +(graph - ';');
length = '(' >> uint_ >> ')';
start = eps > lit("record") >> '{'
>> type
>> (delimiter | length | attr(nil{}))
>> name >> ';' >> '}'
;
}
private:
qi::rule<Iterator, record(), Skipper> start;
qi::rule<Iterator, uint32_t(), Skipper> length;
qi::rule<Iterator, std::string(), Skipper> delimiter;
// lexemes
qi::rule<Iterator, std::string()> type, delimiter_double_quote, delimiter_single_quote, name;
};
}
int main()
{
for (std::string const storage : {
"record { string(5) Alex; }",
"record { string(\"|\") Alex; }",
"record { string Alex; }",
})
{
typedef std::string::const_iterator iterator_type;
typedef client::record_parser<iterator_type> record_parser;
record_parser g; // Our grammar
client::record rec;
auto iter = storage.begin(), end = storage.end();
bool r = phrase_parse(iter, end, g, qi::ascii::space, rec);
if (r) {
std::cout << "Parsing succeeded: " << boost::fusion::as_vector(rec) << std::endl;
} else {
std::cout << "Parsing failed\n";
}
if (iter != end) {
std::cout << "Remaining: '" << std::string(iter, end) << "'...\n";
}
}
}
Prints
Parsing succeeded: (string 5 Alex)
Parsing succeeded: (string | Alex)
Parsing succeeded: (string (nil) Alex)
Without Nil-Param
Requiring exactly one:
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/optional/optional_io.hpp>
#include <string>
namespace qi = boost::spirit::qi;
namespace client {
struct record {
std::string type;
boost::variant<std::string, uint32_t> param;
std::string name;
};
}
BOOST_FUSION_ADAPT_STRUCT(client::record, type, param, name)
namespace client
{
template <typename Iterator, typename Skipper = qi::ascii::space_type>
struct record_parser : qi::grammar<Iterator, record(), Skipper>
{
record_parser() : record_parser::base_type(start)
{
using namespace qi;
type = +(graph - '(');
delimiter_double_quote = '"' >> +~char_('"') >> '"' ;
delimiter_single_quote = "'" >> +~char_("'") >> "'" ;
delimiter = '(' >> (delimiter_double_quote | delimiter_single_quote) >> ')';
name = +(graph - ';');
length = '(' >> uint_ >> ')';
start = eps > lit("record") >> '{'
>> type
>> (delimiter | length)
>> name >> ';' >> '}'
;
}
private:
qi::rule<Iterator, record(), Skipper> start;
qi::rule<Iterator, uint32_t(), Skipper> length;
qi::rule<Iterator, std::string(), Skipper> delimiter;
// lexemes
qi::rule<Iterator, std::string()> type, delimiter_double_quote, delimiter_single_quote, name;
};
}
int main()
{
for (std::string const storage : {
"record { string(5) Alex; }",
"record { string(\"|\") Alex; }",
"record { string Alex; }",
})
{
typedef std::string::const_iterator iterator_type;
typedef client::record_parser<iterator_type> record_parser;
record_parser g; // Our grammar
client::record rec;
auto iter = storage.begin(), end = storage.end();
bool r = phrase_parse(iter, end, g, qi::ascii::space, rec);
if (r) {
std::cout << "Parsing succeeded: " << boost::fusion::as_vector(rec) << std::endl;
} else {
std::cout << "Parsing failed\n";
}
if (iter != end) {
std::cout << "Remaining: '" << std::string(iter, end) << "'...\n";
}
}
}
Prints
Parsing succeeded: (string 5 Alex)
Parsing succeeded: (string | Alex)
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::spirit::qi::expectation_failure<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > >'
what(): boost::spirit::qi::expectation_failure

Boost spirit parsing objective-C like language

I'm trying to use boost's spirit to reimplement the logos parsing perl script from iPhone jailbreaking development.
An example of input is:
%hook SBLockScreenView
-(void)setCustomSlideToUnlockText:(id)arg1
{
arg1 = #"Changed the slider";
%orig(arg1);
}
%end
I so far have:
namespace logos
{
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
struct class_hook
{
std::string class_name;
std::string method_signature;
std::string method_body;
};
template <typename Iterator>
struct class_hook_parser : qi::grammar<Iterator, class_hook(), ascii::space_type>
{
class_hook_parser() : class_hook_parser::base_type(start)
{
using qi::int_;
using qi::lit;
using qi::on_error;
using qi::fail;
using qi::double_;
using qi::lexeme;
using ascii::char_;
hooked_class %= lexeme[+(char_("a-zA-Z") - '-')];
method_sig %= lexeme[+(char_) - '{'];
method_body %= lexeme[+(char_ - '}')];
start %=
lit("%hook")
>> hooked_class
>> method_sig
>> method_body
>> lit("%end")
;
on_error<fail>
(
start,
boost::phoenix::ref(std::cout) << "Something errored!" << std::endl);
}
qi::rule<Iterator, std::string(), ascii::space_type> hooked_class;
qi::rule<Iterator, std::string(), ascii::space_type> method_sig;
qi::rule<Iterator, std::string(), ascii::space_type> method_body;
qi::rule<Iterator, class_hook(), ascii::space_type> start;
};
}
BOOST_FUSION_ADAPT_STRUCT(logos::class_hook,
(std::string, class_name)
(std::string, method_signature)
(std::string, method_body))
typedef std::string::const_iterator iterator_type;
typedef logos::class_hook_parser<iterator_type> class_hook_parser;
using boost::spirit::ascii::space;
std::string::const_iterator
iter = std::begin(tweak_source_code),
end = std::end(tweak_source_code);
class_hook_parser g;
logos::class_hook emp;
bool r = phrase_parse(iter, end, g, space, emp);
if (r) {
std::cout << "Got: " << boost::fusion::as_vector(emp) << std::endl;
}
else std::cout << "Something isn't working" << std::endl;
But this oddly only prints out the Something isn't working message, not the on_fail callback. Where is my mistake in the parsing and how can I get actually working and informative parse error messages?
Did you mean
+(char_ - '{')
instead of
+(char_) - '{'
And likely, you'd require the body to begin with that { that was rejected as part of the signature. Here's my fixed version:
hooked_class = +char_("a-zA-Z");
method_sig = +(char_ - '{');
method_body = '{' >> +(char_ - '}') >> '}';
Notes:
Dropping the skipper allows you to drop the lexeme[] directive too.
Rejecting - from the "a-zA-Z" set is useless (it's not in it...).
method_sig now includes all whitespace (including the trailing newline)
Use BOOST_SPIRIT_DEBUG to get insight in why your grammar works in mysterious ways
See also: Boost spirit skipper issues
Live On Coliru
//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace logos
{
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
struct class_hook
{
std::string class_name;
std::string method_signature;
std::string method_body;
};
template <typename Iterator>
struct class_hook_parser : qi::grammar<Iterator, class_hook(), ascii::space_type>
{
class_hook_parser() : class_hook_parser::base_type(start)
{
using qi::int_;
using qi::lit;
using qi::on_error;
using qi::fail;
using qi::double_;
using qi::lexeme;
using ascii::char_;
hooked_class = +char_("a-zA-Z");
method_sig = +(char_ - '{');
method_body = '{' >> +(char_ - '}') >> '}';
start = "%hook"
>> hooked_class
>> method_sig
>> method_body
>> "%end"
;
on_error<fail> (start,
boost::phoenix::ref(std::cout) << "Something errored\n"
);
BOOST_SPIRIT_DEBUG_NODES((hooked_class)(method_sig)(method_body)(start))
}
private:
qi::rule<Iterator, std::string()> hooked_class, method_sig, method_body;
qi::rule<Iterator, class_hook(), ascii::space_type> start;
};
}
BOOST_FUSION_ADAPT_STRUCT(logos::class_hook, class_name, method_signature, method_body)
int main() {
typedef std::string::const_iterator iterator_type;
typedef logos::class_hook_parser<iterator_type> class_hook_parser;
std::string const tweak_source_code = R"(
%hook SBLockScreenView
-(void)setCustomSlideToUnlockText:(id)arg1
{
arg1 = #"Changed the slider";
%orig(arg1);
}
%end
)";
using boost::spirit::ascii::space;
iterator_type iter = std::begin(tweak_source_code), end = std::end(tweak_source_code);
class_hook_parser g;
logos::class_hook emp;
bool r = phrase_parse(iter, end, g, space, emp);
if (r) {
std::cout << "Got: " << boost::fusion::as_vector(emp) << "\n";
} else {
std::cout << "Something isn't working\n";
}
}
Prints
Got: (SBLockScreenView -(void)setCustomSlideToUnlockText:(id)arg1
arg1 = #"Changed the slider";
%orig(arg1);
)

boost spirit qi assign value from subrule

I am trying to parse 2 different type of strings and assign values into structures. For performance I am trying to use boost spirit subrules.
strings can be of the following types
Animal Type | Animal Attributes
Ex
DOG | Name=tim | Barks=Yes | Has a Tail=N | Address=3 infinite loop
BIRD| Name=poc | Tweets=Yes| Address=10 stack overflow street
The values are stored in an array of Dog and Bird structures below
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_subrule.hpp>
#include <boost/spirit/include/qi_symbols.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <string>
#include <iostream>
using std::cout;
using std::endl;
using std::cerr;
struct Dog
{
std::string Name;
bool Barks;
bool HasATail;
std::string Address;
};
struct Bird
{
std::string Name;
bool Tweets;
std::string Address;
};
namespace qi = boost::spirit::qi;
namespace repo = boost::spirit::repository;
namespace ascii = boost::spirit::ascii;
namespace phx = boost::phoenix;
template <typename Iterator>
struct ZooGrammar : public qi::grammar<Iterator, ascii::space_type>
{
ZooGrammar() : ZooGrammar::base_type(start_)
{
using qi::char_;
using qi::lit_;
using qi::_1;
using boost::phoenix::ref;
boost::spirit::qi::symbols<char, bool> yesno_;
yesno_.add("Y", true)("N", false);
start_ = (
dog_ | bird_,
dog_ = "DOG" >> lit_[ref(d.Name) = _1]>> '|'
>>"Barks=">>yesno_[ref(d.Barks) = _1] >>'|'
>>"Has a Tail=">>yesno_[ref(d.HasATail) = _1] >> '|'
>>lit_[ref(d.Address) = _1]
,
bird_ = "BIRD" >> lit_[ref(b.Name) = _1]>> '|'
>>"Tweets=">>yesno_[ref(b.Tweets) = _1] >>'|'
>>lit_[ref(b.Address) = _1]
);
}
qi::rule<Iterator, ascii::space_type> start_;
repo::qi::subrule<0> dog_;
repo::qi::subrule<1> bird_;
Bird b;
Dog d;
};
int main()
{
std::string test1="DOG | Name=tim | Barks=Yes | Has a Tail=N | Address=3 infinite loop";
std::string test2="BIRD| Name=poc | Tweets=Yes| Address=10 stack overflow street";
using boost::spirit::ascii::space;
typedef std::string::const_iterator iterator_type;
typedef ZooGrammar<iterator_type> grammar;
iterator_type start = test1.begin();
iterator_type end = test1.end();
ZooGrammar g;
if(boost::spirit::qi::phrase_parse(start, end, g, space))
{
cout<<"matched"<<endl;
}
}
The code above crashes the compiler GCC 4.8 and 4.9. I don't know where I am making the mistake.
Please test run the code above in Coliru link
Many thanks in advance !
Subrules are a bit antiquated. To be honest, I didn't even know there was still such a thing in Spirit V2.
I suggest using regular Spirit V2 attribute propagation, which makes things a bit more readable at once:
dog_ = qi::lit("DOG") >> '|' >> "Name=" >> lit_ >> '|'
>> "Barks=" >> yesno_ >> '|'
>> "Has a Tail=" >> yesno_ >> '|'
>> "Address=" >> lit_
;
bird_ = qi::lit("BIRD") >> '|' >> "Name=" >> lit_ >> '|'
>> "Tweets=" >> yesno_ >> '|'
>> "Address=" >> lit_
;
start_ = dog_ | bird_;
I've imagined a lit_ rule (as qi::lit_ doesn't ring any bells):
lit_ = qi::lexeme [ *~qi::char_('|') ];
Of course, you need to adapt the attribute types as far as they don't have builtin support (as with boost::variant<Dog, Bird>, std::string and bool which are all handled without any additional code):
BOOST_FUSION_ADAPT_STRUCT(Dog,
(std::string, Name)(bool, Barks)(bool, HasATail)(std::string, Address))
BOOST_FUSION_ADAPT_STRUCT(Bird,
(std::string, Name)(bool, Tweets)(std::string, Address))
Now with the program extended to print some debug information, output is: Live On Coliru
Matched: [DOG|Name=tim |Barks=Yes|Has a Tail=No|Address=3 infinite loop]
Matched: [BIRD|Name=poc |Tweets=Yes|Address=10 stack overflow street]
Full Sample Code
//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_symbols.hpp>
static const char* YesNo(bool b) { return b?"Yes":"No"; }
struct Dog {
std::string Name;
bool Barks;
bool HasATail;
std::string Address;
friend std::ostream& operator <<(std::ostream& os, Dog const& o) {
return os << "[DOG|Name=" << o.Name << "|Barks=" << YesNo(o.Barks) << "|Has a Tail=" << YesNo(o.HasATail) << "|Address=" << o.Address << "]";
}
};
struct Bird {
std::string Name;
bool Tweets;
std::string Address;
friend std::ostream& operator <<(std::ostream& os, Bird const& o) {
return os << "[BIRD|Name=" << o.Name << "|Tweets=" << YesNo(o.Tweets) << "|Address=" << o.Address << "]";
}
};
typedef boost::variant<Dog, Bird> ZooAnimal;
BOOST_FUSION_ADAPT_STRUCT(Dog, (std::string, Name)(bool, Barks)(bool, HasATail)(std::string, Address))
BOOST_FUSION_ADAPT_STRUCT(Bird, (std::string, Name)(bool, Tweets)(std::string, Address))
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
template <typename Iterator>
struct ZooGrammar : public qi::grammar<Iterator, ZooAnimal(), ascii::space_type>
{
ZooGrammar() : ZooGrammar::base_type(start_)
{
using qi::_1;
yesno_.add("Yes", true)("Y", true)("No", false)("N", false);
dog_ = qi::lit("DOG") >> '|' >> "Name=" >> lit_ >> '|'
>> "Barks=" >> yesno_ >> '|'
>> "Has a Tail=" >> yesno_ >> '|'
>> "Address=" >> lit_
;
bird_ = qi::lit("BIRD") >> '|' >> "Name=" >> lit_ >> '|'
>> "Tweets=" >> yesno_ >> '|'
>> "Address=" >> lit_
;
start_ = dog_ | bird_;
lit_ = qi::lexeme [ *~qi::char_('|') ];
BOOST_SPIRIT_DEBUG_NODES((dog_)(bird_)(start_)(lit_))
}
private:
qi::rule<Iterator, ZooAnimal(), ascii::space_type> start_;
qi::rule<Iterator, std::string(), ascii::space_type> lit_;
qi::rule<Iterator, Dog(), ascii::space_type> dog_;
qi::rule<Iterator, Bird(), ascii::space_type> bird_;
qi::symbols<char, bool> yesno_;
};
int main()
{
typedef std::string::const_iterator iterator_type;
typedef ZooGrammar<iterator_type> grammar;
for (std::string const input : {
"DOG | Name=tim | Barks=Yes | Has a Tail=N | Address=3 infinite loop",
"BIRD| Name=poc | Tweets=Yes| Address=10 stack overflow street"
})
{
iterator_type start = input.begin();
iterator_type end = input.end();
grammar g;
ZooAnimal animal;
if(qi::phrase_parse(start, end, g, ascii::space, animal))
std::cout << "Matched: " << animal << "\n";
else
std::cout << "Parse failed\n";
if (start != end)
std::cout << "Remaining input: '" << std::string(start, end) << "'\n";
}
}

Parsing recursive structure on boost::spirit

I won to parse structure like "text { < > }". Spirit documentation contents similar AST example.
For parsing string like this
<tag1>text1<tag2>text2</tag1></tag2>
this code work:
templ = (tree | text) [_val = _1];
start_tag = '<'
>> !lit('/')
>> lexeme[+(char_- '>') [_val += _1]]
>>'>';
end_tag = "</"
>> string(_r1)
>> '>';
tree = start_tag [at_c<1>(_val) = _1]
>> *templ [push_back(at_c<0>(_val), _1) ]
>> end_tag(at_c<1>(_val) )
;
For parsing string like this
<tag<tag>some_text>
This code not work:
templ = (tree | text) [_val = _1];
tree = '<'
>> *templ [push_back(at_c<0>(_val), _1) ]
>> '>'
;
templ is parsing structure with recursive_wrapper inside:
namespace client {
struct tmp;
typedef boost::variant <
boost::recursive_wrapper<tmp>,
std::string
> tmp_node;
struct tmp {
std::vector<tmp_node> content;
std::string text;
};
}
BOOST_FUSION_ADAPT_STRUCT(
tmp_view::tmp,
(std::vector<tmp_view::tmp_node>, content)
(std::string,text)
)
Who may explain why it happened? Maybe who knows similar parsers wrote on boost::spirit?
Just guessing you didn't actually want to parse XML at all, but rather some kind of mixed-content markup language for hierarchical text, I'd do
simple = +~qi::char_("><");
nested = '<' >> *soup >> '>';
soup = nested|simple;
With the AST/rules defined as
typedef boost::make_recursive_variant<
boost::variant<std::string, std::vector<boost::recursive_variant_> >
>::type tag_soup;
qi::rule<It, std::string()> simple;
qi::rule<It, std::vector<tag_soup>()> nested;
qi::rule<It, tag_soup()> soup;
See it Live On Coliru:
//// #define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/variant/recursive_variant.hpp>
#include <iostream>
#include <fstream>
namespace client
{
typedef boost::make_recursive_variant<
boost::variant<std::string, std::vector<boost::recursive_variant_> >
>::type tag_soup;
namespace qi = boost::spirit::qi;
template <typename It>
struct parser : qi::grammar<It, tag_soup()>
{
parser() : parser::base_type(soup)
{
simple = +~qi::char_("><");
nested = '<' >> *soup >> '>';
soup = nested|simple;
BOOST_SPIRIT_DEBUG_NODES((simple)(nested)(soup))
}
private:
qi::rule<It, std::string()> simple;
qi::rule<It, std::vector<tag_soup>()> nested;
qi::rule<It, tag_soup()> soup;
};
}
namespace boost { // leverage ADL on variant<>
static std::ostream& operator<<(std::ostream& os, std::vector<client::tag_soup> const& soup)
{
os << "<";
std::copy(soup.begin(), soup.end(), std::ostream_iterator<client::tag_soup>(os));
return os << ">";
}
}
int main(int argc, char **argv)
{
if (argc < 2) {
std::cerr << "Error: No input file provided.\n";
return 1;
}
std::ifstream in(argv[1]);
std::string const storage(std::istreambuf_iterator<char>(in), {}); // We will read the contents here.
if (!(in || in.eof())) {
std::cerr << "Error: Could not read from input file\n";
return 1;
}
static const client::parser<std::string::const_iterator> p;
client::tag_soup ast; // Our tree
bool ok = parse(storage.begin(), storage.end(), p, ast);
if (ok) std::cout << "Parsing succeeded\nData: " << ast << "\n";
else std::cout << "Parsing failed\n";
return ok? 0 : 1;
}
If you define BOOST_SPIRIT_DEBUG you'll get verbose output of the parsing process.
For the input
<some text with nested <tags <etc...> >more text>
prints
Parsing succeeded
Data: <some text with nested <tags <etc...> >more text>
Note that the output is printed from the variant, not the original text.