boost::spirit::karma grammar: Comma delimited output from struct with optionals attributes

boost::spirit::karma grammar: Comma delimited output from struct with optionals attributes - c++

I need a comma delimited output from a struct with optionals. For example, if I have this struct:
MyStruct
{
boost::optional<std::string> one;
boost::optional<int> two;
boost::optional<float> three;
};
An output like: { "string", 1, 3.0 } or { "string" } or { 1, 3.0 } and so on.
Now, I have code like this:
struct MyStruct
{
boost::optional<std::string> one;
boost::optional<int> two;
boost::optional<float> three;
};
BOOST_FUSION_ADAPT_STRUCT
(MyStruct,
one,
two,
three)
template<typename Iterator>
struct MyKarmaGrammar : boost::spirit::karma::grammar<Iterator, MyStruct()>
{
MyKarmaGrammar() : MyKarmaGrammar::base_type(request_)
{
using namespace std::literals::string_literals;
namespace karma = boost::spirit::karma;
using karma::int_;
using karma::double_;
using karma::string;
using karma::lit;
using karma::_r1;
key_ = '"' << string(_r1) << '"';
str_prop_ = key_(_r1) << ':'
<< string
;
int_prop_ = key_(_r1) << ':'
<< int_
;
dbl_prop_ = key_(_r1) << ':'
<< double_
;
//REQUEST
request_ = '{'
<< -str_prop_("one"s) <<
-int_prop_("two"s) <<
-dbl_prop_("three"s)
<< '}'
;
}
private:
//GENERAL RULES
boost::spirit::karma::rule<Iterator, void(std::string)> key_;
boost::spirit::karma::rule<Iterator, double(std::string)> dbl_prop_;
boost::spirit::karma::rule<Iterator, int(std::string)> int_prop_;
boost::spirit::karma::rule<Iterator, std::string(std::string)> str_prop_;
//REQUEST
boost::spirit::karma::rule<Iterator, MyStruct()> request_;
};
int main()
{
using namespace std::literals::string_literals;
MyStruct request = {std::string("one"), 2, 3.1};
std::string generated;
std::back_insert_iterator<std::string> sink(generated);
MyKarmaGrammar<std::back_insert_iterator<std::string>> serializer;
boost::spirit::karma::generate(sink, serializer, request);
std::cout << generated << std::endl;
}
This works but I need a comma delimited output. I tried with a grammar like:
request_ = '{'
<< (str_prop_("one"s) |
int_prop_("two"s) |
dbl_prop_("three"s)) % ','
<< '}'
;
But I receive this compile error:
/usr/include/boost/spirit/home/support/container.hpp:194:52: error: no type named ‘const_iterator’ in ‘struct MyStruct’
typedef typename Container::const_iterator type;
thanks!

Your struct is not a container, therefore list-operator% will not work. The documentation states it expects the attribute to be a container type.
So, just like in the Qi counterpart I showed you to create a conditional delim production:
delim = (&qi::lit('}')) | ',';
You'd need something similar here. However, everything about it is reversed. Instead of "detecting" the end of the input sequence from the presence of a {, we need to track the absense of preceding field from "not having output a field since opening brace yet".
That's a bit trickier since the required state cannot come from the same source as the input. We'll use a parser-member for simplicity here¹:
private:
bool _is_first_field;
Now, when we generate the opening brace, we want to initialize that to true:
auto _f = px::ref(_is_first_field); // short-hand
request_ %= lit('{') [ _f = true ]
Note: Use of %= instead of = tells Spirit that we want automatic attribute propagation to happen, in spite of the presence of a Semantic Action ([ _f = true ]).
Now, we need to generate the delimiter:
delim = eps(_f) | ", ";
Simple. Usage is also simple, except we'll want to conditionally reset the _f:
auto reset = boost::proto::deep_copy(eps [ _f = false ]);
str_prop_ %= (delim << key_(_r1) << string << reset) | "";
int_prop_ %= (delim << key_(_r1) << int_ << reset) | "";
dbl_prop_ %= (delim << key_(_r1) << double_ << reset) | "";
A very subtle point here is that I changed to the declared rule attribute types from T to optional<T>. This allows Karma to do the magic to fail the value generator if it's empty (boost::none), and skipping the reset!
ka::rule<Iterator, boost::optional<double>(std::string)> dbl_prop_;
ka::rule<Iterator, boost::optional<int>(std::string)> int_prop_;
ka::rule<Iterator, boost::optional<std::string>(std::string)> str_prop_;
Now, let's put together some testcases:
Test Cases
Live On Coliru
#include "iostream"
#include <boost/optional/optional_io.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <string>
struct MyStruct {
boost::optional<std::string> one;
boost::optional<int> two;
boost::optional<double> three;
};
BOOST_FUSION_ADAPT_STRUCT(MyStruct, one, two, three)
namespace ka = boost::spirit::karma;
namespace px = boost::phoenix;
template<typename Iterator>
struct MyKarmaGrammar : ka::grammar<Iterator, MyStruct()> {
MyKarmaGrammar() : MyKarmaGrammar::base_type(request_) {
using namespace std::literals::string_literals;
using ka::int_;
using ka::double_;
using ka::string;
using ka::lit;
using ka::eps;
using ka::_r1;
auto _f = px::ref(_is_first_field);
auto reset = boost::proto::deep_copy(eps [ _f = false ]);
key_ = '"' << string(_r1) << "\":";
delim = eps(_f) | ", ";
str_prop_ %= (delim << key_(_r1) << string << reset) | "";
int_prop_ %= (delim << key_(_r1) << int_ << reset) | "";
dbl_prop_ %= (delim << key_(_r1) << double_ << reset) | "";
//REQUEST
request_ %= lit('{') [ _f = true ]
<< str_prop_("one"s) <<
int_prop_("two"s) <<
dbl_prop_("three"s)
<< '}';
}
private:
bool _is_first_field = true;
//GENERAL RULES
ka::rule<Iterator, void(std::string)> key_;
ka::rule<Iterator, boost::optional<double>(std::string)> dbl_prop_;
ka::rule<Iterator, boost::optional<int>(std::string)> int_prop_;
ka::rule<Iterator, boost::optional<std::string>(std::string)> str_prop_;
ka::rule<Iterator> delim;
//REQUEST
ka::rule<Iterator, MyStruct()> request_;
};
template <typename T> std::array<boost::optional<T>, 2> option(T const& v) {
return { { v, boost::none } };
}
int main() {
using namespace std::literals::string_literals;
for (auto a : option("one"s))
for (auto b : option(2))
for (auto c : option(3.1))
for (auto request : { MyStruct { a, b, c } }) {
std::string generated;
std::back_insert_iterator<std::string> sink(generated);
MyKarmaGrammar<std::back_insert_iterator<std::string>> serializer;
ka::generate(sink, serializer, request);
std::cout << boost::fusion::as_vector(request) << ":\t" << generated << "\n";
}
}
Printing:
( one 2 3.1): {"one":one, "two":2, "three":3.1}
( one 2 --): {"one":one, "two":2}
( one -- 3.1): {"one":one, "three":3.1}
( one -- --): {"one":one}
(-- 2 3.1): {"two":2, "three":3.1}
(-- 2 --): {"two":2}
(-- -- 3.1): {"three":3.1}
(-- -- --): {}
¹ Note this limits re-entrant use of the parser, as well as making it non-const etc. karma::locals are the true answer to that, adding a little more complexity

Related

Boost X3: Can a variant member be avoided in disjunctions?

I'd like to parse string | (string, int) and store it in a structure that defaults the int component to some value. The attribute of such a construction in X3 is a variant<string, tuple<string, int>>. I was thinking I could have a struct that takes either a string or a (string, int) to automagically be populated:
struct bar
{
bar (std::string x = "", int y = 0) : baz1 {x}, baz2 {y} {}
std::string baz1;
int baz2;
};
BOOST_FUSION_ADAPT_STRUCT (disj::ast::bar, baz1, baz2)
and then simply have:
const x3::rule<class bar, ast::bar> bar = "bar";
using x3::int_;
using x3::ascii::alnum;
auto const bar_def = (+(alnum) | ('(' >> +(alnum) >> ',' >> int_ >> ')')) >> ';';
BOOST_SPIRIT_DEFINE(bar);
However this does not work:
/usr/include/boost/spirit/home/x3/core/detail/parse_into_container.hpp:139:59: error: static assertion failed: Expecting a single element fusion sequence
139 | static_assert(traits::has_size<Attribute, 1>::value,
Setting baz2 to an optional does not help. One way to solve this is to have a variant field or inherit from that type:
struct string_int {
std::string s;
int i;
};
struct foo {
boost::variant<std::string, string_int> var;
};
BOOST_FUSION_ADAPT_STRUCT (disj::ast::string_int, s, i)
BOOST_FUSION_ADAPT_STRUCT (disj::ast::foo, var)
(For some reason, I have to use boost::variant instead of x3::variant for operator<< to work; also, using std::pair or tuple for string_int does not work, but boost::fusion::deque does.) One can then equip foo somehow to get the string and integer.
Question: What is the proper, clean way to do this in X3? Is there a more natural way than this second option and equipping foo with accessors?
Live On Coliru

Sadly the wording in the x3 section is exceedingly sparse and allows it (contrast the Qi section). A quick test confirms it:
Live On Coliru
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
template <typename Expr>
std::string inspect(Expr const& expr) {
using A = typename x3::traits::attribute_of<Expr, x3::unused_type>::type;
return boost::core::demangle(typeid(A).name());
}
int main()
{
std::cout << inspect(x3::double_ | x3::int_) << "\n"; // variant expected
std::cout << inspect(x3::int_ | "bla" >> x3::int_) << "\n"; // variant "understandable"
std::cout << inspect(x3::int_ | x3::int_) << "\n"; // variant suprising:
}
Prints
boost::variant<double, int>
boost::variant<int, int>
boost::variant<int, int>
All Hope Is Not Lost
In your specific case you could trick the system:
auto const bar_def = //
(+x3::alnum >> x3::attr(-1) //
| '(' >> +x3::alnum >> ',' >> x3::int_ >> ')' //
) >> ';';
Note how we "inject" an int value for the first branch. That satisfies the attribute propagation gods:
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/adapted/struct.hpp>
#include <boost/fusion/include/io.hpp>
#include <iomanip>
namespace x3 = boost::spirit::x3;
namespace disj::ast {
struct bar {
std::string x;
int y;
};
using boost::fusion::operator<<;
} // namespace disj::ast
BOOST_FUSION_ADAPT_STRUCT(disj::ast::bar, x, y)
namespace disj::parser {
const x3::rule<class bar, ast::bar> bar = "bar";
auto const bar_def = //
(+x3::alnum >> x3::attr(-1) //
| '(' >> +x3::alnum >> ',' >> x3::int_ >> ')' //
) >> ';';
BOOST_SPIRIT_DEFINE(bar)
}
namespace disj {
void run_tests() {
for (std::string const input : {
"",
";",
"bla;",
"bla, 42;",
"(bla, 42);",
}) {
ast::bar val;
auto f = begin(input), l = end(input);
std::cout << "\n" << quoted(input) << " -> ";
if (phrase_parse(f, l, parser::bar, x3::space, val)) {
std::cout << "Parsed: " << val << "\n";
} else {
std::cout << "Failed\n";
}
if (f!=l) {
std::cout << " -- Remaining " << quoted(std::string_view(f, l)) << "\n";
}
}
}
}
int main()
{
disj::run_tests();
}
Prints
"" -> Failed
";" -> Failed
-- Remaining ";"
"bla;" -> Parsed: (bla -1)
"bla, 42;" -> Failed
-- Remaining "bla, 42;"
"(bla, 42);" -> Parsed: (bla 42)
¹ just today

boost spirit parsing with no skipper

Think about a preprocessor which will read the raw text (no significant white space or tokens).
There are 3 rules.
resolve_para_entry should solve the Argument inside a call. The top-level text is returned as string.
resolve_para should resolve the whole Parameter list and put all the top-level Parameter in a string list.
resolve is the entry
On the way I track the iterator and get the text portion
Samples:
sometext(para) → expect para in the string list
sometext(para1,para2) → expect para1 and para2 in string list
sometext(call(a)) → expect call(a) in the string list
sometext(call(a,b)) ← here it fails; it seams that the "!lit(',')" wont take the Parser to step outside ..
Rules:
resolve_para_entry = +(
(iter_pos >> lit('(') >> (resolve_para_entry | eps) >> lit(')') >> iter_pos) [_val= phoenix::bind(&appendString, _val, _1,_3)]
| (!lit(',') >> !lit(')') >> !lit('(') >> (wide::char_ | wide::space)) [_val = phoenix::bind(&appendChar, _val, _1)]
);
resolve_para = (lit('(') >> lit(')'))[_val = std::vector<std::wstring>()] // empty para -> old style
| (lit('(') >> resolve_para_entry >> *(lit(',') >> resolve_para_entry) > lit(')'))[_val = phoenix::bind(&appendStringList, _val, _1, _2)]
| eps;
;
resolve = (iter_pos >> name_valid >> iter_pos >> resolve_para >> iter_pos);
In the end doesn't seem very elegant. Maybe there is a better way to parse such stuff without skipper

Indeed this should be a lot simpler.
First off, I fail to see why the absense of a skipper is at all relevant.
Second, exposing the raw input is best done using qi::raw[] instead of dancing with iter_pos and clumsy semantic actions¹.
Among the other observations I see:
negating a charset is done with ~, so e.g. ~char_(",()")
(p|eps) would be better spelled -p
(lit('(') >> lit(')')) could be just "()" (after all, there's no skipper, right)
p >> *(',' >> p) is equivalent to p % ','
With the above, resolve_para simplifies to this:
resolve_para = '(' >> -(resolve_para_entry % ',') >> ')';
resolve_para_entry seems weird, to me. It appears that any nested parentheses are simply swallowed. Why not actually parse a recursive grammar so you detect syntax errors?
Here's my take on it:
Define An AST
I prefer to make this the first step because it helps me think about the parser productions:
namespace Ast {
using ArgList = std::list<std::string>;
struct Resolve {
std::string name;
ArgList arglist;
};
using Resolves = std::vector<Resolve>;
}
Creating The Grammar Rules
qi::rule<It, Ast::Resolves()> start;
qi::rule<It, Ast::Resolve()> resolve;
qi::rule<It, Ast::ArgList()> arglist;
qi::rule<It, std::string()> arg, identifier;
And their definitions:
identifier = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
arg = raw [ +('(' >> -arg >> ')' | +~char_(",)(")) ];
arglist = '(' >> -(arg % ',') >> ')';
resolve = identifier >> arglist;
start = *qr::seek[hold[resolve]];
Notes:
No more semantic actions
No more eps
No more iter_pos
I've opted to make arglist not-optional. If you really wanted that, change it back:
resolve = identifier >> -arglist;
But in our sample it will generate a lot of noisy output.
Of course your entry point (start) will be different. I just did the simplest thing that could possibly work, using another handy parser directive from the Spirit Repository (like iter_pos that you were already using): seek[]
The hold is there for this reason: boost::spirit::qi duplicate parsing on the output - You might not need it in your actual parser.
Live On Coliru
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
namespace Ast {
using ArgList = std::list<std::string>;
struct Resolve {
std::string name;
ArgList arglist;
};
using Resolves = std::vector<Resolve>;
}
BOOST_FUSION_ADAPT_STRUCT(Ast::Resolve, name, arglist)
namespace qi = boost::spirit::qi;
namespace qr = boost::spirit::repository::qi;
template <typename It>
struct Parser : qi::grammar<It, Ast::Resolves()>
{
Parser() : Parser::base_type(start) {
using namespace qi;
identifier = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
arg = raw [ +('(' >> -arg >> ')' | +~char_(",)(")) ];
arglist = '(' >> -(arg % ',') >> ')';
resolve = identifier >> arglist;
start = *qr::seek[hold[resolve]];
}
private:
qi::rule<It, Ast::Resolves()> start;
qi::rule<It, Ast::Resolve()> resolve;
qi::rule<It, Ast::ArgList()> arglist;
qi::rule<It, std::string()> arg, identifier;
};
#include <iostream>
int main() {
using It = std::string::const_iterator;
std::string const samples = R"--(
Samples:
sometext(para) → expect para in the string list
sometext(para1,para2) → expect para1 and para2 in string list
sometext(call(a)) → expect call(a) in the string list
sometext(call(a,b)) ← here it fails; it seams that the "!lit(',')" wont make the parser step outside
)--";
It f = samples.begin(), l = samples.end();
Ast::Resolves data;
if (parse(f, l, Parser<It>{}, data)) {
std::cout << "Parsed " << data.size() << " resolves\n";
} else {
std::cout << "Parsing failed\n";
}
for (auto& resolve: data) {
std::cout << " - " << resolve.name << "\n (\n";
for (auto& arg : resolve.arglist) {
std::cout << " " << arg << "\n";
}
std::cout << " )\n";
}
}
Prints
Parsed 6 resolves
- sometext
(
para
)
- sometext
(
para1
para2
)
- sometext
(
call(a)
)
- call
(
a
)
- call
(
a
b
)
- lit
(
'
'
)
More Ideas
That last output shows you a problem with your current grammar: lit(',') should obviously not be seen as a call with two parameters.
I recently did an answer on extracting (nested) function calls with parameters which does things more neatly:
Boost spirit parse rule is not applied
or this one boost spirit reporting semantic error
BONUS
Bonus version that uses string_view and also shows exact line/column information of all extracted words.
Note that it still doesn't require any phoenix or semantic actions. Instead it simply defines the necesary trait to assign to boost::string_view from an iterator range.
Live On Coliru
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
namespace Ast {
using Source = boost::string_view;
using ArgList = std::list<Source>;
struct Resolve {
Source name;
ArgList arglist;
};
using Resolves = std::vector<Resolve>;
}
BOOST_FUSION_ADAPT_STRUCT(Ast::Resolve, name, arglist)
namespace boost { namespace spirit { namespace traits {
template <typename It>
struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
static void call(It f, It l, boost::string_view& attr) {
attr = boost::string_view { f.base(), size_t(std::distance(f.base(),l.base())) };
}
};
} } }
namespace qi = boost::spirit::qi;
namespace qr = boost::spirit::repository::qi;
template <typename It>
struct Parser : qi::grammar<It, Ast::Resolves()>
{
Parser() : Parser::base_type(start) {
using namespace qi;
identifier = raw [ char_("a-zA-Z_") >> *char_("a-zA-Z0-9_") ];
arg = raw [ +('(' >> -arg >> ')' | +~char_(",)(")) ];
arglist = '(' >> -(arg % ',') >> ')';
resolve = identifier >> arglist;
start = *qr::seek[hold[resolve]];
}
private:
qi::rule<It, Ast::Resolves()> start;
qi::rule<It, Ast::Resolve()> resolve;
qi::rule<It, Ast::ArgList()> arglist;
qi::rule<It, Ast::Source()> arg, identifier;
};
#include <iostream>
struct Annotator {
using Ref = boost::string_view;
struct Manip {
Ref fragment, context;
friend std::ostream& operator<<(std::ostream& os, Manip const& m) {
return os << "[" << m.fragment << " at line:" << m.line() << " col:" << m.column() << "]";
}
size_t line() const {
return 1 + std::count(context.begin(), fragment.begin(), '\n');
}
size_t column() const {
return 1 + (fragment.begin() - start_of_line().begin());
}
Ref start_of_line() const {
return context.substr(context.substr(0, fragment.begin()-context.begin()).find_last_of('\n') + 1);
}
};
Ref context;
Manip operator()(Ref what) const { return {what, context}; }
};
int main() {
using It = std::string::const_iterator;
std::string const samples = R"--(Samples:
sometext(para) → expect para in the string list
sometext(para1,para2) → expect para1 and para2 in string list
sometext(call(a)) → expect call(a) in the string list
sometext(call(a,b)) ← here it fails; it seams that the "!lit(',')" wont make the parser step outside
)--";
It f = samples.begin(), l = samples.end();
Ast::Resolves data;
if (parse(f, l, Parser<It>{}, data)) {
std::cout << "Parsed " << data.size() << " resolves\n";
} else {
std::cout << "Parsing failed\n";
}
Annotator annotate{samples};
for (auto& resolve: data) {
std::cout << " - " << annotate(resolve.name) << "\n (\n";
for (auto& arg : resolve.arglist) {
std::cout << " " << annotate(arg) << "\n";
}
std::cout << " )\n";
}
}
Prints
Parsed 6 resolves
- [sometext at line:3 col:1]
(
[para at line:3 col:10]
)
- [sometext at line:4 col:1]
(
[para1 at line:4 col:10]
[para2 at line:4 col:16]
)
- [sometext at line:5 col:1]
(
[call(a) at line:5 col:10]
)
- [call at line:5 col:34]
(
[a at line:5 col:39]
)
- [call at line:6 col:10]
(
[a at line:6 col:15]
[b at line:6 col:17]
)
- [lit at line:6 col:62]
(
[' at line:6 col:66]
[' at line:6 col:68]
)
¹ Boost Spirit: "Semantic actions are evil"?

How to tokenize C++ using Boost Regex

I'm currently working on a tokenizer for a class using boost regex. I'm not too familiar with boost, so I may be way off base on what I have so far but anyway, here is what I'm using:
regex re("[\\s*,()=;<>\+-]{1,2}");
sregex_token_iterator i(text.begin(), text.end(), re, -1);
sregex_token_iterator j;
sregex_token_iterator begin(text.begin(), text.end(), re), end;
unsigned count = 0;
while(i != j)
{
if(*i != ' ' && *i != '\n')
{
count++;
cout << "From i - " << count << " " << *i << endl;
}
i++;
if(*begin != ' ' && *begin != '\n')
{
count++;
cout << "Form j - " << count << " " << *begin << endl;
}
begin++;
}
cout << "There were " << count << " tokens found." << endl;
So, basically, I'm using the spaces and the symbols as delimiters, but I'm still outputting both (since I still want the symbols to be tokens). Like I said, I'm not extremely familiar with boost, so I'm not positive if I'm taking the right approach.
My end goal is to split a file that has a simple c++ block of code and tokenize it, here's the example file I am using:
#define MAX 5
int main(int argc)
{
for(int i = 0; i < MAX; i ++)
{
cout << "i is equal to " << i << endl;
}
return 0;
}
I'm having trouble with the fact that it is counting next lines and blank spaces as tokens, and I need them to be thrown away really. Also, I'm having a hard time with the "++" token, I can't seem to figure out the right expression for it to count "++".
Any help would be greatly appreciated!
Thanks!
Tim

First off,
Boost has Boost Wave which has (several, I think) ready-made tokenizers for C++ source
Boost has Spirit Lex which is a lexer that can tokenize based on regex patterns and some state support. It allows both dynamic lexer tables and statically generated lexer tables
In case you're interested in using Lex I ran a quick & dirty finger exercise for myself: it tokenizes itself Live On Coliru.
Notes:
A Lex tokenizer plays nicely with Boost Spirit Qi for parsing (though in all honesty, I prefer doing Spirit grammars directly on the source iterators).
It exposes an iterator interface, allthough my example leverages the callback interface to display the tokens:
int main()
{
typedef boost::spirit::istream_iterator It;
typedef lex::lexertl::token<It, boost::mpl::vector<int, double>, boost::mpl::true_ > token_type;
tokens<lex::lexertl::actor_lexer<token_type> > lexer;
std::ifstream ifs("main.cpp");
ifs >> std::noskipws;
It first(ifs), last;
bool ok = lex::tokenize(first, last, lexer, process_token());
std::cout << "\nTokenization " << (ok?"succeeded":"failed") << "; remaining input: '" << std::string(first,last) << "'\n";
}
Which is tokenized in the output as (trimming the preceding output):
[int][main][(][)][{][typedef][boost][::][spirit][::][istream_iterator][It][;][typedef][lex][::][lexertl][::][token][<][It][,][boost][::][mpl][::][vector][<][int][,][double][>][,][boost][::][mpl][::][true_][>][token_type][;][tokens][<][lex][::][lexertl][::][actor_lexer][<][token_type][>][>][lexer][;][std][::][ifstream][ifs][(]["main.cpp"][)][;][ifs][>>][std][::][noskipws][;][It][first][(][ifs][)][,][last][;][bool][ok][=][lex][::][tokenize][(][first][,][last][,][lexer][,][process_token][(][)][)][;][std][::][cout][<<]["\nTokenization "][<<][(][ok][?]["succeeded"][:]["failed"][)][<<]["; remaining input: '"][<<][std][::][string][(][first][,][last][)][<<]["'\n"][;][}]
Tokenization succeeded; remaining input: ''
You should actually want a different lexer state for parsing the preprocessor directives (line-ends become meaningful and several other expressions/keywords are valid). In real life, there's often a separate preprocessor step doing its own lexing here. (The fallout of this can be seen when lexing the include file specifications, e.g.)
ordering of tokens in the lexer is critical for the result
in this sample, you'd always match the & token as a binop_. You'd
probably want to match a ampersand_ token and decide at parse time
whether it's a binary operator (bitwise-and), unary operator (adress-of), reference type-qualifier etc. C++ is really interesting to parse :|
Comments are supported!
digraphs/trigraphs are not supported :)
pragmas, line/file directives etc. are unsupported
All in all, this should be pretty usable if you wanted to make, say, a simple syntax highlighter or formatter. Anything beyond that should require some more parsing/semantic analysis.
Full Listing:
#include <boost/spirit/include/support_istream_iterator.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <fstream>
#include <sstream>
#include <boost/lexical_cast.hpp>
namespace lex = boost::spirit::lex;
template <typename Lexer>
struct tokens : lex::lexer<Lexer>
{
tokens()
{
pound_ = "#";
define_ = "define";
if_ = "if";
else_ = "else";
endif_ = "endif";
ifdef_ = "ifdef";
ifndef_ = "ifndef";
defined_ = "defined";
keyword_ = "for|break|continue|while|do|switch|case|default|if|else|return|goto|throw|catch"
"static|volatile|auto|void|int|char|signed|unsigned|long|double|float|"
"delete|new|virtual|override|final|"
"typename|template|using|namespace|extern|\"C\"|"
"friend|public|private|protected|"
"class|struct|enum|"
"register|thread_local|noexcept|constexpr";
scope_ = "::";
dot_ = '.';
arrow_ = "->";
star_ = '*';
popen_ = '(';
pclose_ = ')';
bopen_ = '{';
bclose_ = '}';
iopen_ = '[';
iclose_ = ']';
colon_ = ':';
semic_ = ';';
comma_ = ',';
tern_q_ = '?';
relop_ = "==|!=|<=|>=|<|>";
assign_ = '=';
incr_ = "\\+\\+";
decr_ = "--";
binop_ = "[-+/%&|^]|>>|<<";
unop_ = "[-+~!]";
real_ = "[-+]?[0-9]+(e[-+]?[0-9]+)?f?";
int_ = "[-+]?[0-9]+";
identifier_ = "[a-zA-Z_][a-zA-Z0-9_]*";
ws_ = "[ \\t\\r\\n]";
line_comment_ = "\\/\\/.*?[\\r\\n]";
block_comment_ = "\\/\\*.*?\\*\\/";
this->self.add_pattern
("SCHAR", "\\\\(x[0-9a-fA-F][0-9a-fA-F]|[\\\\\"'0tbrn])|[^\"\\\\'\\r\\n]")
;
string_lit = "\\\"('|{SCHAR})*?\\\"";
char_lit = "'(\\\"|{SCHAR})'";
this->self +=
pound_ | define_ | if_ | else_ | endif_ | ifdef_ | ifndef_ | defined_
| keyword_ | scope_ | dot_ | arrow_ | star_ | popen_ | pclose_ | bopen_ | bclose_ | iopen_ | iclose_ | colon_ | semic_ | comma_ | tern_q_
| relop_ | assign_ | incr_ | decr_ | binop_ | unop_
| int_ | real_ | identifier_ | string_lit | char_lit
// ignore whitespace and comments
| ws_ [ lex::_pass = lex::pass_flags::pass_ignore ]
| line_comment_ [ lex::_pass = lex::pass_flags::pass_ignore ]
| block_comment_[ lex::_pass = lex::pass_flags::pass_ignore ]
;
}
private:
lex::token_def<> pound_, define_, if_, else_, endif_, ifdef_, ifndef_, defined_;
lex::token_def<> keyword_, scope_, dot_, arrow_, star_, popen_, pclose_, bopen_, bclose_, iopen_, iclose_, colon_, semic_, comma_, tern_q_;
lex::token_def<> relop_, assign_, incr_, decr_, binop_, unop_;
lex::token_def<int> int_;
lex::token_def<double> real_;
lex::token_def<> identifier_, string_lit, char_lit;
lex::token_def<lex::omit> ws_, line_comment_, block_comment_;
};
struct token_value : boost::static_visitor<std::string>
{
template <typename... T> // the token value can be a variant over any of the exposed attribute types
std::string operator()(boost::variant<T...> const& v) const {
return boost::apply_visitor(*this, v);
}
template <typename T> // the default value is a pair of iterators into the source sequence
std::string operator()(boost::iterator_range<T> const& v) const {
return { v.begin(), v.end() };
}
template <typename T>
std::string operator()(T const& v) const {
// not taken unless used in Spirit Qi rules, I guess
return std::string("attr<") + typeid(v).name() + ">(" + boost::lexical_cast<std::string>(v) + ")";
}
};
struct process_token
{
template <typename T>
bool operator()(T const& token) const {
std::cout << '[' /*<< token.id() << ":" */<< print(token.value()) << "]";
return true;
}
token_value print;
};
#if 0
std::string read(std::string fname)
{
std::ifstream ifs(fname);
std::ostringstream oss;
oss << ifs.rdbuf();
return oss.str();
}
#endif
int main()
{
typedef boost::spirit::istream_iterator It;
typedef lex::lexertl::token<It, boost::mpl::vector<int, double>, boost::mpl::true_ > token_type;
tokens<lex::lexertl::actor_lexer<token_type> > lexer;
std::ifstream ifs("main.cpp");
ifs >> std::noskipws;
It first(ifs), last;
bool ok = lex::tokenize(first, last, lexer, process_token());
std::cout << "\nTokenization " << (ok?"succeeded":"failed") << "; remaining input: '" << std::string(first,last) << "'\n";
}

Boost::spirit (classic) primitives vs custom parsers

I'm a beginner in Boost::spirit and I want to define grammar that parses TTCN language.
(http://www.trex.informatik.uni-goettingen.de/trac/wiki/ttcn-3_4.5.1)
I'm trying to define some rules for 'primitve' parsers like Alpha, AlphaNum to be faitful 1 to 1 to original grammar but obviously I do something wrong because grammar defined this way does not work.
But when I use primite parsers in place of TTCN's it started to work.
Can someone tell why 'manually' defined rules does not work as expected ?
How to fix it, because I would like to stick close to original grammar.
Is it a begginer's code bug or something different ?
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/classic_symbols.hpp>
#include <boost/spirit/include/classic_tree_to_xml.hpp>
#include <boost/spirit/include/classic_position_iterator.hpp>
#include <boost/spirit/include/classic_core.hpp>
#include <boost/spirit/include/classic_parse_tree.hpp>
#include <boost/spirit/include/classic_ast.hpp>
#include <iostream>
#include <string>
#include <boost/spirit/home/classic/debug.hpp>
using namespace boost::spirit::classic;
using namespace std;
using namespace BOOST_SPIRIT_CLASSIC_NS;
typedef node_iter_data_factory<int> factory_t;
typedef position_iterator<std::string::iterator> pos_iterator_t;
typedef tree_match<pos_iterator_t, factory_t> parse_tree_match_t;
typedef parse_tree_match_t::const_tree_iterator iter_t;
struct ParseGrammar: public grammar<ParseGrammar>
{
template<typename ScannerT>
struct definition
{
definition(ParseGrammar const &)
{
KeywordImport = str_p("import");
KeywordAll = str_p("all");
SemiColon = ch_p(';');
Underscore = ch_p('_');
NonZeroNum = range_p('1','9');
Num = ch_p('0') | NonZeroNum;
UpperAlpha = range_p('A', 'Z');
LowerAlpha = range_p('a', 'z');
Alpha = UpperAlpha | LowerAlpha;
AlphaNum = Alpha | Num;
//this does not!
Identifier = lexeme_d[Alpha >> *(AlphaNum | Underscore)];
// Uncomment below line to make rule work
// Identifier = lexeme_d[alpha_p >> *(alnum_p | Underscore)];
Module = KeywordImport >> Identifier >> KeywordAll >> SemiColon;
BOOST_SPIRIT_DEBUG_NODE(Module);
BOOST_SPIRIT_DEBUG_NODE(KeywordImport);
BOOST_SPIRIT_DEBUG_NODE(KeywordAll);
BOOST_SPIRIT_DEBUG_NODE(Identifier);
BOOST_SPIRIT_DEBUG_NODE(SemiColon);
}
rule<ScannerT> KeywordImport,KeywordAll,Module,Identifier,SemiColon;
rule<ScannerT> Alpha,UpperAlpha,LowerAlpha,Underscore,Num,AlphaNum;
rule<ScannerT> NonZeroNum;
rule<ScannerT> const&
start() const { return Module; }
};
};
int main()
{
ParseGrammar resolver; // Our parser
BOOST_SPIRIT_DEBUG_NODE(resolver);
string content = "import foobar all;";
pos_iterator_t pos_begin(content.begin(), content.end());
pos_iterator_t pos_end;
tree_parse_info<pos_iterator_t, factory_t> info;
info = ast_parse<factory_t>(pos_begin, pos_end, resolver, space_p);
std::cout << "\ninfo.length : " << info.length << std::endl;
std::cout << "info.full : " << info.full << std::endl;
if(info.full)
{
std::cout << "OK: Parsing succeeded\n\n";
}
else
{
int line = info.stop.get_position().line;
int column = info.stop.get_position().column;
std::cout << "-------------------------\n";
std::cout << "ERROR: Parsing failed\n";
std::cout << "stopped at: " << line << ":" << column << "\n";
std::cout << "-------------------------\n";
}
return 0;
}

I don't do Spirit Classic (which has been deprecated for some years now).
I can only assume you've mixed something up with skippers. Here's the thing translated into Spirit V2:
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
namespace qi = boost::spirit::qi;
typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;
template <typename Iterator = pos_iterator_t, typename Skipper = qi::space_type>
struct ParseGrammar: public qi::grammar<Iterator, Skipper>
{
ParseGrammar() : ParseGrammar::base_type(Module)
{
using namespace qi;
KeywordImport = lit("import");
KeywordAll = lit("all");
SemiColon = lit(';');
#if 1
// this rule obviously works
Identifier = lexeme [alpha >> *(alnum | '_')];
#else
// this does too, but less efficiently
Underscore = lit('_');
NonZeroNum = char_('1','9');
Num = char_('0') | NonZeroNum;
UpperAlpha = char_('A', 'Z');
LowerAlpha = char_('a', 'z');
Alpha = UpperAlpha | LowerAlpha;
AlphaNum = Alpha | Num;
Identifier = lexeme [Alpha >> *(AlphaNum | Underscore)];
#endif
Module = KeywordImport >> Identifier >> KeywordAll >> SemiColon;
BOOST_SPIRIT_DEBUG_NODES((Module)(KeywordImport)(KeywordAll)(Identifier)(SemiColon))
}
qi::rule<Iterator, Skipper> Module;
qi::rule<Iterator> KeywordImport,KeywordAll,Identifier,SemiColon;
qi::rule<Iterator> Alpha,UpperAlpha,LowerAlpha,Underscore,Num,AlphaNum;
qi::rule<Iterator> NonZeroNum;
};
int main()
{
std::string const content = "import \r\n\r\nfoobar\r\n\r\n all; bogus";
pos_iterator_t first(content.begin()), iter=first, last(content.end());
ParseGrammar<pos_iterator_t> resolver; // Our parser
bool ok = phrase_parse(iter, last, resolver, qi::space);
std::cout << std::boolalpha;
std::cout << "\nok : " << ok << std::endl;
std::cout << "full : " << (iter == last) << std::endl;
if(ok && iter==last)
{
std::cout << "OK: Parsing fully succeeded\n\n";
}
else
{
int line = get_line(iter);
int column = get_column(first, iter);
std::cout << "-------------------------\n";
std::cout << "ERROR: Parsing failed or not complete\n";
std::cout << "stopped at: " << line << ":" << column << "\n";
std::cout << "remaining: '" << std::string(iter, last) << "'\n";
std::cout << "-------------------------\n";
}
return 0;
}
I've added a little "bogus" at the end of input, so the output becomes a nicer demonstration:
<Module>
<try>import \r\n\r\nfoobar\r\n\r</try>
<KeywordImport>
<try>import \r\n\r\nfoobar\r\n\r</try>
<success> \r\n\r\nfoobar\r\n\r\n all;</success>
<attributes>[]</attributes>
</KeywordImport>
<Identifier>
<try>foobar\r\n\r\n all; bogu</try>
<success>\r\n\r\n all; bogus</success>
<attributes>[]</attributes>
</Identifier>
<KeywordAll>
<try>all; bogus</try>
<success>; bogus</success>
<attributes>[]</attributes>
</KeywordAll>
<SemiColon>
<try>; bogus</try>
<success> bogus</success>
<attributes>[]</attributes>
</SemiColon>
<success> bogus</success>
<attributes>[]</attributes>
</Module>
ok : true
full : false
-------------------------
ERROR: Parsing failed or not complete
stopped at: 3:8
remaining: 'bogus'
-------------------------
That all said, this is what I'd probably reduce it to:
template <typename Iterator, typename Skipper = qi::space_type>
struct ParseGrammar: public qi::grammar<Iterator, Skipper>
{
ParseGrammar() : ParseGrammar::base_type(Module)
{
using namespace qi;
Identifier = alpha >> *(alnum | '_');
Module = "import" >> Identifier >> "all" >> ';';
BOOST_SPIRIT_DEBUG_NODES((Module)(Identifier))
}
qi::rule<Iterator, Skipper> Module;
qi::rule<Iterator> Identifier;
};
As you can see, the Identifier rule is implicitely a lexeme because it doesn't declared to use a skipper.
See it Live on Coliru

Boost::spirit how to parse and call c++ function-like expressions

I want to use boost spirit to parse an expression like
function1(arg1, arg2, function2(arg1, arg2, arg3),
function3(arg1,arg2))
and call corresponding c++ functions. What should be the grammar to parse above expression and call the corresponding c++ function by phoneix::bind()?
I have 2 types of functions to call
1) string functions;
wstring GetSubString(wstring stringToCut, int position, int length);
wstring GetStringToken(wstring stringToTokenize, wstring seperators,
int tokenNumber );
2) Functions that return integer;
int GetCount();
int GetId(wstring srcId, wstring srcType);

Second Answer (more pragmatic)
Here's a second take, for comparison:
Just in case you really didn't want to parse into an abstract syntax tree representation, but rather evaluate the functions on-the-fly during parsing, you can simplify the grammar.
It comes in at 92 lines as opposed to 209 lines in the first answer. It really depends on what you're implementing which approach is more suitable.
This shorter approach has some downsides:
less flexible (not reusable)
less robust (if functions have side effects, they will happen even if parsing fails halfway)
less extensible (the supported functions are hardwired into the grammar1)
Full code:
//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/phoenix/function.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
typedef boost::variant<int, std::string> value;
//////////////////////////////////////////////////
// Demo functions:
value AnswerToLTUAE() {
return 42;
}
value ReverseString(value const& input) {
auto& as_string = boost::get<std::string>(input);
return std::string(as_string.rbegin(), as_string.rend());
}
value Concatenate(value const& a, value const& b) {
std::ostringstream oss;
oss << a << b;
return oss.str();
}
BOOST_PHOENIX_ADAPT_FUNCTION_NULLARY(value, AnswerToLTUAE_, AnswerToLTUAE)
BOOST_PHOENIX_ADAPT_FUNCTION(value, ReverseString_, ReverseString, 1)
BOOST_PHOENIX_ADAPT_FUNCTION(value, Concatenate_, Concatenate, 2)
//////////////////////////////////////////////////
// Parser grammar
template <typename It, typename Skipper = qi::space_type>
struct parser : qi::grammar<It, value(), Skipper>
{
parser() : parser::base_type(expr_)
{
using namespace qi;
function_call_ =
(lit("AnswerToLTUAE") > '(' > ')')
[ _val = AnswerToLTUAE_() ]
| (lit("ReverseString") > '(' > expr_ > ')')
[ _val = ReverseString_(_1) ]
| (lit("Concatenate") > '(' > expr_ > ',' > expr_ > ')')
[ _val = Concatenate_(_1, _2) ]
;
string_ = as_string [
lexeme [ "'" >> *~char_("'") >> "'" ]
];
value_ = int_ | string_;
expr_ = function_call_ | value_;
on_error<fail> ( expr_, std::cout
<< phx::val("Error! Expecting ") << _4 << phx::val(" here: \"")
<< phx::construct<std::string>(_3, _2) << phx::val("\"\n"));
BOOST_SPIRIT_DEBUG_NODES((expr_)(function_call_)(value_)(string_))
}
private:
qi::rule<It, value(), Skipper> value_, function_call_, expr_, string_;
};
int main()
{
for (const std::string input: std::vector<std::string> {
"-99",
"'string'",
"AnswerToLTUAE()",
"ReverseString('string')",
"Concatenate('string', 987)",
"Concatenate('The Answer Is ', AnswerToLTUAE())",
})
{
auto f(std::begin(input)), l(std::end(input));
const static parser<decltype(f)> p;
value direct_eval;
bool ok = qi::phrase_parse(f,l,p,qi::space,direct_eval);
if (!ok)
std::cout << "invalid input\n";
else
{
std::cout << "input:\t" << input << "\n";
std::cout << "eval:\t" << direct_eval << "\n\n";
}
if (f!=l) std::cout << "unparsed: '" << std::string(f,l) << "'\n";
}
}
Note how, instead of using BOOST_PHOENIX_ADAPT_FUNCTION* we could have directly used boost::phoenix::bind.
The output is still the same:
input: -99
eval: -99
input: 'string'
eval: string
input: AnswerToLTUAE()
eval: 42
input: ReverseString('string')
eval: gnirts
input: Concatenate('string', 987)
eval: string987
input: Concatenate('The Answer Is ', AnswerToLTUAE())
eval: The Answer Is 42
1 This last downside is easily remedied by using the 'Nabialek Trick'

First Answer (complete)
I've gone and implemented a simple recursive expression grammar for functions having up-to-three parameters:
for (const std::string input: std::vector<std::string> {
"-99",
"'string'",
"AnswerToLTUAE()",
"ReverseString('string')",
"Concatenate('string', 987)",
"Concatenate('The Answer Is ', AnswerToLTUAE())",
})
{
auto f(std::begin(input)), l(std::end(input));
const static parser<decltype(f)> p;
expr parsed_script;
bool ok = qi::phrase_parse(f,l,p,qi::space,parsed_script);
if (!ok)
std::cout << "invalid input\n";
else
{
const static generator<boost::spirit::ostream_iterator> g;
std::cout << "input:\t" << input << "\n";
std::cout << "tree:\t" << karma::format(g, parsed_script) << "\n";
std::cout << "eval:\t" << evaluate(parsed_script) << "\n";
}
if (f!=l) std::cout << "unparsed: '" << std::string(f,l) << "'\n";
}
Which prints:
input: -99
tree: -99
eval: -99
input: 'string'
tree: 'string'
eval: string
input: AnswerToLTUAE()
tree: nullary_function_call()
eval: 42
input: ReverseString('string')
tree: unary_function_call('string')
eval: gnirts
input: Concatenate('string', 987)
tree: binary_function_call('string',987)
eval: string987
input: Concatenate('The Answer Is ', AnswerToLTUAE())
tree: binary_function_call('The Answer Is ',nullary_function_call())
eval: The Answer Is 42
Some notes:
I separated parsing from execution (which is always a good idea IMO)
I implemented function evaluation for zero, one or two parameters (this should be easy to extend)
Values are assumed to be integers or strings (should be easy to extend)
I added a karma generator to display the parsed expression (with a TODO marked in the comment)
I hope this helps:
//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/variant/recursive_wrapper.hpp>
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx = boost::phoenix;
typedef boost::variant<int, std::string> value;
typedef boost::variant<value, boost::recursive_wrapper<struct function_call> > expr;
typedef std::function<value() > nullary_function_impl;
typedef std::function<value(value const&) > unary_function_impl;
typedef std::function<value(value const&, value const&)> binary_function_impl;
typedef boost::variant<nullary_function_impl, unary_function_impl, binary_function_impl> function_impl;
typedef qi::symbols<char, function_impl> function_table;
struct function_call
{
typedef std::vector<expr> arguments_t;
function_call() = default;
function_call(function_impl f, arguments_t const& arguments)
: f(f), arguments(arguments) { }
function_impl f;
arguments_t arguments;
};
BOOST_FUSION_ADAPT_STRUCT(function_call, (function_impl, f)(function_call::arguments_t, arguments))
#ifdef BOOST_SPIRIT_DEBUG
namespace std
{
static inline std::ostream& operator<<(std::ostream& os, nullary_function_impl const& f) { return os << "<nullary_function_impl>"; }
static inline std::ostream& operator<<(std::ostream& os, unary_function_impl const& f) { return os << "<unary_function_impl>"; }
static inline std::ostream& operator<<(std::ostream& os, binary_function_impl const& f) { return os << "<binary_function_impl>"; }
}
static inline std::ostream& operator<<(std::ostream& os, function_call const& call) { return os << call.f << "(" << call.arguments.size() << ")"; }
#endif
//////////////////////////////////////////////////
// Evaluation
value evaluate(const expr& e);
struct eval : boost::static_visitor<value>
{
eval() {}
value operator()(const value& v) const
{
return v;
}
value operator()(const function_call& call) const
{
return boost::apply_visitor(invoke(call.arguments), call.f);
}
private:
struct invoke : boost::static_visitor<value>
{
function_call::arguments_t const& _args;
invoke(function_call::arguments_t const& args) : _args(args) {}
value operator()(nullary_function_impl const& f) const {
return f();
}
value operator()(unary_function_impl const& f) const {
auto a = evaluate(_args.at(0));
return f(a);
}
value operator()(binary_function_impl const& f) const {
auto a = evaluate(_args.at(0));
auto b = evaluate(_args.at(1));
return f(a, b);
}
};
};
value evaluate(const expr& e)
{
return boost::apply_visitor(eval(), e);
}
//////////////////////////////////////////////////
// Demo functions:
value AnswerToLTUAE() {
return 42;
}
value ReverseString(value const& input) {
auto& as_string = boost::get<std::string>(input);
return std::string(as_string.rbegin(), as_string.rend());
}
value Concatenate(value const& a, value const& b) {
std::ostringstream oss;
oss << a << b;
return oss.str();
}
//////////////////////////////////////////////////
// Parser grammar
template <typename It, typename Skipper = qi::space_type>
struct parser : qi::grammar<It, expr(), Skipper>
{
parser() : parser::base_type(expr_)
{
using namespace qi;
n_ary_ops.add
("AnswerToLTUAE", nullary_function_impl{ &::AnswerToLTUAE })
("ReverseString", unary_function_impl { &::ReverseString })
("Concatenate" , binary_function_impl { &::Concatenate });
function_call_ = n_ary_ops > '(' > expr_list > ')';
string_ = qi::lexeme [ "'" >> *~qi::char_("'") >> "'" ];
value_ = qi::int_ | string_;
expr_list = -expr_ % ',';
expr_ = function_call_ | value_;
on_error<fail> ( expr_, std::cout
<< phx::val("Error! Expecting ") << _4 << phx::val(" here: \"")
<< phx::construct<std::string>(_3, _2) << phx::val("\"\n"));
BOOST_SPIRIT_DEBUG_NODES((expr_)(expr_list)(function_call_)(value_)(string_))
}
private:
function_table n_ary_ops;
template <typename Attr> using Rule = qi::rule<It, Attr(), Skipper>;
Rule<std::string> string_;
Rule<value> value_;
Rule<function_call> function_call_;
Rule<std::vector<expr>> expr_list;
Rule<expr> expr_;
};
//////////////////////////////////////////////////
// Output generator
template <typename It>
struct generator : karma::grammar<It, expr()>
{
generator() : generator::base_type(expr_)
{
using namespace karma;
nullary_ = eps << "nullary_function_call"; // TODO reverse lookup :)
unary_ = eps << "unary_function_call";
binary_ = eps << "binary_function_call";
function_ = nullary_ | unary_ | binary_;
function_call_ = function_ << expr_list;
expr_list = '(' << -(expr_ % ',') << ')';
value_ = karma::int_ | ("'" << karma::string << "'");
expr_ = function_call_ | value_;
}
private:
template <typename Attr> using Rule = karma::rule<It, Attr()>;
Rule<nullary_function_impl> nullary_;
Rule<unary_function_impl> unary_;
Rule<binary_function_impl> binary_;
Rule<function_impl> function_;
Rule<function_call> function_call_;
Rule<value> value_;
Rule<std::vector<expr>> expr_list;
Rule<expr> expr_;
};
int main()
{
for (const std::string input: std::vector<std::string> {
"-99",
"'string'",
"AnswerToLTUAE()",
"ReverseString('string')",
"Concatenate('string', 987)",
"Concatenate('The Answer Is ', AnswerToLTUAE())",
})
{
auto f(std::begin(input)), l(std::end(input));
const static parser<decltype(f)> p;
expr parsed_script;
bool ok = qi::phrase_parse(f,l,p,qi::space,parsed_script);
if (!ok)
std::cout << "invalid input\n";
else
{
const static generator<boost::spirit::ostream_iterator> g;
std::cout << "input:\t" << input << "\n";
std::cout << "tree:\t" << karma::format(g, parsed_script) << "\n";
std::cout << "eval:\t" << evaluate(parsed_script) << "\n\n";
}
if (f!=l) std::cout << "unparsed: '" << std::string(f,l) << "'\n";
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

boost::spirit::karma grammar: Comma delimited output from struct with optionals attributes - c++

Related

Boost X3: Can a variant member be avoided in disjunctions?

boost spirit parsing with no skipper

How to tokenize C++ using Boost Regex

Boost::spirit (classic) primitives vs custom parsers

Boost::spirit how to parse and call c++ function-like expressions

Categories

Resources