How to pass the iterator to a function in spirit qi - c++

template <typename Iterator>
struct parse_grammar
: qi::grammar<Iterator, std::string()>
{
parse_grammar()
: parse_grammar::base_type(start_p, "start_p"){
a_p = ',' > qi::double_;
b_p = *a_p;
start_p = qi::double_ > b_p >> qi::eoi;
}
qi::rule<Iterator, std::string()> a_p;
qi::rule<Iterator, std::string()> b_p;
qi::rule<Iterator, std::string()> start_p;
};
// implementation
std::vector<double> parse(std::istream& input, const std::string& filename)
{
// iterate over stream input
typedef std::istreambuf_iterator<char> base_iterator_type;
base_iterator_type in_begin(input);
// convert input iterator to forward iterator, usable by spirit parser
typedef boost::spirit::multi_pass<base_iterator_type> forward_iterator_type;
forward_iterator_type fwd_begin = boost::spirit::make_default_multi_pass(in_begin);
forward_iterator_type fwd_end;
// prepare output
std::vector<double> output;
// wrap forward iterator with position iterator, to record the position
typedef classic::position_iterator2<forward_iterator_type> pos_iterator_type;
pos_iterator_type position_begin(fwd_begin, fwd_end, filename);
pos_iterator_type position_end;
parse_grammar<pos_iterator_type> gram;
// parse
try
{
qi::phrase_parse(
position_begin, position_end, // iterators over input
gram, // recognize list of doubles
ascii::space); // comment skipper
}
catch(const qi::expectation_failure<pos_iterator_type>& e)
{
const classic::file_position_base<std::string>& pos = e.first.get_position();
std::stringstream msg;
msg <<
"parse error at file " << pos.file <<
" line " << pos.line << " column " << pos.column << std::endl <<
"'" << e.first.get_currentline() << "'" << std::endl <<
" " << "^- here";
throw std::runtime_error(msg.str());
}
// return result
return output;
}
I have this above sample code(Code used from boost-spirit website for example here).
In the grammar in the rule a_p I want to use semantic action and call a method and pass the iterator to it something as below:
a_p = ',' > qi::double_[boost::bind(&parse_grammar::doStuff(), this,
boost::ref(position_begin), boost::ref(position_end)];
and if the signature of the method doStuff is like this:
void doStuff(pos_iterator_type const& first, pos_iterator_type const& last);
Any ideas how to do this?
I do not mind any way(if I can do it using boost::phoenix or something not sure how) as long as to the method the iterators are passed with their current state.

I'm not completely sure why you think you 'need' what you describe. I'm afraid the solution to your actual task might be very simple:
start_p = qi::double_ % ',' > qi::eoi;
However, since the actual question is quite interesting, and the use of position interators in combination with istream_buf (rather than just the usual (slower) boost::spirit::istream_iterator) has it's merit, I'll show you how to do it with the semantic action as well.
For a simple (but rather complete) test main of
int main()
{
std::istringstream iss(
"1, -3.4 ,3.1415926\n"
",+inF,-NaN ,\n"
"2,-.4,4.14e7\n");
data_t parsed = parse(iss, "<inline-test>");
std::cout << "Done, parsed " << parsed.size() << " values ("
<< "min: " << *std::min_element(parsed.begin(), parsed.end()) << ", "
<< "max: " << *std::max_element(parsed.begin(), parsed.end()) << ")\n";
}
The output with the semantic action now becomes:
debug ('start_p') at <inline-test>:1:[1..2] '1' = 1
debug ('start_p') at <inline-test>:1:[4..8] '-3.4' = -3.4
debug ('start_p') at <inline-test>:1:[10..19] '3.1415926' = 3.14159
debug ('start_p') at <inline-test>:2:[2..6] '+inF' = inf
debug ('start_p') at <inline-test>:2:[7..11] '-NaN' = -nan
debug ('start_p') at <inline-test>:3:[1..2] '2' = 2
debug ('start_p') at <inline-test>:3:[3..6] '-.4' = -0.4
debug ('start_p') at <inline-test>:3:[7..13] '4.14e7' = 4.14e+07
Done, parsed 8 values (min: -3.4, max: inf)
See it live at http://liveworkspace.org/code/8a874ef3...
Note how it
demonstrates access to the name of the actual parser instance ('start_p')
demonstrates accces to the full source iterator range
shows how to do specialized processing inside the semantic action
I still suggest using qi::double_ to parse the raw input, because it is the only thing I know that easily handles all cases (see test data and this other question: Is it possible to read infinity or NaN values using input streams?)
demonstrates parsing the actual data into the vector efficiently by displaying statistics of the parsed values
Full Code
Here is the full code for future reference:
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/support_multi_pass.hpp>
#include <boost/spirit/include/classic_position_iterator.hpp>
#include <boost/phoenix/function/adapt_function.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
namespace classic = boost::spirit::classic;
namespace ascii = boost::spirit::ascii;
typedef std::vector<double> data_t;
///////// USING A FREE FUNCTION
//
template <typename Grammar, typename Range>
double doStuff_(Grammar &grammar, Range pos_range)
{
// for efficiency, cache adhoc grammar:
static const qi::rule <typename Range::iterator, double()> r_double = qi::double_;
static const qi::grammar<typename Range::iterator, double()> g_double(r_double); // caching just the rule may be enough, actually
double value = 0;
qi::parse(pos_range.begin(), pos_range.end(), g_double, value);
std::cout << "debug ('" << grammar.name() << "') at "
<< pos_range.begin().get_position().file << ":"
<< pos_range.begin().get_position().line << ":["
<< pos_range.begin().get_position().column << ".."
<< pos_range.end ().get_position().column << "]\t"
<< "'" << std::string(pos_range.begin(),pos_range.end()) << "'\t = "
<< value
<< '\n';
return value;
}
BOOST_PHOENIX_ADAPT_FUNCTION(double, doStuff, doStuff_, 2)
template <typename Iterator, typename Skipper>
struct parse_grammar : qi::grammar<Iterator, data_t(), Skipper>
{
parse_grammar()
: parse_grammar::base_type(start_p, "start_p")
{
using qi::raw;
using qi::double_;
using qi::_1;
using qi::_val;
using qi::eoi;
using phx::push_back;
value_p = raw [ double_ ] [ _val = doStuff(phx::ref(*this), _1) ];
start_p = value_p % ',' > eoi;
// // To use without the semantic action (more efficient):
// start_p = double_ % ',' >> eoi;
}
qi::rule<Iterator, data_t::value_type(), Skipper> value_p;
qi::rule<Iterator, data_t(), Skipper> start_p;
};
// implementation
data_t parse(std::istream& input, const std::string& filename)
{
// iterate over stream input
typedef std::istreambuf_iterator<char> base_iterator_type;
base_iterator_type in_begin(input);
// convert input iterator to forward iterator, usable by spirit parser
typedef boost::spirit::multi_pass<base_iterator_type> forward_iterator_type;
forward_iterator_type fwd_begin = boost::spirit::make_default_multi_pass(in_begin);
forward_iterator_type fwd_end;
// wrap forward iterator with position iterator, to record the position
typedef classic::position_iterator2<forward_iterator_type> pos_iterator_type;
pos_iterator_type position_begin(fwd_begin, fwd_end, filename);
pos_iterator_type position_end;
parse_grammar<pos_iterator_type, ascii::space_type> gram;
data_t output;
// parse
try
{
if (!qi::phrase_parse(
position_begin, position_end, // iterators over input
gram, // recognize list of doubles
ascii::space, // comment skipper
output) // <-- attribute reference
)
{
std::cerr << "Parse failed at "
<< position_begin.get_position().file << ":"
<< position_begin.get_position().line << ":"
<< position_begin.get_position().column << "\n";
}
}
catch(const qi::expectation_failure<pos_iterator_type>& e)
{
const classic::file_position_base<std::string>& pos = e.first.get_position();
std::stringstream msg;
msg << "parse error at file " << pos.file
<< " line " << pos.line
<< " column " << pos.column
<< "\n\t'" << e.first.get_currentline()
<< "'\n\t " << std::string(pos.column, ' ') << "^-- here";
throw std::runtime_error(msg.str());
}
return output;
}
int main()
{
std::istringstream iss(
"1, -3.4 ,3.1415926\n"
",+inF,-NaN ,\n"
"2,-.4,4.14e7\n");
data_t parsed = parse(iss, "<inline-test>");
std::cout << "Done, parsed " << parsed.size() << " values ("
<< "min: " << *std::min_element(parsed.begin(), parsed.end()) << ", "
<< "max: " << *std::max_element(parsed.begin(), parsed.end()) << ")\n";
}

Related

boost spirit assign_to_* customization points

I'm trying to understand how boost spirit assign_to_* customization points work.
Here is an exemple I am using:
I have this parser in a rule in a grammar:
int_ >> lit(':') >> char_
And I want the result to be put in this struct:
struct IntAndChar{
int i;
char c;
};
(This is just an exemple to use the customization point so I won't use the BOOST_FUSION_ADAPT_STRUCT or semantic actions.)
I thought I could just define a specialization of assign_to_attribute_from_value but I only get the int this way and the second element is dropped.
Can someone give me a hint to understand how it works?
You don't want to assign to the attribute¹. Instead you wish to transform boost::fusion::vector2<int, char> into IntAndChar.
Therefore, let's start off telling spirit our type is not container-like:
template<>
struct is_container<IntAndChar, void> : mpl::false_ { };
Next, tell it how it can transform e between raw and cooked forms of our attributes:
template<>
struct transform_attribute<IntAndChar, fusion::vector2<int, char>, qi::domain, void> {
using Transformed = fusion::vector2<int, char>;
using Exposed = IntAndChar;
using type = Transformed;
static Transformed pre(Exposed&) { return Transformed(); }
static void post(Exposed& val, Transformed const& attr) {
val.i = fusion::at_c<0>(attr);
val.c = fusion::at_c<1>(attr);
}
static void fail(Exposed&) {}
};
That's it! There is one catch though. It won't work unless you trigger a transformation. The docs say:
It is invoked by Qi rule, semantic action and attr_cast, [...]
1. Using qi::rule (not very helpful)
So here's a solution using rule:
Live On Coliru
int main() {
using It = std::string::const_iterator;
qi::rule<It, boost::fusion::vector2<int, char>(), qi::space_type> rule = qi::int_ >> ':' >> qi::char_;
//qi::rule<It, IntAndChar(), qi::space_type> rule = qi::attr_cast(qi::int_ >> ':' >> qi::char_);
for (std::string const input : { "123:a", "-4 : \r\nq" }) {
It f = input.begin(), l = input.end();
IntAndChar data;
bool ok = qi::phrase_parse(f, l, rule, qi::space, data);
if (ok) std::cout << "Parse success: " << data.i << ", " << data.c << "\n";
else std::cout << "Parse failure ('" << input << "')\n";
if (f != l) std::cout << "Remaining unparsed input: '" << std::string(f, l) << "'\n";
}
}
Prints:
Parse success: 123, a
Parse success: -4, q
Of course this approach requires you to spell out boost::fusion::vector2<int, char> which is tedious and error-prone.
2. Using qi::attr_cast
You can use qi::attr_cast to trigger the transform:
qi::rule<It, IntAndChar(), qi::space_type> rule = qi::attr_cast<IntAndChar, boost::fusion::vector2<int, char> >(qi::int_ >> ':' >> qi::char_);
// using deduction:
qi::rule<It, IntAndChar(), qi::space_type> rule = qi::attr_cast<IntAndChar>(qi::int_ >> ':' >> qi::char_);
// using even more deduction:
qi::rule<It, IntAndChar(), qi::space_type> rule = qi::attr_cast(qi::int_ >> ':' >> qi::char_);
CAVEAT That should work. However, due to very subtle behaviour (bugs?) you need to deep-copy the Proto expression tree there, in order for it to work without Undefined Behaviour:
qi::rule<It, IntAndChar(), qi::space_type> rule = qi::attr_cast(qi::copy(qi::int_ >> ':' >> qi::char_));
Bringing it all together, we can even do without the qi::rule:
Live On Coliru
int main() {
using It = std::string::const_iterator;
for (std::string const input : { "123:a", "-4 : \r\nq" }) {
It f = input.begin(), l = input.end();
IntAndChar data;
bool ok = qi::phrase_parse(f, l, qi::attr_cast(qi::copy(qi::int_ >> ':' >> qi::char_)), qi::space, data);
if (ok) std::cout << "Parse success: " << data.i << ", " << data.c << "\n";
else std::cout << "Parse failure ('" << input << "')\n";
if (f != l) std::cout << "Remaining unparsed input: '" << std::string(f, l) << "'\n";
}
}
Prints
Parse success: 123, a
Parse success: -4, q
¹ (unless you want to treat IntAndChar as a container, which is a different story)

Boost Spirit Qi: Skipper parser does not skip under certain conditions

I am currently implementing a parser which succeeds on the "strongest" match for spirit::qi. There are meaningful applications for such a thing. E.g matching references to either simple refs (eg "willy") or namespace qualified refs (eg. "willy::anton"). That's not my actual real world case but it is almost self-explanatory, I guess. At least it helped me to track down the issue.
I found a solution for that. It works perfectly, when the skipper parser is not involved (i.e. there is nothing to skip). It does not work as expected if there are areas which need skipping.
I believe, I tracked down the problem. It seems like under certain conditions spaces are actually not skipped allthough they should be.
Below is find a self-contained very working example. It loops over some rules and some input to provide enough information. If you run it with BOOST_SPIRIT_DEBUG enabled, you get in particular the output:
<qualifier>
<try> :: anton</try>
<fail/>
</qualifier>
I think, this one should not have failed. Am I right guessing so? Does anyone know a way to get around that? Or is it just my poor understanding of qi semantics? Thank you very much for your time. :)
My environment: MSVC 2015 latest, target win32 console
#define BOOST_SPIRIT_DEBUG
#include <io.h>
#include<map>
#include <boost/spirit/include/qi.hpp>
typedef std::string::const_iterator iterator_type;
namespace qi = boost::spirit::qi;
using map_type = std::map<std::string, qi::rule<iterator_type, std::string()>&>;
namespace maxence { namespace parser {
template <typename Iterator>
struct ident : qi::grammar<Iterator, std::string()>
{
ident();
qi::rule<Iterator, std::string()>
id, id_raw;
qi::rule<Iterator, std::string()>
not_used,
qualifier,
qualified_id, simple_id,
id_reference, id_reference_final;
map_type rules = {
{ "E1", id },
{ "E2", id_raw}
};
};
template <typename Iterator>
// we actually don't need the start rule (see below)
ident<Iterator>::ident() : ident::base_type(not_used)
{
id_reference = (!simple_id >> qualified_id) | (!qualified_id >> simple_id);
id_reference_final = id_reference;
///////////////////////////////////////////////////
// standard simple id (not followed by
// delimiter "::")
simple_id = (qi::alpha | '_') >> *(qi::alnum | '_') >> !qi::lit("::");
///////////////////////////////////////////////////
// this is qualifier <- "::" simple_id
// I repeat the simple_id pattern here to make sure
// this demo has no "early match" issues
qualifier = qi::string("::") > (qi::alpha | '_') >> *(qi::alnum | '_');
///////////////////////////////////////////////////
// this is: qualified_id <- simple_id qualifier*
qualified_id = (qi::alpha | '_') >> *(qi::alnum | '_') >> +(qualifier) >> !qi::lit("::");
id = id_reference_final;
id_raw = qi::raw[id_reference_final];
BOOST_SPIRIT_DEBUG_NODES(
(id)
(id_raw)
(qualifier)
(qualified_id)
(simple_id)
(id_reference)
(id_reference_final)
)
}
}}
int main()
{
maxence::parser::ident<iterator_type> ident;
using ss_map_type = std::map<std::string, std::string>;
ss_map_type parser_input =
{
{ "Simple id (behaves ok)", "willy" },
{ "Qualified id (behaves ok)", "willy::anton" },
{ "Skipper involved (unexpected)", "willy :: anton" }
};
for (ss_map_type::const_iterator input = parser_input.begin(); input != parser_input.end(); input++) {
for (map_type::const_iterator example = ident.rules.begin(); example != ident.rules.end(); example++) {
std::string to_parse = input->second;
std::string result;
std::string parser_name = (example->second).name();
std::cout << "--------------------------------------------" << std::endl;
std::cout << "Description: " << input->first << std::endl;
std::cout << "Parser [" << parser_name << "] parsing [" << to_parse << "]" << std::endl;
auto b(to_parse.begin()), e(to_parse.end());
// --- test for parser success
bool success = qi::phrase_parse(b, e, (example)->second, qi::space, result);
if (success) std::cout << "Parser succeeded. Result: " << result << std::endl;
else std::cout << " Parser failed. " << std::endl;
//--- test for EOI
if (b == e) {
std::cout << "EOI reached.";
if (success) std::cout << " The sun is shining brightly. :)";
} else {
std::cout << "Failure: EOI not reached. Remaining: [";
while (b != e) std::cout << *b++; std::cout << "]";
}
std::cout << std::endl << "--------------------------------------------" << std::endl;
}
}
return 0;
}

boost::spirit access position iterator from semantic actions

Lets say I have code like this (line numbers for reference):
1:
2:function FuncName_1 {
3: var Var_1 = 3;
4: var Var_2 = 4;
5: ...
I want to write a grammar that parses such text, puts all indentifiers (function and variable names) infos into a tree (utree?).
Each node should preserve: line_num, column_num and symbol value. example:
root: FuncName_1 (line:2,col:10)
children[0]: Var_1 (line:3, col:8)
children[1]: Var_1 (line:4, col:9)
I want to put it into the tree because I plan to traverse through that tree and for each node I must know the 'context': (all parent nodes of current nodes).
E.g, while processing node with Var_1, I must know that this is a name for local variable for function FuncName_1 (that is currently being processed as node, but one level earlier)
I cannot figure out few things
Can this be done in Spirit with semantic actions and utree's ? Or should I use variant<> trees ?
How to pass to the node those three informations (column,line,symbol_name) at the same time ? I know I must use pos_iterator as iterator type for grammar but how to access those information in sematic action ?
I'm a newbie in Boost so I read the Spirit documentaiton over and over, I try to google my problems but I somehow cannot put all the pieces together ot find the solution. Seems like there was no one me with such use case like mine before (or I'm just not able to find it)
Looks like the only solutions with position iterator are the ones with parsing error handling, but this is not the case I'm interested in.
The code that only parses the code I was taking about is below but I dont know how to move forward with it.
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
namespace qi = boost::spirit::qi;
typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;
template<typename Iterator=pos_iterator_t, typename Skipper=qi::space_type>
struct ParseGrammar: public qi::grammar<Iterator, Skipper>
{
ParseGrammar():ParseGrammar::base_type(SourceCode)
{
using namespace qi;
KeywordFunction = lit("function");
KeywordVar = lit("var");
SemiColon = lit(';');
Identifier = lexeme [alpha >> *(alnum | '_')];
VarAssignemnt = KeywordVar >> Identifier >> char_('=') >> int_ >> SemiColon;
SourceCode = KeywordFunction >> Identifier >> '{' >> *VarAssignemnt >> '}';
}
qi::rule<Iterator, Skipper> SourceCode;
qi::rule<Iterator > KeywordFunction;
qi::rule<Iterator, Skipper> VarAssignemnt;
qi::rule<Iterator> KeywordVar;
qi::rule<Iterator> SemiColon;
qi::rule<Iterator > Identifier;
};
int main()
{
std::string const content = "function FuncName_1 {\n var Var_1 = 3;\n var Var_2 = 4; }";
pos_iterator_t first(content.begin()), iter = first, last(content.end());
ParseGrammar<pos_iterator_t> resolver; // Our parser
bool ok = phrase_parse(iter,
last,
resolver,
qi::space);
std::cout << std::boolalpha;
std::cout << "\nok : " << ok << std::endl;
std::cout << "full : " << (iter == last) << std::endl;
if(ok && iter == last)
{
std::cout << "OK: Parsing fully succeeded\n\n";
}
else
{
int line = get_line(iter);
int column = get_column(first, iter);
std::cout << "-------------------------\n";
std::cout << "ERROR: Parsing failed or not complete\n";
std::cout << "stopped at: " << line << ":" << column << "\n";
std::cout << "remaining: '" << std::string(iter, last) << "'\n";
std::cout << "-------------------------\n";
}
return 0;
}
This has been a fun exercise, where I finally put together a working demo of on_success[1] to annotate AST nodes.
Let's assume we want an AST like:
namespace ast
{
struct LocationInfo {
unsigned line, column, length;
};
struct Identifier : LocationInfo {
std::string name;
};
struct VarAssignment : LocationInfo {
Identifier id;
int value;
};
struct SourceCode : LocationInfo {
Identifier function;
std::vector<VarAssignment> assignments;
};
}
I know, 'location information' is probably overkill for the SourceCode node, but you know... Anyways, to make it easy to assign attributes to these nodes without requiring semantic actions or lots of specifically crafted constructors:
#include <boost/fusion/adapted/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(ast::Identifier, (std::string, name))
BOOST_FUSION_ADAPT_STRUCT(ast::VarAssignment, (ast::Identifier, id)(int, value))
BOOST_FUSION_ADAPT_STRUCT(ast::SourceCode, (ast::Identifier, function)(std::vector<ast::VarAssignment>, assignments))
There. Now we can declare the rules to expose these attributes:
qi::rule<Iterator, ast::SourceCode(), Skipper> SourceCode;
qi::rule<Iterator, ast::VarAssignment(), Skipper> VarAssignment;
qi::rule<Iterator, ast::Identifier()> Identifier;
// no skipper, no attributes:
qi::rule<Iterator> KeywordFunction, KeywordVar, SemiColon;
We don't (essentially) modify the grammar, at all: attribute propagation is "just automatic"[2] :
KeywordFunction = lit("function");
KeywordVar = lit("var");
SemiColon = lit(';');
Identifier = as_string [ alpha >> *(alnum | char_("_")) ];
VarAssignment = KeywordVar >> Identifier >> '=' >> int_ >> SemiColon;
SourceCode = KeywordFunction >> Identifier >> '{' >> *VarAssignment >> '}';
The magic
How do we get the source location information attached to our nodes?
auto set_location_info = annotate(_val, _1, _3);
on_success(Identifier, set_location_info);
on_success(VarAssignment, set_location_info);
on_success(SourceCode, set_location_info);
Now, annotate is just a lazy version of a calleable that is defined as:
template<typename It>
struct annotation_f {
typedef void result_type;
annotation_f(It first) : first(first) {}
It const first;
template<typename Val, typename First, typename Last>
void operator()(Val& v, First f, Last l) const {
do_annotate(v, f, l, first);
}
private:
void static do_annotate(ast::LocationInfo& li, It f, It l, It first) {
using std::distance;
li.line = get_line(f);
li.column = get_column(first, f);
li.length = distance(f, l);
}
static void do_annotate(...) { }
};
Due to way in which get_column works, the functor is stateful (as it remembers the start iterator)[3]. As you can see do_annotate just accepts anything that derives from LocationInfo.
Now, the proof of the pudding:
std::string const content = "function FuncName_1 {\n var Var_1 = 3;\n var Var_2 = 4; }";
pos_iterator_t first(content.begin()), iter = first, last(content.end());
ParseGrammar<pos_iterator_t> resolver(first); // Our parser
ast::SourceCode program;
bool ok = phrase_parse(iter,
last,
resolver,
qi::space,
program);
std::cout << std::boolalpha;
std::cout << "ok : " << ok << std::endl;
std::cout << "full: " << (iter == last) << std::endl;
if(ok && iter == last)
{
std::cout << "OK: Parsing fully succeeded\n\n";
std::cout << "Function name: " << program.function.name << " (see L" << program.printLoc() << ")\n";
for (auto const& va : program.assignments)
std::cout << "variable " << va.id.name << " assigned value " << va.value << " at L" << va.printLoc() << "\n";
}
else
{
int line = get_line(iter);
int column = get_column(first, iter);
std::cout << "-------------------------\n";
std::cout << "ERROR: Parsing failed or not complete\n";
std::cout << "stopped at: " << line << ":" << column << "\n";
std::cout << "remaining: '" << std::string(iter, last) << "'\n";
std::cout << "-------------------------\n";
}
This prints:
ok : true
full: true
OK: Parsing fully succeeded
Function name: FuncName_1 (see L1:1:56)
variable Var_1 assigned value 3 at L2:3:14
variable Var_2 assigned value 4 at L3:3:15
Full Demo Program
See it Live On Coliru
Also showing:
error handling, e.g.:
Error: expecting "=" in line 3:
var Var_2 - 4; }
^---- here
ok : false
full: false
-------------------------
ERROR: Parsing failed or not complete
stopped at: 1:1
remaining: 'function FuncName_1 {
var Var_1 = 3;
var Var_2 - 4; }'
-------------------------
BOOST_SPIRIT_DEBUG macros
A bit of a hacky way to conveniently stream the LocationInfo part of any AST node, sorry :)
//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;
namespace phx= boost::phoenix;
typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;
namespace ast
{
namespace manip { struct LocationInfoPrinter; }
struct LocationInfo {
unsigned line, column, length;
manip::LocationInfoPrinter printLoc() const;
};
struct Identifier : LocationInfo {
std::string name;
};
struct VarAssignment : LocationInfo {
Identifier id;
int value;
};
struct SourceCode : LocationInfo {
Identifier function;
std::vector<VarAssignment> assignments;
};
///////////////////////////////////////////////////////////////////////////
// Completely unnecessary tweak to get a "poor man's" io manipulator going
// so we can do `std::cout << x.printLoc()` on types of `x` deriving from
// LocationInfo
namespace manip {
struct LocationInfoPrinter {
LocationInfoPrinter(LocationInfo const& ref) : ref(ref) {}
LocationInfo const& ref;
friend std::ostream& operator<<(std::ostream& os, LocationInfoPrinter const& lip) {
return os << lip.ref.line << ':' << lip.ref.column << ':' << lip.ref.length;
}
};
}
manip::LocationInfoPrinter LocationInfo::printLoc() const { return { *this }; }
// feel free to disregard this hack
///////////////////////////////////////////////////////////////////////////
}
BOOST_FUSION_ADAPT_STRUCT(ast::Identifier, (std::string, name))
BOOST_FUSION_ADAPT_STRUCT(ast::VarAssignment, (ast::Identifier, id)(int, value))
BOOST_FUSION_ADAPT_STRUCT(ast::SourceCode, (ast::Identifier, function)(std::vector<ast::VarAssignment>, assignments))
struct error_handler_f {
typedef qi::error_handler_result result_type;
template<typename T1, typename T2, typename T3, typename T4>
qi::error_handler_result operator()(T1 b, T2 e, T3 where, T4 const& what) const {
std::cerr << "Error: expecting " << what << " in line " << get_line(where) << ": \n"
<< std::string(b,e) << "\n"
<< std::setw(std::distance(b, where)) << '^' << "---- here\n";
return qi::fail;
}
};
template<typename It>
struct annotation_f {
typedef void result_type;
annotation_f(It first) : first(first) {}
It const first;
template<typename Val, typename First, typename Last>
void operator()(Val& v, First f, Last l) const {
do_annotate(v, f, l, first);
}
private:
void static do_annotate(ast::LocationInfo& li, It f, It l, It first) {
using std::distance;
li.line = get_line(f);
li.column = get_column(first, f);
li.length = distance(f, l);
}
static void do_annotate(...) {}
};
template<typename Iterator=pos_iterator_t, typename Skipper=qi::space_type>
struct ParseGrammar: public qi::grammar<Iterator, ast::SourceCode(), Skipper>
{
ParseGrammar(Iterator first) :
ParseGrammar::base_type(SourceCode),
annotate(first)
{
using namespace qi;
KeywordFunction = lit("function");
KeywordVar = lit("var");
SemiColon = lit(';');
Identifier = as_string [ alpha >> *(alnum | char_("_")) ];
VarAssignment = KeywordVar > Identifier > '=' > int_ > SemiColon; // note: expectation points
SourceCode = KeywordFunction >> Identifier >> '{' >> *VarAssignment >> '}';
on_error<fail>(VarAssignment, handler(_1, _2, _3, _4));
on_error<fail>(SourceCode, handler(_1, _2, _3, _4));
auto set_location_info = annotate(_val, _1, _3);
on_success(Identifier, set_location_info);
on_success(VarAssignment, set_location_info);
on_success(SourceCode, set_location_info);
BOOST_SPIRIT_DEBUG_NODES((KeywordFunction)(KeywordVar)(SemiColon)(Identifier)(VarAssignment)(SourceCode))
}
phx::function<error_handler_f> handler;
phx::function<annotation_f<Iterator>> annotate;
qi::rule<Iterator, ast::SourceCode(), Skipper> SourceCode;
qi::rule<Iterator, ast::VarAssignment(), Skipper> VarAssignment;
qi::rule<Iterator, ast::Identifier()> Identifier;
// no skipper, no attributes:
qi::rule<Iterator> KeywordFunction, KeywordVar, SemiColon;
};
int main()
{
std::string const content = "function FuncName_1 {\n var Var_1 = 3;\n var Var_2 - 4; }";
pos_iterator_t first(content.begin()), iter = first, last(content.end());
ParseGrammar<pos_iterator_t> resolver(first); // Our parser
ast::SourceCode program;
bool ok = phrase_parse(iter,
last,
resolver,
qi::space,
program);
std::cout << std::boolalpha;
std::cout << "ok : " << ok << std::endl;
std::cout << "full: " << (iter == last) << std::endl;
if(ok && iter == last)
{
std::cout << "OK: Parsing fully succeeded\n\n";
std::cout << "Function name: " << program.function.name << " (see L" << program.printLoc() << ")\n";
for (auto const& va : program.assignments)
std::cout << "variable " << va.id.name << " assigned value " << va.value << " at L" << va.printLoc() << "\n";
}
else
{
int line = get_line(iter);
int column = get_column(first, iter);
std::cout << "-------------------------\n";
std::cout << "ERROR: Parsing failed or not complete\n";
std::cout << "stopped at: " << line << ":" << column << "\n";
std::cout << "remaining: '" << std::string(iter, last) << "'\n";
std::cout << "-------------------------\n";
}
return 0;
}
[1] sadly un(der)documented, except for the conjure sample(s)
[2] well, I used as_string to get proper assignment to Identifier without too much work
[3] There could be smarter ways about this in terms of performance, but for now, let's keep it simple

Boost::spirit (classic) primitives vs custom parsers

I'm a beginner in Boost::spirit and I want to define grammar that parses TTCN language.
(http://www.trex.informatik.uni-goettingen.de/trac/wiki/ttcn-3_4.5.1)
I'm trying to define some rules for 'primitve' parsers like Alpha, AlphaNum to be faitful 1 to 1 to original grammar but obviously I do something wrong because grammar defined this way does not work.
But when I use primite parsers in place of TTCN's it started to work.
Can someone tell why 'manually' defined rules does not work as expected ?
How to fix it, because I would like to stick close to original grammar.
Is it a begginer's code bug or something different ?
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/classic_symbols.hpp>
#include <boost/spirit/include/classic_tree_to_xml.hpp>
#include <boost/spirit/include/classic_position_iterator.hpp>
#include <boost/spirit/include/classic_core.hpp>
#include <boost/spirit/include/classic_parse_tree.hpp>
#include <boost/spirit/include/classic_ast.hpp>
#include <iostream>
#include <string>
#include <boost/spirit/home/classic/debug.hpp>
using namespace boost::spirit::classic;
using namespace std;
using namespace BOOST_SPIRIT_CLASSIC_NS;
typedef node_iter_data_factory<int> factory_t;
typedef position_iterator<std::string::iterator> pos_iterator_t;
typedef tree_match<pos_iterator_t, factory_t> parse_tree_match_t;
typedef parse_tree_match_t::const_tree_iterator iter_t;
struct ParseGrammar: public grammar<ParseGrammar>
{
template<typename ScannerT>
struct definition
{
definition(ParseGrammar const &)
{
KeywordImport = str_p("import");
KeywordAll = str_p("all");
SemiColon = ch_p(';');
Underscore = ch_p('_');
NonZeroNum = range_p('1','9');
Num = ch_p('0') | NonZeroNum;
UpperAlpha = range_p('A', 'Z');
LowerAlpha = range_p('a', 'z');
Alpha = UpperAlpha | LowerAlpha;
AlphaNum = Alpha | Num;
//this does not!
Identifier = lexeme_d[Alpha >> *(AlphaNum | Underscore)];
// Uncomment below line to make rule work
// Identifier = lexeme_d[alpha_p >> *(alnum_p | Underscore)];
Module = KeywordImport >> Identifier >> KeywordAll >> SemiColon;
BOOST_SPIRIT_DEBUG_NODE(Module);
BOOST_SPIRIT_DEBUG_NODE(KeywordImport);
BOOST_SPIRIT_DEBUG_NODE(KeywordAll);
BOOST_SPIRIT_DEBUG_NODE(Identifier);
BOOST_SPIRIT_DEBUG_NODE(SemiColon);
}
rule<ScannerT> KeywordImport,KeywordAll,Module,Identifier,SemiColon;
rule<ScannerT> Alpha,UpperAlpha,LowerAlpha,Underscore,Num,AlphaNum;
rule<ScannerT> NonZeroNum;
rule<ScannerT> const&
start() const { return Module; }
};
};
int main()
{
ParseGrammar resolver; // Our parser
BOOST_SPIRIT_DEBUG_NODE(resolver);
string content = "import foobar all;";
pos_iterator_t pos_begin(content.begin(), content.end());
pos_iterator_t pos_end;
tree_parse_info<pos_iterator_t, factory_t> info;
info = ast_parse<factory_t>(pos_begin, pos_end, resolver, space_p);
std::cout << "\ninfo.length : " << info.length << std::endl;
std::cout << "info.full : " << info.full << std::endl;
if(info.full)
{
std::cout << "OK: Parsing succeeded\n\n";
}
else
{
int line = info.stop.get_position().line;
int column = info.stop.get_position().column;
std::cout << "-------------------------\n";
std::cout << "ERROR: Parsing failed\n";
std::cout << "stopped at: " << line << ":" << column << "\n";
std::cout << "-------------------------\n";
}
return 0;
}
I don't do Spirit Classic (which has been deprecated for some years now).
I can only assume you've mixed something up with skippers. Here's the thing translated into Spirit V2:
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
namespace qi = boost::spirit::qi;
typedef boost::spirit::line_pos_iterator<std::string::const_iterator> pos_iterator_t;
template <typename Iterator = pos_iterator_t, typename Skipper = qi::space_type>
struct ParseGrammar: public qi::grammar<Iterator, Skipper>
{
ParseGrammar() : ParseGrammar::base_type(Module)
{
using namespace qi;
KeywordImport = lit("import");
KeywordAll = lit("all");
SemiColon = lit(';');
#if 1
// this rule obviously works
Identifier = lexeme [alpha >> *(alnum | '_')];
#else
// this does too, but less efficiently
Underscore = lit('_');
NonZeroNum = char_('1','9');
Num = char_('0') | NonZeroNum;
UpperAlpha = char_('A', 'Z');
LowerAlpha = char_('a', 'z');
Alpha = UpperAlpha | LowerAlpha;
AlphaNum = Alpha | Num;
Identifier = lexeme [Alpha >> *(AlphaNum | Underscore)];
#endif
Module = KeywordImport >> Identifier >> KeywordAll >> SemiColon;
BOOST_SPIRIT_DEBUG_NODES((Module)(KeywordImport)(KeywordAll)(Identifier)(SemiColon))
}
qi::rule<Iterator, Skipper> Module;
qi::rule<Iterator> KeywordImport,KeywordAll,Identifier,SemiColon;
qi::rule<Iterator> Alpha,UpperAlpha,LowerAlpha,Underscore,Num,AlphaNum;
qi::rule<Iterator> NonZeroNum;
};
int main()
{
std::string const content = "import \r\n\r\nfoobar\r\n\r\n all; bogus";
pos_iterator_t first(content.begin()), iter=first, last(content.end());
ParseGrammar<pos_iterator_t> resolver; // Our parser
bool ok = phrase_parse(iter, last, resolver, qi::space);
std::cout << std::boolalpha;
std::cout << "\nok : " << ok << std::endl;
std::cout << "full : " << (iter == last) << std::endl;
if(ok && iter==last)
{
std::cout << "OK: Parsing fully succeeded\n\n";
}
else
{
int line = get_line(iter);
int column = get_column(first, iter);
std::cout << "-------------------------\n";
std::cout << "ERROR: Parsing failed or not complete\n";
std::cout << "stopped at: " << line << ":" << column << "\n";
std::cout << "remaining: '" << std::string(iter, last) << "'\n";
std::cout << "-------------------------\n";
}
return 0;
}
I've added a little "bogus" at the end of input, so the output becomes a nicer demonstration:
<Module>
<try>import \r\n\r\nfoobar\r\n\r</try>
<KeywordImport>
<try>import \r\n\r\nfoobar\r\n\r</try>
<success> \r\n\r\nfoobar\r\n\r\n all;</success>
<attributes>[]</attributes>
</KeywordImport>
<Identifier>
<try>foobar\r\n\r\n all; bogu</try>
<success>\r\n\r\n all; bogus</success>
<attributes>[]</attributes>
</Identifier>
<KeywordAll>
<try>all; bogus</try>
<success>; bogus</success>
<attributes>[]</attributes>
</KeywordAll>
<SemiColon>
<try>; bogus</try>
<success> bogus</success>
<attributes>[]</attributes>
</SemiColon>
<success> bogus</success>
<attributes>[]</attributes>
</Module>
ok : true
full : false
-------------------------
ERROR: Parsing failed or not complete
stopped at: 3:8
remaining: 'bogus'
-------------------------
That all said, this is what I'd probably reduce it to:
template <typename Iterator, typename Skipper = qi::space_type>
struct ParseGrammar: public qi::grammar<Iterator, Skipper>
{
ParseGrammar() : ParseGrammar::base_type(Module)
{
using namespace qi;
Identifier = alpha >> *(alnum | '_');
Module = "import" >> Identifier >> "all" >> ';';
BOOST_SPIRIT_DEBUG_NODES((Module)(Identifier))
}
qi::rule<Iterator, Skipper> Module;
qi::rule<Iterator> Identifier;
};
As you can see, the Identifier rule is implicitely a lexeme because it doesn't declared to use a skipper.
See it Live on Coliru

boost:spirit::qi parser using multiple grammars and phoenix::construct

I'm having trouble writing a Qi grammar which utilizes another Qi grammar. A similar question was asked here, but I'm also trying to use phoenix::construct and having compilation difficulties.
Here's a simplified version of what I'm trying to do. I realize that this example could probably be done easily using BOOST_FUSION_ADAPT_STRUCT, but my actual code deals with more complex object types so I'm hoping there's a way to accomplish this using semantic actions.
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_container.hpp>
#include <boost/spirit/include/phoenix_statement.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <cstdlib>
#include <iostream>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
// grammar for real numbers
template <typename Iterator>
struct Real : qi::grammar<Iterator, long double()>
{
qi::rule<Iterator, long double()> r;
Real() : Real::base_type(r)
{
r %= qi::long_double;
}
};
// grammar for complex numbers of the form a+bi
template <typename Iterator>
struct Complex : qi::grammar<Iterator, std::complex<long double>()>
{
qi::rule<Iterator, std::complex<long double>()> r;
Real<Iterator> real;
Complex() : Complex::base_type(r)
{
r = real [qi::_a = qi::_1] >> (qi::lit("+") | qi::lit("-"))
>> real [qi::_b = qi::_1] >> -qi::lit("*") >> qi::lit("i")
[
qi::_val = phx::construct<std::complex<long double> >(qi::_a, qi::_b)
];
}
};
int main()
{
// test real parsing
std::cout << "Parsing '3'" << std::endl;
std::string toParse = "3";
Real<std::string::iterator> real_parser;
long double real_val;
std::string::iterator beginIt = toParse.begin();
std::string::iterator endIt = toParse.end();
bool r = qi::parse(beginIt, endIt, real_parser, real_val);
if(r && beginIt == endIt)
std::cout << "Successful parse: " << real_val << std::endl;
else
std::cout << "Could not parse" << std::endl;
// test complex parsing
std::cout << "Parsing '3+4i'" << std::endl;
toParse = "3+4i";
Complex<std::string::iterator> complex_parser;
std::complex<long double> complex_val;
beginIt = toParse.begin();
endIt = toParse.end();
r = qi::parse(beginIt, endIt, complex_parser, complex_val);
if(r && beginIt == endIt)
std::cout << "Successful parse: " << real_val << std::endl;
else
std::cout << "Could not parse" << std::endl;
}
I'm able to parse a Complex using the phrase_parse approach demonstrated in Spirit's documentation, but I'd like to be able to easily integrate the Complex grammar into other parsers (an expression parser, for instance). Is there something I'm missing that would allow me to parse Real and Complex objects as distinct entities while still being able to effectively use them in other rules/grammars?
qi::_a and qi::_b represent the first and second local variables for a rule. These variables are only available if you add qi::locals<long double, long double> as a template parameter in the declaration of rule r (and in this case also to qi::grammar... since the start rule passed to the constructor of the grammar needs to be compatible with the grammar, ie have the same template parameters).
Below you can see another alternative without the need for the local variables:
// grammar for complex numbers of the form a+bi
template <typename Iterator>
struct Complex : qi::grammar<Iterator, std::complex<long double>()>
{
qi::rule<Iterator, std::complex<long double>()> r;
Real<Iterator> real;
Complex() : Complex::base_type(r)
{
r = (
real >> (qi::lit("+") | qi::lit("-"))
>> real >> -qi::lit("*") >> qi::lit("i")
)
[
qi::_val = phx::construct<std::complex<long double> >(qi::_1, qi::_2)
];
}
};
In this case the semantic action is attached to the whole parser sequence and we can get the attributes we need with the _N placeholders. Here, qi::_1 refers to the attribute matched by the first Real parser, and qi::_2 to the second one.
Using any of the alternatives we can then use those grammars normally:
//using complex_parser, real_parser, complex_val and real_val declared in your code
std::cout << "Parsing 'variable=3+4i-2'" << std::endl;
toParse = "variable=3+4i-2";
beginIt = toParse.begin();
endIt = toParse.end();
std::string identifier;
r = qi::parse(beginIt, endIt, *qi::char_("a-z") >> '=' >> complex_parser >> '-' >> real_parser, identifier, complex_val, real_val);
if(r && beginIt == endIt)
std::cout << "Successful parse: " << identifier << complex_val.real() << " " << complex_val.imag() << " " << real_val << std::endl;
else
std::cout << "Could not parse" << std::endl;