I have a grammar that works perfectly fine and contains the following lines.
element = container | list | pair;
container = name >> '(' >> -(arg % ',') >> ')' >> '{' >> +element > '}';
// trying to put an expectation operator here --------^
list = name >> '(' > (value % ',') > ')' > ';';
pair = name >> ':' > value > ';';
To have meaningful error messages, I want to make sure that container does not backtrack as soon as it hits '{'. But for some reason, if I replace the sequence operator with an expectation operator right after the '{', I get a huge compiler error. Any ideas what the problem might be?
element is a boost::variant; container, list and pair are own structs with BOOST_FUSION_ADAPT_STRUCT applied. Please have a look here for the full source code: https://github.com/fklemme/liberty_tool/blob/master/src/liberty_grammar.hpp#L24
Yes. Because the precedences of operator>> and operator> aren't equal, the resulting synthesized attribute type is different.
In fact, it is no longer automatically compatible with the intended exposed attribute type.
In this case the problem can be quickly neutralized with some disambiguating parentheses around the sub-expression:
container = name >> '(' >> -(arg % ',') >> ')' >> ('{' > +element > '}');
Related
my parser is nearly working :)
(still amazed by Spirit feature set (and compiletimes) and the very welcoming community here on stack overflow)
small sample for online try:
http://coliru.stacked-crooked.com/a/1c1bf88909dce7e3
so i've learned to use more lexeme-rules and try to prevent no_skip -
my rules are smaller and better to read as a result but now i stuck with
combining lexeme-rules and skipping-rules what seems to be not possible (compiletime error with warning about not castable to Skipper)
my problem is the comma seperated list in subscriptions
which does not skip spaces around expressions
parses:
"a.b[a,b]"
fails:
"a.b[ a , b ]"
these are my rules:
qi::rule<std::string::const_iterator, std::string()> identifier_chain;
qi::rule<std::string::const_iterator, std::string()>
expression_list = identifier_chain >> *(qi::char_(',') >> identifier_chain);
qi::rule < std::string::const_iterator, std::string() >
subscription = qi::char_('[') >> expression_list >> qi::char_(']');
qi::rule<std::string::const_iterator, std::string()>
identifier = qi::ascii::alpha >> *(qi::ascii::alnum | '_');
identifier_chain = identifier >> *(('.' >> identifier) | subscription);
as you can see all rules are "lexeme" and i think the subscription rule should be a ascii::space_type skipper but that does not compile
should i add space eaters in the front and back of identifier_chains in the expression_list?
feels like writing an regex :(
expression_list = *qi::blank >> identifier_chain >> *(*qi::blank >> qi::char_(',') >> *qi::blank >> identifier_chain >> *qi::blank);
it works but i've read that this will get me to an much bigger parser in the end (handling all the space skipping by myself)
thx for any advice
btw: any idea why i can't compile if surrounding the '.' in the indentifier_chain with qi::char_('.')
identifier_chain = identifier >> *(('.' >> identifier) | subscription);
UPDATE:
i've updated my expression list as suggested by sehe
qi::rule<std::string::const_iterator, spirit::ascii::blank_type, std::string()>
expression_list = identifier_chain >> *(qi::char_(',') >> identifier_chain);
qi::rule < std::string::const_iterator, std::string() >
subscription = qi::char_('[') >> qi::skip(qi::blank)[expression_list] >> qi::char_(']');
but still get compile error due to non castable Skipper: http://coliru.stacked-crooked.com/a/adcf665742b055dd
i also tried changed the identifer_chain to
identifier_chain = identifier >> *(('.' >> identifier) | qi::skip(qi::blank)[subscription]);
but i still can't compile the example
The answer I linked to earlier describes all the combinations (if I remember correctly): Boost spirit skipper issues
In short:
any rule that declares a skipper (so rule<It, Skipper[, Attr()]> or rule<It, Attr(), Skipper>) MUST be invoked with a compatible skipper (an expression that can be assigned to the type of Skipper).
any rule that does NOT declare a skipper (so of the form rule<It[, Attr()]>) will implicitly behave like a lexeme, meaning no input characters are skipped.
That's it. The slightly subtler ramifications are that given two rules:
rule<It, blank_type> a;
rule<It> b; // b is implicitly lexeme
You can invoke b from a:
a = "test" >> b;
But when you wish to invoke a from b you will find that you have to provide the skipper:
b = "oops" >> a; // DOES NOT COMPILE
b = "okay" >> qi::skip(qi::blank) [ a ];
That's almost all there is to it. There are a few more directives around skippers and lexemes in Qi, see again the answer linked above.
Side Question:
should i add space eaters in the front and back of identifier_chains in the expression_list?
If you look closely at the answer example here Parse a '.' chained identifier list, with qi::lexeme and prevent space skipping, you can see that it already does pre- and post skipping correctly, because I used phrase_parse:
" a.b " OK: ( "a" "b" )
----
"a . b" Failed
Remaining unparsed: "a . b"
----
You COULD also wrap the whole thing in an "outer" rule:
rule<std::string::const_iterator> main_rule =
qi::skip(qi::blank) [ identifier_chain ];
That's just the same but allows users to call parse without specifying the skipper.
// 1
Mexpression = Mterm >> *(
'+' >> Mterm [qi::_val = phoenix::new_<BinaryNode>(_1, '+', _2)]
| '-' >> Mterm [qi::_val = phoenix::new_<BinaryNode>(_1, '-', _2)]
);
Mterm = Mfactor >> *(
'*' >> Mfactor [qi::_val = phoenix::new_<BinaryNode>(_1, '*', _2)]
| '/' >> Mfactor [qi::_val = phoenix::new_<BinaryNode>(_1, '/', _2)]
);
Mfactor = Unpack
| '+' >> Mfactor [qi::_val = phoenix::new_<UnaryNode>('+', _1)]
| '-' >> Mfactor [qi::_val = phoenix::new_<UnaryNode>('-', _1)]
| '(' >> Mexpression >> ')';`
`Error 2 error C2664: 'BinaryNode::BinaryNode(const BinaryNode &)' : cannot convert argument 3 from 'boost::mpl::void_' to 'anExpression *' c:\boost_1_55_0\boost\spirit\home\phoenix\object\detail\new_eval.hpp 41 1 ConsoleApplication1
Error 1 error C2338: index_is_out_of_bounds c:\boost_1_55_0\boost\spirit\home\support\argument.hpp 103 1 ConsoleApplication1 `
And
c:\boost_1_55_0\boost\spirit\home\support\argument.hpp(166) : see reference to class template instantiation 'boost::spirit::result_of::get_arg<boost::fusion::vector1<Attribute &>,1>' being compiled with
[
Attribute=anExpression *
]
I'm coding a translator for a model language (there are several ebnf with main compositions given as a task.) and stuck somewhere at arithmetical operations.
(see 1 in paste)
here's a model to parse math' exprs,
unpack is somenode, something that can be executed, converted to anExpression *, and given as arg to BinaryNode
there are following rules.
qi::rule<Iterator, anExpression *()> Unpack;
qi::rule<Iterator, anExpression *()> Mexpression, Mterm, Mfactor;
anExpression is an abstract class (Binary and Unary are public anExpression)
while compiling the whole program I have following errors:
fig2
I think that error 2 is the most important thing to fix first.
something like this in build log
fig3
okay, I think that the mistake is in my way of semantic actions. I think there's not Mterm (or Mfactor) in _2 placeholder. there's something I'm doing wrong with this way of using semantics actions and alternative parser ( '|' )
I'll be glad to hear any ideas from you guys =)
Mexpression = Mterm >> *(
'+' >> Mterm [qi::_val = phoenix::new_<BinaryNode>(_1, '+', _2)]
| '-' >> Mterm [qi::_val = phoenix::new_<BinaryNode>(_1, '-', _2)]
);
Can't work indeed. You'd need to temporarily store the result attribute of the lhs Mterm. Luckily you should be able to use the result of the rule itself to do this:
Mexpression = Mterm [qi::_val = qi::_1] >> *(
'+' >> Mterm [qi::_val = phoenix::new_<BinaryNode>(qi::_val, '+', qi::_1)]
| '-' >> Mterm [qi::_val = phoenix::new_<BinaryNode>(qi::_val, '-', qi::_1)]
);
However you may have to accomodate this in the constructor for your BinaryNode type.
This said:
I tend avoid semantic actions. As long as you're imperatively spelling each and every step out in the semantic action, what is the real benefit of using a parser generator like Spirit? It's no longer for Rapid Application Development
especially semantic actions doing dynamic allocations; it's very eay to create leaks in the presence of parser backtracking
That said, this kind of expression combination is about the only point where I think semantic actions can still be considered idiomatic in Spirit. The dynamic allocations most certainly are not.
I've got an embarrassingly simple problem that I can't seem to wrap my head around. I'm reading the boost documentation on how to parse into structs. The sample code provided for that chapter is straightforward - or so I thought. I would like to make a super simple change.
I want to split the start-rule:
start %=
lit("employee")
>> '{'
>> int_ >> ','
>> quoted_string >> ','
>> quoted_string >> ','
>> double_
>> '}'
;
...into two (or later more) rules, like this:
params %=
>> int_ >> ','
>> quoted_string >> ','
>> quoted_string >> ','
>> double_;
start %=
lit("employee")
>> '{'
>> params
>> '}'
;
No matter what I've tried I couldn't get it to parse values correctly into the employee struct. Even when I got a running program that recognized the input, the attributes didn't get written to the struct. It seems parsing only works correctly if everything is specified in the "top-level" rule. Surely, I'm mistaken?! I'll definitely need a more structured approach for the parser I actually need to implement.
Also I'm unclear what the correct type of the params rule should be. I'm thinking qi::rule<Iterator, fusion::vector<int, std::string, std::string, double>, ascii::space_type>, but my compiler didn't seem to like that very much...
I should mention that I'm working with Boost v1.46.1
In this situation, you could really just make params expose an employee attribute directly:
Live On Coliru
qi::rule<Iterator, employee(), ascii::space_type> params;
I have an input string I'm trying to parse. It might look like either of the two:
sys(error1, 2.3%)
sys(error2 , 2.4%)
sys(this error , 3%)
Note the space sometimes before the comma. In my grammer (boost spirit library) I'd like to capture "error1", "error2", and "this error" respectively.
Here is the original grammar I had to capture this - which absorbed the space at the end of the name:
name_string %= lexeme[+(char_ - ',' - '"')];
name_string.name("Systematic Error Name");
start = (lit("sys")|lit("usys")) > '('
> name_string[boost::phoenix::bind(&ErrorValue::SetName, _val, _1)] > ','
> errParser[boost::phoenix::bind(&ErrorValue::CopyErrorAndRelative, _val, _1)]
> ')';
My attempt to fix this was first:
name_string %= lexeme[*(char_ - ',' - '"') > (char_ - ',' - '"' - ' ')];
however that completely failed. Looks like it failes to parse anything with a space in the middle.
I'm fairly new with Spirit - so perhaps I'm missing something simple. Looks like lexeme turns off skipping on the leading edge - I need something that does it on the leading and trailing edge.
Thanks in advance for any help!
Thanks to psur below, I was able to put together an answer. It isn't perfect (see below), but I thought I would update the post for everyone to see it in context and nicely formatted:
qi::rule<Iterator, std::string(), ascii::space_type> name_word;
qi::rule<Iterator, std::string(), ascii::space_type> name_string;
ErrorValueParser<Iterator> errParser;
name_word %= +(qi::char_("_a-zA-Z0-9+"));
//name_string %= lexeme[name_word >> *(qi::hold[+(qi::char_(' ')) >> name_word])];
name_string %= lexeme[+(qi::char_("-_a-zA-Z0-9+")) >> *(qi::hold[+(qi::char_(' ')) >> +(qi::char_("-_a-zA-Z0-9+"))])];
start = (
lit("sys")[bind(&ErrorValue::MakeCorrelated, _val)]
|lit("usys")[bind(&ErrorValue::MakeUncorrelated, _val)]
)
>> '('
>> name_string[bind(&ErrorValue::SetName, _val, _1)] >> *qi::lit(' ')
>> ','
>> errParser[bind(&ErrorValue::CopyErrorAndRelative, _val, _1)]
>> ')';
This works! They key to this is the name_string, and in it the qi::hold, a operator I was not familiar with before this. It is almost like a sub-rule: everything inside qi::hold[...] must successfully parse for it to go. So, above, it will only allow a space after a word if there is another word following. The result is that if a sequence of words end in a space(s), those last spaces will not be parsed! They can be absorbed by the *qi::lit(' ') that follows (see the start rule).
There are two things I'd like to figure out how to improve here:
It would be nice to put the actual string parsing into name_word. The problem is the declaration of name_word - it fails when it is put in the appropriate spot in the definition of name_string.
It would be even better if name_string could include the parsing of the trailing spaces, though its return value did not. I think I know how to do that...
When/if I figure these out I will update this post. Thanks for the help!
Below rules should work for you:
name_word %= +(qi::char_("_a-zA-Z0-9"));
start %= qi::lit("sys(")
>> qi::lexeme[ name_word >> *(qi::hold[ +(qi::char_(' ')) >> name_word ]) ]
>> *qi::lit(' ')
>> qi::lit(',')
// ...
name_word parse only one word in name; I assumed that it contains only letter, digits and underscore.
In start rule qi::hold is important. It will parse space only if next is name_word. In other case parser will rollback and move to *qi::lit(' ') and then to comma.
I tried to modify the mini_c example of boost::spirit to match to my existing vocabulary.
I therefore added a operator "NOT that should behave equal as "!":
unary_expr =
primary_expr
| ("NOT" > primary_expr [op(op_not)]) // This does not work
| ('!' > primary_expr [op(op_not)])
| ('-' > primary_expr [op(op_neg)])
| ('+' > primary_expr)
;
I can compile the modified source code, but when i try to execute it it fails to parse. How can i solve this?
EDIT:
As my want to access external variables, i had made another modification in order to build a list of these variables when compiling:
identifier %=
raw[lexeme[alpha >> *(alnum | 'ยง' | '_' | '.' | '-' )]]
;
variable =
identifier [add_var(_1)]
;
Where add_var and identifier are defined as
rule<Iterator, std::string(), white_space> identifier;
function<var_adder> add_var;
If i don't use this modification, "NOT" can be used. With the modification, using "NOT" generates a parsing error.
EDIT 2:
The following conditional expressions do work though:
logical_expr =
relational_expr
>> *( ("AND" > relational_expr [op(op_and)])
| ("OR" > relational_expr [op(op_or)])
)
;
With your change the small test:
int main()
{
return NOT 1;
}
parses successfully and returns 0. So it is not obvious to me what doesn't work for you. Could you provide a failing input example as well, please?