spirit qi grammar issues

spirit qi grammar issues - c++

First of all, sorry for the inaccurate title, it's just that I don't actually know whats causing the compilation error ( Im new to spirit/phoenix/tuple ), hence for readability of my question I exported the entire grammar to pastebin:
http://pastebin.com/RsGM8E4r
The code is compiled in Visual Studio 2010 with:
Iterator = std::string::const_iterator
and other information you need to understand the grammar and my question at the bottom:
namespace parser { namespace container1 {
template < typename _C >
class atom : public element < _C >
{
private:
typedef typename std::basic_string < _C > _string;
public:
explicit atom ( const boost::variant < bool, long, double, _string > & value )
: _value ( value )
{
_element_type = TY_ATOM;
}
explicit atom ()
{
}
template < typename T >
const T as () const
{
return boost::apply_visitor ( atom_visitor < _C, T > (), _value );
}
private:
boost::variant < bool, long, double, _string > _value;
};
template < typename _C >
struct item
{
typedef typename element < _C > type;
typedef typename boost::shared_ptr < type > ptr;
};
}}
group and list also have element as base.
Now, the thing I don't understand is, when you look at the grammar, the rule definition for atom is:
atom =
( qi::double_ | qi::long_ | qi::bool_ | string ) [ qi::_val = phoenix::construct < _item_ptr > ( phoenix::new_ < _atom > ( qi::_1 ) ) ]
;
This gives a very long list of compiler error which I can't really comprehend .. exported to pastebin again: http://pastebin.com/k4HseJ01
If I however change the rule to
atom =
( qi::double_ | qi::long_ | qi::bool_ | string ) [ qi::_val = phoenix::construct < _item_ptr > ( phoenix::new_ < _atom > () ) ]
;
it compiles successfully, but well I need to get the parsed data from that rule :P
Thank you very much in advance for any help, I'm really stuck at that for days.

Related

Boost spirit core dump on parsing bracketed expression

Having some simplified grammar that should parse sequence of terminal literals: id, '<', '>' and ":action".
I need to allow brackets '(' ')' that do nothing but improve reading. (Full example is there http://coliru.stacked-crooked.com/a/dca93f5c8f37a889 )
Snip of my grammar:
start = expression % eol;
expression = (simple_def >> -expression)
| (qi::lit('(') > expression > ')');
simple_def = qi::lit('<') [qi::_val = Command::left]
| qi::lit('>') [qi::_val = Command::right]
| key [qi::_val = Command::id]
| qi::lit(":action") [qi::_val = Command::action]
;
key = +qi::char_("a-zA-Z_0-9");
When I try to parse: const std::string s = "(a1 > :action)"; Everything works like a charm.
But when I little bit bring more complexity with brackets "(a1 (>) :action)" I've gotten coredump. Just for information - coredump happens on coliru, while msvc compiled example just demonstrate fail parsing.
So my questions: (1) what's wrong with brackets, (2) how exactly brackets can be introduced to expression.
p.s. It is simplified grammar, in real I have more complicated case, but this is a minimal reproduceable code.

You should just handle the expectation failure:
terminate called after throwing an instance of 'boost::wrapexcept<boost::spir
it::qi::expectation_failure<__gnu_cxx::__normal_iterator<char const*, std::__
cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >
>'
what(): boost::spirit::qi::expectation_failure
Aborted (core dumped)
If you handle the expectation failure, the program will not have to terminate.
Fixing The Grammar
Your 'nested expression' rule only accepts a single expression. I think that
expression = (simple_def >> -expression)
is intended to match "1 or more `simple_def". However, the alternative branch:
| ('(' > expression > ')');
doesn't accept the same: it just stops after parsing `)'. This means that your input is simply invalid according to the grammar.
I suggest a simplification by expressing intent. You were on the right path with semantic typedefs. Let's avoid the "weasely" Line Of Lines (what even is that?):
using Id = std::string;
using Line = std::vector<Command>;
using Script = std::vector<Line>;
And use these typedefs consistently. Now, we can express the grammar as we "think" about it:
start = skip(blank)[script];
script = line % eol;
line = +simple;
simple = group | command;
group = '(' > line > ')';
See, by simplifying our mental model and sticking to it, we avoided the entire problem you had a hard time spotting.
Here's a quick demo that includes error handling, optional debug output, both test cases and encapsulating the skipper as it is part of the grammar: Live On Compiler Explorer
#include <fmt/ranges.h>
#include <fmt/ostream.h>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
enum class Command { id, left, right, action };
static inline std::ostream& operator<<(std::ostream& os, Command cmd) {
switch (cmd) {
case Command::id: return os << "[ID]";
case Command::left: return os << "[LEFT]";
case Command::right: return os << "[RIGHT]";
case Command::action: return os << "[ACTION]";
}
return os << "[???]";
}
using Id = std::string;
using Line = std::vector<Command>;
using Script = std::vector<Line>;
template <typename It>
struct ExprGrammar : qi::grammar<It, Script()> {
ExprGrammar() : ExprGrammar::base_type(start) {
using namespace qi;
start = skip(blank)[script];
script = line % eol;
line = +simple;
simple = group | command;
group = '(' > line > ')';
command =
lit('<') [ _val = Command::left ] |
lit('>') [ _val = Command::right ] |
key [ _val = Command::id ] |
lit(":action") [ _val = Command::action ] ;
key = +char_("a-zA-Z_0-9");
BOOST_SPIRIT_DEBUG_NODES((command)(line)(simple)(group)(script)(key));
}
private:
qi::rule<It, Script()> start;
qi::rule<It, Line(), qi::blank_type> line, simple, group;
qi::rule<It, Script(), qi::blank_type> script;
qi::rule<It, Command(), qi::blank_type> command;
// lexemes
qi::rule<It, Id()> key;
};
int main() {
using It = std::string::const_iterator;
ExprGrammar<It> const p;
for (const std::string s : {
"a1 > :action\na1 (>) :action",
"(a1 > :action)\n(a1 (>) :action)",
"a1 (> :action)",
}) {
It f(begin(s)), l(end(s));
try {
Script parsed;
bool ok = qi::parse(f, l, p, parsed);
if (ok) {
fmt::print("Parsed {}\n", parsed);
} else {
fmt::print("Parsed failed\n");
}
if (f != l) {
fmt::print("Remaining unparsed: '{}'\n", std::string(f, l));
}
} catch (qi::expectation_failure<It> const& ef) {
fmt::print("{}\n", ef.what()); // TODO add more details :)
}
}
}
Prints
Parsed {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
Parsed {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
Parsed {{[ID], [RIGHT], [ACTION]}}
BONUS
However, I think this can all be greatly simplified using qi::symbols for the commands. In fact it looks like you're only tokenizing (you confirm this when you say that the parentheses are not important).
line = +simple;
simple = group | command | (omit[key] >> attr(Command::id));
group = '(' > line > ')';
key = +char_("a-zA-Z_0-9");
Now you don't need Phoenix at all: Live On Compiler Explorer, printing
ok? true {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
ok? true {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
ok? true {{[ID], [RIGHT], [ACTION]}}
Even Simpler?
Since I observe that you're basically tokenizing line-wise, why not simply skip the parentheses, and simplify all the way down to:
script = line % eol;
line = *(command | omit[key] >> attr(Command::id));
That's all. See it Live On Compiler Explorer again:
#include <boost/spirit/include/qi.hpp>
#include <fmt/ostream.h>
#include <fmt/ranges.h>
namespace qi = boost::spirit::qi;
enum class Command { id, left, right, action };
using Id = std::string;
using Line = std::vector<Command>;
using Script = std::vector<Line>;
static inline std::ostream& operator<<(std::ostream& os, Command cmd) {
return os << (std::array{"ID", "LEFT", "RIGHT", "ACTION"}.at(int(cmd)));
}
template <typename It>
struct ExprGrammar : qi::grammar<It, Script()> {
ExprGrammar() : ExprGrammar::base_type(start) {
using namespace qi;
start = skip(skipper.alias())[line % eol];
line = *(command | omit[key] >> attr(Command::id));
key = +char_("a-zA-Z_0-9");
BOOST_SPIRIT_DEBUG_NODES((line)(key));
}
private:
using Skipper = qi::rule<It>;
qi::rule<It, Script()> start;
qi::rule<It, Line(), Skipper> line;
Skipper skipper = qi::char_(" \t\b\f()");
qi::rule<It /*, Id()*/> key; // omit attribute for efficiency
struct cmdsym : qi::symbols<char, Command> {
cmdsym() { this->add("<", Command::left)
(">", Command::right)
(":action", Command::action);
}
} command;
};
int main() {
using It = std::string::const_iterator;
ExprGrammar<It> const p;
for (const std::string s : {
"a1 > :action\na1 (>) :action",
"(a1 > :action)\n(a1 (>) :action)",
"a1 (> :action)",
})
try {
It f(begin(s)), l(end(s));
Script parsed;
bool ok = qi::parse(f, l, p, parsed);
fmt::print("ok? {} {}\n", ok, parsed);
if (f != l)
fmt::print(" -- Remaining '{}'\n", std::string(f, l));
} catch (qi::expectation_failure<It> const& ef) {
fmt::print("{}\n", ef.what()); // TODO add more details :)
}
}
Prints
ok? true {{ID, RIGHT, ACTION}, {ID, RIGHT, ACTION}}
ok? true {{ID, RIGHT, ACTION}, {ID, RIGHT, ACTION}}
ok? true {{ID, RIGHT, ACTION}}
Note I very subtly changed +() to *() so it would accept empty lines as well. This may or may not be what you want

Intersection between various values from boost::bimap

I am trying to use boost::bimap for one of my requirements. Below is sample code
typedef bimap<
multiset_of< string >,
multiset_of< string >,
set_of_relation<>
> bm_type;
bm_type bm;
assign::insert( bm )
( "John" , string("lazarus" ) )
( "Peter", string("vinicius") )
( "Peter", string("test") )
( "Simon", string("vinicius") )
( "John", string("viniciusa") )
( "John", string("vinicius") )
I would like to do something as finding matching values for John & Peter, in other words intersection between values for John & Peter for ex: In this case it will be ("vinicius"). Can someone provide some limelight over it?

Here's what I came up with initially:
template <typename Value = std::string, typename Bimap, typename Key>
std::set<Value> projection(Bimap const& bm, Key const& key)
{
std::set<Value> p;
auto range = bm.left.equal_range(key);
auto values = boost::make_iterator_range(range.first, range.second);
for (auto& relation : values)
p.insert(relation.template get<boost::bimaps::member_at::right>());
return p;
}
auto john = projection(bm, "John");
auto peter = projection(bm, "Peter");
std::multiset<std::string> intersection;
std::set_intersection(
john.begin(), john.end(),
peter.begin(), peter.end(),
inserter(intersection, intersection.end())
);
I think it can be more efficient. So I tried replacing the projection on the fly using Boost Range's adaptors:
struct GetRightMember
{
template <typename> struct result { typedef std::string type; };
template <typename T>
std::string operator()(T const& v) const {
return v.template get<boost::bimaps::member_at::right>();
}
};
const GetRightMember getright;
std::cout << "Intersection: ";
// WARNING: broken: ranges not sorted
boost::set_intersection(
bm.left.equal_range("John") | transformed(getright),
bm.left.equal_range("Peter") | transformed(getright),
std::ostream_iterator<std::string>(std::cout, " "));
Sadly it doesn't work - presumably because the transformed ranges aren't sorted.
So I'd stick with the more verbose version (or reconsider my data structure choices). See it Live On Coliru

Compiler issue with Clang 3.2 or am I doing it wrong? (variadic templates & default parameters)

I've gone a bit deep with template programming in my latest library rewrite, and so I made template that assembles bitmasks which I want to use as a method parameter initializer:
test_scanner( unsigned int avec = CtrlVec<'\n','\r'>::mask ) ;
This works with GCC 4.8, but I get a series of errors from clang 3.2:
error: expected parameter declarator
error: expected ')'
error: expected '>'
Putting a () around CtrlVec fixes the problem. What I am curious is, does it work in later versions of Clang or should I submit this as a possible error in the compiler, or am I doing something wrong?
The entire test case is here:
namespace test
{
// template magic for building a bit mask from control characters
template <char...> struct CtrlVec ;
template <char c> struct CtrlVec<c>
{
static_assert( ( c < 0x20 ), "Only for control characters" ) ;
enum { mask = ( 1 << c ) } ;
} ;
template <char c, char... cs> struct CtrlVec<c, cs...>
{
static_assert( ( c < 0x20 ), "Only for control characters" ) ;
enum { mask = ( 1 << c ) | CtrlVec<cs...>::mask } ;
} ;
static_assert( CtrlVec<'\0', '\001', '\002'>::mask == 0x7, "") ;
///
class test_scanner
{
public:
// this version works fine in GCC, but gives an error in Clang 3.2
test_scanner( unsigned int avec = CtrlVec<'\n','\r'>::mask ) ;
// adding the () makes it work
test_scanner( int good, unsigned int avec = ( CtrlVec<'\n','\r'>::mask ) ) ;
} ;
} ;
int main() {}
or you can pull it from github:
https://github.com/woolstar/test/blob/master/misc/testinitargs.cpp

generate unique key for std::map

I have a map with a string as the key and stores lambdas.
I've so far tried
std::map <int, auto> callbackMap
And put a lambda where there isn't one with the same number already. Is this possible ? I keep getting errors saying functions can't have auto as constructors.

It's because auto is just a compile time "feature" that converts the type you need to a very defined type! You are maybe confusing it with a "variant" type... it doesn't work this way.
auto X = 3;
It doesn't mean X is a "variant". It's like the compiler converts it to:
int X = 3;
So, notice that X has a very defined type.
You CAN store functions (lambda is the operator) in your map, no problem. But with your std::function<...> very defined. Example:
std::map< int, std::function< int( int ) > > callbackMap;
callbackMap[ 0 ] = std::function< int( int ) >( [] ( int a ) { return a + 1; } );
callbackMap[ 1 ] = std::function< int( int ) >( [] ( int a ) { return a - 1; } );
callbackMap[ 2 ] = std::function< int( int ) >( [] ( int a ) { return a * 2; } );
callbackMap[ 3 ] = std::function< int( int ) >( [] ( int a ) { return a / 2; } );
Notice that you still need to know the signature of your functions... (here in my example int( int a ), but you can define of course the way you want).
If you decide to store "pointers to functions" you will have the same problem. You have to know the signature! Nothing different.

Need a way to prefix boost::spirit::qi parser with another one

I have a lot of rules that look like this:
cmd_BC = (dlm > timestamp > dlm > cid > dlm > double_)
[
_val = lazy_shared<dc::BoardControl>(_1, _2, _3)
];
I want to make it more readable, like:
cmd_BC = param(timestamp) > param(cid) > param(double_)
or even
cmd_BC = params(timestamp, cid, double_)
As sehe pointed out, it boils down to having some means to automatically expect the delimiters. What are the options here? Myself, I see three possibilities, all flawed:
Use a macro. This wouldn't allow for the shorter variadic form.
Write a custom prefix directive. I don't seem to have enough experience in Spirit's clockwork, but if it's actually not that hard, I will try to.
Write a wrapper function. I tried to no luck the following code:
template <typename T>
auto param(const T & parser) -> decltype(qi::lit(dlm) > parser)
{
return qi::lit(dlm) > parser;
}
but it doesn't compile, failing at
// report invalid argument not found (N is out of bounds)
BOOST_SPIRIT_ASSERT_MSG(
(N < sequence_size::value),
index_is_out_of_bounds, ());
I also tried to return (...).alias(), but it didn't compile too.

This solution is not very "spirit-y" and unfortunately "requires C++11" (I'm not sure how to get the required result type in c++03) but it seems to work. Inspired by the example here.
PS: Oh I didn't see your edit. You have almost the same example.
Update: Added another test using a semantic action with _1,_2 and _3
Update2: Changed the signature of operator() and operator[] following vines's advice in the comments.
Update 3: Added a variadic operator() that constructs a combined parser, and removed operator[] after finding a better solution with boost::proto. Changed slightly the examples.
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/proto/proto.hpp>
namespace qi = boost::spirit::qi;
namespace proto = boost::proto;
namespace phx = boost::phoenix;
using namespace boost::proto;
//This is a proto grammar/transform that creates the prefixed parser. The parser created depends on the parser passed (if it's a kleene or not)
// in _make_xxx "_" corresponds to the supplied parser and "_state" to the delimiter
struct CreatePrefixedParser: //you can use _make_greater instead of _make_shift_right if you want to use "expectation"
or_ <
when < dereference<_>, //If it's a kleene parser...
_make_shift_right ( //create the parser -> dlm >> *(parser -dlm)
_state,
_make_dereference (
_make_minus ( _child_c<0> ( _ ),
_state ) ) ) > ,
when < unary_plus<_>, //If it's a +parser
_make_shift_right ( //create the parser -> dlm >> +(parser -dlm)
_state,
_make_unary_plus (
_make_minus ( _child_c<0> ( _ ),
_state ) ) ) > ,
otherwise < //if it's any other parser
_make_shift_right ( //create the parser -> dlm >> (parser -dlm)
_state,
_make_minus ( _,
_state ) ) >
> {};
//-------------------------------------------------------------
//this combines the parsers this way: parser1, parser2, parser3, parser4 -> parser1>>(parser2 >>(parser3 >> parser4)))
//you can use make_expr<tag::greater> if you want to use "expectation"
//I have absolutely no idea when "deep_copy" is required but it seems to work this way
template<typename Delim, typename First, typename ... Rest>
struct myparser
{
static auto combine ( Delim dlm_, const First& first, const Rest&...rest ) ->
decltype ( make_expr<tag::shift_right> ( CreatePrefixedParser() ( deep_copy ( first ), dlm_ ), myparser<Delim, Rest...>::combine ( dlm_, rest... ) ) )
{
return make_expr<tag::shift_right> ( CreatePrefixedParser() ( deep_copy ( first ), dlm_ ), myparser<Delim, Rest...>::combine ( dlm_, rest... ) );
}
};
template<typename Delim, typename Last>
struct myparser<Delim, Last>
{
static auto combine ( Delim dlm_, const Last& last ) -> decltype ( CreatePrefixedParser() ( deep_copy ( last ), dlm_ ) )
{
return CreatePrefixedParser() ( deep_copy ( last ), dlm_ );
}
};
//-----------------------------------------------------------------
template <typename T>
struct prefixer
{
T dlm_;
prefixer ( T dlm ) : dlm_ ( dlm ) {}
template <typename ... Args>
auto operator() ( const Args&... args ) ->
decltype ( deep_copy ( myparser<T, Args...>::combine ( dlm_, args... ) ) )
{
return deep_copy ( myparser<T, Args...>::combine ( dlm_, args... ) );
}
};
template <typename T>
prefixer<T> make_prefixer ( T dlm )
{
return prefixer<T> ( dlm );
}
int main()
{
std::string test = "lameducklamedog";
std::string::const_iterator f ( test.begin() ), l ( test.end() );
auto param = make_prefixer ( qi::lit ( "lame" ) );
qi::rule<std::string::const_iterator> dog = qi::lit ( "do" ) > qi::char_ ( 'g' );
//qi::rule<std::string::const_iterator> duck = qi::lit ( "duck" ) | qi::int_;
qi::rule<std::string::const_iterator,std::string()> quackdog = (param (*qi::alpha) >> param( dog ));
std::string what;
if ( qi::parse ( f, l, quackdog, what ) && f == l )
std::cout << "the duck and the dog are lame, specially the " << what << std::endl;
else
std::cerr << "Uhoh\n" << std::string(f,l) << std::endl;
test = "*-*2.34*-*10*-*0.16*-*12.5";
std::string::const_iterator f2 ( test.begin() ), l2 ( test.end() );
auto param2 = make_prefixer ( qi::lit ( "*-*" ) );
double d;
qi::rule<std::string::const_iterator> myrule = ( param2 ( qi::double_, qi::int_, qi::double_ , qi::double_) ) [phx::ref ( d ) = qi::_1 + qi::_2 + qi::_3 + qi::_4];
if ( qi::parse ( f2, l2, myrule ) && f2 == l2 )
std::cout << "the sum of the numbers is " << d << std::endl;
else
std::cerr << "Uhoh\n";
}

If I understand correctly, you are looking for a way to automatically expect or ignore the delimiter expression (dlm)?
Skippers
This is the classic terrain for Skippers in Spirit. This is especially useful if the delimiter is variable, e.g. whitespace (varying amounts of whitespace acceptable);
bool ok = qi::phrase_parse(
first, last, // input iterators
timestamp > cid > double_, // just specify the expected params
qi::space); // delimiter, e.g. any amount of whitespace
Note the use of phrase_parse to enable grammars with skippers.
Delimited with % parser directive
You could explicitely go and delimit the grammar:
dlm = qi::lit(','); // as an example, delimit by single comma
rule = timestamp > dlm > cid > dlm > double_;
This is tedious. Something that might work about nicely for you (depending on the amount of input validation that should be performed:
dlm = qi::lit(','); // as an example, delimit by single comma
rule = (timestamp | cid | double_) % dlm;
(This would result in a vector of variant<timestampt_t, cid_t, double>)
Roll your own
You could roll your own parser directive, similar to karma::delimit, but for input.
The idea is outlined in this documentation article by Hartmut Kaiser:
Creating Your Own Generator Component for Spirit.Karma
If you're interested, I could see whether I could make this work as an example (I've not used this before). To be honest, I'm surprised something like this doesn't yet exist, and I think it would be a prime candidate for the Spirit Repository

Here's the solution I'm finally satisfied with. It's based on these three answers:
https://stackoverflow.com/a/12679189/396583
https://stackoverflow.com/a/6069950/396583
https://stackoverflow.com/a/3744742/396583
I've decided not to make it accept arbitrary binary functors though, since I doubt it would have any real-world purpose in context of parsing. So,
#include <boost/proto/deep_copy.hpp>
template <typename D>
struct prefixer
{
template<typename... T>
struct TypeOfPrefixedExpr;
template<typename T>
struct TypeOfPrefixedExpr<T>
{
typedef typename boost::proto::result_of::deep_copy
< decltype ( std::declval<D>() > std::declval<T>() ) >::type type;
};
template<typename T, typename... P>
struct TypeOfPrefixedExpr<T, P...>
{
typedef typename boost::proto::result_of::deep_copy
< decltype ( std::declval<D>() > std::declval<T>()
> std::declval<typename TypeOfPrefixedExpr<P...>::type>() ) >::type type;
};
D dlm_;
prefixer ( D && dlm ) : dlm_ ( dlm ) {}
template <typename U>
typename TypeOfPrefixedExpr<U>::type operator() (U && parser )
{
return boost::proto::deep_copy ( dlm_ > parser );
}
template <typename U, typename ... Tail>
typename TypeOfPrefixedExpr<U, Tail...>::type
operator() (U && parser, Tail && ... tail )
{
return boost::proto::deep_copy ( dlm_ > parser > (*this)(tail ...) );
}
};
template <typename D>
prefixer<D> make_prefixer ( D && dlm )
{
return prefixer<D> ( std::forward<D>(dlm) );
}
And it's used like this:
auto params = make_prefixer(qi::lit(dlm));
cmd_ID = params(string) [ _val = lazy_shared<dc::Auth> (_1) ];
cmd_NAV = params(timestamp, double_, double_, double_, double_, double_)
[
_val = lazy_shared<dc::Navigation>( _1, _2, _3, _4, _5, _6 )
];
cmd_BC = params(timestamp, cid, double_)
[
_val = lazy_shared<dc::BoardControl>(_1, _2, _3)
];

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

spirit qi grammar issues - c++

Related

Boost spirit core dump on parsing bracketed expression

Intersection between various values from boost::bimap

Compiler issue with Clang 3.2 or am I doing it wrong? (variadic templates & default parameters)

generate unique key for std::map

Need a way to prefix boost::spirit::qi parser with another one

Categories

Resources