Error compiling a custom container for boost spirit

Error compiling a custom container for boost spirit - c++

I want to parse something like the following:
1;2
=1200
3;4
5;6
lines can appear in any order. Lines starting with the = sign can be more than one and only the last one matters; lines containing a ; represent a pair of values that I want to store in a map. Reading the answer to this question I came up with some code that should be good enough (sorry but I'm still a noob with Spirit) and should do what I'm trying to achieve. Here's the code:
#define BOOST_SPIRIT_USE_PHOENIX_V3
#define DATAPAIR_PAIR
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/mpl/bool.hpp>
#include <map>
#if !defined(DATAPAIR_PAIR)
#include <vector>
#endif
static const char g_data[] = "1;2\n=1200\n3;4\n5;6\n";
typedef std::string DataTypeFirst;
#if defined(DATAPAIR_PAIR)
typedef std::string DataTypeSecond;
typedef std::pair<DataTypeFirst, DataTypeSecond> DataPair;
typedef std::map<DataTypeFirst, DataTypeSecond> DataMap;
#else
typedef std::vector<DataTypeFirst> DataPair;
typedef std::map<DataTypeFirst, DataTypeFirst> DataMap;
#endif
struct MyContainer {
DataMap data;
double number;
};
namespace boost { namespace spirit { namespace traits {
template<> struct is_container<MyContainer> : boost::mpl::true_ {};
template<>
struct container_value<MyContainer> {
typedef boost::variant<double, DataPair> type;
};
template <>
struct push_back_container<MyContainer, double> {
static bool call ( MyContainer& parContainer, double parValue ) {
parContainer.number = parValue;
return true;
}
};
template <>
struct push_back_container<MyContainer, DataPair> {
static bool call ( MyContainer& parContainer, const DataPair& parValue ) {
#if defined(DATAPAIR_PAIR)
parContainer.data[parValue.first] = parValue.second;
#else
parContainer.data[parValue[0]] = parValue[1];
#endif
return true;
}
};
} } }
template <typename Iterator>
struct TestGrammar : boost::spirit::qi::grammar<Iterator, MyContainer()> {
TestGrammar ( void );
boost::spirit::qi::rule<Iterator, MyContainer()> start;
boost::spirit::qi::rule<Iterator, DataPair()> data;
boost::spirit::qi::rule<Iterator, double()> num;
};
template <typename Iterator>
TestGrammar<Iterator>::TestGrammar() :
TestGrammar::base_type(start)
{
using boost::spirit::qi::alnum;
using boost::spirit::qi::lit;
using boost::spirit::ascii::char_;;
using boost::spirit::qi::double_;
using boost::spirit::qi::eol;
using boost::spirit::qi::eoi;
start %= *((num | data) >> (eol | eoi));
data = +alnum >> lit(";") >> +alnum;
num = '=' >> double_;
}
int main() {
std::cout << "Parsing data:\n" << g_data << "\n";
TestGrammar<const char*> gramm;
MyContainer result;
boost::spirit::qi::parse(static_cast<const char*>(g_data),
g_data + sizeof(g_data) / sizeof(g_data[0]) - 1,
gramm,
result
);
std::cout << "Parsed data:\n";
std::cout << "Result: " << result.number << "\n";
for (const auto& p : result.data) {
std::cout << p.first << " = " << p.second << '\n';
}
return 0;
}
I'm developing this on Gentoo Linux, using dev-libs/boost-1.55.0-r2:0/1.55.0 and gcc (Gentoo 4.8.3 p1.1, pie-0.5.9) 4.8.3. Compiling the above code I get an error like
/usr/include/boost/spirit/home/support/container.hpp:278:13: error: ‘struct MyContainer’ has no member named ‘insert’
as a workaround, I came up with the alternative code you get by commenting the "#define DATAPAIR_PAIR" line. In that case the code compiles and works, but what I really want is a pair where I can for example mix std::string and int values. Why using std::pair as the attribute for my data rule causes the compiler to miss the correct specialization of push_back_container? Is it possible to fix the code and have it working, either using std::pair or anything equivalent?

I'd simplify this by /just/ not treating things like a container and not-a-container at the same time. So for this particular situation I might deviate from my usual mantra (avoid semantic actions) and use them¹:
Live On Coliru
template <typename It, typename Skipper = qi::blank_type>
struct grammar : qi::grammar<It, MyContainer(), Skipper> {
grammar() : grammar::base_type(start) {
update_number = '=' > qi::double_ [ qi::_r1 = qi::_1 ];
map_entry = qi::int_ > ';' > qi::int_;
auto number = phx::bind(&MyContainer::number, qi::_val);
auto data = phx::bind(&MyContainer::data, qi::_val);
start = *(
( update_number(number)
| map_entry [ phx::insert(data, phx::end(data), qi::_1) ]
)
>> qi::eol);
}
private:
qi::rule<It, void(double&), Skipper> update_number;
qi::rule<It, MyContainer::Pair(), Skipper> map_entry;
qi::rule<It, MyContainer(), Skipper> start;
};
If you can afford a (0;0) entry in your map, you can even dispense with the grammar:
Live On Coliru
std::map<int, int> data;
double number;
bool ok = qi::phrase_parse(f, l,
*(
(qi::omit['=' > qi::double_ [phx::ref(number)=qi::_1]]
| (qi::int_ > ';' > qi::int_)
) >> qi::eol)
, qi::blank, data);
I can try to make your "advanced spirit" approach work too, but it might take a while :)
¹ I use auto for readability, but of course you don't need to use that; just repeat the subexpressions inline or use BOOST_AUTO. Note that this is not generically good advice for stateful parser expressions (see BOOST_SPIRIT_AUTO)

Related

cannot get boost::spirit parser&lexer working for token types other than std::string or int or double

This does not compile (code below).
There was another question here with the same error. But I don't understand the answer. I already tried inserting qi::eps in places -- but without success.
I also tried already adding meta functions (boost::spirit::raits::is_container) for the types used -- but this also does not help.
I also tried using the same variant containing all types I need to use everywhere. Same problem.
Has anybody gotten this working for a lexer returning something else than double or int or string? And for the parser also returning non-trivial objects?
I've tried implementing semantic functions everywhere returning default objects. But this also does not help.
Here comes the code:
// spirit_error.cpp : Defines the entry point for the console application.
//
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/phoenix/object.hpp>
#include <boost/spirit/include/qi_char_class.hpp>
#include <boost/spirit/include/phoenix_bind.hpp>
#include <boost/mpl/index_of.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/intrusive_ptr.hpp>
#include <boost/smart_ptr/intrusive_ref_counter.hpp>
namespace lex = boost::spirit::lex;
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace frank
{
class ref_counter:public boost::intrusive_ref_counter<ref_counter>
{ public:
virtual ~ref_counter(void)
{
}
};
class symbol:public ref_counter
{ public:
typedef boost::intrusive_ptr<const symbol> symbolPtr;
typedef std::vector<symbolPtr> symbolVector;
struct push_scope
{ push_scope()
{
}
~push_scope(void)
{
}
};
};
class nature:public symbol
{ public:
enum enumAttribute
{ eAbstol,
eAccess,
eDDT,
eIDT,
eUnits
};
struct empty
{ bool operator<(const empty&) const
{ return false;
}
friend std::ostream &operator<<(std::ostream &_r, const empty&)
{ return _r;
}
};
typedef boost::variant<empty, std::string> attributeValue;
};
class discipline:public symbol
{ public:
enum enumDomain
{ eDiscrete,
eContinuous
};
};
class type:public ref_counter
{ public:
typedef boost::intrusive_ptr<type> typePtr;
};
struct myIterator:std::iterator<std::random_access_iterator_tag, char, std::ptrdiff_t, const char*, const char&>
{ std::string *m_p;
std::size_t m_iPos;
myIterator(void)
:m_p(nullptr),
m_iPos(~std::size_t(0))
{
}
myIterator(std::string &_r, const bool _bEnd = false)
:m_p(&_r),
m_iPos(_bEnd ? ~std::size_t(0) : 0)
{
}
myIterator(const myIterator &_r)
:m_p(_r.m_p),
m_iPos(_r.m_iPos)
{
}
myIterator &operator=(const myIterator &_r)
{ if (this != &_r)
{ m_p = _r.m_p;
m_iPos = _r.m_iPos;
}
return *this;
}
const char &operator*(void) const
{ return m_p->at(m_iPos);
}
bool operator==(const myIterator &_r) const
{ return m_p == _r.m_p && m_iPos == _r.m_iPos;
}
bool operator!=(const myIterator &_r) const
{ return m_p != _r.m_p || m_iPos != _r.m_iPos;
}
myIterator &operator++(void)
{ ++m_iPos;
if (m_iPos == m_p->size())
m_iPos = ~std::size_t(0);
return *this;
}
myIterator operator++(int)
{ const myIterator s(*this);
operator++();
return s;
}
myIterator &operator--(void)
{ --m_iPos;
return *this;
}
myIterator operator--(int)
{ const myIterator s(*this);
operator--();
return s;
}
bool operator<(const myIterator &_r) const
{ if (m_p == _r.m_p)
return m_iPos < _r.m_iPos;
else
return m_p < _r.m_p;
}
std::ptrdiff_t operator-(const myIterator &_r) const
{ return m_iPos - _r.m_iPos;
}
};
struct onInclude
{ auto operator()(myIterator &_rStart, myIterator &_rEnd) const
{ // erase what has been matched (the include statement)
_rStart.m_p->erase(_rStart.m_iPos, _rEnd.m_iPos - _rStart.m_iPos);
// and insert the contents of the file
_rStart.m_p->insert(_rStart.m_iPos, "abcd");
_rEnd = _rStart;
return lex::pass_flags::pass_ignore;
}
};
template<typename LEXER>
class lexer:public lex::lexer<LEXER>
{ public:
lex::token_def<type::typePtr> m_sKW_real, m_sKW_integer, m_sKW_string;
lex::token_def<lex::omit> m_sLineComment, m_sCComment;
lex::token_def<lex::omit> m_sWS;
lex::token_def<lex::omit> m_sSemicolon, m_sEqual, m_sColon, m_sInclude, m_sCharOP, m_sCharCP,
m_sComma;
lex::token_def<std::string> m_sIdentifier, m_sString;
lex::token_def<double> m_sReal;
lex::token_def<int> m_sInteger;
lex::token_def<lex::omit> m_sKW_units, m_sKW_access, m_sKW_idt_nature, m_sKW_ddt_nature, m_sKW_abstol,
m_sKW_nature, m_sKW_endnature, m_sKW_continuous, m_sKW_discrete,
m_sKW_potential, m_sKW_flow, m_sKW_domain, m_sKW_discipline, m_sKW_enddiscipline, m_sKW_module,
m_sKW_endmodule, m_sKW_parameter;
//typedef const type *typePtr;
template<typename T>
struct extractValue
{ T operator()(const myIterator &_rStart, const myIterator &_rEnd) const
{ return boost::lexical_cast<T>(std::string(_rStart, _rEnd));
}
};
struct extractString
{ std::string operator()(const myIterator &_rStart, const myIterator &_rEnd) const
{ const auto s = std::string(_rStart, _rEnd);
return s.substr(1, s.size() - 2);
}
};
lexer(void)
:m_sWS("[ \\t\\n\\r]+"),
m_sKW_parameter("\"parameter\""),
m_sKW_real("\"real\""),
m_sKW_integer("\"integer\""),
m_sKW_string("\"string\""),
m_sLineComment("\\/\\/[^\\n]*"),
m_sCComment("\\/\\*"
"("
"[^*]"
"|" "[\\n]"
"|" "([*][^/])"
")*"
"\\*\\/"),
m_sSemicolon("\";\""),
m_sEqual("\"=\""),
m_sColon("\":\""),
m_sCharOP("\"(\""),
m_sCharCP("\")\""),
m_sComma("\",\""),
m_sIdentifier("[a-zA-Z_]+[a-zA-Z0-9_]*"),
m_sString("[\\\"]"
//"("
// "(\\[\"])"
// "|"
//"[^\"]"
//")*"
"[^\\\"]*"
"[\\\"]"),
m_sKW_units("\"units\""),
m_sKW_access("\"access\""),
m_sKW_idt_nature("\"idt_nature\""),
m_sKW_ddt_nature("\"ddt_nature\""),
m_sKW_abstol("\"abstol\""),
m_sKW_nature("\"nature\""),
m_sKW_endnature("\"endnature\""),
m_sKW_continuous("\"continuous\""),
m_sKW_discrete("\"discrete\""),
m_sKW_domain("\"domain\""),
m_sKW_discipline("\"discipline\""),
m_sKW_enddiscipline("\"enddiscipline\""),
m_sKW_potential("\"potential\""),
m_sKW_flow("\"flow\""),
//realnumber ({uint}{exponent})|((({uint}\.{uint})|(\.{uint})){exponent}?)
//exponent [Ee][+-]?{uint}
//uint [0-9][_0-9]*
m_sReal("({uint}{exponent})"
"|"
"("
"(({uint}[\\.]{uint})|([\\.]{uint})){exponent}?"
")"
),
m_sInteger("{uint}"),
m_sInclude("\"`include\""),
m_sKW_module("\"module\""),
m_sKW_endmodule("\"endmodule\"")
{ this->self.add_pattern
("uint", "[0-9]+")
("exponent", "[eE][\\+\\-]?{uint}");
this->self = m_sSemicolon
| m_sEqual
| m_sColon
| m_sCharOP
| m_sCharCP
| m_sComma
| m_sString[lex::_val = boost::phoenix::bind(extractString(), lex::_start, lex::_end)]
| m_sKW_real//[lex::_val = boost::phoenix::bind(&type::getReal)]
| m_sKW_integer//[lex::_val = boost::phoenix::bind(&type::getInteger)]
| m_sKW_string//[lex::_val = boost::phoenix::bind(&type::getString)]
| m_sKW_parameter
| m_sKW_units
| m_sKW_access
| m_sKW_idt_nature
| m_sKW_ddt_nature
| m_sKW_abstol
| m_sKW_nature
| m_sKW_endnature
| m_sKW_continuous
| m_sKW_discrete
| m_sKW_domain
| m_sKW_discipline
| m_sKW_enddiscipline
| m_sReal[lex::_val = boost::phoenix::bind(extractValue<double>(), lex::_start, lex::_end)]
| m_sInteger[lex::_val = boost::phoenix::bind(extractValue<int>(), lex::_start, lex::_end)]
| m_sKW_potential
| m_sKW_flow
| m_sKW_module
| m_sKW_endmodule
| m_sIdentifier
| m_sInclude [ lex::_state = "INCLUDE" ]
;
this->self("INCLUDE") += m_sString [
lex::_state = "INITIAL", lex::_pass = boost::phoenix::bind(onInclude(), lex::_start, lex::_end)
];
this->self("WS") = m_sWS
| m_sLineComment
| m_sCComment
;
}
};
template<typename Iterator, typename Lexer>
class natureParser:public qi::grammar<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> >
{ qi::rule<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> > m_sStart;
qi::rule<Iterator, std::pair<nature::enumAttribute, nature::attributeValue>(void), qi::in_state_skipper<Lexer> > m_sProperty;
qi::rule<Iterator, std::string(), qi::in_state_skipper<Lexer> > m_sName;
public:
template<typename Tokens>
natureParser(const Tokens &_rTokens)
:natureParser::base_type(m_sStart)
{ m_sProperty = (_rTokens.m_sKW_units
>> _rTokens.m_sEqual
>> _rTokens.m_sString
>> _rTokens.m_sSemicolon
)
| (_rTokens.m_sKW_access
>> _rTokens.m_sEqual
>> _rTokens.m_sIdentifier
>> _rTokens.m_sSemicolon
)
| (_rTokens.m_sKW_idt_nature
>> _rTokens.m_sEqual
>> _rTokens.m_sIdentifier
>> _rTokens.m_sSemicolon
)
| (_rTokens.m_sKW_ddt_nature
>> _rTokens.m_sEqual
>> _rTokens.m_sIdentifier
>> _rTokens.m_sSemicolon
)
| (_rTokens.m_sKW_abstol
>> _rTokens.m_sEqual
>> _rTokens.m_sReal
>> _rTokens.m_sSemicolon
)
;
m_sName = (_rTokens.m_sColon >> _rTokens.m_sIdentifier);
m_sStart = (_rTokens.m_sKW_nature
>> _rTokens.m_sIdentifier
>> -m_sName
>> _rTokens.m_sSemicolon
>> *(m_sProperty)
>> _rTokens.m_sKW_endnature
);
m_sStart.name("start");
m_sProperty.name("property");
}
};
/*
// Conservative discipline
discipline electrical;
potential Voltage;
flow Current;
enddiscipline
*/
// a parser for a discipline declaration
template<typename Iterator, typename Lexer>
class disciplineParser:public qi::grammar<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> >
{ qi::rule<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> > m_sStart;
typedef std::pair<bool, boost::intrusive_ptr<const nature> > CPotentialAndNature;
struct empty
{ bool operator<(const empty&) const
{ return false;
}
friend std::ostream &operator<<(std::ostream &_r, const empty&)
{ return _r;
}
};
typedef boost::variant<empty, CPotentialAndNature, discipline::enumDomain> property;
qi::rule<Iterator, discipline::enumDomain(), qi::in_state_skipper<Lexer> > m_sDomain;
qi::rule<Iterator, property(void), qi::in_state_skipper<Lexer> > m_sProperty;
public:
template<typename Tokens>
disciplineParser(const Tokens &_rTokens)
:disciplineParser::base_type(m_sStart)
{ m_sDomain = _rTokens.m_sKW_continuous
| _rTokens.m_sKW_discrete
;
m_sProperty = (_rTokens.m_sKW_potential >> _rTokens.m_sIdentifier >> _rTokens.m_sSemicolon)
| (_rTokens.m_sKW_flow >> _rTokens.m_sIdentifier >> _rTokens.m_sSemicolon)
| (_rTokens.m_sKW_domain >> m_sDomain >> _rTokens.m_sSemicolon)
;
m_sStart = (_rTokens.m_sKW_discipline
>> _rTokens.m_sIdentifier
>> _rTokens.m_sSemicolon
>> *m_sProperty
>> _rTokens.m_sKW_enddiscipline
);
}
};
template<typename Iterator, typename Lexer>
class moduleParser:public qi::grammar<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> >
{ public:
qi::rule<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> > m_sStart;
qi::rule<Iterator, symbol::symbolVector(void), qi::in_state_skipper<Lexer> > m_sModulePortList;
qi::rule<Iterator, symbol::symbolVector(void), qi::in_state_skipper<Lexer> > m_sPortList;
qi::rule<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> > m_sPort;
qi::rule<Iterator, std::shared_ptr<symbol::push_scope>(void), qi::in_state_skipper<Lexer> > m_sModule;
typedef boost::intrusive_ptr<const ref_counter> intrusivePtr;
typedef std::vector<intrusivePtr> vectorOfPtr;
qi::rule<Iterator, vectorOfPtr(void), qi::in_state_skipper<Lexer> > m_sModuleItemList;
qi::rule<Iterator, intrusivePtr(void), qi::in_state_skipper<Lexer> > m_sParameter;
qi::rule<Iterator, intrusivePtr(void), qi::in_state_skipper<Lexer> > m_sModuleItem;
qi::rule<Iterator, type::typePtr(void), qi::in_state_skipper<Lexer> > m_sType;
template<typename Tokens>
moduleParser(const Tokens &_rTokens)
:moduleParser::base_type(m_sStart)
{ m_sPort = _rTokens.m_sIdentifier;
m_sPortList %= m_sPort % _rTokens.m_sComma;
m_sModulePortList %= _rTokens.m_sCharOP >> m_sPortList >> _rTokens.m_sCharCP;
m_sModule = _rTokens.m_sKW_module;
m_sType = _rTokens.m_sKW_real | _rTokens.m_sKW_integer | _rTokens.m_sKW_string;
m_sParameter = _rTokens.m_sKW_parameter
>> m_sType
>> _rTokens.m_sIdentifier
;
m_sModuleItem = m_sParameter;
m_sModuleItemList %= *m_sModuleItem;
m_sStart = (m_sModule
>> _rTokens.m_sIdentifier
>> m_sModulePortList
>> m_sModuleItemList
>> _rTokens.m_sKW_endmodule);
}
};
template<typename Iterator, typename Lexer>
class fileParser:public qi::grammar<Iterator, symbol::symbolVector(void), qi::in_state_skipper<Lexer> >
{ public:
disciplineParser<Iterator, Lexer> m_sDiscipline;
natureParser<Iterator, Lexer> m_sNature;
moduleParser<Iterator, Lexer> m_sModule;
qi::rule<Iterator, symbol::symbolVector(void), qi::in_state_skipper<Lexer> > m_sStart;
qi::rule<Iterator, symbol::symbolPtr(void), qi::in_state_skipper<Lexer> > m_sItem;
//public:
template<typename Tokens>
fileParser(const Tokens &_rTokens)
:fileParser::base_type(m_sStart),
m_sNature(_rTokens),
m_sDiscipline(_rTokens),
m_sModule(_rTokens)
{ m_sItem = m_sDiscipline | m_sNature | m_sModule;
m_sStart = *m_sItem;
}
};
}
int main()
{ std::string sInput = "\
nature Current;\n\
units = \"A\";\n\
access = I;\n\
idt_nature = Charge;\n\
abstol = 1e-12;\n\
endnature\n\
\n\
// Charge in coulombs\n\
nature Charge;\n\
units = \"coul\";\n\
access = Q;\n\
ddt_nature = Current;\n\
abstol = 1e-14;\n\
endnature\n\
\n\
// Potential in volts\n\
nature Voltage;\n\
units = \"V\";\n\
access = V;\n\
idt_nature = Flux;\n\
abstol = 1e-6;\n\
endnature\n\
\n\
discipline electrical;\n\
potential Voltage;\n\
flow Current;\n\
enddiscipline\n\
";
typedef lex::lexertl::token<frank::myIterator, boost::mpl::vector<frank::type::typePtr, std::string, double, int> > token_type;
typedef lex::lexertl::actor_lexer<token_type> lexer_type;
typedef frank::lexer<lexer_type>::iterator_type iterator_type;
typedef frank::fileParser<iterator_type, frank::lexer<lexer_type>::lexer_def> grammar_type;
frank::lexer<lexer_type> sLexer;
grammar_type sParser(sLexer);
frank::symbol::push_scope sPush;
auto pStringBegin = frank::myIterator(sInput);
auto pBegin(sLexer.begin(pStringBegin, frank::myIterator(sInput, true)));
const auto b = qi::phrase_parse(pBegin, sLexer.end(), sParser, qi::in_state("WS")[sLexer.self]);
}

Has anybody gotten this working for a lexer returning something else than double or int or string?
Sure. Simple examples might be found on this site
And for the parser also returning non-trivial objects?
Here's your real problem. Spirit is nice for a subset of parsers that are expressed easily in a eDSL, and has the huge benefit of "magically" mapping to a selection of attributes.
Some of the realities are:
attributes are expected to have value-semantic; using polymorphic attributes is hard (How can I use polymorphic attributes with boost::spirit::qi parsers?, e.g.)
using Lex makes most of the sweet-spot disappear since all "highlevel" parsers (like real_parser, [u]int_parser) are out the window. The Spirit devs are on record they prefer not to use Lex. Moreover, Spirit X3 doesn't have Lex support anymore.
Background Information:
I'd very much consider parsing the source as-is, into direct value-typed AST nodes. I know, this is probably what you consider "trivial objects", but don't be deceived by apparent simplicity: recursive variant trees have some expressive power.
Examples
Here's a trivial AST to represent JSON in <20 LoC: Boost Karma generator for composition of classes¹
Here we represent the Graphviz source format with full fidelity: How to use boost spirit list operator with mandatory minimum amount of elements?
I've since created the code to transform that AST into a domain representation with fully correct ownership, cascading lexically scoped node/edge attributes and cross references. I have just recovered that work and put it up on github if you're interested, mainly because the task is pretty similar in many respects, like the overriding/inheriting of properties and resolving identifiers within scopes: https://github.com/sehe/spirit-graphviz/blob/master/spirit-graphviz.cpp#L660
Suggestions, Ideas
In your case I'd take similar approach to retain simplicity. The code shown doesn't (yet) cover the trickiest ingredients (like nature attribute overrides within a discipline).
Once you start implementing use-cases like resolving compatible disciplines and the absolute tolerances at a given node, you want a domain model with full fidelity. Preferrably, there would be no loss of source information, and immutable AST information².
As a middle ground, you could probably avoid building an entire source-AST in memory only to transform it in one big go, at the top-level you could have:
file = qi::skip(skipper) [
*(m_sDiscipline | m_sNature | m_sModule) [process_ast(_1)]
];
Where process_ast would apply the "trivial" AST representation into the domain types, one at a time. That way you keep only small bits of temporary AST representation around.
The domain representation can be arbitrarily sophisticated to support all your logic and use-cases.
Let's "Show, Don't Tell"
Baking the simplest AST that comes to mind matching the grammar³:
namespace frank { namespace ast {
struct nature {
struct empty{};
std::string name;
std::string inherits;
enum class Attribute { units, access, idt, ddt, abstol };
using Value = boost::variant<int, double, std::string>;
std::map<Attribute, Value> attributes;
};
struct discipline {
enum enumDomain { eUnspecified, eDiscrete, eContinuous };
struct properties_t {
enumDomain domain = eUnspecified;
boost::optional<std::string> flow, potential;
};
std::string name;
properties_t properties;
};
// TODO
using module = qi::unused_type;
using file = std::vector<boost::variant<nature, discipline, module> >;
enum class type { real, integer, string };
} }
This is trivial and maps 1:1 onto the grammar productions, which means we have very little impedance.
Tokens? We Don't Need Lex For That
You can have common token parsers without requiring the complexities of Lex
Yes, Lex (especially statically generated) can potentially improve performance, but
if you need that, I wager Spirit Qi is not your best option anyways
premature optimization...
What I did:
struct tokens {
// implicit lexemes
qi::rule<It, std::string()> string, identifier;
qi::rule<It, double()> real;
qi::rule<It, int()> integer;
qi::rule<It, ast::nature::Value()> value;
qi::rule<It, ast::nature::Attribute()> attribute;
qi::rule<It, ast::discipline::enumDomain()> domain;
struct attribute_sym_t : qi::symbols<char, ast::nature::Attribute> {
attribute_sym_t() {
this->add
("units", ast::nature::Attribute::units)
("access", ast::nature::Attribute::access)
("idt_nature", ast::nature::Attribute::idt)
("ddt_nature", ast::nature::Attribute::ddt)
("abstol", ast::nature::Attribute::abstol);
}
} attribute_sym;
struct domain_sym_t : qi::symbols<char, ast::discipline::enumDomain> {
domain_sym_t() {
this->add
("discrete", ast::discipline::eDiscrete)
("continuous", ast::discipline::eContinuous);
}
} domain_sym;
tokens() {
using namespace qi;
auto kw = qr::distinct(copy(char_("a-zA-Z0-9_")));
string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
identifier = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
real = double_;
integer = int_;
attribute = kw[attribute_sym];
domain = kw[domain_sym];
value = string | identifier | real | integer;
BOOST_SPIRIT_DEBUG_NODES((string)(identifier)(real)(integer)(value)(domain)(attribute))
}
};
Liberating, isn't it? Note how
all attributes are automatically propagated
strings handle escapes (this bit was commented out in your Lex approach). We don't even need semantic actions to (badly) pry out the unquoted/unescaped value
we used distinct to ensure keyword parsing matches only full identifiers. (See How to parse reserved words correctly in boost spirit).
This is actually where you notice the lack of separate lexer.
On the flipside, this makes context-sensitive keywords a breeze (lex can easily prioritizes keywords over identifiers that occur in places where keywords cannot occur.⁴)
What About Skipping Space/Comments?
We could have added a token, but for reasons of convention I made it a parser:
struct skipParser : qi::grammar<It> {
skipParser() : skipParser::base_type(spaceOrComment) {
using namespace qi;
spaceOrComment = space
| ("//" >> *(char_ - eol) >> (eoi|eol))
| ("/*" >> *(char_ - "*/") >> "*/");
BOOST_SPIRIT_DEBUG_NODES((spaceOrComment))
}
private:
qi::rule<It> spaceOrComment;
};
natureParser
We inherit our AST parsers from tokens:
struct natureParser : tokens, qi::grammar<It, ast::nature(), skipParser> {
And from there it is plain sailing:
property = attribute >> '=' >> value >> ';';
nature
= kw["nature"] >> identifier >> -(':' >> identifier) >> ';'
>> *property
>> kw["endnature"];
disciplineParser
discipline = kw["discipline"] >> identifier >> ';'
>> properties
>> kw["enddiscipline"]
;
properties
= kw["domain"] >> domain >> ';'
^ kw["flow"] >> identifier >> ';'
^ kw["potential"] >> identifier >> ';'
;
This shows a competing approach that uses the permutation operator (^) to parse optional alternatives in any order into a fixed frank::ast::discipline properties struct. Of course, you might elect to have a more generic representation here, like we had with ast::nature.
Module AST is left as an exercise for the reader, though the parser rules are implemented below.
Top Level, Encapsulating The Skipper
I hate having to specify the skipper from the calling code (it's more complex than required, and changing the skipper changes the grammar). So, I encapsulate it in the top-level parser:
struct fileParser : qi::grammar<It, ast::file()> {
fileParser() : fileParser::base_type(file) {
file = qi::skip(qi::copy(m_sSkip)) [
*(m_sDiscipline | m_sNature | m_sModule)
];
BOOST_SPIRIT_DEBUG_NODES((file))
}
private:
disciplineParser m_sDiscipline;
natureParser m_sNature;
moduleParser m_sModule;
skipParser m_sSkip;
qi::rule<It, ast::file()> file;
};
Demo Time
This demo adds operator<< for the enums, and a variant visitor to print some AST details for debug/demonstrational purposes (print_em).
Then we have a test driver:
int main() {
using iterator_type = std::string::const_iterator;
iterator_type iter = sInput.begin(), last = sInput.end();
frank::Parsers<iterator_type>::fileParser parser;
print_em print;
frank::ast::file file;
bool ok = qi::parse(iter, last, parser, file);
if (ok) {
for (auto& symbol : file)
print(symbol);
}
else {
std::cout << "Parse failed\n";
}
if (iter != last) {
std::cout << "Remaining unparsed: '" << std::string(iter,last) << "'\n";
}
}
With the sample input from your question we get the following output:
Live On Coliru
-- Nature
name: Current
inherits:
attribute: units = A
attribute: access = I
attribute: idt = Charge
attribute: abstol = 1e-12
-- Nature
name: Charge
inherits:
attribute: units = coul
attribute: access = Q
attribute: ddt = Current
attribute: abstol = 1e-14
-- Nature
name: Voltage
inherits:
attribute: units = V
attribute: access = V
attribute: idt = Flux
attribute: abstol = 1e-06
-- Discipline
name: electrical
domain: (unspecified)
flow: Current
potential: Voltage
Remaining unparsed: '
'
With BOOST_SPIRIT_DEBUG defined, you get rich debug information: Live On Coliru
Full Listing
Live On Coliru
//#define BOOST_SPIRIT_DEBUG
#include <map>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/adapted.hpp>
#include <boost/spirit/repository/include/qi_distinct.hpp>
namespace qi = boost::spirit::qi;
namespace frank { namespace ast {
struct nature {
struct empty{};
std::string name;
std::string inherits;
enum class Attribute { units, access, idt, ddt, abstol };
using Value = boost::variant<int, double, std::string>;
std::map<Attribute, Value> attributes;
};
struct discipline {
enum enumDomain { eUnspecified, eDiscrete, eContinuous };
struct properties_t {
enumDomain domain = eUnspecified;
boost::optional<std::string> flow, potential;
};
std::string name;
properties_t properties;
};
// TODO
using module = qi::unused_type;
using file = std::vector<boost::variant<nature, discipline, module> >;
enum class type { real, integer, string };
} }
BOOST_FUSION_ADAPT_STRUCT(frank::ast::nature, name, inherits, attributes)
BOOST_FUSION_ADAPT_STRUCT(frank::ast::discipline, name, properties)
BOOST_FUSION_ADAPT_STRUCT(frank::ast::discipline::properties_t, domain, flow, potential)
namespace frank {
namespace qr = boost::spirit::repository::qi;
template <typename It> struct Parsers {
struct tokens {
// implicit lexemes
qi::rule<It, std::string()> string, identifier;
qi::rule<It, double()> real;
qi::rule<It, int()> integer;
qi::rule<It, ast::nature::Value()> value;
qi::rule<It, ast::nature::Attribute()> attribute;
qi::rule<It, ast::discipline::enumDomain()> domain;
struct attribute_sym_t : qi::symbols<char, ast::nature::Attribute> {
attribute_sym_t() {
this->add
("units", ast::nature::Attribute::units)
("access", ast::nature::Attribute::access)
("idt_nature", ast::nature::Attribute::idt)
("ddt_nature", ast::nature::Attribute::ddt)
("abstol", ast::nature::Attribute::abstol);
}
} attribute_sym;
struct domain_sym_t : qi::symbols<char, ast::discipline::enumDomain> {
domain_sym_t() {
this->add
("discrete", ast::discipline::eDiscrete)
("continuous", ast::discipline::eContinuous);
}
} domain_sym;
tokens() {
using namespace qi;
auto kw = qr::distinct(copy(char_("a-zA-Z0-9_")));
string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
identifier = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
real = double_;
integer = int_;
attribute = kw[attribute_sym];
domain = kw[domain_sym];
value = string | identifier | real | integer;
BOOST_SPIRIT_DEBUG_NODES((string)(identifier)(real)(integer)(value)(domain)(attribute))
}
};
struct skipParser : qi::grammar<It> {
skipParser() : skipParser::base_type(spaceOrComment) {
using namespace qi;
spaceOrComment = space
| ("//" >> *(char_ - eol) >> (eoi|eol))
| ("/*" >> *(char_ - "*/") >> "*/");
BOOST_SPIRIT_DEBUG_NODES((spaceOrComment))
}
private:
qi::rule<It> spaceOrComment;
};
struct natureParser : tokens, qi::grammar<It, ast::nature(), skipParser> {
natureParser() : natureParser::base_type(nature) {
using namespace qi;
auto kw = qr::distinct(copy(char_("a-zA-Z0-9_")));
property = attribute >> '=' >> value >> ';';
nature
= kw["nature"] >> identifier >> -(':' >> identifier) >> ';'
>> *property
>> kw["endnature"];
BOOST_SPIRIT_DEBUG_NODES((nature)(property))
}
private:
using Attribute = std::pair<ast::nature::Attribute, ast::nature::Value>;
qi::rule<It, ast::nature(), skipParser> nature;
qi::rule<It, Attribute(), skipParser> property;
using tokens::attribute;
using tokens::value;
using tokens::identifier;
};
struct disciplineParser : tokens, qi::grammar<It, ast::discipline(), skipParser> {
disciplineParser() : disciplineParser::base_type(discipline) {
auto kw = qr::distinct(qi::copy(qi::char_("a-zA-Z0-9_")));
discipline = kw["discipline"] >> identifier >> ';'
>> properties
>> kw["enddiscipline"]
;
properties
= kw["domain"] >> domain >> ';'
^ kw["flow"] >> identifier >> ';'
^ kw["potential"] >> identifier >> ';'
;
BOOST_SPIRIT_DEBUG_NODES((discipline)(properties))
}
private:
qi::rule<It, ast::discipline(), skipParser> discipline;
qi::rule<It, ast::discipline::properties_t(), skipParser> properties;
using tokens::domain;
using tokens::identifier;
};
struct moduleParser : tokens, qi::grammar<It, ast::module(), skipParser> {
moduleParser() : moduleParser::base_type(module) {
auto kw = qr::distinct(qi::copy(qi::char_("a-zA-Z0-9_")));
m_sPort = identifier;
m_sPortList = m_sPort % ',';
m_sModulePortList = '(' >> m_sPortList >> ')';
m_sModule = kw["module"];
m_sType = kw["real"] | kw["integer"] | kw["string"];
m_sParameter = kw["parameter"] >> m_sType >> identifier;
m_sModuleItem = m_sParameter;
m_sModuleItemList = *m_sModuleItem;
module =
(m_sModule >> identifier >> m_sModulePortList >> m_sModuleItemList >> kw["endmodule"]);
}
private:
qi::rule<It, ast::module(), skipParser> module;
qi::rule<It, skipParser> m_sModulePortList;
qi::rule<It, skipParser> m_sPortList;
qi::rule<It, skipParser> m_sPort;
qi::rule<It, skipParser> m_sModule;
qi::rule<It, skipParser> m_sModuleItemList;
qi::rule<It, skipParser> m_sParameter;
qi::rule<It, skipParser> m_sModuleItem;
qi::rule<It, skipParser> m_sType;
using tokens::identifier;
};
struct fileParser : qi::grammar<It, ast::file()> {
fileParser() : fileParser::base_type(file) {
file = qi::skip(qi::copy(m_sSkip)) [
*(m_sDiscipline | m_sNature | m_sModule)
];
BOOST_SPIRIT_DEBUG_NODES((file))
}
private:
disciplineParser m_sDiscipline;
natureParser m_sNature;
moduleParser m_sModule;
skipParser m_sSkip;
qi::rule<It, ast::file()> file;
};
};
}
extern std::string const sInput;
// just for demo
#include <boost/optional/optional_io.hpp>
namespace frank { namespace ast {
//static inline std::ostream &operator<<(std::ostream &os, const nature::empty &) { return os; }
static inline std::ostream &operator<<(std::ostream &os, nature::Attribute a) {
switch(a) {
case nature::Attribute::units: return os << "units";
case nature::Attribute::access: return os << "access";
case nature::Attribute::idt: return os << "idt";
case nature::Attribute::ddt: return os << "ddt";
case nature::Attribute::abstol: return os << "abstol";
};
return os << "?";
}
static inline std::ostream &operator<<(std::ostream &os, discipline::enumDomain d) {
switch(d) {
case discipline::eDiscrete: return os << "discrete";
case discipline::eContinuous: return os << "continuous";
case discipline::eUnspecified: return os << "(unspecified)";
};
return os << "?";
}
} }
struct print_em {
using result_type = void;
template <typename V>
void operator()(V const& variant) const {
boost::apply_visitor(*this, variant);
}
void operator()(frank::ast::nature const& nature) const {
std::cout << "-- Nature\n";
std::cout << "name: " << nature.name << "\n";
std::cout << "inherits: " << nature.inherits << "\n";
for (auto& a : nature.attributes) {
std::cout << "attribute: " << a.first << " = " << a.second << "\n";
}
}
void operator()(frank::ast::discipline const& discipline) const {
std::cout << "-- Discipline\n";
std::cout << "name: " << discipline.name << "\n";
std::cout << "domain: " << discipline.properties.domain << "\n";
std::cout << "flow: " << discipline.properties.flow << "\n";
std::cout << "potential: " << discipline.properties.potential << "\n";
}
void operator()(frank::ast::module const&) const {
std::cout << "-- Module (TODO)\n";
}
};
int main() {
using iterator_type = std::string::const_iterator;
iterator_type iter = sInput.begin(), last = sInput.end();
frank::Parsers<iterator_type>::fileParser parser;
print_em print;
frank::ast::file file;
bool ok = parse(iter, last, parser, file);
if (ok) {
for (auto& symbol : file)
print(symbol);
}
else {
std::cout << "Parse failed\n";
}
if (iter != last) {
std::cout << "Remaining unparsed: '" << std::string(iter,last) << "'\n";
}
}
std::string const sInput = R"(
nature Current;
units = "A";
access = I;
idt_nature = Charge;
abstol = 1e-12;
endnature
// Charge in coulombs
nature Charge;
units = "coul";
access = Q;
ddt_nature = Current;
abstol = 1e-14;
endnature
// Potential in volts
nature Voltage;
units = "V";
access = V;
idt_nature = Flux;
abstol = 1e-6;
endnature
discipline electrical;
potential Voltage;
flow Current;
enddiscipline
)";
¹ incidentally, the other answer there demonstrates the "impedance mismatch" with polymorphic attributes and Spirit - this time on the Karma side of it
² (to prevent subtle bugs that depend on evaluation order or things like that, e.g.)
³ (gleaning some from here but not importing too much complexity that wasn't reflected in your Lex approach)
⁴ (In fact, this is where you'd need state-switching inside the grammar, an area notoriously underdeveloped and practically unusable in Spirit Lex: e.g. when it works how to avoid defining token which matchs everything in boost::spirit::lex or when it goes badly: Boost.Spirit SQL grammar/lexer failure)

One solution would be to use a std::string everywhere and define a boost::variant with everything needed but not use it anywhere in the parser or lexer directly but only serialize & deserialize it into/from the string.
Is this what the originators of boost::spirit intended?

Using lexer token attributes in grammar rules with Lex and Qi from Boost.Spirit

Let's consider following code:
#include <boost/phoenix.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/qi.hpp>
#include <algorithm>
#include <iostream>
#include <string>
#include <utility>
#include <vector>
namespace lex = boost::spirit::lex;
namespace qi = boost::spirit::qi;
namespace phoenix = boost::phoenix;
struct operation
{
enum type
{
add,
sub,
mul,
div
};
};
template<typename Lexer>
class expression_lexer
: public lex::lexer<Lexer>
{
public:
typedef lex::token_def<operation::type> operator_token_type;
typedef lex::token_def<double> value_token_type;
typedef lex::token_def<std::string> variable_token_type;
typedef lex::token_def<lex::omit> parenthesis_token_type;
typedef std::pair<parenthesis_token_type, parenthesis_token_type> parenthesis_token_pair_type;
typedef lex::token_def<lex::omit> whitespace_token_type;
expression_lexer()
: operator_add('+'),
operator_sub('-'),
operator_mul("[x*]"),
operator_div("[:/]"),
value("\\d+(\\.\\d+)?"),
variable("%(\\w+)"),
parenthesis({
std::make_pair(parenthesis_token_type('('), parenthesis_token_type(')')),
std::make_pair(parenthesis_token_type('['), parenthesis_token_type(']'))
}),
whitespace("[ \\t]+")
{
this->self
+= operator_add [lex::_val = operation::add]
| operator_sub [lex::_val = operation::sub]
| operator_mul [lex::_val = operation::mul]
| operator_div [lex::_val = operation::div]
| value
| variable [lex::_val = phoenix::construct<std::string>(lex::_start + 1, lex::_end)]
| whitespace [lex::_pass = lex::pass_flags::pass_ignore]
;
std::for_each(parenthesis.cbegin(), parenthesis.cend(),
[&](parenthesis_token_pair_type const& token_pair)
{
this->self += token_pair.first | token_pair.second;
}
);
}
operator_token_type operator_add;
operator_token_type operator_sub;
operator_token_type operator_mul;
operator_token_type operator_div;
value_token_type value;
variable_token_type variable;
std::vector<parenthesis_token_pair_type> parenthesis;
whitespace_token_type whitespace;
};
template<typename Iterator>
class expression_grammar
: public qi::grammar<Iterator>
{
public:
template<typename Tokens>
explicit expression_grammar(Tokens const& tokens)
: expression_grammar::base_type(start)
{
start %= expression >> qi::eoi;
expression %= sum_operand >> -(sum_operator >> expression);
sum_operator %= tokens.operator_add | tokens.operator_sub;
sum_operand %= fac_operand >> -(fac_operator >> sum_operand);
fac_operator %= tokens.operator_mul | tokens.operator_div;
if(!tokens.parenthesis.empty())
fac_operand %= parenthesised | terminal;
else
fac_operand %= terminal;
terminal %= tokens.value | tokens.variable;
if(!tokens.parenthesis.empty())
{
parenthesised %= tokens.parenthesis.front().first >> expression >> tokens.parenthesis.front().second;
std::for_each(tokens.parenthesis.cbegin() + 1, tokens.parenthesis.cend(),
[&](typename Tokens::parenthesis_token_pair_type const& token_pair)
{
parenthesised %= parenthesised.copy() | (token_pair.first >> expression >> token_pair.second);
}
);
}
}
private:
qi::rule<Iterator> start;
qi::rule<Iterator> expression;
qi::rule<Iterator> sum_operand;
qi::rule<Iterator> sum_operator;
qi::rule<Iterator> fac_operand;
qi::rule<Iterator> fac_operator;
qi::rule<Iterator> terminal;
qi::rule<Iterator> parenthesised;
};
int main()
{
typedef lex::lexertl::token<std::string::const_iterator, boost::mpl::vector<operation::type, double, std::string>> token_type;
typedef expression_lexer<lex::lexertl::actor_lexer<token_type>> expression_lexer_type;
typedef expression_lexer_type::iterator_type expression_lexer_iterator_type;
typedef expression_grammar<expression_lexer_iterator_type> expression_grammar_type;
expression_lexer_type lexer;
expression_grammar_type grammar(lexer);
while(std::cin)
{
std::string line;
std::getline(std::cin, line);
std::string::const_iterator first = line.begin();
std::string::const_iterator const last = line.end();
bool const result = lex::tokenize_and_parse(first, last, lexer, grammar);
if(!result)
std::cout << "Parsing failed! Reminder: >" << std::string(first, last) << "<" << std::endl;
else
{
if(first != last)
std::cout << "Parsing succeeded! Reminder: >" << std::string(first, last) << "<" << std::endl;
else
std::cout << "Parsing succeeded!" << std::endl;
}
}
}
It is a simple parser for arithmetic expressions with values and variables. It is build using expression_lexer for extracting tokens, and then with expression_grammar to parse the tokens.
Use of lexer for such a small case might seem an overkill and probably is one. But that is the cost of simplified example. Also note that use of lexer allows to easily define tokens with regular expression while that allows to easily define them by external code (and user provided configuration in particular). With the example provided it would be no issue at all to read definition of tokens from an external config file and for example allow user to change variables from %name to $name.
The code seems to be working fine (checked on Visual Studio 2013 with Boost 1.61).
The expression_lexer has attributes attached to tokens. I guess they work since they compile. But I don't really know how to check.
Ultimately I would like the grammar to build me an std::vector with reversed polish notation of the expression. (Where every element would be a boost::variant over either operator::type or double or std::string.)
The problem is however that I failed to use token attributes in my expression_grammar. For example if you try to change sum_operator following way:
qi::rule<Iterator, operation::type ()> sum_operator;
you will get compilation error. I expected this to work since operation::type is the attribute for both operator_add and operator_sub and so also for their alternative. And still it doesn't compile. Judging from the error in assign_to_attribute_from_iterators it seems that parser tries to build the attribute value directly from input stream range. Which means it ignores the [lex::_val = operation::add] I specified in my lexer.
Changing that to
qi::rule<Iterator, operation::type (operation::type)> sum_operator;
didn't help either.
Also I tried changing definition to
sum_operator %= (tokens.operator_add | tokens.operator_sub) [qi::_val = qi::_1];
didn't help either.
How to work around that? I know I could use symbols from Qi. But I want to have the lexer to make it easy to configure regexes for the tokens. I could also extend the assign_to_attribute_from_iterators as described in the documentation but this kind of double the work. I guess I could also skip the attributes on lexer and just have them on grammar. But this again doesn't work well with flexibility on variable token (in my actual case there is slightly more logic there so that it is configurable also which part of the token forms actual name of the variable - while here it is fixed to just skip the first character). Anything else?
Also a side question - maybe anyone knows. Is there a way to get to capture groups of the regular expression of the token from tokens action? So that instead of having
variable [lex::_val = phoenix::construct<std::string>(lex::_start + 1, lex::_end)]
instead I would be able to make a string from the capture group and so easily handle formats like $var$.
Edited! I have improved whitespace skipping along conclusions from Whitespace skipper when using Boost.Spirit Qi and Lex. It is a simplification that does not affect questions asked here.

Ok, here's my take on the RPN 'requirement'. I heavily favor natural (automatic) attribute propagation over semantic actions (see Boost Spirit: "Semantic actions are evil"?)
I consider the other options (uglifying) optimizations. You might do them if you're happy with the overall design and don't mind making it harder to maintain :)
Live On Coliru
Beyond the sample from my comment that you've already studied, I added that RPN transformation step:
namespace RPN {
using cell = boost::variant<AST::operation, AST::value, AST::variable>;
using rpn_stack = std::vector<cell>;
struct transform : boost::static_visitor<> {
void operator()(rpn_stack& stack, AST::expression const& e) const {
boost::apply_visitor(boost::bind(*this, boost::ref(stack), ::_1), e);
}
void operator()(rpn_stack& stack, AST::bin_expr const& e) const {
(*this)(stack, e.lhs);
(*this)(stack, e.rhs);
stack.push_back(e.op);
}
void operator()(rpn_stack& stack, AST::value const& v) const { stack.push_back(v); }
void operator()(rpn_stack& stack, AST::variable const& v) const { stack.push_back(v); }
};
}
That's all! Use it like so, e.g.:
RPN::transform compiler;
RPN::rpn_stack program;
compiler(program, expr);
for (auto& instr : program) {
std::cout << instr << " ";
}
Which makes the output:
Parsing success: (3 + (8 * 9))
3 8 9 * +
Full Listing
Live On Coliru
//#define BOOST_SPIRIT_DEBUG
#include <boost/phoenix.hpp>
#include <boost/bind.hpp>
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/qi.hpp>
#include <algorithm>
#include <iostream>
#include <string>
#include <utility>
#include <vector>
namespace lex = boost::spirit::lex;
namespace qi = boost::spirit::qi;
namespace phoenix = boost::phoenix;
struct operation
{
enum type
{
add,
sub,
mul,
div
};
friend std::ostream& operator<<(std::ostream& os, type op) {
switch (op) {
case type::add: return os << "+";
case type::sub: return os << "-";
case type::mul: return os << "*";
case type::div: return os << "/";
}
return os << "<" << static_cast<int>(op) << ">";
}
};
template<typename Lexer>
class expression_lexer
: public lex::lexer<Lexer>
{
public:
//typedef lex::token_def<operation::type> operator_token_type;
typedef lex::token_def<lex::omit> operator_token_type;
typedef lex::token_def<double> value_token_type;
typedef lex::token_def<std::string> variable_token_type;
typedef lex::token_def<lex::omit> parenthesis_token_type;
typedef std::pair<parenthesis_token_type, parenthesis_token_type> parenthesis_token_pair_type;
typedef lex::token_def<lex::omit> whitespace_token_type;
expression_lexer()
: operator_add('+'),
operator_sub('-'),
operator_mul("[x*]"),
operator_div("[:/]"),
value("\\d+(\\.\\d+)?"),
variable("%(\\w+)"),
parenthesis({
std::make_pair(parenthesis_token_type('('), parenthesis_token_type(')')),
std::make_pair(parenthesis_token_type('['), parenthesis_token_type(']'))
}),
whitespace("[ \\t]+")
{
this->self
+= operator_add [lex::_val = operation::add]
| operator_sub [lex::_val = operation::sub]
| operator_mul [lex::_val = operation::mul]
| operator_div [lex::_val = operation::div]
| value
| variable [lex::_val = phoenix::construct<std::string>(lex::_start + 1, lex::_end)]
| whitespace [lex::_pass = lex::pass_flags::pass_ignore]
;
std::for_each(parenthesis.cbegin(), parenthesis.cend(),
[&](parenthesis_token_pair_type const& token_pair)
{
this->self += token_pair.first | token_pair.second;
}
);
}
operator_token_type operator_add;
operator_token_type operator_sub;
operator_token_type operator_mul;
operator_token_type operator_div;
value_token_type value;
variable_token_type variable;
std::vector<parenthesis_token_pair_type> parenthesis;
whitespace_token_type whitespace;
};
namespace AST {
using operation = operation::type;
using value = double;
using variable = std::string;
struct bin_expr;
using expression = boost::variant<value, variable, boost::recursive_wrapper<bin_expr> >;
struct bin_expr {
expression lhs, rhs;
operation op;
friend std::ostream& operator<<(std::ostream& os, bin_expr const& be) {
return os << "(" << be.lhs << " " << be.op << " " << be.rhs << ")";
}
};
}
BOOST_FUSION_ADAPT_STRUCT(AST::bin_expr, lhs, op, rhs)
template<typename Iterator>
class expression_grammar : public qi::grammar<Iterator, AST::expression()>
{
public:
template<typename Tokens>
explicit expression_grammar(Tokens const& tokens)
: expression_grammar::base_type(start)
{
start = expression >> qi::eoi;
bin_sum_expr = sum_operand >> sum_operator >> expression;
bin_fac_expr = fac_operand >> fac_operator >> sum_operand;
expression = bin_sum_expr | sum_operand;
sum_operand = bin_fac_expr | fac_operand;
sum_operator = tokens.operator_add >> qi::attr(AST::operation::add) | tokens.operator_sub >> qi::attr(AST::operation::sub);
fac_operator = tokens.operator_mul >> qi::attr(AST::operation::mul) | tokens.operator_div >> qi::attr(AST::operation::div);
if(tokens.parenthesis.empty()) {
fac_operand = terminal;
}
else {
fac_operand = parenthesised | terminal;
parenthesised = tokens.parenthesis.front().first >> expression >> tokens.parenthesis.front().second;
std::for_each(tokens.parenthesis.cbegin() + 1, tokens.parenthesis.cend(),
[&](typename Tokens::parenthesis_token_pair_type const& token_pair)
{
parenthesised = parenthesised.copy() | (token_pair.first >> expression >> token_pair.second);
});
}
terminal = tokens.value | tokens.variable;
BOOST_SPIRIT_DEBUG_NODES(
(start) (expression) (bin_sum_expr) (bin_fac_expr)
(fac_operand) (terminal) (parenthesised) (sum_operand)
(sum_operator) (fac_operator)
);
}
private:
qi::rule<Iterator, AST::expression()> start;
qi::rule<Iterator, AST::expression()> expression;
qi::rule<Iterator, AST::expression()> sum_operand;
qi::rule<Iterator, AST::expression()> fac_operand;
qi::rule<Iterator, AST::expression()> terminal;
qi::rule<Iterator, AST::expression()> parenthesised;
qi::rule<Iterator, int()> sum_operator;
qi::rule<Iterator, int()> fac_operator;
// extra rules to help with AST creation
qi::rule<Iterator, AST::bin_expr()> bin_sum_expr;
qi::rule<Iterator, AST::bin_expr()> bin_fac_expr;
};
namespace RPN {
using cell = boost::variant<AST::operation, AST::value, AST::variable>;
using rpn_stack = std::vector<cell>;
struct transform : boost::static_visitor<> {
void operator()(rpn_stack& stack, AST::expression const& e) const {
boost::apply_visitor(boost::bind(*this, boost::ref(stack), ::_1), e);
}
void operator()(rpn_stack& stack, AST::bin_expr const& e) const {
(*this)(stack, e.lhs);
(*this)(stack, e.rhs);
stack.push_back(e.op);
}
void operator()(rpn_stack& stack, AST::value const& v) const { stack.push_back(v); }
void operator()(rpn_stack& stack, AST::variable const& v) const { stack.push_back(v); }
};
}
int main()
{
typedef lex::lexertl::token<std::string::const_iterator, boost::mpl::vector<operation::type, double, std::string>> token_type;
typedef expression_lexer<lex::lexertl::actor_lexer<token_type>> expression_lexer_type;
typedef expression_lexer_type::iterator_type expression_lexer_iterator_type;
typedef expression_grammar<expression_lexer_iterator_type> expression_grammar_type;
expression_lexer_type lexer;
expression_grammar_type grammar(lexer);
RPN::transform compiler;
std::string line;
while(std::getline(std::cin, line) && !line.empty())
{
std::string::const_iterator first = line.begin();
std::string::const_iterator const last = line.end();
AST::expression expr;
bool const result = lex::tokenize_and_parse(first, last, lexer, grammar, expr);
if(!result)
std::cout << "Parsing failed!\n";
else
{
std::cout << "Parsing success: " << expr << "\n";
RPN::rpn_stack program;
compiler(program, expr);
for (auto& instr : program) {
std::cout << instr << " ";
}
}
if(first != last)
std::cout << "Remainder: >" << std::string(first, last) << "<\n";
}
}

boost spirit on_error not triggered

^ No it is not. This was part of the problem, but if review the code as is right now, it already does what the pointed out question/answer shows ... and the errors are still not triggered.
I have this boost spirit parser for string literal. It works. Now I would like to start handle errors when it fail. I copied the on_error handle 1-1 from the mini xml example and it compiles, but it is never triggered (no errors are outputted).
This is the parser:
#define BOOST_SPIRIT_USE_PHOENIX_V3
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/home/support/iterators/line_pos_iterator.hpp>
#include <boost/spirit/include/phoenix_fusion.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
namespace qi = boost::spirit::qi;
struct my_handler_f
{
template <typename...> struct result { typedef void type; };
template <typename... T>
void operator()(T&&...) const {
std::cout << "\nmy_handler_f() invoked with " << sizeof...(T) << " arguments\n";
}
};
struct append_utf8_f
{
template <typename, typename>
struct result { typedef void type; };
template <typename INT>
void operator()(INT in, std::string& to) const
{
auto out = std::back_inserter(to);
boost::utf8_output_iterator<decltype(out)> convert(out);
*convert++ = in;
}
};
struct get_line_f
{
template <typename> struct result { typedef size_t type; };
template <typename It> size_t operator()(It const& pos_iter) const
{
return get_line(pos_iter);
}
};
struct RangePosition { size_t beginLine, endLine; };
struct String : public RangePosition
{
String()
: RangePosition()
, value()
, source()
{
}
std::string value;
std::string source;
};
BOOST_FUSION_ADAPT_STRUCT(String,
(std::string, value)
(std::string, source)
(size_t, beginLine)
(size_t, endLine)
)
template <typename Iterator>
struct source_string : qi::grammar<Iterator, String(), qi::space_type>
{
struct escape_symbols : qi::symbols<char, char>
{
escape_symbols()
{
add
("\'" , '\'')
("\"" , '\"')
("\?" , '\?')
("\\" , '\\')
("0" , '\0')
("a" , '\a')
("b" , '\b')
("f" , '\f')
("n" , '\n')
("r" , '\r')
("t" , '\t')
("v" , '\v')
;
}
} escape_symbol;
source_string() : source_string::base_type(start)
{
using qi::raw;
using qi::_val;
using qi::_1;
using qi::_2;
using qi::_3;
using qi::_4;
using qi::space;
using qi::omit;
using qi::no_case;
using qi::print;
using qi::eps;
using qi::on_error;
using qi::fail;
using qi::lit;
namespace phx = boost::phoenix;
using phx::at_c;
using phx::begin;
using phx::end;
using phx::construct;
using phx::ref;
using phx::val;
escape %= escape_symbol;
character %= (no_case["\\x"] > hex12)
| ("\\" > (oct123 | escape))
| (print - (lit('"') | '\\'));
unicode = ("\\u" > hex4[append_utf8(_1, _val)])
| ("\\U" > hex8[append_utf8(_1, _val)]);
string_section %= '"' > *(unicode | character) > '"';
string %= string_section % omit[*space];
main = raw [
string[at_c<0>(_val) = _1]
]
[
at_c<1>(_val) = construct<std::string>(begin(_1), end(_1)),
at_c<2>(_val) = get_line_(begin(_1)),
at_c<3>(_val) = get_line_(end(_1))
];
start %= eps > main;
on_error<fail>(start, my_handler);
}
boost::phoenix::function<my_handler_f> my_handler;
qi::rule<Iterator, std::string()> escape;
qi::uint_parser<char, 16, 1, 2> hex12;
qi::uint_parser<char, 8, 1, 3> oct123;
qi::rule<Iterator, std::string()> character;
qi::uint_parser<uint16_t, 16, 4, 4> hex4;
qi::uint_parser<uint32_t, 16, 8, 8> hex8;
boost::phoenix::function<append_utf8_f> append_utf8;
qi::rule<Iterator, std::string()> unicode;
qi::rule<Iterator, std::string()> string_section;
qi::rule<Iterator, std::string()> string;
boost::phoenix::function<get_line_f> get_line_;
qi::rule<Iterator, String(), qi::space_type> main;
qi::rule<Iterator, String(), qi::space_type> start;
};
and this is the test code
int main()
{
std::string str[] =
{
"\"\\u1234\\U0002345\"",
//"\"te\"\"st\"",
//"\"te\" \"st\"",
//"\"te\" \n \"st\"",
//"\"\"",
//"\"\\\"\"",
//"\"test\"",
//"\"test\" something",
//"\"\\\'\\\"\\\?\\\\\\a\\b\\f\\n\\r\\t\\v\"",
//"\"\\x61cd\\X3012\\x7z\"",
//"\"\\141cd\\06012\\78\\778\"",
"\"te",
//"\"te\nst\"",
//"\"test\\\"",
//"\"te\\st\"",
//
};
typedef boost::spirit::line_pos_iterator<std::string::const_iterator> Iterator;
for (size_t i = 0; i < sizeof(str) / sizeof(str[0]); ++i)
{
source_string<Iterator> g;
Iterator iter(str[i].begin());
Iterator end(str[i].end());
String string;
bool r = phrase_parse(iter, end, g, qi::space, string);
if (r)
std::cout << string.beginLine << "-" << string.endLine << ": " << string.value << " === " << string.source << "\n";
else
std::cout << "Parsing failed\n";
}
}

boost spirit parse c and cpp style comments

I am working on parser for which the comments are meaningful. And before you say it ... the reason is if statement A has a comment in the source in the target statement resulting out of the description of A should be commented with the comment of A. Or comments propagation. Anyways, for this purpose I have to parse and collect the comments not just to skip them.
This is what i have so far:
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/home/support/iterators/line_pos_iterator.hpp>
#include <boost/spirit/repository/include/qi_confix.hpp>
#include <boost/spirit/include/phoenix_fusion.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
using namespace boost::spirit;
#include <boost/fusion/include/adapt_struct.hpp>
////////////////////////////////
// extra facilities
struct get_line_f
{
template <typename> struct result { typedef size_t type; };
template <typename It> size_t operator()(It const& pos_iter) const
{
return get_line(pos_iter);
}
};
struct Position
{
Position()
: beginLine(-1)
, endLine(-1)
{
}
size_t beginLine;
size_t endLine;
};
struct Comment : public Position
{
Comment()
: Position()
, text()
{
}
std::vector<std::string> text;
std::string source;
};
BOOST_FUSION_ADAPT_STRUCT(Comment,
(std::vector<std::string>, text)
(std::string, source)
(size_t, beginLine)
(size_t, endLine)
)
//
////////////////////////////////
template <typename Iterator>
struct comment : qi::grammar<Iterator, Comment(), qi::space_type>
{
comment() : comment::base_type(start)
{
c_comment %= repository::confix("/*", "*/")[*(char_ - "*/")];
cpp_comment %= repository::confix("//", eol | eoi)[*(char_ - eol)];
comments %= *(c_comment | cpp_comment);
start = raw[ comments[at_c<0>(_val) = _1] ]
[
at_c<1>(_val) = construct<std::string>(begin(_1), end(_1)),
at_c<2>(_val) = get_line_(begin(_1)),
at_c<3>(_val) = get_line_(end(_1))
]
;
}
boost::phoenix::function<get_line_f> get_line_;
qi::rule<Iterator, Comment(), qi::space_type> start;
qi::rule<Iterator, std::string()> c_comment;
qi::rule<Iterator, std::string()> cpp_comment;
qi::rule<Iterator, std::vector<std::string>()> comments;
};
This is the testing code:
int main()
{
std::string str[] =
{
"/*1234*/",
"\n\n/*1234*/",
"// foo bar\n",
"// foo bar",
"\n\n // foo bar\n",
"/*1234\n5678*/",
};
typedef line_pos_iterator<std::string::const_iterator> Iterator;
for (size_t i = 0; i < sizeof(str) / sizeof(str[0]); ++i)
{
comment<Iterator> g;
Iterator iter(str[i].begin());
Iterator end(str[i].end());
Comment comment;
bool r = phrase_parse(iter, end, g, qi::space, comment);
if (r && iter == end)
std::cout << comment.beginLine << comment.source << "\n";
else
std::cout << "Parsing failed\n";
}
}
Edit:
This code now is compiling and working ... thanks for the help.

How do I capture the original input into the synthesized output from a spirit grammar?

I'm working on a boost::spirit::qi::grammar and would like to copy a portion of the original text into the synthesized output structure of the grammar (more specifically, the portion that matched one of the components of the rule). The grammar would ultimately be used as a sub-grammar for a more complicated grammar, so I don't really have access to the original input.
I'm guessing that this can be done through semantic actions or the grammar context, but I can't find an example that does this without access to the original parse().
Here's what I have so far:
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
namespace qi = boost::spirit::qi;
struct A
{
std::string header;
std::vector<int> ints;
std::string inttext;
};
BOOST_FUSION_ADAPT_STRUCT(
A,
(std::string, header)
(std::vector<int>, ints)
//(std::string, inttext)
)
template <typename Iterator>
struct parser : qi::grammar< Iterator, A() >
{
parser() : parser::base_type(start)
{
header %= qi::lexeme[ +qi::alpha ];
ints %= qi::lexeme[ qi::int_ % qi::char_(",_") ]; // <---- capture the original text that matches this into inttext
start %= header >> ' ' >> ints;
}
qi::rule<Iterator, std::string()> header;
qi::rule<Iterator, std::vector<int>() > ints;
qi::rule<Iterator, A()> start;
};
int main()
{
A output;
std::string input("out 1,2_3");
auto iter = input.begin();
parser<decltype(iter)> p;
bool r = qi::parse(iter, input.end(), p, output);
if( !r || iter != input.end() )
{
std::cout << "did not parse";
}
else
{
// would like output.inttext to be "1,2_3"
std::cout << output.header << ": " << output.inttext << " -> [ ";
for( auto & i: output.ints )
std::cout << i << ' ';
std::cout << ']' << std::endl;
}
}

Something similar to what you asked without using semantic actions:
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/repository/include/qi_iter_pos.hpp>
namespace qi = boost::spirit::qi;
using boost::spirit::repository::qi::iter_pos;
struct ints_type
{
std::vector<int> data;
std::string::const_iterator begin;
std::string::const_iterator end;
};
struct A
{
std::string header;
ints_type ints;
};
BOOST_FUSION_ADAPT_STRUCT(
ints_type,
(std::string::const_iterator, begin)
(std::vector<int>, data)
(std::string::const_iterator, end)
)
BOOST_FUSION_ADAPT_STRUCT(
A,
(std::string, header)
(ints_type, ints)
)
template <typename Iterator>
struct parser : qi::grammar< Iterator, A() >
{
parser() : parser::base_type(start)
{
header %= qi::lexeme[ +qi::alpha ];
ints %= qi::lexeme[ iter_pos >> qi::int_ % qi::char_(",_") >> iter_pos ]; // <---- capture the original text that matches this into inttext
start %= header >> ' ' >> ints;
}
qi::rule<Iterator, std::string()> header;
qi::rule<Iterator, ints_type() > ints;
qi::rule<Iterator, A()> start;
};
int main()
{
A output;
std::string input("out 1,2_3");
auto iter = input.begin();
parser<decltype(iter)> p;
bool r = qi::parse(iter, input.end(), p, output);
if( !r || iter != input.end() )
{
std::cout << "did not parse";
}
else
{
// would like output.inttext to be "1,2_3"
std::cout << output.header << ": " << std::string(output.ints.begin,output.ints.end) << " -> [ ";
for( auto & i: output.ints.data )
std::cout << i << ' ';
std::cout << ']' << std::endl;
}
}
Using semantic actions:
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/repository/include/qi_iter_pos.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
using boost::spirit::repository::qi::iter_pos;
struct ints_type
{
std::vector<int> data;
std::string inttext;
};
struct A
{
std::string header;
ints_type ints;
};
BOOST_FUSION_ADAPT_STRUCT(
ints_type,
(std::vector<int>, data)
(std::string, inttext)
)
BOOST_FUSION_ADAPT_STRUCT(
A,
(std::string, header)
(ints_type, ints)
)
template <typename Iterator>
struct parser : qi::grammar< Iterator, A() >
{
parser() : parser::base_type(start)
{
header %= qi::lexeme[ +qi::alpha ];
ints = qi::lexeme[
(iter_pos >> qi::int_ % qi::char_(",_") >> iter_pos)
[phx::at_c<0>(qi::_val)=qi::_2,
phx::at_c<1>(qi::_val)=phx::construct<std::string>(qi::_1,qi::_3)]
];
start %= header >> ' ' >> ints;
}
qi::rule<Iterator, std::string()> header;
qi::rule<Iterator, ints_type() > ints;
qi::rule<Iterator, A()> start;
};
int main()
{
A output;
std::string input("out 1,2_3");
auto iter = input.begin();
parser<decltype(iter)> p;
bool r = qi::parse(iter, input.end(), p, output);
if( !r || iter != input.end() )
{
std::cout << "did not parse";
}
else
{
// would like output.inttext to be "1,2_3"
std::cout << output.header << ": " << output.ints.inttext << " -> [ ";
for( auto & i: output.ints.data )
std::cout << i << ' ';
std::cout << ']' << std::endl;
}
}

Another alternative using a custom directive dont_eat that returns the subject attribute but does not consume any input. This is possibly slower since the rule ints is parsed twice, but I believe that the syntax is nicer (and it's a good excuse to try creating your own directive)(It's a slightly modified version of "boost/spirit/home/qi/directive/lexeme.hpp").
dont_eat.hpp
#if !defined(DONT_EAT_HPP)
#define DONT_EAT_HPP
#if defined(_MSC_VER)
#pragma once
#endif
#include <boost/spirit/home/qi/meta_compiler.hpp>
#include <boost/spirit/home/qi/skip_over.hpp>
#include <boost/spirit/home/qi/parser.hpp>
#include <boost/spirit/home/support/unused.hpp>
#include <boost/spirit/home/support/common_terminals.hpp>
#include <boost/spirit/home/qi/detail/attributes.hpp>
#include <boost/spirit/home/support/info.hpp>
#include <boost/spirit/home/support/handles_container.hpp>
namespace custom
{
BOOST_SPIRIT_TERMINAL(dont_eat);
}
namespace boost { namespace spirit
{
///////////////////////////////////////////////////////////////////////////
// Enablers
///////////////////////////////////////////////////////////////////////////
template <>
struct use_directive<qi::domain, custom::tag::dont_eat> // enables dont_eat
: mpl::true_ {};
}}
namespace custom
{
template <typename Subject>
struct dont_eat_directive : boost::spirit::qi::unary_parser<dont_eat_directive<Subject> >
{
typedef Subject subject_type;
dont_eat_directive(Subject const& subject)
: subject(subject) {}
template <typename Context, typename Iterator>
struct attribute
{
typedef typename
boost::spirit::traits::attribute_of<subject_type, Context, Iterator>::type
type;
};
template <typename Iterator, typename Context
, typename Skipper, typename Attribute>
bool parse(Iterator& first, Iterator const& last
, Context& context, Skipper const& skipper
, Attribute& attr) const
{
Iterator temp = first;
boost::spirit::qi::skip_over(temp, last, skipper);
return subject.parse(temp, last, context, skipper, attr);
}
template <typename Context>
boost::spirit::info what(Context& context) const
{
return info("dont_eat", subject.what(context));
}
Subject subject;
};
}//custom
///////////////////////////////////////////////////////////////////////////
// Parser generators: make_xxx function (objects)
///////////////////////////////////////////////////////////////////////////
namespace boost { namespace spirit { namespace qi
{
template <typename Subject, typename Modifiers>
struct make_directive<custom::tag::dont_eat, Subject, Modifiers>
{
typedef custom::dont_eat_directive<Subject> result_type;
result_type operator()(unused_type, Subject const& subject, unused_type) const
{
return result_type(subject);
}
};
}}}
namespace boost { namespace spirit { namespace traits
{
///////////////////////////////////////////////////////////////////////////
template <typename Subject>
struct has_semantic_action<custom::dont_eat_directive<Subject> >
: unary_has_semantic_action<Subject> {};
///////////////////////////////////////////////////////////////////////////
template <typename Subject, typename Attribute, typename Context
, typename Iterator>
struct handles_container<custom::dont_eat_directive<Subject>, Attribute
, Context, Iterator>
: unary_handles_container<Subject, Attribute, Context, Iterator> {};
}}}
#endif
main.cpp
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include "dont_eat.hpp"
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
struct ints_type
{
std::vector<int> data;
std::string inttext;
};
struct A
{
std::string header;
ints_type ints;
};
BOOST_FUSION_ADAPT_STRUCT(
ints_type,
(std::vector<int>, data)
(std::string, inttext)
)
BOOST_FUSION_ADAPT_STRUCT(
A,
(std::string, header)
(ints_type, ints)
)
template <typename Iterator>
struct parser : qi::grammar< Iterator, A() >
{
parser() : parser::base_type(start)
{
header %= qi::lexeme[ +qi::alpha ];
ints = qi::lexeme[qi::int_ % qi::char_(",_")];
ints_string = custom::dont_eat[ints] >> qi::as_string[qi::raw[ints]];
start %= header >> ' ' >> ints_string;
}
qi::rule<Iterator, std::string()> header;
qi::rule<Iterator, std::vector<int>() > ints;
qi::rule<Iterator, ints_type() > ints_string;
qi::rule<Iterator, A()> start;
};
int main()
{
A output;
std::string input("out 1,2_3");
auto iter = input.begin();
parser<decltype(iter)> p;
bool r = qi::parse(iter, input.end(), p, output);
if( !r || iter != input.end() )
{
std::cout << "did not parse";
}
else
{
// would like output.inttext to be "1,2_3"
std::cout << output.header << ": " << output.ints.inttext << " -> [ ";
for( auto & i: output.ints.data )
std::cout << i << ' ';
std::cout << ']' << std::endl;
}
}

This directive returns a fusion::vector2<> with the subject's attribute as its first member and the string corresponding to the synthesized attribute as its second. I think this is the easiest method to reuse as long as you adapt your structs adequately. I'm not sure that this fusion::vector2<> is the best way to handle the attributes but in the limited testing I've done it has worked fine. With this directive the ints_string rule would simply be:
ints_string=custom::annotate[ints];
//or ints_string=custom::annotate[qi::lexeme[qi::int_ % qi::char_(",_")]];
Example on LWS.
annotate.hpp
#if !defined(ANNOTATE_HPP)
#define ANNOTATE_HPP
#if defined(_MSC_VER)
#pragma once
#endif
#include <boost/spirit/home/qi/meta_compiler.hpp>
#include <boost/spirit/home/qi/skip_over.hpp>
#include <boost/spirit/home/qi/parser.hpp>
#include <boost/spirit/home/support/unused.hpp>
#include <boost/spirit/home/support/common_terminals.hpp>
#include <boost/spirit/home/qi/detail/attributes.hpp>
#include <boost/spirit/home/support/info.hpp>
#include <boost/spirit/home/support/handles_container.hpp>
namespace custom
{
BOOST_SPIRIT_TERMINAL(annotate);
}
namespace boost { namespace spirit
{
///////////////////////////////////////////////////////////////////////////
// Enablers
///////////////////////////////////////////////////////////////////////////
template <>
struct use_directive<qi::domain, custom::tag::annotate> // enables annotate
: mpl::true_ {};
}}
namespace custom
{
template <typename Subject>
struct annotate_directive : boost::spirit::qi::unary_parser<annotate_directive<Subject> >
{
typedef Subject subject_type;
annotate_directive(Subject const& subject)
: subject(subject) {}
template <typename Context, typename Iterator>
struct attribute
{
typedef
boost::fusion::vector2<
typename boost::spirit::traits::attribute_of<subject_type, Context, Iterator>::type
,std::string
>
type;
};
template <typename Iterator, typename Context
, typename Skipper, typename Attribute>
bool parse(Iterator& first, Iterator const& last
, Context& context, Skipper const& skipper
, Attribute& attr) const
{
boost::spirit::qi::skip_over(first, last, skipper);
Iterator save = first;
typename boost::spirit::traits::attribute_of<subject_type, Context, Iterator>::type attr_;
if(subject.parse(first, last, context, skipper, attr_))
{
boost::spirit::traits::assign_to(attr_,boost::fusion::at_c<0>(attr));
boost::spirit::traits::assign_to(std::string(save,first),boost::fusion::at_c<1>(attr));
return true;
}
first = save;
return false;
}
template <typename Context>
boost::spirit::info what(Context& context) const
{
return info("annotate", subject.what(context));
}
Subject subject;
};
}//custom
///////////////////////////////////////////////////////////////////////////
// Parser generators: make_xxx function (objects)
///////////////////////////////////////////////////////////////////////////
namespace boost { namespace spirit { namespace qi
{
template <typename Subject, typename Modifiers>
struct make_directive<custom::tag::annotate, Subject, Modifiers>
{
typedef custom::annotate_directive<Subject> result_type;
result_type operator()(unused_type, Subject const& subject, unused_type) const
{
return result_type(subject);
}
};
}}}
namespace boost { namespace spirit { namespace traits
{
///////////////////////////////////////////////////////////////////////////
template <typename Subject>
struct has_semantic_action<custom::annotate_directive<Subject> >
: unary_has_semantic_action<Subject> {};
///////////////////////////////////////////////////////////////////////////
template <typename Subject, typename Attribute, typename Context
, typename Iterator>
struct handles_container<custom::annotate_directive<Subject>, Attribute
, Context, Iterator>
: unary_handles_container<Subject, Attribute, Context, Iterator> {};
}}}
#endif
main.cpp
#include <iostream>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include "annotate.hpp"
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
struct ints_type
{
std::vector<int> data;
std::string inttext;
};
struct A
{
std::string header;
ints_type ints;
};
BOOST_FUSION_ADAPT_STRUCT(
ints_type,
(std::vector<int>, data)
(std::string, inttext)
)
BOOST_FUSION_ADAPT_STRUCT(
A,
(std::string, header)
(ints_type, ints)
)
template <typename Iterator>
struct parser : qi::grammar< Iterator, A() >
{
parser() : parser::base_type(start)
{
header %= qi::lexeme[ +qi::alpha ];
ints = qi::lexeme[qi::int_ % qi::char_(",_")];
ints_string = custom::annotate[ints];
start %= header >> ' ' >> ints_string;
}
qi::rule<Iterator, std::string()> header;
qi::rule<Iterator, std::vector<int>() > ints;
qi::rule<Iterator, ints_type() > ints_string;
qi::rule<Iterator, A()> start;
};
int main()
{
A output;
std::string input("out 1,2_3");
auto iter = input.begin();
parser<decltype(iter)> p;
std::string annotation;
bool r = qi::parse(iter, input.end(), custom::annotate[p], output, annotation);
if( !r || iter != input.end() )
{
std::cout << "did not parse";
}
else
{
// would like output.inttext to be "1,2_3"
std::cout << "annotation: " << annotation << std::endl;
std::cout << output.header << ": " << output.ints.inttext << " -> [ ";
for( auto & i: output.ints.data )
std::cout << i << ' ';
std::cout << ']' << std::endl;
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Error compiling a custom container for boost spirit - c++

Related

cannot get boost::spirit parser&lexer working for token types other than std::string or int or double

Using lexer token attributes in grammar rules with Lex and Qi from Boost.Spirit

boost spirit on_error not triggered

boost spirit parse c and cpp style comments

How do I capture the original input into the synthesized output from a spirit grammar?

Categories

Resources