I have the following code:
std::string test("1.1");
std::pair<int, int> d;
bool r = qi::phrase_parse(
test.begin(),
test.end(),
qi::int_ >> '.' >> qi::int_,
space,
d
);
So I'm trying to parse the string test and place the result in the std::pair d. However it is not working, I suspect it has to do with the Compound Attribute Rules.
Any hints to how to get this working?
The compiler error is the following:
error: no matching function for call
to 'std::pair::pair(const
int&)'
It should work. What people forget very often is to add a
#include <boost/fusion/include/std_pair.hpp>
to their list of includes. This is necessary to make std::pair a full blown Fusion citizen.
Related
I'm trying to make a JSON parser but my object rule doesn't compile...
Code (complete code here):
// AST
using Object = std::map<std::string, struct Value>; // (Value is a variant which can contain a float, a string, an Object or an Array)
// Grammar def
using ObjectType = x3::rule<struct ObjectClass, Object>;
const ObjectType obj{"object"};
const auto obj_def = '{' > ((quotedString > ':' > val) % ',') > '}';
Error (complete error here):
/usr/include/boost/spirit/home/x3/support/traits/container_traits.hpp:77:56: error: no type named 'value_type' in 'std::pair<std::basic_string<char>, Json::Value>'
: detail::remove_value_const<typename Container::value_type>
~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
The type std::pair<std::basic_string<char>, Json::Value> is good, but it must be an array (std::vector<std::pair<std::basic_string<char>, Json::Value>>, so std::map<std::basic_string<char>, Json::Value>)
What is the problem?
Your diagnosis is off the mark. You can just eliminate rules and defs until you find the culprit. The obj_def is the culprit, which you can confirm by commenting it out:
const auto obj_def = x3::eps; // '{' > ((quotedString > ':' > val) % ',') > '}';
In your grammardef.hpp you need to include
#include <boost/fusion/adapted.hpp>
so that Fusion knows how to deal with std::pair<std::string, Json::Value>.
This is a FAQ entry since early days of Spirit V2 (http://boost-spirit.com/home/articles/qi-example/parsing-a-list-of-key-value-pairs-using-spirit-qi/).
Also, bear in mind that some implementations will expect properties to be ordered (this is not actually specified) and you might want to check against duplicate keys (especially after normalizing unicode escapes).
I want to create a parser that will match exactly two alphanumeric words from a string, such as:
message1 message2
and then save that into two variables of type std::string.
I've read this previous answer which seems to work for an endless amount of repetitions, which uses the following parser:
+qi::alnum % +qi::space
However when I try to do this:
bool const result = qi::phrase_parse(
input.begin(), input.end(),
+qi::alnum >> +qi::alnum,
+qi::space,
words
);
the words vector contains every single letter in a different string:
't'
'h'
'i'
's'
'i'
's'
This is extremely counter-intuitive, and I'm not sure as to why it's happening. Could someone please explain that?
Also, can I have two predefined strings to be populated instead of a std::vector?
Final note: I would like to avoid the using statement, as I would like to have every namespace clearly defined to help me understand how Spirit works.
Yes, but the skipper ignores the whitespace before you can act on it.
Use lexeme to control the skipper:
bool const result = qi::phrase_parse(
input.begin(), input.end(),
qi::lexeme [+qi::alnum] >> qi::lexeme [+qi::alnum],
qi::space,
words
);
Note the skipper should be qi::space instead of +qi::space.
See also Boost spirit skipper issues
I am using Spirit Qi as my parser, to parse mathematical expressions into an expression tree. I keep track of such things as the types of the symbols which are encountered as I parse, and which must be declared in the text I am parsing. Namely, I am parsing Bertini input files, a simple-ish example of which is here, a complicated example is here, and for completeness purposes, as below:
%input: our first input file
variable_group x,y;
function f,g;
f = x^2 - 1;
g = y^2 - 4;
END;
The grammar I have been working on will ideally
find declaration statements, and then parse the following comma-separated list of symbols of the type being declared, and store the resulting vector of symbols in the class object being parsed into; e.g. variable_group x, y;
find a previously declared symbol, which is followed by an equals sign, and is the definition of that symbol as an evaluatable mathematical object; e.g. f = x^2 - 1; This part I mostly have under control.
find a not-previously declared symbol followed by =, and parse it as a subfunction. I think I can handle this, too.
The problem I have been struggling to solve seems like it is so trivial, yet after hours of searching, I still haven't gotten there. I've read dozens of Boost Spirit mailing list posts, SO posts, the manual, and the headers for Spirit themselves, yet still don't quite grok a few critical things about Spirit Qi parsing.
Here is the problematic basic grammar definition, which would go in system_parser.hpp:
#define BOOST_SPIRIT_USE_PHOENIX_V3 1
#include <boost/spirit/include/qi_core.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <iostream>
#include <string>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
template<typename Iterator>
struct SystemParser : qi::grammar<Iterator, std::vector<std::string>(), boost::spirit::ascii::space_type>
{
SystemParser() : SystemParser::base_type(variable_group_)
{
namespace phx = boost::phoenix;
using qi::_1;
using qi::_val;
using qi::eps;
using qi::lit;
qi::symbols<char,int> encountered_variables;
qi::symbols<char,int> declarative_symbols;
declarative_symbols.add("variable_group",0);
// wraps the vector between its appropriate declaration and line termination.
BOOST_SPIRIT_DEBUG_NODE(variable_group_);
debug(variable_group_);
variable_group_.name("variable_group_");
variable_group_ %= lit("variable_group") >> genericvargp_ >> lit(';');
// creates a vector of strings
BOOST_SPIRIT_DEBUG_NODE(genericvargp_);
debug(genericvargp_);
genericvargp_.name("genericvargp_");
genericvargp_ %= new_variable_ % ',';
// will in the future make a shared pointer to an object using the string
BOOST_SPIRIT_DEBUG_NODE(new_variable_);
debug(new_variable_);
new_variable_.name("new_variable_");
new_variable_ %= unencountered_symbol_;
// this rule gets a string.
BOOST_SPIRIT_DEBUG_NODE(unencountered_symbol_);
debug(unencountered_symbol_);
unencountered_symbol_.name("unencountered_symbol");
unencountered_symbol_ %= valid_variable_name_ - ( encountered_variables | declarative_symbols);
// get a string which fits the naming rules.
BOOST_SPIRIT_DEBUG_NODE(valid_variable_name_);
valid_variable_name_.name("valid_variable_name_");
valid_variable_name_ %= +qi::alpha >> *(qi::alnum | qi::char_('_') | qi::char_('[') | qi::char_(']') );
}
// rule declarations. these are member variables for the parser.
qi::rule<Iterator, std::vector<std::string>(), ascii::space_type > variable_group_;
qi::rule<Iterator, std::vector<std::string>(), ascii::space_type > genericvargp_;
qi::rule<Iterator, std::string(), ascii::space_type> new_variable_;
qi::rule<Iterator, std::string(), ascii::space_type > unencountered_symbol_;// , ascii::space_type
// the rule which determines valid variable names
qi::rule<Iterator, std::string()> valid_variable_name_;
};
and some code which uses it:
#include "system_parsing.hpp"
int main(int argc, char** argv)
{
std::vector<std::string> V;
std::string str = "variable_group x, y, z;";
std::string::const_iterator iter = str.begin();
std::string::const_iterator end = str.end();
SystemParser<std::string::const_iterator> S;
bool s = phrase_parse(iter, end, S, boost::spirit::ascii::space, V);
std::cout << "the unparsed string:\n" << std::string(iter,end);
return 0;
}
It compiles under Clang 4.9.x on OSX just fine. When I run it, I get:
Assertion failed: (px != 0), function operator->, file /usr/local/include/boost/smart_ptr/shared_ptr.hpp, line 648.
Alternately, if I use expectation operator > rather than >> in the definition of the variable_group_ rule, I get our dear old friend Segmentation fault: 11.
In my learning process, I've come across such excellent posts as how to tell the type spirit is trying to generate, attribute propagation, how to interact with symbols, an example of infinite left recursion which lead to a segfault, information on parsing into classes, not structs which has a link to using Customization points (yet the links contain no examples), the Nabialek trick which couples keywords to actions, and perhaps most relevant for what I am trying to do dynamic difference parsing which is certainly something I need since the set of symbols grows, and I disallow usage of them as another type later, as the set of already-encountered symbols starts empty, and grows -- that it, the rules for parsing are dynamic.
So here's where I am at. My current problem is the assert/segfault generated by this particular example. However, I am unclear on some things, and need guiding advice, which I just haven't put together from any of the sources I have consulted, and the request for which hopefully makes this SO question disjoint from others previously asked:
When is it appropriate to use lexeme? I just don't know when to use lexeme, and not.
What are some guidelines for when to use > rather than >>?
I've seen many Fusion adapt examples where there is a struct to be parsed into, and a set of rules to do so. My input files will possibly have multiple occurrences of declarations of function, variables, etc, which all need to go the same place, so I need to be able to add to fields of the terminal class object into which I am parsing, in any order, multiple times. I think I would like to use getter/setters for the class object, so that parsing is not the only pathway to object construction. Is this a problem?
Any kind advice for this beginner is most welcome.
You reference the symbols variables. But they are locals so they don't exist once the constructor returns. This invokes Undefined Behaviour. Anything can happen.
Make the symmbol tables members of the class.
Also simplifying the dance around
the skippers (see Boost spirit skipper issues). That link also answers your _"When is it appropriate to use lexeme[]. In your sample you lacked the lexeme[] around encountered_variables|declarative_symbols, for example.
the debug macros
the operator%=, and some generally unused stuff
guessing you didn't need the mapped type of the symbols<> (because the int wasn't consumed), simplified the initialization there
Demo
Live On Coliru
#define BOOST_SPIRIT_USE_PHOENIX_V3 1
#define BOOST_SPIRIT_DEBUG 1
#include <boost/spirit/include/qi_core.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <iostream>
#include <string>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
template <typename Iterator, typename Skipper = ascii::space_type>
struct SystemParser : qi::grammar<Iterator, std::vector<std::string>(), Skipper> {
SystemParser() : SystemParser::base_type(variable_group_)
{
declarative_symbols += "variable_group";
variable_group_ = "variable_group" >> genericvargp_ >> ';';
genericvargp_ = new_variable_ % ',';
valid_variable_name_ = qi::alpha >> *(qi::alnum | qi::char_("_[]"));
unencountered_symbol_ = valid_variable_name_ - (encountered_variables|declarative_symbols);
new_variable_ = unencountered_symbol_;
BOOST_SPIRIT_DEBUG_NODES((variable_group_) (valid_variable_name_) (unencountered_symbol_) (new_variable_) (genericvargp_))
}
private:
qi::symbols<char, qi::unused_type> encountered_variables, declarative_symbols;
// rule declarations. these are member variables for the parser.
qi::rule<Iterator, std::vector<std::string>(), Skipper> variable_group_;
qi::rule<Iterator, std::vector<std::string>(), Skipper> genericvargp_;
qi::rule<Iterator, std::string()> new_variable_;
qi::rule<Iterator, std::string()> unencountered_symbol_; // , Skipper
// the rule which determines valid variable names
qi::rule<Iterator, std::string()> valid_variable_name_;
};
//#include "system_parsing.hpp"
int main() {
using It = std::string::const_iterator;
std::string const str = "variable_group x, y, z;";
SystemParser<It> S;
It iter = str.begin(), end = str.end();
std::vector<std::string> V;
bool s = phrase_parse(iter, end, S, boost::spirit::ascii::space, V);
if (s)
{
std::cout << "Parse succeeded: " << V.size() << "\n";
for (auto& s : V)
std::cout << " - '" << s << "'\n";
}
else
std::cout << "Parse failed\n";
if (iter!=end)
std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
}
Prints
Parse succeeded: 3
- 'x'
- 'y'
- 'z'
I'm trying to define some Boost::spirit::qi parsers for multiple subsets of a language with minimal code duplication. To do this, I created a few basic rule building functions. The original parser works fine, but once I started to use the composing functions, my parsers no longer seem to work.
The general language is of the form:
A B: C
There are subsets of the language where A, B, or C must be specific types, such as A is an int while B and C are floats. Here is the parser I used for that sub language:
using entry = boost::tuple<int, float, float>;
template <typename Iterator>
struct sublang : grammar<Iterator, entry(), ascii::space_type>
{
sublang() : sublang::base_type(start)
{
start = int_ >> float_ >> ':' >> float_;
}
rule<Iterator, entry(), ascii::space_type> start;
};
But since there are many subsets, I tried to create a function to build my parser rules:
template<typename AttrName, typename Value>
auto attribute(AttrName attrName, Value value)
{
return attrName >> ':' >> value;
}
So that I could build parsers for each subset more easily without duplicate information:
// in sublang
start = int_ >> attribute(float_, float_);
This fails however and I'm not sure why. In my clang testing, parsing just fails. In g++, it seems the program crashes.
Here's the full example code: http://coliru.stacked-crooked.com/a/8636f19b2e9bff8d
What is wrong with the current code and what would be the correct approach for this problem? I would like to avoid specifying the grammar of attributes and other elements in each sublanguage parser.
Quite simply: using auto with Spirit (or any EDSL based on Boost Proto and Boost Phoenix) is most likely Undefined Behaviour¹
Now, you can usually fix this using
BOOST_SPIRIT_AUTO
boost::proto::deep_copy
the new facility that's coming in the most recent version of Boost (TODO add link)
In this case,
template<typename AttrName, typename Value>
auto attribute(AttrName attrName, Value value) {
return boost::proto::deep_copy(attrName >> ':' >> value);
}
fixes it: Live On Coliru
Alternatively
you could use qi::lazy[] with inherited attributes.
I do very similar things in the prop_key rule in Reading JSON file with C++ and BOOST.
you could have a look at the Keyword List Operator from the Spirit Repository. It's designed to allow easier construction of grammars like:
no_constraint_person_rule %=
kwd("name")['=' > parse_string ]
/ kwd("age") ['=' > int_]
/ kwd("size") ['=' > double_ > 'm']
;
This you could potentially combine with the Nabialek Trick. I'd search the answers on SO for examples. (One is Grammar balancing issue)
¹ Except for entirely stateless actors (Eric Niebler on this) and expression placeholders. See e.g.
Assigning parsers to auto variables
undefined behaviour somewhere in boost::spirit::qi::phrase_parse
C++ Boost qi recursive rule construction
boost spirit V2 qi bug associated with optimization level
Some examples
Define parsers parameterized with sub-parsers in Boost Spirit
Generating Spirit parser expressions from a variadic list of alternative parser expressions
I have a simple grammar consisting of mixed variables ($(name)) and variable-value pairs ($(name:value)). I have a hand-coded recursive parser, but am interested in using it as an exercise to learn Spirit, which I'll need for more complex grammars eventually(/soon).
Anyway, the set of possible forms I'm working with (simplified from the full grammar) is:
$(variable) // Uses simple look-up, recursion and inline replace
$(name:value) // Inserts a new variable into the local lookup table
My current rules look something like:
typedef std::map<std::string, std::string> dictionary;
template <typename Iterator>
bool parse_vars(Iterator first, Iterator last, dictionary & vars, std::string & output)
{
using qi::phrase_parse;
using qi::_1;
using ascii::char_;
using ascii::string;
using ascii::space;
using phoenix::insert;
dictionary statevars;
typedef qi::rule<Iterator, std::string()> string_rule;
typedef qi::rule<Iterator, std::pair<std::string, std::string>()> pair_rule;
string_rule state = string >> ':' >> string; // Error 3
pair_rule variable =
(
char_('$') >> '(' >>
(
state[insert(phoenix::ref(statevars), _1)] |
string[output += vars[_1]] // Error 1, will eventually need to recurse
) >> ')'
); // Error 2
bool result = phrase_parse
(
first, last,
(
variable % ','
),
space
);
return r;
}
If it wasn't obvious, I have no idea how Spirit works and the docs have everything but actual explanations, so this is about an hour of throwing examples together.
The parts I particularly question are the leading char_('$') in the variable rule, but removing this causes a shift operator error (the compiler interprets '$' >> '(' as a right-shift).
When compiling, I get errors related to the state rule, particularly creating the pair, and the lookup:
error C2679: binary '[' : no operator found which takes a right-hand operand of type 'const boost::spirit::_1_type' (or there is no acceptable conversion)
error C2512: 'boost::spirit::qi::rule::rule' : no appropriate default constructor available
Changing the lookup (vars[_1]) to a simple += gives:
3. error C2665: 'boost::spirit::char_class::classify::is' : none of the 15 overloads could convert all the argument types
Error 1 seems to relate to the type (attribute?) of the _1 placeholder, but that should be a string, and is when used for printing or concatenation to the output string. 2 appears to be noise caused by 1.
Error 3, digging down the stack of template errors, seems to relate to not being able to turn the state rule into a pair, which seems odd as it almost exactly matches one of the rules from this example.
How can I modify the variable rule to properly handle both input forms?
A few things to note:
To adapt std::pair (so you can use it with maps) you should include (at least)
#include <boost/fusion/adapted/std_pair.hpp>
It looks like you are trying to create a symbol table. You could use qi::symbols for that
avoid mixing output generation with parsing, it complicates matters unduly
I haven't 'fixed' all the above (due to lack of context), but I'd happy to help out with any other questions arising from those.
Here is a fixed code version staying pretty close to the OP. Edit have tested it too now, output below:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <map>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
typedef std::map<std::string, std::string> dictionary;
template <typename Iterator, typename Skipper = qi::space_type>
struct parser : qi::grammar<Iterator, Skipper>
{
parser(dictionary& statevars, std::string& output) : parser::base_type(start)
{
using namespace qi;
using phx::insert;
with_initializer = +~char_(":)") >> ':' >> *~char_(")");
simple = +~char_(")");
variable =
"$(" >> (
with_initializer [ insert(phx::ref(statevars), qi::_1) ]
| simple [ phx::ref(output) += phx::ref(statevars)[_1] ]
) >> ')';
start = variable % ',';
BOOST_SPIRIT_DEBUG_NODE(start);
BOOST_SPIRIT_DEBUG_NODE(variable);
BOOST_SPIRIT_DEBUG_NODE(simple);
BOOST_SPIRIT_DEBUG_NODE(with_initializer);
}
private:
qi::rule<Iterator, std::pair<std::string, std::string>(), Skipper> with_initializer;
qi::rule<Iterator, std::string(), Skipper> simple;
qi::rule<Iterator, Skipper> variable;
qi::rule<Iterator, Skipper> start;
};
template <typename Iterator>
bool parse_vars(Iterator &first, Iterator last, dictionary & vars, std::string & output)
{
parser<Iterator> p(vars, output);
return qi::phrase_parse(first, last, p, qi::space);
}
int main()
{
const std::string input = "$(name:default),$(var),$(name)";
std::string::const_iterator f(input.begin());
std::string::const_iterator l(input.end());
std::string output;
dictionary table;
if (!parse_vars(f,l,table,output))
std::cerr << "oops\n";
if (f!=l)
std::cerr << "Unparsed: '" << std::string(f,l) << "'\n";
std::cout << "Output: '" << output << "'\n";
}
Output:
Output: 'default'
you have to have char_('$') otherwise the >> is 'char' on both sides - you need to have at least one spirit type in there to get the overloaded operator >>.
You may also need to use _1 from phoenix.
Also take a look at:
http://boost-spirit.com/home/articles/qi-example/parsing-a-list-of-key-value-pairs-using-spirit-qi/