The string content is like:
20 10 5 3...
it is a list of pair of int. How to use spirit parse it to std::vector<std::pair<int, int>>?
std::string line;
std::vector<std::pair<int, int>> v;
boost::spirit::qi::phrase_parse(
line.cbegin(),
line.cend(),
(
???
),
boost::spirit::qi::space
);
You could do a simple parser expression like *(int_ >> int_) (see the tutorial and these documentation pages).
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/std_pair.hpp>
namespace qi = boost::spirit::qi;
int main() {
std::string line = "20 10 5 3";
std::vector<std::pair<int, int>> v;
qi::phrase_parse(line.cbegin(), line.cend(), *(qi::int_ >> qi::int_), qi::space, v);
for (auto& p : v) {
std::cout << "(" << p.first << ", " << p.second << ")\n";
}
}
Prints
(20, 10)
(5, 3)
Pro Tip 1: Validity
If you want to make sure there's no unwanted/unexpected input, check for remaining data:
check the iterators after parsing
auto f = line.cbegin(), l = line.cend();
qi::phrase_parse(f, l, *(qi::int_ >> qi::int_), qi::space, v);
if (f!=l)
std::cout << "Unparsed input '" << std::string(f,l) << "'\n";
or simple require qi::eoi as part of the parser expression and check the return value:
bool ok = qi::phrase_parse(line.cbegin(), line.cend(), *(qi::int_ >> qi::int_) >> qi::eoi, qi::space, v);
Pro Tip 2: "Look ma, no hands"
Since the grammar is trivially the simplest thing that could parse into this datastructure, you can let Spirit do all the guesswork:
Live On Coliru
qi::phrase_parse(line.begin(), line.end(), qi::auto_, qi::space, v);
That's, a grammar consisting of nothing but a single qi::auto_. Output is still:
(20, 10)
(5, 3)
Related
I have a huge amount of files I am trying to parse using boost::spirit::qi. Parsing is not a problem, but some of the files contain noise that I want to skip. Building a simple parser (not using boost::spirit::qi) verifies that I can avoid the noise by skipping anything that doesn't match rules at the beginning of a line. So, I'm looking for a way to write a line based parser that skip lines when not matching any rule.
The example below allows the grammar to skip lines if they don't match at all, but the 'junk' rule still inserts an empty instance of V(), which is unwanted behaviour.
The use of \r instead of \n in the example is intentional as I have encountered both \n, \r and \r\n in the files.
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>(), ascii::space_type> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v %= string("v") > double_ > double_ > double_;
junk = +(char_ - eol);
start %= +(v | junk);
v.name("v");
junk.name("junk");
start.name("start");
using phx::val;
using phx::construct;
on_error<fail>(
start,
std::cout
<< val("Error! Expecting \n\n'")
<< qi::_4
<< val("'\n\n here: \n\n'")
<< construct<std::string>(qi::_3, qi::_2)
<< val("'")
<< std::endl
);
//debug(v);
//debug(junk);
//debug(start);
}
qi::rule<Iterator> junk;
//qi::rule<Iterator, qi::unused_type()> junk; // Doesn't work either
//qi::rule<Iterator, qi::unused_type(), qi::unused_type()> junk; // Doesn't work either
qi::rule<Iterator, V(), ascii::space_type> v;
qi::rule<Iterator, std::vector<V>(), ascii::space_type> start;
};
} // namespace client
int main(int argc, char* argv[]) {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = phrase_parse(iter, end, v_grammar, ascii::space, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (r && (iter == end)) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}
Here's the output from the example:
run
done ... r: true, iter==end: true
v, 1, 2, 3
, 0, 0, 0
v, 4, 5, 6
v, 7, 8, 9
v, 10, 11, 12
And here is what I actually want the parser to return.
run
done ... r: true, iter==end: true
v, 1, 2, 3
v, 4, 5, 6
My main problem right now is to keep the 'junk' rule from adding an empty V() object. How do I accomplish this? Or am I overthinking the problem?
I have tried adding lit(junk) to the start rule, since lit() doesn't return anything, but this will not compile. It fails with: "static assertion failed: error_invalid_expression".
I have also tried to set the semantic action on the junk rule to qi::unused_type() but the rule still creates an empty V() in that case.
I am aware of the following questions, but they don't address this particular issue. I have tried out the comment skipper earlier, but it looks like I'll have to reimplement all the parse rules in the skipper in order to identify noise. My example is inspired by the solution in the last link:
How to skip line/block/nested-block comments in Boost.Spirit?
How to parse entries followed by semicolon or newline (boost::spirit)?
Version info:
Linux debian 4.9.0-7-amd64 #1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13) x86_64 GNU/Linux
g++ (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
#define BOOST_VERSION 106200
and:
Linux raspberrypi 4.14.24-v7+ #1097 SMP Mon Mar 5 16:42:05 GMT 2018 armv7l GNU/Linux
g++ (Raspbian 4.9.2-10+deb8u1) 4.9.2
#define BOOST_VERSION 106200
For those who wonder: yes I'm trying to parse files similar to Wavefront OBJ files and I'm aware that there is already a bunch of parsers available. However, the data I'm parsing is part of a larger data structure which also requires parsing, so it does make sense to build a new parser.
What you are wanting to achieve is called error recover.
Unfortunately, Spirit does not have a nice way of doing it (there are also some internal decisions which makes it hard to make it externally). However, in your case it is simple to achieve by grammar rewrite.
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>()> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v = skip(blank)[no_skip[string("v")] > double_ > double_ > double_];
junk = +(char_ - eol);
start = (v || -junk) % eol;
v.name("v");
junk.name("junk");
start.name("start");
using phx::val;
using phx::construct;
on_error<fail>(
start,
std::cout
<< val("Error! Expecting \n\n'")
<< qi::_4
<< val("'\n\n here: \n\n'")
<< construct<std::string>(qi::_3, qi::_2)
<< val("'")
<< std::endl
);
//debug(v);
//debug(junk);
//debug(start);
}
qi::rule<Iterator> junk;
//qi::rule<Iterator, qi::unused_type()> junk; // Doesn't work either
//qi::rule<Iterator, qi::unused_type(), qi::unused_type()> junk; // Doesn't work either
qi::rule<Iterator, V()> v;
qi::rule<Iterator, std::vector<V>()> start;
};
} // namespace client
int main(int argc, char* argv[]) {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = parse(iter, end, v_grammar, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (r && (iter == end)) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}
I have tried adding lit(junk) to the start rule, since lit() doesn't return anything, but this will not compile. It fails with: "static assertion failed: error_invalid_expression".
What you're looking for would be omit[junk], but it should make no difference because it will still make the synthesized attribute optional<>.
Fixing Things
First of all, you need newlines to be significant. Which means you cannot skip space. Because it eats newlines. What's worse, you need leading whitespace to be significant as well (to junk that last line, e.g.). You cannot even use qi::blank for the skipper then. (See Boost spirit skipper issues).
Just so you can still have whitespace inside the v rule, just have a local skipper (that doesn't eat newlines):
v %= &lit("v") >> skip(blank) [ string("v") > double_ > double_ > double_ ];
It engages the skipper only after establishing that there was no unexpected leading whitespace.
Note that the string("v") is a bit redundant this way, but that brings us to the second motive:
Second of all, I'm with you in avoiding semantic actions. However, this means you have to make your rules reflect your data structures.
In this particular instance, it means you should probably turn the line skipping a bit inside-out. What if you express the grammar as a straight repeat of v, interspersed with /whatever/, instead of just /newline/? I'd write that like:
junk = *(char_ - eol);
other = !v >> junk;
start = *(v >> junk >> eol % other);
Note that
the delimiter expression now uses the operator% (list operator) itself: (eol % other). What this cleverly accomplishes is that it keeps eating newlines as long as they are only delimited by "other" lines (anything !v at this point).
other is more constrained than junk, because junk may eat v, whereas other makes sure that never happens
therefore v >> junk allows the third line of your sample to be correctly processed (the line that has v 4 5 6 v 7 8 9\r)
Now it all works: Live On Coliru:
run
done ... r: true, iter==end: true
v, 1, 2, 3
v, 4, 5, 6
Perfecting It
You might be aware of the fact that this does not handle the case when the first line(s) are not v lines. Let's add that case to the sample and make sure it works as well:
Live On Coliru:
//#define BOOST_SPIRIT_DEBUG
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>()> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v %= &lit("v") >> skip(blank) [ string("v") > double_ > double_ > double_ ];
junk = *(char_ - eol);
other = !v >> junk;
start =
other >> eol % other >>
*(v >> junk >> eol % other);
BOOST_SPIRIT_DEBUG_NODES((v)(junk)(start))
on_error<fail>(
start,
std::cout
<< phx::val("Error! Expecting \n\n'") << qi::_4
<< "'\n\n here: \n\n'" << phx::construct<std::string>(qi::_3, qi::_2)
<< "'\n"
);
}
private:
qi::rule<Iterator> other, junk;
qi::rule<Iterator, V()> v;
qi::rule<Iterator, std::vector<V>()> start;
};
} // namespace client
int main() {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "o a b c\r"; // parse as junk
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = parse(iter, end, v_grammar, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (iter != end)
std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
if (r) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}
I want to parse the following data list.
N; data1, data2 .... dataN;
example: "100000; 1, 2, 3, 4, 5, ... 100000;" (A very large list)
Simple parsing:
auto rule = qi::int_ >> qi::lit(';') >> qi::int_ % ',' >> qi::lit(';');
However, in this case, I think that log2 (N) times memory reallocation occurs according to the specification of std::vector.
I think that this can be avoided by the following method.
int size;
std::vector<int> v;
qi::phrase_parse(itr, end, qi::int_ >> qi::lit(';'), qi::space, size);
v.reserve(size); // reserve memory
qi::phrase_parse(itr, end, qi::int_ % ',' >> qi::lit(';'), qi::space, v);
Is there a way to give hints on memory allocation for vectors on a single rule like this? For example, it is like qi::repeat(N). Or is there a technique to avoid reallocating vector memory?
Thanks for the help in advance.
Yes. You can reserve in an action.
Better yet, do it in an epsilon argument so that you don't lose automatic attribute propagation.
Proof of concept: Live On Coliru
Updated: Extended the demo. It turns out Phoenix already has functors for reserve and capacity, as well as size.
Note that
now the reserve is a semantic action,
the rule uses %= to enable automatic attribute propagation still,
and then (ironically?) uses qi::omit to prevent from inserting the first int_ attribute into the container as well
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;
int main() {
using Attr = std::vector<int>;
using It = std::string::const_iterator;
qi::rule<It, Attr(), qi::space_type> rule;
rule %= qi::omit[qi::int_ [ px::reserve(qi::_val, qi::_1) ] >> ';' ]
>> (qi::eps(px::size(qi::_val) < px::capacity(qi::_val)) >> qi::int_) % ','
>> ';'
;
for (std::string const input : {
"42; 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41;", })
{
It f = begin(input), l = end(input);
Attr data;
std::cout << "Capacity before: " << data.capacity() << "\n";
if (phrase_parse(f, l, rule, qi::space, data))
std::cout << "Parsed: " << data.size() << " elements ";
else
std::cout << "Parse failed at '" << std::string(f,l) << "' ";
if (f != l)
std::cout << "Remaining: '" << std::string(f,l) << "'";
std::cout << '\n';
std::cout << "Capacity after: " << data.capacity() << "\n";
}
}
Prints
Capacity before: 0
Parsed: 42 elements
Capacity after: 42
My grammar has various entries which start with a generic name.
After I determined the type I would like to use the expectation operator in order to create parsing errors.
rule1=name >> (type1 > something);
rule2=name >> (type2 > something);
I already figured that I cannot mix the two operators > and >> -- that's why the parenthesis. My guess is that the parenthesis causes a tuple to be created.
How do I access the elements of the tuple in the semantic action?
The following is certainly wrong but should clarify what I want to accomplish.
rule1=(name >> (type1 > something))[qi::_val = boost::phoenix::bind(
create,
qi::_1,
std::get<0>(qi::_2),
std::get<1>(qi::_2))];
thanks
Directly addressing the question:
using px::at_c;
rule1 = (name >> (type1 > something)) [_val = px::bind(create, _1, at_c<0>(_2), at_c<1>(_2))];
However, I'd use this little trick with qi::eps to avoid the complexity:
rule2 = (name >> type1 >> (eps > something)) [_val = px::bind(create, _1, _2, _3)];
Finally, look at boost::phoenix::function<>:
px::function<decltype(&create)> create_(create); // or just decltype(create) if it's a function object
rule3 = (name >> type1 >> (eps > something)) [_val = create_(_1, _2, _3)];
That way you can even have readable code!
DEMO
Just to prove that all three have the same behaviour¹
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/at_c.hpp>
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;
static int create(char n, char t, char s) {
assert(n=='n' && t=='t' && s=='s');
return 42;
}
int main() {
using It = std::string::const_iterator;
// fake rules just for demo
qi::rule<It, char()>
name = qi::char_("n"),
type1 = qi::char_("t"),
something = qi::char_("s");
//using boost::fusion::at_c;
qi::rule<It, int(), qi::space_type> rule1, rule2, rule3;
{
using namespace qi;
using px::at_c;
rule1 = (name >> (type1 > something)) [_val = px::bind(create, _1, at_c<0>(_2), at_c<1>(_2))];
rule2 = (name >> type1 >> (eps > something)) [_val = px::bind(create, _1, _2, _3)];
px::function<decltype(&create)> create_(create); // or just decltype(create) if it's a function object
rule3 = (name >> type1 >> (eps > something)) [_val = create_(_1, _2, _3)];
}
for(auto& parser : { rule1, rule2, rule3 }) {
for(std::string const input : { "n t s", "n t !" }) {
std::cout << "Input: '" << input << "'\n";
auto f = input.begin(), l = input.end();
int data;
try {
bool ok = qi::phrase_parse(f, l, parser, qi::space, data);
if (ok) {
std::cout << "Parsing result: " << data << '\n';
} else {
std::cout << "Parsing failed\n";
}
} catch(qi::expectation_failure<It> const& e) {
std::cout << "Expectation failure: " << e.what() << " at '" << std::string(e.first, e.last) << "'\n";
}
if (f!=l) {
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
std::cout << "-------------------------------------------\n";
}
}
}
Which prints 3x the same output:
Input: 'n t s'
Parsing result: 42
-------------------------------------------
Input: 'n t !'
Expectation failure: boost::spirit::qi::expectation_failure at '!'
Remaining unparsed: 'n t !'
-------------------------------------------
Input: 'n t s'
Parsing result: 42
-------------------------------------------
Input: 'n t !'
Expectation failure: boost::spirit::qi::expectation_failure at '!'
Remaining unparsed: 'n t !'
-------------------------------------------
Input: 'n t s'
Parsing result: 42
-------------------------------------------
Input: 'n t !'
Expectation failure: boost::spirit::qi::expectation_failure at '!'
Remaining unparsed: 'n t !'
-------------------------------------------
¹ PS let this serve as an example of how to create a SSCCE code example in your questions
I'm pretty new to boost::spirit. I would like to parse a string of comma separated objects into an std::vector (similarly as in the tutorials). The string could be of different types (known at compile time): integers, like "1,2,3", strings "Apple, Orange, Banana", etc. etc.
I would like to have a unified interface for all types.
If I parse a single element I can use the auto_ expression.
Is it possible to have a similar interface with vectors?
Can I define a rule that, given a template parameter, can actually parse this vector?
Here is a simple sample code (which does not compile due to the last call to phrase_parse):
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <iostream>
#include <vector>
#include <boost/spirit/include/qi_auto.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phoenix = boost::phoenix;
using qi::auto_;
using qi::phrase_parse;
using ascii::space;
using phoenix::push_back;
int main()
{
std::string line1 = "3";
std::string line2 = "1, 2, 3";
int v;
std::vector<int> vector;
typedef std::string::iterator stringIterator;
stringIterator first = line1.begin();
stringIterator last = line1.end();
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
bool r1 = qi::phrase_parse( first,
last,
qi::auto_,
ascii::space,
v );
first = line2.begin();
last = line2.end();
//The following call is wrong!
bool r2 = qi::phrase_parse( first,
last,
// Begin grammar
(
qi::auto_[push_back(phoenix::ref(vector), qi::_1)]
>> *(',' >> qi::auto_[push_back(phoenix::ref(vector),qi::_1)])
),
// End grammar
ascii::space,
vector);
return 0;
}
UPDATE
I found a solution, in the case the size of the vector is known before parsing. On the other hand I cannot use the syntax *( ',' >> qi::auto_ ).
#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;
int main()
{
std::string s = "1, 2, 3";
std::vector<int> vector;
//This works
qi::phrase_parse(s.begin(), s.end(), qi::auto_ >> ',' >> qi::auto_ >> ',' >> qi::auto_ , qi::blank, vector);
//This does not compile
qi::phrase_parse(s.begin(), s.end(), qi::auto_ >> *( ',' >> qi::auto_ ) , qi::blank, vector);
for(int i = 0; i < vector.size() ; i++)
std::cout << i << ": " << vector[i] << std::endl;
return 0;
}
Moreover using auto_, I cannot parse a a string. Is it possible to define e template function, where the grammar can be deduced by the template parameter?
template< typename T >
void MyParse(std::string& line, std::vector<T> vec)
{
qi::phrase_parse( line.begin(),
line.end(),
/*
How do I define a grammar based on T
such as:
double_ >> *( ',' >> double_ ) for T = double
+qi::alnum >> *( ',' >> +qi::alnum ) for T = std::string
*/,
qi::blank,
vec);
}
auto_ has support for container attributes out of the box:
Live On Coliru
std::istringstream iss("1 2 3 4 5; 6 7 8 9;");
iss.unsetf(std::ios::skipws);
std::vector<int> i;
std::vector<double> d;
if (iss >> qi::phrase_match(qi::auto_ >> ";" >> qi::auto_, qi::space, i, d))
{
for (auto e:i) std::cout << "int: " << e << "\n";
for (auto e:d) std::cout << "double: " << e << "\n";
}
Prints
int: 1
int: 2
int: 3
int: 4
int: 5
double: 6
double: 7
double: 8
double: 9
So you could basically write your template function by using ',' as the skipper. I'd prefer the operator% variant though.
Simple Take
template<typename Container>
void MyParse(std::string const& line, Container& container)
{
auto f(line.begin()), l(line.end());
bool ok = qi::phrase_parse(
f, l,
qi::auto_ % ',', qi::blank, container);
if (!ok || (f!=l))
throw "parser error: '" + std::string(f,l) + "'"; // FIXME
}
Variant 2
template<typename Container>
void MyParse(std::string const& line, Container& container)
{
auto f(line.begin()), l(line.end());
bool ok = qi::phrase_parse(
f, l,
qi::auto_, qi::blank | ',', container);
if (!ok || (f!=l))
throw "parser error: '" + std::string(f,l) + "'"; // FIXME
}
Solving the string case (and others):
If the element type is not 'deducible' by Spirit (anything could be parsed into a string), just take an optional parser/grammar that knows how to parse the element type?
template<typename Container, typename ElementParser = qi::auto_type>
void MyParse(std::string const& line, Container& container, ElementParser const& elementParser = ElementParser())
{
auto f(line.begin()), l(line.end());
bool ok = qi::phrase_parse(
f, l,
elementParser % ",", qi::blank, container);
if (!ok || (f!=l))
throw "parser error: '" + std::string(f,l) + "'"; // FIXME
}
Now, it parses strings just fine:
std::vector<int> i;
std::set<std::string> s;
MyParse("1,22,33,44,15", i);
MyParse("1,22,33,44,15", s, *~qi::char_(","));
for(auto e:i) std::cout << "i: " << e << "\n";
for(auto e:s) std::cout << "s: " << e << "\n";
Prints
i: 1
i: 22
i: 33
i: 44
i: 15
s: 1
s: 15
s: 22
s: 33
s: 44
Full Listing
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <iostream>
namespace qi = boost::spirit::qi;
template<typename Container, typename ElementParser = qi::auto_type>
void MyParse(std::string const& line, Container& container, ElementParser const& elementParser = ElementParser())
{
auto f(line.begin()), l(line.end());
bool ok = qi::phrase_parse(
f, l,
elementParser % ",", qi::blank, container);
if (!ok || (f!=l))
throw "parser error: '" + std::string(f,l) + "'"; // FIXME
}
#include <set>
int main()
{
std::vector<int> i;
std::set<std::string> s;
MyParse("1,22,33,44,15", i);
MyParse("1,22,33,44,15", s, *~qi::char_(","));
for(auto e:i) std::cout << "i: " << e << "\n";
for(auto e:s) std::cout << "s: " << e << "\n";
}
Hello everybody I am new to boost and boost::spirit, so I am sorry for noob question.
When I use qi::phrase_parsefunction, the function returns only bool variable which indicates whether parsing has been successful or not, but I don't know where I can find the result of parsing ...some sort of syntax tree etc.
If I use macro #define BOOST_SPIRIT_DEBUG XML representation of tree is printed on standard output, but these nodes has to be stored somewhere. Can you help me please?
You can 'bind' attribute references. qi::parse, qi::phrase_parse (and related) accept variadic arguments that will be used to receive the exposed attributes.
A simplistic example is: (EDIT included a utree example too)
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_utree.hpp>
namespace qi = boost::spirit::qi;
int main()
{
using namespace qi;
std::string input("1 2 3 4 5");
std::string::const_iterator F(input.begin()), f(F), l(input.end());
std::vector<int> ints;
if (qi::phrase_parse(f = F, l, *qi::int_, qi::space, ints))
std::cout << ints.size() << " ints parsed\n";
int i;
std::string s;
// it is variadic:
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> +qi::char_, i, s))
std::cout << "i: " << i << ", s: " << s << '\n';
std::pair<int, std::string> data;
// any compatible sequence can be used:
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> +qi::char_, data))
std::cout << "first: " << data.first << ", second: " << data.second << '\n';
// using utree:
boost::spirit::utree tree;
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> qi::as_string [ +qi::char_ ], tree))
std::cout << "tree: " << tree << '\n';
}
Outputs:
5 ints parsed
i: 3, s: 4 5
first: 3, second: 4 5
tree: ( 3 " 4 5" )
A few samples of parsers with 'AST' like data structures:
Boolean expression (grammar) parser in c++
Boost::Spirit Expression Parser
If you want to have a very generic AST structure, look at utree: http://www.boost.org/doc/libs/1_50_0/libs/spirit/doc/html/spirit/support/utree.html