Hello everybody I am new to boost and boost::spirit, so I am sorry for noob question.
When I use qi::phrase_parsefunction, the function returns only bool variable which indicates whether parsing has been successful or not, but I don't know where I can find the result of parsing ...some sort of syntax tree etc.
If I use macro #define BOOST_SPIRIT_DEBUG XML representation of tree is printed on standard output, but these nodes has to be stored somewhere. Can you help me please?
You can 'bind' attribute references. qi::parse, qi::phrase_parse (and related) accept variadic arguments that will be used to receive the exposed attributes.
A simplistic example is: (EDIT included a utree example too)
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_utree.hpp>
namespace qi = boost::spirit::qi;
int main()
{
using namespace qi;
std::string input("1 2 3 4 5");
std::string::const_iterator F(input.begin()), f(F), l(input.end());
std::vector<int> ints;
if (qi::phrase_parse(f = F, l, *qi::int_, qi::space, ints))
std::cout << ints.size() << " ints parsed\n";
int i;
std::string s;
// it is variadic:
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> +qi::char_, i, s))
std::cout << "i: " << i << ", s: " << s << '\n';
std::pair<int, std::string> data;
// any compatible sequence can be used:
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> +qi::char_, data))
std::cout << "first: " << data.first << ", second: " << data.second << '\n';
// using utree:
boost::spirit::utree tree;
if (qi::parse(f = F, l, "1 2 " >> qi::int_ >> qi::as_string [ +qi::char_ ], tree))
std::cout << "tree: " << tree << '\n';
}
Outputs:
5 ints parsed
i: 3, s: 4 5
first: 3, second: 4 5
tree: ( 3 " 4 5" )
A few samples of parsers with 'AST' like data structures:
Boolean expression (grammar) parser in c++
Boost::Spirit Expression Parser
If you want to have a very generic AST structure, look at utree: http://www.boost.org/doc/libs/1_50_0/libs/spirit/doc/html/spirit/support/utree.html
Related
I have a huge amount of files I am trying to parse using boost::spirit::qi. Parsing is not a problem, but some of the files contain noise that I want to skip. Building a simple parser (not using boost::spirit::qi) verifies that I can avoid the noise by skipping anything that doesn't match rules at the beginning of a line. So, I'm looking for a way to write a line based parser that skip lines when not matching any rule.
The example below allows the grammar to skip lines if they don't match at all, but the 'junk' rule still inserts an empty instance of V(), which is unwanted behaviour.
The use of \r instead of \n in the example is intentional as I have encountered both \n, \r and \r\n in the files.
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>(), ascii::space_type> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v %= string("v") > double_ > double_ > double_;
junk = +(char_ - eol);
start %= +(v | junk);
v.name("v");
junk.name("junk");
start.name("start");
using phx::val;
using phx::construct;
on_error<fail>(
start,
std::cout
<< val("Error! Expecting \n\n'")
<< qi::_4
<< val("'\n\n here: \n\n'")
<< construct<std::string>(qi::_3, qi::_2)
<< val("'")
<< std::endl
);
//debug(v);
//debug(junk);
//debug(start);
}
qi::rule<Iterator> junk;
//qi::rule<Iterator, qi::unused_type()> junk; // Doesn't work either
//qi::rule<Iterator, qi::unused_type(), qi::unused_type()> junk; // Doesn't work either
qi::rule<Iterator, V(), ascii::space_type> v;
qi::rule<Iterator, std::vector<V>(), ascii::space_type> start;
};
} // namespace client
int main(int argc, char* argv[]) {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = phrase_parse(iter, end, v_grammar, ascii::space, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (r && (iter == end)) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}
Here's the output from the example:
run
done ... r: true, iter==end: true
v, 1, 2, 3
, 0, 0, 0
v, 4, 5, 6
v, 7, 8, 9
v, 10, 11, 12
And here is what I actually want the parser to return.
run
done ... r: true, iter==end: true
v, 1, 2, 3
v, 4, 5, 6
My main problem right now is to keep the 'junk' rule from adding an empty V() object. How do I accomplish this? Or am I overthinking the problem?
I have tried adding lit(junk) to the start rule, since lit() doesn't return anything, but this will not compile. It fails with: "static assertion failed: error_invalid_expression".
I have also tried to set the semantic action on the junk rule to qi::unused_type() but the rule still creates an empty V() in that case.
I am aware of the following questions, but they don't address this particular issue. I have tried out the comment skipper earlier, but it looks like I'll have to reimplement all the parse rules in the skipper in order to identify noise. My example is inspired by the solution in the last link:
How to skip line/block/nested-block comments in Boost.Spirit?
How to parse entries followed by semicolon or newline (boost::spirit)?
Version info:
Linux debian 4.9.0-7-amd64 #1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13) x86_64 GNU/Linux
g++ (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
#define BOOST_VERSION 106200
and:
Linux raspberrypi 4.14.24-v7+ #1097 SMP Mon Mar 5 16:42:05 GMT 2018 armv7l GNU/Linux
g++ (Raspbian 4.9.2-10+deb8u1) 4.9.2
#define BOOST_VERSION 106200
For those who wonder: yes I'm trying to parse files similar to Wavefront OBJ files and I'm aware that there is already a bunch of parsers available. However, the data I'm parsing is part of a larger data structure which also requires parsing, so it does make sense to build a new parser.
What you are wanting to achieve is called error recover.
Unfortunately, Spirit does not have a nice way of doing it (there are also some internal decisions which makes it hard to make it externally). However, in your case it is simple to achieve by grammar rewrite.
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>()> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v = skip(blank)[no_skip[string("v")] > double_ > double_ > double_];
junk = +(char_ - eol);
start = (v || -junk) % eol;
v.name("v");
junk.name("junk");
start.name("start");
using phx::val;
using phx::construct;
on_error<fail>(
start,
std::cout
<< val("Error! Expecting \n\n'")
<< qi::_4
<< val("'\n\n here: \n\n'")
<< construct<std::string>(qi::_3, qi::_2)
<< val("'")
<< std::endl
);
//debug(v);
//debug(junk);
//debug(start);
}
qi::rule<Iterator> junk;
//qi::rule<Iterator, qi::unused_type()> junk; // Doesn't work either
//qi::rule<Iterator, qi::unused_type(), qi::unused_type()> junk; // Doesn't work either
qi::rule<Iterator, V()> v;
qi::rule<Iterator, std::vector<V>()> start;
};
} // namespace client
int main(int argc, char* argv[]) {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = parse(iter, end, v_grammar, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (r && (iter == end)) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}
I have tried adding lit(junk) to the start rule, since lit() doesn't return anything, but this will not compile. It fails with: "static assertion failed: error_invalid_expression".
What you're looking for would be omit[junk], but it should make no difference because it will still make the synthesized attribute optional<>.
Fixing Things
First of all, you need newlines to be significant. Which means you cannot skip space. Because it eats newlines. What's worse, you need leading whitespace to be significant as well (to junk that last line, e.g.). You cannot even use qi::blank for the skipper then. (See Boost spirit skipper issues).
Just so you can still have whitespace inside the v rule, just have a local skipper (that doesn't eat newlines):
v %= &lit("v") >> skip(blank) [ string("v") > double_ > double_ > double_ ];
It engages the skipper only after establishing that there was no unexpected leading whitespace.
Note that the string("v") is a bit redundant this way, but that brings us to the second motive:
Second of all, I'm with you in avoiding semantic actions. However, this means you have to make your rules reflect your data structures.
In this particular instance, it means you should probably turn the line skipping a bit inside-out. What if you express the grammar as a straight repeat of v, interspersed with /whatever/, instead of just /newline/? I'd write that like:
junk = *(char_ - eol);
other = !v >> junk;
start = *(v >> junk >> eol % other);
Note that
the delimiter expression now uses the operator% (list operator) itself: (eol % other). What this cleverly accomplishes is that it keeps eating newlines as long as they are only delimited by "other" lines (anything !v at this point).
other is more constrained than junk, because junk may eat v, whereas other makes sure that never happens
therefore v >> junk allows the third line of your sample to be correctly processed (the line that has v 4 5 6 v 7 8 9\r)
Now it all works: Live On Coliru:
run
done ... r: true, iter==end: true
v, 1, 2, 3
v, 4, 5, 6
Perfecting It
You might be aware of the fact that this does not handle the case when the first line(s) are not v lines. Let's add that case to the sample and make sure it works as well:
Live On Coliru:
//#define BOOST_SPIRIT_DEBUG
#include <iostream>
#include <string>
#include <vector>
#include <boost/foreach.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/include/std_tuple.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
using V = std::tuple<std::string, double, double, double>;
namespace client {
template <typename Iterator>
struct VGrammar : qi::grammar<Iterator, std::vector<V>()> {
VGrammar() : VGrammar::base_type(start) {
using namespace qi;
v %= &lit("v") >> skip(blank) [ string("v") > double_ > double_ > double_ ];
junk = *(char_ - eol);
other = !v >> junk;
start =
other >> eol % other >>
*(v >> junk >> eol % other);
BOOST_SPIRIT_DEBUG_NODES((v)(junk)(start))
on_error<fail>(
start,
std::cout
<< phx::val("Error! Expecting \n\n'") << qi::_4
<< "'\n\n here: \n\n'" << phx::construct<std::string>(qi::_3, qi::_2)
<< "'\n"
);
}
private:
qi::rule<Iterator> other, junk;
qi::rule<Iterator, V()> v;
qi::rule<Iterator, std::vector<V>()> start;
};
} // namespace client
int main() {
using iterator_type = std::string::const_iterator;
std::string input = "";
input += "o a b c\r"; // parse as junk
input += "v 1 2 3\r"; // keep v 1 2 3
input += "o a b c\r"; // parse as junk
input += "v 4 5 6 v 7 8 9\r"; // keep v 4 5 6, but parse v 7 8 9 as junk
input += " v 10 11 12\r\r"; // parse as junk
iterator_type iter = input.begin();
const iterator_type end = input.end();
std::vector<V> parsed_output;
client::VGrammar<iterator_type> v_grammar;
std::cout << "run" << std::endl;
bool r = parse(iter, end, v_grammar, parsed_output);
std::cout << "done ... r: " << (r ? "true" : "false") << ", iter==end: " << ((iter == end) ? "true" : "false") << std::endl;
if (iter != end)
std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
if (r) {
BOOST_FOREACH(V const& v_row, parsed_output) {
std::cout << std::get<0>(v_row) << ", " << std::get<1>(v_row) << ", " << std::get<2>(v_row) << ", " << std::get<3>(v_row) << std::endl;
}
}
return EXIT_SUCCESS;
}
I want to parse the following data list.
N; data1, data2 .... dataN;
example: "100000; 1, 2, 3, 4, 5, ... 100000;" (A very large list)
Simple parsing:
auto rule = qi::int_ >> qi::lit(';') >> qi::int_ % ',' >> qi::lit(';');
However, in this case, I think that log2 (N) times memory reallocation occurs according to the specification of std::vector.
I think that this can be avoided by the following method.
int size;
std::vector<int> v;
qi::phrase_parse(itr, end, qi::int_ >> qi::lit(';'), qi::space, size);
v.reserve(size); // reserve memory
qi::phrase_parse(itr, end, qi::int_ % ',' >> qi::lit(';'), qi::space, v);
Is there a way to give hints on memory allocation for vectors on a single rule like this? For example, it is like qi::repeat(N). Or is there a technique to avoid reallocating vector memory?
Thanks for the help in advance.
Yes. You can reserve in an action.
Better yet, do it in an epsilon argument so that you don't lose automatic attribute propagation.
Proof of concept: Live On Coliru
Updated: Extended the demo. It turns out Phoenix already has functors for reserve and capacity, as well as size.
Note that
now the reserve is a semantic action,
the rule uses %= to enable automatic attribute propagation still,
and then (ironically?) uses qi::omit to prevent from inserting the first int_ attribute into the container as well
Live On Coliru
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;
int main() {
using Attr = std::vector<int>;
using It = std::string::const_iterator;
qi::rule<It, Attr(), qi::space_type> rule;
rule %= qi::omit[qi::int_ [ px::reserve(qi::_val, qi::_1) ] >> ';' ]
>> (qi::eps(px::size(qi::_val) < px::capacity(qi::_val)) >> qi::int_) % ','
>> ';'
;
for (std::string const input : {
"42; 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41;", })
{
It f = begin(input), l = end(input);
Attr data;
std::cout << "Capacity before: " << data.capacity() << "\n";
if (phrase_parse(f, l, rule, qi::space, data))
std::cout << "Parsed: " << data.size() << " elements ";
else
std::cout << "Parse failed at '" << std::string(f,l) << "' ";
if (f != l)
std::cout << "Remaining: '" << std::string(f,l) << "'";
std::cout << '\n';
std::cout << "Capacity after: " << data.capacity() << "\n";
}
}
Prints
Capacity before: 0
Parsed: 42 elements
Capacity after: 42
I am trying to get the current line of the file I am parsing using boost spirit. I created a grammar class and my structures to parse my commands into. I would also like to keep track of which line the command was found on and parse that into my structures as well. I have wrapped my istream file iterator in a multi_pass iterator and then wrapped that in a boost::spirit::classic::position_iterator2. In my rules of my grammar how would I get the current position of the iterator or is this not possible?
Update: It is similar to that problem but I just need to be able to keep a count of all the lines processed. I don't need to do all of the extra buffering that was done in the solution.
Update: It is similar to that problem but I just need to be able to keep a count of all the lines processed. I don't need to do all of the extra buffering that was done in the solution.
Keeping a count of all lines processed is not nearly the same as "getting the current line".
Simple Take
If this is what you need, just check it after the parse:
Live On Wandbox
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
#include <fstream>
#include <set>
namespace qi = boost::spirit::qi;
int main() {
using It = boost::spirit::istream_iterator;
std::ifstream ifs("main.cpp");
boost::spirit::line_pos_iterator<It> f(It(ifs >> std::noskipws)), l;
std::set<std::string> words;
if (qi::phrase_parse(f, l, *qi::lexeme[+qi::graph], qi::space, words)) {
std::cout << "Parsed " << words.size() << " words";
if (!words.empty())
std::cout << " (from '" << *words.begin() << "' to '" << *words.rbegin() << "')";
std::cout << "\nLast line processed: " << boost::spirit::get_line(f) << "\n";
}
}
Prints
Parsed 50 words (from '"' to '}')
Last line processed: 22
Slightly More Complex Take
If you say "no, wait, I really did want to get the current line /while parsing/". The real full monty is here:
boost::spirit access position iterator from semantic actions
Here's the completely trimmed down version using iter_pos:
Live On Wandbox
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
#include <boost/spirit/repository/include/qi_iter_pos.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <fstream>
#include <map>
namespace qi = boost::spirit::qi;
namespace qr = boost::spirit::repository::qi;
using LineNum = size_t;
struct line_number_f {
template <typename It> LineNum operator()(It it) const { return get_line(it); }
};
static boost::phoenix::function<line_number_f> line_number_;
int main() {
using Underlying = boost::spirit::istream_iterator;
using It = boost::spirit::line_pos_iterator<Underlying>;
qi::rule<It, LineNum()> line_no = qr::iter_pos [ qi::_val = line_number_(qi::_1) ];
std::ifstream ifs("main.cpp");
It f(Underlying{ifs >> std::noskipws}), l;
std::multimap<LineNum, std::string> words;
if (qi::phrase_parse(f, l, +(line_no >> qi::lexeme[+qi::graph]), qi::space, words)) {
std::cout << "Parsed " << words.size() << " words.\n";
if (!words.empty()) {
auto& first = *words.begin();
std::cout << "First word: '" << first.second << "' (in line " << first.first << ")\n";
auto& last = *words.rbegin();
std::cout << "Last word: '" << last.second << "' (in line " << last.first << ")\n";
}
std::cout << "Line 20 contains:\n";
auto p = words.equal_range(20);
for (auto it = p.first; it != p.second; ++it)
std::cout << " - '" << it->second << "'\n";
}
}
Printing:
Parsed 166 words.
First word: '#include' (in line 1)
Last word: '}' (in line 46)
Line 20 contains:
- 'int'
- 'main()'
- '{'
I want to check a file for all enums(this is just an MCVE so nothing complicated) and the name of the enums should be stored in an std::vector I build my parsers like this:
auto const any = x3::rule<class any_id, const x3::unused_type>{"any"}
= ~x3::space;
auto const identifier = x3::rule<class identifier_id, std::string>{"identifier"}
= x3::lexeme[x3::char_("A-Za-z_") >> *x3::char_("A-Za-z_0-9")];
auto const enum_finder = x3::rule<class enum_finder_id, std::vector<std::string>>{"enum_finder"}
= *(("enum" >> identifier) | any);
When I am trying to parse a string with this enum_finder into a std::vector, the std::vector also contains a lot of empty string.
Why is this parser also parsing empty strings into the vector?
I've assumed you want to parse "enum " out of free form text ignoring whitespaces.
What you really want is for ("enum" >> identifier | any) to synthesize an optional<string>. Sadly, what you get is variant<string, unused_type> or somesuch.
The same happens when you wrap any with x3::omit[any] - it's still the same unused_type.
Plan B: Since you're really just parsing repeated enum-ids separated by "anything", why not use the list operator:
("enum" >> identifier) % any
This works a little. Now some tweaking: lets avoid eating "any" character by character. In fact, we can likely just consume whole whitespace delimited words: (note +~space is equivalent +graph):
auto const any = x3::rule<class any_id>{"any"}
= x3::lexeme [+x3::graph];
Next, to allow for multiple bogus words to be accepted in a row there's the trick to make the list's subject parser optional:
-("enum" >> identifier) % any;
This parses correctly. See a full demo:
DEMO
Live On Coliru
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
namespace parser {
using namespace x3;
auto any = lexeme [+~space];
auto identifier = lexeme [char_("A-Za-z_") >> *char_("A-Za-z_0-9")];
auto enum_finder = -("enum" >> identifier) % any;
}
#include <iostream>
int main() {
for (std::string input : {
"",
" ",
"bogus",
"enum one",
"enum one enum two",
"enum one bogus bogus more bogus enum two !##!##Yay",
})
{
auto f = input.begin(), l = input.end();
std::cout << "------------ parsing '" << input << "'\n";
std::vector<std::string> data;
if (phrase_parse(f, l, parser::enum_finder, x3::space, data))
{
std::cout << "parsed " << data.size() << " elements:\n";
for (auto& el : data)
std::cout << "\t" << el << "\n";
} else {
std::cout << "Parse failure\n";
}
if (f!=l)
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
}
Prints:
------------ parsing ''
parsed 0 elements:
------------ parsing ' '
parsed 0 elements:
------------ parsing 'bogus'
parsed 0 elements:
------------ parsing 'enum one'
parsed 1 elements:
one
------------ parsing 'enum one enum two'
parsed 1 elements:
one
------------ parsing 'enum one bogus bogus more bogus enum two !##!##Yay'
parsed 2 elements:
one
two
Using boost spirit, I'd like to extract a string that is followed by some data in parentheses. The relevant string is separated by a space from the opening parenthesis. Unfortunately, the string itself may contain spaces. I'm looking for a concise solution that returns the string without a trailing space.
The following code illustrates the problem:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <string>
#include <iostream>
namespace qi = boost::spirit::qi;
using std::string;
using std::cout;
using std::endl;
void
test_input(const string &input)
{
string::const_iterator b = input.begin();
string::const_iterator e = input.end();
string parsed;
bool const r = qi::parse(b, e,
*(qi::char_ - qi::char_("(")) >> qi::lit("(Spirit)"),
parsed
);
if(r) {
cout << "PASSED:" << endl;
} else {
cout << "FAILED:" << endl;
}
cout << " Parsed: \"" << parsed << "\"" << endl;
cout << " Rest: \"" << string(b, e) << "\"" << endl;
}
int main()
{
test_input("Fine (Spirit)");
test_input("Hello, World (Spirit)");
return 0;
}
Its output is:
PASSED:
Parsed: "Fine "
Rest: ""
PASSED:
Parsed: "Hello, World "
Rest: ""
With this simple grammar, the extracted string is always followed by a space (that I 'd like to eliminate).
The solution should work within Spirit since this is only part of a larger grammar. (Thus, it would probably be clumsy to trim the extracted strings after parsing.)
Thank you in advance.
Like the comment said, in the case of a single space, you can just hard code it. If you need to be more flexible or tolerant:
I'd use a skipper with raw to "cheat" the skipper for your purposes:
bool const r = qi::phrase_parse(b, e,
qi::raw [ *(qi::char_ - qi::char_("(")) ] >> qi::lit("(Spirit)"),
qi::space,
parsed
);
This works, and prints
PASSED:
Parsed: "Fine"
Rest: ""
PASSED:
Parsed: "Hello, World"
Rest: ""
See it Live on Coliru
Full program for reference:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <string>
#include <iostream>
namespace qi = boost::spirit::qi;
using std::string;
using std::cout;
using std::endl;
void
test_input(const string &input)
{
string::const_iterator b = input.begin();
string::const_iterator e = input.end();
string parsed;
bool const r = qi::phrase_parse(b, e,
qi::raw [ *(qi::char_ - qi::char_("(")) ] >> qi::lit("(Spirit)"),
qi::space,
parsed
);
if(r) {
cout << "PASSED:" << endl;
} else {
cout << "FAILED:" << endl;
}
cout << " Parsed: \"" << parsed << "\"" << endl;
cout << " Rest: \"" << string(b, e) << "\"" << endl;
}
int main()
{
test_input("Fine (Spirit)");
test_input("Hello, World (Spirit)");
return 0;
}