Get current line in boost spirit grammar - c++

I am trying to get the current line of the file I am parsing using boost spirit. I created a grammar class and my structures to parse my commands into. I would also like to keep track of which line the command was found on and parse that into my structures as well. I have wrapped my istream file iterator in a multi_pass iterator and then wrapped that in a boost::spirit::classic::position_iterator2. In my rules of my grammar how would I get the current position of the iterator or is this not possible?
Update: It is similar to that problem but I just need to be able to keep a count of all the lines processed. I don't need to do all of the extra buffering that was done in the solution.

Update: It is similar to that problem but I just need to be able to keep a count of all the lines processed. I don't need to do all of the extra buffering that was done in the solution.
Keeping a count of all lines processed is not nearly the same as "getting the current line".
Simple Take
If this is what you need, just check it after the parse:
Live On Wandbox
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
#include <fstream>
#include <set>
namespace qi = boost::spirit::qi;
int main() {
using It = boost::spirit::istream_iterator;
std::ifstream ifs("main.cpp");
boost::spirit::line_pos_iterator<It> f(It(ifs >> std::noskipws)), l;
std::set<std::string> words;
if (qi::phrase_parse(f, l, *qi::lexeme[+qi::graph], qi::space, words)) {
std::cout << "Parsed " << words.size() << " words";
if (!words.empty())
std::cout << " (from '" << *words.begin() << "' to '" << *words.rbegin() << "')";
std::cout << "\nLast line processed: " << boost::spirit::get_line(f) << "\n";
}
}
Prints
Parsed 50 words (from '"' to '}')
Last line processed: 22
Slightly More Complex Take
If you say "no, wait, I really did want to get the current line /while parsing/". The real full monty is here:
boost::spirit access position iterator from semantic actions
Here's the completely trimmed down version using iter_pos:
Live On Wandbox
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
#include <boost/spirit/repository/include/qi_iter_pos.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <fstream>
#include <map>
namespace qi = boost::spirit::qi;
namespace qr = boost::spirit::repository::qi;
using LineNum = size_t;
struct line_number_f {
template <typename It> LineNum operator()(It it) const { return get_line(it); }
};
static boost::phoenix::function<line_number_f> line_number_;
int main() {
using Underlying = boost::spirit::istream_iterator;
using It = boost::spirit::line_pos_iterator<Underlying>;
qi::rule<It, LineNum()> line_no = qr::iter_pos [ qi::_val = line_number_(qi::_1) ];
std::ifstream ifs("main.cpp");
It f(Underlying{ifs >> std::noskipws}), l;
std::multimap<LineNum, std::string> words;
if (qi::phrase_parse(f, l, +(line_no >> qi::lexeme[+qi::graph]), qi::space, words)) {
std::cout << "Parsed " << words.size() << " words.\n";
if (!words.empty()) {
auto& first = *words.begin();
std::cout << "First word: '" << first.second << "' (in line " << first.first << ")\n";
auto& last = *words.rbegin();
std::cout << "Last word: '" << last.second << "' (in line " << last.first << ")\n";
}
std::cout << "Line 20 contains:\n";
auto p = words.equal_range(20);
for (auto it = p.first; it != p.second; ++it)
std::cout << " - '" << it->second << "'\n";
}
}
Printing:
Parsed 166 words.
First word: '#include' (in line 1)
Last word: '}' (in line 46)
Line 20 contains:
- 'int'
- 'main()'
- '{'

Related

How to parse two strings using boost::spirit?

I am still trying to wrap my head around Boost::Spirit.
I want to parse two words into a variable. When I can do that, into a struct.
The single word compiles, the Variable doesn't. Why?
#include <boost/spirit/include/qi.hpp>
#include <boost/tuple/tuple.hpp>
#include <string>
#include <iostream>
using namespace boost::spirit;
/*
class Syntax : public qi::parser{
};
*/
int main()
{
//get user input
std::string input;
std::getline(std::cin, input);
auto it = input.begin();
bool result;
//define grammar for a single word
auto word_grammar = +qi::alnum - qi::space;
std::string singleWord;
result = qi::parse(
it, input.end(),
word_grammar,
singleWord
);
if(!result){
std::cout << "Failed to parse a word" << '\n';
return -1;
}
std::cout << "\"" << singleWord << "\"" << '\n';
//Now parse two words into a variable
std::cout << "Variable:\n";
typedef boost::tuple<std::string, std::string> Variable;
Variable variable;
auto variable_grammar = word_grammar >> word_grammar;
result = qi::parse(
it, input.end(),
variable_grammar,
variable
);
if(!result){
std::cout << "Failed to parse a variable" << '\n';
return -1;
}
std::cout << "\"" << variable.get<0>() << "\" \"" << variable.get<1>() << "\"" << '\n';
//now parse a list of variables
std::cout << "List of Variables:\n";
std::list<Variable> variables;
result = qi::parse(
it, input.end(),
variable_grammar % +qi::space,
variable
);
if(!result){
std::cout << "Failed to parse a list of variables" << '\n';
return -1;
}
for(auto var : variables)
std::cout << "DataType: " << var.get<0>() << ", VariableName: " << var.get<1>() << '\n';
}
In the end I want to parse something like this:
int a
float b
string name
Templates are nice, but when problems occur the error messages are just not human readable (thus no point in posting them here).
I am using the gcc
Sorry to take so long. I've been building a new web server in a hurry and had much to learn.
Here is what it looks like in X3. I think it is easier to deal with than qi. And then, I've used it a lot more. But then qi is much more mature, richer. That said, x3 is meant to be adaptable, hackable. So you can make it do just about anything you want.
So, live on coliru
#include <string>
#include <iostream>
#include <vector>
#include <boost/spirit/home/x3.hpp>
#include <boost/tuple/tuple.hpp>
//as pointed out, for the error 'The parser expects tuple-like attribute type'
#include <boost/fusion/adapted/boost_tuple.hpp>
//our declarations
using Variable = boost::tuple<std::string, std::string>;
using Vector = std::vector<Variable>;
namespace parsers {
using namespace boost::spirit::x3;
auto const word = lexeme[+char_("a-zA-Z")];
//note, using 'space' as the stock skipper
auto const tuple = word >> word;
}
std::ostream& operator << (std::ostream& os, /*const*/ Variable& obj) {
return os << obj.get<0>() << ' ' << obj.get<1>();
}
std::ostream& operator << (std::ostream& os, /*const*/ Vector& obj) {
for (auto& item : obj)
os << item << " : ";
return os;
}
template<typename P, typename A>
bool test_parse(std::string in, P parser, A& attr) {
auto begin(in.begin());
bool r = phrase_parse(begin, in.end(), parser, boost::spirit::x3::space, attr);
std::cout << "result:\n " << attr << std::endl;
return r;
}
int main()
{
//not recomended but this is testing stuff
using namespace boost::spirit::x3;
using namespace parsers;
std::string input("first second third forth");
//parse one word
std::string singleWord;
test_parse(input, word, singleWord);
//parse two words into a variable
Variable variable;
test_parse(input, tuple, variable);
//parse two sets of two words
Vector vector;
test_parse(input, *tuple, vector);
}
You may like this form of testing. You can concentrate on testing parsers without a lot of extra code. It makes it easier down the road to keep your basic parsers in their own namespace. Oh yea, x3 compiles much faster than qi!
The single word compiles, the Variable doesn't. Why?
There are missing two #includes:
#include <boost/fusion/adapted/boost_tuple.hpp>
#include <boost/spirit/include/qi_list.hpp>

boost::spirit::qi::phrase_parser() into std::map error

The below code is to parse a "key=val;.." string into std::map and it fails to compile with the error:
Error C2146 : syntax error: missing '>' before identifier 'value_type'
Error C2039 : 'value_type': is not a member of 'std::pair,std::allocator,std::basic_string,std::allocator>>' c:\git\risk-engine-core_tcp\stage\boost-1.66.0-barclays-1\include\boost\spirit\home\support\container.hpp
It does not like the last parameter, "contents" (std::map), passed as a container.
Boost version is 1.66
namespace qi = boost::spirit::qi;
std::map<std::string,std::string> contents;
std::string::iterator first = str.begin();
std::string::iterator last = str.end();
const bool result = qi::phrase_parse(first,last,
*( *(qi::char_-"=") >> qi::lit("=") >> *(qi::char_-";") >> -qi::lit(";") ),
ascii::space, contents);
Looking at the boost docs and stack overflow, I do not see any issue with the above code.
Did you include
#include <boost/fusion/adapted/std_pair.hpp>
Here's a working example with some improvement suggestions:
Live On Coliru
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/spirit/include/qi.hpp>
#include <map>
#include <iomanip> // std::quoted
namespace qi = boost::spirit::qi;
int main() {
std::string str("key = value");
std::string::const_iterator first = str.begin();
std::string::const_iterator last = str.end();
std::map<std::string, std::string> contents;
bool const result = qi::phrase_parse(first,last,
*( *~qi::char_('=') >> '=' >> *~qi::char_(';') >> -qi::lit(';') ),
qi::ascii::space, contents);
if (result) {
std::cout << "Parsed " << contents.size() << " elements\n";
for (auto& [k,v] : contents) {
std::cout << "\t" << std::quoted(k) << ": " << std::quoted(v) << "\n";
}
} else {
std::cout << "Parse failed\n";
}
if (first != last)
std::cout << "Remaining input unparsed: " << std::quoted(std::string(first, last)) << "\n";
}
Prints
Parsed 1 elements
"key": "value"

Trouble using get_value with Boost's property trees

I have to write an XML parser with Boost. However I have some trouble.
I can access the nodes name without any problem, but for some reason I can't access the attributes inside a tag by using get_value, which should work instantly. Maybe there is a mistake in my code I didn't spot? Take a look:
void ParametersGroup::load(const boost::property_tree::ptree &pt)
{
using boost::property_tree::ptree;
BOOST_FOREACH(const ptree::value_type& v, pt)
{
name = v.second.get_value<std::string>("name");
std::string node_name = v.first;
if (node_name == "<xmlattr>" || node_name == "<xmlcomment>")
continue;
else if (node_name == "ParametersGroup")
sg.load(v.second); // Recursion to go deeper
else if (node_name == "Parameter")
{
// Do stuff
std::cout << "PARAMETER_ELEM" << std::endl;
std::cout << "name: " << name << std::endl;
std::cout << "node_name: " << node_name << std::endl << std::endl;
}
else
{
std::cerr << "FATAL ERROR: XML document contains a non-recognized element: " << node_name << std::endl;
exit(-1);
}
}
}
So basically I ignore and tags, when I'm in a ParametersGroup tag I go deeper, and when I'm in a Parameter tag I recover the datas to do stuff. However, I can't get the "name" properly.
This is the kind of lines I'm scanning in the last else if :
<Parameter name="box">
The std::cout << name displays things like that:
name: ^M
^M
^M
^M
^M
^M
which is obvisouly not what I'm asking for.
What am I doing wrong? Any help would be greatly appreciated.
Since your question isn't particularly selfcontained, here's my selfcontained counter example:
Live On Coliru
#include <sstream>
#include <iostream>
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/xml_parser.hpp>
using namespace boost::property_tree;
int main() {
ptree pt;
std::istringstream iss("<Parameter name=\"box\" />");
xml_parser::read_xml(iss, pt);
for (auto& element : pt)
{
std::cout << "'" << element.first << "'\n";
for (auto& attr : element.second)
{
std::cout << "'" << attr.first << "'\n";
for (auto& which : attr.second)
{
std::cout << "'" << which.first << "': \"" << which.second.get_value<std::string>() << "\"\n";
}
}
}
}
It prints
'Parameter'
'<xmlattr>'
'name': "box"
I hope you can see what you need to do (likely an unexpected level of nodes in the tree?). To get directly to the leaf node:
pt.get_child("Parameter.<xmlattr>.name").get_value<std::string>()

How to extract trimmed text using Boost Spirit?

Using boost spirit, I'd like to extract a string that is followed by some data in parentheses. The relevant string is separated by a space from the opening parenthesis. Unfortunately, the string itself may contain spaces. I'm looking for a concise solution that returns the string without a trailing space.
The following code illustrates the problem:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <string>
#include <iostream>
namespace qi = boost::spirit::qi;
using std::string;
using std::cout;
using std::endl;
void
test_input(const string &input)
{
string::const_iterator b = input.begin();
string::const_iterator e = input.end();
string parsed;
bool const r = qi::parse(b, e,
*(qi::char_ - qi::char_("(")) >> qi::lit("(Spirit)"),
parsed
);
if(r) {
cout << "PASSED:" << endl;
} else {
cout << "FAILED:" << endl;
}
cout << " Parsed: \"" << parsed << "\"" << endl;
cout << " Rest: \"" << string(b, e) << "\"" << endl;
}
int main()
{
test_input("Fine (Spirit)");
test_input("Hello, World (Spirit)");
return 0;
}
Its output is:
PASSED:
Parsed: "Fine "
Rest: ""
PASSED:
Parsed: "Hello, World "
Rest: ""
With this simple grammar, the extracted string is always followed by a space (that I 'd like to eliminate).
The solution should work within Spirit since this is only part of a larger grammar. (Thus, it would probably be clumsy to trim the extracted strings after parsing.)
Thank you in advance.
Like the comment said, in the case of a single space, you can just hard code it. If you need to be more flexible or tolerant:
I'd use a skipper with raw to "cheat" the skipper for your purposes:
bool const r = qi::phrase_parse(b, e,
qi::raw [ *(qi::char_ - qi::char_("(")) ] >> qi::lit("(Spirit)"),
qi::space,
parsed
);
This works, and prints
PASSED:
Parsed: "Fine"
Rest: ""
PASSED:
Parsed: "Hello, World"
Rest: ""
See it Live on Coliru
Full program for reference:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <string>
#include <iostream>
namespace qi = boost::spirit::qi;
using std::string;
using std::cout;
using std::endl;
void
test_input(const string &input)
{
string::const_iterator b = input.begin();
string::const_iterator e = input.end();
string parsed;
bool const r = qi::phrase_parse(b, e,
qi::raw [ *(qi::char_ - qi::char_("(")) ] >> qi::lit("(Spirit)"),
qi::space,
parsed
);
if(r) {
cout << "PASSED:" << endl;
} else {
cout << "FAILED:" << endl;
}
cout << " Parsed: \"" << parsed << "\"" << endl;
cout << " Rest: \"" << string(b, e) << "\"" << endl;
}
int main()
{
test_input("Fine (Spirit)");
test_input("Hello, World (Spirit)");
return 0;
}

How to expand environment variables in .ini files using Boost

I have a INI file like
[Section1]
Value1 = /home/%USER%/Desktop
Value2 = /home/%USER%/%SOME_ENV%/Test
and want to parse it using Boost. I tried using Boost property_tree like
boost::property_tree::ptree pt;
boost::property_tree::ini_parser::read_ini("config.ini", pt);
std::cout << pt.get<std::string>("Section1.Value1") << std::endl;
std::cout << pt.get<std::string>("Section1.Value2") << std::endl;
But it didn't expand the environment variable. Output looks like
/home/%USER%/Desktop
/home/%USER%/%SOME_ENV%/Test
I was expecting something like
/home/Maverick/Desktop
/home/Maverick/Doc/Test
I am not sure if it is even possible with boost property_tree.
I would appreciate any hint to parse this kind of file using boost.
And here's another take on it, using the old crafts:
not requiring Spirit, or indeed Boost
not hardwiring the interface to std::string (instead allowing any combination of input iterators and output iterator)
handling %% "properly" as a single % 1
The essence:
#include <string>
#include <algorithm>
static std::string safe_getenv(std::string const& macro) {
auto var = getenv(macro.c_str());
return var? var : macro;
}
template <typename It, typename Out>
Out expand_env(It f, It l, Out o)
{
bool in_var = false;
std::string accum;
while (f!=l)
{
switch(auto ch = *f++)
{
case '%':
if (in_var || (*f!='%'))
{
in_var = !in_var;
if (in_var)
accum.clear();
else
{
accum = safe_getenv(accum);
o = std::copy(begin(accum), end(accum), o);
}
break;
} else
++f; // %% -> %
default:
if (in_var)
accum += ch;
else
*o++ = ch;
}
}
return o;
}
#include <iterator>
std::string expand_env(std::string const& input)
{
std::string result;
expand_env(begin(input), end(input), std::back_inserter(result));
return result;
}
#include <iostream>
#include <sstream>
#include <list>
int main()
{
// same use case as first answer, show `%%` escape
std::cout << "'" << expand_env("Greeti%%ng is %HOME% world!") << "'\n";
// can be done streaming, to any container
std::istringstream iss("Greeti%%ng is %HOME% world!");
std::list<char> some_target;
std::istreambuf_iterator<char> f(iss), l;
expand_env(f, l, std::back_inserter(some_target));
std::cout << "Streaming results: '" << std::string(begin(some_target), end(some_target)) << "'\n";
// some more edge cases uses to validate the algorithm (note `%%` doesn't
// act as escape if the first ends a 'pending' variable)
std::cout << "'" << expand_env("") << "'\n";
std::cout << "'" << expand_env("%HOME%") << "'\n";
std::cout << "'" << expand_env(" %HOME%") << "'\n";
std::cout << "'" << expand_env("%HOME% ") << "'\n";
std::cout << "'" << expand_env("%HOME%%HOME%") << "'\n";
std::cout << "'" << expand_env(" %HOME%%HOME% ") << "'\n";
std::cout << "'" << expand_env(" %HOME% %HOME% ") << "'\n";
}
Which, on my box, prints:
'Greeti%ng is /home/sehe world!'
Streaming results: 'Greeti%ng is /home/sehe world!'
''
'/home/sehe'
' /home/sehe'
'/home/sehe '
'/home/sehe/home/sehe'
' /home/sehe/home/sehe '
' /home/sehe /home/sehe '
1 Of course, "properly" is subjective. At the very least, I think this
would be useful (how else would you configure a value legitimitely containing %?)
is how cmd.exe does it on Windows
I'm pretty sure that this could be done trivally (see my newer answer) using a handwritten parser, but I'm personally a fan of Spirit:
grammar %= (*~char_("%")) % as_string ["%" >> +~char_("%") >> "%"]
[ _val += phx::bind(safe_getenv, _1) ];
Meaning:
take all non-% chars, if any
then take any word from inside %s and pass it through safe_getenv before appending
Now, safe_getenv is a trivial wrapper:
static std::string safe_getenv(std::string const& macro) {
auto var = getenv(macro.c_str());
return var? var : macro;
}
Here's a complete minimal implementation:
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
static std::string safe_getenv(std::string const& macro) {
auto var = getenv(macro.c_str());
return var? var : macro;
}
std::string expand_env(std::string const& input)
{
using namespace boost::spirit::qi;
using boost::phoenix::bind;
static const rule<std::string::const_iterator, std::string()> compiled =
*(~char_("%")) [ _val+=_1 ]
% as_string ["%" >> +~char_("%") >> "%"] [ _val += bind(safe_getenv, _1) ];
std::string::const_iterator f(input.begin()), l(input.end());
std::string result;
parse(f, l, compiled, result);
return result;
}
int main()
{
std::cout << expand_env("Greeting is %HOME% world!\n");
}
This prints
Greeting is /home/sehe world!
on my box
Notes
this is not optimized (well, not beyond compiling the rule once)
replace_regex_copy would do as nicely and more efficient (?)
see this answer for a slightly more involved 'expansion' engine: Compiling a simple parser with Boost.Spirit
using output iterator instead of std::string for accumulation
allowing nested variables
allowing escapes