boost variant type collision - c++

Follow-Up Question
So, I've been playing with the
Boost Mini C Tutorial
What I have done is added a rule to parse string literals. The purpose is so that I can parse and compile programs like (functionality already built-in):
int ret(int x) {
return x;
}
int main() {
int x = 5;
return ret(x)*2;
}
As well as (want to add this functionality),
string print(string s) {
return s;
}
int main() {
string foo = "bar";
print(foo);
return 0;
}
Whether or not the last two examples compile with say gcc, is inconsequential.
So, the gist of what I added is the following:
Within the file expression_def.hpp (production rule 'quoted_string' has been added):
quoted_string = '"' >> *('\\' >> char_ | ~char_('"')) >> '"'; // ADDED THIS
primary_expr =
uint_
| quoted_string // ADDED THIS
| function_call
| identifier
| bool_
| '(' > expr > ')'
;
within ast.hpp, the variant type 'std:string' has been added:
typedef boost::variant<
nil
, bool
, unsigned int
, std::string // ADDED THIS
, identifier
, boost::recursive_wrapper<unary>
, boost::recursive_wrapper<function_call>
, boost::recursive_wrapper<expression>
>
operand;
Here is the rule declaration for the addition, as well as the rule it's colliding with:
qi::rule<Iterator, std::string(), skipper<Iterator> > identifier;
qi::rule<Iterator, std::string()> quoted_string; // declaring this without the skipper
// lets us avoid the lexeme[] incantation (thanks #sehe).
The problem now, is that the compiler confuses what should be an 'identifier' for a 'quoted_string' - or actually just a std::string.
My guess is, the fact that they both have a std::string signature return type is the cause of the problem, but I don't know a good workaround here. Additionally, the 'identifier' struct has a data member of type std::string that it is initialized with, so really the compiler cannot tell between the two and the variant std::string ends up being the better match.
Now, if I change std::string to char* like so:
typedef boost::variant<
nil
, bool
, unsigned int
, char* // CHANGED, YET AGAIN
, identifier
, boost::recursive_wrapper<unary>
, boost::recursive_wrapper<function_call>
, boost::recursive_wrapper<expression>
>
operand;
it will compile and work with integers, bet then I am unable to parse strings (in fact, VS will call abort()) It should be noted that because each variant needs an overload, I have something in my code along the lines of:
bool compiler::operator()(std::string const& x)
{
BOOST_ASSERT(current != 0);
current->op(op_string, x);
return true;
}
and
void function::op(int a, std::string const& b)
{
code.push_back(a);
code.push_back(b.size());
for (uintptr_t ch : b)
{
code.push_back(ch);
}
size_ += 2 + b.size();
}
These both work swimmingly when I need to parse strings (of course sacrificing the ability to handle integers).
Their integer equivalents are (and found in compiler.cpp)
bool compiler::operator()(unsigned int x)
{
BOOST_ASSERT(current != 0);
current->op(op_int, x);
return true;
}
and of course:
void function::op(int a, int b)
{
code.push_back(a);
code.push_back(b);
size_ += 2;
}
If I have to change the variant type from std::string to char*, then I have to update the overloads, and because of C legacies, it gets to look a bit ugly.
I understand this might be a bit daunting and not really appealing to comb through the source, but I assure you it really isn't. This compiler tutorial simply pushes bytecode into a vector, which by design only handles integers. I am trying to modify it to handle strings, as well, hence the additions and overloads, as well as the need for unintptr_t. Anyone familiar with the material and/or Boost will likely know exactly what they are looking at (ehem, #sehe, ehem!).

Related

Spirit X3, two rules do not compile after being combined into one

I am currently learning how to use x3. As the title states, I have had success creating a grammar with a few simple rules, but upon combining two of these rules into one, the code no longer compiles. Here is the code for the AST portion:
namespace x3 = boost::spirit::x3;
struct Expression;
struct FunctionExpression {
std::string functionName;
std::vector<x3::forward_ast<Expression>> inputs;
};
struct Expression: x3::variant<int, double, bool, FunctionExpression> {
using base_type::base_type;
using base_type::operator=;
};
The rules I have created parse input formatted as {rangeMin, rangeMax}:
rule<struct basic_exp_class, ast::Expression> const
basic_exp = "basic_exp";
rule<struct exp_pair_class, std::vector<ast::Expression>> const
exp_pair = "exp_pair";
rule<struct range_class, ast::FunctionExpression> const
range = "range";
auto const basic_exp_def = double_ | int_ | bool_;
auto const exp_pair_def = basic_expr >> ',' >> basic_expr;
auto const range_def = attr("computeRange") >> '{' >> exp_pair >> '}';
BOOST_SPIRIT_DEFINE(basic_expr, exp_pair_def, range_def);
This code compiles fine. However, if I try to inline the exp_pair rule into the range_def rule, like so:
rule<struct basic_exp_class, ast::Expression> const
basic_exp = "basic_exp";
rule<struct range_class, ast::FunctionExpression> const
range = "range";
auto const basic_exp_def = double_ | int_ | bool_;
auto const range_def = attr("computeRange") >> '{' >> (
basic_exp >> ',' >> basic_exp
) >> '}';
BOOST_SPIRIT_DEFINE(basic_expr, range_def);
The code fails to compile with a very long template error, ending with the line:
spirit/include/boost/spirit/home/x3/operator/detail/sequence.hpp:149:9: error: static assertion failed: Size of the passed attribute is less than expected.
static_assert(
^~~~~~~~~~~~~
The header file also includes this comment above the static_assert:
// If you got an error here, then you are trying to pass
// a fusion sequence with the wrong number of elements
// as that expected by the (sequence) parser.
But I do not see why the code should fail. According to x3's compound attribute rules, the inlined portion in the parenthesis should have an attribute of type vector<ast::Expression>, making the overall rule have the type tuple<string, vector<ast::Expression>, so that it would be compatible with ast::FunctionExpression. The same logic applies the more verbose three-rule version, the only difference being that I specifically declared a rule for the inner part and specifically stated its attribute needed to be of type vector<ast::Expression>.
Spirit x3 is probably seeing the result of the inlined rule as two separate ast::Expression instead of the std::vector<ast::Expression> required by the ast::FunctionExpression struct.
To solve it we can use a helper as lambda as mentioned in another answer to specify the return type of a sub-rule.
And the modified range_def would become:
auto const range_def = attr("computeRange") >> '{' >> as<std::vector<ast::Expression>>(basic_exp >> ',' >> basic_exp) >> '}';

How do I convert boost::spirit::qi::lexeme's attribute to std::string?

Consider:
struct s {
AttrType f(const std::string &);
};
...and a rule r with an attribute AttrType:
template <typename Signature> using rule_t =
boost::spirit::qi::rule<Iterator,
Signature,
boost::spirit::qi::standard::space_type>;
rule_t<AttrType()> r;
r = lexeme[alnum >> +(alnum | char_('.') | char_('_'))][
_val = boost::phoenix::bind(&s::f, s_inst, _1)
];
When compiling this (with clang), I get this error message:
boost/phoenix/bind/detail/preprocessed/member_function_ptr_10.hpp:28:72: error: no viable conversion from
'boost::fusion::vector2<char, std::__1::vector<char, std::__1::allocator<char> > >' to 'const std::__1::basic_string<char>'
return (BOOST_PROTO_GET_POINTER(class_type, obj)->*fp)(a0);
^~
It's my impression that the problem is the type of the placeholder variable, _1. Is there a concise way to convert lexeme's attribute to std::string for this purpose?
If I interject an additional rule with an attribute type of std::string, it compiles:
rule_t<std::string()> r_str;
r = r_str[boost::phoenix::bind(&s::f, s_inst, _1)];
r_str = lexeme[alnum >> +(alnum | char_('.') | char_('_'))];
...but this seems a bit awkward. Is there a better way?
You can use qi::as_string[] (which will coerce the attribute into a string if a suitable automatic transformation exists).
Alternatively you can use qi::raw[] which exposes the source-iterator range. This will automatically transform into std::string attributes. The good thing here is that the input can be reflected unaltered (e.g. qi::raw[ qi::int_ >> ';' >> qi::double_ ] will work
In your case you can probably use as_string[]. But you can also fix the argument to take a std::vector<char> const&
Finally you could use attr_cast<> to achieve exactly the same effect as with the separate qi::rule<> (but without using the separate rule :)) but I don't recommend it for efficiency and because older versions of boost had bugs in this facility.

string input, how to tell if it is int?

I am writing a program that converts a parathensized expression into a mathematical one, and evaluates it. I've got the calculation bit written already.
I am using a stack for the operands, and a queue for the numbers. Adding operands to the stack isn't an issue, but I need to identify whether the input character is an integer, and if so, add it to the queue. Here's some code:
cout << "Enter the numbers and operands for the expression";
string aString;
do
{
cin >> aString
if (aString = int) // function to convert to read if int, convert to int
{
c_str(...);
atoi(...);
istack.push(int);
}
}
That's where I'm stuck now. I know I'm going to have to use c_str and atoi to convert it to an int. Am I taking the wrong approach?
Use the .fail() method of the stream.
If you need the string too, you can read to a string first, then attempt to convert the string to an integer using a stringstream object and check .fail() on the stringstream to see if the conversion could be done.
cin >> aString;
std::stringstream ss;
ss << aString;
int n;
ss >> n;
if (!ss.fail()) {
// int;
} else {
// not int;
}
I'll probably get flamed for this by the C++ purists.
However, sometimes the C++ library is just more work than the C library. I offer this
solution to C developers out there. And C++ developers who don't mind using some of the
features of the C library.
The whole check and conversion can be done in 1 line of C using the sscanf function.
int intval;
cin >> aString
if (sscanf(aString.c_str(), "%d", &intval)){
istack.push(intval);
}
sscanf returns the number of input arguments matched and assigned. So in this case, it's looking for one standard signed int value. If sscanf returns 1 then it succeeded in assigning the value. If it returns 0 then we don't have an int.
If you expect an integer, I would use boost::lexical_cast.
std::string some_string = "345";
int val = boost::lexical_cast<int>(some_string);
If it fails to cast to an integer, it will throw. The performance is pretty reasonable, and it keeps your code very clean.
I am unaware of any non-throwing version. You could use something like this, though I usually try to avoid letting exceptions control program flow.
bool cast_nothrow(const std::string &str, int &val) {
try {
val = boost::lexical_cast<int>(str);
return true;
} catch (boost::bad_lexical_cast &) {
return false;
}
}
Edit:
I would not recommend your integer validation checking for structure like you described. Good functions do one thing and one thing well.
Usually you'd want a more formal grammar parser to handle such things. My honest advice is to embed a scripting language or library in your project. It is non-trivial, so let someone else do the hard work.
If I actually tried to implement what you propose, I would probably do a stack based solution keeping the parenthesis levels at their own stack frame. The simplest thing would just be to hard code the simple operators (parenthesis, add, sub, etc) and assume that the rest of everything is a number.
Eventually you'd want everything broken down into some expression type. It might look something like this:
struct Expression {
virtual ~Expression() {}
virtual float value() const = 0;
};
struct Number : public Expression {
virtual float value() const {return val;}
float val;
};
struct AdditionOper : public Expression {
virtual float value() const {return lhs->value() + rhs->value();}
boost::shared_ptr<Expression> lhs;
boost::shared_ptr<Expression> rhs;
};
I'd start by parsing out the parenthesis, they will determine the order of your expressions. Then I'd split everything based on the numerical operands and start putting them in expressions. Then you're left with cases like 3 + 4 * 6 which would require some some care to get the order of operations right.
Good luck.
You can either run your function that converts a string representation of a number to a double and see if there's an error, or you can look at the contents of the string and see if it matches the pattern of a number and then do the conversion.
You might use boost::lexical_cast<double>() or std::stod() (C++11) where errors are reported with an exception, or istringstream extractors where the error is reported by setting the fail bit, or with C conversion functions that report errors by setting the global (thread local, rather) variable errno.
try {
istack.push_back(std::stod(aString));
} catch(std::invalid_argument &e) {
// aString is not a number
}
or
errno = 0;
char const *s = aString.c_str();
char *end;
double result = strtod(s,&end);
if(EINVAL==errno) {
// the string is not a number
} else {
istack.push_back(result);
}
An implementation of the second option might use a regex to see if the string matches the pattern you use for numbers, and if it does then running your conversion function. Here's an example of a pattern you might expect for floating point values:
std::regex pattern("[+-]?(\d*.\d+|\d+.?)([eE][+-]?\d+)?");
if(std::regex_match(aString,pattern)) {
istack.push_back(std::stod(aString));
} else {
// aString is not a number
}
Also, this probably doesn't matter to you, but most any built in method for converting a string to a number will be locale sensitive one way or another. One way to isolate yourself from this is to use a stringstream you create and imbue with the classic locale.
I guess the C++ (no boost) way would be this :
do
{
std::stringstream ss;
std::string test;
cin >> test;
ss << test;
int num;
if (ss >> num) // function to convert to read if int, convert to int
{
std::cout << "Number : " << num << "\n";
}
}while(true); // don't do this though..
Can you not use ctype.h http://www.cplusplus.com/reference/clibrary/cctype/. I have used this before and did not get into trouble.
Especially if you're doing base-10 input, I find the most blatant thing to do is read the string, then check that it only contains valid characters:
string s;
cin >> s;
if(strrspn(s.c_str(), "0123456789")==s.length()){
//int
} else{
//not int
}

BOOST_FUSION_ADAPT_STRUCT doesn't take the right number of arguments

I am using Boost::Spirit to parse some text into structs. This requires using BOOST_FUSION_ADAPT_STRUCT for parsing text and directly storing into the structure. I know that the macro takes 2 arguments: the structure name as the 1st arg and all the structure members as the 2nd argument. And I am passing just those 2. But I get a compilation error saying,
error: macro "BOOST_FUSION_ADAPT_STRUCT_FILLER_0" passed 3 arguments, but takes just 2
Here is the code snippet. Let me know if you need the entire code.
Thanks.
namespace client
{
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
namespace phoenix = boost::phoenix;
struct Dir_Entry_Pair
{
std::string dir;
std::string value1;
std::pair<std::string, std::string> keyw_value2;
};
}
BOOST_FUSION_ADAPT_STRUCT(
client::Dir_Entry_Pair,
(std::string, dir)
(std::string, value1)
(std::pair< std::string, std::string >, keyw_value2))
This is the rule I am trying to parse,
qi::rule<Iterator, Dir_Entry_Pair()> ppair = dir
>> '/'
>> entry
>> -(keyword >> entry);
Most likely the issue is std::pair<std::string,std::string>.
The problem is that there is a comma in the type, which will play havoc with the macro expansion (when using the last element of your list).
You should try wrapping the type in its own set of parentheses.

boost spirit semantic action parameters

in this article about boost spirit semantic actions it is mentioned that
There are actually 2 more arguments
being passed: the parser context and a
reference to a boolean ‘hit’
parameter. The parser context is
meaningful only if the semantic action
is attached somewhere to the right
hand side of a rule. We will see more
information about this shortly. The
boolean value can be set to false
inside the semantic action invalidates
the match in retrospective, making the
parser fail.
All fine, but i've been trying to find an example passing a function object as semantic action that uses the other parameters (parser context and hit boolean) but i haven't found any. I would love to see an example using regular functions or function objects, as i barely can grok the phoenix voodoo
This a really good question (and also a can of worms) because it gets at the interface of qi and phoenix. I haven't seen an example either, so I'll extend the article a little in this direction.
As you say, functions for semantic actions can take up to three parameters
Matched attribute - covered in the article
Context - contains the qi-phoenix interface
Match flag - manipulate the match state
Match flag
As the article states, the second parameter is not meaningful unless the expression is part of a rule, so lets start with the third. A placeholder for the second parameter is still needed though and for this use boost::fusion::unused_type. So a modified function from the article to use the third parameter is:
#include <boost/spirit/include/qi.hpp>
#include <string>
#include <iostream>
void f(int attribute, const boost::fusion::unused_type& it, bool& mFlag){
//output parameters
std::cout << "matched integer: '" << attribute << "'" << std::endl
<< "match flag: " << mFlag << std::endl;
//fiddle with match flag
mFlag = false;
}
namespace qi = boost::spirit::qi;
int main(void){
std::string input("1234 6543");
std::string::const_iterator begin = input.begin(), end = input.end();
bool returnVal = qi::phrase_parse(begin, end, qi::int_[f], qi::space);
std::cout << "return: " << returnVal << std::endl;
return 0;
}
which outputs:
matched integer: '1234'
match flag: 1
return: 0
All this example does is switch the match to a non-match, which is reflected in the parser output. According to hkaiser, in boost 1.44 and up setting the match flag to false will cause the match to fail in the normal way. If alternatives are defined, the parser will backtrack and attempt to match them as one would expect. However, in boost<=1.43 a Spirit bug prevents backtracking, which causes strange behavior. To see this, add phoenix include boost/spirit/include/phoenix.hpp and change the expression to
qi::int_[f] | qi::digit[std::cout << qi::_1 << "\n"]
You'd expect that, when the qi::int parser fails, the alternative qi::digit to match the beginning of the input at "1", but the output is:
matched integer: '1234'
match flag: 1
6
return: 1
The 6 is the first digit of the second int in the input which indicates the alternative is taken using the skipper and without backtracking. Notice also that the match is considered succesful, based on the alternative.
Once boost 1.44 is out, the match flag will be useful for applying match criteria that might be otherwise difficult to express in a parser sequence. Note that the match flag can be manipulated in phoenix expressions using the _pass placeholder.
Context parameter
The more interesting parameter is the second one, which contains the qi-phoenix interface, or in qi parlance, the context of the semantic action. To illustrate this, first examine a rule:
rule<Iterator, Attribute(Arg1,Arg2,...), qi::locals<Loc1,Loc2,...>, Skipper>
The context parameter embodies the Attribute, Arg1, ... ArgN, and qi::locals template paramters, wrapped in a boost::spirit::context template type. This attribute differs from the function parameter: the function parameter attribute is the parsed value, while this attribute is the value of the rule itself. A semantic action must map the former to the latter. Here's an example of a possible context type (phoenix expression equivalents indicated):
using namespace boost;
spirit::context< //context template
fusion::cons<
int&, //return int attribute (phoenix: _val)
fusion::cons<
char&, //char argument1 (phoenix: _r1)
fusion::cons<
float&, //float argument2 (phoenix: _r2)
fusion::nil //end of cons list
>,
>,
>,
fusion::vector2< //locals container
char, //char local (phoenix: _a)
unsigned int //unsigned int local (phoenix: _b)
>
>
Note the return attribute and argument list take the form of a lisp-style list (a cons list). To access these variables within a function, access the attribute or locals members of the context struct template with fusion::at<>(). For example, for a context variable con
//assign return attribute
fusion::at_c<0>(con.attributes) = 1;
//get the second rule argument
float arg2 = fusion::at_c<2>(con.attributes);
//assign the first local
fusion::at_c<1>(con.locals) = 42;
To modify the article example to use the second argument, change the function definition and phrase_parse calls:
...
typedef
boost::spirit::context<
boost::fusion::cons<int&, boost::fusion::nil>,
boost::fusion::vector0<>
> f_context;
void f(int attribute, const f_context& con, bool& mFlag){
std::cout << "matched integer: '" << attribute << "'" << std::endl
<< "match flag: " << mFlag << std::endl;
//assign output attribute from parsed value
boost::fusion::at_c<0>(con.attributes) = attribute;
}
...
int matchedInt;
qi::rule<std::string::const_iterator,int(void),ascii::space_type>
intRule = qi::int_[f];
qi::phrase_parse(begin, end, intRule, ascii::space, matchedInt);
std::cout << "matched: " << matchedInt << std::endl;
....
This is a very simple example that just maps the parsed value to the output attribute value, but extensions should be fairly apparent. Just make the context struct template parameters match the rule output, input, and local types. Note that this type of a direct match between parsed type/value to output type/value can be done automatically using auto rules, with a %= instead of a = when defining the rule:
qi::rule<std::string::const_iterator,int(void),ascii::space_type>
intRule %= qi::int_;
IMHO, writing a function for each action would be rather tedious, compared to the brief and readable phoenix expression equivalents. I sympathize with the voodoo viewpoint, but once you work with phoenix for a little while, the semantics and syntax aren't terribly difficult.
Edit: Accessing rule context w/ Phoenix
The context variable is only defined when the parser is part of a rule. Think of a parser as being any expression that consumes input, where a rule translates the parser values (qi::_1) into a rule value (qi::_val). The difference is often non-trivial, for example when qi::val has a Class type that needs to be constructed from POD parsed values. Below is a simple example.
Let's say part of our input is a sequence of three CSV integers (x1, x2, x3), and we only care out an arithmetic function of these three integers (f = x0 + (x1+x2)*x3 ), where x0 is a value obtained elsewhere. One option is to read in the integers and calculate the function, or alternatively use phoenix to do both.
For this example, use one rule with an output attribute (the function value), and input (x0), and a local (to pass information between individual parsers with the rule). Here's the full example.
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <string>
#include <iostream>
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
int main(void){
std::string input("1234, 6543, 42");
std::string::const_iterator begin = input.begin(), end = input.end();
qi::rule<
std::string::const_iterator,
int(int), //output (_val) and input (_r1)
qi::locals<int>, //local int (_a)
ascii::space_type
>
intRule =
qi::int_[qi::_a = qi::_1] //local = x1
>> ","
>> qi::int_[qi::_a += qi::_1] //local = x1 + x2
>> ","
>> qi::int_
[
qi::_val = qi::_a*qi::_1 + qi::_r1 //output = local*x3 + x0
];
int ruleValue, x0 = 10;
qi::phrase_parse(begin, end, intRule(x0), ascii::space, ruleValue);
std::cout << "rule value: " << ruleValue << std::endl;
return 0;
}
Alternatively, all the ints could be parsed as a vector, and the function evaluated with a single semantic action (the % below is the list operator and elements of the vector are accessed with phoenix::at):
namespace ph = boost::phoenix;
...
qi::rule<
std::string::const_iterator,
int(int),
ascii::space_type
>
intRule =
(qi::int_ % ",")
[
qi::_val = (ph::at(qi::_1,0) + ph::at(qi::_1,1))
* ph::at(qi::_1,2) + qi::_r1
];
....
For the above, if the input is incorrect (two ints instead of three), bad thing could happen at run time, so it would be better to specify the number of parsed values explicitly, so parsing will fail for a bad input. The below uses _1, _2, and _3 to reference the first, second, and third match value:
(qi::int_ >> "," >> qi::int_ >> "," >> qi::int_)
[
qi::_val = (qi::_1 + qi::_2) * qi::_3 + qi::_r1
];
This is a contrived example, but should give you the idea. I've found phoenix semantic actions really helpful in constructing complex objects directly from input; this is possible because you can call constructors and member functions within semantic actions.