Evaluate a math expression in Qt - c++

I'm trying to create a Qt application and I need a math expression evaluator to evaluate things like this e.g. (4+5)*2-9/3.
I included the .hpp file of this library (http://www.partow.net/programming/exprtk/) to my project in the Qt Creator and tried to launch the following example of code:
#include <cstdio>
#include <string>
#include "exprtk.hpp"
int main()
{
typedef exprtk::expression<double> expression_t;
typedef exprtk::parser<double> parser_t;
std::string expression_string = "3 + sqrt(5) + pow(3,2) + log(5)";
expression_t expression;
parser_t parser;
if (parser.compile(expression_string,expression))
{
double result = expression.value();
printf("Result: %19.15\n",result);
}
else
printf("Error in expression\n.");
return 0;
}
When I try to compile and run it I get the following output:
debug\main.o:-1: error: too many sections (62303)
What could be the problem?

Using just pure Qt you can do something like this:
QString expression_string("3 + Math.sqrt(5) + Math.pow(3,2) + Math.log(5)");
QScriptEngine expression;
double my_val=expression.evaluate(expression_string).toNumber();
you can do much more, see HERE and HERE

Actually, on my machine (Qt 5.5, Ubuntu 16.04 with g++ 5.3), the code above does not work.
Despite the answer is quite old, I put my solution in case someone finds it useful.
QScriptEngine uses the JavaScript syntax. So to make the above code work, I had to change the syntax to:
QString expression_string("3 + Math.sqrt(5) + Math.pow(3,2) + Math.log(5)");
QScriptEngine expression;
double my_val=expression.evaluate(expression_string).toNumber();

Following the request in comments, here is how to implement an arithmetic parser using boost::spirit. First, you need to download the boost tarball, don't try to just clone Spirit alone from GitHub, because it has dependencies from other boost libraries.
Boost is huge, so if you want just a subset enough for a parser, you can extract it using bcp. From boost source directory:
cd tools/build/src/engine
./build.sh
cd ../../../bcp
../build/src/engine/b2
cd ../..
dist/bin/bcp fusion/include hana/functional spirit/home/x3 /some/path
bcp will copy all dependencies. You can leave only /some/path/boost directory, because all libraries we need are header only.
Finally, here is the full code of the parser.
#include <iostream>
#include <numeric>
#include <stdexcept>
#include <string>
#include <vector>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/hana/functional/fix.hpp>
#include <boost/hana/functional/overload.hpp>
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/ast/variant.hpp>
using namespace boost::spirit;
namespace hana = boost::hana;
// Define AST. The root is `ast::expr`, which is the first left-hand side
// operand and a list of all operations on the right-hand side. Each operand is
// a recursive `variant` that has `ast::expr` inside.
namespace ast
{
struct nil {};
struct signed_;
struct expr;
struct operand : x3::variant<
nil
, double
, x3::forward_ast<signed_>
, x3::forward_ast<expr>
>
{
using base_type::base_type;
using base_type::operator=;
};
struct signed_
{
char sign;
operand operand_;
};
struct operation
{
char operator_;
operand operand_;
};
struct expr
{
operand first;
std::vector<operation> rest;
};
} // namespace ast
// Give the grammar access to the fields of AST.
BOOST_FUSION_ADAPT_STRUCT(ast::signed_, sign, operand_)
BOOST_FUSION_ADAPT_STRUCT(ast::operation, operator_, operand_)
BOOST_FUSION_ADAPT_STRUCT(ast::expr, first, rest)
// Arithmetic expression grammar definition.
namespace ArithExpr
{
x3::rule<class expression, ast::expr > const expression("expression");
x3::rule<class term, ast::expr > const term("term");
x3::rule<class factor, ast::operand> const factor("factor");
auto const expression_def =
term
>> *(
(x3::char_('+') >> term)
| (x3::char_('-') >> term)
);
auto const term_def =
factor
>> *(
(x3::char_('*') >> factor)
| (x3::char_('/') >> factor)
);
auto const factor_def =
x3::double_
| '(' >> expression >> ')'
| (x3::char_('-') >> factor)
| (x3::char_('+') >> factor);
BOOST_SPIRIT_DEFINE(expression, term, factor);
auto calc = expression;
} // namespace ArithExpr
template <typename Iterator>
double CalcArithExpr(Iterator const &first, Iterator last) {
ast::expr expr;
// Build AST.
if (!x3::phrase_parse(first, last, ArithExpr::calc, x3::ascii::space, expr)) {
throw std::runtime_error("Cannot parse arithmetic expression");
}
// Parse the AST and calculate the result.
// hana::fix allows recursive lambda call
auto astEval = hana::fix([](auto self, auto expr) -> double {
// hana::overload calls a lambda corresponding to the type in the variant
return hana::overload(
[](ast::nil) -> double {
BOOST_ASSERT(0);
return 0;
},
[](double x) -> double { return x; },
[&](ast::signed_ const &x) -> double {
double rhs = boost::apply_visitor(self, x.operand_);
switch (x.sign) {
case '-': return -rhs;
case '+': return +rhs;
}
BOOST_ASSERT(0);
return 0;
},
[&](ast::expr const &x) -> double {
return std::accumulate(
x.rest.begin(), x.rest.end(),
// evaluate recursively left-hand side
boost::apply_visitor(self, x.first),
[&](double lhs, const ast::operation &op) -> double {
// evaluate recursively right-hand side
double rhs = boost::apply_visitor(self, op.operand_);
switch (op.operator_) {
case '+': return lhs + rhs;
case '-': return lhs - rhs;
case '*': return lhs * rhs;
case '/': return lhs / rhs;
}
BOOST_ASSERT(0);
return 0;
}
);
}
)(expr);
});
return astEval(expr);
}
int main(int argc, char *argv[]) {
auto expr = std::string{"-(4.5 + 5e-1) * 2.22 - 9.1 / 3.45"};
std::cout << CalcArithExpr(expr.begin(), expr.end()) << std::endl;
}
It calculates -(4.5 + 5e-1) * 2.22 - 9.1 / 3.45 and outputs -13.7377.
Update
Here are instructions how to build bcp and copy selected headers on Windows. Though, without any guarantee. In Linux everything just works, on Windows it is always jumps over some hoops, and the direction of jumps are always unpredictable.
This being said, open PowerShell command line. There
Import-Module 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\Common7\Tools\Microsoft.VisualStudio.DevShell.dll'
Install-Module VSSetup -Scope CurrentUser
Get-VSSetupInstance
Substitute 2019 above with your version of VS. You have to do it only once for your PowerShell. The rest is every time you need to build bcp. Get-VSSetupInstance above will print information about the instances of Visual Studio you have on your machine. Write down InstanceId that you would like to use. Now change to the boost directory in the PowerShell, and:
Enter-VsDevShell InstanceId -DevCmdArguments '-arch=x64' -SkipAutomaticLocation
Where InstanceId is the ID you got from Get-VSSetupInstance. Then from the same command prompt
cd tools\build\src\engine
& .\build.bat
cd ..\..\..\bcp
..\build\src\engine\b2 address-model=64
cd ..\..
dist\bin\bcp fusion\include hana\functional spirit\home\x3 X:\some\path\boost

Related

Boost spirit core dump on parsing bracketed expression

Having some simplified grammar that should parse sequence of terminal literals: id, '<', '>' and ":action".
I need to allow brackets '(' ')' that do nothing but improve reading. (Full example is there http://coliru.stacked-crooked.com/a/dca93f5c8f37a889 )
Snip of my grammar:
start = expression % eol;
expression = (simple_def >> -expression)
| (qi::lit('(') > expression > ')');
simple_def = qi::lit('<') [qi::_val = Command::left]
| qi::lit('>') [qi::_val = Command::right]
| key [qi::_val = Command::id]
| qi::lit(":action") [qi::_val = Command::action]
;
key = +qi::char_("a-zA-Z_0-9");
When I try to parse: const std::string s = "(a1 > :action)"; Everything works like a charm.
But when I little bit bring more complexity with brackets "(a1 (>) :action)" I've gotten coredump. Just for information - coredump happens on coliru, while msvc compiled example just demonstrate fail parsing.
So my questions: (1) what's wrong with brackets, (2) how exactly brackets can be introduced to expression.
p.s. It is simplified grammar, in real I have more complicated case, but this is a minimal reproduceable code.
You should just handle the expectation failure:
terminate called after throwing an instance of 'boost::wrapexcept<boost::spir
it::qi::expectation_failure<__gnu_cxx::__normal_iterator<char const*, std::__
cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >
>'
what(): boost::spirit::qi::expectation_failure
Aborted (core dumped)
If you handle the expectation failure, the program will not have to terminate.
Fixing The Grammar
Your 'nested expression' rule only accepts a single expression. I think that
expression = (simple_def >> -expression)
is intended to match "1 or more `simple_def". However, the alternative branch:
| ('(' > expression > ')');
doesn't accept the same: it just stops after parsing `)'. This means that your input is simply invalid according to the grammar.
I suggest a simplification by expressing intent. You were on the right path with semantic typedefs. Let's avoid the "weasely" Line Of Lines (what even is that?):
using Id = std::string;
using Line = std::vector<Command>;
using Script = std::vector<Line>;
And use these typedefs consistently. Now, we can express the grammar as we "think" about it:
start = skip(blank)[script];
script = line % eol;
line = +simple;
simple = group | command;
group = '(' > line > ')';
See, by simplifying our mental model and sticking to it, we avoided the entire problem you had a hard time spotting.
Here's a quick demo that includes error handling, optional debug output, both test cases and encapsulating the skipper as it is part of the grammar: Live On Compiler Explorer
#include <fmt/ranges.h>
#include <fmt/ostream.h>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace phx = boost::phoenix;
enum class Command { id, left, right, action };
static inline std::ostream& operator<<(std::ostream& os, Command cmd) {
switch (cmd) {
case Command::id: return os << "[ID]";
case Command::left: return os << "[LEFT]";
case Command::right: return os << "[RIGHT]";
case Command::action: return os << "[ACTION]";
}
return os << "[???]";
}
using Id = std::string;
using Line = std::vector<Command>;
using Script = std::vector<Line>;
template <typename It>
struct ExprGrammar : qi::grammar<It, Script()> {
ExprGrammar() : ExprGrammar::base_type(start) {
using namespace qi;
start = skip(blank)[script];
script = line % eol;
line = +simple;
simple = group | command;
group = '(' > line > ')';
command =
lit('<') [ _val = Command::left ] |
lit('>') [ _val = Command::right ] |
key [ _val = Command::id ] |
lit(":action") [ _val = Command::action ] ;
key = +char_("a-zA-Z_0-9");
BOOST_SPIRIT_DEBUG_NODES((command)(line)(simple)(group)(script)(key));
}
private:
qi::rule<It, Script()> start;
qi::rule<It, Line(), qi::blank_type> line, simple, group;
qi::rule<It, Script(), qi::blank_type> script;
qi::rule<It, Command(), qi::blank_type> command;
// lexemes
qi::rule<It, Id()> key;
};
int main() {
using It = std::string::const_iterator;
ExprGrammar<It> const p;
for (const std::string s : {
"a1 > :action\na1 (>) :action",
"(a1 > :action)\n(a1 (>) :action)",
"a1 (> :action)",
}) {
It f(begin(s)), l(end(s));
try {
Script parsed;
bool ok = qi::parse(f, l, p, parsed);
if (ok) {
fmt::print("Parsed {}\n", parsed);
} else {
fmt::print("Parsed failed\n");
}
if (f != l) {
fmt::print("Remaining unparsed: '{}'\n", std::string(f, l));
}
} catch (qi::expectation_failure<It> const& ef) {
fmt::print("{}\n", ef.what()); // TODO add more details :)
}
}
}
Prints
Parsed {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
Parsed {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
Parsed {{[ID], [RIGHT], [ACTION]}}
BONUS
However, I think this can all be greatly simplified using qi::symbols for the commands. In fact it looks like you're only tokenizing (you confirm this when you say that the parentheses are not important).
line = +simple;
simple = group | command | (omit[key] >> attr(Command::id));
group = '(' > line > ')';
key = +char_("a-zA-Z_0-9");
Now you don't need Phoenix at all: Live On Compiler Explorer, printing
ok? true {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
ok? true {{[ID], [RIGHT], [ACTION]}, {[ID], [RIGHT], [ACTION]}}
ok? true {{[ID], [RIGHT], [ACTION]}}
Even Simpler?
Since I observe that you're basically tokenizing line-wise, why not simply skip the parentheses, and simplify all the way down to:
script = line % eol;
line = *(command | omit[key] >> attr(Command::id));
That's all. See it Live On Compiler Explorer again:
#include <boost/spirit/include/qi.hpp>
#include <fmt/ostream.h>
#include <fmt/ranges.h>
namespace qi = boost::spirit::qi;
enum class Command { id, left, right, action };
using Id = std::string;
using Line = std::vector<Command>;
using Script = std::vector<Line>;
static inline std::ostream& operator<<(std::ostream& os, Command cmd) {
return os << (std::array{"ID", "LEFT", "RIGHT", "ACTION"}.at(int(cmd)));
}
template <typename It>
struct ExprGrammar : qi::grammar<It, Script()> {
ExprGrammar() : ExprGrammar::base_type(start) {
using namespace qi;
start = skip(skipper.alias())[line % eol];
line = *(command | omit[key] >> attr(Command::id));
key = +char_("a-zA-Z_0-9");
BOOST_SPIRIT_DEBUG_NODES((line)(key));
}
private:
using Skipper = qi::rule<It>;
qi::rule<It, Script()> start;
qi::rule<It, Line(), Skipper> line;
Skipper skipper = qi::char_(" \t\b\f()");
qi::rule<It /*, Id()*/> key; // omit attribute for efficiency
struct cmdsym : qi::symbols<char, Command> {
cmdsym() { this->add("<", Command::left)
(">", Command::right)
(":action", Command::action);
}
} command;
};
int main() {
using It = std::string::const_iterator;
ExprGrammar<It> const p;
for (const std::string s : {
"a1 > :action\na1 (>) :action",
"(a1 > :action)\n(a1 (>) :action)",
"a1 (> :action)",
})
try {
It f(begin(s)), l(end(s));
Script parsed;
bool ok = qi::parse(f, l, p, parsed);
fmt::print("ok? {} {}\n", ok, parsed);
if (f != l)
fmt::print(" -- Remaining '{}'\n", std::string(f, l));
} catch (qi::expectation_failure<It> const& ef) {
fmt::print("{}\n", ef.what()); // TODO add more details :)
}
}
Prints
ok? true {{ID, RIGHT, ACTION}, {ID, RIGHT, ACTION}}
ok? true {{ID, RIGHT, ACTION}, {ID, RIGHT, ACTION}}
ok? true {{ID, RIGHT, ACTION}}
Note I very subtly changed +() to *() so it would accept empty lines as well. This may or may not be what you want

How to rotate a bitvector in cvc4 using c++ API

I try to rotate a bitvector in cvc4 using the C++ API, but the API is a little bit confusing when it comes to operator expressions.
Using the following code (extract):
#include <iostream>
#include <cvc4/cvc4.h>
using namespace std;
using namespace CVC4;
int main() {
ExprManager em;
SmtEngine smt(&em);
smt.setLogic("QF_BV");
Type bitvector32 = em.mkBitVectorType(32);
Integer i = Integer(1, 10);
BitVector bv = BitVector(32, i);
Expr expr = em->mkConst(bv);
BitVectorRotateLeft bv_rl = BitVectorRotateLeft(1);
Expr e_bv_rl = em->mkConst(bv_rl);
Expr e_op_rl = em->operatorOf(kind::BITVECTOR_ROTATE_LEFT_OP);
Expr e_op_e = em->mkExpr(e_op_rl, e_bv_rl);
Expr e = em->mkExpr(Kind::BITVECTOR_ROTATE_LEFT, e_op_e, expr);
return 0;
}
Executing this yields:
terminate called after throwing an instance of 'CVC4::IllegalArgumentException'
what(): Illegal argument detected
CVC4::Expr CVC4::ExprManager::mkExpr(CVC4::Expr, CVC4::Expr)
`opExpr' is a bad argument; expected (opExpr.getKind() == kind::BUILTIN || kind::metaKindOf(kind) == kind::metakind::PARAMETERIZED) to hold
This Expr constructor is for parameterized kinds only
Aborted
Does anybody know how to deal with the operator construct of cvc4?
See below for the correct construction of a rotate left expression. In general, whenever you have an expression operator that is itself an expression, you apply it by simply calling mkExpr and passing the operator expression as the first argument.
int main() {
ExprManager em;
SmtEngine smt(&em);
smt.setLogic("QF_BV");
Type bitvector32 = em.mkBitVectorType(32);
BitVector bv = BitVector(32, 1U);
Expr expr = em.mkConst(bv);
BitVectorRotateLeft bv_rl = BitVectorRotateLeft(1);
Expr e_bv_rl = em.mkConst(bv_rl);
Expr e = em.mkExpr(e_bv_rl, expr);
cout << e;
return 0;
}

boost::spirit::karma alternative selection based on properties of the input

I'm trying to write a boost::spirit::karma generator where some of the output depends on non-trivial properties of the input values.
The actual problem is part of a larger grammar, but this example has the same properties as several of the other troublesome rules and is actually one of the grammar rules that is causing me trouble.
I'll start with a minimal example that is almost what I want, and then work from there.
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/home/phoenix.hpp>
#include <boost/fusion/adapted.hpp>
#include <string>
#include <vector>
template<typename OutputIterator_T>
struct Test_Grammar :
boost::spirit::karma::grammar<OutputIterator_T, std::vector<double>()>
{
Test_Grammar() : Test_Grammar::base_type(start), start(), value()
{
namespace karma = boost::spirit::karma;
start
= *(value % karma::lit(", "))
;
value
= (karma::double_)
;
}
boost::spirit::karma::rule<OutputIterator_T, std::vector<double>()> start;
boost::spirit::karma::rule<OutputIterator_T, double()> value;
};
template <typename OutputIterator_T>
bool generate_output(OutputIterator_T& sink, std::vector<double> const& data)
{
Test_Grammar<OutputIterator_T> grammar;
return (boost::spirit::karma::generate(sink, grammar, data));
}
int main (int, char**)
{
std::string generated;
std::back_insert_iterator<std::string> sink(generated);
std::vector<double> data{1.5, 0.0, -2.5,
std::numeric_limits<float>::quiet_NaN(),
std::numeric_limits<float>::infinity()};
generate_output(sink, data);
std::cout << generated << std::endl;
return 0;
}
The above code defines a grammar that, when fed with the test data, produces the output
1.5, 0.0, -2.5, nan, inf
However, the output that I want is
1.5, 0.0, -2.5, special, special
If I replace the value part of the grammar with
value
= (&karma::double_(std::numeric_limits<double>::quiet_NaN()) <<
karma::lit("special"))
| (&karma::double_(std::numeric_limits<double>::infinity()) <<
karma::lit("special"))
| (karma::double_)
;
I get the desired behavior for infinity. However, I do not get the desired result for NaN since NaN has the property that (NaN != NaN) in comparisons. So I need a way to use the fpclassify macros/functions such as isfinite().
I should be able to get what I want by replacing the value part of the grammar with
value
= (karma::eps(...) << karma::lit("special"))
| (karma::double_)
;
However, every combination of function calls, function pointers, and bind incantations that I've tried for the ... part has resulted in compiler errors.
Any help would be much appreciated.
UPDATE:
Sehe provided an excellent general solution (which I have accepted). Thank you!
For my particular use case, I was able to further simplify sehe's answer and wanted to document that here for others.
After changing all of the includes from <boost/spirit/home/*> to <boost/spirit/include/*> and defining BOOST_SPIRIT_USE_PHOENIX_V3 before those includes, I added the following line
BOOST_PHOENIX_ADAPT_FUNCTION(bool, isfinite_, std::isfinite, 1)
and changed the value part of the grammar to this
value
%= karma::double_[karma::_pass = isfinite_(karma::_1)]
| karma::lit("special")
;
I'd use the semantic action to dynamically "fail" the double_ generator:
value
%= karma::double_ [ karma::_pass = !(isnan_(karma::_1) || isinf_(karma::_1)) ]
| karma::lit("special")
;
Now, how do we get isnan_ and isinf_ implemented? I prefer to use Phoenix V3 (which will be the default in all coming releases of Boost):
BOOST_PHOENIX_ADAPT_FUNCTION(bool, isnan_, std::isnan, 1)
BOOST_PHOENIX_ADAPT_FUNCTION(bool, isinf_, std::isinf, 1)
That's all. See it Live On Coliru
Notes
use %= to get automatic attribute propagation even though there is a semantic action
include include/*.hpp instead of home/*.hpp
Full Listing:
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/phoenix_function.hpp>
#include <boost/fusion/adapted.hpp>
#include <string>
#include <vector>
#include <cmath>
BOOST_PHOENIX_ADAPT_FUNCTION(bool, isnan_, std::isnan, 1)
BOOST_PHOENIX_ADAPT_FUNCTION(bool, isinf_, std::isinf, 1)
template<typename OutputIterator_T>
struct Test_Grammar :
boost::spirit::karma::grammar<OutputIterator_T, std::vector<double>()>
{
Test_Grammar() : Test_Grammar::base_type(start), start(), value()
{
namespace karma = boost::spirit::karma;
namespace phx = boost::phoenix;
start
= *(value % karma::lit(", "))
;
value
%= karma::double_ [ karma::_pass = !(isnan_(karma::_1) || isinf_(karma::_1)) ]
| karma::lit("special")
;
}
boost::spirit::karma::rule<OutputIterator_T, std::vector<double>()> start;
boost::spirit::karma::rule<OutputIterator_T, double()> value;
};
template <typename OutputIterator_T>
bool generate_output(OutputIterator_T& sink, std::vector<double> const& data)
{
Test_Grammar<OutputIterator_T> grammar;
return (boost::spirit::karma::generate(sink, grammar, data));
}
int main (int, char**)
{
std::string generated;
std::back_insert_iterator<std::string> sink(generated);
std::vector<double> data{1.5, 0.0, -2.5,
std::numeric_limits<float>::quiet_NaN(),
std::numeric_limits<float>::infinity()};
generate_output(sink, data);
std::cout << generated << std::endl;
return 0;
}
Output
1.5, 0.0, -2.5, special, special

in boost::spirit::lex, it takes longest time to do first parsing, following parsing will be much shorter

I feed a series of text into my sip parser.the first one takes the longest time, no matter which is the first one.I wonder if there is any initialization work when spirit::lex do the first parsing?
template <typename Lexer>
struct sip_token : lex::lexer<Lexer>
{
sip_token()
{
this->self.add_pattern
("KSIP", "sip:")
("KSIPS", "sips:")
("USERINFO", "[0-9a-zA-Z-_.!~*'()]+(:[0-9a-zA-Z-_.!~*'()&=+$,]*)?#")
("DOMAINLBL", "([0-9a-zA-Z]|([0-9a-zA-Z][0-9a-zA-Z-]*[0-9a-zA-Z]))")
("TOPLBL", "[a-zA-Z]|([a-zA-Z][0-9a-zA-Z-]*[0-9a-zA-Z-])")
("INVITE", "INVITE")
("ACK", "ACK")
("OPTIONS", "OPTIONS")
("BYE", "BYE")
("CANCEL", "CANCEL")
("REGISTER", "REGISTER")
("METHOD", "({INVITE}|{ACK}|{OPTIONS}|{BYE}|{CANCEL}|{REGISTER})")
("SIPVERSION", "SIP\\/[0-9]\\.[0-9]")
("PROTOCOAL", "SIP\\/[^/]+\\/UDP")
("IPV4ADDR", "(\\d{1,3}\\.){3}\\d{1,3}")
("HOSTNAME", "[^ \t\r\n]+")
("SIPURL", "{KSIP}{USERINFO}?{HOSTNAME}(:[0-9]+)?")
("SIPSURL", "{KSIPS}{USERINFO}?{HOSTNAME}(:[0-9]+)?")
("SENTBY", "({HOSTNAME}|{IPV4ADDR})(:[0-9]+)?")
("GENPARM", "[^ ;\\n]+=[^ ;\r\\n]+")
("TOKEN", "[0-9a-zA-Z-.!%*_+~`']+")
("NAMEADDR", "({TOKEN} )?<({SIPURL}|{SIPSURL})>")
("STATUSCODE", "\\d{3}")
("REASONPHRASE", "[0-9a-zA-Z-_.!~*'()&=+$,]*")
("CR", "\\r")
("LF", "\\n")
;
this->self.add
("{METHOD} {SIPURL} {SIPVERSION}", T_REQ_LINE)
("{SIPVERSION} {STATUSCODE} {REASONPHRASE}", T_STAT_LINE)
("{CR}?{LF}", T_CRLF)
("Via: {PROTOCOAL} {SENTBY}(;{GENPARM})*", T_VIA)
("To: {NAMEADDR}(;{GENPARM})*", T_TO)
("From: {NAMEADDR}(;{GENPARM})*", T_FROM)
("[0-9a-zA-Z -_.!~*'()&=+$,;/?:#]+", T_OTHER)
;
}
};
grammar:
template <typename Iterator>
struct sip_grammar : qi::grammar<Iterator>
{
template <typename TokenDef>
sip_grammar(TokenDef const& tok)
: sip_grammar::base_type(start)
{
using boost::phoenix::ref;
using boost::phoenix::size;
using boost::spirit::qi::eol;
start = request | response;
response = stat_line >> *(msg_header) >> qi::token(T_CRLF);
request = req_line >> *(msg_header) >> qi::token(T_CRLF);
stat_line = qi::token(T_STAT_LINE) >> qi::token(T_CRLF);
req_line = qi::token(T_REQ_LINE) >> qi::token(T_CRLF);
msg_header = (qi::token(T_VIA) | qi::token(T_TO) | qi::token(T_FROM) | qi::token(T_OTHER))
>> qi::token(T_CRLF);
}
std::size_t c, w, l;
qi::rule<Iterator> start, response, request, stat_line, req_line, msg_header;
};
timing:
gettimeofday(&t1, NULL);
bool r = lex::tokenize_and_parse(first, last, siplexer, g);
gettimeofday(&t2, NULL);
result:
pkt1 time=40945(us)
pkt2 time=140
pkt3 time=60
pkt4 time=74
pkt5 time=58
pkt6 time=51
Clearly, it does :)
Lex will likely generate a DFA (one for each Lexer state, maybe). This is most likely the thing that takes the most time. Use a profiler to be certain :/
Now, you can
make sure the tables are initialized before first use, or
use the The Static Lexer Model to prevent the startup cost
This means you'll write an 'extra' main to generate the DFA as C++ code:
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/lex_generate_static_lexertl.hpp>
#include <fstream>
#include "sip_token.hpp"
using namespace boost::spirit;
int main(int argc, char* argv[])
{
// create the lexer object instance needed to invoke the generator
sip_token<lex::lexertl::lexer<> > my_lexer; // the token definition
std::ofstream out(argc < 2 ? "sip_token_static.hpp" : argv[1]);
// invoke the generator, passing the token definition, the output stream
// and the name suffix of the tables and functions to be generated
//
// The suffix "sip" used below results in a type lexertl::static_::lexer_sip
// to be generated, which needs to be passed as a template parameter to the
// lexertl::static_lexer template (see word_count_static.cpp).
return lex::lexertl::generate_static_dfa(my_lexer, out, "sip") ? 0 : -1;
}
An example of the code generated is here (in the word-count example from the tutorial): http://www.boost.org/doc/libs/1_54_0/libs/spirit/example/lex/static_lexer/word_count_static.hpp

Parsing a SQL INSERT with Boost Spirit Classic

I'm trying to learn Boost Spirit and as an exercise, I've tried to parse a SQL INSERT statement using Boost Spirit Classic.
This is the string I'm trying to parse:
INSERT INTO example_tab (cola, colb, colc, cold) VALUES (vala, valb, valc, vald);
From this SELECT example I've created this little grammar:
struct microsql_grammar : public grammar<microsql_grammar>
{
template <typename ScannerT>
struct definition
{
definition(microsql_grammar const& self)
{
keywords = "insert", "into", "values";
chlit<> LPAREN('(');
chlit<> RPAREN(')');
chlit<> SEMI(';');
chlit<> COMMA(',');
typedef inhibit_case<strlit<> > token_t;
token_t INSERT = as_lower_d["insert"];
token_t INTO = as_lower_d["into"];
token_t VALUES = as_lower_d["values"];
identifier =
nocase_d
[
lexeme_d
[
(alpha_p >> *(alnum_p | '_'))
]
];
string_literal =
lexeme_d
[
ch_p('\'') >> +( anychar_p - ch_p('\'') )
>> ch_p('\'')
];
program = +(query);
query = insert_into_clause >> SEMI;
insert_into_clause = insert_clause >> into_clause;
insert_clause = INSERT >> INTO >> identifier >> LPAREN >> var_list_clause >> RPAREN;
into_clause = VALUES >> LPAREN >> var_list_clause >> RPAREN;
var_list_clause = list_p( identifier, COMMA );
}
rule<ScannerT> const& start() const { return program; }
symbols<> keywords;
rule<ScannerT> identifier, string_literal, program, query, insert_into_clause, insert_clause,
into_clause, var_list_clause;
};
};
Using a minimal to test it:
void test_it(const string& my_example)
{
microsql_grammar g;
if (!parse(example.c_str(), g, space_p).full)
{
// point a - FAIL
throw new exception();
}
// point b - OK
}
Unfortunately it always enters the point A and throws the exception. Since I'm new to this, I have no idea where my error lies. I have two questions:
What's the proper way to debug parsing errors when using Boost Spirit?
Why parsing fails in this example?
To get visibility into what is failing to parse, assign the result of parse to a parse_info<>, then log/examine the parse_info<>::stop field, which in this case should be a const char * pointing at the last byte of you input string that matched your grammar.
microsql_grammar g;
parse_info<std::string::const_iterator> result = parse(example.begin(), example.end(), g, space_p)
if (!result.full)
{
std::string parsed(example.begin(), result.stop);
std::cout << parsed << std::endl;
// point a - FAIL
}
// point b - OK
Apologies if this doesn't compile, but should be a starting point.