using OCaml to simplify arithmetic expression - ocaml

I am now using ocaml to deal with some arithmetic expressions. If every arithmetic expression is a string like: "1+2*(2-5)". I want to know how to use ocaml to eliminate useless parentheses.
For example if we get a string like "(2*(1-8))" we should output "2*(1-8)".
Thanks.

OCaml is just a programming language, not a symbolic algebra system. So you would solve this in OCaml just as in any general purpose language.
A full blown solution would be to parse the expression into a tree, then walk the tree to produce the output. For this you need to analyze your string lexically (for which you can probably use the Str module), and then parse the tokens. You can code your own parser pretty easily, or you could really go full force and use a parser generator like ocamlyacc.
For the relatively simple problem of reparenthesizing an arithmetic expression you can use a variant of the "shunting yard" algorithm, which in essence calculates a canonical, unparenthesized (RPN) form of an expression.

Related

Intersection of two regular expressions in Golang using Ragel

The template of the function is as follows:
func GetIntersection(firstRegex string, secondRegex string) string {
...
}
I'm trying to use Ragel to get the intersection of two regular expressions. Not sure if Ragel is the right tool to use, though. My last resort is implementing conversions from regex to DFA and DFA to regex, as well as intersection of two DFA's myself, but I would rather avoid that. Would highly appreciate any reliable libraries to solve the problem.
Ragel has an intersection operator (&). You can produce the intersection of any two regular expressions, so long as they are expressed in the ragel syntax. There is no freely available automatic ragel-to-regex translation. This is the kind of thing I've been producing for clients privately. Anyhow, see the manual for more information on intersection.

How to test if a string is a valid C++(ish) expression?

I am writing a program in C++ that needs to be able to test if a string (probably std::string) is a valid C++ expression. Variables can be checked if they have been declared (bool variableDeclared(std::string identifier)) and their type can also be checked (std::string variableType(std::string identifier)). The variableType function returns a string based on how it would be declared in C++ ("bool", "double", "char", etc).
The expression doesn't need to be evaluated but only tested to see if it is valid. The function only needs to support character literals, string literals, number literals, brackets, simple operators (+, -, *, /, ! (logic not), &&, ||, >, <, ==), and variables of type double, std::string (no function calls needed), bool and char. It is also not required to support string concatenation.
The desired result would be a function that is something like bool validExpression(std::string expression). It is also preferable that it allows me to modify the operations (for example I could change "==" to "equal-to").
How would I implement this? Is there a library that could do something like this, a regex statement or is it simply a matter of a long function with lots of if statements?
Formally, your situation is: you have a grammar which describes the language of expressions which you want to validate, and a word for which you want to determine whether it belongs in that language. This is a job for a parser of that language.
You could hand-cook something like a recursive-descent LL(1) parser, or use a tool to generate a parser. A well-known example of such a tool is Bison for generating LALR(1) parsers. Wikipedia has a long parser generator list.
Technical terms are used above mainly to provide entry points for googling.
You would start from defining your language more or less formally. (A language is a set of strings). A good way to define a language is to specify its context-free grammar. Describe additional conditions (like the requirement that variables must be declared, and of the right type) informally in prose.
The next step would be building a parser for your grammar specified at the previous step. There are several tools for building parsers from grammars automatically, from yacc/bison to boost::spirit.
After building and checking the parser, implement the informally-specified rules and plug them into your parser code/data.
Normally the next step, building an evaluator, would probably the easiest part of writing a simple interpreter, but you say you don't need one.
Describing your language as "just like C++ only with certain bits taken out" could be a preliminary step to the sequence outlined above. It is however not recommended to start out from C++ if you can help it. C++ is an extremely hard language to specify formally, and its parsers tend to be rather hairy, due to its convoluted declaration syntax.
you can run compiler as sub-process of your application. All you have to do is to pass arguments and parse response properly

How to parse mathematical formulae from strings in c++

I want to write a program that takes an string like x^2+1 and understand it.
I want to ask the user to enter her/his function and I want to be able to process and understand it. Any Ideas?
char s[100];
s <- "x*I+2"
x=5;
I=2;
res=calc(s);
I think it could be done by something like string analyses but I think Its so hard for me.
I have another Idea and that is using tcc in main program and doing a realtime compile and run and delete a seprated program (or maybe function) that has the string s in it.
and I will create a temp file every time and ask tcc to compile it and run it by exec or similar syntax.
/*tmp.cpp:*/
#include <math.h>
void main(/*input args*/){
return x*I+2;
}
the tmp.cpp will created dynamically.
thanks in advance.
I am not sure what do you expect. It's too complex to give the code as answer, but the general idea is not very complex. It's not out of reach to code, even for a normal hobbyist programmer.
You need to define grammar, tokenize string, recognize operators, constants and variables.
Probably put expression into a tree. Make up a method for substituting the variables... and you can evaluate!
You need to have some kind of a parser. The easiest way to have math operations parsable is to have them written in RPN. You can, however, write your own parser using parser libraries, like Spirit from boost or Yacc
I use with success , function parser
from www it looks like it supports also std::complex, but I never used it
As luck would have it, I recently wrote one!
Look for {,include/}lib/MathExpression/Term. It handles complex numbers but you can easily adapt it for plain old floats.
The licence is GPL 2.
The theory in brief, when you have an expression like
X*(X+2)
Your highest level parser can parse expressions of the form A + B + C... In this case A is the whole expression.
You recurse to parse an operator of higher precedence, A * B * C... In this case A is X and B is (X+2)
Keep recursing until you're parsing either basic tokens such as X or hit an opening parenthesis, in which case push some kind of stack to track where your are and recurse into the parentheses with the top-level low-precedence parser.
I recommend you use RAII and throw exceptions when there are parse errors.
use a Recursive descent parser
Sample: it's in german, but a small and powerfull solution
look here
here is exactly what You are searching for. Change the function read_varname to detect a variable like 'x' or 'I'.

Regular Expression Vs. String Parsing

At the risk of open a can of worms and getting negative votes I find myself needing to ask,
When should I use Regular Expressions and when is it more appropriate to use String Parsing?
And I'm going to need examples and reasoning as to your stance. I'd like you to address things like readability, maintainability, scaling, and probably most of all performance in your answer.
I found another question Here that only had 1 answer that even bothered giving an example. I need more to understand this.
I'm currently playing around in C++ but Regular Expressions are in almost every Higher Level language and I'd like to know how different languages use/ handle regular expressions also but that's more an after thought.
Thanks for the help in understanding it!
Edit: I'm still looking for more examples and talk on this but the response so far has been great. :)
It depends on how complex the language you're dealing with is.
Splitting
This is great when it works, but only works when there are no escaping conventions.
It does not work for CSV for example because commas inside quoted strings are not proper split points.
foo,bar,baz
can be split, but
foo,"bar,baz"
cannot.
Regular
Regular expressions are great for simple languages that have a "regular grammar". Perl 5 regular expressions are a little more powerful due to back-references but the general rule of thumb is this:
If you need to match brackets ((...), [...]) or other nesting like HTML tags, then regular expressions by themselves are not sufficient.
You can use regular expressions to break a string into a known number of chunks -- for example, pulling out the month/day/year from a date. They are the wrong job for parsing complicated arithmetic expressions though.
Obviously, if you write a regular expression, walk away for a cup of coffee, come back, and can't easily understand what you just wrote, then you should look for a clearer way to express what you're doing. Email addresses are probably at the limit of what one can correctly & readably handle using regular expressions.
Context free
Parser generators and hand-coded pushdown/PEG parsers are great for dealing with more complicated input where you need to handle nesting so you can build a tree or deal with operator precedence or associativity.
Context free parsers often use regular expressions to first break the input into chunks (spaces, identifiers, punctuation, quoted strings) and then use a grammar to turn that stream of chunks into a tree form.
The rule of thumb for CF grammars is
If regular expressions are insufficient but all words in the language have the same meaning regardless of prior declarations then CF works.
Non context free
If words in your language change meaning depending on context, then you need a more complicated solution. These are almost always hand-coded solutions.
For example, in C,
#ifdef X
typedef int foo
#endif
foo * bar
If foo is a type, then foo * bar is the declaration of a foo pointer named bar. Otherwise it is a multiplication of a variable named foo by a variable named bar.
It should be Regular Expression AND String Parsing..
You can use both of them to your advantage!Many a times programmers try to make a SINGLE regular expression for parsing a text and then find it very difficult to maintain..You should use both as and when required.
The REGEX engine is FAST.A simple match takes less than a microsecond.But its not recommended for parsing HTML.

Expression Evaluation in C++

I'm writing some excel-like C++ console app for homework.
My app should be able to accept formulas for it's cells, for example it should evaluate something like this:
Sum(tablename\fieldname[recordnumber], fieldname[recordnumber], ...)
tablename\fieldname[recordnumber] points to a cell in another table,
fieldname[recordnumber] points to a cell in current table
or
Sin(fieldname[recordnumber])
or
anotherfieldname[recordnumber]
or
"10" // (simply a number)
something like that.
functions are Sum, Ave, Sin, Cos, Tan, Cot, Mul, Div, Pow, Log (10), Ln, Mod
It's pathetic, I know, but it's my homework :'(
So does anyone know a trick to evaluate something like this?
Ok, nice homework question by the way.
It really depends on how heavy you want this to be. You can create a full expression parser (which is fun but also time consuming).
In order to do that, you need to describe the full grammar and write a frontend (have a look at lex and yacc or flexx and bison.
But as I see your question you can limit yourself to three subcases:
a simple value
a lookup (possibly to an other table)
a function which inputs are lookups
I think a little OO design can helps you out here.
I'm not sure if you have to deal with real time refresh and circular dependency checks. Else they can be tricky too.
For the parsing, I'd look at Recursive descent parsing. Then have a table that maps all possible function names to function pointers:
struct FunctionTableEntry {
string name;
double (*f)(double);
};
You should write a parser. Parser should take the expression i.e., each line and should identify the command and construct the parse tree. This is the first phase. In the second phase you can evaluate the tree by substituting the data for each elements of the command.
Previous responders have hit it on the head: you need to parse the cell contents, and interpret them.
StackOverflow already has a whole slew of questions on building compilers and interperters where you can find pointers to resources. Some of them are:
Learning to write a compiler (#1669 people!)
Learning Resources on Parsers, Interpreters, and Compilers
What are good resources on compilation?
References Needed for Implementing an Interpreter in C/C++
...
and so on.
Aside: I never have the energy to link them all together, or even try to build a comprehensive list.
I guess you cannot use yacc/lex (or the like) so you have to parse "manually":
Iterate over the string and divide it into its parts. What a part is depends on you grammar (syntax). That way you can find the function names and the parameters. The difficulty of this depends on the complexity of your syntax.
Maybe you should read a bit about lexical analysis.