Infix to Postfix with function support - c++

There are many algorithms to convert infix to postfix all over the web. But my question is how to make that to support functions? For example sin(x+y)*z.
I will appreciate a code.

If you are looking for an algorithm that gives you the conversion infix to postfix including function call support, you can use the below pseudocode(which looks like python code). I have written this for my case but not yet tested thouroughly. If you find any bugs please let me know.
I have also written a Java implementation for the same.
Also, there are few things to note about this implementation:
This algorithm assumes a stream of tokens in infix. It does not parse a expression string. So each token can be identified as an operand, operator, function call etc.
There are 7 different kinds of tokens:
Operands X, Y etc
Left Paranthesis - (
Right Paranthesis - )
Operators - +, *
Function call starts - sin(
Function call ends - sin( x )
Comma - ,
Function call starts are denoted by [ character in the algorithm and function call ends are denoted by ]. Please note that function call termination is a different token than Right Paranthesis ) although they may be represented by the same character in the string expression.
Every operator is a binary operator with precedence and associativity as their usual meaning.
Comma , is a special binary operator with precedence of NEGATIVE INFINITY and associativity as LEFT (same as + and *). Comma operator is used to separate the arguments of a function call. So for a function call:
f(a,b,c)
first comma separates a and b
second comma separates a,b and c
So the postfix for the above will be
ab,c,f
You can view Comma operator as a add to list function which adds the second argument to the list specified by the first argument or if both are single values it creates a list of two values.
Algorithm
infix_to_postfix(infix):
postfix = []
infix.add(')')
stack = []
stack.push('(')
for each token in infix:
if token is operand:
postfix.add(token)
if token is '[':
stack.push(token)
else if token is operator:
if stack is empty OR
stack[top] is '(' or stack[top] is '[':
stack.push(token)
else if (operator)token['precedence'] > stack[top]['precedence'] OR
( (operator)token['precedence'] == stack[top]['precedence'] AND
(operator)token['associativity') == 'RIGHT' ):
stack.push(token)
else
postfix.add(stack.pop())
stack.push(token)
else if token is '(':
stack.push(token)
else if token is ')':
while topToken = stack.pop() NOT '(':
postfix.add(topToken)
else if token is ']':
while True:
topToken = stack.pop()
postfix.add(topToken)
if topToken is '[':
break
else if token is ',':
while topToken = stack.peek() NOT '[':
postfix.add(topToken)
stack.pop()
stack.push(token)

Thats quite easy: It work with functions too, the regular operators you use (like +,-,*) are functions too. Your problem is, that what you consider "function" (like sin) is not in infix, but they are in prefix.
To come back to your problem: Just convert these prefix functions into postfix (you should find prefix to postfix on the web too - my assumption is that you dont know the "prefix" term) beforehand.
EDIT: Basicaly it is nothing more that first convert the arguments and output them in sequence and append the name of the function afterwards.

Although #mickeymoon algorithm seems to work, I still had to make some adjustments(didn't work for me) so I think it can be helpful for somebody another implementation(Java like implementation). Based on https://en.wikipedia.org/wiki/Shunting-yard_algorithm
Stack<Token> stack = new Stack<>();
List<Token> result = new ArrayList<>();
//https://en.wikipedia.org/wiki/Shunting-yard_algorithm
// with small adjustment for expressions in functions. Wiki example works only for constants as arguments
for (Token token : tokens) {
if (isNumber(token) || isIdentifier(token)) {
result.add(token);
continue;
}
if (isFunction(token)) {
stack.push(token);
continue;
}
// if OP(open parentheses) then put to stack
if (isOP(token)) {
stack.push(token);
continue;
}
// CP(close parentheses) pop stack to result until OP
if (isCP(token)) {
Token cur = stack.pop();
while (!isOP(cur)) {
if (!isComma(cur)) {
result.add(cur);
}
cur = stack.pop();
}
continue;
}
if (isBinaryOperation(token)) {
if (!stack.empty()) {
Token cur = stack.peek();
while ((!isBinaryOperation(cur)
|| (isBinaryOperation(cur) && hasHigherPriority(cur, token))
|| (hasEqualPriority(cur, token) && isLeftAssociative(token)))
&& !isOP(cur)
) {
// no need in commas in resulting list if we now how many parameters the function need
if (!isComma(cur)) {
result.add(cur);
}
stack.pop();
if (!stack.empty()) {
cur = stack.peek();
}
}
}
stack.push(token);
continue;
}
if (isComma(token)) {
Token cur = stack.peek();
while (!(isOP(cur) || isComma(cur))) {
result.add(cur);
stack.pop();
if (!stack.empty()) {
cur = stack.peek();// don't pop if priority is less
}
}
stack.push(token);
}
}
while (!stack.empty()) {
Token pop = stack.pop();
if (!isComma(pop)) {
result.add(pop);
}
}
return result;
I tested it with various complex expressions including function composition and complex arguments(doesn't work with example from Wiki algorithm). A couple of examples(e is just a variable, min,max, rand - functions):
Input: (3.4+2^(5-e))/(1+5/5)
Output: 3.4 2 5 e - ^ + 1 5 / + /
Input: 2+rand(1.4+2, 3+4)
Output: 2 1.4 2 + 3 4 + rand +
Input: max(4+4,min(1*10,2+(3-e)))
Output: 4 4 + 1 10 * 2 3 e - + min max
I also tested it with complex function with three arguments(where each argument is an expression by itself) and it words fine.
Here is the github for my java function that takes the list of tokens and returns the list of tokens in postfix notation. And here is the function that takes the output from first function and calculates the value of the expression

The code you'll have to work out yourself. Using your specific case as an example might help get you started; the postfix form of sin(x + y) * z would be:
x y + sin z *
Note that in this one example some operations operation on two values (+ and *), and others one (sin)

binary operators like + can be considered as +(x,y)
Similarly Consider sin, cos, etc functions as unary operators. So, sin(x+y)*z can be written as x y + sin z *. You need to give these unary functions special treatment.

Related

How can I use a variable in another input statement?

I am asking the user to input an expression which will be evaluated in postfix notation. The beginning of the expression is the variable name where the answer of the evaluated expression will be stored. Ex: A 4 5 * 6 + 2 * 1 – 6 / 4 2 + 3 * * = where A is the variable name and the equal sign means the answer to the expression will be stored in the variable A. The OUT A statement means that the number stored in the variable A will be printed out.
What I need help with is that when I input the second expression, I do not get the right answer. For example, my first expression A 4 5 * 6 + 2 * 1 – 6 / 4 2 + 3 * * = will evaluate to 153 and then when I input my second expression B A 10 * 35.50 + =, it has to evaluate to 1565.5, but it doesn't. It evaluates to 35.5. I cannot figure out why I am getting the wrong answer. Also, I need help with the OUT statement.
else if (isalpha(expr1[i]))
{
stackIt.push(mapVars1[expr1[i]]);
}
Will place the variable, or zero if the variable has not been set, onto the stack.
else if (isalpha(expr1[i]))
{
map<char, double>::iterator found = mapVars1.find(expr1[i]);
if (found != mapVars1.end())
{
stackIt.push(found->second);
}
else
{
// error message and exit loop
}
}
Is probably better.
Other suggestions:
Compilers are pretty sharp these days, but you may get a bit out of char cur = expr1[i]; and then using cur (or suitably descriptive variable name) in place of the remaining expr1[i]s in the loop.
Consider using isdigit instead of expr1[i] >= '0' && expr1[i] <= '9'
Test your code for expressions with multiple spaces in a row or a space after an operator. It looks like you will re-add the last number you parsed.
Test for input like 123a456. You might not like the result.
If spaces after each token in the expression are specified in the expression protocol, placing your input string into a stringstream will allow you to remove a great deal of your parsing code.
stringstream in(expr1);
string token;
while (in >> token)
{
if (token == "+" || token == "-'" || ...)
{
// operator code
}
else if (token == "=")
{
// equals code
}
else if (mapVars1.find(token) != mapVars1.end())
{
// push variable
}
else if (token.length() > 0)
{
char * endp;
double val = strtod(token.c_str(), &endp);
if (*endp == '\0')
{
// push val
}
}
}
To use previous symbol names in subsequent expressions add this to the if statements in your parsing loop:
else if (expr1[i] >= 'A' && expr1[i] <= 'Z')
{
stackIt.push(mapVars1[expr[i]]);
}
Also you need to pass mapVars by reference to accumulate its contents across Eval calls:
void Eval(string expr1, map<char, double> & mapVars1)
For the output (or any) other command I would recommend parsing the command token that's at the front of the string first. Then call different evaluators based on the command string. You are trying to check for OUT right now after you have already tried to evaluate the string as an arithmetic assignment command. You need to make that choice first.

Need Help Understanding Recursive Prefix Evaluator

This is a piece of code I found in my textbook for using recursion to evaluate prefix expressions. I'm having trouble understanding this code and the process in which it goes through.
char *a; int i;
int eval()
{ int x = 0;
while (a[i] == ' ') i++;
if (a[i] == '+')
{ i++; return eval() + eval(); }
if (a[i] == '*')
{ i++; return eval() * eval(); }
while ((a[i] >= '0') && (a[i] <= '9'))
x = 10*x + (a[i++] - '0');
return x;
}
I guess I'm confused primarily with the return statements and how it eventually leads to solving a prefix expression. Thanks in advance!
The best way to understand recursive examples is to work through an example :
char* a = "+11 4"
first off, i is initialized to 0 because there is no default initializer. i is also global, so updates to it will affect all calls of eval().
i = 0, a[i] = '+'
there are no leading spaces, so the first while loop condition fails. The first if statement succeeds, i is incremented to 1 and eval() + eval() is executed. We'll evaluate these one at a time, and then come back after we have our results.
i = 1, a[1] = '1'
Again, no leading spaces, so the first while loop fails. The first and second if statements fail. In the last while loop, '1' is between 0 and 9(based on ascii value), so x becomes 0 + a[1] - '0', or 0 + 1 = 1. Important here is that i is incremented after a[i] is read, then i is incremented. The next iteration of the while loop adds to x. Here x = 10 * 1 + a[2] - '0', or 10 + 1 = 11. With the correct value of x, we can exit eval() and return the result of the first operand, again here 11.
i = 2, a[2] = '4'
As in the previous step, the only statement executed in this call of eval() is the last while loop. x = 0 + a[2] - '0', or 0 + 4 = 4. So we return 4.
At this point the control flow returns back to the original call to eval(), and now we have both values for the operands. We simply perform the addition to get 11 + 4 = 15, then return the result.
Every time eval() is called, it computes the value of the immediate next expression starting at position i, and returns that value.
Within eval:
The first while loop is just to ignore all the spaces.
Then there are 3 cases:
(a) Evaluate expressions starting with a + (i.e. An expression of the form A+B which is "+ A B" in prefix
(b) Evaluate expressions starting with a * (i.e. A*B = "* A B")
(c) Evaluate integer values (i.e. Any consecutive sequence of digits)
The while loop at the end takes care of case (c).
The code for case (a) is similar to that for case (b). Think about case (a):
If we encounter a + sign, it means we need to add the next two "things" we find in the sequence. The "things" might be numbers, or may themselves be expressions to be evaluated (such as X+Y or X*Y).
In order to get what these "things" are, the function eval() is called with an updated value of i. Each call to eval() will fetch the value of the immediate next expression, and update position i.
Thus, 2 successive calls to eval() obtain the values of the 2 following expressions.
We then apply the + operator to the 2 values, and return the result.
It will help to work through an example such as "+ * 2 3 * 4 5", which is prefix notation for (2*3)+(4*5).
So this piece of code can only eat +, *, spaces and numbers. It is supposed to eat one command which can be one of:
- + <op1> <op2>
- * <op1> <op2>
<number>
It gets a pointer to a string, and a reading position which is incremented as the program goes along that string.
char *a; int i;
int eval()
{ int x = 0;
while (a[i] == ' ') i++; // it eats all spaces
if (a[i] == '+')
/* if the program encounters '+', two operands are expected next.
The reading position i already points just before the place
from which you have to start reading the next operand
(which is what first eval() call will do).
After the first eval() is finished,
the reading position is moved to the begin of the second operand,
which will be read during the second eval() call. */
{ i++; return eval() + eval(); }
if (a[i] == '*') // exactly the same, but for '*' operation.
{ i++; return eval() * eval(); }
while ((a[i] >= '0') && (a[i] <= '9')) // here it eats all digit until something else is encountered.
x = 10*x + (a[i++] - '0'); // every time the new digit is read, it multiplies the previously obtained number by 10 and adds the new digit.
return x;
// base case: returning the number. Note that the reading position already moved past it.
}
The example you are given uses a couple of global variables. They persist outside of the function's scope and must be initialized before calling the function.
i should be initialized to 0 so that you start at the beginning of the string, and the prefix expression is the string in a.
the operator is your prefix and so should be your first non-blank character, if you start with a number (string of numbers) you are done, that is the result.
example: a = " + 15 450"
eval() finds '+' at i = 1
calls eval()
which finds '1' at i = 3 and then '5'
calculates x = 1 x 10 + 5
returns 15
calls eval()
which finds '4' at i = 6 and then '5' and then '0'
calclulates x = ((4 x 10) + 5) x 10) + 0
returns 450
calculates the '+' operator of 15 and 450
returns 465
The returns are either a value found or the result of an operator and the succeeding results found. So recursively, the function successively looks through the input string and performs the operations until either the string ends or an invalid character is found.
Rather than breaking up code into chunks and so on, i'll try and just explain the concept it as simple as possible.
The eval function always skips spaces so that it points to either a number character ('0'->'9'), an addition ('+') or a multiply ('*') at the current place in the expression string.
If it encounters a number, it proceeds to continue to eat the number digits, until it reaches a non-number digit returning the total result in integer format.
If it encounters operator ('+' and '*') it requires two integers, so eval calls itself twice to get the next two numbers from the expression string and returns that result as an integer.
One hair in the soup may be evaluation order, cf. https://www.securecoding.cert.org/confluence/display/seccode/EXP10-C.+Do+not+depend+on+the+order+of+evaluation+of+subexpressions+or+the+order+in+which+side+effects+take+place.
It is not specified which eval in "eval() + eval()" is, well, evaluated first. That's ok for commutative operators but will fail for - or /, because eval() as a side effect advances the global position counter so that the (in time) second eval gets the (in space) second expression. But that may well be the (in space) first eval.
I think the fix is easy; assign to a temp and compute with that:
if (a[i] == '-')
{ i++; int tmp = eval(); return tmp - eval(); }

Which Data Structure used to solve a simple math equation

When taking in a expression like (10+5*15) and following orders of operations.
How would one best solve a problem like this? What kind of data structure is best?
Thanks.
I'd go with Dijkstra's Shunting yard algorithm to create the AST.
Try parsing the expression using recursive descent. This would give you a parse tree respecting order of operations.
The usual data structure for this task is a stack. When you're doing things like compiling, creating an abstract syntax tree is useful, but for simple evaluation it's usually overkill.
Think about it for a second - what is an operator? Pretty much every operator (+, -, *, /) are all binary operators. Parenthesis are depth constructors; you move one level deeper with parenthesis.
In fact, constructing the tree of data you need to solve this problem is going to be your biggest hurdle.
It's in Java, but this seems to convert from infix to postfix, and then evaluates using a stack-based approach. It puts numbers onto the stack, reaches operators, and then pops the two numbers from the stack to evaluate them with the operator (x + / -).
http://enel.ucalgary.ca/People/Norman/enel315_winter1999/lab_solutions/lab5sol/exF/Calculator.java
The conversion is as follows:
Scan the Infix string from left to
right.
Initialise an empty stack.
If the scannned character is an operand, add it to the Postfix string. If the scanned character is an operator and if the stack is empty
Push the character to stack.
If the scanned character is an Operand and the stack is not empty, compare the precedence of the character with the element on top of the stack (topStack). If topStack has higher precedence over the scanned character Pop the stack else Push the scanned character to stack. Repeat this step as long as stack is not empty and topStack has precedence over the character.
Repeat this step till all the characters are scanned. (After all characters are scanned, we have to add any character that the stack may have to the Postfix string.)
If stack is not empty add topStack to
Postfix string and Pop the stack.
Repeat this step as long as stack is
not empty.
Return the Postfix string.
Evaluate the Postfix string.
If you need to simply compute the result of the expression that is available as a string then I'd go with no data structure at all and just functions like:
//
// expression ::= addendum [ { "-" | "+" } addendum ]
// addendum ::= factor [ { "*" | "/" } factor ]
// factor ::= { number | sub-expression | "-" factor }
// sub-expression ::= "(" expression ")"
// number ::= digit [ digit ]
// digit ::= { "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" }
//
int calcExpression(const char *& p);
int calcDigit(const char *& p);
int calcNumber(const char *& p);
int calcFactor(const char *& p);
int calcAddendum(const char *& p);
where each function just accepts a const char * by reference that reads from it (incrementing the pointer) and returning as value the numeric value of the result, throwing instead an exception in case of problems.
This approach doesn't need any data structure because uses the C++ stack for intermediate results. As an example...
int calcDigit(const char *& p)
{
if (*p >= '0' && *p <= '9')
return *p++ - '0';
throw std::runtime_error("Digit expected");
}
int calcNumber(const char *& p)
{
int acc = calcDigit(p);
while (*p >= '0' && *p <= '9')
acc = acc * 10 + calcDigit(p);
return acc;
}
If you need instead to write a compiler that transforms a string (for example including variables or function calls) into code or bytecode then probably the best solution is to start either using a generic n-way tree or a tree with specific structures for the different AST node types.

C++ Recognizing double digits using strings

Sorry, I realized that I put in all of my code in this question. All of my code equals most of the answer for this particular problem for other students, which was idiotic.
Here's the basic gist of the problem I put:
I needed to recognize single digit numbers in a regular mathematical expression (such as 5 + 6) as well as double digit (such as 56 + 78). The mathematical expressions could also be displayed as 56+78 (no spaces) or 56 +78 and so on.
The actual problem was that I was reading in the expression as 5 6 + 7 8 no matter what the input was.
Thanks and sorry that I pretty much deleted this question, but my goal is not to give answers out for homework problems.
Jesse Smothermon
The problem really consists of two parts: lexing the input (turning the sequence of characters into a sequence of "tokens") and evaluating the expression. If you do these two tasks separately, it should be much easier.
First, read in the input and convert it into a sequence of tokens, where each token is an operator (+, -, etc.) or an operand (42, etc.).
Then, perform the infix-to-postfix conversion on this sequence of tokens. A "Token" type doesn't have to be anything fancy, it can be as simple as:
struct Token {
enum Type { Operand, Operator };
enum OperatorType { Plus, Minus };
Type type_;
OperatorType operatorType_; // only valid if type_ == Operator
int operand_; // only valid if type_ == Operand
};
First, it helps to move such ifs like this
userInput[i] != '+' || userInput[i] != '-' || userInput[i] != '*' || userInput[i] != '/' || userInput[i] != '^' || userInput[i] != ' ' && i < userInput.length()
into its own function, just for the clarity.
bool isOperator(char c){
return c == '+' || c == '-' || c == '*' || c == '/' || c == '^';
}
Also, no need to check that it's no operator, just check that the input is a number:
bool isNum(char c){
return '0' <= c && c <= '9';
}
Another thing, with the long chain above, you got the problem that you will also enter the tempNumber += ... block, if the input character is anyhing other than '+'. You would have to check with &&, or better with the function above:
if (isNum(userInput[iterator])){
tempNumber += userInput[iterator];
}
This will also rule out any invalid input like b, X and the likes.
Then, for your problem with double digit numbers:
The problem is, that you always input a space after inserting the tempNumber. You only need to do that, if the digit sequence is finished. To fix that, just modify the end of your long if-else if chain:
// ... operator stuff
} else {
postfixExpression << tempNumber;
// peek if the next character is also a digit, if not insert a space
// also, if the current character is the last in the sequence, there can be no next digit
if (iterator == userInput.lenght()-1 || !isNum(userInput[iterator+1])){
postfixExpression << ' ';
}
}
This should do the job of giving the correct representation from 56 + 78 --> 56 78 +. Please tell me if there's anything wrong. :)

Effect of using a comma instead of a semi-colon in C and C++

I've noticed on a number of occasions when refactoring various pieces of C and C++ code that a comma is used rather than a semi-colon to seperate statements. Something like this;
int a = 0, b = 0;
a = 5, b = 5;
Where I would have expected
int a = 0, b = 0;
a = 5; b = 5;
I know that C and C++ allow use of commas to seperate statements (notably loop headers), but what is the difference if any between these two pieces of code? My guess is that the comma has been left in as the result of cut & pasting, but is it a bug and does it effect execution?
It doesn't make a difference in the code you posted. In general, the comma separates expressions just like a semicolon, however, if you take the whole as an expression, then the comma operator means that the expression evaluates to the last argument.
Here's an example:
b = (3, 5);
Will evaluate 3, then 5 and assign the latter to b. So b = 5. Note that the brackets are important here:
b = 3, 5;
Will evaluate b = 3, then 5 and the result of the whole expression is 5, nevertheless b == 3.
The comma operator is especially helpful in for-loops when your iterator code is not a simple i++, but you need to do multiple commands. In that case a semicolon doesn't work well with the for-loop syntax.
The comma is a operator that returns a value which is always the 2nd (right) argument while a semicolon just ends statements. That allows the comma operator to be used inside other statements or to concatenate multiple statements to appear as one.
Here the function f(x) gets called and then x > y is evaluated for the if statement.
if( y = f(x), x > y )
An example when it's used just to avoid a the need for block
if( ... )
x = 2, y = 3;
if( ... ) {
x = 2;
y = 3;
}
The comma operator evaluates all operands from left to right, and the result is the value of the last operand.
It is mostly useful in for-loops if you want to do multiple actions in the "increment" part, e.g (reversing a string)
for (int lower = 0, upper = s.size() - 1; lower < upper; ++lower, --upper)
std::swap(s[lower], s[upper]);
Another example, where it might be an option (finding all occurrences in a string):
#include <string>
#include <iostream>
int main()
{
std::string s("abracadabra");
size_t search_position = 0;
size_t position = 0;
while (position = s.find('a', search_position), position != std::string::npos) {
std::cout << position << '\n';
search_position = position + 1;
}
}
In particular, logical and cannot be used for this condition, since both zero and non-zero can mean that the character was found in the string. With comma, on the other hand, position = s.find() is called each time when the condition is evaluated, but the result of this part of the condition is just ignored.
Naturally there are other ways to write the loop:
while ((position = s.find('a', search_position)) != std::string::npos)
or just
while (true) {
position = s.find('a', search_position);
if (position == std::string::npos)
break;
...
}
As Frank mentioned, how the comma operator is used in your example doesn't cause a bug. The comma operator can be confusing for several reasons:
it's not seen too often because it's only necessary in some special situations
there are several other syntactic uses of the comma that may look like a comma operator - but they aren't (the commas used to separate function parameters/arguments, the commas used to separate variable declarations or initializers)
Since it's confusing and often unnecessary, the comma operator should be avoided except for some very specific situations:
it can be useful to perform multiple operation in one or more of a for statement's controlling expressions
it can be used in preprocessor macros to evaluate more than one expression in a single statement. This is usually done to allow a macros to do more than one thing and still be a a single expression so the macro will 'fit' in places that only allow an expression.
The comma operator is a hackish operator pretty much by definition - it's to hack in 2 things where only one is allowed. It's almost always ugly, but sometimes that's all you've got. And that's the only time you should use it - if you have another option, don't use the comma operator.
Off the top of my head I can't think of too many other reasons to use the operator, since you can get a similar effect by evaluating the expressions in separate statements in most other situations (though I'm sure that someone will comment on a another use that I've overlooked).
One usage would be in code golfing:
if (x == 1) y = 2, z = 3;
if (x == 1) { y = 2; z = 3; }
The first line is shorter, but that looks too confusing to use in regular development.