Shunting-yard: missing argument to operator - c++

I'm implementing the shunting-yard algorithm. I'm having trouble detecting when there are missing arguments to operators. The wikipedia entry is very bad on this topic, and their code also crashes for the example below.
For instance 3 - (5 + ) is incorrect because the + is missing an argument.
Just before the algorithm reaches the ), the operator stack contains - ( + and the operand stack contains 3 5. Then it goes like this:
it pops + from the operator stack
discovers that + is a binary operator
pops two operands, apply operator and push result (8) to operand stack
then it pops the matching ( from the stack, and continues
So how can I detect that the + is missing an argument? extra kudos if you also update wikipedia :-)

For binary operator only expressions, the postfix expression has the invariant that in any prefix of the expression, numbers of operands > numbers of operators and in the end, that difference is exactly one.
So you can verify the RPN expression for validity at each stage of the shunting yard by maintaining a running count of number of operands - number of operators. If that drops below one, or becomes more than one at the end, you have an error.
It does not pinpoint the error, but at least lets you know there is one.
(Note: I haven't tried proving the above fact, but seems like it will work)

You can build a state machine. It can spot the tokens where something is wrong.
When you start reading the expression expect a prefix operator or an operand.
If you read a prefix operator expect a prefix operator, operand or opening parenthesis.
If you read an operand expect a postfix or and infix operator or a closing parenthesis.
If you read a postfix operator expect and infix operator or a closing parenthesis.
If you read an infix operator expect a prefix operator, operand or opening parenthesis.
If you read an opening parenthesis expect a prefix operator, operand or opening parenthesis.
if you read a closing parenthesis expect a postfix or and infix operator or a closing parenthesis.
You can turn these ifs to a switch easily. :)

Related

Using ++ as a prefix to a statement of access through class member not causing an error

I am kind of confused right now, I was running the following code:
std::vector<int> test{1, 5, 10};
++test.at(1); // I'm trying to increment that number 5 to six, which works
std::cout << test.at(1) << std::endl; // Prints out 6 to the console
I was expecting it to give me a compiler error because as I had read from about the operator precedence that the . (for member access) and the increment operator (++) have the same precedence, but they read left to right on a statement, from what I understood anyways. So in the code shown above I thought it would have been equal to saying (++test).at(1), which obviously causes a compiler error. Why isn't that the case even though the associativity is left to right, why is it reading it (from what I think) like this... ++(test.at(1))? If they have the same precedence wouldn't it, just like in maths, for example, use the ++ first and then the .?
True, postfix increment (a++) and member access (.) have the same precedence.
But you're using prefix increment (++a).
Consult cppreference's precedence chart.
Indeed, test++.at(i) would error for the reasons you give, though as readers of the code we would not be in any way surprised in that case.
Function calls have a higher operator precedence than unary operators like "++". The function call at() gets resolved first, and so the increment operator takes place on what it returns instead of the container.
https://en.cppreference.com/w/cpp/language/operator_precedence
EDIT: As Asteroids With Wings pointed out in their answer, only the prefix version of "++" has lower precedence than the function call. The postfix "++" is at the same level of precedence.

shunting Yard Algorithm, any changes?

I have implemented shunting yard algorithm in C++11 according to what is mentioned in wikipedia:
This implementation does not implement composite functions,functions with variable number of arguments, and unary operators.
while there are tokens to be read:
read a token.
if the token is a number, then:
push it to the output queue.
else if the token is a function then:
push it onto the operator stack
else if the token is an operator then:
while ((there is a operator at the top of the operator stack)
and ((the operator at the top of the operator stack has greater precedence)
or (the operator at the top of the operator stack has equal precedence and the token is left associative))
and (the operator at the top of the operator stack is not a left parenthesis)):
pop operators from the operator stack onto the output queue.
push it onto the operator stack.
else if the token is a left parenthesis (i.e. "("), then:
push it onto the operator stack.
else if the token is a right parenthesis (i.e. ")"), then:
while the operator at the top of the operator stack is not a left parenthesis:
pop the operator from the operator stack onto the output queue.
/* If the stack runs out without finding a left parenthesis, then there are mismatched parentheses. */
if there is a left parenthesis at the top of the operator stack, then:
pop the operator from the operator stack and discard it
/* After while loop, if operator stack not null, pop everything to output queue */
if there are no more tokens to read then:
while there are still operator tokens on the stack:
/* If the operator token on the top of the stack is a parenthesis, then there are mismatched parentheses. */
pop the operator from the operator stack onto the output queue.
exit.
As you can see it's mentioned that this algorithm doesn't deal with unary operator, suppose I have one ! which is stronger than all other operator, what changes should I make to my algorithm if any?
Some Legal Examples of using ! operator:
!1
! 1
! (1)
!( 1 + 2)
Plus one small question, does this algorithm deal with wrong syntax like 1==2 (I supposed that yes it does)?
Question 1:
In order to make your algorithm work you should parse the prefix operator ! before the infix operators, simply treating it as if it was an open parenthesis ( (then you need to tweak the stack virtual machine to allow this kind operator).
I suggest moving the if check for the parenthesis before the infix operator (it doesn't change much but it's more readable).
I will also say that if you want to achieve things like operator precedence, postfix operators and mixfix operators all together you should switch to a Pratt parser (which is much easier to work with).
Question 2:
The parser here doesn't deal with operations like 1 == 2, it only parses them. The stack based virtual machine deals with them and 1 == 2 is a completely fine comparison since it is supposed to return false. This if you plan to have boolean expressions as well as arithmetic expressions.
EDIT 1:
The "tweak" (which partially solves the issue): consider the operator as right associative and make its precedence higher than the other operators.
EDIT 2:
This (as pointed out in the comments by #dure) is just a tweak, since il will cause the parser to parse prefix and postfix operators without distinction and needs further care to avoid bugs.

Why operator cannot be in parentheses?

The question comes when I tried to make a macro like this:
#define OP1(a,b,op) (a) op (b)
then I was wondering why not also put op into parentheses, as it is also a macro parameter.
I then find I cannot even have this:
1 (+) 1;
otherwise there will be error:
error: expected primary-expression before ')' token
Can anyone tell me where is the rule saying operator cannot be in parentheses? I really cannot find it. Thank you.
ยง 7.6.6 (expr.add) defines "additive expressions" as:
additive-expression:
multiplicative-expression
additive-expression + multiplicative-expression
additive-expression - multiplicative-expression
No parens around the operator allowed.
There actually isn't any rule that says an operator should not be in parenthesis. But there is a rule that states that, "for a binary operator like +, the value on either sides of the operator must be valid operands like 5, 5.2".
So the expression (+) to the compiler means you are adding two parentheses (left paren, plus, right paren) together which is not supported by the language.
Putting macro parameters in parenthesis is good practice of course, but there is actually no need for putting the operator in this case inside parenthesis as there is no way of passing a complicated operator expression so you can rest assured that your macro will always work.
In programming, as in mathematics, the parentheses are used to override the operators precedence.
Without parentheses, 2 + 3 * 4 is evaluated as 2 + (3 * 4) because the multiplication (*) has a higher precedence than the addition (+). One can use parentheses to force the addition of 2 and 3 happen before the multiplication (of the result) by 4 by placing them around the addition operator and its operands as (2 + 3) * 4.
Both 3 * 4 and 2 + 3 in the expressions above are valid expressions.
+ in the expression 1 (+) 2 is not a valid expression. More, assuming the parentheses contain a valid sub-expression, the entire expression is invalid because it is just a list of values without operators to connect them into an expression.
Even more, this is also not the way you learned in school to write mathematical expressions.
Back to your #define, to avoid hidden errors and headache (due to the operators precedence) you should always enclose the expanded value of such a macro into parentheses like this:
#define OP1(a,b,op) ((a) op (b))

C++ precedence for unary negation vs prefix decrement

I came across the following when learning C++
int a = 5;
-----a;
The second statement doesn't compile. The statement could either be read as --(--(-a)) or -(--(--a)), since both operators are in the same precedence group. In this case though only the second interpretation (when you DO use brackets) makes sense. Therefore I see no ambiguity.
My question therefore is: why is the unary negation not in a higher precedence group than the prefix decrement?

How do I set a precedence value to operators such as '*', '/', '+', and '-.

I either want to set them to bool values or simply integer values so that I can tell my function to multiply/divide these two integers before I add/subtract them to another operand.
Here is my code:
while (!S.empty() && **PRECEDENCE**next <= **PRECEDENCE**S.top())
{
temp = S.top();
S.pop();
postfix.append(temp);
}
Where S is for the stack. So let's say next is the * token and S.top() is '+', so * takes priority over +, so I need to assign a value to * and + so that when they are compared to one another, their values are compared. So the value of * is 1 whereas the value of + is 0.
OMG, there are many methods to do this.
Look up table
You could create a table of records and search the table for the operator, then retrieve the precedence value.
Use switch statement
I don't advise, but it is similar functionality of mapping a precedence to an operator.
Use std::map
Same concept as the other two, except using std::map<char, int> for operator character and precedence.
Hard code the precedence
The operators you compare for first will have the highest precedence. The next operators checked will have lower precedence.
Search the Web and StackOverflow
Hint: Other people have done this assignment. Search StackOverflow or the web for:
"c++ postfix operator precedence"
"c++ calculator operator precedence"
"c++ calculator operator evaluation"
"c++ postfix operator evaluation"
Or use other synonyms that I haven't listed.
Research before posting
Be as smart person and cheat by researching and seeing what other people have done. Before posting questions here.
Use a debugger
Use a debugger or print statements to see where your issues lie.
If you are still stumped, post the issue, your minimal code and the expected behavior.