C++: Associativity of * (multiply) operator is not left-to-right - c++

While working on a school assignment, we had to do something with operator overloading and templates. All cool. I wrote:
template<class T>
class Multiplication : public Expression<T>
{
private:
typename std::shared_ptr<Expression<T> > l, r;
public:
Multiplication(typename std::shared_ptr<Expression<T> > l, typename std::shared_ptr<Expression<T> > r) : l(l), r(r) {};
virtual ~Multiplication() {};
T evaluate() const
{
std::cout << "*";
T ml = l->evaluate();
T mr = r->evaluate();
return ml * mr;
};
};
Then a friend asked me why his code produced output in the "wrong" order.
He had something like
T evaluate() const
{
std::cout << "*";
return l->evaluate() * r->evaluate();
};
The code of r->evaluate() printed the debug information, before l->evaluate().
I tested it on my machine as well, by just changing these three lines to a one-liner.
So, I thought, well then * should be right-to-left associative. But everywhere on the internet they say it is left-to-right. Are there some extra rules? Maybe something special when using templates? Or is this a bug in VS2012 ?

When we say the associativity of * is left-to-right, we mean that the expression a*b*c*d will always evaluate as (((a*b)*c)*d). That's it. In your example, you only have one operator*, so there isn't anything to associate.
What you're running into is the order of evaluation of operands. You are calling:
operator*(l->evaluate(), r->evaluate());
Both expressions need to be evaluated before the call to operator*, but it is unspecified (explicitly) by the C++ standard what order they get evaluated in. In your case, r->evaluate() got evaluated first - but that has nothing to do with the associativity of operator*.
Note that even if you had a->evaluate() * b->evaluate() * c->evaluate(), that would get parsed as:
operator*(operator*(a->evaluate(), b->evaluate()), c->evaluate())
based on the rules of operator associativity - but even in that case, there's no rule to prevent c->evaluate() from being called first. It may very well be!

You have a single operator in your expression:
l->evaluate() * r->evaluate()
so associativity is not involved at all here. The catch is that the two operands are evaluated before calling the * operator and the order in which they are evaluated is not defined. A compiler is allowed to reorder the evaluation in any suitable way.
In C++11 terms, the call to operator* is sequenced after the operand evaluation, but there is no sequence relation between the two evaluations. From the n4296 draft (post C++14), page 10:
§1.9.15 Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.

Related

Need clarification about logic behind precedence of operators [duplicate]

Does the ANSI standard mandate the logical operators to be short-circuited, in either C or C++?
I'm confused for I recall the K&R book saying your code shouldn't depend on these operations being short circuited, for they may not. Could someone please point out where in the standard it's said logic ops are always short-circuited? I'm mostly interested on C++, an answer also for C would be great.
I also remember reading (can't remember where) that evaluation order isn't strictly defined, so your code shouldn't depend or assume functions within an expression would be executed in a specific order: by the end of a statement all referenced functions will have been called, but the compiler has freedom in selecting the most efficient order.
Does the standard indicate the evaluation order of this expression?
if( functionA() && functionB() && functionC() ) cout<<"Hello world";
Yes, short-circuiting and evaluation order are required for operators || and && in both C and C++ standards.
C++ standard says (there should be an equivalent clause in the C standard):
1.9.18
In the evaluation of the following expressions
a && b
a || b
a ? b : c
a , b
using the built-in meaning of the operators in these expressions, there is a sequence point after the evaluation of the first expression (12).
In C++ there is an extra trap: short-circuiting does NOT apply to types that overload operators || and &&.
Footnote 12: The operators indicated in this paragraph are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation, and the operands form an argument list, without an implied sequence point between them.
It is usually not recommended to overload these operators in C++ unless you have a very specific requirement. You can do it, but it may break expected behaviour in other people's code, especially if these operators are used indirectly via instantiating templates with the type overloading these operators.
Short circuit evaluation, and order of evaluation, is a mandated semantic standard in both C and C++.
If it wasn't, code like this would not be a common idiom
char* pChar = 0;
// some actions which may or may not set pChar to something
if ((pChar != 0) && (*pChar != '\0')) {
// do something useful
}
Section 6.5.13 Logical AND operator of the C99 specification (PDF link) says
(4). Unlike the bitwise binary & operator, the && operator guarantees
left-to-right evaluation; there is a
sequence point after the evaluation of
the first operand. If the first
operand compares equal to 0, the
second operand is not evaluated.
Similarly, section 6.5.14 Logical OR operator says
(4) Unlike the bitwise | operator, the ||
operator guarantees left-to-right
evaluation; there is a sequence point
after the evaluation of the first
operand. If the first operand compares
unequal to 0, the second operand is
not evaluated.
Similar wording can be found in the C++ standards, check section 5.14 in this draft copy. As checkers notes in another answer, if you override && or ||, then both operands must be evaluated as it becomes a regular function call.
Yes, it mandates that (both evaluation order and short circuit). In your example if all functions return true, the order of the calls are strictly from functionA then functionB and then functionC. Used for this like
if(ptr && ptr->value) {
...
}
Same for the comma operator:
// calls a, then b and evaluates to the value returned by b
// which is used to initialize c
int c = (a(), b());
One says between the left and right operand of &&, ||, , and between the first and second/third operand of ?: (conditional operator) is a "sequence point". Any side effects are evaluated completely before that point. So, this is safe:
int a = 0;
int b = (a++, a); // b initialized with 1, and a is 1
Note that the comma operator is not to be confused with the syntactical comma used to separate things:
// order of calls to a and b is unspecified!
function(a(), b());
The C++ Standard says in 5.14/1:
The && operator groups left-to-right. The operands are both implicitly converted to type bool (clause 4).
The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right
evaluation: the second operand is not evaluated if the first operand is false.
And in 5.15/1:
The || operator groups left-to-right. The operands are both implicitly converted to bool (clause 4). It returns true if either of its operands is true, and false otherwise. Unlike |, || guarantees left-to-right evaluation; moreover, the second operand is not evaluated if the first operand evaluates to true.
It says for both next to those:
The result is a bool. All side effects of the first expression except for destruction of temporaries (12.2) happen before the second expression is evaluated.
In addition to that, 1.9/18 says
In the evaluation of each of the expressions
a && b
a || b
a ? b : C
a , b
using the built-in meaning of the operators in these expressions (5.14, 5.15, 5.16, 5.18), there is a sequence point after the evaluation of the first expression.
Straight from good old K&R:
C guarantees that && and || are evaluated left to right — we shall soon see cases where this matters.
Be very very careful.
For fundamental types these are shortcut operators.
But if you define these operators for your own class or enumeration types they are not shortcut. Because of this semantic difference in their usage under these different circumstances it is recommended that you do not define these operators.
For the operator && and operator || for fundamental types the evaluation order is left to right (otherwise short cutting would be hard :-) But for overloaded operators that you define, these are basically syntactic sugar to defining a method and thus the order of evaluation of the parameters is undefined.
Your question comes down to C++ operator precedence and associativity. Basically, in expressions with multiple operators and no parentheses, the compiler constructs the expression tree by following these rules.
For precedence, when you have something like A op1 B op2 C, you could group things as either (A op1 B) op2 C or A op1 (B op2 C). If op1 has higher precedence than op2, you'll get the first expression. Otherwise, you'll get the second one.
For associativity, when you have something like A op B op C, you could again group thins as (A op B) op C or A op (B op C). If op has left associativity, we end up with the first expression. If it has right associativity, we end up with the second one. This also works for operators at the same precedence level.
In this particular case, && has higher precedence than ||, so the expression will be evaluated as (a != "" && it == seqMap.end()) || isEven.
The order itself is "left-to-right" on the expression-tree form. So we'll first evaluate a != "" && it == seqMap.end(). If it's true the whole expression is true, otherwise we go to isEven. The procedure repeats itself recursively inside the left-subexpression of course.
Interesting tidbits, but the concept of precedence has its roots in mathematic notation. The same thing happens in a*b + c, where * has higher precedence than +.
Even more interesting/obscure, for a unparenthasiszed expression A1 op1 A2 op2 ... opn-1 An, where all operators have the same precedence, the number of binary expression trees we could form is given by the so called Catalan numbers. For large n, these grow extremely fast.
d
If you trust Wikipedia:
[&& and ||] are semantically distinct from the bit-wise operators & and | because they will never evaluate the right operand if the result can be determined from the left alone
C (programming language)

Why is '--++a-​- ++ +b--' evaluated in this order?

Why does the following print bD aD aB aA aC aU instead of aD aB aA aC bD aU? In other words, why is b-- evaluated before --++a--++?
#include <iostream>
using namespace std;
class A {
char c_;
public:
A(char c) : c_(c) {}
A& operator++() {
cout << c_ << "A ";
return *this;
}
A& operator++(int) {
cout << c_ << "B ";
return *this;
}
A& operator--() {
cout << c_ << "C ";
return *this;
}
A& operator--(int) {
cout << c_ << "D ";
return *this;
}
void operator+(A& b) {
cout << c_ << "U ";
}
};
int main()
{
A a('a'), b('b');
--++a-- ++ +b--; // the culprit
}
From what I gather, here's how the expression is parsed by the compiler:
Preprocessor tokenization: -- ++ a -- ++ + b --;
Operator precedence1: (--(++((a--)++))) + (b--);
+ is left-to-right associative, but nonetheless the compiler may choose to evaluate the expression on the right (b--) first.
I'm assuming the compiler chooses to do it this way because it leads to better optimized code (less instructions). However, it's worth noting that I get the same result when compiling with /Od (MSVC) and -O0 (GCC). This brings me to my question:
Since I was asked this on a test which should in principle be implementation/compiler-agnostic, is there something in the C++ standard that prescribes the above behavior, or is it truly unspecified? Can someone quote an excerpt from the standard which confirms either? Was it wrong to have such a question on the test?
1 I realize the compiler doesn't really know about operator precedence or associativity, rather it cares only about the language grammar, but this should get the point across either way.
The expression statement
--++a-- ++ +b--; // the culprit
can be represented the following way
at first like
( --++a-- ++ ) + ( b-- );
then like
( -- ( ++ ( ( a-- ) ++ ) ) ) + ( b-- );
and at last like
a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator + ( b.operator --( 0 ) );
Here is a demonstrative program.
#include <iostream>
using namespace std;
#include <iostream>
using namespace std;
class A {
char c_;
public:
A(char c) : c_(c) {}
A& operator++() {
cout << c_ << "A ";
return *this;
}
A& operator++(int) {
cout << c_ << "B ";
return *this;
}
A& operator--() {
cout << c_ << "C ";
return *this;
}
A& operator--(int) {
cout << c_ << "D ";
return *this;
}
void operator+(A& b) {
cout << c_ << "U ";
}
};
int main()
{
A a('a'), b('b');
--++a-- ++ +b--; // the culprit
std::cout << std::endl;
a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator + ( b.operator --( 0 ) );
return 0;
}
Its output is
bD aD aB aA aC aU
bD aD aB aA aC aU
You can imagine the last expression written in the functional form like a postfix expression of the form
postfix-expression ( expression-list )
where the postfix expression is
a.operator --( 0 ).operator ++( 0 ).operator ++().operator --().operator +
and the expression-list is
b.operator --( 0 )
In the C++ Standard (5.2.2 Function call) there is said that
8 [Note: The evaluations of the postfix expression and of the arguments
are all unsequenced relative to one another. All side effects of
argument evaluations are sequenced before the function is entered (see
1.9). —end note]
So it is implementation-defined whether at first the argument will be evaluated or the postfix expression. According to the showed output the compiler at first evaluates the argument and only then the postfix expression.
I would say they were wrong to include such a question.
Except as noted, the following excerpts are all from §[intro.execution] of N4618 (and I don't think any of this stuff has changed in more recent drafts).
Paragraph 16 has the basic definition of sequenced before, indeterminately sequenced, etc.
Paragraph 18 says:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
In this case, you're (indirectly) calling some functions. The rules there are fairly simple as well:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is
sequenced before B or B is sequenced before A.
Putting that into bullet points to more directly indicate order:
first evaluate the function arguments, and whatever designates the function being called.
Evaluate the body of the function itself.
Evaluate another (sub-)expression.
No interleaving is allowed unless something starts up a thread to allow something else to execute in parallel.
So, does any of this change before we're invoking the functions via operator overloads rather than directly? Paragraph 19 says "No":
The sequencing constraints on the execution of the called function (as described above) are features of the function calls as evaluated, whatever the syntax of the expression that calls the function might be.
§[expr]/2 also says:
Uses of overloaded operators are transformed into function calls as described
in 13.5. Overloaded operators obey the rules for syntax and evaluation order specified in Clause 5, but the requirements of operand type and value category are replaced by the rules for function call.
Individual operators
The only operator you've used that has somewhat unusual requirements with respect to sequencing are the post-increment and post-decrement. These say (§[expr.post.incr]/1:
The value computation of the ++ expression is sequenced before the modification of the operand object. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single postfix ++ operator. —end note ]
In the end, however, this is pretty much just what you'd probably expect: if you pass x++ as a parameter to a function, the function receives the previous value of x, but if x is also in scope inside the function, x will have the incremented value by the time the body of the function starts to execute.
The + operator, however, does not specify ordering of the evaluation of its operands.
Summary
Using overloaded operators does not enforce any sequencing on the evaluation of sub-expressions within an expression, beyond the fact that evaluating an individual operator is a function call, and has the ordering requirements of any other function call.
More specifically, in this case, b-- is the operand to a function call, and --++a-- ++ is the expression that designates the function being called (or at least the object on which the function will be called--the -- designates the function within that object). As noted, ordering between these two is not specified (nor does operator + specify an order of evaluating its left vs. right operand).
There is not something in the C++ standard which says things need to be evaluated in this way. C++ has the concept of sequenced-before, where some operations are guaranteed to happen before other operations are. This is a partially-ordered set; that is, sosome operations are sequenced before others, two operations can’t be sequenced before eath other, and if a is sequenced before b, and b is sequenced before c, then a is sequenced before c. However, there are many types of operation which have no sequenced-before guarantees. Before C++11, there was instead a concept of a sequence point, which isn’t quite the same but similar.
Very few operators (only ,, &&, ?:, and ||, I believe) guarantee a sequence point between their arguments (and even then, until C++17, this guarantee doesn’t exist when the operators are overloaded). In particular, the addition does not guarantee any such thing. The compiler is free to evaluate the left-hand side first, to evaluate the right-hand side first, or (I think) even to evaluate them simultaneously.
Sometimes changing optimization options can change the results, or changing compilers. Apparently you aren’t seeing that; there are no guarantees here.
Operator precedence and associativity rules are only used to convert your expression from the original "operators in expression" notation to the equivalent "function call" format. After the conversion you end up with a bunch of nested function calls, which are processed in the usual way. In particular, order of parameter evaluation is unspecified, which means that there's no way to say which operand of the "binary +" call will get evaluated first.
Also, note that in your case binary + is implemented as a member function, which creates certain superficial asymmetry between its arguments: one argument is "regular" argument, another is this. Maybe some compilers "prefer" to evaluate "regular" arguments first, which is what leads to b-- being evaluated first in your tests (you might end up with different ordering from the same compiler if you implement your binary + as a freestanding function). Or maybe it doesn't matter at all.
Clang, for example, begins with evaluating the first operand, leaving b-- for later.
Take in account priority of operators in c++:
a++ a-- Suffix/postfix increment and decrement. Left-to-right
++a --a Prefix increment and decrement. Right-to-left
a+b a-b Addition and subtraction. Left-to-right
Keeping the list in your mind you can easily read the expression even without parentheses:
--++a--+++b--;//will follow with
--++a+++b--;//and so on
--++a+b--;
--++a+b;
--a+b;
a+b;
And dont forget about essential difference prefix and postfix operators in terms of order evaluation of variable and expression ))

Unordered function evaluation for functions returning void

Is there a way in C and C++ to cause functions returning void to be evaluated in unspecified order?
I know that function arguments are evaluated in unspecified order so for functions not returning void this can be used to evaluate those functions in unspecified order:
#include <stdio.h>
int hi(void) {
puts("hi");
return 0;
}
int bye(void) {
puts("bye");
return 0;
}
int moo(void) {
puts("moo");
return 0;
}
void dummy(int a, int b, int c) {}
int main(void) {
dummy(hi(), bye(), moo());
}
Legal C and C++ code compiled by a conforming compiler may print hi, bye, and moo in any order. This is not undefined behavior (nasal demons would not be valid), there is simply more than one but less than infinite valid outputs and a compliant compiler need not even be deterministic in what it produces.
Is there any way to do this without the dummy return values?
Clarification: This is an abstract question about C and C++. A better original phrasing might have been is there any context in which function evaluation order is unspecified for functions returning void? I'm not trying to solve a specific problem.
You can take advantage of the fact that the left hand side of a the comma operator is a discarded value expression (void expression in C) like this (see it live):
int main(void) {
dummy((hi(),0), (bye(),0), (moo(),0));
}
From the draft C++ standard section 5.18 Comma operator:
A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-value expression (Clause 5).
and C11 section 6.5.17 Comma operator:
The left operand of a comma operator is evaluated as a void expression; there is a
sequence point between its evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value.
As Matt points out is is also possible to mix the above method with arithmetic operators to achieve unspecified order of evaluation:
(hi(),0) + (bye(),0) + (moo(),0) ;
Well there's always the obvious approach of putting pointers to the functions in a container, shuffling it up (or as suggested in a comment sorting it), and calling each item in the container. If you need to have the same behavior each run just make sure your seed is the same each time.

Operator precedence doesn't behave as expected in c++

Consider this code :
int func1()
{
cout<<"Plus"<<endl;
return 1;
}
int func2()
{
cout<<"Multiplication"<<endl;
return 2;
}
int main()
{
cout<<func1()+4*func2();
}
According to this page * operator has higher precedence than + operator So I expect the result to be :
Multiplication
Plus
9
But the result is
Plus
Multipication
9
!!
What is going on in compiler parser ?! Does compiler prefer Operator associaty ?
Is the output same in all c/c++ compilers?
Operator precedence is not the same thing as order of evaluation.
You have no guarantee about the order of evaluation - the compiler is free to call functions in whatever order it likes within an expression so long as you get the correct result.
(A minor qualification: anything which introduces a sequence point (which includes short circuit operators), will have an effect on order of evaluation, but there are no sequence points within the expression in this particular case.)
The compiler is free to evaluate functions in any order it pleases - the only cases within expressions where the order of evaluation is guaranteed are the sequence points; operators ||, &&, ,, and ? of the ternary conditional operator ? : are sequence points. In each case the left side has all its values and side effects evaluated before the right side is touched.

Evaluation order of overloaded operator |?

5.15 Logical OR operator in the standard says the following:
Unlike |, || guarantees left-to-right evaluation;
Does this mean somewhere I cannot locate in the standard, | is defined to evaluate right-to-left, or that it is implementation-defined? Does this vary when the operator is overloaded? I wrote a quick program to test this and both MSVC++ and GCC seem to evaluate right-to-left.
#include<iostream>
using namespace std;
int foo = 7;
class Bar {
public:
Bar& operator|(Bar& other) {
return *this;
}
Bar& operator++() {
foo += 2;
return *this;
}
Bar& operator--() {
foo *= 2;
return *this;
}
};
int main(int argc, char** argv) {
Bar a;
Bar b;
Bar c = ++a | --b;
cout << foo;
}
This outputs 16.
If ++a and --b are switched it outputs 19.
I've also considered that I may be running into the multiple changes between sequence points rule (and thus undefined behavior), but I'm unsure how/if that applies with two separate instances as operands.
Ignore that operator for now, and just take note of this:
(x + y) * (z + 1)
Here, both operands must be evaluated before the multiplication can take place (otherwise we wouldn't know what to multiply). In C++, the order in which this is done is unspecified: it could be (x + y) first, or (z + 1) first, whatever the compiler feels is better.†
The same is true for the operator |. However, operator || must short-circuit, and in order to do that, it must evaluate strictly left to right. (And if the left evaluation yields true, the evaluation ends without evaluating the right operand.) That's what the sentence means.
†Note that it may have no preference one way or another, and just evaluate in the order it's listed. This is why you get the output you do, though you cannot rely on it at the language level.
As others said, it means that the order of the evaluation of the two sides is unspecified. To answer your other questions -
I've also considered that I may be running into the multiple changes between sequence points rule (and thus undefined behavior)
No, your case does not modify foo in between two adjacent sequence points. Before entering a function and before leaving a function, there always is a sequence point, which means that both modifications of foo happen in between two different pairs of sequence points.
Does this vary when the operator is overloaded?
All of clause 5 only talks about builtin operators. For user defined operator implementations, the rules don't apply. So also for ||, for user defined operators the order is not specified. But notice that it is only for user defined operators; not when both operands are converted to bool and trigger the builtin operator:
struct A {
operator bool() const { return false; }
};
struct B {
operator bool() const { return true; }
};
int main() {
A a;
B b;
a || b;
shared_ptr<myclass> p = ...;
if(p && p->dosomething()) ...;
}
This will always first execute A::operator bool, and then B::operator bool. And it will only call p->dosomething() if p evaluates to true.
Does this mean somewhere I cannot locate in the standard, | is defined to evaluate right-to-left, or that it is implementation-defined?
Pedantically speaking the order of evaluation of arguments of | operator is unspecified. So that means the operands can be evaluated in either order.
However the order of evaluation of operands of logical operators (i.e &&, || etc) and comma operator is specified i.e from left to right.