Several unary operators in C and C++ - c++

Is it standard-conforming to use expressions like
int i = 1;
+-+-+i;
and how the sign of i variable is determined?

Yes it is. Unary + and - associate right-to-left, so the expression is parsed as
+(-(+(-(+i))));
Which results in 1.
Note that these can be overloaded, so for a user-defined type the answer may differ.

Your operators has no side effect, +i do nothing with int itself and you do not use the temporary generated value but remove + that do nothing and you have -(-i) witch is equal to i itself.(removing + in the code will convert the operator, I mean remove it in computation because it has no effect)

i isn't modified (C: without intervening sequence points|C++: in an unsequenced manner) so it's legal. You're just creating a new temporary with each operator.
The unary + doesn't even do anything, so all you have is two negations which just give 1 for that expression. The variable i itself is never changed.

Related

C++ : How is this statement parsed?

I have been trying to learn the associativity of operators in C++ and I have come across a code segment :
int a = 10;
int C = a++ + ++a + ++a +a;
I have also studied that ++a is right to left associative and a++ is left to right associative. Also + is left to right associative. But I don't understand how to apply this knowledge in this problem.
I am confused that how this statement will be parsed by my compiler?
I am also puzzled that since putting spaces don't matter much why does removing spaces like :
int C = a+++++a+++a+a; //error: lvalue required as increment operand
generate an error?
Please help me understand this concept.
Thanks!
First of all space does matter- It helps compiler to resolve ambiguity.
Whenever there is an expression, compiler parse it from right to left. It looks for all the post increment operators first and then pre increment operators as later has lower precedence than the former. So any modification done by pre-increment operator will be applied to the whole expression and then changes of post-increment will be applied in the next expression.
Explanation
++a first increments the value of a and then returns lvalue referring to a, so if a is used then it will be the incremented value.
In your case there are total two ++a, thus the value of a will be incremented to 12 and thus assigned to a. so all the a in your expression will be holding the value 12 giving you the value of c=48.
a++ first returns an rvalue whose value is a, that is the old value, and then increments a at an unspecified time before the next full expression.
In your case if you use value of a after the expression it will be 13 as in the previous expression there was only one a++.
For eg.
int a = 10;
int C = a++ + ++a + ++a +a; // Here a=12 and the post increment effect will be applied in the next expression
int B = a + a; // Here a=13 the effect of previous post increment.
Regarding Error
With no space in expression, compiler will get confused when it will parse expression and thus dosent have any value to do the assignment.
PS: lvalue is a value that can be the target of an assignment.
In C/C++ the pre-increment (decrement) and the post-increment (decrement) operators require an L-value expression as operand. Providing an R-value or a const qualified variable results in compilation error.
Putting aside the fact that it would result in UB (as no sequence points between these multiple increments of the same variable)
a+++++a+++a+a
is parsed (as parser is greedy) as
((a++)++) + (a++) + a + a
and (a++)++ is illegal when a is a built-in type as int.

Why the Standard C++ Grammar for Assignment Expression Looks so Weird

From the C++ standard, the grammar for an assignment expression is like this:
assignment-expression:
conditional-expression
logical-or-expression assignment-operator assignment-expression
throw-expression
assignment-operator: one of
= *= /= %= += -= >>= <<= &= ^= |=
Notice that the left hand side of the "assignment-operator" is "logical-or-expression" i.e. something like (4 || localvar1) = 5; is a valid assignment expression according to the grammar. This doesn't make sense to me. Why they choose a "logical-or-expression" instead of, say, an identifier or id_expression?
The grammar is a bit complex, but if you continue unrolling with the previous definitions you will see that assignment expressions are very generic and allow for mostly anything. While the snippet from the standard that you quote focuses on logical-or-expression, if you keep unrolling the definition of that you will find that both the left hand side and right hand side of an assignment can be almost any subexpression (although not literally any).
The reason as pointed out before is that assignment can be applied to any lvalue expression of enum or fundamental type or rvalue expression of a class type (where operator= is always a member). Many expressions, in a language that allows for operator overloading and that does not define what the type returned from the operator is, can potentially fulfill the needs of assignment, and the grammar must allow all of those uses.
Different rules in the standard will later limit which of the possible expressions that can be generated from the grammar are actually valid or not.
Your particular statement, (4 || localvar1) = 5; is invalid (unless operator|| is overloaded), because you can't assign 5 to 4 (4 is an r-value). You must have an l-value (something that can be assigned to) on the left, such as a reference returned by a function.
For example, say you have some function int& get_my_int() that returns the reference to an integer. Then, you can do this:
`get_my_int() = 5;`
This will set the integer returned by get_my_int() to 5.
Just like in your first post, this MUST be a reference to an integer (and not a value); otherwise, the above statement wouldn't compile.
There are actually two interesting things about the C++ grammar for assignment statements, neither of which have to do with the validity of:
(4 || localvar1) = 5;
That expression is syntactically valid (up to type-checking) because of the parentheses. Any parenthesized expression of reference type is syntactically correct on the left-hand-side of an assignment operator. (And, as has been pointed out, almost any expression which involves a user type or function can be of reference type, as a result of operator overloading.)
What's more interesting about the grammar is that it establishes the left precedence of assignment operators as being lower than almost all other operators, including logical-or, so that the above expression is semantically equivalent to
4 || localvar1 = 5;
even though many readers would interpret the above as 4 || (localvar1 = 5) (which would have been totally correct assuming that localvar1 is of a type which can be assigned to by an int, even though said assignment will never happen -- unless, of course, || is overloaded in this context).
So what has a lower precedence on the left hand side of an assignment operator? As I said, very little, but one important exception is ?::
// Replace the larger of a and b with c
a > b ? a = c : b = c;
is valid and conveniently parenthesis-less. (Many style guides insist on redundant parentheses here, but I personally rather like the unparenthesized version.) This is not the same as right-hand precedence, so that the following also works without parentheses:
// Replace c with the larger of a and b
c = a > b ? a : b;
The only other operators which bind less tightly on the left of an assignment operator than the assignment operator are the , operator and another assignment operator. (In other words, assignment is right-associative unlike almost all other binary operators.) Neither of these are surprising -- in fact, they are so necessary that it's easy to miss how important it is to design a grammar in this way. Consider the following unremarkable for clause:
for (first = p = vec.begin(), last = vec.end(); p < last; ++p)
Here the , is a comma operator, and it clearly needs to bind less tightly than either of the assignments which surround it. (C and C++ are only exceptional in this syntax by having a comma operator; in most languages, , is not considered an operator.) Also, it would obviously be undesirable for the first assignment expression to be parsed as (first = p) = vec.begin().
The fact that assignment operators associate to the right is unremarkable, but it's worth noting for one historical curiosity. When Bjarne Stroustrup was looking around for operators to overload for I/O streams, he settled on << and >> because, although an assignment operator might have been more natural [1], assignment binds to the right, and a streaming operator must bind to the left (std::cout << a << b must be (std::cout << a) << b). However, since << binds much more tightly than assignment, there are a number of gotchas when using the streaming operators. (The one which most recently caught me is that shift binds more tightly than the bitwise operators.)
[Note 1]: I don't have a citation for this, but I remember reading it many years ago in The C++ Programming Language. As I recall, there was not a consensus about assignment operators being natural, but it seems more natural than overloading shift operators to be something completely different from their normal semantics.

Low level details of C/C++ assignment operator implementation. What does it return?

I m a total newbie to a C++ world (and C too). And don't know all its details. But one thing really bothers me.
It is constructions like :
while (a=b) {...} .As I understand this magic works because assignment operator in C and C++ returns something.
So the questions: what does it return? Is this a documented thing?Does it work the same in C and C++. Low level details about assignment operator and its implementation in both C and C++ (if there is a difference) will be very appreciated!
I hope that this question won't be closed, because I can't find a comprehensive explanation and good material on this theme from the low level point of view all the more so.
For built-in types in C++ evaluating an assignment expression produces an lvalue that is the left hand side of the assignment expression. The assignment is sequenced before the result can be used, so when the result is converted to an rvalue you get the newly assigned value:
int a, b=5;
int &c = (a=b);
assert(&c==&a);
b=10;
assert(10==(a=b));
C is almost but not exactly the same. The result of an assignment expression in C is an rvalue the same as the value newly assigned to the left hand side of the assignment.
int *c = &(a=b); // not legal in C because you can only take the address of lvalues.
Usually if the result of an assignment is used at all it's used as an rvalue (e.g., a=b=c), so this difference between C++ and C largely goes unnoticed.
The assignment operator is defined (in C) as returning the value of the variable that was assigned to - i.e. the value of the expression (a=b) is the value of a after the expression has been evaluated.
It can be defined to be something different (of the same type) for user-defined operator overloads in C++, but I suspect most would consider this to be a very unpleasant use of operator overloading.
You can use this (non-boolean) value in a while (or an if, etc.) because of type conversion - using a value in a conditional context causes it to be implicitly converted to something which makes sense in a conditional context. In C++, this is bool, and you can define your own conversion (for your own type) by overloading operator bool(). In C, anything other than 0 is true.
To understand such expressions, you have to first understand that, positive integers are considered as 'true' and 0 is considered as false.
An assignment evaluates to the left hand side of the operator = as its value. So, while(a=b) { } would mean, while(1 /*true*/) if a after being assigned to b evaluates to non-zero. Else, it is considered as while(0 /*false*/)
Similarly, with the operator (a=b)?1:0 is the value of a after being assigned to b .. if it is non-zero then the value is taken as true and the statement following ? will be executed, or the statement following : is executed.
Assignments usually evaluate to the value at the left hand side of the operator = where as, logical operators(such as ==, && etc) evaluate to 1 or 0.
Note: with C++, it will depend upon if or not, a certain operator is overloaded.. and it will also depend upon the return type of the overloaded operator.
The assignment operators in C and C++ return the value of the variable being assigned to, i.e., their left operand. In your example of a = b, the value of this entire expression is the value that is assigned to a (which is the value of b converted into the type of a).
So you can say that the assignment operator "returns" the value of its left operand.
In C++ it's a little more complicated because you can overload the = operator with an actual user-defined function, and have it return something other than the value (and type) of the left operand.

Why does C++ accept multiple prefixes but not postfixes for a variable

While looking into Can you have a incrementor and a decrementor on the same variable in the same statement in c
I discovered that you can have several prefix increment/decrement operators on a single variable, but only one postfix
ex:
++--++foo; // valid
foo++--++; // invalid
--foo++; // invalid
Why is this?
This is due to the fact that in C++ (but not C), the result of ++x is a lValue, meaning it is assignable, and thus chain-able.
However, the result of x++ is NOT an lValue, instead it is a prValue, meaning it cannot be assigned to, and thus cannot be chained.
In C++ language prefix increment/decrement operators return lvalues, while postfix ones return rvalues. Meanwhile, all modifying operators require lvalue arguments. This means that the result of prefix increment/decrement can be passed on to any other additional operator that requires an lvalue argument (including additional increments/decrements).
For the very same reason in C++ you can write code like this
int i = 0;
int *p = &++i;
which will increment i and make p point to i. Unary & requires lvalue operand, which is why it will work with the result of prefix ++ (but not with postfix one).
Expressions with multiple built-in prefix increments/decrements applied to the same object produce undefined behavior, but they are nevertheless well-formed (i.e. "compilable").
Expressions like ++foo-- are invalid because in C++ postfix operators have higher precedence than prefix ones. Braces can change that. For example, (++foo)-- is a well-formed expression, albeit leading to undefined behavior again.

Is comma operator free from side effect?

For example for such statement:
c += 2, c -= 1
Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?
Yes, it is guaranteed by the standard, as long as that comma is a non-overloaded comma operator. Quoting n3290 ยง5.18:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-
value expression (Clause 5)83. Every value computation and side effect associated with the left expression
is sequenced before every value computation and side effect associated with the right expression. The type
and value of the result are the type and value of the right operand; the result is of the same value category
as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field.
And the corresponding footnote:
83 However, an invocation of an overloaded comma operator is an ordinary function call; hence, the evaluations of its argument
expressions are unsequenced relative to one another (see 1.9).
So this holds only for the non-overloaded comma operator.
The , between arguments to a function are not comma operators. This rule does not apply there either.
For C++03, the situation is similar:
The comma operator groups left-to-right.
expression:
assignment-expression
expression , assignment-expression
A pair of expressions separated by a comma is evaluated left-to-right and the value of the left expression is
discarded. The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conver-
sions are not applied to the left expression. All side effects (1.9) of the left expression, except for the
destruction of temporaries (12.2), are performed before the evaluation of the right expression. The type and
value of the result are the type and value of the right operand; the result is an lvalue if its right operand is.
Restrictions are the same though: does not apply to overloaded comma operators, or function argument lists.
Yes, the comma operator guarantees that the statements are evaluated in left-to-right order, and the returned value is the evaluated rightmost statement.
Be aware, however, that the comma in some contexts is not the comma operator. For example, the above is not guaranteed for function argument lists.
Yes, in C++ the comma operator is a sequence point and those expression will be evaluated in the order they are written. See 5.18 in the current working draft:
[snip] is evaluated left-to-right. [snip]
I feel that your question is lacking some explanation as to what you mean by "side effects". Every statement in C++ is allowed to have a side effect and so is an overloaded comma operator.
Why is the statement you have written not valid in a function call?
It's all about sequence points. In C++ and C it is forbidden to modify a value twice inside between two sequence points. If your example truly uses operator, every self-assignment is inside its own sequence point. If you use it like this foo(c += 2, c -= 2) the order of evaluation is undefined. I'm actually unsure if the second case is undefined behaviour as I do not know if an argument list is one or many sequence points. I ought to ask a question about this.
It should be always evaluated from left to right, as this is the in the definition of the comma operator:
Link
You've got two questions.
The first question: "Is comma operator free from side effect?"
The answer to this is no. The comma operator naturally facilitates writing expressions with side effects, and deliberately writing expressions with side effects is what the operator is commonly used for. E.g., in while (cin >> str, str != "exit") the state of the input stream is changed, which is an intentional side effect.
But maybe you don't mean side-effect in the computer science sense, but in some ad hoc sense.
Your second question: "For example for such statement: c += 2, c -= 1 Is it true that c += 2 will be always evaluated first, and c in second expression c-= 1 will always be updated value from expression c += 2?"
The answer to this is yes in the case of a statement or expression, except when the comma operator is overloaded (very unusual). However, sequences like c += 2, c -= 1 can also occur in argument lists, in which case, what you've got is not an expression, and the comma is not a sequence operator, and the order of evaluation is not defined. In foo(c += 2, c -= 1) the comma is not a comma operator, but in foo((c += 2, c -= 1)) it is, so it may pay to pay attention to the parentheses in function calls.