According to the C++ standard, can I be sure that assignment operators for built-in variables return (the original value)?
Or is this implementation dependent (yet simply have most popular compilers implemented this)?
Yes, it is guaranteed:
5.17 Assignment and compound assignment operators
The assignment operator (=) and the compound assignment operators all group
right-to-left. All require a modifiable lvalue as their left operand
and return an lvalue referring to the left operand.
This applies to built-in types. With user-defined types it can return anything.
It depends on what you mean by "the original value".
For example:
#include <iostream>
int main() {
int i;
std::cout << (i = 1.9) << "\n";
}
prints 1. The assignment expression yields the new value of the LHS (namely 1), not the "original value" of the RHS (1.9).
I'm not sure whether that's what you meant to ask about.
According to the C++ standard, can I be sure that assignment operators for build in variables return (the original value)?
Edit I get it now, I think. Yes: you can be sure that the built-in types return the original value by reference after operator=, *=, /=, -=, +=, ^=, ~=, &= and |=.
"For build in variables" is a bit cryptic to me. However,
X makeX()
{
return X();
} // must have accessible copy constructor
// ...
X x;
x = makeX(); // ... _and_ assignment operator
Or is this implementation dependent
It should not be (edit see standard reference by UncleBens)
Yes. It is guaranteed that all assignments and augmented assignments on predefined types work that way.
Note however that user defined types assignments or augmented assignments can instead for example return void. This is not good practice and shouldn't be done (it raises the surprise effect on users - not a good thing), but it's technically possible so if you are writing for example a template library you should not make that assumption unless it's really important for you.
Yes. Operator semantics will simply not work otherwise.
Related
I was reading Effective C++: 55 Specific Ways to Improve Your Programs and Designs by Scott Meyers and he stated:
Having a function return a constant value is generally inappropriate, but sometimes doing so can reduce the incidence of client errors without giving up safety or efficiency. For example, consider the declaration of the operator* function:
class Rational { ... };
const Rational operator*(const Rational& lhs, const Rational& rhs);
According to Meyers, do this prevents "atrocities" like this, which would be illegal if a, b were primitive types:
Rational a, b, c;
...
(a * b) = c;
This got me confused and while trying to understand why the above assignment was illegal for primitive types but not user-defined types, I came across rvalues and lvalues
I still feel I don't have a strong grasp of what rvalues and lvalues are after looking through some SO questions, but here's my basic understanding: an lvalue references a location in memory and thus can be assigned to (it can be on both sides of = operator as well); an rvalue however, cannot be assigned to because it does not reference a memory location(e.g. temporary values from function returns and literals)
My question is: why is assigning to a product of two numbers/objects legal for user-defined types (even though it does not make sense) but not primitives? Does it have to do with return types? does the overloaded * operator return an assignable value or a temporary value?
[expr.call]/14: A function call is an lvalue if the result type is an lvalue reference type or an rvalue reference to function type, an xvalue if the result type is an rvalue reference to object type, and a prvalue otherwise.
This makes sense, since the result doesn't "have a name". If you returned a reference, the implication would be that it is a reference to some object somewhere that does "have a name" (which is, generally but not always, true).
Then there's this:
[expr.ass]/1: The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
This is saying that an assignment requires an lvalue on the left hand side. So far so good; you've covered this yourself.
How come a non-const function call result works then?
By a special rule!
[over.oper]/8:: [..] Some predefined operators, such as +=, require an operand to be an lvalue when applied to basic types; this is not required by operator functions.
… and = applied to an object of class type invokes an operator function.
I can't readily answer the "why": on the surface of it, it made sense to relax this restriction when dealing with classes, and the original (inherited) restriction on built-ins always seemed a little excessive (in my opinion) but would have had to be kept for compatibility reasons.
But then you have people like Meyers pointing out that it now becomes useful (sort of) to return const values to effectively "undo" this change.
Ultimately I wouldn't try too hard to find a strong rationale either way.
This line compiles when I use C++, but not C:
gmtime(&(*(time_t *)alloca(sizeof(time_t)) = time(NULL))); //make an lvalue with alloca
I'm surprised by this difference. There is not even a warning for C++.
When I specify gcc -x c, the message is:
playground.cpp:25:8: error: lvalue required as unary '&' operand
gmtime(&(*(time_t *)alloca(sizeof(time_t)) = time(NULL)));
^
Isn't the & here just an address-of operator? Why is it different in C and C++?
Although I can use compound literals in C, still is it possible to modify my syntax to make it work in both C & C++?
In C11 6.5.16/3:
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue.
In C++14 5.17/1:
The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand.
(Earlier versions of the language standards in each case specified the same thing).
Since the address-of operator can only operate on an lvalue, the code is correct in C++ but not in C.
Regarding the question "Is it possible to modify my syntax to make it work in both C & C++?". This is not a desirable goal; the two languages are different and you should decide what you're writing. This makes about as much sense as trying to stick to syntax that works in both C and Java.
As suggested by others, you could write:
time_t t = time(NULL);
gmtime(&t);
which has the benefits over your original code of being:
simpler, therefore easier to understand and maintain
does not depend on non-standard alloca function
does not have potential alignment violation
uses no more memory and perhaps uses less
Every programmer should know that:
(De Morgan's Laws)
Under some circumstances, in order to optimize the program, it may happen that compiler modifies (!p && !q) to (!(p || q)).
The two expressions are equivalent, and it makes no difference evaluating the first or the second.
But in C++ it is possible to overload operators, and the overloaded operator may not always respect this property. So transforming the code this way will actually modify the code.
Should the compiler use De Morgan's Laws when !, || and && are overloaded?
Note that:
Builtin operators && and || perform short-circuit evaluation (do not evaluate the second operand if the result is known after evaluating the first), but overloaded operators behave like regular function calls and always evaluate both operands.
...
Because the short-circuiting properties of operator&& and operator|| do not apply to overloads, and because types with boolean semantics are uncommon, only two standard library classes overload these operators ...
Source: http://en.cppreference.com/w/cpp/language/operator_logical
(emphasis mine)
And that:
If there is a user-written candidate with the same name
and parameter types as a built-in candidate operator function, the built-in operator function is hidden and
is not included in the set of candidate functions.
Source: n4431 13.6 Built-in operators [over.built] (emphasis mine)
To summarize: overloaded operators behave like regular, user-written functions.
NO, the compiler will not replace a call of a user-written function with a call of another user-written function.
Doing otherwise would potentially violate the "as if" rule.
I think that you have answered your own question: no, a compiler can not do this. Not only the operators can be overloaded, some can not be even defined. For example, you can have operator && and operator ! defined, and operator || not defined at all.
Note that there are many other laws that the compiler can not follow. For example, it can not change p||q to q||p, as well as x+y to y+x.
(All of the above applies to overloaded operators, as this is what the question asks for.)
No, in that case the transformation would be invalid. The permission to transform !p && !q into !(p || q) is implicit, by the as-if rule. The as-if rule allows any transformation that, roughly speaking, cannot be observed by a correct program. When overloaded operators are used and would detect the transformation, that automatically means the transformation is no longer allowed.
Overloaded operators per se are just syntactic sugar for function calls; the compiler itself is not allowed to make any assumption about the properties that may or may not hold for such calls. Optimizations that exploit properties of some specific operator (say, De Morgan's for boolean operators, commutativity for sums, distributivity for sum/product, transformation of integral division in an appropriate multiplication, ...) can be employed only when the "real operators" are used.
Notice instead that some parts of the standard library may associate some specific semantic meaning to overloaded operators - for example, std::sort by default expects an operator< that complies to a strict weak ordering between the elements - but this is of course listed in the prerequisites of each algorithm/container.
(incidentally, overloading && and || should probably be avoided anyway since they lose their short-circuiting properties when overloaded, so their behavior becomes surprising and thus potentially dangerous)
You are asking whether the compiler can arbitrarily rewrite your program to do something you did not write it to do.
The answer is: of course not!
Where De Morgan's laws apply, they may be applied.
Where they don't, they may not.
It's really that simple.
Not directly.
If p and q are expressions so that p does not have overloaded operators, short circuit evaluation is in effect: expression q is going to be evaluated only if p is false.
If p is of non-primitive type, there is no short circuit evaluation and overloaded function could be anything - even not related to the conventional usage.
Compiler will do its optimizations in its own way. Perhaps it might result de Morgan identities, but not on the level of if condition replacement.
DeMorgan's laws apply to the semantics of those operators. Overloading applies to the syntax of those operators. There is no guarantee that an overloaded operator implements the semantics that are needed for DeMorgan's laws to apply.
But in C++ it is possible to overload operators, and the overloaded operator may not always respect this property.
Overloaded operator is no longer an operator, it is a function call.
class Boolean
{
bool value;
..
Boolean operator||(const Boolean& b)
{
Boolean c;
c.value = this->value || b.value;
return c;
}
Boolean logical_or(const Boolean& b)
{
Boolean c;
c.value = this->value || b.value;
return c;
}
}
So this line of code
Boolean a (true);
Boolean b (false);
Boolean c = a || b;
is equivalent to this
Boolean c = a.logical_or(b);
From the C++ standard, the grammar for an assignment expression is like this:
assignment-expression:
conditional-expression
logical-or-expression assignment-operator assignment-expression
throw-expression
assignment-operator: one of
= *= /= %= += -= >>= <<= &= ^= |=
Notice that the left hand side of the "assignment-operator" is "logical-or-expression" i.e. something like (4 || localvar1) = 5; is a valid assignment expression according to the grammar. This doesn't make sense to me. Why they choose a "logical-or-expression" instead of, say, an identifier or id_expression?
The grammar is a bit complex, but if you continue unrolling with the previous definitions you will see that assignment expressions are very generic and allow for mostly anything. While the snippet from the standard that you quote focuses on logical-or-expression, if you keep unrolling the definition of that you will find that both the left hand side and right hand side of an assignment can be almost any subexpression (although not literally any).
The reason as pointed out before is that assignment can be applied to any lvalue expression of enum or fundamental type or rvalue expression of a class type (where operator= is always a member). Many expressions, in a language that allows for operator overloading and that does not define what the type returned from the operator is, can potentially fulfill the needs of assignment, and the grammar must allow all of those uses.
Different rules in the standard will later limit which of the possible expressions that can be generated from the grammar are actually valid or not.
Your particular statement, (4 || localvar1) = 5; is invalid (unless operator|| is overloaded), because you can't assign 5 to 4 (4 is an r-value). You must have an l-value (something that can be assigned to) on the left, such as a reference returned by a function.
For example, say you have some function int& get_my_int() that returns the reference to an integer. Then, you can do this:
`get_my_int() = 5;`
This will set the integer returned by get_my_int() to 5.
Just like in your first post, this MUST be a reference to an integer (and not a value); otherwise, the above statement wouldn't compile.
There are actually two interesting things about the C++ grammar for assignment statements, neither of which have to do with the validity of:
(4 || localvar1) = 5;
That expression is syntactically valid (up to type-checking) because of the parentheses. Any parenthesized expression of reference type is syntactically correct on the left-hand-side of an assignment operator. (And, as has been pointed out, almost any expression which involves a user type or function can be of reference type, as a result of operator overloading.)
What's more interesting about the grammar is that it establishes the left precedence of assignment operators as being lower than almost all other operators, including logical-or, so that the above expression is semantically equivalent to
4 || localvar1 = 5;
even though many readers would interpret the above as 4 || (localvar1 = 5) (which would have been totally correct assuming that localvar1 is of a type which can be assigned to by an int, even though said assignment will never happen -- unless, of course, || is overloaded in this context).
So what has a lower precedence on the left hand side of an assignment operator? As I said, very little, but one important exception is ?::
// Replace the larger of a and b with c
a > b ? a = c : b = c;
is valid and conveniently parenthesis-less. (Many style guides insist on redundant parentheses here, but I personally rather like the unparenthesized version.) This is not the same as right-hand precedence, so that the following also works without parentheses:
// Replace c with the larger of a and b
c = a > b ? a : b;
The only other operators which bind less tightly on the left of an assignment operator than the assignment operator are the , operator and another assignment operator. (In other words, assignment is right-associative unlike almost all other binary operators.) Neither of these are surprising -- in fact, they are so necessary that it's easy to miss how important it is to design a grammar in this way. Consider the following unremarkable for clause:
for (first = p = vec.begin(), last = vec.end(); p < last; ++p)
Here the , is a comma operator, and it clearly needs to bind less tightly than either of the assignments which surround it. (C and C++ are only exceptional in this syntax by having a comma operator; in most languages, , is not considered an operator.) Also, it would obviously be undesirable for the first assignment expression to be parsed as (first = p) = vec.begin().
The fact that assignment operators associate to the right is unremarkable, but it's worth noting for one historical curiosity. When Bjarne Stroustrup was looking around for operators to overload for I/O streams, he settled on << and >> because, although an assignment operator might have been more natural [1], assignment binds to the right, and a streaming operator must bind to the left (std::cout << a << b must be (std::cout << a) << b). However, since << binds much more tightly than assignment, there are a number of gotchas when using the streaming operators. (The one which most recently caught me is that shift binds more tightly than the bitwise operators.)
[Note 1]: I don't have a citation for this, but I remember reading it many years ago in The C++ Programming Language. As I recall, there was not a consensus about assignment operators being natural, but it seems more natural than overloading shift operators to be something completely different from their normal semantics.
I m a total newbie to a C++ world (and C too). And don't know all its details. But one thing really bothers me.
It is constructions like :
while (a=b) {...} .As I understand this magic works because assignment operator in C and C++ returns something.
So the questions: what does it return? Is this a documented thing?Does it work the same in C and C++. Low level details about assignment operator and its implementation in both C and C++ (if there is a difference) will be very appreciated!
I hope that this question won't be closed, because I can't find a comprehensive explanation and good material on this theme from the low level point of view all the more so.
For built-in types in C++ evaluating an assignment expression produces an lvalue that is the left hand side of the assignment expression. The assignment is sequenced before the result can be used, so when the result is converted to an rvalue you get the newly assigned value:
int a, b=5;
int &c = (a=b);
assert(&c==&a);
b=10;
assert(10==(a=b));
C is almost but not exactly the same. The result of an assignment expression in C is an rvalue the same as the value newly assigned to the left hand side of the assignment.
int *c = &(a=b); // not legal in C because you can only take the address of lvalues.
Usually if the result of an assignment is used at all it's used as an rvalue (e.g., a=b=c), so this difference between C++ and C largely goes unnoticed.
The assignment operator is defined (in C) as returning the value of the variable that was assigned to - i.e. the value of the expression (a=b) is the value of a after the expression has been evaluated.
It can be defined to be something different (of the same type) for user-defined operator overloads in C++, but I suspect most would consider this to be a very unpleasant use of operator overloading.
You can use this (non-boolean) value in a while (or an if, etc.) because of type conversion - using a value in a conditional context causes it to be implicitly converted to something which makes sense in a conditional context. In C++, this is bool, and you can define your own conversion (for your own type) by overloading operator bool(). In C, anything other than 0 is true.
To understand such expressions, you have to first understand that, positive integers are considered as 'true' and 0 is considered as false.
An assignment evaluates to the left hand side of the operator = as its value. So, while(a=b) { } would mean, while(1 /*true*/) if a after being assigned to b evaluates to non-zero. Else, it is considered as while(0 /*false*/)
Similarly, with the operator (a=b)?1:0 is the value of a after being assigned to b .. if it is non-zero then the value is taken as true and the statement following ? will be executed, or the statement following : is executed.
Assignments usually evaluate to the value at the left hand side of the operator = where as, logical operators(such as ==, && etc) evaluate to 1 or 0.
Note: with C++, it will depend upon if or not, a certain operator is overloaded.. and it will also depend upon the return type of the overloaded operator.
The assignment operators in C and C++ return the value of the variable being assigned to, i.e., their left operand. In your example of a = b, the value of this entire expression is the value that is assigned to a (which is the value of b converted into the type of a).
So you can say that the assignment operator "returns" the value of its left operand.
In C++ it's a little more complicated because you can overload the = operator with an actual user-defined function, and have it return something other than the value (and type) of the left operand.