In C++, pre-increment operator gives lvalue because incremented object itself is returned, not a copy.
But in C, it gives rvalue. Why?
C doesn't have references. In C++ ++i returns a reference to i (lvalue) whereas in C it returns a copy(incremented).
C99 6.5.3.1/2
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation. The expression ++Eis equivalent to (E+=1).
‘‘value of an expression’’ <=> rvalue
However for historical reasons I think "references not being part of C" could be a possible reason.
C99 says in the footnote (of section $6.3.2.1),
The name ‘‘lvalue’’ comes originally
from the assignment expression E1 =
E2, in which the left operand E1 is
required to be a (modifiable) lvalue.
It is perhaps better considered as
representing an object ‘‘locator
value’’. What is sometimes called
‘‘rvalue’’ is in this International
Standard described as the ‘‘value of
an expression’’.
Hope that explains why ++i in C, returns rvalue.
As for C++, I would say it depends on the object being incremented. If the object's type is some user-defined type, then it may always return lvalue. That means, you can always write i++++++++ or ++++++i if type of i is Index as defined here:
Undefined behavior and sequence points reloaded
Off the top of my head, I can't imagine any useful statements that could result from using a pre-incremented variable as an lvalue. In C++, due to the existence of operator overloading, I can. Do you have a specific example of something that you're prevented from doing in C, due to this restriction?
Related
This question already has answers here:
Why are multiple increments/decrements valid in C++ but not in C?
(4 answers)
Closed 5 years ago.
Why is
int main()
{
int i = 0;
++++i;
}
valid C++ but not valid C?
C and C++ say different things about the result of prefix ++. In C++:
[expr.pre.incr]
The operand of prefix ++ is modified by adding 1. The operand shall be
a modifiable lvalue. The type of the operand shall be an arithmetic
type other than cv bool, or a pointer to a completely-defined object
type. The result is the updated operand; it is an lvalue, and it is a
bit-field if the operand is a bit-field. The expression ++x is
equivalent to x+=1.
So ++ can be applied on the result again, because the result is basically just the object being incremented and is an lvalue. In C however:
6.5.3 Unary operators
The operand of the prefix increment or decrement operator shall have atomic, qualified, or unqualified real or pointer type, and shall be a modifiable lvalue.
The value of the operand of the prefix ++ operator is incremented. The
result is the new value of the operand after incrementation.
The result is not an lvalue; it's just the pure value of the incrementation. So you can't apply any operator that requires an lvalue on it, including ++.
If you are ever told the C++ and C are superset or subset of each other, know that it is not the case. There are many differences that make that assertion false.
In C, it's always been that way. Possibly because pre-incremented ++ can be optimised to a single machine code instruction on many CPUs, including ones from the 1970s which was when the ++ concept developed.
In C++ though there's the symmetry with operator overloading to consider. To match C, the canonical pre-increment ++ would need to return const &, unless you had different behaviour for user-defined and built-in types (which would be a smell). Restricting the return to const & is a contrivance. So the return of ++ gets relaxed from the C rules, at the expense of increased compiler complexity in order to exploit any CPU optimisations for built-in types.
I assume you understand why it's fine in C++ so I'm not going to elaborate on that.
For whatever it's worth, here's my test result:
t.c:6:2: error: lvalue required as increment operand
++ ++c;
^
Regarding CppReference:
Non-lvalue object expressions
Colloquially known as rvalues, non-lvalue object expressions are the expressions of object types that do not designate objects, but rather values that have no object identity or storage location. The address of a non-lvalue object expression cannot be taken.
The following expressions are non-lvalue object expressions:
all operators not specified to return lvalues, including
increment and decrement operators (note: pre- forms are lvalues in C++)
And Section 6.5.3.1 from n1570:
The value of the operand of the prefix ++ operator is incremented. The result is the new value of the operand after incrementation.
So in C, the result of prefix increment and prefix decrement operators are not required to be lvalue, thus not incrementable again. In fact, such word can be understood as "required to be rvalue".
The other answers explain the way that the standards diverge in what they require. This answer provides a motivating example in the area of difference.
In C++, you can have a function like int& foo(int&);, which has no analog in C. It is useful (and not onerous) for C++ to have the option of foo(foo(x));.
Imagine for a moment that operations on basic types were defined somewhere, e.g. int& operator++(int&);. ++++x itself is not a motivating example, but it fits the pattern of foo above.
Recently tried the following program and it compiles, runs fine and produces expected output instead of any runtime error.
#include <iostream>
class demo
{
public:
static void fun()
{
std::cout<<"fun() is called\n";
}
static int a;
};
int demo::a=9;
int main()
{
demo* d=nullptr;
d->fun();
std::cout<<d->a;
return 0;
}
If an uninitialized pointer is used to access class and/or struct members behaviour is undefined, but why it is allowed to access static members using null pointers also. Is there any harm in my program?
TL;DR: Your example is well-defined. Merely dereferencing a null pointer is not invoking UB.
There is a lot of debate over this topic, which basically boils down to whether indirection through a null pointer is itself UB.
The only questionable thing that happens in your example is the evaluation of the object expression. In particular, d->a is equivalent to (*d).a according to [expr.ref]/2:
The expression E1->E2 is converted to the equivalent form
(*(E1)).E2; the remainder of 5.2.5 will address only the first
option (dot).
*d is just evaluated:
The postfix expression before the dot or arrow is evaluated;65 the
result of that evaluation, together with the id-expression, determines
the result of the entire postfix expression.
65) If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary
to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.
Let's extract the critical part of the code. Consider the expression statement
*d;
In this statement, *d is a discarded value expression according to [stmt.expr]. So *d is solely evaluated1, just as in d->a.
Hence if *d; is valid, or in other words the evaluation of the expression *d, so is your example.
Does indirection through null pointers inherently result in undefined behavior?
There is the open CWG issue #232, created over fifteen years ago, which concerns this exact question. A very important argument is raised. The report starts with
At least a couple of places in the IS state that indirection through a
null pointer produces undefined behavior: 1.9 [intro.execution]
paragraph 4 gives "dereferencing the null pointer" as an example of
undefined behavior, and 8.3.2 [dcl.ref] paragraph 4 (in a note) uses
this supposedly undefined behavior as justification for the
nonexistence of "null references."
Note that the example mentioned was changed to cover modifications of const objects instead, and the note in [dcl.ref] - while still existing - is not normative. The normative passage was removed to avoid commitment.
However, 5.3.1 [expr.unary.op] paragraph 1, which describes the unary
"*" operator, does not say that the behavior is undefined if the
operand is a null pointer, as one might expect. Furthermore, at least
one passage gives dereferencing a null pointer well-defined behavior:
5.2.8 [expr.typeid] paragraph 2 says
If the lvalue expression is obtained by applying the unary * operator
to a pointer and the pointer is a null pointer value (4.10
[conv.ptr]), the typeid expression throws the bad_typeid exception
(18.7.3 [bad.typeid]).
This is inconsistent and should be cleaned up.
The last point is especially important. The quote in [expr.typeid] still exists and appertains to glvalues of polymorphic class type, which is the case in the following example:
int main() try {
// Polymorphic type
class A
{
virtual ~A(){}
};
typeid( *((A*)0) );
}
catch (std::bad_typeid)
{
std::cerr << "bad_exception\n";
}
The behavior of this program is well-defined (an exception will be thrown and catched), and the expression *((A*)0) is evaluated as it isn't part of an unevaluated operand. Now if indirection through null pointers induced UB, then the expression written as
*((A*)0);
would be doing just that, inducing UB, which seems nonsensical when compared to the typeid scenario. If the above expression is merely evaluated as every discarded-value expression is1, where is the crucial difference that makes the evaluation in the second snippet UB? There is no existing implementation that analyzes the typeid-operand, finds the innermost, corresponding dereference and surrounds its operand with a check - there would be a performance loss, too.
A note in that issue then ends the short discussion with:
We agreed that the approach in the standard seems okay: p = 0; *p;
is not inherently an error. An lvalue-to-rvalue conversion would give
it undefined behavior.
I.e. the committee agreed upon this. Although the proposed resolution of this report, which introduced so-called "empty lvalues", was never adopted…
However, “not modifiable” is a compile-time concept, while in fact
this deals with runtime values and thus should produce undefined
behavior instead. Also, there are other contexts in which lvalues can
occur, such as the left operand of . or .*, which should also be
restricted. Additional drafting is required.
…that does not affect the rationale. Then again, it should be noted that this issue even precedes C++03, which makes it less convincing while we approach C++17.
CWG-issue #315 seems to cover your case as well:
Another instance to consider is that of invoking a member function
from a null pointer:
struct A { void f () { } };
int main ()
{
A* ap = 0;
ap->f ();
}
[…]
Rationale (October 2003):
We agreed the example should be allowed. p->f() is rewritten as
(*p).f() according to 5.2.5 [expr.ref]. *p is not an error when
p is null unless the lvalue is converted to an rvalue (4.1
[conv.lval]), which it isn't here.
According to this rationale, indirection through a null pointer per se does not invoke UB without further lvalue-to-rvalue conversions (=accesses to stored value), reference bindings, value computations or the like. (Nota bene: Calling a non-static member function with a null pointer should invoke UB, albeit merely hazily disallowed by [class.mfct.non-static]/2. The rationale is outdated in this respect.)
I.e. a mere evaluation of *d does not suffice to invoke UB. The identity of the object is not required, and neither is its previously stored value. On the other hand, e.g.
*p = 123;
is undefined since there is a value computation of the left operand, [expr.ass]/1:
In all cases, the assignment is sequenced after the value computation
of the right and left operands
Because the left operand is expected to be a glvalue, the identity of the object referred to by that glvalue must be determined as mentioned by the definition of evaluation of an expression in [intro.execution]/12, which is impossible (and thus leads to UB).
1 [expr]/11:
In some contexts, an expression only appears for its side effects.
Such an expression is called a discarded-value expression. The
expression is evaluated and its value is discarded. […]. The lvalue-to-rvalue conversion (4.1) is
applied if and only if the expression is a glvalue of
volatile-qualified type and […]
From the C++ Draft Standard N3337:
9.4 Static members
2 A static member s of class X may be referred to using the qualified-id expression X::s; it is not necessary to use the class member access syntax (5.2.5) to refer to a static member. A static member may be referred
to using the class member access syntax, in which case the object expression is evaluated.
And in the section about object expression...
5.2.5 Class member access
4 If E2 is declared to have type “reference to T,” then E1.E2 is an lvalue; the type of E1.E2 is T. Otherwise,
one of the following rules applies.
— If E2 is a static data member and the type of E2 is T, then E1.E2 is an lvalue; the expression designates the named member of the class. The type of E1.E2 is T.
Based on the last paragraph of the standard, the expressions:
d->fun();
std::cout << d->a;
work because they both designate the named member of the class regardless of the value of d.
runs fine and produces expected output instead of any runtime error.
That's a basic assumption error. What you are doing is undefined behavior, which means that your claim for any kind of "expected output" is faulty.
Addendum: Note that, while there is a CWG defect (#315) report that is closed as "in agreement" of not making the above UB, it relies on the positive closing of another CWG defect (#232) that is still active, and hence none of it is added to the standard.
Let me quote a part of a comment from James McNellis to an answer to a similar Stack Overflow question:
I don't think CWG defect 315 is as "closed" as its presence on the "closed issues" page implies. The rationale says that it should be allowed because "*p is not an error when p is null unless the lvalue is converted to an rvalue." However, that relies on the concept of an "empty lvalue," which is part of the proposed resolution to CWG defect 232, but which has not been adopted.
The expressions d->fun and d->a() both cause evaluation of *d ([expr.ref]/2).
The complete definition of the unary * operator from [expr.unary.op]/1 is:
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.
For the expression d there is no "object or function to which the expression points" . Therefore this paragraph does not define the behaviour of *d.
Hence the code is undefined by omission, since the behaviour of evaluating *d is not defined anywhere in the Standard.
What you are seeing here is what I would consider an ill-conceived and unfortunate design choice in the specification of the C++ language and many other languages that belong to the same general family of programming languages.
These languages allow you to refer to static members of a class using a reference to an instance of the class. The actual value of the instance reference is of course ignored, since no instance is required to access static members.
So, in d->fun(); the the compiler uses the d pointer only during compilation to figure out that you are referring to a member of the demo class, and then it ignores it. No code is emitted by the compiler to dereference the pointer, so the fact that it is going to be NULL during runtime does not matter.
So, what you see happening is in perfect accordance to the specification of the language, and in my opinion the specification suffers in this respect, because it allows an illogical thing to happen: to use an instance reference to refer to a static member.
P.S. Most compilers in most languages are actually capable of issuing warnings for that kind of stuff. I do not know about your compiler, but you might want to check, because the fact that you received no warning for doing what you did might mean that you do not have enough warnings enabled.
Consider the following code snippet
int a,i;
a = 5;
(i++) = a;
(++i) = a;
cout<<i<<endl;
Line (++i) = a is compiling properly and giving 5 as output.
But (i++) = a is giving compilation error error: lvalue required as left operand of assignment.
I am not able to find the reason for such indifferent behavior. I would be grateful if someone explains this.
The expression i++ evaluates to the value of i prior to the increment operation. That value is a temporary (which is an rvalue) and you cannot assign to it.
++i works because that expression evaluates to i after it has been incremented, and i can be assigned to (it's an lvalue).
More on lvalues and rvalues on Wikipedia.
According to the C++ standard, prefix ++ is an lvalue (which
is different than C), post-fix no. More generally, C++ takes
the point of view that anything which changes an lvalue
parameter, and has as its value the value of that parameter,
results in an lvalue. So ++ i is an lvalue (since the
resulting value is the new value of i), but i ++ is not
(since the resulting value is not the new value, but the old).
All of this, of course, for the built-in ++ operators. If you
overload, it depends on the signatures of your overloads (but
a correctly designed overloaded ++ will behave like the
built-in ones).
Of course, neither (++ i) = a; nor (i ++) = a; in your
example are legal; both use the value of an uninitialized
variable (i), which is undefined behavior, and both modify i
twice without an intervening sequence point.
test.(c/cpp)
#include <stdio.h>
int main(int argc, char** argv)
{
int a = 0, b = 0;
printf("a = %d, b = %d\n", a, b);
b = (++a)--;
printf("a = %d, b = %d\n", a, b);
return 0;
}
If I save the above as a .cpp file, it compiles and outputs this upon execution:
a = 0, b = 0
a = 0, b = 1
However, if I save it as a .c file, I get the following error:
test.c:7:12: error: lvalue required as decrement operator.
Shouldn't the (++a) operation be resolved before the (newValue)-- operation? Does anyone have any insight on this?
In C the result of the prefix and postfix increment/decrement operators is not an lvalue.
In C++ the result of the postfix increment/decrement operator is also not an lvalue but the result of the prefix increment/decrement operator is an lvalue.
Now doing something like (++a)-- in C++ is undefined behavior because you are modifying an object value twice between two sequence points.
EDIT: following up on #bames53 comment. It is undefined behavior in C++98/C++03 but the changes in C++11 on the idea of sequence points now makes this expression defined.
In C and C++, there are lvalue expressions which may be used on the left-hand side of the = operator and rvalue expressions which may not. C++ allows more things to be lvalues because it supports reference semantics.
++ a = 3; /* makes sense in C++ but not in C. */
The increment and decrement operators are similar to assignment, since they modify their argument.
In C++03, (++a)-- would cause undefined behavior because two operations which are not sequenced with respect to each other are modifying the same variable. (Even though one is "pre" and one is "post", they are unsequenced because there is no ,, &&, ?, or such.)
In C++11, the expression now does what you would expect. But C11 does not change any such rules, it's a syntax error.
For anybody who might want the precise details of the differences as they're stated in the standards, C99, §6.5.3/2 says:
The value of the operand of the prefix ++ operator is incremented. The result is the new
value of the operand after incrementation.
By contrast, C++11, §5.3.2/1 says:
The result is the updated operand; it is an lvalue, and it is a bit-field if
the operand is a bit-field.
[emphasis added, in both cases]
Also note that although (++a)-- gives undefined behavior (at least in C++03) when a is an int, if a is some user-defined type, so you're using your own overloads of ++ and --, the behavior will be defined -- in such a case, you're getting the equivalent of:
a.operator++().operator--(0);
Since each operator results in a function call (which can't overlap) you actually do have sequence points to force defined behavior (note that I'm not recommending its use, only noting that the behavior is actually defined in this case).
§5.2.7 Increment and decrement:
The value of a postfix ++ expression is the value of its operand. [ ... ] The operand shall be a modifiable lvalue.
The error you get in your C compilation helps to suggest that this is only a feature present in C++.
I think everyone here knows that --i is a left value expression while i-- is a right value expression. But I read the Assembly code of the two expression and find out that they are compiled to the same Assembly code:
mov eax,dword ptr [i]
sub eax,1
mov dword ptr [i],eax
In C99 language standard, An lvalue is defined to an expression with an object type or an incomplete type other than void.
So I can ensure that --i return a value which is an type other than void while i-- return a value which is void or maybe a temp variable.
However when I give a assignment such as i--=5, the compiler will give me an error indicating i-- is not a lvalue, I do no know why it is not and why the return value is a temp variable. How does the compiler make such a judgement? Can anybody give me some explanation in Assembly language level?Thanks!
Left value? Right value?
If you are talking about lvalues and rvalues, then the property of being lvalue or rvalue applies to the result of an expression, meaning that you have to consider the results of --i and i--. And in C language both --i and i-- are rvalues. So, your question is based on incorrect premise in the realm of C language. --i is not an lvalue in C. I don't know what point you are trying to make by referring to the C99 standard, since it clearly states that neither is an lvalue. Also, it is not clear what you mean by i-- returning a void. No, the built-in postfix -- never returns void.
The lvalue vs. rvalue distinction in case of --i and i-- exists in C++ only.
Anyway, if you are looking at mere --i; and i--; expression statements, you are not using the results of these expressions. You are discarding them. The only point to use standalone --i and i-- is their side-effects (decrement of i). But since their side-effects are identical, it is completely expected that the generated code is the same.
If you want to see the difference between --i and i-- expressions, you have to use their results. For example
int a = --i;
int b = i--;
will generate different code for each initialization.
This example has nothing to do with lvalueness or rvalueness of their results though. If you want to observe the difference from that side (which only exists in C++, as I said above), you can try this
int *a = &--i;
int *b = &i--;
The first initialization will compile in C++ (since the result is an lvalue) while the second won't compile (since the result is an rvalue and you cannot apply the built-in unary & to an rvalue).
The rationale behind this specification is rather obvious. Since the --i evaluates to the new value of i, it is perfectly possible to make this operator to return a reference to i itself as its result (and C++ language, as opposed to C, prefers to return lvalues whenever possible). Meanwhile, i-- is required to return the old value of i. Since by the time we get to analyze the result oh i-- the i itself is likely to hold the new value, we cannot return a reference to i. We have to save (or recreate) the old value of i in some auxiliary temporary location and return it as the result of i--. That temporary value is just a value, not an object. It does not need to reside in memory, which is why it cannot be an lvalue.
[Note: I'm answering this from a C++ perspective.]
Assuming i is a built-in type, if you just write --i; or i--; rather than, say, j = ++i; or j = i++;, then it's unsurprising that they get compiled to the assembly code by the compiler - they're doing the same thing, which is decrementing i. The difference only becomes apparent at the assembly level when you do something with the result, otherwise they effectively have the same semantics.
(Note that if we were thinking about overloaded pre- and post-decrement operators for a user-defined type, the code generated would not be the same.)
When you write something like i-- = 5;, the compiler quite rightly complains, because the semantics of post-decrement are essentially to decrement the thing in question but return the old value of it for further use. The thing returned will be a temporary, hence why i-- yields an r-value.
The terms “lvalue” and “rvalue” originate from the assignment expression E1 = E2, in which the left operand E1 is used to identify the object to be modified, and the right operand E2 identifies the value to be used. (See C 1999 6.3.2.1, note 53.)
Thus, an expression which still has some object associated with it can be used to locate that object and to write to it. This is an lvalue. If an expression is not an lvalue, it might be called an rvalue.
For example, if you have i, the name of some object, it is an lvalue, because we can find where i is, and we can assign to it, as in i = 3.
On the other hand, if we have the expression i+1, then we have taken the value of i and added 1, and we now have a value, but it is not associated with a particular object. This new value is not in i. It is just a temporary value and does not have a particular location. (To be sure, the compiler must put it somewhere, unless optimization removes the expression completely. But it might be in registers and never in memory. Even if it is in memory for some reason, the C language does not provide you for a way to find out where.) So i+1 is not an lvalue, because you cannot use it on the left side of an assignment.
--i and i++ are both expressions that result from taking the value of i and performing some arithmetic. (These expressions also change i, but that is a side effect of the operator, not part of the result it returns.) The “left” and “right” of lvalues and rvalues have nothing to do with whether -- or ++ operator is on the left side or the right side of a name; they have to do with the left side or the right side of an assignment. As other answers explain, in C++, when they are on the left side of an lvalue, they return an lvalue. However, this is coincidental; this definition of the operators in C++ came many years after the creation of the term “lvalue”.