dereferencing the null pointer - c++

int* p = 0;
int* q = &*p;
Is this undefined behavior or not? I browsed some related questions, but this specific aspect didn't show up.

The answer to this question is: it depends which language standard you are following :-).
In C90 and C++, this is not valid because you perform indirection on the null pointer (by doing *p), and doing so results in undefined behavior.
However, in C99, this is valid, well-formed, and well-defined. In C99, if the operand of the unary-& was obtained as the result of applying the unary-* or by performing subscripting ([]), then neither the & nor the * or [] is applied. For example:
int* p = 0;
int* q = &*p; // In C99, this is equivalent to int* q = p;
Likewise,
int* p = 0;
int* q = &p[0]; // In C99, this is equivalent to int* q = p + 0;
From C99 §6.5.3.2/3:
If the operand [of the unary & operator] is the result of a unary * operator, neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue.
Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator.
(and its footnote, #84):
Thus, &*E is equivalent to E (even if E is a null pointer)

Yes that would be undefined behavior, but your compiler might optimize the &* out.
Why it its undefined, is that you are attempting to access memory outside your addressable space.

Yes, dereferencing the null pointer is undefined behavior. Integer constant 0 in a pointer context is the null pointer. That's it.
Now, if your second line was int *q = p; that would be a simple pointer assignment. If the compiler removes the &* and reduces the dereference to an assignment, you're OK.

IMHO, As far as the two code lines are concerned, there isn't any access outside the address space. The second statement is simply taking the address of (*p) which would be 'p' again and hence it will store '0'. But the location is never accessed.

Related

Is member access on a null pointer defined in C++?

Is address computation on a null pointer defined behavior in C++? Here's a simple example program.
struct A { int x; };
int main() {
A* p = nullptr;
&(p->x); // is this undefined behavior?
return 0;
}
Thanks.
EDIT Subscripting is covered in this other question.
&(p->x); // is this undefined behavior?
Standard is a bit vague regarding this:
[expr.ref] ... The expression E1->E2 is converted to the equivalent form (*(E1)).E2;
[expr.unary.op] The unary * operator ... the result is an lvalue referring to the object ... to which the expression points.
There is no explicit mention of UB in the section. The quoted rule does appear to conflict with the fact that the null pointer doesn't point to any object. This could be interpreted that yes, behaviour is undefined.
[expr.unary.op] The result of the unary & operator is a pointer to its operand. ... if the operand is an lvalue of type T, the resulting expression is a prvalue of type “pointer to T” whose result is a pointer to the designated object ([intro.memory]).
Again, no designated object exists. Note that at no point is the operand lvalue converted to an rvalue, which would definitely have been UB.
Back in 2000 there was CWG issue to clarify whether indirection through null is undefined. The proposed resolution (2004), that would clarify that indirection through null is not UB, appears to not have been added to the standard so far.
However whether it is or isn't UB doesn't matter much since you don't need to do this. At the very least, the resulting pointer will be invalid and thus useless.
If you were planning to convert the pointer to an integer to get the offset of the member, there is no need to do this because you can instead us the offsetof macro from the standard library, which doesn't have UB.
&(p[1]); // undefined?
Here, behaviour is quite clearly undefined:
[expr.sub] ... The expression E1[E2] is identical (by definition) to *((E1)+(E2)), except that in the case of an array operand, the result is an lvalue if that operand is an lvalue and an xvalue otherwise.
[expr.add] When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
If P evaluates to a null pointer value and J evaluates to 0 (does not apply)
Otherwise, if P points to an array element (does not apply)
Otherwise, the behavior is undefined.
&(p[0]); // undefined?
As per previous rules, the first option applies:
If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
And now we are back to the question of whether indirection through this null is UB. See the beginning of the answer.
Still, doesn't really matter. There is no need to write this, since this is simply unnecessarily complicated way to write sizeof(int) * i (with i being 1 and 0 respectively).

Is creating a pointer one past the end of a non-array pointer not derived from unary operator & undefined behavior in C++17?

The C++17 standard seems to say that an integer can only be added to a pointer if the pointer is to an array element, or, as a special exception, the pointer is the result of unary operator &:
8.5.6 [expr.add] describing addition to a pointer:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined.
This quote includes a non-normative footnote:
An object that is not an array element is considered to belong to a single-element array for this purpose; see 8.5.2.1
which references 8.5.2.1 [expr.unary.op] discussing the unary & operator:
The result of the unary & operator is a pointer to its operand... For purposes of pointer arithmetic (8.5.6) and comparison (8.5.9, 8.5.10), an object that is not an array element whose address is taken in this way is considered to belong to an array with one element of type T.
The non-normative footnote seems to be slightly misleading, as the section it references describes behavior specific to the result of unary operator &. Nothing appears to permit other pointers (e.g. from non-array new) to be considered single-element arrays.
This seems to suggest:
void f(int a) {
int* z = (new int) + 1; // undefined behavior
int* w = &a + 1; // ok
}
Is this an oversight in the changes made for C++17? Am I missing something? Is there a reason that the "single-element array rule" is only provided specifically for unary operator &?
Note: As specified in the title, this question is specific to C++17. The C standard and prior versions of the C++ standard contained clear normative language that is no longer present. Older, vague questions like this are not relevant.
Yes, this appears to be a bug in the c++17 standard.
int* z = (new int)+1; // undefined behavior.
int* a = new int;
int* b = a+1; // undefined behavior, same reason as `z`
&*a; // seeming noop, but magically makes `*a` into an array of one element!
int* c = a+1; // defined behavior!
this is pretty ridiculous.
8.5.2.1 [expr.unary.op]
[...] an object that is not an array element whose address is taken in this way is considered to belong to an array with one element of type T
once "blessed" by 8.5.2.1, the object is an array of one element. If you don't bless it by invoking & at least once, it has never been blessed by 8.5.2.1 and is not an array of one element.
It was fixed as a defect in c++20.

What is the different between *&aPtr and &*aPtr?

I want to know what is the different between *&aPtr and &*aPtr if replaced * & and & * ?
int a;
int *aptr;
a = 7;
aptr=&a;
cout << &* aPtr<< *&aPtr<< endl;
They have the same value, but *&aPtr is an lvalue that refers to aPtr whereas &*aPtr is a prvalue that has the same value as aPtr.
If the types are primitives (integers, characters, booleans etc.), then they will yield the same value.
A difference may occure if the operators & and * are overloaded for specific class. in this case, depending on the implementation alone - there might be a difference.
One other thing: a corner case can occure if T* t actually points to null:
int* i = nullptr;
*&i; //ok, first takes the address of i, then dereference it, yielding a null pointer again
&*i //wrong, dereference a null pointer, yielding undefined behavior
These unary operators & and * group right to left.
So in this expression
&*aPtr
at first operator * is applied and you get lvalue of a after that operator & is applied and you get rvalue of pointer to a.
Its value is the same as the initial value of aPtr. However you may not write for example
&*aPtr = &a;
while you may write
aPtr = &a;
In this expression
*&aPtr
at first operator & is applied that yields the address of variable aPtr itself. After that operator * is applied and you get again aPtr.
The difference between this expression and the above expression is that you may write
*&aPtr = &a;
because expression *&aPtr yields lvalue of aPtr.

Semantics of unary & on numeric literal

What is the unary-& doing here?
int * a = 1990;
int result = &5[a];
If you were to print result you would get the value 2010.
You have to compile it with -fpermissive or it will stop due to errors.
In C, x [y] and y [x] are identical. So &5[a] is the same as &a[5].
&5[a] is the same as &a[5] and the same as a + 5. In your case it's undefined behavior because a points to nowhere.
C11 standard chapter 6.5.6 Additive operators/8 (the same in C++):
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
"...unary & on numeric literal"?
Postfix operators in C always have higher priority than prefix ones. In case of &5[a], the [] has higher priority than the &. Which means that in &5[a] the unary & is not applied to "numeric literal" as you seem to incorrectly believe. It is applied to the entire 5[a] subexpression. I.e. &5[a] is equivalent to &(5[a]).
As for what 5[a] means - this is a beaten-to-death FAQ. Look it up.
And no, you don't have "to compile it with -fpermissive" (my compiler tells me it doesn't even know what -fpermissive is). You have to figure out that this
int * a = 1990;
is not legal code in either C or C++. If anything, it requires an explicit cast
int * a = (int *) 1990;
not some obscure switch of some specific compiler you happened to be using at the moment. The same applies to another illegal initialization in int result = &5[a].
Finally, even if we overlook the illegal code and the undefined behavior triggered by that 5[a], the behavior of this code will still be highly implementation-dependent. I.e. the answer is no, in general case you will not get 2010 in result.
You cannot apply the unary & operator to an integer literal, because a literal is not an lvalue.
Due to operator precedence, your code doesn't do that. Since the indexing operator [] binds more tightly than unary &, &5[a] is equivalent to &(5[a]).
Here's a program similar to yours, except that it's valid code, not requiring -fpermissive to compile:
#include <stdio.h>
int main(void) {
int arr[6];
int *ptr1 = arr;
int *ptr2 = &5[ptr1];
printf("%p %p\n", ptr1, ptr2);
}
As explained in this question and my answer, the indexing operator is commutative (because it's defined in terms of addition, and addition is commutative), so 5[a] is equivalent to a[5]. So the expression &5[ptr1] computes the address of element 5 of arr.
In your program:
int * a = 1990;
int result = &5[a];
the initialization of a is invalid because a is of type int* and 1990 is of type int, and there is no implicit conversion from int to int*. Likewise, the initialization of result is invalid because &5[a] is of type int*. Apparently -fpermissive causes the compiler to violate the rules of the language and permit these invalid implicit conversions.
At least in the version of gcc I'm using, the -fpermissive option is valid only for C++ and Objective-C, not for C. In C, gcc permits such implicit conversions (with a warning) anyway. I strongly recommend not using this option. (Your question is tagged both C and C++. Keep in mind that C and C++ are two distinct, though closely related, languages. They happen to behave similarly in this case, but it's usually best to pick one language or the other.)

What happens when a casted pointer has an increment operator?

For example:
int x[100];
void *p;
x[0] = 0x12345678;
x[1] = 0xfacecafe;
x[3] = 0xdeadbeef;
p = x;
((int *) p) ++ ;
printf("The value = 0x%08x", *(int*)p);
Compiling the above generates an lvalue required error on the line with the ++ operator.
The cast creates a temporary pointer of type int *. You can't increment a temporary as it doesn't denote a place to store the result.
In C and C++ standardese, (int *)p is an rvalue, which roughly means an expression that can only occur on the right-hand side of an assignment.
p on the other hand is an lvalue, which means it can validly appear on the left-hand side of an assignment. Only lvalues can be incremented.
The expression ((int *) p) treats the pointer stored inside the variable p is a pointer to int. If you want to treat the variable itself as a pointer to int variable (and then increment it), use a reference cast:
((int *&) p) ++ ;
Thanks to larsmans for pointing to the right direction.
I took the liberty of digging deeper into this. So for future reference, according to sections 6.5.2.4 and 6.5.4 of the C99 standard (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf):
6.5.2.4 Postfix increment and decrement operators
Constraints
The operand of the postfix increment
or decrement operator shall have
qualified or unqualified real or
pointer type and shall be a modifiable
lvalue....
6.5.4 Cast operators
..
..
[Footnote] 89) A cast
does not yield an lvalue. Thus, a cast
to a qualified type has the same
effect as a cast to the unqualified
version of the type.
Note: This only applies to C. C++ may handle casts differently.
You can get the intended result with
p = (int*)p + 1;
Using the increment operator on a dereferenced pointer to p, which is an lvalue, also works:
(*(int**)&p)++;
However, the latter is not portable, since (void*)p might not have the same representation as (int*)p.
Rvalue expression ((int *) p) creates and temporary of type int* on which operator ++ cannot be applied.
++ requires an lvalue as its operand.
As #FredOverflow mentions lvalues and rvalues have very little to do with assignment.
Arrays are lvalues still they cannot be assigned to because they are non-modifiable.
std::string("Prasoon") is an rvalue expression still it can occur on the left side of assignment operator because we are allowed to call member functions( operator = in this case) on temporaries.