Are nullptr references undefined behaviour in C++? [duplicate] - c++

This question already has answers here:
Assigning a reference by dereferencing a NULL pointer
(5 answers)
Closed 9 years ago.
The following code fools around with nullptr pointer and reference:
#include <cstdio>
void printRefAddr(int &ref) {
printf("printAddr %p\n", &ref);
}
int main() {
int *ip = nullptr;
int &ir = *ip;
// 1. get address of nullptr reference
printf("ip=%p &ir=%p\n", ip, &ir);
// 2. dereference a nullptr pointer and pass it as reference
printRefAddr(*ip);
// 3. pass nullptr reference
printRefAddr(ir);
return 0;
}
Question: In C++ standards, are commented statements 1..3 valid code or undefined behavior?
Is this same or different with different versions of C++ (older ones would of course use 0 literal instead of nullptr keyword)?
Bonus question: are there known compilers / optimization options, which would actually cause above code to do something unexpected / crash? For example, is there a flag for any compiler, which would generate implicit assertion for nullptr everywhere where reference is initialized, including passing reference argument from *ptr?
An example output for the curious, nothing unexpected:
ip=(nil) &ir=(nil)
printAddr (nil)
printAddr (nil)

// 2. dereference a nullptr pointer and pass it as reference
Dereferencing a null pointer is Undefined Behaviour, and so whether you pass it as a reference or by value, the fact is that you've dereferenced it and therefore invoked UB, meaning from that point on all bets are off.
You've already invoked UB here:
int &ir = *ip; //ip is null, you cannot deref it without invoking UB.

Since ir is just a shadow of *ip it will not cause an undefined behavior on its own.
The undefined behavior is using a pointer which points to nullptr_t. I mean using *ip. Therefore
int &ir = *ip;
^^^
Causes an UB.

Related

Storing address of local var via passing a double ptr to function does not cause a seg fault but returning address of that var causes seg fault. Why? [duplicate]

This question already has answers here:
Why is the phrase: "undefined behavior means the compiler can do anything it wants" true?
(2 answers)
Why don't I get a segmentation fault when I write beyond the end of an array?
(4 answers)
C++ Impossible nullptr call mystery [duplicate]
(2 answers)
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 7 months ago.
I understand the concept of dangling pointers, func2 returns address of deallocated memory.
In func1 instead of returning the address we store by passing a double pointer. Isn't this also a case of dangling pointer ?
#include <iostream>
void func1(int **x){
int z = 10;
*x = &z;
}
int *func2(){
int z=11;
return &z;
}
int main() {
int *a;
func1(&a);
std::cout << *a << std::endl;
int *b = func2();
std::cout << *b << std::endl;
return 0;
}
OUTPUT:
10
seg fault
In both cases, you are dereferencing a dangling pointer, which invokes undefined behavior.
This means that you cannot rely on any specific behavior. You may get a segmentation fault or your program may work as intended, or something else might happen. The behavior may be different on different compilers. Even on the same compiler the behavior may be different, depending on the compiler settings, such as the optimization level. Also, simply updating your compiler to a new version may cause the behavior to change.
Asking why you are not getting a segmentation fault when you are invoking undefined behavior is not a meaningful question. It is like driving your car through an intersection when you have a red light and then asking the question why you did not collide with another car. When you break the rules, you cannot rely on anything specific to happen.
You may want to read this similar question:
Can a local variable's memory be accessed outside its scope?
Yes, both a and b are dangling pointers when you dereference them and either dereference therefore makes the program invalid.
But dereferencing a dangling pointer causes undefined behavior. That means that there is no guarantee for any particular behavior whatsoever. There is no guarantee for a segfault. There is no guarantee for a particular output. There is no guarantee that it won't look as if accessing the dangling pointer to the out-of-lifetime variable "worked".

Is it UB on derefencing pointer or not? [duplicate]

If I don't actually access the dereferenced "object", is dereferencing the null pointer still undefined?
int* p = 0;
int& r = *p; // undefined?
int* q = &*p; // undefined?
A slightly more practical example: can I dereference the null pointer to distinguish between overloads?
void foo(Bar&);
void foo(Baz&);
foo(*(Bar*)0); // undefined?
Okay, the reference examples are definitely undefined behavior according to the standard:
a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the "object" obtained by dereferencing a null pointer, which causes undefined behavior.
Unfortunately, the emphasized part is ambiguous. Is it the binding part that causes undefined behavior, or is the dereferencing part sufficient?
I think the second opus of What every C programmer should know about Undefined Behavior might help illustrate this issue.
Taking the example of the blog:
void contains_null_check(int *P) {
int dead = *P;
if (P == 0)
return;
*P = 4;
}
Might be optimized to (RNCE: Redundant Null Check Elimintation):
void contains_null_check_after_RNCE(int *P) {
int dead = *P;
if (false) // P was dereferenced by this point, so it can't be null
return;
*P = 4;
}
Which is turn optimized into (DCE: Dead Code Elimination):
void contains_null_check_after_RNCE_and_DCE(int *P) {
//int dead = *P; -- dead store
//if (false) -- unreachable branch
// return;
*P = 4;
}
As you can see, even though dead is never used, the simple int dead = *P assignment has caused Undefined Behavior to creep in the program.
To distinguish between overloads, I'd suggest using a pointer (which might be null) rather than artificially creating a null reference and exposing yourself to Undefined Behavior.
int& r = *p; // undefined?
I think right here you've undefined behavior even if you don't actually use r (or *p)- the dereferenced object. Because after this step (i.e dereferencing the null pointer), the program behaviour is not guaranteed by the language, as the program may crash immediately which is one of the possibilities of UB. You seem to think that only reading the value of r so as to be used in real purpose invokes UB. I don't think so.
Also, the language specification clearly says "the effect of dereferencing the null pointer" invokes undefined behavior. It does not say "the effect of actually using dereferenced object from a null pointer" invokes UB. The effect of dereferencing the null pointer (or in other words undefined behavior) doesn't mean that you will necessarily and immediately get problems, or it must crash immediately after dereferencing the null pointer. No. It simply means, the program behavior is not defined after dereferencing the null pointer. That is, the program may run normally, as expected, from start to end. Or it may crash immediately, or after after some time - after few minutes, hours or days. Anything can happen anytime after dereferencing the null pointer.
Yes it is undefined behavior, because the spec says that an "lvalue designates an object or function" (at clause 3.10) and it says for the *-operator "the result [of dereferencing] is an lvalue referring to the object or function to which the expression points" (at clause 5.3.1).
That means there is no description for what happens when you dereference a null pointer. It's simply undefined behavior.

How to use const cast with pointers in C++? [duplicate]

This question already has answers here:
behavior of const_cast in C++ [duplicate]
(3 answers)
Closed 8 years ago.
What is happening here?
const int a = 0;
const int *pa = &a;
int *p = const_cast<int*>(pa);
*p = 1; // undefined behavior ??
cout << a << *p; // ??
My compiler outputs 0 and 1, but address of 'a' and value of 'p' is the same, so I'm confused how is this possible.
Quote from cppreference:
Even though const_cast may remove constness or volatility from any pointer or reference, using the resulting pointer or reference to write to an object that was declared const or to access an object that was declared volatile invokes undefined behavior.
So yes, modifying constant variables is undefined behavior. The output you see is caused by the fact that you tell the compiler that the value of a will never change, so it can just put a literal 0 instead of the variable a in the cout line.
§7.1.6.1 [dcl.type.cv]/p4:
Except that any class member declared mutable (7.1.1) can be modified,
any attempt to modify a const object during its lifetime (3.8) results
in undefined behavior.
Attempting to write on a const value is undefined behavior, for example to allow the compiler to allocate const values into read only memory (usually in code segment) or inline their value into expressions at compile time, which is what happens in your case.

Is this undefined behavior with const_cast? [duplicate]

This question already has answers here:
behavior of const_cast in C++ [duplicate]
(3 answers)
Closed 8 years ago.
What is happening here?
const int a = 0;
const int *pa = &a;
int *p = const_cast<int*>(pa);
*p = 1; // undefined behavior ??
cout << a << *p; // ??
My compiler outputs 0 and 1, but address of 'a' and value of 'p' is the same, so I'm confused how is this possible.
Quote from cppreference:
Even though const_cast may remove constness or volatility from any pointer or reference, using the resulting pointer or reference to write to an object that was declared const or to access an object that was declared volatile invokes undefined behavior.
So yes, modifying constant variables is undefined behavior. The output you see is caused by the fact that you tell the compiler that the value of a will never change, so it can just put a literal 0 instead of the variable a in the cout line.
§7.1.6.1 [dcl.type.cv]/p4:
Except that any class member declared mutable (7.1.1) can be modified,
any attempt to modify a const object during its lifetime (3.8) results
in undefined behavior.
Attempting to write on a const value is undefined behavior, for example to allow the compiler to allocate const values into read only memory (usually in code segment) or inline their value into expressions at compile time, which is what happens in your case.

At what point does dereferencing the null pointer become undefined behavior?

If I don't actually access the dereferenced "object", is dereferencing the null pointer still undefined?
int* p = 0;
int& r = *p; // undefined?
int* q = &*p; // undefined?
A slightly more practical example: can I dereference the null pointer to distinguish between overloads?
void foo(Bar&);
void foo(Baz&);
foo(*(Bar*)0); // undefined?
Okay, the reference examples are definitely undefined behavior according to the standard:
a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the "object" obtained by dereferencing a null pointer, which causes undefined behavior.
Unfortunately, the emphasized part is ambiguous. Is it the binding part that causes undefined behavior, or is the dereferencing part sufficient?
I think the second opus of What every C programmer should know about Undefined Behavior might help illustrate this issue.
Taking the example of the blog:
void contains_null_check(int *P) {
int dead = *P;
if (P == 0)
return;
*P = 4;
}
Might be optimized to (RNCE: Redundant Null Check Elimintation):
void contains_null_check_after_RNCE(int *P) {
int dead = *P;
if (false) // P was dereferenced by this point, so it can't be null
return;
*P = 4;
}
Which is turn optimized into (DCE: Dead Code Elimination):
void contains_null_check_after_RNCE_and_DCE(int *P) {
//int dead = *P; -- dead store
//if (false) -- unreachable branch
// return;
*P = 4;
}
As you can see, even though dead is never used, the simple int dead = *P assignment has caused Undefined Behavior to creep in the program.
To distinguish between overloads, I'd suggest using a pointer (which might be null) rather than artificially creating a null reference and exposing yourself to Undefined Behavior.
int& r = *p; // undefined?
I think right here you've undefined behavior even if you don't actually use r (or *p)- the dereferenced object. Because after this step (i.e dereferencing the null pointer), the program behaviour is not guaranteed by the language, as the program may crash immediately which is one of the possibilities of UB. You seem to think that only reading the value of r so as to be used in real purpose invokes UB. I don't think so.
Also, the language specification clearly says "the effect of dereferencing the null pointer" invokes undefined behavior. It does not say "the effect of actually using dereferenced object from a null pointer" invokes UB. The effect of dereferencing the null pointer (or in other words undefined behavior) doesn't mean that you will necessarily and immediately get problems, or it must crash immediately after dereferencing the null pointer. No. It simply means, the program behavior is not defined after dereferencing the null pointer. That is, the program may run normally, as expected, from start to end. Or it may crash immediately, or after after some time - after few minutes, hours or days. Anything can happen anytime after dereferencing the null pointer.
Yes it is undefined behavior, because the spec says that an "lvalue designates an object or function" (at clause 3.10) and it says for the *-operator "the result [of dereferencing] is an lvalue referring to the object or function to which the expression points" (at clause 5.3.1).
That means there is no description for what happens when you dereference a null pointer. It's simply undefined behavior.