Legal legacy code using pointers suddenly becomes UB

Legal legacy code using pointers suddenly becomes UB - c++

Let's say we have this legacy code from C++98:
bool expensiveCheck();
struct Foo;
bool someFunc()
{
Foo *ptr = 0;
if( expensiveCheck() )
ptr = new Foo;
// doing something irrelevant here
...
if( ptr ) {
// using foo
}
delete ptr;
return ptr; // here we have UB(Undefined Behavior) in C++11
}
So basically pointer here is used to keep dynamically allocated data and use it as a flag at the same time. For me it is readable code and I believe it is legal C++98 code. Now according to this questions:
Pointers in c++ after delete
What happens to the pointer itself after delete?
this code has UB in C++11. Is it true?
If yes another question comes in mind, I heard that committee puts significant effort not to break existing code in new standard. If I am not mistaken in this case this not true. What is the reason? Is such code considered harmfull already so nobody cares it would be broken? They did not think about consequences? This optimization is so important? Something else?

Your example exhibits undefined behavior under C++98, too. From the C++98 standard:
[basic.stc.dynamic.deallocation]/4 If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined.33)
Footnote 33) On some implementations, it causes a system-generated runtime fault.

Related

Does std::unique_ptr set its underlying pointer to nullptr inside its destructor?

When implementing my own unique_ptr( just for fun), I found it cannot pass this test file from libstdcxx:
struct A;
struct B
{
std::unique_ptr<A> a;
};
struct A
{
B* b;
~A() { VERIFY(b->a != nullptr); }
};
void test01()
{
B b;
b.a.reset(new A);
b.a->b = &b;
}
gcc passes this test file happily (of course, this file is from libstdcxx), while clang fails for the VERIFY part.
Question:
Is it implementation dependent or undefined behavior?
I guess this postcondition (b->a != nullptr) is important for gcc, otherwise it'll not have a test file for it, but I don't know what's behind it. Is it related to optimization? I know many UB are for better optimizations.

clang (libc++) seems to be non-compliant on this point because the standard says:
[unique.ptr.single.dtor]
~unique_ptr();
Requires: The expression get_deleter()(get()) shall be well-formed, shall have well-defined behavior, and shall not throw exceptions.
[ Note: The use of default_delete requires T to be a complete type.
— end note
 ]
Effects: If get() == nullptr there are no effects.
Otherwise get_deleter()(get()).
So the destructor should be equivalent to get_deleter()(get()), which would imply that b->a cannot be nullptr within the destructor of A (which is called inside get_deleter() by the delete instruction).
On a side note, both clang (libc++) and gcc (libstdc++) sets the pointer to nullptr when destroying a std::unique_ptr, but here is gcc destructor:
auto& __ptr = _M_t._M_ptr();
if (__ptr != nullptr)
get_deleter()(__ptr);
__ptr = pointer();
...and here is clang (call to reset()):
pointer __tmp = __ptr_.first();
__ptr_.first() = pointer();
if (__tmp)
__ptr_.second()(__tmp);
As you can see, gcc first deletes then assigns to nullptr (pointer()) while clang first assigns to nullptr (pointer()) then delete1.
1 pointer is an alias corresponding to Deleter::pointer, if it exists, or simply T*.

Both libstdc++ and libc++ are conforming because this is unobservable by a well-defined program. During the destructor's execution, [res.on.objects]/2 prohibits any attempt to observe (or modify, for that matter) the state of the unique_ptr on pain of undefined behavior:
If an object of a standard library type is accessed, and the beginning of the object's lifetime does not happen before the access, or the access does not happen before the end of the object's lifetime, the behavior is undefined unless otherwise specified.
In fact, unique_ptr's destructor is why this paragraph was added in the first place (by LWG2224).
Additionally, after the destruction is complete, the contents of the storage it occupies is indeterminate by [basic.life]/4:
The properties ascribed to objects and references throughout this document apply for a given object or reference only during its lifetime.

There is no requirement on the final state of the memory occupied by the std::unique_ptr<> after destruction. It wouldn't make sense to set it to null as the memory is being returned to where ever it was allocated from. GCC probably checks that its not null to make sure nobody added unnecessary code to clear it. Under the right circumstances, forcing a clear of the value when not needed could cause a performance hit.

Deleting dynamically allocated variables setting pointer to 0 [duplicate]

This question already has answers here:
Is it good practice to NULL a pointer after deleting it?
(18 answers)
Closed 5 years ago.
I can't understand the end of this code (array = 0;):
#include <iostream>
int main()
{
std::cout << "Enter a positive integer: ";
int length;
std::cin >> length;
int *array = new int[length];
std::cout << "I just allocated an array of integers of length " << length << '\n';
array[0] = 5; // set element 0 to value 5
delete[] array; // use array delete to deallocate array
array = 0; // use nullptr instead of 0 in C++11
return 0;
}
At the end, a dynamically allocated array is deleted (returned to OS) and then assigned a value of 0.
Why is this done? After array has been returned to the OS, there is no need to assign it a value of 0, right?
Code from: http://www.learncpp.com/cpp-tutorial/6-9a-dynamically-allocating-arrays/

After array has been returned to the OS, there is no need to assign it a value of 0, right?
You're right it is not needed because the memory is freed (deallocated) by the operator delete. But think of a case where you may use the pointer in another place in your code (functions, loops, etc.) after you use delete[] on it.
The array variable still holds the address of the old allocation after the delete[] statement was called (dangling pointer). If you would access that address you would get undefined bahaviour (UB) because the memory is no longer yours, in most of the cases your program would crash.
To avoid that you do a null pointer check like:
if (array != nullptr)
{
/* access array */
...
}
which is checking the pointer against the address 0 which represents an invalid address.
To make that check possible you set the pointer to nullptr or NULL if C++11 is not available. The nullptr keyword introduces type safety because it acts like a pointer type and should be preferred over the C-like NULL. In pre C++11 NULL is defined as integer 0, since C++11 it's an alias to nullptr.
To define your own nullptr to use it for pre C++11 compiler look here: How to define our own nullptr in c++98?
An interesting fact about delete or delete[] is that it is safe to use it on a nullptr. It is written at point 2 on cppreference.com or at this SO answer.
operator delete, operator delete[]
2)
[...] The behavior of the standard library implementation of this function is undefined unless ptr is a null pointer or is a pointer previously obtained from the standard library implementation of operator new[](size_t) or operator new[](size_t, std::nothrow_t).

We are setting pointers to NULL (0) to avoid dangling pointers (pointer is still pointing to same memory which is no longer yours). In case of local variables it is not that useful if function doesnt continue after delete (so its obvious pointer won't be reused). In case of global/member poitners its good practice to avoid bugs.
Accessing already deleted pointer may lead to overwriting/reading random memory (it might be more dangerous than crash) and causes undefined behavior, while accessing NULL pointer will immediatelly crash.
Since c++11 you should use nullptr because its defined as pointer type while NULL is more int type and improves type safety + resolves ambiguous situations.
In case of double deleting pointer, its safe to use delete on nullptr and nothing happens but if you delete already deleted non-null pointer, it will cause undefined behavior and most likely program will crash.
In c++ you should avoid using pure pointers since there are STL containers (which frees their resources them self (RAII)) for this usage or smart pointers.
std::vector<int> array{1,2,3,4,5};

This is done so that the pointer is set to NULL (whether in C++, we prefer nullptr, since NULL and 0 may be different things).
This tactic eliminates the possibility of a dangling pointer, because the array may have been deleted, but that doesn't mean that it's set to NULL.
If we don't do that, we run the risk of checking whether the pointer is NULL or not (latter in our code), we will see that it's not NULL, wrongly believe that the pointer is OK to be accessed, and cause Undefined Behavior.

You assign to a value commonly known as "invalid address", i.e. NULL, 0 or the pointer type nullptr, because otherwise there is no way you can know whether you pointer points to an invalid address. In other words when you delete[] your array your pointer "doesn't know" that it is pointing to a no longer usable memory address.

Pointers in c++ after delete

After reading many posts about this, I want to clarify the next point:
A* a = new A();
A* b = a;
delete a;
A* c = a; //illegal - I know it (in c++ 11)
A* d = b; //I suppose it's legal, is it true?
So the question is about using the value of copy of deleted pointer.
I've read, that in c++ 11 reading the value of a leads to undefined behaviour - but what about reading the value of b?
Trying to read the value of the pointer (note: this is different to
dereferencing it) causes implementation-defined behaviour since C++14,
which may include generating a runtime fault. (In C++11 it was
undefined behaviour)
What happens to the pointer itself after delete?

Both:
A* c = a;
A* d = b;
are undefined in C++11 and implementation defined in C++14. This is because a and b are both "invalid pointer values" (as they point to deallocated storage space), and "using an invalid pointer value" is either undefined or implementation defined, depending on the C++ version. ("Using" includes "copying the value of").
The relevant section ([basic.stc.dynamic.deallocation]/4) in C++11 reads (emphasis added):
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined.
with a non-normative note stating:
On some implementations, it causes a system-generated runtime
In C++14 the same section reads:
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undeﬁned behavior. Any other use of an invalid pointer value has implementation-deﬁned behavior.
with a non-normative note stating:
Some implementations might deﬁne that copying an invalid pointer value causes a system-generated runtime fault

These 2 lines do not have any difference (meaning legality for C++):
A* c = a; //illegal - I know it (in c++ 11)
A* d = b; //I suppose it's legal, is it true?
Your mistake (and it is pretty common) to think if you call delete on a it makes it any different than b. You should remember that when you call delete on a pointer you pass argument by value, so memory, where a points to after delete is not usable anymore, but that call does not make a any different than b in your example.

You should not use the pointer after delete. My below example with acessing a is based on implementation-defined behaviour.
(thanks to for M.M and Mankarse for pointing this)
I feel that it is not the variable a (or b, c, d) that is important here, but that the value (=the memory address of a deallocated block) which in some implementations can trigger a runtime fault when used in some 'pointer context'.
This value may be an rvalue/expression, not necessarily the value stored in a variable - so I do not believe the value of a ever changes (I am using the loose 'pointer context' to distinguish from using the same value, i.e. the same set of bits, in non-pointer related expressions - which will not cause a runtime fault).
------------My original post is below.---------------
Well, you are almost there with your experiment. Just add some cout's like here:
class A {};
A* a = new A();
A* b = a;
std::cout << a << std::endl; // <--- added here
delete a;
std::cout << a << std::endl; // <--- added here. Note 'a' can still be used!
A* c = a;
A* d = b;
Calling delete a does not do anything to the variable a. This is just a library call. The library that manages dynamic memory allocation keeps a list of allocated memory blocks and uses the value passed by variable a to mark one of the previously allocated blocks as freed.
While it is true what Mankarse cites from C++ documentation, about: "rendering invalid all pointers referring to any part of the deallocated storage" - note that the value of variable a remains untouched (you did not pass it by reference, but by value !).
So to sum up and to try to answer your question:
Variable a still exists in the scope after delete. The variable a still contains the same value, which is the address of the beginning of the memory block allocated (and now already deallocated) for an object of class A. This value of a technically can be used - you can e.g. print it like in my above example – however it is hard to find a more reasonable use for it than printing/logging the past...
What you should not do is trying to de-reference this value (which you also keep in variables b, c, and d) – as this value is not a valid memory pointer any longer.
You should never rely on the object being in the deallocated storage (while it is quite probable that it will remain there for some while, as C++ does not require to clear the storage freed after use) - you have no guarantees and no safe way to check this).

Address held by pointer changes after pointer is deleted

In the following code, why is the address held by pointer x changing after the delete? As I understand, the deletecall should free up allocated memory from heap, but it shouldn't change the pointer address.
using namespace std;
#include <iostream>
#include <cstdlib>
int main()
{
int* x = new int;
*x = 2;
cout << x << endl << *x << endl ;
delete x;
cout << x << endl;
system("Pause");
return 0;
}
OUTPUT:
01103ED8
2
00008123
Observations: I'm using Visual Studio 2013 and Windows 8. Reportedly this doesn't work the same in other compilers. Also, I understand this is bad practice and that I should just reassign the pointer to NULL after it's deletion, I'm simply trying to understand what is driving this weird behaviour.

As I understand, the deletecall should free up allocated memory from heap, but it shouldn't change the pointer address.
Well, why not? It's perfectly legal output -- reading a pointer after having deleted it leads to undefined behavior. And that includes the pointer's value changing. (In fact, that doesn't even need UB; a deleted pointer can really point anywhere.)

Having read relevant bits of both C++98 and C++11 [N3485], and all the stuff H2CO3 pointed to:
Neither edition of the standard adequately describes what an "invalid pointer" is, under what circumstances they are created, or what their semantics are. Therefore, it is unclear to me whether or not the OP's code was intended to provoke undefined behavior, but de facto it does (since anything that the standard does not clearly define is, tautologically, undefined). The text is improved in C++11 but is still inadequate.
As a matter of language design, the following program certainly does exhibit unspecified behavior as marked, which is fine. It may, but should not also exhibit undefined behavior as marked; in other words, to the extent that this program exhibits undefined behavior, that is IMNSHO a defect in the standard. Concretely, copying the value of an "invalid" pointer, and performing equality comparisons on such pointers, should not be UB. I specifically reject the argument to the contrary from hypothetical hardware that traps on merely loading a pointer to unmapped memory into a register. (Note: I cannot find text in C++11 corresponding to C11 6.5.2.3 footnote 95, regarding the legitimacy of writing one union member and reading another; this program assumes that the result of this operation is unspecified but not undefined (except insofar as it might involve a trap representation), as it is in C.)
#include <string.h>
#include <stdio.h>
union ptr {
int *val;
unsigned char repr[sizeof(int *)];
};
int main(void)
{
ptr a, b, c, d, e;
a.val = new int(0);
b.val = a.val;
memcpy(c.repr, a.repr, sizeof(int *));
delete a.val;
d.val = a.val; // copy may, but should not, provoke UB
memcpy(e.repr, a.repr, sizeof(int *));
// accesses to b.val and d.val may, but should not, provoke UB
// result of comparison is unspecified (may, but should not, be undefined)
printf("b %c= d\n", b.val == d.val ? '=' : '!');
// result of comparison is unspecified
printf("c %c= e\n", memcmp(c.repr, e.repr, sizeof(int *)) ? '!' : '=');
}
This is all of the relevant text from C++98:
[3.7.3.2p4] If the argument given to a deallocation function in the standard library
is a pointer that is not the null pointer value (4.10), the deallocation function
shall deallocate the storage referenced by the pointer, rendering invalid all
pointers referring to any part of the deallocated storage. The effect of using
an invalid pointer value (including passing it to a deallocation function) is
undefined. [footnote: On some implementations, it causes a system-generated
runtime fault.]
The problem is that there is no definition of "using an invalid pointer value", so we get to argue about what qualifies. There is a clue to the committee's intent in the discussion of iterators (a category which is defined to include bare pointers):
[24.1p5] ... Iterators can also have singular values that are not associated with any container. [Example: After the declaration of an uninitialized pointer x (as with int* x; [sic]), x must always be assumed to have a singular value of a pointer.] Results of most expressions are undefined for singular values; the only exception is an assignment of a non-singular value to an iterator that holds a singular value. In this case the singular value is overwritten the same way as any other value. Dereferenceable and past-the-end values are always non-singular.
It seems at least plausible to assume that an "invalid pointer" is also meant to be an example of a "singular iterator", but there is no text to back this up; going in the opposite direction, there is no text confirming the (equally plausible) assumption that an uninitialized pointer value is meant to be an "invalid pointer" as well s a "singular iterator". So the hair-splitters among us might not accept "results of most expressions are undefined" as clarifying what qualifies as use of an invalid pointer.
C++11 has changed the text corresponding to 3.7.2.3p4 somewhat:
[3.7.4.2p4] ... Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an
invalid pointer value has implementation-defined behavior. [footnote: Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.]
(the text elided by the ellipsis is unchanged) We now have somewhat more clarity as to what is meant by "use of an invalid pointer value", and we can now say that the OP's code's semantics are definitely implementation-defined (but might be implementation-defined to be undefined). There is also a new paragraph in the discussion of iterators:
[24.2.1p10] An invalid iterator is an iterator that may be singular.
which confirms that "invalid pointer" and "singular iterator" are effectively the same thing. The remaining confusion in C++11 is largely about the exact circumstances that produce invalid/singular pointers/iterators; there should be a detailed chart of pointer/iterator lifecycle transitions (like there is for *values). And, as with C++98, the standard is defective to the extent it does not guarantee that copying-from and equality comparison upon such values are valid (not undefined).

Where in the C++ Standard does it say ::delete can change lvalues?

I ran into my first compiler that changes the lvalue passed to ::delete, but doesn't zero out the lvalue. That is the following is true:
Foo * p = new Foo();
Foo * q = p;
assert(p != 0);
assert(p == q);
::delete p;
assert(p != q);
assert(p != 0);
Note that p is not zero after the delete operation, and it has changed from it's old value. A coworker told me that this is not unusual in his experience having worked with some mainframe C++ compilers that would change p to 0xFFFFFFFF, as well as other compilers that would change p to 0.
Where in the C++ Standard does it say that a compiler is allowed to do this?
Searching through StackOverflow, I found this question: Why doesn’t delete set the pointer to NULL? which had an answer that referred to Bjarne Stroustrup's response that includes the statement:
C++ explicitly allows an implementation of delete to zero out an lvalue operand, and I had hoped that implementations would do that, but that idea doesn't seem to have become popular with implementers.
I've read and re-read section 5.3.5 and 12.5 of the final committee draft C++0x standard, but I'm not seeing the "explicit" part. Am I just looking in the wrong sections of the standard? Or is there a chain of logic that is in the sections but I'm just not connecting together properly.
I don't have my copy of the Annotated C++ Reference Manual anymore. Was it in the ARM that a compiler could do this?
[Edit: Correcting the section reference from 3.5.3 to 5.3.5. I'm also adding an interesting paradox as a counterpoint to Henk's assertion that p is undefined after delete.]
There is an interesting paradox if p is initialized to null.
Foo * p = 0;
Foo * q = p;
assert(p == 0);
assert(p == q);
::delete p;
assert(p == q);
assert(p == 0);
In this case though, the behavior is well documented. When delete gets a null pointer, it is suppose to do nothing, so p remains unchanged.

It might not be so explicit. In 5.3.5/7 it says that the delete expression will call a deallocator function. Then in 3.7.3.2/4 it says that using a pointer that has been deallocated is undefined. Since the value of the pointer cannot be used after the deallocation, then whether the pointer keeps the value or the value is changed by the implementation does not make a difference.
5.3.5/7
The delete-expression will call a deallocation function (3.7.3.2).
3.7.3.2/4
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, render- ing invalid all pointers referring to any part of the deallocated storage. The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined.
The references are from the current standard. In the upcoming standard 5.3.5/7 has been reworded:
C++0x FD 5.3.5/7
If the value of the operand of the delete-expression is not a null pointer value, the delete-expression will call a deallocation function (3.7.4.2). Otherwise, it is unspecified whether the deallocation function will be called. [ Note: The deallocation function is called regardless of whether the destructor for the object or some element of the array throws an exception. — end note ]

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Legal legacy code using pointers suddenly becomes UB - c++

Related

Does std::unique_ptr set its underlying pointer to nullptr inside its destructor?

Deleting dynamically allocated variables setting pointer to 0 [duplicate]

Pointers in c++ after delete

Address held by pointer changes after pointer is deleted

Where in the C++ Standard does it say ::delete can change lvalues?

Categories

Resources