std::default_delete can be specialized to allow std::unique_ptrs to painlessly manage types which have to be destroyed by calling some custom destroy-function instead of using delete p;.
There are basically two ways to make sure an object is managed by a std::shared_ptr in C++:
Create it managed by a shared-pointer, using std::make_shared or std::allocate_shared. This is the preferred way, as it coalesces both memory-blocks needed (payload and reference-counts) into one. Though iff there are only std::weak_ptrs left, the need for the reference-counts will by necessity still pin down the memory for the payload too.
Assign management to a shared-pointer afterwards, using a constructor or .reset().
The second case, when not providing a custom deleter is interesting:
Specifically, it is defined to use its own deleter of unspecified type which uses delete [] p; or delete p; respectively, depending on the std::shared_ptr being instantiated for an Array or not.
Quote from n4659 (~C++17):
template<class Y> explicit shared_ptr(Y* p);
4 Requires: Y shall be a complete type. The expression delete[] p, when T is an array type, or delete p, when T is not an array type, shall have well-defined behavior, and shall not throw exceptions.
5 Effects: When T is not an Array type, constructs a shared_ptr object that owns the pointer p. Otherwise, constructs a shared_ptr that owns p and a deleter of an unspecified type that calls delete[] p. When T is not an array type, enables shared_from_this with p. If an exception is thrown, delete p is called when T is not an array type, delete[] p otherwise.
6 Postconditions: use_count() == 1 && get() == p.
[…]
template<class Y> void reset(Y* p);
3 Effects: Equivalent to shared_ptr(p).swap(*this).
My questions are:
Is there a, preferably good, reason that it is not specified to use std::default_delete instead?
Would any valid (and potentially useful?) code be broken by that change?
Is there already a proposal to do so?
Is there a, preferably good, reason that it is not specified to use std::default_delete instead?
Because it wouldn't do what you want. See, just because you can specialize something doesn't mean you can hijack it. The standard says ([namespace.std]):
A program may add a template specialization for any standard library template to namespace std only if the declaration depends on a user-defined type and the specialization meets the standard library requirements for the original template and is not explicitly prohibited.
The standard library requirement for std::default_delete<T>::operator()(T* ptr)'s behavior is that it "Calls delete on ptr." So your specialization of it must do the same.
As such, there should be no difference between having shared_ptr perform delete ptr; and having shared_ptr invoke default_delete<T>{}(ptr).
This is why unique_ptr takes a deleter type, rather than relying on you to specialize it.
From the comments:
The specialization deletes the object, in the only proper way.
But that's not what the requirement says. It says "Calls delete on ptr." It does not say something more ambiguous like "ends the lifetime of the object pointed to by ptr" or "destroys the object referenced by ptr". It gives explicit code that must happen.
And your specialization has to follow through.
If you remain unconvinced, the paper P0722R1 says this:
Note that the standard requires specializations of default_delete<T> to have the same effect as calling delete p;,
So clearly, the authors agree that specializing default_delete is not a mechanism for adding your own behavior.
So the premise of your question is invalid.
However, let's pretend for a moment that your question were valid, that such a specialization would work. Valid or not, specializing default_delete to customize deleter behavior is not the intended method of doing so. If it were the intent, you wouldn't need a deleter object for unique_ptr at all. At most, you would just need a parameter that tells you what the pointer type is, which would default to T*.
So that's a good reason not to do this.
Related
When you have class template argument deduction available from C++17, why can't you deduce the template arguments of std::unique_ptr? For example, this gives me an error:
std::unique_ptr smp(new D);
That says "Argument list of class template is missing".
Shouldn't the template arguments (at least the pointer type) be deducable?
See this:
any declaration that specifies initialization of a variable and
variable template
Lets look at new int and new int[10]. Both of those return an int*. There is no way to tell if you should have unique_ptr<int> or unique_ptr<int[]>. That right there is enough not to provide any sort of deduction guide.
I'm not going to repeat the rationale in #NathanOliver's great answer, I'm just going to mention the how of it, the mechanics, which is what I think you are also after. You are right that if the constructor of unique_ptr looked merely like...
explicit unique_ptr( T* ) noexcept;
... it'd be possible to deduce T. The compiler generated deduction guide would work just fine. And that would be a problem, like Nathan illustrates. But the constructor is specified like this...
explicit unique_ptr( pointer p ) noexcept;
... where the alias pointer is specified as follows:
pointer : std::remove_reference<Deleter>::type::pointer if that
type exists, otherwise T*. Must satisfy NullablePointer.
That specification essentially means that pointer must be an alias to __some_meta_function<T>::type. Everything on the left of ::type is a non-deduced context, which is what prevents the deduction of T from pointer. That's how these sort of deduction guides could be made to fail even if pointer needed to be T* always. Just by making it a non-deduced context will prevent the viability of any deduction guide produced from that constructor.
So this is a side effect from those olden times at the beginning of C++, when the standard makers decided to have two different delete and delete[] operators for pointers to objects and pointers to arrays of objects.
In these modern times of C++, where we have templates (they weren't there from the beginning), std::array (for fixed sized arrays), inititalizer lists (for static fixed sized arrays) and std::vector (for dynamically sized arrays), almost nobody will need the delete[] operator anymore. I have never used it, and I wouldn't be surprised, if the vast majority of the readers of this question have not used it, either.
Removing int* array = new int[5]; in favour of auto* array = new std::array<int, 5>; would simplify things and would enable safe conversion of pointers to std::unique_ptr and std::shared_ptr. But it would break old code, and so far, the C++ standard maintainers have been very keen on backwards compatibility.
Nobody stops you, though, from writing a small inlined templated wrapper function:
template<typename T>
std::unique_ptr<T> unique_obj_ptr(T* object) {
static_assert(!std::is_pointer<T>::value, "Cannot use pointers to pointers here");
return std::unique_ptr<T>(object);
}
Of course, you can also create a similiar function shared_obj_ptr() to create std::shared_ptrs, and if you really need them, you can also add unique_arr_ptr() and shared_arr_ptr().
I recently followed a Reddit discussion which lead to a nice comparison of std::visit optimization across compilers. I noticed the following: https://godbolt.org/z/D2Q5ED
Both GCC9 and Clang9 (I guess they share the same stdlib) do not generate code for checking and throwing a valueless exception when all types meet some conditions. This leads to way better codegen, hence I raised an issue with the MSVC STL and was presented with this code:
template <class T>
struct valueless_hack {
struct tag {};
operator T() const { throw tag{}; }
};
template<class First, class... Rest>
void make_valueless(std::variant<First, Rest...>& v) {
try { v.emplace<0>(valueless_hack<First>()); }
catch(typename valueless_hack<First>::tag const&) {}
}
The claim was, that this makes any variant valueless, and reading the docu it should:
First, destroys the currently contained value (if any). Then
direct-initializes the contained value as if constructing a value of
type T_I with the arguments std::forward<Args>(args).... If an
exception is thrown, *this may become valueless_by_exception.
What I don't understand: Why is it stated as "may"? Is it legal to stay in the old state if the whole operation throws? Because this is what GCC does:
// For suitably-small, trivially copyable types we can create temporaries
// on the stack and then memcpy them into place.
template<typename _Tp>
struct _Never_valueless_alt
: __and_<bool_constant<sizeof(_Tp) <= 256>, is_trivially_copyable<_Tp>>
{ };
And later it (conditionally) does something like:
T tmp = forward(args...);
reset();
construct(tmp);
// Or
variant tmp(inplace_index<I>, forward(args...));
*this = move(tmp);
Hence basically it creates a temporary, and if that succeeds copies/moves it into the real place.
IMO this is a violation of "First, destroys the currently contained value" as stated by the docu. As I read the standard, then after a v.emplace(...) the current value in the variant is always destroyed and the new type is either the set type or valueless.
I do get that the condition is_trivially_copyable excludes all types that have an observable destructor. So this can also be though as: "as-if variant is reinitialized with the old value" or so. But the state of the variant is an observable effect. So does the standard indeed allow, that emplace does not change the current value?
Edit in response to a standard quote:
Then initializes the contained value as if direct-non-list-initializing a value of type TI with the arguments std::forward<Args>(args)....
Does T tmp {std::forward<Args>(args)...}; this->value = std::move(tmp); really count as a valid implementation of the above? Is this what is meant by "as if"?
I think the important part of the standard is this:
From https://timsong-cpp.github.io/cppwp/n4659/variant.mod#12
23.7.3.4 Modifiers
(...)
template
variant_alternative_t>& emplace(Args&&... args);
(...) If an exception is thrown during the initialization of the contained value, the variant might not hold a value
It says "might" not "must". I would expect this to be intentional in order to allow implementations like the one used by gcc.
As you mentioned yourself, this is only possible if the destructors of all alternatives are trivial and thus unobservable because destroying the previous value is required.
Followup question:
Then initializes the contained value as if direct-non-list-initializing a value of type TI with the arguments std::forward<Args>(args)....
Does T tmp {std::forward(args)...}; this->value = std::move(tmp); really count as a valid implementation of the above? Is this what is meant by "as if"?
Yes, because for types that are trivially copyable there is no way to detect the difference, so the implementation behaves as if the value was initialized as described. This would not work if the type was not trivially copyable.
So does the standard indeed allow, that emplace does not change the
current value?
Yes. emplace shall provide the basic guarantee of no leaking (i.e., respecting object lifetime when construction and destruction produce observable side effects), but when possible, it is allowed to provide the strong guarantee (i.e., the original state is kept when an operation fails).
variant is required to behave similarly to a union — the alternatives are allocated in one region of suitably allocated storage. It is not allowed to allocate dynamic memory. Therefore, a type-changing emplace has no way to keep the original object without calling an additional move constructor — it has to destroy it and construct the new object in place of it. If this construction fails, then the variant has to go to the exceptional valueless state. This prevents weird things like destroying a nonexistent object.
However, for small trivially copyable types, it is possible to provide the strong guarantee without too much overhead (even a performance boost for avoiding a check, in this case). Therefore, the implementation does it. This is standard-conforming: the implementation still provides the basic guarantee as required by the standard, just in a more user-friendly way.
Edit in response to a standard quote:
Then initializes the contained value as if
direct-non-list-initializing a value of type TI with the arguments
std::forward<Args>(args)....
Does T tmp {std::forward<Args>(args)...}; this->value =
std::move(tmp); really count as a valid implementation of the above?
Is this what is meant by "as if"?
Yes, if the move assignment produces no observable effect, which is the case for trivially copyable types.
It appears that in C++20, we're getting some additional utility functions for smart pointers, including:
template<class T> unique_ptr<T> make_unique_for_overwrite();
template<class T> unique_ptr<T> make_unique_for_overwrite(size_t n);
and the same for std::make_shared with std::shared_ptr. Why aren't the existing functions:
template<class T, class... Args> unique_ptr<T> make_unique(Args&&... args); // with empty Args
template<class T> unique_ptr<T> make_unique(size_t n);
enough? Don't the existing ones use the default constructor for the object?
Note: In earlier proposals of these functions, the name was make_unique_default_init().
These new functions are different:
Original make_XYZ: Always initializes the pointed-to value ("explicit initialization", see § class.expl.init in the standard).
New make_XYZ_for_overwrite: Performs "default initialization" of the pointed-to value (see § dcl.init, paragraph 7 in the standard); on typical machines, this means effectively no initialization for non-class, non-array types. (Yes, the term is a bit confusing; please read the paragraph at the link.)
This is a feature of plain vanilla pointers which was not available with the smart pointer utility functions: With regular pointers you can just allocate without actually initializing the pointed-to value:
new int
For unique/shared pointers you could only achieve this by wrapping an existing pointer, as in:
std::unique_ptr<int[]>(new int[n])
now we have a wrapper function for that.
Note: See the relevant ISO C++ WG21 proposal as well as this SO answer
allocate_shared, make_shared, and make_unique all initialize the underlying object by performning something equivalent to new T(args...). In the zero-argument case, that reduces to new T() - which is to say, it performs value initialization. Value initialization in many cases (including scalar types like int and char, arrays of them, and aggregates of them) performs zero initialization - which is to say, that is actual work being done to zero out a bunch of data.
Maybe you want that and that is important to your application, maybe you don't. From P1020R1, the paper that introduced the functions originally named make_unique_default_init, make_shared_default_init, and allocate_shared_default_init (these were renamed from meow_default_init to meow_for_overwrite during the national ballot commenting process for C++20):
It is not uncommon for arrays of built-in types such as unsigned char or double to be immediately initialized by the user in their entirety after allocation. In these cases, the value initialization performed by allocate_shared, make_shared, and make_unique is redundant and hurts performance, and a way to choose default initialization is needed.
That is, if you were writing code like:
auto buffer = std::make_unique<char[]>(100);
read_data_into(buffer.get());
The value initialization performed by make_unique, which would zero out those 100 bytes, is completely unnecessary since you're immediately overwriting it anyway.
The new meow_for_overwrite functions instead perform default initialization since the memory used will be immediately overwritten anyway (hence the name) - which is to say the equivalent of doing new T (without any parentheses or braces). Default initialization in those cases I mentioned earlier (like int and char, arrays of them, and aggregates of them) performs no initialization, which saves time.
For class types that have a user-provided default constructor, there is no difference between value initialization and default initialization: both would just invoke the default constructor. But for many other types, there can be a large difference.
Does the standard define precisely what I can do with an object once it has been moved from? I used to think that all you can do with a moved-from object is do destruct it, but that would not be sufficient.
For example, take the function template swap as defined in the standard library:
template <typename T>
void swap(T& a, T& b)
{
T c = std::move(a); // line 1
a = std::move(b); // line 2: assignment to moved-from object!
b = std::move(c); // line 3: assignment to moved-from object!
}
Obviously, it must be possible to assign to moved-from objects, otherwise lines 2 and 3 would fail. So what else can I do with moved-from objects? Where exactly can I find these details in the standard?
(By the way, why is it T c = std::move(a); instead of T c(std::move(a)); in line 1?)
17.6.5.15 [lib.types.movedfrom]
Objects of types defined in the C++ standard library may be moved from
(12.8). Move operations may be explicitly specified or implicitly
generated. Unless otherwise specified, such moved-from objects shall
be placed in a valid but unspecified state.
When an object is in an unspecified state, you can perform any operation on the object which has no preconditions. If there is an operation with preconditions you wish to perform, you can not directly perform that operation because you do not know if the unspecified-state of the object satisfies the preconditions.
Examples of operations that generally do not have preconditions:
destruction
assignment
const observers such as get, empty, size
Examples of operations that generally do have preconditions:
dereference
pop_back
This answer now appears in video format here: http://www.youtube.com/watch?v=vLinb2fgkHk&t=47m10s
Moved-from objects exist in an unspecified, but valid, state. That suggests that whilst the object might not be capable of doing much anymore, all of its member functions should still exhibit defined behaviour — including operator= — and all its members in a defined state- and it still requires destruction. The Standard gives no specific definitions because it would be unique to each UDT, but you might be able to find specifications for Standard types. Some like containers are relatively obvious — they just move their contents around and an empty container is a well-defined valid state. Primitives don't modify the moved-from object.
Side note: I believe it's T c = std::move(a) so that if the move constructor (or copy constructor if no move is provided) is explicit the function will fail.
In the C++ standard draft (N3485), it states the following:
20.7.1.2.4 unique_ptr observers [unique.ptr.single.observers]
typename add_lvalue_reference<T>::type operator*() const;
1 Requires: get() != nullptr.
2 Returns: *get().
pointer operator->() const noexcept;
3 Requires: get() != nullptr.
4 Returns: get().
5 Note: use typically requires that T be a complete type.
You can see that operator* (dereference) is not specified as noexcept, probably because it can cause a segfault, but then operator-> on the same object is specified as noexcept. The requirements for both are the same, however there is a difference in exception specification.
I have noticed they have different return types, one returns a pointer and the other a reference. Is that saying that operator-> doesn't actually dereference anything?
The fact of the matter is that using operator-> on a pointer of any kind which is NULL, will segfault (is UB). Why then, is one of these specified as noexcept and the other not?
I'm sure I've overlooked something.
EDIT:
Looking at std::shared_ptr we have this:
20.7.2.2.5 shared_ptr observers [util.smartptr.shared.obs]
T& operator*() const noexcept;
T* operator->() const noexcept;
It's not the same? Does that have anything to do with the different ownership semantics?
A segfault is outside of C++'s exception system. If you dereference a null pointer, you don't get any kind of exception thrown (well, atleast if you comply with the Require: clause; see below for details).
For operator->, it's typically implemented as simply return m_ptr; (or return get(); for unique_ptr). As you can see, the operator itself can't throw - it just returns the pointer. No dereferencing, no nothing. The language has some special rules for p->identifier:
§13.5.6 [over.ref] p1
An expression x->m is interpreted as (x.operator->())->m for a class object x of type T if T::operator->() exists and if the operator is selected as the best match function by the overload resolution mechanism (13.3).
The above applies recursively and in the end must yield a pointer, for which the built-in operator-> is used. This allows users of smart pointers and iterators to simply do smart->fun() without worrying about anything.
A note for the Require: parts of the specification: These denote preconditions. If you don't meet them, you're invoking UB.
Why then, is one of these specified as noexcept and the other not?
To be honest, I'm not sure. It would seem that dereferencing a pointer should always be noexcept, however, unique_ptr allows you to completely change what the internal pointer type is (through the deleter). Now, as the user, you can define entirely different semantics for operator* on your pointer type. Maybe it computes things on the fly? All that fun stuff, which may throw.
Looking at std::shared_ptr we have this:
This is easy to explain - shared_ptr doesn't support the above-mentioned customization to the pointer type, which means the built-in semantics always apply - and *p where p is T* simply doesn't throw.
For what it's worth, here's a little of the history, and how things got the way they are now.
Before N3025, operator * wasn't specified with noexcept, but its description did contain a Throws: nothing. This requirement was removed in N3025:
Change [unique.ptr.single.observers] as indicated (834) [For details see the Remarks section]:
typename add_lvalue_reference<T>::type operator*() const;
1 - Requires: get() != 0nullptr.
2 - Returns: *get().
3 - Throws: nothing.
Here's the content of the "Remarks" section noted above:
During reviews of this paper it became controversial how to properly specify the operational semantics of operator*, operator[], and the heterogenous comparison functions. [structure.specifications]/3 doesn't clearly say whether a Returns element (in the absence of the new Equivalent to formula) specifies effects. Further-on it's unclear whether this would allow for such a return expression to exit via an exception, if additionally a Throws:-Nothing element is provided (would the implementor be required to catch those?). To resolve this conflict, any existing Throws element was removed for these operations, which is at least consistent with [unique.ptr.special] and other parts of the standard. The result of this is that we give now implicit support for potentially throwing comparison functions, but not for homogeneous == and !=, which might be a bit surprising.
The same paper also contains a recommendation for editing the definition of operator ->, but it reads as follows:
pointer operator->() const;
4 - Requires: get() != 0nullptr.
5 - Returns: get().
6 - Throws: nothing.
7 - Note: use typically requires that T be a complete type.
As far as the question itself goes: it comes down to a basic difference between the operator itself, and the expression in which the operator is used.
When you use operator*, the operator dereferences the pointer, which can throw.
When you use operator->, the operator itself just returns a pointer (which isn't allowed to throw). That pointer is then dereferenced in the expression that contained the ->. Any exception from dereferencing the pointer happens in the surrounding expression rather than in the operator itself.
Frankly, this just looks like a defect to me. Conceptually, a->b should always be equivalent to (*a).b, and this applies even if a is a smart pointer. But if *a isn't noexcept, then (*a).b isn't, and therefore a->b shouldn't be.
Regarding:
Is that saying that operator-> doesn't actually dereference anything?
No, the standard evaluation of -> for a type overloading operator-> is:
a->b; // (a.operator->())->b
I.e. the evaluation is defined recursively, when the source code contains a ->, operator-> is applied yielding another expression with an -> that can itself refer to a operator->...
Regarding the overall question, if the pointer is null, the behavior is undefined, and the lack of noexcept allows an implementation to throw. If the signature was noexcept then the implementation could not throw (a throw would be a call to std::terminate).