Why do I need to move `std::unique_ptr` - c++

Given the following code:
#include <iostream>
#include <memory>
struct A {};
struct B : public A {};
std::pair<bool, std::unique_ptr<B>> GetBoolAndB() {
return { true, std::make_unique<B>() };
}
std::unique_ptr<A> GetA1() {
auto[a, b] = GetBoolAndB();
return b;
}
std::unique_ptr<A> GetA2() {
auto [a, b] = GetBoolAndB();
return std::move(b);
}
GetA1 does not compile, with this error:
C2440: 'return': cannot convert from 'std::unique_ptr<B,std::default_delete<_Ty>>' to 'std::unique_ptr<A,std::default_delete<_Ty>>'
while GetA2 does compile without errors.
I don't understand why I need to call std::move to make the function work.
Edit
Just to clarify, as pointed out in comments by DanielLangr, my doubt was about the fact that
std::unique_ptr<A> GetA3() {
std::unique_ptr<B> b2;
return b2;
}
compiles and transfer ownership without the need for std::move.
Now I understand that in case of GetA1 and GetA2, with structured bindings it happens that b is part of some object, and so it must be moved to become an rvalue reference.

I don't understand why I need to call std::move to make the function work.
Because the corresponding constructor of std::unique_ptr has a parameter of rvalue reference type:
template< class U, class E >
unique_ptr( unique_ptr<U, E>&& u ) noexcept;
See documentation for details: https://en.cppreference.com/w/cpp/memory/unique_ptr/unique_ptr
Since rvalue references cannot bind lvalues, consequently, you cannot use b (which is lvalue) as an argument of this constructor.
If you wonder why b is treated as lvalue in the return statement, see, for example: Why Structured Bindings disable both RVO and move on return statement? In short, b is not a variable with automatic storage duration, but a reference to a pair element instead.
The error message basically just says that the compiler could not find any viable converting constructor, therefore, it "cannot convert...".
By wrapping b with std::move call, you are creating an expression that refers to the very same object as b, but its category is rvalue. Which may be bound with that constructor parameter.

Because there should only be one valid unique_ptr at any one time.
That is why it is called unique_ptr.
unique_ptr is un-copyable, you must move it.
Otherwise you would end up with a copy of the pointer which would defeat the point of it being unique!
See: Rules for Smart Pointers

Related

What can be done to prevent misleading assigment to returned value?

After many years of using C++ I realized a quirk in the syntax when using custom classes.
Despite being the correct language behavior it allows to create very misleading interfaces.
Example here:
class complex_arg {
double r_;
double phi_;
public:
std::complex<double> value() const {return r_*exp(phi_*std::complex<double>{0, 1});}
};
int main() {
complex_arg ca;
ca.value() = std::complex<double>(1000., 0.); // accepted by the compiler !?
assert( ca.value() != std::complex<double>(1000., 0.) ); // what !?
}
https://godbolt.org/z/Y5Pcjsc8d
What can be done to the class definition to prevent this behavior?
(Or at the least flag the user of the clas that the 3rd line is not really doing any assignment.)
I see only one way out, but it requires modifying the class and it doesn't scale well (to large classes that can be moved).
const std::complex<double> value() const;
I also tried [[nodiscard]] value() but it didn't help.
As a last resort, maybe something can be done to the returned type std::complex<double> ? (that is, assuming one is in control of that class)
Note that I understand that sometimes one might need to do (optimized) assign to a newly obtained value and passe it to yet another function f( ca.value() = bla ). I am not questioning this usage per se (although it is quite confusing as well); I have the problem mostly with ca.value() = bla; as a standalone statement that doesn't do what it looks.
Ordinarily we can call a member function on an object regardless of whether that object's value category is an lvalue or rvalue.
What can be done to the class definition to prevent this behavior?
Prior to modern C++ there was no way prevent this usage. But since C++11 we can ref-qualify a member function to do what you ask as shown below.
From member functions:
During overload resolution, non-static member function with a cv-qualifier sequence of class X is treated as follows:
no ref-qualifier: the implicit object parameter has type lvalue reference to cv-qualified X and is additionally allowed to bind rvalue implied object argument
lvalue ref-qualifier: the implicit object parameter has type lvalue reference to cv-qualified X
rvalue ref-qualifier: the implicit object parameter has type rvalue reference to cv-qualified X
This allows us to do what you ask for a custom managed class. In particular, we can & qualify the copy assignment operator.
struct C
{
C(int)
{
std::cout<<"converting ctor called"<<std::endl;
}
C(){
std::cout<<"default ctor called"<<std::endl;
}
C(const C&)
{
std::cout<<"copy ctor called"<<std::endl;
}
//-------------------------v------>added this ref-qualifier
C& operator=(const C&) &
{
std::cout<<"copy assignment operator called"<<std::endl;;
return *this;
}
};
C func()
{
C temp;
return temp;
}
int main()
{
//---------v---------> won't work because assignment operator is & qualified
func() = 4;
}

return std::move a class with a unique_ptr member

Why can't I return a class containing a std::unique_ptr, using std::move semantics (I thought), as in the example below? I thought that the return would invoke the move ctor of class A, which would std::move the std::unique_ptr. (I'm using gcc 11.2, C++20)
Example:
#include <memory>
class A {
public:
explicit A(std::unique_ptr<int> m): m_(std::move(m)) {}
private:
std::unique_ptr<int> m_;
};
A makit(int num) {
auto m = std::make_unique<int>(num);
return std::move(A(m)); // error: use of deleted function 'std::unique_ptr<_Tp, _Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = int; _Dp = std::default_delete<int>]'x86-64 gcc 11.2 #1
}
int main() {
auto a = makit(42);
return 0;
}
I believe the solution is to return a std::unique_ptr, but before I give in I wonder why the move approach doesn't work.
I thought that the return would invoke the move ctor of class A, which would std::move the std::unique_ptr.
All true, but the move constructor only moves the members of A. It cannot move unrelated satellite unique pointers for you.
In the expression A(m), you use m as an lvalue. That will try to copy m in order to initialize the parameter m of A::A (btw, terrible naming scheme to reason about all of this). If you move that, i.e. A(std::move(m)), the expression becomes well-formed.
And on that subject, the outer std::move in std::move(A(...)) is redundant. A(...) is already an rvalue of type A. The extra std::move does nothing good here.
In short, you should not return A using std::move since the compiler will do the optimization work for you (through a trick called RVO: What are copy elision and return value optimization?).
Another point is that the std::move in m_(std::move(m)) is totally unnecessary since the coping of the unique_ptr will happen regardless of using std::move or not. Remember that std::move does not guarantee the move operation and does not prevent coping in all cases.
In conclusion, you better use return A( std::move(m) ); and not return std::move(A(m));. In your return statement, A(m) is already a prvalue and you don't need to cast it to an xvalue using std::move in order to return it efficiently. Just return it by value and the compiler will do the trick for you.

Why does passing a brace-initialized temporary by address require explicit casting to the same type in MSVS

I was trying to make my code less bloated when dealing with windows API by replacing two-liners not unlike
TEMP t{0,1,2}; // let's say it's struct TEMP {int a; int b; int c}
SomeVeryVerboseFunctionName(&t);
with one-liners
SomeVeryVerboseFunctionName(&TEMP{0,1,2});
but stumbled upon error:
expression must be an lvalue or function designator.
After many attempts I finally came up with code that does compile (MSVS 2013u4):
SomeVeryVerboseFunctionName(&(TEMP) TEMP{0,1,2});//explicit cast to the same type!
To better understand why the cast is needed I set up a simple test project:
#include <stdio.h>
struct A
{
int a;
int b;
A(int _a, int _b) : a(_a), b(_b) {};
};
struct B
{
int a;
int b;
};
template <typename T> void fn(T* in)
{
printf("a = %i, b = %i\n", in->a, in->b);
}
int main()
{
fn(&A{ 1, 2 }); //OK, no extra magick
/* fn(&B {3, 4}); //error: expression must be an lvalue or function designator */
fn(&(B)B{ 3, 4 }); //OK with explicit cast to B (but why?)
}
and found out that if some struct T has an explicit constructor (like A has in the above code), then it's possible to take address of brace-initialized temporary of type T and pass it to a function that takes a pointer T*, but if it doesn't have one (like B), then the said error arises and can only be overcome by explicit casting to type T.
So the question is: why does B require such strange casting and A doesn't ?
Update
Now that it's clear that treating rvalue as lvalue is an extension/feature/bug in MSVS, does anyone care to pretend it's actually a feature (enough used for MS to maintain it since 2010) and elaborate on why temporaries of A and B need to be passed in different ways in order to satisfy the compiler? It must have something to do with A's constructor and B's lack thereof...
What you are doing is actually illegal in C++.
Clang 3.5 complains:
23 : error: taking the address of a temporary object of type 'A' [-Waddress-of-temporary]
fn(&A {1, 2}); //OK, no extra magick
^~~~~~~~~
25 : error: taking the address of a temporary object of type 'B' [-Waddress-of-temporary]
fn(&(B) B {3, 4}); //OK with explicit cast to B (but why?)
^~~~~~~~~~~~~
All operands of & must be lvalues, not temporaries. The fact that MSVC accepts these constructs is a bug. According to the link pointed out by Shafik above, it seems that MSVC erroneously creates lvalues for these.
template<class T>
T& as_lvalue(T&& t){return t;}
// optional, blocks you being able to call as_lvalue on an lvalue:
template<class T>
void as_lvalue(T&)=delete;
will solve your problem using legal C++.
SomeVeryVerboseFunctionName(&as_lvalue(TEMP{0,1,2}));
in a sense, as_lvalue is an inverse-move. You could call it unmove, but that would get confusing.
Taking the address of an rvalue is illegal in C++. The above turns the rvalue into an lvalue, at which point taking the address becomes legal.
The reason why taking an address of an rvalue is illegal is that such data is meant to be discarded. The pointer will only remain valid until the end of the current line (barring the rvalue being created via a cast of an lvalue). Such pointers only have corner-case usefulness. However, in the case of windows APIs, many such APIs take pointers to data structures for C-style versioning purposes.
To that end, these might be safer:
template<class T>
T const& as_lvalue(T&& t){return t;}
template<class T>
T& as_mutable_lvalue(T&& t){return t;}
// optional, blocks you being able to call as_lvalue on an lvalue:
template<class T>
void as_lvalue(T&)=delete;
template<class T>
void as_mutable_lvalue(T&)=delete;
because the more likely to be correct one returns a const reference to data (why are you modifying a temporary?), and the longer one (hence less likely to be used) returns the non-const version.
MSVC has an old "bug"/"feature" where it treats many things as lvalues when it should not, including the result of casts. Use /Za to disable that extension. This can cause otherwise working code to fail to compile. It may even cause working code to fail to work and still compile: I have not proved the contrary.

std::initializer_list and reference types

Can a std::initializer_list contain reference types (both rvalue and lvalue)? Or does one have to use pointers or a reference wrapper (such as std::ref)?
EDIT:
Perhaps more clarification is due:
I have a member variable, ::std::vector<std::function<void()> >, into which I would like to forward a lambda object. This would usually be accomplished with emplace_back, but I wanted to do it in the constructor's initialization list. Alas, as I read, this would make forwarding impossible.
Can a std::initializer_list contain reference types (both rvalue and lvalue)?
std::initializer_list<T> doesn't hold references to its elements. It uses copy-semantics by holding its values as const objects:
18.9 Initializer List [support.initlist]
An object of type initializer_list<E> provides access to an array of objects of type const E.
An initializer_list of references will cause a compilation error because iternally pointers are used for iterators:
#include <initializer_list>
int main()
{
int x;
std::initializer_list<int&> l = {x};
// In instantiation of 'class std::initializer_list<int&>':
// error: forming pointer to reference type 'int&'
// typedef const _E* iterator;
}
An initializer_list also doesn't support move-semantics as const objects cannot be moved from. Holding your objects in a std::reference_wrapper<T> is the most viable solution if you wish to maintain reference-semantics.
From http://www.cplusplus.com/reference/initializer_list/initializer_list/
initializer_list objects are automatically constructed as if an array
of elements of type T was allocated
thus they can't be used with something like std::initializer_list<int&>. The reason is the same for which the following gives a compiler error
int& arr[20];
error: declaration of ‘arr’ as array of references
and that is dictated by the C++ standard: https://stackoverflow.com/a/1164306/1938163
You do not need list initialization here
As others mentioned, you cannot use std::initializer_list with references. You can use std::initializer_list<std::reference_wrapper<...>>, but it will prevent your from passing rvalues as arguments to the constructor, because std::reference_wrapper can only bind to lvalues. In other words, the following will not compile:
YourContainerOfFunctions C{ [](){} };
This makes usage of std::initializer_list in your case neither efficient nor convenient.
Use variadic templates instead!
I believe that is what you wanted to achieve:
class Foo {
std::vector<std::function<void()>> Functions;
public:
template <class... FuncTs>
Foo(FuncTs &&...Funcs) : Functions({std::forward<FuncTs>(Funcs)...}) {}
};
void foo(){};
int main() {
auto boo = []() {};
std::function<void()> moo = []() {};
Foo F{
foo, boo, // passed by reference, then copied
[]() {}, // moved, then copied
std::move(moo) // moved, then also moved
};
}
This requires at most one copy per argument, necessary because std::function always make a copy of functor object which it is constructed from. An exception is construction of std::function from std::function of the same type

Should the assignment operator observe the assigned object's rvalueness?

For class types it is possible to assign to temporary objects which is actually not allowed for built-in types. Further, the assignment operator generated by default even yields an lvalue:
int() = int(); // illegal: "expression is not assignable"
struct B {};
B& b = B() = B(); // compiles OK: yields an lvalue! ... but is wrong! (see below)
For the last statement the result of the assignment operator is actually used to initialize a non-const reference which will become stale immediately after the statement: the reference isn't bound to the temporary object directly (it can't as temporary objects can only be bound to a const or rvalue references) but to the result of the assignment whose life-time isn't extended.
Another problem is that the lvalue returned from the assignment operator doesn't look as if it can be moved although it actually refers to a temporary. If anything is using the result of the assignment to get hold of the value it will be copied rather than moved although it would be entirely viable to move. At this point it is worth noting that the problem is described in terms of the assignment operator because this operator is typically available for value types and returns an lvalue reference. The same problem exists for any function returning a reference to the objects, i.e., *this.
A potential fix is to overload the assignment operator (or other functions returning a reference to the object) to consider the kind of object, e.g.:
class G {
public:
// other members
G& operator=(G) & { /*...*/ return *this; }
G operator=(G) && { /*...*/ return std::move(*this); }
};
The possibility to overload the assignment operators as above has come with C++11 and would prevent the subtle object invalidation noted above and simultaneously allow moving the result of an assignment to a temporary. The implementation of the these two operators is probably identical. Although the implementation is likely to be rather simple (essentially just a swap() of the two objects) it still means extra work raising the question:
Should functions returning a reference to the object (e.g., the assignment operator) observe the rvalueness of the object being assigned to?
An alternatively (mentioned by Simple in a comment) is to not overload the assignment operator but to qualify it explicitly with a & to restrict its use to lvalues:
class GG {
public:
// other members
GG& operator=(GG) & { /*...*/ return *this; }
};
GG g;
g = GG(); // OK
GG() = GG(); // ERROR
IMHO, the original suggestion by Dietmar Kühl (providing overloads for & and && ref-qualifiers) is superior than Simple's one (providing it only for &).
The original idea is:
class G {
public:
// other members
G& operator=(G) & { /*...*/ return *this; }
G operator=(G) && { /*...*/ return std::move(*this); }
};
and Simple has suggested to remove the second overload. Both solutions invalidate this line
G& g = G() = G();
(as wanted) but if the second overload is removed, then these lines also fail to compile:
const G& g1 = G() = G();
G&& g2 = G() = G();
and I see no reason why they shouldn't (there's no lifetime issue as explained in Yakk's post).
I can see only one situation where Simple's suggestion is preferable: when G doesn't have an accessible copy/move constructor. Since most types for which the copy/move assignment operator is accessible also have an accessible copy/move constructor, this situation is quite rare.
Both overloads take the argument by value and there are good reasons for that if G has an accessible copy/move constructor. Suppose for now that G does not have one. In this case the operators should take the argument by const G&.
Unfortunately the second overload (which, as it is, returns by value) should not return a reference (of any type) to *this because the expression to which *this binds to is an rvalue and thus, it's likely to be a temporary whose lifetime is about to expiry. (Recall that forbidding this from happening was one of the OP's motivation.)
In this case, you should remove the second overload (as per Simple's suggestion) otherwise the class doesn't compile (unless the second overload is a template that's never instantiated). Alternatively, we can keep the second overload and define it as deleted. (But why bother since the existence of the overload for & alone is already enough?)
A peripheral point.
What should be the definition of operator = for &&? (We assume again that G has an accessible copy/move constructor.)
As Dietmar Kühl has pointed out and Yakk has explored, the code of the both overloads should be very similar and, in this case, it's better to implement the one for && in terms of the one for &. Since the performance of a move is expected to be no worse than a copy (and since RVO doesn't apply when returning *this) we should return std::move(*this). In summary, a possible one-line definition is:
G operator =(G o) && { return std::move(*this = std::move(o)); }
This is good enough if only G can be assigned to another G or if G has (non-explicit) converting constructors. Otherwise, you should instead consider giving G a (template) forwarding copy/move assignment operator taking an universal reference:
template <typename T>
G operator =(T&& o) && { return std::move(*this = std::forward<T>(o)); }
Although this is not a lot of boiler plate code it's still an annoyance if we have to do that for many classes. To decrease the amount of boiler plate code we can define a macro:
#define ASSIGNMENT_FOR_RVALUE(type) \
template <typename T> \
type operator =(T&& b) && { return std::move(*this = std::forward<T>(b)); }
Then inside G's definition one adds ASSIGNMENT_FOR_RVALUE(G).
(Notice that the relevant type appears only as the return type. In C++14 it can be automatically deduced by the compiler and thus, G and type in the last two code snippets can be replaced by auto. It follows that the macro can become an object-like macro instead of a function-like macro.)
Another way of reducing the amount of boiler plate code is defining a CRTP base class that implements operator = for &&:
template <typename Derived>
struct assignment_for_rvalue {
template <typename T>
Derived operator =(T&& o) && {
return std::move(static_cast<Derived&>(*this) = std::forward<T>(o));
}
};
The boiler plate becomes the inheritance and the using declaration as shown below:
class G : public assignment_for_rvalue<G> {
public:
// other members, possibly including assignment operator overloads for `&`
// but taking arguments of different types and/or value category.
G& operator=(G) & { /*...*/ return *this; }
using assignment_for_rvalue::operator =;
};
Recall that, for some types and contrarily to using ASSIGNMENT_FOR_RVALUE, inheriting from assignment_for_rvalue might have some unwanted consequences on the class layout.
The first problem is that this is not actually ok in C++03:
B& b = B() = B();
in that b is bound to an expired temporary once the line is finished.
The only "safe" way to use this is in a function call:
void foo(B&);
foo( B()=B() );
or something similar, where the line-lifetime of the temporaries is sufficient for the lifetime of what we bind it to.
We can replace the probably inefficient B()=B() syntax with:
template<typename T>
typename std::decay<T>::type& to_lvalue( T&& t ) { return t; }
and now the call looks clearer:
foo( to_lvalue(B()) );
which does it via pure casting. Lifetime is still not extended (I cannot think of a way to manage that), but we don't construct to objects then pointlessly assign one to the other.
So now we sit down and examine these two options:
G operator=(G o) && { return std::move(o); }
G&& operator=(G o) && { *this = std::move(o); return std::move(*this); }
G operator=(G o) && { *this = std::move(o); return std::move(*this); }
which are, as an aside, complete implementations, assuming G& operator=(G o)& exists and is written properly. (Why duplicate code when you don't need to?)
The first and third allows for lifetime extension of the return value, the second uses the lifetime of *this. The second and third modify *this, while the first one does not.
I would claim that the first one is the right answer. Because *this is bound to an rvalue, the caller has stated that it will not be reused, and its state does not matter: changing it is pointless.
The lifetime of first and third means that whomever uses it can extend the lifetime of the returned value, and not be tied to whatever *this's lifetime is.
About the only use the B operator=(B)&& has is that it allows you to treat rvalue and lvalue code relatively uniformly. As a downside, it lets you treat it relatively uniformly in situations where the result may be surprising.
std::forward<T>(t) = std::forward<U>(u);
should probably fail to compile instead of doing something surprising like "not modifying t" when T&& is an rvalue reference. And modifying t when T&& is an rvalue reference is equally wrong.