Reducing assignment of temporary object to in-place construction - c++

Using std::list supporting move semantics as an example.
std::list<std::string> X;
... //X is used in various ways
X=std::list<std::string>({"foo","bar","dead","beef"});
The most straightforward way for compiler to do the assignment since C++11 is:
destroy X
construct std::list
move std::list to X
Now, compiler isn't allowed to do following instead:
destroy X
contruct std::list in-place
because while this obviously saves another memcpy it eliminates assignment. What is the convenient way of making second behaviour possible and usable? Is it planned in future versions of C++?
My guess is that C++ still does not offer that except with writing:
X.~X();
new(&X) std::list<std::string>({"foo","bar","dead","beef"});
Am I right?

You can actually do it by defining operator= to take an initializer list.
For std::list, just call
X = {"foo","bar","dead","beef"}.
In your case, what was happening is actually:
Construct a temporary
Call move assignment operator on X with the temporary
On most objects, such as std::list, this won't actually be expensive compared to simply constructing an object.
However, it still incurs additional allocations for the internal storage of the second std::list, which could be avoided: we could reuse the internal storage already allocated for X if possible. What is happenning is:
Construct: the temporary allocates some space for the elements
Move: the pointer is moved to X; the space used by X before is freed
Some objects overload the assignment operator to take an initializer list, and it is the case for std::vector and std::list. Such an operator may use the storage already allocated internally, which is the most effective solution here.
// Please insert the usual rambling about premature optimization here

Is it planned in future versions of C++?
No. And thank goodness for that.
Assignment is not the same as destroy-then-create. X is not destroyed in your assignment example. X is a live object; the contents of X may be destroyed, but X itself never is. And nor should it be.
If you want to destroy X, then you have that ability, using the explicit-destructor-and-placement-new. Though thanks to the possibility of const members, you'll also need to launder the pointer to the object if you want to be safe. But assignment should never be considered equivalent to that.
If efficiency is your concern, it's much better to use the assign member function. By using assign, the X has the opportunity to reuse the existing allocations. And this would almost certainly make it faster than your "destroy-plus-construct" version. The cost of moving a linked list into another object is trivial; the cost of having to destroy all of those allocations only to allocate them again is not.
This is especially important for std::list, as it has a lot of allocations.
Worst-case scenario, assign will be no less efficient than whatever else you could come up with from outside the class. And best-case, it will be far better.

When you have a statement involving a move assignment:
x = std::move(y);
The destructor is not called for x before doing the move. However, after the move, at some point the destructor will be called for y. The idea behind a move assignment operator is that it might be able to move the contents of y to x in a simple way (for example, copying a pointer to y's storage into x). It also has to ensure its previous contents are destroyed properly (it may opt to swap this with y, because you know y may not be used anymore, and that y's destructor will be called).
If the move assignment is inlined, the compiler might be able to deduce that all the operations necessary for moving storage from y to x are just equivalent to in-place construction.

Re your final question
” Am I right?
No.
Your ideas about what's permitted or not, are wrong. The compiler is permitted to substitute any optimization as long as it preserves the observable effects. This is called the "as if" rule. Possible optimizations include removing all of that code, if it does not affect anything observable. In particular, your "isn't allowed" for the second example is completely false, and the reasoning "it eliminates assignment" applies also to your first example, where you draw the opposite conclusion, i.e. there's a self-contradiction right there.

Related

Safe to std:move a member?

Have found comparable questions but not exactly with such a case.
Take the following code for example:
#include <iostream>
#include <string>
#include <vector>
struct Inner
{
int a, b;
};
struct Outer
{
Inner inner;
};
std::vector<Inner> vec;
int main()
{
Outer * an_outer = new Outer;
vec.push_back(std::move(an_outer->inner));
delete an_outer;
}
Is this safe? Even if those were polymorphic classes or ones with custom destructors?
My concern regards the instance of "Outer" which has a member variable "inner" moved away. From what I learned, moved things should not be touched anymore. However does that include the delete call that is applied to outer and would technically call delete on inner as well (and thus "touch" it)?
Neither std::move, nor move semantics more generally, have any effect on the object model. They don't stop objects from existing, nor prevent you from using those objects in the future.
What they do is ask to borrow encapsulated resources from the thing you're "moving from". For example, a vector, which directly only stores a pointer some dynamically-allocated data: the concept of ownership of that data can be "stolen" by simply copying that pointer then telling the vector to null the pointer and never have anything to do with that data again. It's yielded. The data belongs to you now. You have the last pointer to it that exists in the universe.
All of this is achieved simply by a bunch of hacks. The first is std::move, which just casts your vector expression to vector&&, so when you pass the result of it to a construction or assignment operation, the version that takes vector&& (the move constructor, or move-assignment operator) is triggered instead of the one that takes const vector&, and that version performs the steps necessary to do what I described in the previous paragraph.
(For other types that we make, we traditionally keep following that pattern, because that's how we can have nice things and persuade people to use our libraries.)
But then you can still use the vector! You can "touch" it. What exactly you can do with it is discoverable from the documentation for vector, and this extends to any other moveable type: the constraints emplaced on your usage of a moved-from object depend entirely on its type, and on the decisions made by the person who designed that type.
None of this has any impact on the lifetime of the vector. It still exists, it still takes memory, and it will still be destructed when the time comes. (In this particular example you can actually .clear() it and start again adding data to a new buffer.)
So, even if ints had any sort of concept of this (they don't; they encapsulate no indirectly-stored data, and own no resources; they have no constructors, so they also have no constructors taking int&&), the delete "touch"ing them would be entirely safe. And, more generally, none of this depends on whether the thing you've moved from is a member or not.
More generally, if you had a type T, and an object of that type, and you moved from it, and one of the constraints for T was that you couldn't delete it after moving from it, that would be a bug in T. That would be a serious mistake by the author of T. Your objects all need to be destructible. The mistake could manifest as a compilation failure or, more likely, undefined behaviour, depending on what exactly the bug was.
tl;dr: Yes, this is safe, for several reasons.
std::move is a cast to an rvalue-reference, which primarily changes which constructor/assignment operator overload is chosen. In your example the move-constructor is the default generated move-constructor, which just copies the ints over so nothing happens.
Whether or not this generally safe depends on the way your classes implement move construction/assignment. Assume for example that your class instead held a pointer. You would have to set that to nullptr in the moved-from class to avoid destroying the pointed-to data, if the moved-from class is destroyed.
Because just defining move-semantics is a custom way almost always leads to problems, the rule of five says that if you customize any of:
the copy constructor
the copy assignment operator
the move constructor
the move assignment operator
the destructor
you should usually customize all to ensure that they behave consistently with the expectations a caller would usually have for your class.

What are the operations supported after an object is moved? [duplicate]

This question already has answers here:
What can I do with a moved-from object?
(2 answers)
Closed 9 years ago.
If an object is actually moved to another location, what are the operations supported on the original object?
To elaborate it, I have a type T with available move constructor. With the following statements
T x{constructor-args};
T y{std::move(x)};
what all can be done with the object x (provided that the object actually moves from x to y using available move constructor of T)?
Specifically,
x can be destroyed for sure. Can I assume x is trivially destructible after move?
can x be assigned or move assigned to a new object ( I guess yes, as I have seen many swap uses this)?
can I construct a new object object in place of x? If by any chance this is allowed, then is it possible to have a uninitialized_move which works with (partially) overlapped source & destination range, like std::copy and unlike std::uninitialized_copy ?
T z{some-args};
x = z; //or x = std::move(z);
T u{some-args};
new(&x) T{std::move(u)}; // or new(&x) T{u}; etc
Whatever you define it to be. At the least, the resulting
object should be destructable—its destructor will be
called, and the state of the object should be such that that
causes no problems.
The standard library guarantees that its objects are some
coherent state. You can call any of the member functions: it is
unspecified what you get in return, but the results will be
coherent: if you call size on a vector which has been moved,
you can also call operator[] with an index less than the value
returned by size (but in the only reasonable implementation,
size will return 0,). This is probably more than what is
needed for effective use, however; all that is really necessary
is that calling the destructor will work.
To make it clearer, using std::vector as an example: if we
suppose the usual implementation, with three pointers, begin,
end and limit (or however they are called: limit is meant to be
one past the end of the allocated memory, so that capacity()
returns limit - begin). After moving, the standard would
require that all three pointers be null. In most
implementations, the limit pointer will not be accessed in the
destructor, so the looser requirements of being deletable would
be met if only the begin and end pointers were set to null.
With regards to library types, the standard says (17.6.5.15)
Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.
This means they should, at a minimum, be destructible and assignable. As James mentions below, the object should still comply with its own interface. Your own types should follow the same idea.
You should not be using placement-new for this sort of thing. Pretend the move constructor doesn't exist and write code the same way as before, using operator=. The last line in your example should be x=std::move(u); or x=u;.

Can we say bye to copy constructors?

Copy constructors were traditionally ubiquitous in C++ programs. However, I'm doubting whether there's a good reason to that since C++11.
Even when the program logic didn't need copying objects, copy constructors (usu. default) were often included for the sole purpose of object reallocation. Without a copy constructor, you couldn't store objects in a std::vector or even return an object from a function.
However, since C++11, move constructors have been responsible for object reallocation.
Another use case for copy constructors was, simply, making clones of objects. However, I'm quite convinced that a .copy() or .clone() method is better suited for that role than a copy constructor because...
Copying objects isn't really commonplace. Certainly it's sometimes necessary for an object's interface to contain a "make a duplicate of yourself" method, but only sometimes. And when it is the case, explicit is better than implicit.
Sometimes an object could expose several different .copy()-like methods, because in different contexts the copy might need to be created differently (e.g. shallower or deeper).
In some contexts, we'd want the .copy() methods to do non-trivial things related to program logic (increment some counter, or perhaps generate a new unique name for the copy). I wouldn't accept any code that has non-obvious logic in a copy constructor.
Last but not least, a .copy() method can be virtual if needed, allowing to solve the problem of slicing.
The only cases where I'd actually want to use a copy constructor are:
RAII handles of copiable resources (quite obviously)
Structures that are intended to be used like built-in types, like math vectors or matrices -
simply because they are copied often and vec3 b = a.copy() is too verbose.
Side note: I've considered the fact that copy constructor is needed for CAS, but CAS is needed for operator=(const T&) which I consider redundant basing on the exact same reasoning;
.copy() + operator=(T&&) = default would be preferred if you really need this.)
For me, that's quite enough incentive to use T(const T&) = delete everywhere by default and provide a .copy() method when needed. (Perhaps also a private T(const T&) = default just to be able to write copy() or virtual copy() without boilerplate.)
Q: Is the above reasoning correct or am I missing any good reasons why logic objects actually need or somehow benefit from copy constructors?
Specifically, am I correct in that move constructors took over the responsibility of object reallocation in C++11 completely? I'm using "reallocation" informally for all the situations when an object needs to be moved someplace else in the memory without altering its state.
The problem is what is the word "object" referring to.
If objects are the resources that variables refers to (like in java or in C++ through pointers, using classical OOP paradigms) every "copy between variables" is a "sharing", and if single ownership is imposed, "sharing" becomes "moving".
If objects are the variables themselves, since each variables has to have its own history, you cannot "move" if you cannot / don't want to impose the destruction of a value in favor of another.
Cosider for example std::strings:
std::string a="Aa";
std::string b=a;
...
b = "Bb";
Do you expect the value of a to change, or that code to don't compile? If not, then copy is needed.
Now consider this:
std::string a="Aa";
std::string b=std::move(a);
...
b = "Bb";
Now a is left empty, since its value (better, the dynamic memory that contains it) had been "moved" to b. The value of b is then chaged, and the old "Aa" discarded.
In essence, move works only if explicitly called or if the right argument is "temporary", like in
a = b+c;
where the resource hold by the return of operator+ is clearly not needed after the assignment, hence moving it to a, rather than copy it in another a's held place and delete it is more effective.
Move and copy are two different things. Move is not "THE replacement for copy". It an more efficient way to avoid copy only in all the cases when an object is not required to generate a clone of itself.
Short anwer
Is the above reasoning correct or am I missing any good reasons why logic objects actually need or somehow benefit from copy constructors?
Automatically generated copy constructors are a great benefit in separating resource management from program logic; classes implementing logic do not need to worry about allocating, freeing or copying resources at all.
In my opinion, any replacement would need to do the same, and doing that for named functions feels a bit weird.
Long answer
When considering copy semantics, it's useful to divide types into four categories:
Primitive types, with semantics defined by the language;
Resource management (or RAII) types, with special requirements;
Aggregate types, which simply copy each member;
Polymorphic types.
Primitive types are what they are, so they are beyond the scope of the question; I'm assuming that a radical change to the language, breaking decades of legacy code, won't happen. Polymorphic types can't be copied (while maintaining the dynamic type) without user-defined virtual functions or RTTI shenanigans, so they are also beyond the scope of the question.
So the proposal is: mandate that RAII and aggregate types implement a named function, rather than a copy constructor, if they should be copied.
This makes little difference to RAII types; they just need to declare a differently-named copy function, and users just need to be slightly more verbose.
However, in the current world, aggregate types do not need to declare an explicit copy constructor at all; one will be generated automatically to copy all the members, or deleted if any are uncopyable. This ensures that, as long as all the member types are correctly copyable, so is the aggregate.
In your world, there are two possibilities:
Either the language knows about your copy-function, and can automatically generate one (perhaps only if explicitly requested, i.e. T copy() = default;, since you want explicitness). In my opinion, automatically generating named functions based on the same named function in other types feels more like magic than the current scheme of generating "language elements" (constructors and operator overloads), but perhaps that's just my prejudice speaking.
Or it's left to the user to correctly implement copying semantics for aggregates. This is error-prone (since you could add a member and forget to update the function), and breaks the current clean separation between resource management and program logic.
And to address the points you make in favour:
Copying (non-polymorphic) objects is commonplace, although as you say it's less common now that they can be moved when possible. It's just your opinion that "explicit is better" or that T a(b); is less explicit than T a(b.copy());
Agreed, if an object doesn't have clearly defined copy semantics, then it should have named functions to cover whatever options it offers. I don't see how that affects how normal objects should be copied.
I've no idea why you think that a copy constructor shouldn't be allowed to do things that a named function could, as long as they are part of the defined copy semantics. You argue that copy constructors shouldn't be used because of artificial restrictions that you place on them yourself.
Copying polymorphic objects is an entirely different kettle of fish. Forcing all types to use named functions just because polymorphic ones must won't give the consistency you seem to be arguing for, since the return types would have to be different. Polymorphic copies will need to be dynamically allocated and returned by pointer; non-polymorphic copies should be returned by value. In my opinion, there is little value in making these different operations look similar without being interchangable.
One case where copy constructors come in useful is when implementing the strong exception guarantees.
To illustrate the point, let's consider the resize function of std::vector. The function might be implemented roughly as follows:
void std::vector::resize(std::size_t n)
{
if (n > capacity())
{
T *newData = new T [n];
for (std::size_t i = 0; i < capacity(); i++)
newData[i] = std::move(m_data[i]);
delete[] m_data;
m_data = newData;
}
else
{ /* ... */ }
}
If the resize function were to have a strong exception guarantee we need to ensure that, if an exception is thrown, the state of the std::vector before the resize() call is preserved.
If T has no move constructor, then we will default to the copy constructor. In this case, if the copy constructor throws an exception, we can still provide strong exception guarantee: we simply delete the newData array and no harm to the std::vector has been done.
However, if we were using the move constructor of T and it threw an exception, then we have a bunch of Ts that were moved into the newData array. Rolling this operation back isn't straight-forward: if we try to move them back into the m_data array the move constructor of T may throw an exception again!
To resolve this issue we have the std::move_if_noexcept function. This function will use the move constructor of T if it is marked as noexcept, otherwise the copy constructor will be used. This allows us to implement std::vector::resize in such a way as to provide a strong exception guarantee.
For completeness, I should mention that C++11 std::vector::resize does not provide a strong exception guarantee in all cases. According to www.cplusplus.com we have the the follow guarantees:
If n is less than or equal to the size of the container, the function never throws exceptions (no-throw guarantee).
If n is greater and a reallocation happens, there are no changes in the container in case of exception (strong guarantee) if the type of the elements is either copyable or no-throw moveable.
Otherwise, if an exception is thrown, the container is left with a valid state (basic guarantee).
Here's the thing. Moving is the new default- the new minimum requirement. But copying is still often a useful and convenient operation.
Nobody should bend over backwards to offer a copy constructor anymore. But it is still useful for your users to have copyability if you can offer it simply.
I would not ditch copy constructors any time soon, but I admit that for my own types, I only add them when it becomes clear I need them- not immediately. So far this is very, very few types.

When are R-value references necessary?

I understand that, without R-value references, perfect forwarding in C++ would be impossible.
However, I would like to know: is there anything else that necessitates them?
For example, this page points out this example in which R-value references are apparently necessary:
X foo();
X x;
// perhaps use x in various ways
x = foo();
The last line above:
Destructs the resource held by x,
Clones the resource from the temporary returned by foo,
Destructs the temporary and thereby releases its resource.
However, it seems to me that a simple change would have fixed the problem, if swap were implemented properly:
X foo();
X x;
// perhaps use x in various ways
{
X y = foo();
swap(x, y);
}
So it seems to me that r-value references not necessary for this optimization. (Is that correct?)
So, what are some problems that could not be solved with r-value references (except for perfect forwarding, about which I already know)?
So, what are some problems that could not be solved with r-value references (except for perfect forwarding, about which I already know)?
Yes. In order for the swap trick to work (or at least, work optimally), the class must be designed to be in an empty state when constructed. Imagine a vector implementation that always reserved a few elements, rather than starting off totally empty. Swapping from such a default-constructed vector with an already existing vector would mean doing an extra allocation (in the default constructor of this vector implementation).
With an actual move, the default-constructed object can have allocated data, but the moved-from object can be left in an unallocated state.
Also, consider this code:
std::unique_ptr<T> make_unique(...) {return std::unique_ptr<T>(new T(...));}
std::unique_ptr<T> unique = make_unique();
This is not possible without language-level move semantics. Why? Because of the second statement. By the standard, this is copy initialization. That, as the name suggests, requires the type to be copyable. And the whole point of unique_ptr is that it is, well, unique. IE: not copyable.
If move semantics didn't exist, you couldn't return it. Oh yes, you can talk about copy elision and such, but remember: elision is an optimization. A conforming compiler must still detect that the copy constructor exists and is callable; the compiler simply has the option to not call it.
So no, you can't just emulate move semantics with swap. And quite frankly, even if you could, why would you ever want to?
You have to write swap implementations yourself. So if you've got a class with 6 members, your swap function has to swap each member in turn. Recursively through all base classes and such too.
C++11 (though not VC10 or VC2012) allows the compiler to write the move constructors for you.
Yes, there are circumstances where you can get away without it. But they look incredibly hacky, are hard to see why you're doing things that way, and most important of all, make things difficult for both the reader and the writer.
When people want to do something a lot for performance sake that makes their code tedious to read, difficult to write, and error-prone to maintain (add another member to the class without adding it to the swap function, for example), then you're looking at something that probably should become part of the language. Especially when it's something that the compiler can handle quite easily.

Questions about postblit and move semantics

I have already asked a similar question a while ago, but I'm still unclear on some details.
Under what circumstances is the postblit constructor called?
What are the semantics of moving an object? Will it be postblitted and/or destructed?
What happens if I return a local variable by value? Will it implicitly be moved?
How do I cast an expression to an rvalue? For example, how would a generic swap look like?
A postblit constructor is called whenever the struct is copied - e.g. when passing a struct to a function.
A move is a bitwise copy. The postblit constructor is never called. The destructor is never called. The bits are simply copied. The original was "moved" and so nothing needs to be created or destroyed.
It will be moved. This is the prime example of a move.
There are a number of different situations that a swap function would have to worry about if you want to make it as efficient as possible. I would advise simply using the swap function in std.algorithm. The classic swap would result in copying and would thus call the postblit constructor and the destructor. Moves are generally done by the compiler, not the programmer. However, looking at the official implementation of swap, it looks like it plays some tricks to get move semantics out of the deal where it can. Regardless, moves are generally done by the compiler. They're an optimization that it will do where it knows that it can (RVO being the classic case where it can).
According to TDPL (p. 251), there are only 2 cases where D guarantees that a move will take place:
All anonymous rvalues are moved, not copied. A call to this(this)
is never inserted when the source is an anonymous rvalue (i.e., a
temporary as featured in the function hun above).
All named temporaries that are stack-allocated inside a function and
then returned elide a call to this(this).
There is no guarantee that other potential elisions are observed.
So, the compiler may use moves elsewhere, but there's no guarantee that it will.
As far as I understand:
1) When a struct is copied, as opposed to moved or constructed.
2) The point of move semantics is that neither of the two needs to happen. The new location of the struct is initialized with a bit-wise copy of the struct, and the old location goes out of scope and becomes inaccessible. Thus, the struct has "moved" from A to B.
3) That is the typical move situation:
S init(bool someFlag)
{
S s;
s.foo = someFlag? bar : baz;
return s; // `s` can now be safely moved from here...
}
// call-site:
S s = init(flag);
//^ ... to here.