Why "move semantics" rather than simply memcpy? - c++

Given the following code:
typename std::aligned_storage<sizeof(T), alignof(T)>::type storage_t;
//this moves the back of src to the back of dst:
void push_popped(std::list<storage_t> & dstLst, std::list<storage_t> & srcLst)
{
auto & src = srcLst.back();
dstLst.push_back(storage_t());
auto & dst = dstLst.back();
std::memcpy(&dst, &src, sizeof(T));
srcLst.pop_back();
}
I'm aware of 3 reasons why this approach is not, in general, correct (even though it avoids calling src->~T() and so avoids double-reclamation of T's resources).
object members of type U* that point to other U members of the same object
hidden class members may need to be updated (vtable, for instance)
the system needs to record that no T exists anymore at src and that a T does now exist at dst
(These are mentioned here: http://www.gamedev.net/topic/655730-c-stdmove-vs-stdmemcpy/#entry5148523.)
Assuming that T is not a type whose memory address is a property of its state (std::mutex or std::condition_variable, for instance), are these the only issues with this approach? Or are there other things that could go wrong? I'd like a description of the unknown issues.
I'd like to think I have an "object relocation semantics" developed, but I'd rather not ask people to consider it if there's an obvious hole in it.

The concept of "trivially copyable" implies that a memcpy is safe. You can test if a type is trivially copyable via a trait in std.
It includes the idea that destroying it is a noop; in your case, you want destruction to not be a noop, but rather not done on the source, while being done on the dest.
The concept of "move-and-destroy-source" has been proposed in the C++1z standardization process independent of the "trivially copyable" concept. It was proposed for exception safety; there are types for which a move-construct is not exception-safe, but a move-construct-and-destroy-source would be. And there are thorny problems involving exceptions and container allocations that make a noexcept move-ctor operation very valuable.
If that gets into the standard, then a trivially-copyable-if-you-don't-destroy-source concept could also be added to the standard, if it proves valuable.
It wouldn't apply to everything move semantics can enhance, and it may require effort on the part of programmers (how the compiler can work out that "it is ok to elide a destroyer" is not going to be easy; all non-trivial non-structural properties of Turing Machine behavior are intractable.)

Why use Copy constructor instead of std::memcpy ?
The Move constructor/move assignment oeprator gives you encapsulated opratunity to do some other usefull stuff when you move an object - logging, cleaning up, etc.
preformance wise - in many cases the compiler can optimize many moves to just one (imagin many functions that simply returns some object from one another). with memcpy their abilty is much more restricted.
and finally - because C++ is not about moving bytes around - it's about using objects as a basis for your program.

Related

Can we detect "trivial relocatability" in C++17?

In future standards of C++, we will have the concept of "trivial relocatability", which means we can simply copy bytes from one object to an uninitialized chunk of memory, and simply ignore/zero out the bytes of the original object.
this way, we imitate the C-style way of copying/moving objects around.
In future standards, we will probably have something like std::is_trivially_relocatable<type> as a type trait. currently, the closest thing we have is std::is_pod<type> which will be deprecated in C++20.
My question is, do we have a way in the current standard (C++17) to figure out if the object is trivially relocatable?
For example, std::unique_ptr<type> can be moved around by copying its bytes to a new memory address and zeroing out the original bytes, but std::is_pod_v<std::unique_ptr<int>> is false.
Also, currently the standard mandate that every uninitialized chunk of memory must pass through a constructor in order to be considered a valid C++ object. even if we can somehow figure out if the object is trivially relocatable, if we just move the bytes - it's still UB according to the standard.
So another question is - even if we can detect trivial relocatability, how can we implement trivial relocation without causing UB? simply calling memcpy + memset(src,0,...) and casting the memory address to the right type is UB.
`
Thanks!
The whole point of trivial-relocatability would seem to be to enable byte-wise moving of objects even in the presence of a non-trivial move constructor or move assignment operator. Even in the current proposal P1144R3, this ultimately requires that a user manually mark types for which this is possible. For a compiler to figure out whether a given type is trivially-relocatable in general is most-likely equivalent to solving the halting problem (it would have to understand and reason about what an arbitrary, potentially user-defined move constructor or move assignment operator does)…
It is, of course, possible that you define your own is_trivially_relocatable trait that defaults to std::is_trivially_copyable_v and have the user specialize for types that should specifically be considered trivially-relocatable. Even this is problematic, however, because there's gonna be no way to automatically propagate this property to types that are composed of trivially-relocatable types…
Even for trivially-copyable types, you can't just copy the bytes of the object representation to some random memory location and cast the address to a pointer to the type of the original object. Since an object was never created, that pointer will not point to an object. And attempting to access the object that pointer doesn't point to will result in undefined behavior. Trivial-copyabibility means you can copy the bytes of the object representation from one existing object to another existing object and rely on that making the value of the one object equal to the value of the other [basic.types]/3.
To do this for trivially-relocating some object would mean that you have to first construct an object of the given type at your target location, then copy the bytes of the original object into that, and then modify the original object in a way equivalent to what would have happened if you had moved from that object. Which is essentially a complicated way of just moving the object…
There's a reason a proposal to add the concept of trivial-relocatability to the language exists: because you currently just can't do it from within the langugage itself…
Note that, despite all this, just because the compiler frontend cannot avoid generating constructor calls doesn't mean the optimizer cannot eliminate unnecessary loads and stores. Let's have a look at what code the compiler generates for your example of moving a std::vector or std::unique_ptr:
auto test1(void* dest, std::vector<int>& src)
{
return new (dest) std::vector<int>(std::move(src));
}
auto test2(void* dest, std::unique_ptr<int>& src)
{
return new (dest) std::unique_ptr<int>(std::move(src));
}
As you can see, just doing an actual move often already boils down to just copying and overwriting some bytes, even for non-trivial types…
Author of P1144 here; somehow I'm just seeing this SO question now!
std::is_trivially_relocatable<T> is proposed for some-future-version-of-C++, but I don't predict it'll get in anytime soon (definitely not C++23, I bet not C++26, quite possibly not ever). The paper (P1144R6, June 2022) ought to answer a lot of your questions, especially the ones where people are correctly answering that if you could already implement this in present-day C++, we wouldn't need a proposal. See also my 2019 C++Now talk.
Michael Kenzel's answer says that P1144 "ultimately requires that a user manually mark types for which [trivial relocation] is possible"; I want to point out that that's kind of the opposite of the point. The state of the art for trivial relocatability is manual marking ("warranting") of each and every such type; for example, in Folly, you'd say
struct Widget {
std::string s;
std::vector<int> v;
};
FOLLY_ASSUME_FBVECTOR_COMPATIBLE(Widget);
And this is a problem, because the average industry programmer shouldn't be bothered with trying to figure out if std::string is trivially relocatable on their library of choice. (The annotation above is wrong on 1.5 of the big 3 vendors!) Even Folly's own maintainers can't get these manual annotations right 100% of the time.
So the idea of P1144 is that the compiler can just take care of it for you. Your job changes from dangerously warranting things-you-don't-necessarily-know, to merely (and optionally) verifying things-you-want-to-be-true via static_assert (Godbolt):
struct Widget {
std::string s;
std::vector<int> v;
};
static_assert(std::is_trivially_relocatable_v<Widget>);
struct Gadget {
std::string s;
std::list<int> v;
};
static_assert(!std::is_trivially_relocatable_v<Gadget>);
In your (OP's) specific use-case, it sounds like you need to find out whether a given lambda type is trivially relocatable (Godbolt):
void f(std::list<int> v) {
auto widget = [&]() { return v; };
auto gadget = [=]() { return v; };
static_assert(std::is_trivially_relocatable_v<decltype(widget)>);
static_assert(!std::is_trivially_relocatable_v<decltype(gadget)>);
}
This is something you can't really do at all with Folly/BSL/EASTL, because their warranting mechanisms work only on named types at the global scope. You can't exactly FOLLY_ASSUME_FBVECTOR_COMPATIBLE(decltype(widget)).
Inside a std::function-like type, you're correct that it would be useful to know whether the captured type is trivially relocatable or not. But since you can't know that, the next best thing (and what you should do in practice) is to check std::is_trivially_copyable. That's the currently blessed type trait that literally means "This type is safe to memcpy, safe to skip the destructor of" — basically all the things you're going to be doing with it. Even if you knew that the type was exactly std::unique_ptr<int>, or whatever, it would still be undefined behavior to memcpy it in present-day C++, because the current standard says that you're not allowed to memcpy types that aren't trivially copyable.
(Btw, technically, P1144 doesn't change that fact. P1144 merely says that the implementation is allowed to elide the effects of relocation, which is a huge wink-and-nod to implementors that they should just use memcpy. But even P1144R6 doesn't make it legal for ordinary non-implementor programmers to memcpy non-trivially-copyable types: it leaves the door open for some compiler to implement, and some library implementation to use, a __builtin_trivial_relocate function that is in some magical sense distinguishable from a plain old memcpy.)
Finally, your last paragraph refers to memcpy + memset(src,0,...). That's wrong. Trivial relocation is tantamount to just memcpy. If you care about the state of the source object afterward — if you care that it's all-zero-bytes, for example — then that must mean you're going to look at it again, which means you aren't actually treating it as destroyed, which means you aren't actually doing the semantics of a relocate here. "Copy and null out the source" is more often the semantics of a move. The point of relocation is to avoid that extra work.

What's the connection between value semantics and move semantics in C++?

There're plenty of articles discussing value semantics vs reference semantics, and maybe more trying to explain move semantics. However, No one has ever talked about the connection between value semantics and move semantics. Are they orthogonal concepts?
Note: This question is NOT about comparing value semantics vs move semantics, cause it is perfectly clear these two concepts are not "comparable". This question is about how they are connected, specifically (like #StoryTeller said), about discussing(how):
Move semantics help facilitate more use of value types.
From the original move proposal:
Copy vs Move
C and C++ are built on copy semantics. This is a Good Thing. Move
semantics is not an attempt to supplant copy semantics, nor undermine
it in any way. Rather this proposal seeks to augment copy semantics. A
general user defined class might be both copyable and movable, one or
the other, or neither.
The difference between a copy and a move is that a copy leaves the
source unchanged. A move on the other hand leaves the source in a
state defined differently for each type. The state of the source may
be unchanged, or it may be radically different. The only requirement
is that the object remain in a self consistent state (all internal
invariants are still intact). From a client code point of view,
choosing move instead of copy means that you don't care what happens
to the state of the source.
For PODs, move and copy are identical operations (right down to the
machine instruction level).
I guess one could add to this and say:
Move semantics allows us to keep value semantics, but at the same time gain the performance of reference semantics in those cases where the value of the original (copied-from) object is unimportant to program logic.
Inspired by Howard's answer, I wrote an article about this topic, hope it can help someone that's also wondering about it. I copy/paste the article here.
While I was learning move semantics, I always had a feeling, that even though I knew the concept quite well, I cannot fit it into the big picture of C++. Move semantics is not like some syntactic sugar that solely exists for convenience, it deeply affected the way people think and write C++ and has become one of the most important C++ idioms. But hey, the pond of C++ was already full of other idioms, when you throw move semantics in, mutual extrusion comes with it. Did move semantics break, enhance or replace other idioms? I don't know, but I want to find out.
Value Semantics
Value semantics is what makes me start to think about this problem. Since there aren't many things in C++ with the name "semantics", I naturally thought, "maybe value and move semantics have some connections?". Turns out, it's not just connections, it's the origin:
Move semantics is not an attempt to supplant copy semantics, nor undermine it in any way. Rather this proposal seeks to augment copy semantics.
- Move Semantics Proposal, September 10, 2002
Perhaps you've noticed it uses the wording "copy semantics", in fact, "value semantics" and "copy semantics" are the same thing, and I'll use them interchangeably.
OK, so what is value semantics? isocpp has a whole page talking about it, but basically, value semantics means assignment copies the value, like T b = a;. That's the definition, but often times value semantics just means to create, use, store the object itself, pass, return by value, rather than pointers or references.
The opposite concept is reference semantics, where assignment copies the pointer. In reference semantics, what's important is identity, for example T& b = a; , we have to remember that b is an alias of a, not anything else. But in value semantics, we don't care about identity at all, we only care about the value an object1 holds. This is brought by the nature of copy, because a copy is ensured to give us two independent objects that hold the same value, you can't tell which one is the source, nor does it affect usage.
Unlike other languages(Java, C#, JavaScript), C++ is built on value semantics. By default, assignment does bit-wise-copy(if no user-defined copy ctor is involved), arguments and return values are copy-constructed(yes I know there's RVO). Keeping value semantics is considered a good thing in C++. On the one hand, it's safer, because you don't need to worry about dangling pointers and all the creepy stuff; on the other hand, it's faster, because you have less indirection, see here for the official explanation.
Move Semantics: V8 Engine on the Value Semantics Car
Move semantics is not an attempt to supplant copy semantics. They are totally compatible with each other. I came up with this metaphor which I feel describes their relation really well.
Imagine you have a car, it ran smoothly with the built-in engine. One day, you installed an extra V8 engine onto this car. Whenever you have enough fuel, the V8 engine is able to speed up your car, and that makes you happy.
So, the car is value semantics, and the V8 engine is move semantics. Installing an engine on your car does not require a new car, it's still the same car, just like using move semantics won't make you drop value semantics, because you're still operating on the object itself not its references or pointers. Further more, the move if you can, else copy strategy, implemented by the binding preferences, is exactly like way engine is chosen, that is to use V8 if you can(enough fuel), otherwise fall back to the original engine.
Now we have a pretty good understanding of Howard Hinnant(main author of the move proposal)'s answer on SO:
Move semantics allows us to keep value semantics, but at the same time gain the performance of reference semantics in those cases where the value of the original (copied-from) object is unimportant to program logic.
EDIT: Howard added some comment that really worth mentioning. By definition, move semantics acts more like reference semantics, because the moved-to and moved-from objects are not independent, when modifying(either by move-construction or move-assignment) the moved-to object, the moved-from object is also modified. However, it doesn't really matter——when move semantics takes place, you don't care about the moved-from object, it's either a pure rvalue (so nobody else has a reference to the original), or when the programmer specifically says "I don't care about the value of the original after the copy" (by using std::move instead of copy). Since modification to the original object has no impact on the program, you can use the moved-to object as if it's an independent copy, retaining the appearance of value semantics.
Move Semantics and Performance Optimization
Move semantics is mostly about performance optimization: the ability to move an expensive object from one address in memory to another, while pilfering resources of the source in order to construct the target with minimum expense.
- Move Semantics Proposal
As stated in the proposal, the main benefit people get from move semantics are performance boost. I'll give two examples here.
The optimization you can see
Suppose we have a handler(whatever that is) which is expensive to construct, and we want to store it into a map for future use.
std::unordered_map<string, Handler> handlers;
void RegisterHandler(const string& name, Handler handler) {
handlers[name] = std::move(handler);
}
RegisterHandler("handler-A", build_handler());
This is a typical use of move, and of course it assumes Handler has a move ctor. By moving(not copying)-constructing a map value, a lot of time may be saved.
The optimization you can't see
Howard Hinnant once mentioned in his talk that, the idea of move semantics came from optimizing std::vector. How?
A std::vector<T> object is basically a set of pointers to an internal data buffer on heap, like begin() and end(). Copying a vector is expensive due to allocating new memory for the data buffer. When move is used instead of copy, only the pointers get copied and point to the old buffer.
What's more, move also boosts vector insert operation. This is explained in the vector Example section in the proposal. Say we have a std::vector<string> with two elements "AAAAA" and "BBBBB", now we want to insert "CCCCC" at index 1. Assuming the vector has enough capacity, the following graph demonstrates the process of inserting with copy vs move.
(source: qnssl.com)
Everything shown on the graph is on heap, including the vector's data buffer and each element string's data buffer. With copy, str_b's data buffer has to be copied, which involves a buffer allocation then deallocation. With move, old str_b's data buffer is reused by the new str_b in the new address, no buffer allocation or deallocation is needed(As Howard pointed out, the "data" that old str_b now points to is unspecified). This brings a huge performance boost, yet it means more than that, because now you can store expensive objects into a vector without sacrificing performance, while previously having to store pointers. This also helps extend usage of value semantics.
Move Semantics and Resource Management
In the famous article Rule of Zero, the author wrote:
Using value semantics is essential for RAII, because references don’t affect the lifetime of their referrents.
I found it to be a good starting point to discuss the correlation between move semantics and resource management.
As you may or may not know, RAII has another name called Scope-Bound Resource Management (SBRM), after the basic use case where the lifetime of an RAII object ends due to scope exit. Remember one advantage of using value semantics? Safety. We know exactly when an object's lifetime starts and ends, just by looking at its storage duration, and 99% of the time we'll find it at block scope, which makes it very simple. Things get a lot more complicated for pointers and references, now we have to worry about whether the object that is referenced or pointed to has been released. This is hard, what makes it worse is that these objects usually exist in different scope from its pointers and references.
It's obvious why value semantics gets along well with RAII —— RAII binds the life cycle of a resource to the lifetime of an object, and with value semantics, you have a clear idea of an object's lifetime.
But, resource is about identity…
Though value semantics and RAII seems to be a perfect match, in reality it was not. Why? Fundamentally speaking, because resource is about identity, while value semantics only cares about value. You have an open socket, you use the very socket; you have an open file, you use the very file. In the context of resource management, there aren't things with the same value. A resource represents itself, with unique identity.
See the contradiction here? Prior to C++11, if we stick with value semantics, it was hard to work with resources cause they cannot be copied, therefore programmers came up with some workarounds:
Use raw pointers;
Write their own movable-but-not-copyable class(often Involves private copy ctor and operations like swap and splice);
Use auto_ptr.
These solutions intended to solve the problem of unique ownership and ownership transferring, but they all have some drawbacks. I won't talk about it here cause it's everywhere on the Internet. What I would like to address is that, even without move semantics, resource ownership management can be done, it's just that it takes more code and is often error-prone.
What is lacking is uniform syntax and semantics to enable generic code to move arbitrary objects (just as generic code today can copy arbitrary objects).
- Move Semantics Proposal
Compared to the above statement from proposal, I like this answer more:
In addition to the obvious efficiency benefit, this also affords a programmer a standards-compliant way to have objects that are movable but not copyable. Objects that are movable and not copyable convey a very clear boundary of resource ownership via standard language semantics …my point is that move semantics is now a standard way to concisely express (among other things) movable-but-not-copyable objects.
The above quote has done a pretty good job explaining what move semantics means to resource ownership management in C++. Resource should naturally be movable(by "movable" I mean transferrable) but not copyable, now with the help of move semantics(well actually a whole lot of change at language level to support it), there's a standard way to do this right and efficiently.
The Rebirth of Value Semantics
Finally, we are able to talk about the other aspect(besides performance) of augmentation that move semantics brought to value semantics.
Stepping through the above discussion, we've seen why value semantics fits the RAII model, but at the same time not compatible with resource management. With the arise of move semantics, the necessary materials to fill this gap is finally prepared. So here we have, smart pointers!
Needless to say the importance of std::unique_ptr and std::shared_ptr, here I'd like to emphasize three things:
They follow RAII;
They take huge advantage of move semantics(especially for unique_ptr);
They help keep value semantics.
For the third point, if you've read Rule of Zero, you know what I'm talking about. No need to use raw pointers to manage resources, EVER, just use unique_ptr directly or store as member variable, and you're done. When transferring resource ownership, the implicitly constructed move ctor is able to do the job well. Better yet, the current specification ensures that, a named value in the return statement in the worst case(i.e. without elisions) is treated as an rvalue. It means, returning by value should be the default choice for unique_ptr.
std::unique_ptr<ExpensiveResource> foo() {
auto data = std::make_unique<ExpensiveResource>();
return data;
}
std::unique_ptr<ExpensiveResource> p = foo(); // a move at worst
See here for a more detailed explanation. In fact, when using unique_ptr as function parameters, passing by value is still the best choice. I'll probably write an article about it, if time is available.
Besides smart pointers, std::string and std::vector are also RAII wrappers, and the resource they manage is heap memory. For these classes, return by value is still preferred. I'm not too sure about other things like std::thread or std::lock_guard cause I haven't got chance to use them.
To summarize, by utilizing smart pointers, value semantics now truly gains compatibility with RAII. At its core, this is powered by move semantics.
Summary
So far we've gone through a lot of concepts and you probably feel overwhelmed, but the points I want to convey are simple:
Move semantics boosts performance while keeping value semantics;
Move semantics helps bring every piece of resource management together to become what it is today. In particular, it is the key that makes value semantics and RAII truly work together, as it should have been long ago.
I'm a learner on this topic myself, so feel free to point out anything that you feel is wrong, I really appreciate it.
[1]: Here object means "a piece of memory that has an address, a type, and is capable of storing values", from Andrzej's C++ blog.

Performance of resizing std::vector<std::unique_ptr<T>>

The general conception seems to be that std::unique_ptr has no time overhead compared to properly used owning raw pointers, given sufficient optimization.
But what about using std::unique_ptr in compound data structures, in particular std::vector<std::unique_ptr<T>>? For instance, resizing the underlying data of a vector, which can happen during push_back. To isolate the performance, I loop around pop_back, shrink_to_fit, emplace_back:
#include <chrono>
#include <vector>
#include <memory>
#include <iostream>
constexpr size_t size = 1000000;
constexpr size_t repeat = 1000;
using my_clock = std::chrono::high_resolution_clock;
template<class T>
auto test(std::vector<T>& v) {
v.reserve(size);
for (size_t i = 0; i < size; i++) {
v.emplace_back(new int());
}
auto t0 = my_clock::now();
for (int i = 0; i < repeat; i++) {
auto back = std::move(v.back());
v.pop_back();
v.shrink_to_fit();
if (back == nullptr) throw "don't optimize me away";
v.emplace_back(std::move(back));
}
return my_clock::now() - t0;
}
int main() {
std::vector<std::unique_ptr<int>> v_u;
std::vector<int*> v_p;
auto millis_p = std::chrono::duration_cast<std::chrono::milliseconds>(test(v_p));
auto millis_u = std::chrono::duration_cast<std::chrono::milliseconds>(test(v_u));
std::cout << "raw pointer: " << millis_p.count() << " ms, unique_ptr: " << millis_u.count() << " ms\n";
for (auto p : v_p) delete p; // I don't like memory leaks ;-)
}
Compiling the code with -O3 -o -march=native -std=c++14 -g with gcc 7.1.0, clang 3.8.0, and 17.0.4 on Linux on a Intel Xeon E5-2690 v3 # 2.6 GHz (no turbo):
raw pointer: 2746 ms, unique_ptr: 5140 ms (gcc)
raw pointer: 2667 ms, unique_ptr: 5529 ms (clang)
raw pointer: 1448 ms, unique_ptr: 5374 ms (intel)
The raw pointer version spends all it's time in an optimized memmove (intel seems to have a much better one than clang and gcc). The unique_ptr code seems to first copy over the vector data from one memory block to the other and assign the original one with zero - all in a horribly un-optimized loop. And then it loops over the original block of data again to see if any of those that were just zero'd are nonzero and need to be deleted. The full gory detail can be seen on godbolt. The question is not how the compiled code differs, that is pretty clear. The question is why the compiler fails to optimize what is generally regarded as a no-extra-overhead abstraction.
Trying to understand how the compilers reason about handling std::unique_ptr, I was looking a bit more at isolated code. For instance:
void foo(std::unique_ptr<int>& a, std::unique_ptr<int>& b) {
a.release();
a = std::move(b);
}
or the similar
a.release();
a.reset(b.release());
none of the x86 compilers seem to be able to optimize away the senseless if (ptr) delete ptr;. The Intel compiler even gives the delete a 28 % chance. Surprisingly, the delete check is consistently omitted for:
auto tmp = b.release();
a.release();
a.reset(tmp);
These bits are not the main aspect of this question, but all of this makes me feel that I am missing something.
Why do various compilers fail to optimize reallocation within std::vector<std::unique_ptr<int>>? Is there anything in the standard that prevents generating code as efficient as with raw pointers? Is this an issue with the standard library implementation? Or are the compilers just not sufficiently clever enough (yet)?
What can one do to avoid performance impact compared to using raw pointers?
Note: Assume that T is polymorphic and expensive to move, so std::vector<T> is not an option.
The claim that unique_ptr performs as well as a raw pointer after optimization mostly applies only to the basic operations on a single pointer, such as creation, dereferencing, assignment of a single pointer and deletion. Those operations are defined simply enough that an optimizing compiler can usually make the required transformations such that the resulting code is equivalent (or nearly so) in performance to the raw version0.
One place this falls apart is especially higher level language-based optimizations on array-based containers such as std::vector, as you have noted with your test. These containers usually use source level optimizations which depend on type traits to determine at compile time if a type can safely be copied using a byte-wise copy such as memcpy, and delegate to such a method if so, or otherwise fall back to an element-wise copy loop.
To be safely copyable with memcpy an object must be trivially copyable. Now std::unique_ptr is not trivially copyable since indeed it fails several of the requirements such as having only trivial or deleted copy and move constructors. The exact mechanism depends on the standard library involved, but in general a quality std::vector implementation will end up calling a specialized form of something like std::uninitialized_copy for trivially-copyable types that just delegates to memmove.
The typical implementation details are quite tortured, but for libstc++ (used by gcc) you can see the high-level divergence in std::uninitialized_copy:
template<typename _InputIterator, typename _ForwardIterator>
inline _ForwardIterator
uninitialized_copy(_InputIterator __first, _InputIterator __last,
_ForwardIterator __result)
{
...
return std::__uninitialized_copy<__is_trivial(_ValueType1)
&& __is_trivial(_ValueType2)
&& __assignable>::
__uninit_copy(__first, __last, __result);
}
From there you can take my word that many of the std::vector "movement" methods end up here, and that __uninitialized_copy<true>::__uinit_copy(...) ultimately calls memmove while the <false> version doesn't - or you can trace through the code yourself (but you already saw the result in your benchmark).
Ultimately then, you end up with a several loops that perform the required copy steps for non-trivial objects, such as calling the move constructor of the destination object, and subsequently calling the destructor of all the source objects. These are separate loops and even modern compilers will pretty much not be able to reason about something like "OK, in the first loop I moved all the destination objects so their ptr member will be null, so the second loop is a no-op". Finally, to equal the speed of raw pointers, not only would compilers need to optimize across these two loops, they would need to have a transformation which recognizes that the whole thing can be replaced by memcpy or memmove2.
So one answer to your question is that compilers just aren't smart enough to do this optimization, but it's largely because the "raw" version has a lot of compile-time help to skip the need for this optimization entirely.
Loop Fusion
As mentioned the existing vector implementations implement a resize-type operation in two separate loops (in addition to non-loop work such as allocating the new storage and freeing the old storage):
Copying the source objects into the newly allocated destination array (conceptually using something like placement new calling the move constructor).
Destroying the source objects in the old region.
Conceptually you could imagine an alternative way: doing this all in one loop, copying each element and them immediately destroying it. It possible that a compiler could even notice that the two loops iterate over the same set of values and fuse the two loops into one. [Apparently], howevever, (https://gcc.gnu.org/ml/gcc/2015-04/msg00291.html) gcc doesn't do any loop fusion today, and nor do clang or icc if you believe this test.
So then we are left trying to put the loops together explicitly at the source level.
Now the two-loop implementation helps preserve the exception safety contract of the operation by not destroying any source objects until we know the construction part of the copy has completed, but it also helps to optimize the copy and destruction when we have trivially-copyable and trivially-destructible objects, respectively. In particular, with simple-traits based selection we can replace the copy with a memmove and the destruction loop can be elided entirely3.
So the two-loop approach helps when those optimizations apply, but it actually hurts in the general case of objects which are neither trivially copyable or destructible. It means you need two passes over the objects and you lose the opportunity to optimize and eliminate code between the copy of the object and it's subsequent destruction. In the unique_ptr case you lose the ability for the compiler to propagate the knowledge that the source unique_ptr will have a NULL internal ptr member and hence skip the if (ptr) delete ptr check entirely4.
Trivially Movable
Now one might ask whether we could apply the same type-traits compile-time optimization to the unique_ptr case. For example, one might look at the trivially copyable requirements and see that they are perhaps too strict for the common move operations in std::vector. Sure, a unique_ptr is evidently not trivially copyable since a bit-wise copy would leave both the source and destination object owing the same pointer (and result in double-deletion), but it seems that it should be bit-wise movable: if you move a unique_ptr from one area of memory to another, such that you no longer consider the source as a live object (and hence won't call its destructor) it should "just work", for the typical unique_ptr implementation.
Unfortunately, no such "trivial move" concept exists, although you could try to roll your own. There seems to be an open debate about whether this is UB or not for objects that can be byte-wise copied and do not depend on their constructor or destructor behavior in the move scenario.
You could always implement your own trivially movable concept, which would be something like (a) the object has a trivial move constructor and (b) when used as the source argument of the move constructor the object is left in a state where it's destructor has no effect. Note that such a definition is currently mostly useless, since "trivial move constructor" (basically element-wise copy and nothing else) is not consistent with any modification of the source object. So for example, a trivial move constructor cannot set the ptr member of the source unique_ptr to zero. So you'd need to jump though some more hoops such as introducing the concept of a destructive move operation which leaves the source object destroyed, rather than in a valid-but-unspecified state.
You can find some more detailed discussion of this "trivially movable" on this thread on the ISO C++ usenet discussion group. In the particular, in the linked reply, the exact issue of vectors of unique_ptr is addressed:
It turns out many smart pointers (unique_ptr and shared_ptr included)
fall into all three of those categories and by applying them you can
have vectors of smart pointers with essentially zero overhead over raw
pointers even in non-optimized debug builds.
See also the relocator proposal.
0 Although the non-vector examples at the end of your question show that this isn't always the case. Here it is due to possible aliasing as zneak explains in his answer. Raw pointers will avoid many of these aliasing issues since they lack the indirection that unique_ptr has (e.g, you pass a raw pointer by value, rather than a structure with a pointer by reference) and can often omit the if (ptr) delete ptr check entirely.
2 This is actually harder than you might think, because memmove, for example, has subtly different semantics than an object copy loop, when the source and destination overlap. Of course the high level type traits code that works for raw points knows (by contract) that there is no overlap, or the behavior of memmove is consistent even if there is overlap, but proving the same thing at some later arbitrary optimization pass may be much harder.
3 It is important to note that these optimizations are more or less independent. For example, many objects are trivially destructible that at are not trivially copyable.
4 Although in my test neither gcc nor clang were able to suppress the check, even with __restrict__ applied, apparently due to insufficiently powerful aliasing analysis, or perhaps because std::move strips the "restrict" qualifier somehow.
I don't have a precise answer for what is biting you in the back with vectors; looks like BeeOnRope might already have one for you.
Luckily, I can tell you what's biting you in the back for your micro-example involving different ways to reset pointers: alias analysis. Specifically, the compilers are unable to prove (or unwilling to infer) that the two unique_ptr references don't overlap. They force themselves to reload the unique_ptr value in case the write to the first one has modified the second one. baz doesn't suffer from it because the compiler can prove that neither parameter, in a well-formed program, could possibly alias with tmp, which has function-local automatic storage.
You can verify this by adding the __restrict__ keyword (which, as the double underscore somewhat implies, is not standard C++) to either unique_ptr reference parameter. That keyword informs the compiler that the reference is the only reference through which that memory can possibly be accessed, and therefore there is no risk that anything else can alias with it. When you do it, all three versions of your function compile to the same machine code and don't bother checking if the unique_ptr needs to be deleted.

Are C++11 move semantics doing something new, or just making semantics clearer?

I am basically trying to figure out, is the whole "move semantics" concept something brand new, or it is just making existing code simpler to implement? I am always interested in reducing the number of times I call copy/constructors but I usually pass objects through using reference (and possibly const) and ensure I always use initialiser lists. With this in mind (and having looked at the whole ugly && syntax) I wonder if it is worth adopting these principles or simply coding as I already do? Is anything new being done here, or is it just "easier" syntactic sugar for what I already do?
TL;DR
This is definitely something new and it goes well beyond just being a way to avoid copying memory.
Long Answer: Why it's new and some perhaps non-obvious implications
Move semantics are just what the name implies--that is, a way to explicitly declare instructions for moving objects rather than copying. In addition to the obvious efficiency benefit, this also affords a programmer a standards-compliant way to have objects that are movable but not copyable. Objects that are movable and not copyable convey a very clear boundary of resource ownership via standard language semantics. This was possible in the past, but there was no standard/unified (or STL-compatible) way to do this.
This is a big deal because having a standard and unified semantic benefits both programmers and compilers. Programmers don't have to spend time potentially introducing bugs into a move routine that can reliably be generated by compilers (most cases); compilers can now make appropriate optimizations because the standard provides a way to inform the compiler when and where you're doing standard moves.
Move semantics is particularly interesting because it very well suits the RAII idiom, which is a long-standing a cornerstone of C++ best practice. RAII encompasses much more than just this example, but my point is that move semantics is now a standard way to concisely express (among other things) movable-but-not-copyable objects.
You don't always have to explicitly define this functionality in order to prevent copying. A compiler feature known as "copy elision" will eliminate quite a lot of unnecessary copies from functions that pass by value.
Criminally-Incomplete Crash Course on RAII (for the uninitiated)
I realize you didn't ask for a code example, but here's a really simple one that might benefit a future reader who might be less familiar with the topic or the relevance of Move Semantics to RAII practices. (If you already understand this, then skip the rest of this answer)
// non-copyable class that manages lifecycle of a resource
// note: non-virtual destructor--probably not an appropriate candidate
// for serving as a base class for objects handled polymorphically.
class res_t {
using handle_t = /* whatever */;
handle_t* handle; // Pointer to owned resource
public:
res_t( const res_t& src ) = delete; // no copy constructor
res_t& operator=( const res_t& src ) = delete; // no copy-assignment
res_t( res_t&& src ) = default; // Move constructor
res_t& operator=( res_t&& src ) = default; // Move-assignment
res_t(); // Default constructor
~res_t(); // Destructor
};
Objects of this class will allocate/provision whatever resource is needed upon construction and then free/release it upon destruction. Since the resource pointed to by the data member can never accidentally be transferred to another object, the rightful owner of a resource is never in doubt. In addition to making your code less prone to abuse or errors (and easily compatible with STL containers), your intentions will be immediately recognized by any programmer familiar with this standard practice.
In the Turing Tar Pit, there is nothing new under the sun. Everything that move semantics does, can be done without move semantics -- it just takes a lot more code, and is a lot more fragile.
What move semantics does is takes a particular common pattern that massively increases efficiency and safety in a number of situations, and embeds it in the language.
It increases efficiency in obvious ways. Moving, be it via swap or move construction, is much faster for many data types than copying. You can create special interfaces to indicate when things can be moved from: but honestly people didn't do that. With move semantics, it becomes relatively easy to do. Compare the cost of moving a std::vector to copying it -- move takes roughly copying 3 pointers, while copying requires a heap allocation, copying every element in the container, and creating 3 pointers.
Even more so, compare reserve on a move-aware std::vector to a copy-only aware one: suppose you have a std::vector of std::vector. In C++03, that was performance suicide if you didn't know the dimensions of every component ahead of time -- in C++11, move semantics makes it as smooth as silk, because it is no longer repeatedly copying the sub-vectors whenever the outer vector resizes.
Move semantics makes every "pImpl pattern" type to have blazing fast performance, while means you can start having complex objects that behave like values instead of having to deal with and manage pointers to them.
On top of these performance gains, and opening up complex-class-as-value, move semantics also open up a whole host of safety measures, and allow doing some things that where not very practical before.
std::unique_ptr is a replacement for std::auto_ptr. They both do roughly the same thing, but std::auto_ptr treated copies as moves. This made std::auto_ptr ridiculously dangerous to use in practice. Meanwhile, std::unique_ptr just works. It represents unique ownership of some resource extremely well, and transfer of ownership can happen easily and smoothly.
You know the problem whereby you take a foo* in an interface, and sometimes it means "this interface is taking ownership of the object" and sometimes it means "this interface just wants to be able to modify this object remotely", and you have to delve into API documentation and sometimes source code to figure out which?
std::unique_ptr actually solves this problem -- interfaces that want to take onwership can now take a std::unique_ptr<foo>, and the transfer of ownership is obvious at both the API level and in the code that calls the interface. std::unique_ptr is an auto_ptr that just works, and has the unsafe portions removed, and replaced with move semantics. And it does all of this with nearly perfect efficiency.
std::unique_ptr is a transferable RAII representation of resource whose value is represented by a pointer.
After you write make_unique<T>(Args&&...), unless you are writing really low level code, it is probably a good idea to never call new directly again. Move semantics basically have made new obsolete.
Other RAII representations are often non-copyable. A port, a print session, an interaction with a physical device -- all of these are resources for whom "copy" doesn't make much sense. Most every one of them can be easily modified to support move semantics, which opens up a whole host of freedom in dealing with these variables.
Move semantics also allows you to put your return values in the return part of a function. The pattern of taking return values by reference (and documenting "this one is out-only, this one is in/out", or failing to do so) can be somewhat replaced by returning your data.
So instead of void fill_vec( std::vector<foo>& ), you have std::vector<foo> get_vec(). This even works with multiple return values -- std::tuple< std::vector<A>, std::set<B>, bool > get_stuff() can be called, and you can load your data into local variables efficiently via std::tie( my_vec, my_set, my_bool ) = get_stuff().
Output parameters can be semantically output-only, with very little overhead (the above, in a worst case, costs 8 pointer and 2 bool copies, regardless of how much data we have in those containers -- and that overhead can be as little as 0 pointer and 0 bool copies with a bit more work), because of move semantics.
There is absolutely something new going on here. Consider unique_ptr which can be moved, but not copied because it uniquely holds ownership of a resource. That ownership can then be transferred by moving it to a new unique_ptr if needed, but copying it would be impossible (as you would then have two references to the owned object).
While many uses of moving may have positive performance implications, the movable-but-not-copyable types are a much bigger functional improvement to the language.
In short, use the new techniques where it indicates the meaning of how your class should be used, or where (significant) performance concerns can be alleviated by movement rather than copy-and-destroy.
No answer is complete without a reference to Thomas Becker's painstakingly exhaustive write up on rvalue references, perfect forwarding, reference collapsing and everything related to that.
see here: http://thbecker.net/articles/rvalue_references/section_01.html
I would say yes because a Move Constructor and Move Assignment operator are now compiler defined for objects that do not define/protect a destructor, copy constructor, or copy assignment.
This means that if you have the following code...
struct intContainer
{
std::vector<int> v;
}
intContainer CreateContainer()
{
intContainer c;
c.v.push_back(3);
return c;
}
The code above would be optimized simply by recompiling with a compiler that supports move semantics. Your container c will have compiler defined move-semantics and thus will call the manually defined move operations for std::vector without any changes to your code.
Since move semantics only apply in the presence of rvalue
references, which are declared by a new token, &&, it seems
very clear that they are something new.
In principle, they are purely an optimizing techique, which
means that:
1. you don't use them until the profiler says it is necessary, and
2. in theory, optimizing is the compiler's job, and move
semantics aren't any more necessary than register.
Concerning 1, we may, in time, end up with an ubiquitous
heuristic as to how to use them: after all, passing an argument
by const reference, rather than by value, is also an
optimization, but the ubiquitous convention is to pass class
types by const reference, and all other types by value.
Concerning 2, compilers just aren't there yet. At least, the
usual ones. The basic principles which could be used to make
move semantics irrelevant are (well?) known, but to date, they
tend to result in unacceptable compile times for real programs.
As a result: if you're writing a low level library, you'll
probably want to consider move semantics from the start.
Otherwise, they're just extra complication, and should be
ignored, until the profiler says otherwise.

Does D have something akin to C++0x's move semantics?

A problem of "value types" with external resources (like std::vector<T> or std::string) is that copying them tends to be quite expensive, and copies are created implicitly in various contexts, so this tends to be a performance concern. C++0x's answer to this problem is move semantics, which is conceptionally based on the idea of resource pilfering and technically powered by rvalue references.
Does D have anything similar to move semantics or rvalue references?
I believe that there are several places in D (such as returning structs) that D manages to make them moves whereas C++ would make them a copy. IIRC, the compiler will do a move rather than a copy in any case where it can determine that a copy isn't needed, so struct copying is going to happen less in D than in C++. And of course, since classes are references, they don't have the problem at all.
But regardless, copy construction already works differently in D than in C++. Generally, instead of declaring a copy constructor, you declare a postblit constructor: this(this). It does a full memcpy before this(this) is called, and you only make whatever changes are necessary to ensure that the new struct is separate from the original (such as doing a deep copy of member variables where needed), as opposed to creating an entirely new constructor that must copy everything. So, the general approach is already a bit different from C++. It's also generally agreed upon that structs should not have expensive postblit constructors - copying structs should be cheap - so it's less of an issue than it would be in C++. Objects which would be expensive to copy are generally either classes or structs with reference or COW semantics.
Containers are generally reference types (in Phobos, they're structs rather than classes, since they don't need polymorphism, but copying them does not copy their contents, so they're still reference types), so copying them around is not expensive like it would be in C++.
There may very well be cases in D where it could use something similar to a move constructor, but in general, D has been designed in such a way as to reduce the problems that C++ has with copying objects around, so it's nowhere near the problem that it is in C++.
I think all answers completely failed to answer the original question.
First, as stated above, the question is only relevant for structs. Classes have no meaningful move. Also stated above, for structs, a certain amount of move will happen automatically by the compiler under certain conditions.
If you wish to get control over the move operations, here's what you have to do. You can disable copying by annotating this(this) with #disable. Next, you can override C++'s constructor(constructor &&that) by defining this(Struct that). Likewise, you can override the assign with opAssign(Struct that). In both cases, you need to make sure that you destroy the values of that.
For assignment, since you also need to destroy the old value of this, the simplest way is to swap them. An implementation of C++'s unique_ptr would, therefore, look something like this:
struct UniquePtr(T) {
private T* ptr = null;
#disable this(this); // This disables both copy construction and opAssign
// The obvious constructor, destructor and accessor
this(T* ptr) {
if(ptr !is null)
this.ptr = ptr;
}
~this() {
freeMemory(ptr);
}
inout(T)* get() inout {
return ptr;
}
// Move operations
this(UniquePtr!T that) {
this.ptr = that.ptr;
that.ptr = null;
}
ref UniquePtr!T opAssign(UniquePtr!T that) { // Notice no "ref" on "that"
swap(this.ptr, that.ptr); // We change it anyways, because it's a temporary
return this;
}
}
Edit:
Notice I did not define opAssign(ref UniquePtr!T that). That is the copy assignment operator, and if you try to define it, the compiler will error out because you declared, in the #disable line, that you have no such thing.
D have separate value and object semantics :
if you declare your type as struct, it will have value semantic by default
if you declare your type as class, it will have object semantic.
Now, assuming you don't manage the memory yourself, as it's the default case in D - using a garbage collector - you have to understand that object of types declared as class are automatically pointers (or "reference" if you prefer) to the real object, not the real object itself.
So, when passing vectors around in D, what you pass is the reference/pointer. Automatically. No copy involved (other than the copy of the reference).
That's why D, C#, Java and other language don't "need" moving semantic (as most types are object semantic and are manipulated by reference, not by copy).
Maybe they could implement it, I'm not sure. But would they really get performance boost as in C++? By nature, it don't seem likely.
I somehow have the feeling that actually the rvalue references and the whole concept of "move semantics" is a consequence that it's normal in C++ to create local, "temporary" stack objects. In D and most GC languages, it's most common to have objects on the heap, and then there's no overhead with having a temporary object copied (or moved) several times when returning it through a call stack - so there's no need for a mechanism to avoid that overhead too.
In D (and most GC languages) a class object is never copied implicitly and you're only passing the reference around most of the time, so this may mean that you don't need any rvalue references for them.
OTOH, struct objects are NOT supposed to be "handles to resources", but simple value types behaving similar to builtin types - so again, no reason for any move semantics here, IMHO.
This would yield a conclusion - D doesn't have rvalue refs because it doesn't need them.
However, I haven't used rvalue references in practice, I've only had a read on them, so I might have skipped some actual use cases of this feature. Please treat this post as a bunch of thoughts on the matter which hopefully would be helpful for you, not as a reliable judgement.
I think if you need the source to loose the resource you might be in trouble. However being GC'ed you can often avoid needing to worry about multiple owners so it might not be an issue for most cases.