Copy initialization is effective with move in C++11? - c++

Copy initialization is when Hello is created in memory and then being used copy constructor to initialize s, right?
std::string s = std::string("Hello")
After C++11 where move semantics is introduced can I say that the code above is as effective (eliminates copy) as in this case:
std::string s("Hello");
EDIT: please don't answer for string. string was just and example of a class. SSO is not the case what I ask. I ask in general.

When you use strings smaller than 20 characters (depending on the implementation), short string optimization kicks in, and everything is copied anyway.
But to answer your question, move semantics is not used in any of your examples anyway. In the first case even if both copy constructor and string(const char*) constructors must be available, copy elision will eliminate the copy.
EDIT:
To address your edit, if you have one initialization and one initialization + move constructor, the former will obviously always be faster. The reason I brought up SSO is because people assumes that move operations are always cheap (or even free), but not necessarily and sometimes they don't even happen at all.

Short answer: performances should be the same if copy elision is performed. Otherwise the latter should probably be faster.
Long answer:
This code
std::string s = std::string("Hello")
should call a move constructor in C++11+ code (it requires an accessible one). Anyway copy elision is allowed in this case, although not mandated (cfr. [class.copy]/p31)
When
certain criteria are met, an implementation is allowed to omit the
copy/move construction
These are concepts that were already present in pre-C++11 though (they applied to copy constructors as well).
As to the performances question:
The standard also describes a few situations where copying can be eliminated even if this would alter the program's behavior, the most common being the return value optimization. Another widely implemented optimization, described in the C++ standard, is when a temporary object of class type is copied to an object of the same type.[1] As a result, copy-initialization is usually equivalent to direct-initialization in terms of performance, but not in semantics; copy-initialization still requires an accessible copy constructor.[2]
Source - Copy elision
If copy elision doesn't take place (e.g. it has been disabled in gcc via -fno-elide-constructors or for whatever reason the compiler won't perform it) then performances will probably not be the same and direct initialization should be faster (in this case for std::string SSO might also take a toll on the move)

Actually, this was the case before C++11 because of copy elision.
Note that std::string is not necessarily cheap to move, as small strings may be held in the object itself rather than being dynamically allocated. This is known as small string optimisation.

Related

Where do standard library or compilers leverage noexcept move semantics (other than vector growth)?

Move operations should be noexcept; in the first place for intuitive and reasonable semantics. The second argument is runtime performance. From the Core Guidelines, C.66, "Make move operations noexcept":
A throwing move violates most people’s reasonably assumptions. A non-throwing move will be used more efficiently by standard-library and language facilities.
The canonical example for the performance-part of this guideline is the case when std::vector::push_back or friends need to grow the buffer. The standard requires a strong exception guarantee here, and this can only move-construct the elements into the new buffer if this is noexcept - otherwise, it must be copied. I get that, and the difference is visible in benchmarks.
However, apart from this, I have a hard time finding real-world evidence of the positive performance impact of noexcept move semantics. Skimming through the standard library (libcxx + grep), we see that std::move_if_noexcept exists, but it's almost not used within the library itself. Similarly, std::is_noexcept_swappable is merely used for fleshing out conditional noexcept qualifiers. This doesn't match existing claims, for example this one from "C++ High Performance" by Andrist and Sehr (2nd ed., p. 153):
All algorithms use std::swap() and std::move() when moving elements around, but only if the move constructor and move assignment are marked noexcept. Therefore, it is important to have these implemented for heavy objects when using algorithms. If they are not available and exception free, the elements will be copied instead.
To break my question into pieces:
Are there code paths in the standard library similar to the std::vector::push_back, that run faster when fed with std::is_nothrow_move_constructible types?
Am I correct to conclude that the cited paragraph from the book is not correct?
Is there an obvious example for when the compiler will reliably generate more runtime-efficient code when a type adheres to the noexcept guideline?
I know the third one might be a bit blurry. But if someone could come up with a simple example, this would be great.
Background: I refer to std::vector's use of noexcept as "the vector pessimization." I claim that the vector pessimization is the only reason anyone ever cared about putting a noexcept keyword into the language. Furthermore, the vector pessimization applies only to the element type's move constructor. I claim that marking your move-assignment or swap operations as noexcept has no "in-game effect"; leaving aside whether it might be philosophically satisfying or stylistically correct, you shouldn't expect it to have any effect on your code's performance.
Let's check a real library implementation and see how close I am to wrong. ;)
Vector reallocation. libc++'s headers use move_if_noexcept only inside __construct_{forward,backward}_with_exception_guarantees, which is used only inside vector reallocation.
Assignment operator for variant. Inside __assign_alt, the code tag-dispatches on is_nothrow_constructible_v<_Tp, _Arg> || !is_nothrow_move_constructible_v<_Tp>. When you do myvariant = arg;, the default "safe" approach is to construct a temporary _Tp from the given arg, and then destroy the currently emplaced alternative, and then move-construct that temporary _Tp into the new alternative (which hopefully won't throw). However, if we know that the _Tp is nothrow-constructible directly from arg, we'll just do that; or, if _Tp's move-constructor is throwing, such that the "safe" approach isn't actually safe, then it's not buying us anything and we'll just do the fast direct-construction approach anyway.
Btw, the assignment operator for optional does not do any of this logic.
Notice that for variant assignment, having a noexcept move constructor actually hurts (unoptimized) performance, unless you have also marked the selected converting constructor as noexcept! Godbolt.
(This experiment also turned up an apparent bug in libstdc++: #99417.)
string appending/inserting/assigning. This is a surprising one. string::append makes a call to __append_forward_unsafe under a SFINAE check for __libcpp_string_gets_noexcept_iterator. When you do s1.append(first, last), we'd like to do s1.resize(s1.size() + std::distance(first, last)) and then copy into those new bytes. However, this doesn't work in three situations: (1) If first, last point into s1 itself. (2) If first, last are exactly input_iterators (e.g. reading from an istream_iterator), such that it's known impossible to iterate the range twice. (3) If it's possible that iterating the range once could put it into a bad state where iterating the second time would throw. That is, if any of the operations in the second loop (++, ==, *) are non-noexcept. So in any of those three situations, we take the "safe" approach of constructing a temporary string s2(first, last) and then s1.append(s2). Godbolt.
I would bet money that the logic controlling this string::append optimization is incorrect. (EDIT: yes, it is.) See "Attribute noexcept_verify" (2018-06-12). Also observe in that godbolt that the operation whose noexceptness matters to libc++ is rv == rv, but the one it actually calls inside std::distance is lv != lv.
The same logic applies even harder in string::assign and string::insert. We need to iterate the range while modifying the string. So we need either a guarantee that the iterator operations are noexcept, or a way to "back out" our changes when an exception is thrown. And of course for assign in particular, there's not going to be any way to "back out" our changes. The only solution in that case is to copy the input range into a temporary string and then assign from that string (because we know string::iterator's operations are noexcept, so they can use the optimized path).
libc++'s string::replace does not do this optimization; it always copies the input range into a temporary string first.
function SBO. libc++'s function uses its small buffer only when the stored callable object is_nothrow_copy_constructible (and of course is small enough to fit). In that case, the callable is treated as a sort of "copy-only type": even when you move-construct or move-assign the function, the stored callable will be copy-constructed, not move-constructed. function doesn't even require that the stored callable be move-constructible at all!
any SBO. libc++'s any uses its small buffer only when the stored callable object is_nothrow_move_constructible (and of course is small enough to fit). Unlike function, any treats "move" and "copy" as distinct type-erased operations.
Btw, libc++'s packaged_task SBO doesn't care about throwing move-constructors. Its noexcept move-constructor will happily call the move-constructor of a user-defined callable: Godbolt. This results in a call to std::terminate if the callable's move-constructor ever actually does throw. (Confusingly, the error message printed to the screen makes it look as if an exception is escaping out the top of main; but that's not actually what's happening internally. It's just escaping out the top of packaged_task(packaged_task&&) noexcept and being halted there by the noexcept.)
Some conclusions:
To avoid the vector pessimization, you must declare your move-constructor noexcept. I still think this is a good idea.
If you declare your move-constructor noexcept, then to avoid the "variant pessimization," you must also declare all your single-argument converting constructors noexcept. However, the "variant pessimization" merely costs a single move-construct; it does not degrade all the way into a copy-construct. So you can probably eat this cost safely.
Declaring your copy constructor noexcept can enable small-buffer optimization in libc++'s function. However, this matters only for things that are (A) callable and (B) very small and (C) not in possession of a defaulted copy constructor. I think this describes the empty set. Don't worry about it.
Declaring your iterator's operations noexcept can enable a (dubious) optimization in libc++'s string::append. But literally nobody cares about this; and besides, the optimization's logic is buggy anyway. I'm very much considering submitting a patch to rip out that logic, which will make this bullet point obsolete. (EDIT: Patch submitted, and also blogged.)
I'm not aware of anywhere else in libc++ that cares about noexceptness. If I missed something, please tell me! I'd also be very interested to see similar rundowns for libstdc++ and Microsoft.
vector push_back, resize, reserve, etc is very important case, as it is expected to be the most used container.
Anyway, take look at std::fuction as well, I'd expect it to take advantage of noexcept move for small object optimization version.
That is, when functor object is small, and it has noexcept move constructor, it can be stored in a small buffer in std::function itself, not on heap. But if the functor doesn't have noexcept move constructor, it has to be on heap (and don't move when std::function is moved)
Overall, there ain't too many cases indeed.

Why was the code required to have an accessible copy/move constructor even when copy-elision was permitted to happen?

Nicol Bolas wrote the following in his answer in SO:
Copy elision was permitted to happen under a number of circumstances.
However, even if it was permitted, the code still had to be able to
work as if the copy were not elided. Namely, there had to be an
accessible copy and/or move constructor.]
Why was it necessary (before the advent of "guaranteed copy elision") for the code to maintain a copy/move constructor even when copy-elision was permitted to happen?
Why does "guaranteed copy elision" free the programmer from these requirements?
When copy elision is/was not guaranteed (or required) by the standard, then there is no requirement for a compiler to implement it.
That meant the standard allowed compilers to support copy elision, but did not require them to. And, in practice, a number of compiler vendors chose to not implement copy elision. For those vendors it is a matter of cost - not implementing a feature consumes less developer effort. For programmers (the people who use compilers) it was a quality of implementation concern - a higher quality compiler was more likely to implement desirable optimisations, including copy elision, than a lower quality compiler - but also be more expensive to acquire.
Over time, as higher quality compilers become more freely available (by various definitions of "free" - not all are equivalent to zero cost), gradually the standard is able to mandate more features that were previously optional. But it didn't start that way.
With copy elision being optional, some compilers would rely on accessibility of relevant copy constructors, etc, and some would not. However, the notion of code which complies with requirements of the standard, which builds with one compliant compiler but not another, is naturally undesirable in a standard. Therefore the standard mandated a need for constructors to be accessible, even while permitting an implementation to elide them.
For the code to be guaranteed to work, it has to have some way to work without copy elision for every case where copy elision is not guaranteed.
Because it was just permitted, but not guaranteed.
If an accessible copy constructor is not required, some code would compile when the optimization kicks in, but could fail on some other compiler.
Why was it necessary (before the advent of "guaranteed copy elision") for the code to maintain a copy/move constructor even when copy-elision was permitted to happen?
Because as others have said, it was just permitted that the copy or move was omitted. Not every compiler had to omit it, so for consistency the programmer was still arrange for the copy/move to be possible. And conceptually, there was still a copy/move, whether it was carried out or not by the compiler is a different story.
Why does "guaranteed copy elision" free the programmer from these requirements?
Because there is no copy or move to begin with. The guaranteed copy "elision" works by completely changing the meaning of T a = T(), saying that T() initializes a instead of a temporary object, therefore at no point the copy or move constructors are even part of the game.

Is there language level optimization like RVO and NRVO?

RVO is a compiler optimization but can provide a really useful performance boost. However it is not guaranteed and cannot be relied on.
Is there anything in the language standard itself can optimize return value? Move semantics still copies the members values, correct?
I don't know if I misunderstood your question, but (N)RVO is in fact "in the language standard itself". This is called copy elision and described in §12.8.31.
Indeed move construction is more computationally expensive than RVO, because as you said it still has to "copy" the member variables from one memory location to another (shallow copy of the object). RVO eliminates these read/write actions entirely.
(N)RVO works in the majority of cases, and where it can't (when a function can return one of multiple variables) move construction kicks in.
AFAIK there is a consensus that all cases of "optimizing return value" as you call it are sufficiently covered since C++11
C++11 has the guarantee of implicit move when returning from a function in certain ways.
In addition, you can directly construct the return value using any implicit constructor via return {a,b,c}; syntax.
Finally, rvo if it fails becomes an implicit move, and the first implicit move above is often turned into nrvo, so the techniques align nicely.

Pushing temporary into vector in c++

How many copies happen/object exist in the following, assuming that normal compiler optimizations are enabled:
std::vector<MyClass> v;
v.push_back(MyClass());
If it is not exactly 1 object creation and 0 copying, What can I do (including changes in MyClass) to achieve that, since it seems to me that that is all that should really be necessary?
If the constructor of MyClass has side-effects, then in C++03 the copy is not permitted to be elided. That's because the temporary object that's the source of the copy has been bound to a reference (the parameter of push_back).
If the copy constructor of MyClass has no side-effects then the compiler is permitted to optimize it away under the "as-if" rule. I think the only sensible way to determine whether it actually has done so with "normal optimizations" is to inspect the emitted code. Different people have different ideas what's normal, and a given compiler might be sensitive to the details of MyClass. My guess is that what this amounts to is whether or not the compiler (or linker) inlines everything in sight. If it does then it will probably optimize, if it doesn't then it won't. So even the size of the constructor code might be relevant, never mind what it does.
So I think the main thing you can do is to ensure that both the default and the copy constructor of MyClass have no side-effects and are available to be inlined. If they're not available then of course the compiler will assume that they could have side-effects and will do the copy. If link-time optimization is a normal compiler option for you, then you don't have to do much to make them available. Otherwise, if they're user-defined then do it in the header file that defines MyClass. You might be able to get away with the default constructor having certain kinds of side-effects: if the effects don't depend on the address of the temporary being different from the address of the vector element then "as-if" still applies.
In C++11 you have a move (that likewise must not be elided if it has side-effects), but you can use v.emplace_back() to avoid that. The move would call the move constructor of MyClass if it has one, otherwise the copy constructor, and everything I say above about "as-if" applies to moves. emplace_back() calls the no-args constructor to construct the vector element (or if you pass arguments to emplace_back then whatever constructor matches those args), which I think is exactly what you want.
You mean:
std::vector<MyClass> v;
v.push_back(MyClass());
None. The temporary will cause the move version of push_back to be called. Even the move construction will most likely be elided.
If you have a C++11 compiler, you can use emplace_back to construct the element at the end of the vector, zero copies necessary.
In C++03, you would have a construction and a copy, plus destruction of the temporary.
If your compiler supports C++11 and MyClass defines a move constructor, then you have one construction and a move.
As mentionned by Timbo, you can also use emplace_back to avoid the move, the object being constructed in-place.

Under what conditions should I be thinking about implementing a move constructor and a move operator?

For standard copy constructors and assignment operators, I always think about implementing them or deleteing the defaults out of existence, if my class implements a destructor.
For the new move constructor and move operator, what is the right way to think about whether or not an implementation is necessary?
As a first pass of transitioning a system from pre-C++0x, could I just delete the default move constructor and move operator or should I leave them alone?
You don't have to worry about it, in the sense that when you user-declare a destructor (or anything else listed in 12.8/9), that blocks the default move constructor from being generated. So there's not the same risk as there is with copies, that the default is wrong.
So, as the first pass leave them alone. There may be places in your existing code where C++11 move semantics allow a move, whereas C++03 dictates a copy. Your class will continue to be copied, and if that caused no performance problems in C++03 then I can't immediately think of any reason why it would in C++11. If it did cause performance problems in C++03, then you have an opportunity to fix a bug in your code that you never got around to before, but that's an opportunity, not an obligation ;-)
If you later implement move construction and assignment, they will be moved, and in particular you'll want to do this if you think that C++11 clients of your class are less likely to use "swaptimization" to avoid copies, more likely to pass your type by value, etc, than C++03 clients were.
When writing new classes in C++11, you need to consider and implement move under the same criteria that you considered and implemented swap in C++03. A class that can be copied implements the C++11 concept of "movable", (much as a class that can be copied in C++03 can be swapped via the default implementation in std), because "movable" doesn't say what state the source is left in - in particular it's permitted to be unchanged. So a copy is valid as a move, it's just not necessarily an efficient one, and for many classes you'll find that unlike a "good" move or swap, a copy can throw.
You might find that you have to implement move for your classes in cases where you have a destructor (hence no default move constructor), and you also have a data member which is movable but not copyable (hence no default copy constructor either). That's when move becomes important semantically as well as for performance.
With C++11, you very rarely need to provide a destructor or copy semantics, due to the way the library is written. Compiler provided members pretty much always do fine (provided they are implemented correctly: MSVC forces you to implement a lot of move semantics by hand, which is very bothersome).
In case you have to implement a custom destructor, use the following approach:
Implement a move constructor, and an assignment operator taking by value (using copy&swap: note that you cannot use std::swap since it uses the assignment. You have to provide a private swap yourself). Pay attention to exception guarantees (look up std::move_if_noexcept).
If necessary, implement a copy constructor. Otherwise, delete it. Beware that non default copy semantics rarely make sense.
Also, a virtual destructor counts as a custom destructor: provide or delete copy + move semantics when declaring a virtual destructor.
Since they are used as an optimization you should implement them if the optimization is applicable to your class. If you can "steal" the internal resource your class is holding from a temporary object that is about to be destroyed. std::vector is a perfect example, where move constructor only assigns pointers to internal buffer leaving the temporary object empty (effectively stealing the elements).