Consider the following code.
using T = std::string;
void rotate_left(std::vector<T>& v) {
T temp = std::move(v[0]);
for (size_t i=0; i+1 < v.size(); ++i) {
v[i] = std::move(v[i+1]);
}
v.back() = std::move(temp);
}
int main()
{
std::vector<T> v(3); // a vector of three Ts
T x = std::move(v[1]); // move-from the second element
rotate_left(v);
// Can we now say that v[0] is in a moved-from state, or did we
// get undefined behavior when we moved from v[1] a second time?
}
The rotate_left function is just really simply shifting everything in the vector down by one position (and then putting the first element onto the end). My question is, does this function have defined behavior when one of the elements in the vector is in a "moved-from state"?
This is related but not quite the same as "self-move". In this case, we're moving from one moved-from object into another moved-from object, and my question is whether we can rely on this leaving both objects still in some moved-from state, or whether "assignment-from" is one of those operations that "has a precondition" and therefore can't be used on arbitrary moved-from objects.
I'm well aware that
this is perfectly fine for sane library types such as unique_ptr
this will be perfectly fine for my own user-defined types if I define them sanely
this will be problematic for my own user-defined types if I define them insanely
this is perfectly fine for all the library types I've tried on libstdc++ and libc++
So what I'm really looking for is either:
concrete wording from the Standard proving that this must be perfectly fine for all STL types going forward, or
a concrete example of an STL or Boost type "in the wild" where this code is definitely problematic
I infer from this bug that _GLIBCXX_DEBUG checks for self-move, but I can confirm that it does not check for move-from-moved; is this because they consider move-from-moved to be safe and legal, or just because nobody wrote the code to check for it yet?
Your program's behavior is well-defined, but unspecified. The difference is that v has a valid state, you just don't know what that state is without inspecting it.
This corner of the standard has been controversial, and the wording has been difficult to get right. But I believe the latest wording (C++17) is our best attempt so far, so I will quote from that (N4660).
Disclaimer, I actually linked to N4659 as N4660 is not publicly available. The difference is inconsequential.
From 20.5.5.15 Moved-from state of library types [lib.types.movedfrom]
1 Objects of types defined in the C++ standard library may be moved from (15.8). Move operations may be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.
This paragraph is a blanket statement for all types defined by the std::lib that moved-from objects are not "poison", you just don't know value they have.
Furthermore, each algorithm (including member functions) defined in the standard may have a list of preconditions that must be true prior to calling that algorithm. If there are no preconditions listed, that means that you can always call that function.
Move assignment for all types defined in the std::lib never have any preconditions listed for the left or right hand side arguments.
It is hard to quote something that doesn't exist, but that is the way this specification works.
Generalizing further, the following section refers to all types (i.e. user-supplied) that are used with the std::lib:
20.5.3.1 Template argument requirements [utility.arg.requirements]
Table 23 — MoveConstructible requirements [moveconstructible]
T u = rv;
T(rv);
rv's state is unspecified in the post-condition.
Table 25 — MoveAssignable requirements [moveassignable]
t = rv
Only if t and rv do not refer to the same object, t is equivalent to the value of rv before the assignment.
Afterwards, rv's state is unspecified (whether or not t and rv refer to the same object).
Furthermore this sections notes (notes are non-normative and often the normative text appears elsewhere) that rv must still meet the requirements of the library component (algorithm) it is being used with, even though it has been moved from.
For example, std::sort is allowed to move from a value x, and then use that x in a comparison expression. x must be LessThanComparable, whether or not x is moved-from. Only x's value is unspecified.
bool b = x < x; // b must be false, no matter what!
Related
I've made a simple Huffman encoding program to output individual encodings for characters and save the encoded file. This was for an assignment, and I was told that using const_cast on heap.top() is considered undefined behavior if we heap.pop() afterwards, but I'm not sure I understand why.
I've read cppreference regarding the std::pop_heap which is the underlying function called when we call heap.pop() and I believe that a nullptr in the comparison is still defined and understood. It doesn't seem to function abnormally to me when I debugged it.
Here's an example
#include <functional>
#include <queue>
#include <vector>
#include <iostream>
#include <memory>
template<typename T> void print_queue_constcast(T& q) {
while(!q.empty()) {
auto temp = std::move(const_cast<int&>(q.top()));
std::cout << temp << " ";
q.pop();
}
std::cout << '\n';
}
template<typename T> void print_queue(T& q) {
while(!q.empty()) {
std::cout << q.top() << " ";
q.pop();
}
std::cout << '\n';
}
int main() {
std::priority_queue<int> q1;
std::priority_queue<int> q2;
for(int n : {1,8,5,6,3,4,0,9,7,2}){
q1.push(n);
q2.push(n);
}
print_queue(q1);
print_queue_constcast(q2);
}
Could anyone explain what is actually going in the backgroun that'd be undefined behavior or that would cause this to fail under certain circumstances?
tl;dr: Maybe; maybe not.
Language-level safety
Like a set, a priority_queue is in charge of ordering its elements. Any modification to an element would potentially "break" the ordering, so the only safe way to do that is via the container's own mutating methods. (In fact, neither one actually provides such a thing.) Directly modifying elements is dangerous. To enforce this, these containers only expose const access to your elements.
Now, at the language level, the objects won't actually have a static type of const T; most likely they're just Ts. So modifying them (after a const_cast to cheat the type system) doesn't have undefined behaviour in that sense.
Library-level safety
However, you are potentially breaking a condition of using the container. The rules for priority_queue don't ever actually say this, but since its mutating operations are defined in terms of functions like push_heap and pop_heap, your use of such operations will break preconditions of those functions if the container's ordering is no longer satisfied after your direct mutation.
Thus your program will have undefined behaviour if you break the ordering and later mutate the priority_queue in such a way that depends on the ordering being intact. If you don't, technically your program's behaviour is well-defined; however, in general, you'd still be playing with fire. A const_cast should be a measure of last resort.
So, where do we stand?
The question is: did you break the ordering? What's the state of the element after moving from it, and is the ordering satisfied by having an object in that state at the top of the queue?
Your original example uses shared_ptrs, and we know from the documentation that a moved-from shared_ptr turns safely into a null pointer.
The default priority_queue ordering is defined by std::less, which yields a strict total order over raw pointers; std::less on a shared_ptr will actually invoke its base case of operator<, but that in turn is defined to invoke std::less on its raw pointer equivalent.
Unfortunately, that doesn't mean that a null shared_ptr is ordered "first": though std::less's pointer ordering is strict and total, where null pointers land in this ordering is unspecified.
So, it is unspecified as to whether your mutation will break the ordering, and therefore it is unspecified as to whether your pop() will have undefined behaviour.
(The MCVE example with int is safe because std::move on an int has no work to do: it'll just copy the int. So, the ordering remains unaffected.)
Conclusion
I would agree with what was presumably your driving rationale, that it is unfortunate pop() doesn't return you the popped thing, which you could then move from. Similar restrictions with sets and maps are why we now have node splicing features for those containers. There is not such a thing for a priority_queue, which is just a wrapper around another container like a vector. If you need more fine-grained control, you can substitute that for your own which has the features you need.
Anyway, for the sake of a shared_ptr increment (as in your original code), I'd probably just take the hit of the copy, unless you have some really extreme performance requirements. That way, you know everything will be well-defined.
Certainly, for the sake of an int copy (as in your MCVE), a std::move is entirely pointless (there are no indirect resources to steal!) and you're doing a copy anyway, so the point is rather moot and all you've done is to create more complex code for no reason.
I would also recommend not writing code where you have to ask whether it's well-defined, even if it turns out it is. That's not ideal for readability or maintainability.
I recently followed a Reddit discussion which lead to a nice comparison of std::visit optimization across compilers. I noticed the following: https://godbolt.org/z/D2Q5ED
Both GCC9 and Clang9 (I guess they share the same stdlib) do not generate code for checking and throwing a valueless exception when all types meet some conditions. This leads to way better codegen, hence I raised an issue with the MSVC STL and was presented with this code:
template <class T>
struct valueless_hack {
struct tag {};
operator T() const { throw tag{}; }
};
template<class First, class... Rest>
void make_valueless(std::variant<First, Rest...>& v) {
try { v.emplace<0>(valueless_hack<First>()); }
catch(typename valueless_hack<First>::tag const&) {}
}
The claim was, that this makes any variant valueless, and reading the docu it should:
First, destroys the currently contained value (if any). Then
direct-initializes the contained value as if constructing a value of
type T_I with the arguments std::forward<Args>(args).... If an
exception is thrown, *this may become valueless_by_exception.
What I don't understand: Why is it stated as "may"? Is it legal to stay in the old state if the whole operation throws? Because this is what GCC does:
// For suitably-small, trivially copyable types we can create temporaries
// on the stack and then memcpy them into place.
template<typename _Tp>
struct _Never_valueless_alt
: __and_<bool_constant<sizeof(_Tp) <= 256>, is_trivially_copyable<_Tp>>
{ };
And later it (conditionally) does something like:
T tmp = forward(args...);
reset();
construct(tmp);
// Or
variant tmp(inplace_index<I>, forward(args...));
*this = move(tmp);
Hence basically it creates a temporary, and if that succeeds copies/moves it into the real place.
IMO this is a violation of "First, destroys the currently contained value" as stated by the docu. As I read the standard, then after a v.emplace(...) the current value in the variant is always destroyed and the new type is either the set type or valueless.
I do get that the condition is_trivially_copyable excludes all types that have an observable destructor. So this can also be though as: "as-if variant is reinitialized with the old value" or so. But the state of the variant is an observable effect. So does the standard indeed allow, that emplace does not change the current value?
Edit in response to a standard quote:
Then initializes the contained value as if direct-non-list-initializing a value of type TI with the arguments std::forward<Args>(args)....
Does T tmp {std::forward<Args>(args)...}; this->value = std::move(tmp); really count as a valid implementation of the above? Is this what is meant by "as if"?
I think the important part of the standard is this:
From https://timsong-cpp.github.io/cppwp/n4659/variant.mod#12
23.7.3.4 Modifiers
(...)
template
variant_alternative_t>& emplace(Args&&... args);
(...) If an exception is thrown during the initialization of the contained value, the variant might not hold a value
It says "might" not "must". I would expect this to be intentional in order to allow implementations like the one used by gcc.
As you mentioned yourself, this is only possible if the destructors of all alternatives are trivial and thus unobservable because destroying the previous value is required.
Followup question:
Then initializes the contained value as if direct-non-list-initializing a value of type TI with the arguments std::forward<Args>(args)....
Does T tmp {std::forward(args)...}; this->value = std::move(tmp); really count as a valid implementation of the above? Is this what is meant by "as if"?
Yes, because for types that are trivially copyable there is no way to detect the difference, so the implementation behaves as if the value was initialized as described. This would not work if the type was not trivially copyable.
So does the standard indeed allow, that emplace does not change the
current value?
Yes. emplace shall provide the basic guarantee of no leaking (i.e., respecting object lifetime when construction and destruction produce observable side effects), but when possible, it is allowed to provide the strong guarantee (i.e., the original state is kept when an operation fails).
variant is required to behave similarly to a union — the alternatives are allocated in one region of suitably allocated storage. It is not allowed to allocate dynamic memory. Therefore, a type-changing emplace has no way to keep the original object without calling an additional move constructor — it has to destroy it and construct the new object in place of it. If this construction fails, then the variant has to go to the exceptional valueless state. This prevents weird things like destroying a nonexistent object.
However, for small trivially copyable types, it is possible to provide the strong guarantee without too much overhead (even a performance boost for avoiding a check, in this case). Therefore, the implementation does it. This is standard-conforming: the implementation still provides the basic guarantee as required by the standard, just in a more user-friendly way.
Edit in response to a standard quote:
Then initializes the contained value as if
direct-non-list-initializing a value of type TI with the arguments
std::forward<Args>(args)....
Does T tmp {std::forward<Args>(args)...}; this->value =
std::move(tmp); really count as a valid implementation of the above?
Is this what is meant by "as if"?
Yes, if the move assignment produces no observable effect, which is the case for trivially copyable types.
I don't see anything on subj in current draft. Do I get it right, that the following code
struct Omg { Omg &operator=(Omg const &o) { throw 0; } };
std::tuple t0{42, Omg{}};
std::tuple t1{10, Omg{}};
t1 = t0;
is fully allowed to leave t1 in semi-assigned state? I.e., its first element could have already changed yet the second one can remain as it was, or even become inconsistent?
is fully allowed to leave t1 in semi-assigned state?
Yes. Copy-assignment is specified as just:
Effects: Assigns each element of u to the corresponding element of *this.
There are other types in the standard library that do specify an exception guarantee (e.g. optional), but tuple does not provide one.
Note that it doesn't specify an ordering to the assignment. An implementation could assign the Omg first (so no change to t1) or the int first (so you end up with a semi-assigned state).
I think an implementation could also choose to do copy-and-swap and thus provide a strong exception guarantee. That would match the specified effects. But this is not guaranteed by the standard.
Does the standard define precisely what I can do with an object once it has been moved from? I used to think that all you can do with a moved-from object is do destruct it, but that would not be sufficient.
For example, take the function template swap as defined in the standard library:
template <typename T>
void swap(T& a, T& b)
{
T c = std::move(a); // line 1
a = std::move(b); // line 2: assignment to moved-from object!
b = std::move(c); // line 3: assignment to moved-from object!
}
Obviously, it must be possible to assign to moved-from objects, otherwise lines 2 and 3 would fail. So what else can I do with moved-from objects? Where exactly can I find these details in the standard?
(By the way, why is it T c = std::move(a); instead of T c(std::move(a)); in line 1?)
17.6.5.15 [lib.types.movedfrom]
Objects of types defined in the C++ standard library may be moved from
(12.8). Move operations may be explicitly specified or implicitly
generated. Unless otherwise specified, such moved-from objects shall
be placed in a valid but unspecified state.
When an object is in an unspecified state, you can perform any operation on the object which has no preconditions. If there is an operation with preconditions you wish to perform, you can not directly perform that operation because you do not know if the unspecified-state of the object satisfies the preconditions.
Examples of operations that generally do not have preconditions:
destruction
assignment
const observers such as get, empty, size
Examples of operations that generally do have preconditions:
dereference
pop_back
This answer now appears in video format here: http://www.youtube.com/watch?v=vLinb2fgkHk&t=47m10s
Moved-from objects exist in an unspecified, but valid, state. That suggests that whilst the object might not be capable of doing much anymore, all of its member functions should still exhibit defined behaviour — including operator= — and all its members in a defined state- and it still requires destruction. The Standard gives no specific definitions because it would be unique to each UDT, but you might be able to find specifications for Standard types. Some like containers are relatively obvious — they just move their contents around and an empty container is a well-defined valid state. Primitives don't modify the moved-from object.
Side note: I believe it's T c = std::move(a) so that if the move constructor (or copy constructor if no move is provided) is explicit the function will fail.
I know that generally the standard places few requirements on the values which have been moved from:
N3485 17.6.5.15 [lib.types.movedfrom]/1:
Objects of types defined in the C++ standard library may be moved from (12.8). Move operations may
be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.
I can't find anything about vector that explicitly excludes it from this paragraph. However, I can't come up with a sane implementation that would result in the vector being not empty.
Is there some standardese that entails this that I'm missing or is this similar to treating basic_string as a contiguous buffer in C++03?
I'm coming to this party late, and offering an additional answer because I do not believe any other answer at this time is completely correct.
Question:
Is a moved-from vector always empty?
Answer:
Usually, but no, not always.
The gory details:
vector has no standard-defined moved-from state like some types do (e.g. unique_ptr is specified to be equal to nullptr after being moved from). However the requirements for vector are such that there are not too many options.
The answer depends on whether we're talking about vector's move constructor or move assignment operator. In the latter case, the answer also depends on the vector's allocator.
vector<T, A>::vector(vector&& v)
This operation must have constant complexity. That means that there are no options but to steal resources from v to construct *this, leaving v in an empty state. This is true no matter what the allocator A is, nor what the type T is.
So for the move constructor, yes, the moved-from vector will always be empty. This is not directly specified, but falls out of the complexity requirement, and the fact that there is no other way to implement it.
vector<T, A>&
vector<T, A>::operator=(vector&& v)
This is considerably more complicated. There are 3 major cases:
One:
allocator_traits<A>::propagate_on_container_move_assignment::value == true
(propagate_on_container_move_assignment evaluates to true_type)
In this case the move assignment operator will destruct all elements in *this, deallocate capacity using the allocator from *this, move assign the allocators, and then transfer ownership of the memory buffer from v to *this. Except for the destruction of elements in *this, this is an O(1) complexity operation. And typically (e.g. in most but not all std::algorithms), the lhs of a move assignment has empty() == true prior to the move assignment.
Note: In C++11 the propagate_on_container_move_assignment for std::allocator is false_type, but this has been changed to true_type for C++1y (y == 4 we hope).
In case One, the moved-from vector will always be empty.
Two:
allocator_traits<A>::propagate_on_container_move_assignment::value == false
&& get_allocator() == v.get_allocator()
(propagate_on_container_move_assignment evaluates to false_type, and the two allocators compare equal)
In this case, the move assignment operator behaves just like case One, with the following exceptions:
The allocators are not move assigned.
The decision between this case and case Three happens at run time, and case Three requires more of T, and thus so does case Two, even though case Two doesn't actually execute those extra requirements on T.
In case Two, the moved-from vector will always be empty.
Three:
allocator_traits<A>::propagate_on_container_move_assignment::value == false
&& get_allocator() != v.get_allocator()
(propagate_on_container_move_assignment evaluates to false_type, and the two allocators do not compare equal)
In this case the implementation can not move assign the allocators, nor can it transfer any resources from v to *this (resources being the memory buffer). In this case, the only way to implement the move assignment operator is to effectively:
typedef move_iterator<iterator> Ip;
assign(Ip(v.begin()), Ip(v.end()));
That is, move each individual T from v to *this. The assign can reuse both capacity and size in *this if available. For example if *this has the same size as v the implementation can move assign each T from v to *this. This requires T to be MoveAssignable. Note that MoveAssignable does not require T to have a move assignment operator. A copy assignment operator will also suffice. MoveAssignable just means T has to be assignable from an rvalue T.
If the size of *this is not sufficient, then new T will have to be constructed in *this. This requires T to be MoveInsertable. For any sane allocator I can think of, MoveInsertable boils down to the same thing as MoveConstructible, which means constructible from an rvalue T (does not imply the existence of a move constructor for T).
In case Three, the moved-from vector will in general not be empty. It could be full of moved-from elements. If the elements don't have a move constructor, this could be equivalent to a copy assignment. However, there is nothing that mandates this. The implementor is free to do some extra work and execute v.clear() if he so desires, leaving v empty. I am not aware of any implementation doing so, nor am I aware of any motivation for an implementation to do so. But I don't see anything forbidding it.
David Rodríguez reports that GCC 4.8.1 calls v.clear() in this case, leaving v empty. libc++ does not, leaving v not empty. Both implementations are conforming.
While it might not be a sane implementation in the general case, a valid implementation of the move constructor/assignment is just copying the data from the source, leaving the source untouched. Additionally, for the case of assignment, move can be implemented as swap, and the moved-from container might contain the old value of the moved-to container.
Implementing move as copy can actually happen if you use polymorphic allocators, as we do, and the allocator is not deemed to be part of the value of the object (and thus, assignment never changes the actual allocator being used). In this context, a move operation can detect whether both the source and the destination use the same allocator. If they use the same allocator the move operation can just move the data from the source. If they use different allocators then the destination must copy the source container.
In a lot of situations, move-construction and move-assignment can be implemented by delegating to swap - especially if no allocators are involved. There are several reasons for doing that:
swap has to be implemented anyway
developer efficiency because less code has to be written
runtime efficiency because fewer operations are executed in total
Here is an example for move-assignment. In this case, the move-from vector will not be empty, if the moved-to vector was not empty.
auto operator=(vector&& rhs) -> vector&
{
if (/* allocator is neither move- nor swap-aware */) {
swap(rhs);
} else {
...
}
return *this;
}
I left comments to this effect on other answers, but had to rush off before fully explaining. The result of a moved-from vector must always be empty, or in the case of move assignment, must be either empty or the previous object's state (i.e. a swap), because otherwise the iterator invalidation rules cannot be met, namely that a move does not invalidate them. Consider:
std::vector<int> move;
std::vector<int>::iterator it;
{
std::vector<int> x(some_size);
it = x.begin();
move = std::move(x);
}
std::cout << *it;
Here you can see that iterator invalidation does expose the implementation of the move. The requirement for this code to be legal, specifically that the iterator remains valid, prevents the implementation from performing a copy, or small-object-storage or any similar thing. If a copy was made, then it would be invalidated when the optional is emptied, and the same is true if the vector uses some kind of SSO-based storage. Essentially, the only reasonable possible implementation is to swap pointers, or simply move them.
Kindly view the Standard quotes on requirements for all containers:
X u(rv)
X u = rv
post: u shall be equal to the value that rv had before this construction
a = rv
a shall be equal to the value that rv had before this assignment
Iterator validity is part of the value of a container. Although the Standard does not unambiguously state this directly, we can see in, for example,
begin() returns an iterator referring to the first element in the
container. end() returns an iterator which is the past-the-end value
for the container. If the container is empty, then begin() == end();
Any implementation which actually did move from the elements of the source instead of swapping the memory would be defective, so I suggest that any Standard wordings saying otherwise is a defect- not least of which because the Standard is not in fact very clear on this point. These quotes are from N3691.