Move constructor and pre-increment vs post-increment

Move constructor and pre-increment vs post-increment - c++

In C++, if you have a for loop that "copies" objects of a user defined type using a move constructor, does it make any difference if you use ++i or i++ as the loop counter?
I know this question seems rather vague, but I was (I believe) asked this in a phone interview. I wasn't sure if I understood the question correctly, and the interviewer took this as my not knowing the answer, and cut the interview short.
What could he have been getting at?

In C++, if you have a for loop that "copies" objects of a user defined type using a move constructor [...]
First of all, a move constructor is used for move-constructing, which usually means you are not "copying": you can realize moving as copying - in fact, a class which is copy-constructible is also move-constructible - but then why defining a move constructor explicitly?
[...] does it make any difference if you use ++i or i++ as the loop counter?
It depends on what i is. If it is a scalar object, like an int, then there is no difference at all.
If i is a class-type iterator, on the other hand, ++i should be more efficient (on a purely theoretical ground), because the implementation of operator ++ will not have to create a copy of the iterator to be returned before the iterator itself is incremented.
Here, for instance, is how stdlibc++ defines the increment operators for the iterator type of an std::list:
_Self&
operator++()
{
_M_node = _M_node->_M_next;
return *this;
}
_Self
operator++(int)
{
_Self __tmp = *this;
_M_node = _M_node->_M_next;
return __tmp;
}
As you can see, the postfix version (the one accepting a dummy int) has more work to do: it needs to create a copy of the original iterator to be refurned, then alter the iterator's internal pointer, then return the copy.
On the other hand, the prefix version just has to alter the internal pointer and return (a reference to) itself.
However, please keep in mind that when performance is concerned, all assumptions have to be backed up by measurement. In this case, I do not expect any sensible difference between these two functions.

Related

Why do std::future<T> and std::shared_future<T> not provide member swap()?

All kinds of classes in the C++ standard library have a member swap function, including some polymorphic classes like std::basic_ios<CharT>. The template class std::shared_future<T> clearly is a value type and std::future<T> is a move-only value type. Is there any particular reason, they don't provide a swap() member function?

Member swap was a massive performance increase prior to std::move support in C++11. It was the way you could move one vector to another spot, for example. It was used in vector resizes as well, and it meant that inserting into a vector of vectors was not complete performance suicide.
After std::move arrived in C++11, with many sometimes-empty types the default implementation of std::swap:
template<class T>
void swap( T& lhs, T& rhs ) {
auto tmp = std::move(rhs);
rhs = std::move(lhs);
lhs = std::move(tmp);
}
is going to be basically as fast as a custom-written one.
Existing types with swap members are unlikely to lose them (at least immediately). Extending the API of a new type should, however, be justified.
If std::future is basically a wrapper around a std::unique_ptr< future_impl >, then the above is going to require 4 pointer reads, 3 pointer writes, and one branch. And an optimizating compiler who inlined it1 could reduce it down to 2 pointer reads and 2 pointer writes (using SSA2 for example), which is what an optimized .swap member function could do.
1 So it knows intermediate access to the lhs and rhs never occurs, thus the existence of tmp can be eliminated as-if once it proves tmp is empty and hence has a no-op dtor.
2 Static single assignment, where you break a program down such that every assignment to a primitive creates a brand new variable (with metadata). You then prove properties about that variable, and eliminate redundant ones.

shared_ptr assignment: order of reference counting

When you use the copy assignment operator of a shared_ptr, conceptually, the shared_ptr on the left hand side of the assignment would need to decrement the reference count of the object it currently owns, and then increment the reference count of the object on the right-hand side of the assignment. (Assuming, of course, that both pointers are non-null.)
So an implementation might look something like the following pseudo code:
shared_ptr& operator = (const shared_ptr& rhs)
{
decrement_reference_count(this->m_ptr);
this->m_ptr = rhs.m_ptr;
increment_reference_count(this->m_ptr);
return *this;
}
But note that here we decrement the reference count of this before we increment the reference count of rhs. We could also do it the other way around. My question is, does the standard actually specify the order here?
Why it makes a difference: it could make a big difference in the event that there is some kind of dependency between the reference count of this and the reference count of lhs. For example, suppose both are part of a linked list structure, where the next pointer in each linked node is a shared_ptr. So, decrementing the reference count of any node in the structure could trigger a destructor, which would then set off a chain reaction and decrement the reference count (and possibly also destruct) every other node in the chain.
So, supposing a situation where the reference count of lhs is affected by the reference count of this, it makes a big difference if we first decrement this, or we first increment lhs. If we first increment lhs before decrementing this, then we can be sure that lhs will not end up being destructed when we decrement this.
But does the standard actually specify an order here? As far as I can see, the only thing the standard says is that the copy assignment operator is equivalent to the expression:
shared_ptr(lhs).swap(*this)
But I can't really wrap my head around the implications (if any) that this equivalency might have in regard to the order of decrementing/incrementing the reference counts.
So does the standard specify an order here? Or is this implementation defined behavior?

The Standard says [20.7.2.2.3] that
shared_ptr& operator=(const shared_ptr& r) noexcept;
has effects equivalent to
shared_ptr(r).swap(*this)
This means constructing a temporary, which increments r's reference count, then swapping its data with *this, then destroying the temporary, which means decrementing the reference count that used to belong to *this.

It must increment the reference counter first, in case rhs is this. Otherwise it could inadvertently destroy the pointee when the reference counter is 1. It could check whether this == &rhs but this check is unnecessary if the reference counter increment is performed before the decrement.
shared_ptr(lhs).swap(*this) does not suffer from this issue because it creates a copy first, thus incrementing the reference counter first.

What is the difference between these two parameters in C++?

I am new to C++ and currently am learning about templates and iterators.
I saw some code implementing custom iterators and I'm curious to know what the difference between these two iterator parameters is:
iterator & operator=(iterator i) { ... i.someVar }
bool operator==(const iterator & i) { ... i.someVar }
They implement the = and == operators for the particular iterator. Assuming the iterator class has a member variable 'someVar', why is one operator implemented using "iterator i" and another with "iterator & i"? Is there any difference between the two "i.someVar" expressions?
I googled a little and found this question
Address of array - difference between having an ampersand and no ampersand
to which the answer was "the array is converted to a pointer and its value is the address of the first thing in the array." I'm not sure this is related, but it seems like the only valid explanation I could find.
Thank you!

operator= takes its argument by value (a.k.a. by copy). operator == takes its argument by const reference (a.k.a. by address, albeit with a guarantee that the object will not be modified).
An iterator may be/contain a pointer into an array but it is not itself an array.
The ampersand (&) has different contextual meanings. Used in an expression, it behaves as an operator. Used in a declaration such as iterator & i, it forms part of the type iterator & and indicates that i is a reference, as opposed to an object.
For more discussion (with pictures!), see Pass by Reference / Value in C++ and What's the difference between passing by reference vs. passing by value? (this one is language agnostic).

the assignment operator = takes the iterator i as value, which means a copy of the original iterator is made and passed to the function so any changes applied to the iterator i inside the operator method won't affect the original.
the comparison operator == takes a constant reference, which denotes that the original object can't/shouldn't be changed in the method. This makes sense since a comparison operator usually only compares objects without changing them. The reference allows to pass a reference to the original iterator which lives outside the method. This means that the actual object won't be copied which is usually faster.

First, you don't have an address of an array here.
There's no semantic difference, unless you try to make a local change to the local variable i: iterator i will allow a local change, while const iterator & i will not.
Many people are used to writing const type & var for function parameters because passing by reference can be faster than by value, especially if type is big and expensive to copy, but in your case, an iterator should be small and cheap to copy, so there's no gain from avoiding copying. (Actually, having a local copy can enhance locality of reference and help optimization, so I would just pass small values by value (by copying).)

Is a moved-from vector always empty?

I know that generally the standard places few requirements on the values which have been moved from:
N3485 17.6.5.15 [lib.types.movedfrom]/1:
Objects of types defined in the C++ standard library may be moved from (12.8). Move operations may
be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.
I can't find anything about vector that explicitly excludes it from this paragraph. However, I can't come up with a sane implementation that would result in the vector being not empty.
Is there some standardese that entails this that I'm missing or is this similar to treating basic_string as a contiguous buffer in C++03?

I'm coming to this party late, and offering an additional answer because I do not believe any other answer at this time is completely correct.
Question:
Is a moved-from vector always empty?
Answer:
Usually, but no, not always.
The gory details:
vector has no standard-defined moved-from state like some types do (e.g. unique_ptr is specified to be equal to nullptr after being moved from). However the requirements for vector are such that there are not too many options.
The answer depends on whether we're talking about vector's move constructor or move assignment operator. In the latter case, the answer also depends on the vector's allocator.
vector<T, A>::vector(vector&& v)
This operation must have constant complexity. That means that there are no options but to steal resources from v to construct *this, leaving v in an empty state. This is true no matter what the allocator A is, nor what the type T is.
So for the move constructor, yes, the moved-from vector will always be empty. This is not directly specified, but falls out of the complexity requirement, and the fact that there is no other way to implement it.
vector<T, A>&
vector<T, A>::operator=(vector&& v)
This is considerably more complicated. There are 3 major cases:
One:
allocator_traits<A>::propagate_on_container_move_assignment::value == true
(propagate_on_container_move_assignment evaluates to true_type)
In this case the move assignment operator will destruct all elements in *this, deallocate capacity using the allocator from *this, move assign the allocators, and then transfer ownership of the memory buffer from v to *this. Except for the destruction of elements in *this, this is an O(1) complexity operation. And typically (e.g. in most but not all std::algorithms), the lhs of a move assignment has empty() == true prior to the move assignment.
Note: In C++11 the propagate_on_container_move_assignment for std::allocator is false_type, but this has been changed to true_type for C++1y (y == 4 we hope).
In case One, the moved-from vector will always be empty.
Two:
allocator_traits<A>::propagate_on_container_move_assignment::value == false
&& get_allocator() == v.get_allocator()
(propagate_on_container_move_assignment evaluates to false_type, and the two allocators compare equal)
In this case, the move assignment operator behaves just like case One, with the following exceptions:
The allocators are not move assigned.
The decision between this case and case Three happens at run time, and case Three requires more of T, and thus so does case Two, even though case Two doesn't actually execute those extra requirements on T.
In case Two, the moved-from vector will always be empty.
Three:
allocator_traits<A>::propagate_on_container_move_assignment::value == false
&& get_allocator() != v.get_allocator()
(propagate_on_container_move_assignment evaluates to false_type, and the two allocators do not compare equal)
In this case the implementation can not move assign the allocators, nor can it transfer any resources from v to *this (resources being the memory buffer). In this case, the only way to implement the move assignment operator is to effectively:
typedef move_iterator<iterator> Ip;
assign(Ip(v.begin()), Ip(v.end()));
That is, move each individual T from v to *this. The assign can reuse both capacity and size in *this if available. For example if *this has the same size as v the implementation can move assign each T from v to *this. This requires T to be MoveAssignable. Note that MoveAssignable does not require T to have a move assignment operator. A copy assignment operator will also suffice. MoveAssignable just means T has to be assignable from an rvalue T.
If the size of *this is not sufficient, then new T will have to be constructed in *this. This requires T to be MoveInsertable. For any sane allocator I can think of, MoveInsertable boils down to the same thing as MoveConstructible, which means constructible from an rvalue T (does not imply the existence of a move constructor for T).
In case Three, the moved-from vector will in general not be empty. It could be full of moved-from elements. If the elements don't have a move constructor, this could be equivalent to a copy assignment. However, there is nothing that mandates this. The implementor is free to do some extra work and execute v.clear() if he so desires, leaving v empty. I am not aware of any implementation doing so, nor am I aware of any motivation for an implementation to do so. But I don't see anything forbidding it.
David Rodríguez reports that GCC 4.8.1 calls v.clear() in this case, leaving v empty. libc++ does not, leaving v not empty. Both implementations are conforming.

While it might not be a sane implementation in the general case, a valid implementation of the move constructor/assignment is just copying the data from the source, leaving the source untouched. Additionally, for the case of assignment, move can be implemented as swap, and the moved-from container might contain the old value of the moved-to container.
Implementing move as copy can actually happen if you use polymorphic allocators, as we do, and the allocator is not deemed to be part of the value of the object (and thus, assignment never changes the actual allocator being used). In this context, a move operation can detect whether both the source and the destination use the same allocator. If they use the same allocator the move operation can just move the data from the source. If they use different allocators then the destination must copy the source container.

In a lot of situations, move-construction and move-assignment can be implemented by delegating to swap - especially if no allocators are involved. There are several reasons for doing that:
swap has to be implemented anyway
developer efficiency because less code has to be written
runtime efficiency because fewer operations are executed in total
Here is an example for move-assignment. In this case, the move-from vector will not be empty, if the moved-to vector was not empty.
auto operator=(vector&& rhs) -> vector&
{
if (/* allocator is neither move- nor swap-aware */) {
swap(rhs);
} else {
...
}
return *this;
}

I left comments to this effect on other answers, but had to rush off before fully explaining. The result of a moved-from vector must always be empty, or in the case of move assignment, must be either empty or the previous object's state (i.e. a swap), because otherwise the iterator invalidation rules cannot be met, namely that a move does not invalidate them. Consider:
std::vector<int> move;
std::vector<int>::iterator it;
{
std::vector<int> x(some_size);
it = x.begin();
move = std::move(x);
}
std::cout << *it;
Here you can see that iterator invalidation does expose the implementation of the move. The requirement for this code to be legal, specifically that the iterator remains valid, prevents the implementation from performing a copy, or small-object-storage or any similar thing. If a copy was made, then it would be invalidated when the optional is emptied, and the same is true if the vector uses some kind of SSO-based storage. Essentially, the only reasonable possible implementation is to swap pointers, or simply move them.
Kindly view the Standard quotes on requirements for all containers:
X u(rv)
X u = rv
post: u shall be equal to the value that rv had before this construction
a = rv
a shall be equal to the value that rv had before this assignment
Iterator validity is part of the value of a container. Although the Standard does not unambiguously state this directly, we can see in, for example,
begin() returns an iterator referring to the first element in the
container. end() returns an iterator which is the past-the-end value
for the container. If the container is empty, then begin() == end();
Any implementation which actually did move from the elements of the source instead of swapping the memory would be defective, so I suggest that any Standard wordings saying otherwise is a defect- not least of which because the Standard is not in fact very clear on this point. These quotes are from N3691.

Do I have to return a reference to the object when overloading a pre-increment operator?

Can I use:
MyClass& MyClass::operator++ () {
a++; // private var of MyClass
return (*this);
}
Or it can be:
MyClass MyClass::operator++ ();
What's the difference?
Thanks for answers. I have another issue.
Many people do something like that:
MyClass& MyClass::operator++();
MyClass MyClass::operator++(int);
Isn't it illogical? Please give some examples if you can.
I know that the first version is pre-increment and the second is post-increment, but i ask why the first one returns reference but the second one not? It is in the same code (class), and the same use of the code.

No, you don't have to return the reference to your object when you overload the pre-increment operator. In fact you may return anything you'd like, MyClass, int, void, whatever.
This is a design issue -- you must ask yourself what is the most useful thing to the users of your class that you are able to return.
As a general rule, class operators are the most useful when they cause the least confusion, that is, when they operate the most like operators on basic types. In this case, the pre-increment operator on a basic type:
int i = 7;
j = ++i;
increments the variable and then returns the new value. If this is the only use you want MyClass to have, then returning a copy of your class is sufficient.
But, the pre-increment operator on a basic type actually returns an lvalue. So, this is legal:
int i = 7;
int *p = &++i;
If you want to support an operation like this, you must return a reference.
Is there a specific reason that you don't want to return a reference? Is that not a well-formed concept for your particular class? If so, consider returning void. In that case, this expression: ++myObject is legal, while this myOtherObject = ++myObject is not.

You can return whatever you want. void, reference to self, copy of self, something else. Whichever you prefer (or need).
If you plan using the ++ operator in chained expressions (like (++obj).something()) then return a reference. In you don't, then void is just fine.
Remember that in the end, operators are just like normal methods: you can do whatever you want with them, provided you respect their prototype.

For question two:
Prefix returns a reference, as expected.
Postfix returns a copy to be consistent with the behavior of the postfix operator(s).
Break it down simply to int:
int c = 0;
if(++c)
{
// true, prefix increments prior to the test
}
c = 0;
if(c++)
{
// false, c now == 1, but was incremented after the test
}
Implementing this behavior in a class requires a copy be returned because the postfix operator will have modified the state of the object.
If the program does not need true postfix operation, you are free of course to implement how you wish. While there are standard ways of writing these operators (that are understood by most C++ programmers), there's nothing actually stopping you from implementing this in different ways.
The argument provided about incorrect functionality surrounding (obj++)++ is not really important, as that code won't even compile for POD types (in Visual Studio 2010, at least), because for POD types, a copy is returned and that temporary copy cannot be used alone as an l-value.
However, for the prefix operator a reference is the preferred return as that allows the proper behavior for chaining the operation (++(++obj)).

Its not compulsory, but we should try to make operator overloading intuitive and it should work as per the operator which is being overloaded.
If we do
int i = 10;
i++ = 0
Then second statement is not allowed it says it requires lvalue as i++ denotes older state of i not a storage ...
while ++i = 0 perfectly works fine ..
so just to keep it in sync with actual operators prefix version had to return refence so that its return value may be treated as lvalue in expressions.

Yes, you should return by reference. No need for the parenthesis around *this.
EDIT: Replying to your comment... You don't have to return by reference. But in general we follow some guidelines which make our classes behave "as expected" when compared to the builtin semantics of such operators. You might wanna take a look at http://www.parashift.com/c++-faq-lite/operator-overloading.html.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Move constructor and pre-increment vs post-increment - c++

Related

Why do std::future<T> and std::shared_future<T> not provide member swap()?

shared_ptr assignment: order of reference counting

What is the difference between these two parameters in C++?

Is a moved-from vector always empty?

Do I have to return a reference to the object when overloading a pre-increment operator?

Categories

Resources