C++ new operator with std::move to "copy" pointer

C++ new operator with std::move to "copy" pointer - c++

I recently find a code snippet as follows:
// To be specific: the "Item" can be viewed as "std::pair<xxx, xxx>*" here
void moveItemDuringRehash(Item* itemAddr, Item& src) {
// This is basically *itemAddr = src; src = nullptr, but allowing
// for fancy pointers.
// TODO(T31574848): clean up assume-s used to optimize placement new
assume(itemAddr != nullptr);
new (itemAddr) Item{std::move(src)};
src = nullptr;
src.~Item();
}
The code is originated from the Folly Lib of Facebook. The functionality of this code is simple: copy std::pair* referenced by src to the memory pointed by itemAddr.
The implementation should be very simple, as mentioned in the comment. But actually, the code does not. The new operator with std::move is confusing, and I am not sure what is happening under the hood. I guess. Item{std::move(src)} construct a temp object with move ctor of std::pair*. And the temp object is copy to the object pointed by itemAddr by copy ctor of std::pair*. I am not sure if my guess is correct. Thank you for sharing your opinion. By the way, I was wondering if there is any performance benefit from this new operator with std::move.
Another question is why src.~Item() is needed? For safety, I need to set src (std::pair*) to nullptr. But why I need to use src.~Item() to dtor a nullptr?

My guess without much context:
The function implements a destructive move. It not only moves the value of the Item, but also destroys the Item object.
Similarly, it doesn't expect that there is already a Item object at the location itemAddr. Instead it constructs one.
For a scalar type like std::pair<xxx, xxx>*, there isn't really any difference between a destructive and a non-destructive move, but the function is written so that Item may also be a more complex type than a raw pointer. Types that are not actually raw pointers, but behave like them and can be used to replace raw pointers in allocators are called fancy pointers. The function is written in such a way that general allocators can be supported, including those with fancy pointers. (This isn't obvious from your shown code where no allocator type is mentioned, but that's how it looks in context of the Folly sources.)
A fancy pointer might have non-trivial construction and destruction, so that the destructive move operation and construction of a new object may behave differently than simple assignment.
new (itemAddr) Item{std::move(src)}; is a placement-new to construct a new Item object at the location itemAddr, move-constructed form src. Move-construction is a non-destructive move, so to destruct the source object that was moved from, src.~Item(); is still required.
I am not really sure why src = nullptr; is there. The user of the function may not access src after the destructor call either way. Since C++20 this is also true for the pseudo-destructor call of scalar types. For a fancy pointer the assignment may have side effects, but I don't see why that would matter here. The only reason I could think of is that this is used defensively.
There are no temporary objects involved here.
Please note that I am not familiar with the context in the library. This is just a guess based on what you showed and a quick look at the source file.

Related

What is the practice for handling a complex resource pointer in C++ when allocating a new version of the resource?

In most of my programming now a day I put everything in a smart pointer and forget about it. The resource is properly managed 99.9% of the time. It's really great and way better than a garbage collection mechanism.
However, once in a while, the resource being held by a smart pointer needs to be explicitly freed before we can reallocate a new instance of it. Something like this:
r = std::make_shared<my_resource>(with_this_id);
r->do_work();
...
r->do_more_work();
...
r->do_even_more_work();
r.reset();
r = std::make_shared<my_resource>(with_this_id);
...
If I miss the r.reset() call, the resource may either be using a large mount of memory or disk space and re-allocating without first resetting is likely to cause problems on smaller computers. Either that, or the resource is locked so it can't be reallocated until explicitly freed.
Is there a pattern/algorithm/something which handles such a case in a cleaner manner?

I see basically two ways to approach this. The first is to wrap the reset-assign sequence into one function and never assign directly. I would probably do this like such
template<typename T, typename ...Args>
void reset_and_assign(std::shared_ptr<T> &ptr, Args ... &&args) { //in-out parameter to avoid copy since you cannot rvo on a parameter
ptr.reset();
ptr.reset(new T(std::forward<Args> args...));
}
This is pretty easy and fast to do but it won't save you from accidentally calling the assignment directly and I don't see a way to do this if you keep using shared_ptr.
The other alternative is writing a wrapper around shared_ptr that just forwards most function calls and changes reset and assignment such that it will first deallocate and then create the new resource. This is a bit work to do and it is easy to get a bug into this (especially if you mess up some universal references if you try to save yourself some constructors). It will also be annoying to interact with other code that uses std smart pointers and will be a major refactoring process. But you cannot mess it up by accidentally calling an assignment (at least probably not).
Also note that the standard library does reset intentionally in this order such that we don't delete the old resource if the new allocation throws.

Although it may sound bizarre, I believe this can be better expressed as move semantics. Here is why:
The new resource object you create is a replacement of the old one. Assume that the replacement is transparent to your users (i.e. they perceive the new resource object and the old one as the same). Therefore, it is as if you performed a imagined copy of the old object to obtain the new one, and then destroy the old one. This matches the intention of move semantics.
So my_resource should have a move constructor:
my_resource(my_resource&& old) {
old.reset();
/* code to init the new resource object */
}
// replace
ptr = std::make_unique<my_resource>(std::move(*ptr));

Normally you just put the reset() inside the destructor of the my_resource class.
If this isn't suitable for whatever reason, upon instantiation, you can put custom destructor function into the std::shared_ptr that calls the reset() and then deletes the resource.
If the shared_ptr doesn't reach the refcount 0 - are you sure you use it correctly? There is std::weak_ptr - a utility class for shared_ptr - made for the whole purpose of when you need both safety and timely deallocation.
Also, why use make_shared and instatiate new my_resource instead of simply using some init function of my_resource that will automatically call reset() if needed?

Cannot add a unique_ptr to a std::array

After searching through books and online until the onset of a big pain between my ears, I cannot figure out how to add a std::unique_ptr to a std::array.
The following is a class member:
std::array<std::unique_ptr<Media>, MAX_ELMS_NUM> m_collection;
In the .cpp file I am trying to add a new media pointer stuffed in a std::unique_ptr to the array:
Media* newMedia = CreateNewMedia(Media info stuff);
unique_ptr<Media> np(newMedia);
m_collection[0] = np;
Everything compiles except for the last line.

The last line is an attempt to do a copy assignment operation, which is deleted for std::unique_ptr. You need to use the move-assignment operator:
m_collection[0] = std::move(np);

You can't copy a unique_ptr, full stop. That's what "unique" means - there is only one unique_ptr pointing to the same object.
Now, you can assign a unique_ptr to a different one, but that clears the original one. Since it would be really confusing if a = b modified b, you have to specifically indicate that you want this, with std::move:
m_collection[0] = std::move(np);
After this line, m_collection[0] will point to the Media object, and np will be cleared.

std::unique_ptr can't be copied, but moved, because it's unique. You could use std::move.
The class (std::unique_ptr) satisfies the requirements of MoveConstructible and MoveAssignable, but not the requirements of either CopyConstructible or CopyAssignable.
std::move is used to indicate that an object t may be "moved from", i.e. allowing the efficient transfer of resources from t to another object.
Media* newMedia = CreateNewMedia(Media info stuff);
unique_ptr<Media> np(newMedia);
m_collection[0] = std::move(np);
~~~~~~~~~
BTW: As #M.M mentioned, using std::unique_ptr::reset could solve your problem too, and more clear by avoiding all the temporary variables.
m_collection[0].reset(CreateNewMedia(Media info stuff));

What are the benefits and risks, if any, of using std::move with std::shared_ptr

I am in the process of learning C++11 features and as part of that I am diving head first into the world of unique_ptr and shared_ptr.
When I started, I wrote some code that used unique_ptr exclusively, and as such when I was passing my variables around I needed to accomplish that with std::move (or so I was made to understand).
I realized after some effort that I really needed shared_ptr instead for what I was doing. A quick find/replace later and my pointers were switched over to shared but I lazily just left the move() calls in.
To my surprise, not only did this compile, but it behaved perfectly well in my program and I got every ounce of functionality I was expecting... particularly, I was able to "move" a shared_ptr from ObjectA to ObjectB, and both objects had access to it and could manipulate it. Fantastic.
This raised the question for me though... is the move() call actually doing anything at all now that I am on shared_ptr? And if so, what, and what are the ramifications of it?
Code Example
shared_ptr<Label> lblLevel(new Label());
//levelTest is shared_ptr<Label> declared in the interface of my class, undefined to this point
levelTest = lblLevel;
//Configure my label with some redacted code
//Pass the label off to a container which stores the shared_ptr in an std::list
//That std::list is iterated through in the render phase, rendering text to screen
this->guiView.AddSubview(move(lblLevel));
At this point, I can make important changes to levelTest like changing the text, and those changes are reflected on screen.
This to me makes it appear as though both levelTest and the shared_ptr in the list are the same pointer, and move() really hasn't done much. This is my amateur interpretation. Looking for insight. Using MinGW on Windows.

ecatmur's answer explains the why of things behaving as you're seeing in a general sense.
Specifically to your case, levelTest is a copy of lblTest which creates an additional owning reference to the shared resource. You moved from lblTest so levelTest is completely unaffected and its ownership of the resource stays intact.
If you looked at lblTest I'm sure you'd see that it's been set to an empty value. Because you made a copy of the shared_ptr before you moved from it, both of the existing live instances of the pointer (levelTest and the value in guiView) should reference the same underlying pointer (their get method returns the same value) and there should be at least two references (their use_count method should return 2, or more if you made additional copies).
The whole point of shared_ptr is to enable things like you're seeing while still allowing automatic cleanup of resources when all the shared_ptr instances are destructed.

When you move-construct or move-assign from a shared pointer of convertible type, the source pointer becomes empty, per 20.7.2.2.1:
22 - Postconditions: *this shall contain the old value of r. r shall be empty. r.get() == 0.
So if you are observing that the source pointer is still valid after a move-construct or move-assignment, then either your compiler is incorrect or you are using std::move incorrectly.
For example:
std::shared_ptr<int> p = std::make_shared<int>(5);
std::shared_ptr<int> q = std::move(p);
assert(p.get() == nullptr);

If you copy a shared_ptr, the reference count to the pointer target is incremented (in a thread-safe way).
Instead, when you move a shared_ptr from A to B, B contains the copy of the state of A before the move, and A is empty. There was no thread-safe reference count increment/decrement, but some very simple and inexpensive pointer exchange between the internal bits of A and B.
You can think of a move as an efficient way of "stealing resources" from the source of the move to the destination of the move.

Why have move semantics?

Let me preface by saying that I have read some of the many questions already asked regarding move semantics. This question is not about how to use move semantics, it is asking what the purpose of it is - if I am not mistaken, I do not see why move semantics is needed.
Background
I was implementing a heavy class, which, for the purposes of this question, looked something like this:
class B;
class A
{
private:
std::array<B, 1000> b;
public:
// ...
}
When it came time to make a move assignment operator, I realized that I could significantly optimize the process by changing the b member to std::array<B, 1000> *b; - then movement could just be a deletion and pointer swap.
This lead me to the following thought: now, shouldn't all non-primitive type members be pointers to speed up movement (corrected below [1] [2]) (there is a case to be made for cases where memory should not be dynamically allocated, but in these cases optimizing movement is not an issue since there is no way to do so)?
Here is where I had the following realization - why create a class A which really just houses a pointer b so swapping later is easier when I can simply make a pointer to the entire A class itself. Clearly, if a client expects movement to be significantly faster than copying, the client should be OK with dynamic memory allocation. But in this case, why does the client not just dynamically allocate the whole A class?
The Question
Can't the client already take advantage of pointers to do everything move semantics gives us? If so, then what is the purpose of move semantics?
Move semantics:
std::string f()
{
std::string s("some long string");
return s;
}
int main()
{
// super-fast pointer swap!
std::string a = f();
return 0;
}
Pointers:
std::string *f()
{
std::string *s = new std::string("some long string");
return s;
}
int main()
{
// still super-fast pointer swap!
std::string *a = f();
delete a;
return 0;
}
And here's the strong assignment that everyone says is so great:
template<typename T>
T& strong_assign(T *&t1, T *&t2)
{
delete t1;
// super-fast pointer swap!
t1 = t2;
t2 = nullptr;
return *t1;
}
#define rvalue_strong_assign(a, b) (auto ___##b = b, strong_assign(a, &___##b))
Fine - the latter in both examples may be considered "bad style" - whatever that means - but is it really worth all the trouble with the double ampersands? If an exception might be thrown before delete a is called, that's still not a real problem - just make a guard or use unique_ptr.
Edit [1] I just realized this wouldn't be necessary with classes such as std::vector which use dynamic memory allocation themselves and have efficient move methods. This just invalidates a thought I had - the question below still stands.
Edit [2] As mentioned in the discussion in the comments and answers below this whole point is pretty much moot. One should use value semantics as much as possible to avoid allocation overhead since the client can always move the whole thing to the heap if needed.

I thoroughly enjoyed all the answers and comments! And I agree with all of them. I just wanted to stick in one more motivation that no one has yet mentioned. This comes from N1377:
Move semantics is mostly about performance optimization: the ability
to move an expensive object from one address in memory to another,
while pilfering resources of the source in order to construct the
target with minimum expense.
Move semantics already exists in the current language and library to a
certain extent:
copy constructor elision in some contexts
auto_ptr "copy"
list::splice
swap on containers
All of these operations involve transferring resources from one object
(location) to another (at least conceptually). What is lacking is
uniform syntax and semantics to enable generic code to move arbitrary
objects (just as generic code today can copy arbitrary objects). There
are several places in the standard library that would greatly benefit
from the ability to move objects instead of copy them (to be discussed
in depth below).
I.e. in generic code such as vector::erase, one needs a single unified syntax to move values to plug the hole left by the erased valued. One can't use swap because that would be too expensive when the value_type is int. And one can't use copy assignment as that would be too expensive when value_type is A (the OP's A). Well, one could use copy assignment, after all we did in C++98/03, but it is ridiculously expensive.
shouldn't all non-primitive type members be pointers to speed up movement
This would be horribly expensive when the member type is complex<double>. Might as well color it Java.

Your example gives it away: your code is not exception-safe, and it makes use of the free-store (twice), which can be nontrivial. To use pointers, in many/most situations you have to allocate stuff on the free store, which is much slower than automatic storage, and does not allow for RAII.
They also let you more efficiently represent non-copyable resources, like sockets.
Move semantics aren't strictly necessary, as you can see that C++ has existed for 40 years a while without them. They are simply a better way to represent certain concepts, and an optimization.

Can't the client already take advantage of pointers to do everything move semantics gives us? If so, then what is the purpose of move semantics?
Your second example gives one very good reason why move semantics is a good thing:
std::string *f()
{
std::string *s = new std::string("some long string");
return s;
}
int main()
{
// still super-fast pointer swap!
std::string *a = f();
delete a;
return 0;
}
Here, the client has to examine the implementation to figure out who is responsible for deleting the pointer. With move semantics, this ownership issue won't even come up.
If an exception might be thrown before delete a is called, that's still not a real problem just make a guard or use unique_ptr.
Again, the ugly ownership issue shows up if you don't use move semantics. By the way, how
would you implement unique_ptr without move semantics?
I know about auto_ptr and there are good reasons why it is now deprecated.
is it really worth all the trouble with the double ampersands?
True, it takes some time to get used to it. After you are familiar and comfortable with it, you will be wondering how you could live without move semantics.

Your string example is great. The short string optimization means that short std::strings do not exist in the free store: instead they exist in automatic storage.
The new/delete version means that you force every std::string into the free store. The move version only puts large strings into the free store, and small strings stay (and are possibly copied) in automatic storage.
On top of that your pointer version lacks exception safety, as it has non-RAII resource handles. Even if you do not use exceptions, naked pointer resource owners basically forces single exit point control flow to manage cleanup. On top of that, use of naked pointer ownership leads to resource leaks and dangling pointers.
So the naked pointer version is worse in piles of ways.
move semantics means you can treat complex objects as normal values. You move when you do not want duplicate state, and copy otherwise. Nearly normal types that cannot be copied can expose move only (unique_ptr), others can optimize for it (shared_ptr). Data stored in containers, like std::vector, can now include abnormal types because it is move aware. The std::vector of std::vector goes from ridiculously inefficient and hard to use to easy and fast at the stroke of a standard version.
Pointers place the resource management overhead into the clients, while good C++11 classes handle that problem for you. move semantics makes this both easier to maintain, and far less error prone.

Sequencing of the copying when passing by value in C++

In C++, when passing an object by value, are there restrictions on when the copy takes place ?
I have the following code (simplified):
class A;
class Parent
{
public:
void doSomething(std::auto_ptr<A> a); // meant to transfer ownership.
};
std::auto_ptr<A> a = ...;
a->getParent()->doSomething(a);
It acts like:
std::auto_ptr<A> a = ...;
std::auto_ptr<A> copy(a);
a->getParent()->doSomething(copy);
Which will obviously segfault since a is now referencing NULL.
And not like:
std::auto_ptr<A> a = ...;
Parent* p = a->getParent();
p->doSomething(a);
Is this expected ?

A: auto_ptr is deprecated in newer versions of C++, I recommend checking out unique_ptr.
B: This behavior is expected. An auto_ptr owns the thing that it has created. So if you wish to properly transfer ownership from one auto_ptr to another, the original auto_ptrs managed object would properly be a null pointer. Though I believe this logic is handled by the std::auto_ptr library and you shouldn't have to do anything special to get this behavior. If two auto_ptrs were allowed to manage the same object, they would also both try and free the memory for this object when they went out of scope. This is bad in itself, but even worse, is that if one of these auto_ptrs had broader scope it could attempt to reference memory that no longer held the object in question because it had since been freed by the other auto_ptr and in this we have true chaos. Hence, when ownership is transferred, the original pointers managed object is set to null, and we have the illusion of safety. :)

From my point of view, the example is not good because of at least three reasons.
1) Looking at the code without seeing doSomething proto it is not clear that the ownership can change.
2) If by a slightest chance the result can depend on the order of evaluation, the code is not portable or implementation dependent and so not acceptable.
3) Even if the order of evaluation is right, the code can raise this exact question from other
developers and will waste their time. The readability must be of the highest priority.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js