Does std::move invalidate pointers?

Does std::move invalidate pointers? - c++

Assume the following:
template<typename Item>
class Pipeline
{
[...]
void connect(OutputSide<Item> first, InputSide<Item> second)
{
Queue<Item> queue;
first.setOutputQueue(&queue);
second.setInputQueue(&queue);
queues.push_back(std::move(queue));
}
[...]
std::vector<Queue<Item> > queues;
};
Will the pointers to queue still work in "first" and "second" after the move?

Does std::move invalidate pointers?
No. An object still exists after being moved from, so any pointers to that object are still valid. If the Queue is sensibly implemented, then moving from it should leave it in a valid state (i.e. it's safe to destroy or reassign it); but may change its state (perhaps leaving it empty).
Will the pointers to queue still work in "first" and "second" after the move?
No. They will point to the local object that's been moved from; as described above, you can't make any assumptions about that object's state after the move.
Much worse than that is that when the function returns, it's destroyed, leaving the pointers dangling. They are now invalid, not pointing to any object, and using them will give undefined behaviour.
Perhaps you want them to point to the object that's been moved into queues:
queues.push_back(std::move(queue));
first.setOutputQueue(&queue.back());
second.setInputQueue(&queue.back());
but, since queues is a vector, those pointers will be invalidated when the queue next reallocates its memory.
To fix that problem, use a container like deque or list which doesn't move its elements after insertion. Alternatively, at the cost of an extra level of indirection, you could store (smart) pointers rather than objects, as described in Danvil's answer.

The pointers will not work, because queue is a local object which will be deleted at the end of connect. Even by using std::move you still create a new object at a new memory location. It will just try to use as much as possible from the "old" object.
Additionally the whole thing will not work at all independent of using std::move as push_back possibly has to reallocate. Thus a call to connect may invalidate all your old pointers.
A possible solution is creating Queue objects on the heap. The following suggestion uses C++11:
#include <memory>
template<typename Item>
class Pipeline
{
[...]
void connect(OutputSide<Item> first, InputSide<Item> second)
{
auto queue = std::make_shared<Queue<Item>>();
first.setOutputQueue(queue);
second.setInputQueue(queue);
queues.push_back(queue);
}
[...]
std::vector<std::shared_ptr<Queue<Item>>> queues;
};

Others provided nice and detailed explanations, but your question indicates that you do not understand fully what move does, or what it is designed to do. I'll try to describe it in simple words.
move, as the name implies, is meant to move things. But what can be moved? You cannot move an object once it is allocated somewhere. Recall the new moving-construtcor added recently, which resembles the copy-constructor..
So, it's all about the "contents" of an object. Both copy- and move-constructors are meant to operate on the "contents". So is the std::move. It is meant to move the contents of one object into the target, and in contrast to the copy, it is meant to leave no trace of contents in the original location.
It is meant to be used everywhere where something forces us to make a copy which we don't really care about and which we really'd like to actually omit, and only have the contents already in the target place.
That is, the usage of "move" indicates that a copy will be made, contents will be moved there, and original will be cleared (sometimes some steps might be skipped, but still, that's the basic idea of a move).
This clearly indicates that, even if the original survives, any pointers to the original object will not point to the destination, which received the content. At best, they will point to the original thing that was just 'cleared'. At worst, it will point to a completely unusable thing.
Now look at your code. It fills the queue and then takes the pointer to the original, then moves the queue.
I hope that's clear now what's happening and what you can do with it.

Related

Sharing a `std::list` without adding a (redundant) reference to it

I have a conainter, lets say a std::list<int>, which I would like to share between objects. One of the objects is known to live longer than the others, so he will hold the container. In order to be able to access the list, the other objects may have a pointer to the list.
Since the holder object might get moved, I'll need to wrap the list with a unique_ptr:
class LongLiveHolder { std::unique_ptr<std::list<int>> list; };
class ShortLiveObject { std::list<int>& list; };
However, I don't really need the unique_ptr wrapper. Since the list probably just contains a [unique_ptr] pointer to the first node (and a pointer to the last node), I could, theoretically, have those pointers at the other objects:
class LongLiveHolder { std::unique_ptr<NonExistentListNode<int>> back; };
class ShortLiveObject { NonExistentListNode<int>& back; };
, which would save me a redundant dereference when accessing the list, except that I would no longer have the full std::list interface to use with the shorter-lived object- just the node pointers.
Can I somehow get rid of this extra layer of indirection, while still having the std::list interface in the shorter-lived object?

Preface
You may be overthinking the cost of the extra indirection from the std::unique_ptr (unless you have a lot of these lists and you know that usages of them will be frequent and intermixed with other procedures). In general, I'd first trust my compiler to do smart things. If you want to know the cost, do performance profiling.
The main purpose of the std::unique_ptr in your use-case is just to have shared data with a stable address when other data that reference it gets moved. If you use the list member of the long-lived object multiple times in a single procedure, you can possibly help your compiler to help you (and also get some nicer-to-read code) when you use the list through the long-lived object by making a variable in the scope of the procedure that stores a reference to the std::list pointed to by the std::unique_ptr like:
void fn(LongLiveHolder& holder) {
auto& list {holder.list.get()};
list.<some_operation_1>(...);
list.<some_operation_2>(...);
list.<some_operation_3>(...);
}
But again, you should inspect the generated machine code and do performance profiling if you really want to know what kind of difference it makes.
If Context Permits, Write your own List
You said:
However, I don't really need the unique_ptr wrapper. Since the list probably just contains a [unique_ptr] pointer to the first node (and a pointer to the last node), I could, theoretically, have those pointers at the other objects: [...]
Considering Changes in what is the First Node
What if the first node of the list is allowed to be deleted? What if a new node is allowed to be inserted at the beginning of the list? You'd need a very specific context for those to not be requirements. What you want in your short-lived object is a view abstractions which supports the same interface as the actual list but just doesn't manage the lifetime of the list contents. If you implement the view abstraction as a pointer to the list's first node, then how will the view object know about changes to what the "real"/lifetime-managing list considers to be the first node? It can't- unless the lifetime-managing list keeps an internal list of all views of itself which are alive and also updates those (which itself is a performance and space overhead), and even then, what about the reverse? If the view abstraction was used to change what's considered the first node, how would the lifetime-managing list know about that change? The simplest, sane solution is to have an extra level of indirection: make the view point to the list instead of to what was the list's first node when the view was created.
Considering Requirements on Time Complexity of getting the list size
I'm pretty sure a std::list can't just hold pointers to front and back nodes. For one thing, since c++11 requires that std::list::size() is O(1), std::list probably has to keep track of its size at all times in a counter member- either storing it in itself, or doing some kind of size-tracking in each node struct, or some other implementation-defined behaviour. I'm pretty sure the simplest and most performant way to have multiple moveable references (non-const pointers) to something that needs to do this kind of bookkeeping is to just add another level of indirection.
You could try to "skip" the indirection layer required by the bookkeeping for specific cases that don't require that information, which is the iterators/node-pointers approach, which I'll comment on later. I can't think of a better place or way to store that bookkeeping other than with the collection itself. Ie. If the list interface has requirements that require such bookkeeping, an extra layer of indirection for each user of the list implementation has a very strong design rationale.
If Context Permits
If you don't care about having O(1) to get the size of your list, and you know that what is considered the first node will not change for the lifetime of the short-lived object, then you can write your own List class list-view class and make your own context-specific optimizations. That's one of the big selling-points of languages like C++: You get a nice standard library that does commonly useful things, and when you have a specific scenario where some features of those tools aren't required and are resulting in unnecessary overhead, you can build your own tool/abstraction (or possibly use someone else's library).
Commentary on std::unique_ptr + reference
Your first snippet works, but you can probably get some better implicit constructors and such for SortLiveObject by using std::reference_wrapper, since the default implicity-declared copy-assignment and default-construct functions get deleted when there's a reference member.
class LongLiveHolder { std::unique_ptr<std::list<int>> list; };
class ShortLiveObject { std::reference_wrapper<std::list<int>> list; };
Commentary on std::shared_ptr + std::weak_ref
Like #Adrian Maire suggested, std::shared_ptr in the longer-lived, object which might move while the shorter-lived object exists, and std::weak_ptr in the shorter-lived object is a working approach, but it probably has more overhead (at least coming from the ref-count) than using std::unique_ptr + a reference, and I can't think of any generalized pros, so I wouldn't suggest it unless you already had some other reason to use a std::shared_ptr. In the scenario you gave, I'm pretty sure you do not.
Commentary on Storing iterators/node-pointers in the short-lived object
#Daniel Langr already commented about this, but I'll try to expand.
Specifically for std::list, there is a possible standard-compliant solution (with several caveats) that doesn't have the extra indirection of the smart pointer. Caveats:
You must be okay with only having an iterator interface for the shorter-lived object (which you indicated that you are not).
The front and back iterators must be stable for the lifetime of the shorter-lived object. (the iterators should not be deleted from the list, and the shorter-lived object won't see new list entries that are pushed to the front or back by someone using the longer-lived object).
From cppreference.com's page for std::list's constructors:
After container move construction (overload (8)), references, pointers, and iterators (other than the end iterator) to other remain valid, but refer to elements that are now in *this. The current standard makes this guarantee via the blanket statement in [container.requirements.general]/12, and a more direct guarantee is under consideration via LWG 2321.
From cppreference.com's page for std::list:
Adding, removing and moving the elements within the list or across several lists does not invalidate the iterators or references. An iterator is invalidated only when the corresponding element is deleted.
But I am not a language lawyer. I could be missing something important.
Also, you replied to Daniel saying:
Some iterators get invalid when moving the container (e.g. insert_iterator) #DanielLangr
Yes, so if you want to be able to make std::input_iterators, use the std::unique_ptr + reference approach and construct short-lived std::input_iterators when needed instead of trying to store long-lived ones.

If the list owner will be moved, then you need some memory address to share somehow.
You already indicated the unique_ptr. It's a decent solution if the non-owners don't need to save it internally.
The std::shared_ptr is an obvious alternative.
Finally, you can have a std::shared_ptr in the owner object, and pass std::weak_ptr to non-owners.

Ideas on how to track boost::intrusive_ptr's

I use boost::intrusive_ptr a lot to keep instances of certain classes alive.
At some point my program expects all boost::intrusive_ptrs to have been deleted,
so that the under laying object is released.
A problem that I seem to frequently run into lately is that not all boost::intrusive_ptr are deleted and said object is still "alive". This is a bug.
I need to be able to detect where those pointers are. Getting a list of the source-file/line numbers of where they were created is good enough (for example, if if the debug output tells me that a pointer is still alive that was created here:
foo.m_ptr = foo::create<Foo>();
where m_ptr is a boost::intrusive_ptr<Foo> and/or boost::intrusive<Foo const> then
I can figure out that the pointer that still wasn't destructed is Foo::m_ptr ;).
Of course, in this case, m_ptr was already created before - maybe even assigned. So what is important here is not to track the constructor of boost::intrusive_ptr but rather the place where the reference count is (last) incremented.
The above assignment function looks as follows:
intrusive_ptr<T>& operator=(intrusive_ptr<T> const& rhs)
{
intrusive_ptr<T>(rhs).swap(*this);
return *this;
}
Hence, this copy-constructs a temporary from rhs, incrementing the reference count of the object that rhs points to, then swaps that with *this so that the temporary now points to what *this pointed to before and the current object points to what rhs points to. Finally the temporary is destructed causing a decrement on the ref count of what *this pointed to.
Getting the call address from the intrusive_ptr_add_ref would hence require unwinding the stack twice (if not three times), possibly depending on optimization level.
As such I think it cannot be avoided to replace boost::intrusive_ptr with my own class, lets say utils::intrusive_ptr, which would implement the above function as follows:
[[gnu::noinline]] intrusive_ptr<T>& operator=(intrusive_ptr<T> const& rhs)
{
intrusive_ptr<T>(rhs, __builtin_return_address(0)).swap(*this);
return *this;
}
which uses [[gnu::noinline]] to be sure that __builtin_return_address(0) returns the return address of the current function. This return address is passed to the constructor to register it (the return address can easily be converted to to call location later on, ie when printing the list of still existing pointers).
At the same time it is required to remove the location that was stored for rhs.
It seems to me that the best way to make sure of this is to literally store the location in the intrusive_ptr itself. This location then would also be swapped by swap and thus destructed when leaving this scope.
Registration of existing intrusive_ptrs can be done by their address: every constructor has an address (its this pointer) that will be unique and can be used as a key in a map. Upon destruction this key is simply removed again.
This way, using the map, there is a list of all existing intrusive_ptr objects, which store their own location.
Is my reasoning correct here? Or is there another way to do this?
EDIT:
I added a new class that is basically a copy of boost::intrusive_ptr, but stripped of anything boost and that only supports c++17 (and up). I didn't compile it yet, but this should serve as the basis(?). See https://github.com/CarloWood/ai-utils/blob/master/intrusive_ptr.h

Is it okay to "Move" an object from a queue, if you're about to pop from it?

I've been working on a parser for commands (which are fancy wrappers around large arrays of data), and have a queue that unhandled commands reside on. If I need a command, I query it with code like this:
boost::optional<command> get_command() {
if (!has_command()) return boost::optional<command>(nullptr);
else {
boost::optional<command> comm(command_feed.front()); //command_feed is declared as a std::queue<command>
command_feed.pop();
return comm;
}
}
The problem is, these commands could be megabytes in size, under the right circumstances, and need to parse pretty quickly. My thought was that I could optimize the transferal to a move like so:
boost::optional<command> get_command() {
if (!has_command()) return boost::optional<command>(nullptr);
else {
boost::optional<command> comm(std::move(command_feed.front())); //command_feed is declared as a std::queue<command>
command_feed.pop();
return comm;
}
}
And it seems to work for this specific case, but can this be used as a general purpose solution to any properly maintained RAII object, or should I be doing something else?

Yes, this is perfectly safe:
std::queue<T> q;
// add stuff...
T top = std::move(q.front());
q.pop();
pop() doesn't have any preconditions on the first element in the q having a specified state, and since you're not subsequently using q.front() you don't have to deal with that object being invalidated any more.
Sounds like a good idea to do!

It depends on what the move constructor for your type does. If it leaves the original object in a state that can safely be destroyed, then all is well. If not, then you may be in trouble. Note that the comments about preconditions and valid states are about constraints on types defined in the standard library. Types that you define do not have those constraints, except to the extent that they use types from the standard library. So look at your move constructor to sort out what you can and can't do with a moved-from object.

Yes. As long as your std::queue's container template argument ensures that there are no preconditions on the state of its contained values for pop_front(); the default for std::queue is std::deque and that offers the guarantee.
As long as you ensure what I wrote on the previous paragraph, you are completely safe. You're about to remove that item from your queue, thus there is no reason not to move it out since you are taking ownership of that object.

moving an object may leave it in an invalid state. It's invariants are no longer guaranteed. You would be safe popping it from a non-intrusive queue.
The std::move itself does nothing other than tell the compiler, that it can select a comm routine that takes an r-value.
A well written comm routine, would then steal the representation from the old object for the new object. For instance, just copy the pointers to the new object, and zero the pointers in the old object (that way the old object destructor won't destroy the arrays).
if comm is not overloaded to do this there will not be any benefit to std::mov.

What are the benefits and risks, if any, of using std::move with std::shared_ptr

I am in the process of learning C++11 features and as part of that I am diving head first into the world of unique_ptr and shared_ptr.
When I started, I wrote some code that used unique_ptr exclusively, and as such when I was passing my variables around I needed to accomplish that with std::move (or so I was made to understand).
I realized after some effort that I really needed shared_ptr instead for what I was doing. A quick find/replace later and my pointers were switched over to shared but I lazily just left the move() calls in.
To my surprise, not only did this compile, but it behaved perfectly well in my program and I got every ounce of functionality I was expecting... particularly, I was able to "move" a shared_ptr from ObjectA to ObjectB, and both objects had access to it and could manipulate it. Fantastic.
This raised the question for me though... is the move() call actually doing anything at all now that I am on shared_ptr? And if so, what, and what are the ramifications of it?
Code Example
shared_ptr<Label> lblLevel(new Label());
//levelTest is shared_ptr<Label> declared in the interface of my class, undefined to this point
levelTest = lblLevel;
//Configure my label with some redacted code
//Pass the label off to a container which stores the shared_ptr in an std::list
//That std::list is iterated through in the render phase, rendering text to screen
this->guiView.AddSubview(move(lblLevel));
At this point, I can make important changes to levelTest like changing the text, and those changes are reflected on screen.
This to me makes it appear as though both levelTest and the shared_ptr in the list are the same pointer, and move() really hasn't done much. This is my amateur interpretation. Looking for insight. Using MinGW on Windows.

ecatmur's answer explains the why of things behaving as you're seeing in a general sense.
Specifically to your case, levelTest is a copy of lblTest which creates an additional owning reference to the shared resource. You moved from lblTest so levelTest is completely unaffected and its ownership of the resource stays intact.
If you looked at lblTest I'm sure you'd see that it's been set to an empty value. Because you made a copy of the shared_ptr before you moved from it, both of the existing live instances of the pointer (levelTest and the value in guiView) should reference the same underlying pointer (their get method returns the same value) and there should be at least two references (their use_count method should return 2, or more if you made additional copies).
The whole point of shared_ptr is to enable things like you're seeing while still allowing automatic cleanup of resources when all the shared_ptr instances are destructed.

When you move-construct or move-assign from a shared pointer of convertible type, the source pointer becomes empty, per 20.7.2.2.1:
22 - Postconditions: *this shall contain the old value of r. r shall be empty. r.get() == 0.
So if you are observing that the source pointer is still valid after a move-construct or move-assignment, then either your compiler is incorrect or you are using std::move incorrectly.
For example:
std::shared_ptr<int> p = std::make_shared<int>(5);
std::shared_ptr<int> q = std::move(p);
assert(p.get() == nullptr);

If you copy a shared_ptr, the reference count to the pointer target is incremented (in a thread-safe way).
Instead, when you move a shared_ptr from A to B, B contains the copy of the state of A before the move, and A is empty. There was no thread-safe reference count increment/decrement, but some very simple and inexpensive pointer exchange between the internal bits of A and B.
You can think of a move as an efficient way of "stealing resources" from the source of the move to the destination of the move.

Container of Pointers vs Container of Objects - Performance

I was wondering if there is any difference in performance when you compare/contrast
A) Allocating objects on the heap, putting pointers to those objects in a container, operating on the container elsewhere in the code
Ex:
std::list<SomeObject*> someList;
// Somewhere else in the code
SomeObject* foo = new SomeObject(param1, param2);
someList.push_back(foo);
// Somewhere else in the code
while (itr != someList.end())
{
(*itr)->DoStuff();
//...
}
B) Creating an object, putting it in a container, operating on that container elsewhere in the code
Ex:
std::list<SomeObject> someList;
// Somewhere else in the code
SomeObject newObject(param1, param2);
someList.push_back(newObject);
// Somewhere else in the code
while (itr != someList.end())
{
itr->DoStuff();
...
}
Assuming the pointers are all deallocated correctly and everything works fine, my question is...
If there is a difference, what would yield better performance, and how great would the difference be?

There is a performance hit when inserting objects instead of pointers to objects.
std::list as well as other std containers make a copy of the parameter that you store (for std::map both key and value is copied).
As your someList is a std::list the following line copies your object:
Foo foo;
someList.push_back(foo); // copy foo object
It will get copied again when you retrieve it from list. So you are making of copies of the whole object compared to making copies of pointer when using:
Foo * foo = new Foo();
someList.push_back(foo); // copy of foo*
You can double check by inserting print statements into Foo's constructor, destructor, copy constructor.
EDIT: As mentioned in comments, pop_front does not return anything. You usually get reference to front element with front then you pop_front to remove the element from list:
Foo * fooB = someList.front(); // copy of foo*
someList.pop_front();
OR
Foo fooB = someList.front(); // front() returns reference to element but if you
someList.pop_front(); // are going to pop it from list you need to keep a
// copy so Foo fooB = someList.front() makes a copy

Like most performance questions, this doesn't have one clear cut answer.
For one thing, it depends on what exactly you're doing with the list. Pointers might make it easier to do various operations (like sorting). That's because comparing pointers and swapping pointers is probably going to be faster than comparing/swapping SomeObject (of course, it depends on the implementation of SomeObject).
On the other hand, dynamic memory allocation tends to be worse than allocating on the stack. So, assuming you have enough memory on the stack for all the objects, that's another thing to consider.
In the end, I would personally recommend the best piece of advice I've ever gotten: It's pointless trying to guess what will perform better. Code it the way that makes the most sense (easiest to implement/maintain). If, and only if* you later discover there is a performance problem, run a profiler and figure out why. Chances are, most programs won't need all these optimizations, and this will turn out to be a moot point.

It depends how you use the list. Do you just fill it with stuff, and do lookups, or do you insert and remove data regularly. Lookups may be marginally faster without pointers, while adding and removing elements will be faster with pointers.

With objects it is going to be memberwise copy (thus new object creation and copy of members) assuming there aren't any copy constructors and = operator overloads. Therefore, using pointers is efficient std::auto_ptr or boost's smart pointers better, but that is beyond the scope of this question.
If you still have to use object syntax using reference.

Some additional things to consider (You have already been made aware of the copy semantics of STL containers):
Are your objects really smaller than pointers to them? This becomes more relevant if you use any kind of smart pointer as those have a tendency to be larger.
Copy operations are (often?) optimized to use memcpy() by the compiler. Especially this is probably not true for smart pointers.
Additional dereferencing caused by pointers
All the things I have mentioned are micro optimizations considerations and I'd discourage even thinking about them and go with them. On the other hand: A lot of my claims would need verification and would make for interesting test cases. Feel free to benchmark them.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js