I have a class Foo, which has a vector of some large classes. The idea is, that an octal tree will be built recursively out of the elements of the vector, and each OctreeNode will have a pointer to few elements of the vector found in Foo. (In the example, just for simplicity, a node will point to only one element of the vector)
class Foo
{
vector<LargeClass> mLargeClasses;
void removeItem(const int index); //remove an element from the vector at the index
}
class OctreeNode
{
LargeClass* mLargeClass;
}
One can say, "why bother keeping the vector after the tree is built, and store the objects in the tree itself". True, let's just say, I need to keep vector parallel to the built tree as well.
While the above concept works, I have issues when elements got removed from the underlying vector. In such case, some Octree nodes end up with dangling pointers.
My solution #1:
If removeItem function is called, then before it removes the vector element, it first recursively traverse the octal tree, and make all mLargeClass pointer a nullptr which happen to point to that particular vector element. It's ok to have nullptr in the nodes, as I check each time against nullptr, when I access them anyway.
My solution #2:
Have the vector store shared_ptrs, and have the OctreeNode store a weak_ptr. I am not fan of this, as each time I access a weak_ptr in the tree, it gets converted to a shared_ptr in the background with all the atomic counter increases. I am not expert on performance testing, but I have a feeling, that it is slower than a simple pointer access with if condition.
Does anybody know any better solutions?
I think the most elegant would be:
To have smart pointer which behaves like a shared_pointer, counts, how many other pointer refers to it, keep a record of them, and in case it gets destroyed, it automatically nulls out all other "observer" pointers which refer to it?
While the field, and the purpose is somewhat different, i think i will give a try for the handle system described in this port:
Simple, efficient weak pointer that is set to NULL when target memory is deallocated
If I fail, I will revert back to the shared_ptr/weak_ptr duo. Described in the same post.
Related
I understand that vectors get moved if you push more elements than it has capacity for, but what happens to std::list if one of it's elements get's moved for reasons unrelated to the list itself? For instance to make space for a vector?
Will the list get invalidated because the elements around the moved element no longer point to it? Or is the list prepared for such eventuality?
If it is the later, what happens to pointers that point to the moved element?
For case application, i want to make a node map, which of course means that every node has to point to other nodes. But I also need to have a list of the nodes so I can search them easily.
So i wanted to have a list where every object of the list will have pointers to some other elements of the same list (this is outside of the normal std::list back and forth pointers). But I got worried about how would std::list handle one of its elements getting moved, and how could i handle my own pointers in such eventuality.
I discarded vectors already because the documentation already states that if it gets moved all pointers and references to its elements will get invalidated. If my approach of using std::list can not work, my 2nd best would be to keep the list of nodes into a vector and make the nodes reference eachother through index number (which i can do because once built the vector won't change it's size)
but what happens to std::list if one of it's elements get's moved for
reasons unrelated to the list itself?
In C++, the runtime environment is not allowed to unilaterally move objects in that way, for exactly the reason you imagine -- there is no reliable or efficient way to find all pointers to the object's old location and modify them to point to a new location, so any attempt to surreptitiously move an object would run the risk of creating dangling-pointers which would lead to undefined behavior. (btw this is one reason why garbage collectors don't work very well in C++)
So having your list-nodes objects moved behind your back is not something you need to worry about. Since your list-container has sole ownership of the nodes, the only way for this to happen would be if the list itself decided to explicitly move one of its nodes for some reason (and in general there's no general reason why it would need or want to do that).
My implementation uses std::map to store data. When I started my code it seemed like the best option. Now I came to a point where I have to change the key values of all objects inside the map.
The problem is that each object points to another object inside the map:
class AND : public node{
vector <node*> inputs;
vector <node*> outputs;
}
And the map is declared like this:
map<unsigned int, AND> all_ANDs;
My question is: If I use map::extract from C++17 to modify the key values in all_ANDs map, will my pointers (E.g. the ones inside the attribute inputs) keep pointing to the right places?
In other words: If I change the value of "first" element with extract, the address of "second" will keep intact?
I noticed from this link that the string "papaya" stays the same (and works gracefully). But I wanted to be sure about pointers.
YES
The reference you have already quoted in your posts clearly states that no elements are copied or moved. (This assumes that node in your code snippet does not refer to map::node_type).
The same holds for the insert operation of the map-node (after modifying its key):
If the insertion is successful, pointers and references to the element obtained while it is held in the node handle are invalidated, and pointers and references obtained to that element before it was extracted become valid. (since C++17)
However, accessing the object between extract()ion and re-insert()ion has undefined behaviour and its address taken whilst in the extracted state is of limited use. Quoting from the standard:
The extract members invalidate only iterators to the removed element;
pointers and references to the removed element remain valid. However,
accessing the element through such pointers and references while the
element is owned by a node_type is undefined behavior. References and
pointers to an element obtained while it is owned by a node_type are
invalidated if the element is successfully inserted.
Explanation
Essentially, a map<> is implemented as a tree of nodes, each holding a key and T (which are exposed as pair<const Key, T> to the user). Nodes are allocated (typically) on the heap and the address of your object is related to that of a node. map::extract() un-links a node from its tree and returns a node handle (an object holding a pointer to a map node), but AFAIK the node itself is not re-allocated, moved, or copied. Upon map::insert(handle), the node is re-linked into the tree according to its (new) key. Again, this involves no re-allocation, move, or copy of the node.
Remark
The above is a rough sketch. How things are actually done is likely more complex and also implementation defined. As explained here a node_handle does allow to alter the key through the member function
Key &node_handle::key() const;
How this is done under the hood is not specified and I speculate that the implementation uses a union or some cast to allow this. Of course, the map has to present to users a pair<const Key,T> in order to prevent them from changing the key and hence breaking the map, but this is not of any concern for an element extracted from the map.
My above answer addresses your immediate question. However, as I have suggested in a comment, this appears to be a XY problem. What I suspect:
You have some structure of AND objects which are interlinked via their inputs and outputs fields. This linkage must not be broken by any re-allocation, so you cannot store them in a growing vector<AND> with re-allocation.
You also want to order these objects according to some key and have therefore stored them in a map<Key,AND>, which indeed does not re-allocate when grown.
You now want to order them according to another key (and/or change all the keys).
(If you're actually are not interested in ordering but merely in finding your objects by their key, you should have used unordered_map instead of map, which supports find() in O(n) rather than O(log(n)) operations.)
I suggest a different layout of your data:
You store your AND objects in a way that allows growing their number without re-allocation. An obvious choice here is deque<AND>, since
insertion and deletion at either end of a deque never invalidates
pointers or references to the rest of the elements
You may also make AND non-copyable and non-movable, ensuring that once allocated their address never changes (and pointers to them remain valid until destruction).
You can support any find-by-key or order-by-key operations by actually working on pointers to the stored objects, either by sorting a vector of pair<key,AND*> or by using a map<key,AND*> or unordered_map<key,AND*>. You can even simultaneously have various keys per object (and a map for each).
When you must re-key all objects, simply forget the old map and make a new one: since the map only stores pointers and not the objects, this does not affect your linkages.
Your map holds actual AND objects, not pointers to objects. So, if the AND* pointers stored inside your vectors are pointing at the map's AND objects, then those pointers WILL become invalid once those objects are erased from the map.
However, extraction merely unlinks a specified node from the map, the node and thus its key and value are still valid in memory. The node can be re-inserted into a map without affecting the addresses of the node's key and value. In this regard, the pointers in the vectors WILL NOT become invalid (although it is undefined to dereference them while the node is detached from the container).
Another option is to change your map to hold AND* pointers instead. Or better, consider using std::shared_ptr<AND> in the map and std::shared_ptr<node> in the vectors, instead of raw pointers. Then it won't matter whether the map entries are erased or extracted, the AND objects will remain valid as long as there are active shared_ptr references to them.
I have a deletion method that works and is as follows:
void deleteUserByID(int id, std::vector<Person*>& userList)
{
for(int i = 0; i < userList.size(); i++) {
if (userList.at(i)->getID() == id) {
userList.erase(userList.begin() + i);
}
}
}
However, I was trying the following before the above and couldn't understand why it wasn't working.
Instead of using userList.erase(userList.begin() + i);, I was using delete userList.at(i)
I'm somewhat new to C++, and have been instructed to delete heap allocated memory with the "delete" keyword. I felt that should have removed it from the Vector, but was wrong.
Why doesn't the delete userList.at(i) work? I'm curious. Any info would be helpful.
There are two separate concepts at play here. First, there's the maintenance of the std::vector that you're using. The vector's job is to hold a sequence of elements, and in many ways it doesn't really care what those elements actually are. From the vector's perspective, its elements will stick around until something explicitly comes along and says to get rid of them. The call to erase tells the vector "Hey, you know that element you've got at that one position? Please get rid of it." So when you make the call to erase, you're telling the vector to get rid of one of its elements.
Independently, there's the objects that are being stored in the vector. You're storing Person *s, which are pointers to Person objects. Those objects (I'm assuming) were allocated with new, so each Person essentially thinks "I'm going to live forever, or at least until someone comes around and calls delete on me." If you delete one of the Person objects, that object ceases to exist. However, the Person objects have absolutely no idea that there's a vector somewhere with pointers to people.
In order to get everything to work the way you want it to, you actually need to use a combination of both erase and delete (with a caveat that I'll mention later). If you just erase the pointers from the vector, then from the vector's perspective everything is cleaned up (it no longer holds pointers to the Person object in question), but from the Person's perspective the Person object is still very much alive and well because you never said to delete it. If you just delete the pointers, then from the Person's perspective everything is cleaned up (you've told the Person that it's time to go to the giant playground in the sky), but from the vector's perspective nothing was added or removed, so you now have a dangling pointer in your vector. In other words, the first option results in a memory leak - there's a Person object that was never told to clean itsefl up - and the second option results in dangling pointer - there's a pointer to what used to be a person, but which is now a bunch of bits that can be recycled however the program wishes.
Using the setup you have right now, the "best" way to handle this would be to use a combined approach. When you find an item to remove, first delete the pointer, then call erase. That ensures that the Person gets cleaned up and that the vector no longer has a dangling pointer in it.
But as some of the commenters have noted, there's a much better way to do this. Rather than storing Person *s and using raw pointers to reference the Person objects, use the std::shared_ptr type and manage your Person objects through std::shared_ptr<Person>. Unlike a regular pointer, which just says "yeah, there's a thing over there" and won't do any memory management on its own, the std::shared_ptr type actually owns the resource that it points at. If you erase a std::shared_ptr from a vector, the std::shared_ptr then says "okay, I just got kicked out of the vector, and if I'm the last pointer to the Person, I'll go and delete it for you." That means that you don't need to do any of your own memory management to clean things up.
In summary:
Just calling erase gets rid of an element from the vector, but leaves a Person adrift in the heap, wondering why no one loves it anymore.
Just calling delete sets the Person object free, but leaves a ghostly pointer to it in the vector that's a major hazard.
Calling both delete and erase in the proper order will solve this problem, but isn't the ideal solution.
Using std::shared_ptr instead of raw pointers is probably the best option, since it ensures that all the right deletes happen automatically.
Hope this helps!
And a quick addendum - are you sure that you code correctly visits all the elements of the vector? For example, if you erase the item at index 0, all the other elements of the vector will shift back one position. But then your implementation increments i to 1, at which point you've skipped over the item that just got shifted back to the first position.
I'll let you think about how to resolve this. Another answer has offered a good suggestion of using remove_if, which is one good solution, though if for your own edification you want to roll your own version, you might want to think over how you'd address the above issue.
This is one of those places a picture is almost certainly worth at least a thousand words. The vector is storing pointers, which point to (presumably) dynamically allocated objects, something like this:
So, the green boxes represent the elements in the vector itself. The blue boxes represent your data objects. I've separated the third one to signify the fact that it's the one we're going to (eventually) remove.
As it stands right now, your code is deleting some of the green boxes. It leaves the blue box (your data) in memory, but you no longer have a pointer to it:
At this point, you're right that the data no longer appears in the vector, so your routine has "worked" to that extent. The problem is that you no longer have access to that data, so you've leaked its memory.
What's (apparently) being suggested is that when you find the object you want to remove from the list, you should first use delete to destroy the data (the blue box):
...then use erase to remove that element from the vector:
Alternatives
I would not use a std::shared_ptr for a case like this. A shared_ptr is intended to manage objects that have shared ownership, and nothing you've said indicates that you're dealing with shared ownership. If you must use dynamically allocated objects, and don't want to manage things manually (which I agree is a good thing to avoid), you might consider using std::unique_ptr, or you might want to consider using a Boost ptr_vector instead.
Alternatively, consider changing it to a std::vector<Person> (i.e., store the objects directly in the vector instead of storing pointers to dynamically allocated objects). At least in my experience, this is really the right answer the vast majority of the time. If you really need to ensure against moving the Person objects around when the vector resizes, consider using an std::deque<Person> instead. A std::deque<Person> is fairly close to what you've created, but with at least some potential for the compiler to optimize allocation by putting a number of data objects (Persons, in your case) into a single block of memory, instead of allocating each one individually.
Conclusion
Until or unless evidence to the contrary is found, the right answer is most likely std::vector<Person> with std::deque<Person> in second place. Direct dynamic allocation of the Person objects, with something to automate their deletion runs a distant third place (at best).
The other answers given summarize what you really should do in terms of design, and that is to use smart pointers.
However, if you really did use raw pointers, and allocated those entries with new, the way you can delete and erase without writing any loops is to
Partition the elements to delete
delete the elements
Erase the partitioned elements from the vector using vector<T>::erase.
Here is an example:
void deleteUserByID(int id, std::vector<Person*>& userList)
{
// partition the about-to-be deleted elements to the right of the partition
// and all good items to the left of the partition
auto iter = std::partition(userList.begin(), userList.end(), [&](Person *p)
{ return p->getID() != id; });
// issue a delete on those elements on right of partition
std::for_each(iter, userList.end(), [](Person *p) { delete p; });
// now erase those elements from the vector.
userList.erase(iter, userList.end());
}
The std::partition simply places all elements you wish to delete on the right of the partition (which is returned by iter). Then it's just a matter of calling delete on those elements on the right of the partition, and finally erase those elements.
The reason why this 3-step process was done instead of directly using the std::remove_if is that std::remove_if gives you undetermined elements in the range denoting the items that were "removed", thus issuing subsequent delete calls on those elements would have resulted in undefined behavior.
For example, this code, even though it looks like it would work, actually results in undefined behavior:
void deleteUserByID(int id, std::vector<Person*>& userList)
{
// move items to be removed to the end of the vector
auto iter = std::remove_if(userList.begin(), userList.end(), [&](Person *p)
{ return p->getID() == id; });
// issue a delete on those elements (this actually invokes undefined behavior)
std::for_each(iter, userList.end(), [](Person *p) { delete p; });
// now erase those elements from the vector (if your program even gets this far)
userList.erase(iter, userList.end());
}
Basically, you can't do anything "special" to the items in the removed range (for example, call delete), as those items are indeterminate garbage. The only thing you can safely do is to erase them.
So the trick is to partition the elements (which doesn't invalidate those items), delete the partitioned elements, and then remove them using erase.
*Note that if you want to keep the order of the elements that will not be deleted, then use std::stable_partition instead of std::partition.
Proper way to do it is to use smart pointers and an algorithm from STL.
void deleteUserByID(int id, std::vector<std::unique_ptr<Person>>& userList)
{
auto endIt = std::remove_if(userList.begin(), userList.end(),
[id](const auto &person) {
return person->getID() == id;
});
userList.erase(endIt, userList.end());
}
These are two different and complementary things. For your vector
userList.erase(userList.begin() + i);
will remove the ith pointer from your vector, but will not affect the pointed at Person object in any way
delete userList.at(i);
will delete (free) the Person object pointed at by the ith pointer in your vector, but will not affect the vector in any way.
Depending on where these Person objects are coming from and what you are trying to do, you might need to do both.
So far I have only worked with lists in C++ (Queues, stacks, tree etc.. in Java). I have done some reading and have endeavoured to learn about Vectors as they are good for traversal compared to lists and don't have the complexity of Arrays in regards to house keeping.
So far I am aware that there can be an issue in regards to pointer validation in the event the Vector needs to be reallocated. The pickle being (as far as I know) no real way to determine if the adding of an element to the Vector will trigger reallocation.
One answer I can think of is to re-assign the pointers to each element every time an element is added or removed.
This seems like a decent amount of overhead on the chance reallocation is done. Is there a better way perhaps?
One way to approach this is just to have a vector of pointers (preferably smart pointers, so you don't need to worry about manual deallocation).
E.g. instead of
std::vector<MyObject> vec;
vec.push_back(MyObject());
MyObject* ptr = &vec[0];
Do something like this:
std::vector<std::unique_ptr<MyObject>>
vec.push_back(std::unique_ptr<MyObject>(new MyObject())); // *
MyObject* ptr = vec[0].get();
(* or use vec.push_back(std::make_unique<MyObject>()) in C++14)
If done in the second way, ptr will always be valid across internal reallocations of the vector, because it's not pointing to memory that is managed by the vector, it is pointing to memory that is managed by the unique_ptr, which will not change until the object is explicitly released or destroyed.
Well, I don't know if it is possible, but the thing would be:
struct stPiece
{
/* some stuff */
stPiece *mother; // pointer to the piece that created this one
};
vector<stPiece> pieces;
Is it possible to erase the piece referenced by 'mother' from pieces, having just that pointer as a reference? How?
Would it mess with the other references? (i.e. if it is not the last element in the vector, by shifting the next elements to other memory positions, while the other '*mothers' remain constant). Of course, I assuming that all the child pieces will be deleted (so I won't need to update any pointer that goes to the same mother).
Thanks!
If your mother pointers point directly to elements of the pieces vector you will get in all kinds of trouble.
Deleting an element from pieces will shift all the positions of the elements at higher indexes. Even inserting elements can make all the pointers invalid, since the vector might need to reallocate it's internal array which might transfer all the elements to new positions in memory.
To answer your main question: You can't delete the element you have the pointer to directly, you would first need search through the vector to find it, or calculate it's index in the vector.
Not storing pointers into pieces as mother but instead the indexes of the elements would make it a bit more robust, so that at least inserting new elements could not break the existing mothers. But deleting from pieces would still shift elements to new indexes.
Using a std::list for pieces and storing iterators into that as mother might be a solution. Iterators of std::list are not invalidated if other elements are of that list are removed/added. If different elements can have the same mother you still have a problem finding out when to remove the mother elements, than maybe using boost::shared_ptr would be simpler.
It is not exactly clear how the entire data structure is organized and what the consequences are going to be, but it is perfectly possible to erase an element from the vector by having a pointer to that element and the vector itself. You just need to convert the pointer to an iterator first. For example, having a vector
vector<stPiece> pieces;
and a pointer into that vector
stPiece *mother;
you can convert the pointer to an index
vector<stPiece>::size_type i = mother - &pieces[0];
assert(i < pieces.size());
then convert the index to an iterator
vector<stPiece>::iterator it = pieces.begin() + i;
then erase the element
pieces.erase(it);
and that's it.
However, it appears that in your data structure you might have multiple long-lived pointers pointing into the same vector. Any attempts to erase elements from such vector will immediately invalidate all these pointers. It theoretically is possible to "restore" their validity, if you do everything carefully, but this is going to a major PITA.
I'm not sure I understand what you mean by "assuming that all the child pieces will be deleted".
Yes, you can erase the piece referenced by mother.
If you delete the piece referenced by 'mother', the mother pointer in all its children will become dangling, you'll have to take care of this.
About the shifting of elements in the vector, you need not do it, its taken care by the vector class.
Short answer: no.
The pieces are stored in the vector by value. A vector iterator is therefore a pointer to a piece. This means, then, that a pointer to the mother piece is the same as the vector's iterator at the mother. Vector iterators are invalidated on insertion (all iterators) and erasure (all iterators past the erased iterator), which means memory locations will change and it will be nearly impossible to keep all the pointers updated.
You could store dynamically allocated pieces in the vector, i.e.:
vector<stPiece*> pieces
The mother pointers won't change as pieces are added/removed to/from the vector. The downsides are:
you now have to manage memory (new/delete each piece)
it uses more memory per piece (the pointers in pieces)
it may be slower because you lose spatial locality (cache efficiency) because it is no longer a contiguous array of stPiece objects
The latter two points may or may not be important in your application.
What you have coded is a singly-linked tree. You probably don't want an object to contain all your stPieces, because that would get in the way of implementing creation and deletion semantics.
I'm guessing that you want to delete mother after all the children are gone.
set< stPiece * > all_pieces;
struct stPiece {
boost::shared_ptr< stPiece > const mother;
stPiece( boost::shared_ptr< stPiece > &in_mother )
: mother( in_mother ) {
all_pieces.insert( this );
}
~stPiece() {
all_pieces.erase( this );
}
};
The key point is that there's a difference between containing some objects and merely being able to iterate over them. If using the most obvious way to create and delete the objects isn't using the container, they probably shouldn't be in it.