Get a std::list::iterator from std::reference_wrapper - c++

I'm coding a Fibonacci heap data structure (https://en.wikipedia.org/wiki/Fibonacci_heap) in C++.
This data structure consists of several heaps, with roots connected in a doubly-linked list. Each node has a doubly-linked list of its children. A whole heap has a doubly-linked list of leaf nodes, to support fast pruning. (CLRS 19-3.b)
My implementation of Node is:
struct Node {
using Iterator = std::list<std::unique_ptr<Node>>::iterator;
using LeafIterator = std::list<std::reference_wrapper<std::unique_ptr<Node>>>::iterator;
Iterator parent;
std::list<std::unique_ptr<Node>> child_list;
T key;
bool mark = false;
bool is_leaf = false;
LeafIterator leaf_iterator;
Node(const T& key) : key {key} {}
};
My implementation of FibonacciHeap is:
using Iterator = std::list<std::unique_ptr<Node>>::iterator;
using LeafIterator = std::list<std::reference_wrapper<std::unique_ptr<Node>>>::iterator;
std::list<std::unique_ptr<Node>> NIL;
std::list<std::unique_ptr<Node>> root_list;
std::list<std::reference_wrapper<std::unique_ptr<Node>>> leaf_list;
Iterator min_element;
I used std::list<std::reference_wrapper<std::unique_ptr<Node>>> for leaf_list, instead of std::list<Node*>, because the memory of leaf nodes are solely owned by their parents, and I don't want double-delete crash.
The problem arises when I attempt to delete a leaf node. I can access a leaf node to delete by leaf_list.begin(), but I cannot erase it from its parent's child_list.
There are two possible workarounds I thought:
Perform a linear scan from parent's child_list to get a std::list<std::unique_ptr<Node>>::iterator that matches the given leaf. This is a linear scan, so slow.
Ditch leaf_list and maintain two pointers as member variables of Node that contains prev_leaf and next_leaf to emulate doubly linked list. I don't like this because it would make Nodes more bloaty.
...can't think else for now
What would be the best way to get std::list<std::unique_ptr<Node>>::iterator from std::reference_wrapper in this case?

Using a std::list<Node*> would not cause any double deletes, as long as you don't manually delete nodes, and let the actual unique_ptr pointers in child_list members handle that. You would just need to be careful to avoid using a dangling pointer after a Node has been destroyed. But this way still doesn't give a good way to quickly remove a Node* from the appropriate child_list.
Instead, you could maybe use std::list<Iterator> leaf_list;. This is relatively safe since inserts and erases on a std::list do not invalidate any iterators (except of course iterators to erased elements).
Though since you still have an invariant to follow, that the iterators in leaf_list belong to the appropriate child_list, it would be good to help code follow it. Depending on the intended usage and generality of the class, that might mean just putting notes in comments within or just before the struct Node definition. Or it might mean making Node a proper class with private members and a safer public interface - I might consider creating custom iterators using boost::iterator_adaptor to allow iteration over the leaf nodes without as much danger of breaking the invariant. If you don't expect much reuse, but then find it would be useful again in more contexts or projects, you could of course change these sorts of decisions later (unless too much code gets written using the raw way).

Related

Can a C++ intrusive linked list work with std::queue?

Recently I was trying to make an intrusive list (like boost's or elt's) work with std::queue.
std:queue<T> has two options for push which end up calling though to push_back on the underlying list impl:
void push( const value_type& value );
void push( value_type&& value );
Neither of these make much sense for intrusive lists of T. We have to modify T too insert it so the const would be a lie and we definitely don't want to move it into list since the whole point of an intrusive list is that it doesn't own the elements. Reasonably then the intrusive lists I've seen only implement void push_back(value_type& value);.
I realise std::queue doesn't buy me anything I couldn't do with manual application of push_back/pop_front ...but at this point I'm morbidly curious if it's possible. Is there some C++ magic I'm missing?
It would not be possible to do this if you want the original list object to remain valid after being added to the queue but if you dispense with that requirement then the move version of push() becomes usable with an intrusive list.
It would require creating a new intrusive list object (this would be managed by std::queue), assigning the current list's head and tail to the new list's head and tail and also fixing up the head's tail and tail's head to point to the new list and then clearing the old list's head and tail (these operations would all be handled by the list's move constructor, maybe move-assign).
The thing to keep in mind is that C++ standard containers always take copies of whatever is added. Copying an intrusive container likely never makes sense but moving one can.

Reduce number of shared_ptrs in persistent data structure

I'm faced with a design choice for a singly linked list class. The rough idea is this:
template<typename T>
class List {
public:
...
private:
struct Node {
std::shared_ptr<const T> value;
std::shared_ptr<const Node> next;
};
std::shared_ptr<const Node> node_;
};
Yes I know there are a lot of shared_ptrs wandering around, but that's because List is a functional persistent data structure that needs as much structural sharing as possible. In this implementation, for example, reversing a list does not require copying any elements, and multiple lists can share a common sub-list (by pointing to a same shared_ptr tail).
That being said, I still feel there are perhaps too many shared_ptrs. Is there anyway to reduce the number of shared_ptrs used while still enabling structural sharing? Something like combining the two shared_ptrs inside a Node to reduce the overhead of control blocks... I don't know, maybe there isn't a way, or maybe there is. Any idea is welcome, even about redesigning the List class altogether.
You want to share data without structure (the reverse case).
You want to share structure.
Both require shared pointers. However, if you want to reduce control block overhead, this can be done, so long as you entangle lifetimes.
You can make the T's lifetime tied to its node. The reversed node then needs to also make the original node persist. This can cause structure to outlive its needs, but makes the pure-forward case less expensive.
Make the pointer-to-T a raw pointer.
Create a combined struct with a T and a Node in it.
Use make_shared to create it.
Now make the pointer-to-T point at the T in the combined struct.
Next, use the aliasing ctor to create a shared ptr to the Node sharing the control block of the combined struct.
To reverse, create a helper struct with a Node and a shared ptr to Node. Make shared the helper. Point the shared node ptr to the forward node, the T ptr to the T ptr in the forward node, and then use the aliasing ctor of shared ptr to get a shared ptr to Node.
I do not think this is worth it.

Remove item from a list using its pointer

I have a pointer p (not an iterator) to an item in a list. Can I then use p to delete (erase) the item from the list? Something like:
mylist.erase(p);
So far I have only been able to do this by iterating through the list until I reach an item at the location p, and then using the erase method, which seems very inefficient.
Nope, you'll have to use an iterator. I don't get why getting the pointer is easier than getting an iterator though...
A std::list is not associative so there's no way you can use a pointer as a key to simply delete a specific element directly.
The fact that you find yourself in this situation points rather to questionable design since you're correct that the only way to remove the item from the collection as it stands is by iterating over it completely (i.e. linear complexity)
The following may be worth considering:
If possible, you could change the list to a std::multiset (assuming there are duplicate items) which will make direct access more efficient.
If the design allows, change the item that you're pointing to to incorporate a 'deleted' flag (or use a template to provide this) allowing you to avoid deleting the object from the collection but quickly mark it as deleted. Drawback is that all your software will have to change to accommodate this convention.
If this is the only bit of linear searching and the collection is not big (<20 items say.) For the sake of expediency, just do the linear search as you've suggested but leave a big comment in the code indicating how you "completely get" how inefficient this is. You may find that this does not become a tangible issue in any case for a while, if ever.
I'm guessing that 3 is probably your best option. :)
This is not what I advice to do, but just to answer the question:
Read only if you are ready to go into forbidden world of undefined behavior and non-portability:
There is non-portable way to make an iterator from T* pointer to an element in a list<T>. You need to look into your std library list header file. For Gnu g++ it includes stl_list.h where std::list definition is. Most typically std::list<T> consists of nodes similar to this:
template <class T>
struct Node {
T item;
Node* prev;
Node* next;
};
Having pointer to Node<T>::item you can by using offsetof calculate this node pointer. Be aware that this Node template could be the private part of std::list so you must hack this - let say by defining identical struct template with different name. std::list<>::iterator is just wrapper over this node.
It cannot be done.
I have a similar problem in that I'm using epoll_wait and processing a list of events. The events structure only contains a union, of which the most obvious type to use is void * to indicate which data is relevant (including the file descriptor) that was found.
It seems really silly that std::list will not allow you to remove an element via a pointer since there is obviously a next and previous pointer.
I'm considering going back to using the Linux kernel LIST macros instead to get around this. The problem with too much abstraction is that you have to give up on interoperability and communication with lower level apis.

C++ Linked List remove all

So this is a bit of a conceptual question. I'm writing a LinkedList in C++, and as Java is my first language, I start to write my removeAll function so that it just joins the head an the tail nodes (I'm using sentinel Nodes btw). But I instantly realize that this won't work in C++ because I have to free the memory for the Nodes!
Is there some way around iterating through the entire list, deleting every element manually?
You can make each node own the next one, i.e. be responsible for destroying it when it is destroyed itself. You can do this by using a smart pointer like std::unique_ptr:
struct node {
// blah blah
std::unique_ptr<node> next;
};
Then you can just destroy the first node and all the others will be accounted for: they will all be destroyed in a chain reaction of unique_ptr destructors.
If this is a doubly-linked list, you should not use unique_ptrs in both directions, however. That would make each node own the next one, and be owned by the next one! You should make this ownership relation exist only in one direction. In the other use regular non-owning pointers: node* previous;
However, this will not work as is for the sentinel node: it should not be destroyed. How to handle that depends on how the sentinel node is identified and other properties of the list.
If you can tell the sentinel node apart easily, like, for example, checking a boolean member, you can use a custom deleter that avoids deleting the sentinel:
struct delete_if_not_sentinel {
void operator()(node* ptr) const {
if(!ptr->is_sentinel) delete ptr;
}
};
typedef std::unique_ptr<node, delete_if_not_sentinel> node_handle;
struct node {
// blah blah
node_handle next;
};
This stops the chain reaction at the sentinel.
You could do it like Java if you used a c++ garbage collector. Not many do. In any case, it saves you at most a constant factor in running time, as you spend the cost to allocate each element in the list anyway.
Yes. Well, sort of... If you implement your list to use a memory pool then it is responsible for all data in that pool and the entire list can be deleted by deleting the memory pool (which may contain one or more large chunks of memory).
When you use memory pools, you generally have at least one of the following considerations:
limitations on how your objects are created and destroyed;
limitations on what kind of data you can store;
extra memory requirements on each node (to reference the pool);
a simple, intuitive pool versus a complex, confusing pool.
I am no expert on this. Generally when I've needed fast memory management it's been for memory that is populated once, with no need to maintain free-lists etc. Memory pools are much easier to design and implement when you have specific goals and design constraints. If you want some magic bullet that works for all situations, you're probably out of luck.

C++ vector of pointers problem

I'm currently trying to implement the A* pathfinding algorithm using C++.
I'm having some problems with pointers... I usually find a way to avoid using them but now I guess I have to use them.
So let's say I have a "node" class(not related to A*) implemented like this:
class Node
{
public:
int x;
Node *parent;
Node(int _x, Node *_parent)
: x(_x), parent(_parent)
{ }
bool operator==(const Node &rhs)
{
return x == rhs.x && parent == rhs.parent;
}
};
It has a value (in this case, int x) and a parent (a pointer to another node) used to navigate through nodes with the parent pointers.
Now, I want to have a list of nodes which contains all the nodes that have been or are being considered. It would look like this:
std::vector<Node> nodes;
I want a list that contains pointers pointing to nodes inside the nodes list.
Declared like this:
std::vector<Node*> list;
However, I'm definitely not understanding pointers properly because my code won't work.
Here's the code I'm talking about:
std::vector<Node> nodes;//nodes that have been considered
std::vector<Node*> list;//pointers to nodes insided the nodes list.
Node node1(1, NULL);//create a node with a x value of 1 and no parent
Node node2(2, &node1);//create a node with a x value of 2 and node1 being its parent
nodes.push_back(node1);
list.push_back(&nodes[0]);
//so far it works
//as soon as I add node2 to nodes, the pointer in "list" points to an object with
//strange data, with a x value of -17891602 and a parent 0xfeeefeee
nodes.push_back(node2);
list.push_back(&nodes[1]);
There is clearly undefined behaviour going on, but I can't manage to see where.
Could somebody please show me where my lack of understanding of pointers breaks this code and why?
So, the first issue that you have here is that you are using the address of individual Nodes of one of your vectors. But, over time, as you add more Node objects to your vector, those pointers may become invalid, because the vector may move the Nodes.
(The vector starts out at a certain pre-allocated size, and when you fill it up, it allocates a new, larger storage area and moves all of the elements to the new location. I'm betting that in your case, as soon as you add the second Node to nodes, it is doing this move.)
Is there a reason why you can't store the indices instead of the raw pointers?
One problem is that push_back can force a reallocation of the vector, i.e. it creates a larger block of memory, copies all existing elements to that larger block, and then deletes the old block. That invalidates any pointers you have to elements in the vector.
The problem is that, every time you add to a vector, it might need to expand its internal memory. If it does so, it allocates a new piece of storage, copies everything over, and deletes the old one, invalidating iterators and pointers to all of its objects.
As solution to your problem you could either
avoid reallocation by reserving enough space upfront (nodes.reserve(42))
turn nodes into a std::list (which doesn't invalidate iterators or pointers to elements not directly affected by changes)
store indexes instead of pointers.
Besides your problem, but still worth mentioning:
The legal use of identifiers starting with underlines is rather limited. Yours is legal, but if you don't know the exact rules, you might want to avoid using them.
Your comparison operator doesn't tell that it won't change its left argument. Also, operators treating their operands equally (i.e. not modifying them, as opposed to, say, +=), are usually best implemented as free functions, rather than as member functions.
just adding to the existing answers; instead of the raw pointers, consider using some form of smart pointer, for example, if boost is available, consider shared_ptr.
std::vector<boost::shared_ptr<Node> > nodes;
and
std::list<boost::shared_ptr<Node> > list;
Hence, you only need to create a single instance of Node, and it is "managed" for you. Inside the Node class, you have the option of a shared_ptr for parent (if you want to ensure that the parent Node does not get cleaned up till all child nodes are removed, or you can make that a weak_ptr.
Using shared pointers may also help alleviate problems where you want to store "handles" in multiple containers (i.e. you don't necessarily need to worry about ownership - as long as all references are removed, then the object will get cleaned up).
Your code looks fine to me, but remember that when nodes goes out of scope, list becomes invalid.