I have gone through the C++ reference manual and am still unclear on how to use the priorityqueue data structure in STL.
So, basically I have been trying to implement my own using heaps.
I am doing this for implementing Prim's algorithm.
Vector <int, int> pq;
This is my priority queue. The first field is the node and the second field the weight to the existing tree.
I plan to modify the values of weight in pq every time a new node is added to the tree by updating the weights of its neighbour nodes.
How do I access the individual elements of this vector? I also need to be able to delete elements at will.
Is this a good way to implement a priority queue? what if I want to add another field to the container, namely
Vector<int, int, int> MST
How would I access the third element? I want to store the resulting MST this way such that the first two fields represent the vertices forming the edge, and the third the weight.
It would also help if someone could tell me how to assign elements to this vector using push_back.
Also, would the conventional C++ STL priority queue help in this as I need to update the priority values each time a new element is added to the MST? Would it self-correct itself according to the priority when values are modified?
One other question, these Vectors, when I pass them to a function, and try to make changes, is it a pass by value or pass by reference - Or, are the changes reflected outside the function?
In Prim's algorithm the random access to elements not needed. You just need to skip elements from the queue which connect already connected and pass forward.
So the algorithm looks as follows:
choose a node N
add all edges from N to the PQ
pop a minimal cost edge from PQ
if it connects nodes which are already in the tree, skip it
otherwise add this edge to the tree, call the new node N and go back to point 2.
After adding the node just check if size of the tree is already size of graph - 1. If so then finish.
Note that the only operations on PQ are add_element and pop_minimum - thus std::priority_queue will work.
Firstly, std::vector<int,int> isn't valid - the second type argument is an (optional) allocator, and int is not an allocator. If you're using a different underlying container, please say what it is. I'll assume you want to work with std::vector for now.
Secondly, std::priority_queue doesn't support the operations you want (access and delete arbitrary elements), so you can't use that.
You can use the underlying vector directly, and the heap algorithms (std::make_heap etc.) to sort it:
random access will work (although it's not clear what you expect the index to be once your vector is in heap order)
deleting an arbitrary element will require erasing it from the vector and re-running make_heap, or you can implement your own siftDown
Oh, and you can make some value type to store in your vector, such as
std::vector<std::pair<int,int>>
for your first example, or perhaps more clearly:
struct {
int node;
int weight;
} Node;
// ...
std::vector<Node>
Related
So I am looking for a data structure which needs a FIFO behaviour but should also have a quick look up time by value.
In my current code I have some data duplication. I use a std::unordered_set and std::queue for achieving the behaviour I want but there's probably a better way of achieving this that I'm not thinking of at the moment. I have a function that adds my new entry to both the set and the queue when a new entry comes up. To search if an entry exists in the queue I use find() in the set. Laslty, I have a timer that is set off after an insertion to the queue. After a minute I get the entry in the front of the queue with queue.front(), then I use this value to erase from the set, and finally I do a pop on the queue.
This all works as expected and gives me both the FIFO behaviour and the constant time complexity for the look up but I have data duplication and I was wondering if there is a data structure (maybe something form boost?) which does what I want without the duplication.
Data structure for FIFO behaviour and fast lookup by value
A solution is to use two containers: Store the elements in an unordered set for fast lookup, and upon insertion, store iterator to the element in a queue. When you pop the queue, erase the corresponding element from the set.
A more structured approach is to use a multi-index container. The standard library doesn't provide such, but boost does. More specifically, you could use a combination of hashed and sequence indices.
This answer is mostly concerning corner cases of the problem as presented
If you problem is a practical one, and you are able store the elements with a std::vector - and if you have less than in the ballpark of some ~10-100 elements in the queue, then you could just use:
std::queue<T, std::vector<T> > q;
That is a queue using vector as the underlying container. When you have that small number of elements (only 10-100) then using advanced lookup methods is not worth it.
You then only needs to check for duplicates when you pop the queue not on every insertion. Again, that might or might not be usefull depending on your specific case. I can imagine cases where this method is superior. Eg. a webserver serving pages that gets a lot of hits to just one or a few pages. Then it might be faster to just add say 100,000 elements to the vector and then go and remove the duplicates all in one go when popping.
How about defining your own data structure which can act as a BST (for lookups) and as a min heap which you can use to impose fifo?
class node {
public:
static int autoIncrement = 0;
int order; // this will be auto-incremented to impose FIFO
int data;
node* left_Bst;
node* right_Bst;
node* left_Heap;
node* right_Heap;
node() {
order = autoIncrement;
autoIncrement++;
}
}
By doing this you are basically creating two data structures sharing the same nodes. BST's partial order is imposed via data, and heap's can be maintained via order variable.
During an insertion you can traverse via BST pointers and insert your element if it doesn't exist already and also modify the heap pointers accordingly after insertion.
I was going through Djikstra's algorithm when I noticed, I could update keys in heap(with n keys) in O(logn) time (last line in the pseudocode). How do I update keys in heaps in C++, is there any method in priority_queues to do this? Or do I have to write my own heap class to do achieve updates in O(logn) like this?
Edit 1:
Clarifying my need - for a binary heap with n elements -
1) Should insert new values and find & pop minimum values in O(logn)
2) Should update already present keys in O(logn)
I tried to come up with a way to implement this using make_heap, push_heap, pop_heap, and a custom function for update as John Ding suggested.
However I am facing a problem in making the function, I first need to find the location of the key in the heap. Doing this under O(logn) in a heap requires a lookup array for position of keys in heap, see here (I don't know of any other way). However these lookup tables won't be updated when I call push_heap or pop_heap.
You can optimize dijktra algorithm with priority_queue. It is implemented by a binary heap, where you can pop the top or push in a element in O(logN) time. However, due to the encapsulation of priority_queue, you cannot modify the key(more pricisely, decrease the key) of any element.
So our method is to push multiple elements into the heap regardless of whether we have multiple elements refering to the same node.
for example, when
Node N : distance = 30, GraphNode = A(where A refers to one node in the graph, while N is one node in the heap)
is already in the heap, then using the priority_queue cannot help you do such a operation when we try to relax Node N:
decrease_key_to(N, 20)
by decreasing key can make the heap always include less than N elements, but it's cannot be implemented by priority_queue
What we can do with it is to add another node in the heap:
Node N2 : distance = 20, GraphNode = A
push N2 into the heap
That's corresponding to priority_queue::push
So you may need to implement a binary heap supporting decrease_key yourself or find an implementation online, and store a table of pointers pointing to every element in a heap to know access elements through nodes in the graph.
As an extension, using Fibonacci heap can even make decrease_key faster, that's the ultimate level of Dijkstra, Haha :)
Problem of last version of my answer:
We cannot locate the element pushed in to the heap using push_heap.
In order to do this, you need more than the priority_queue provides: you need to know where in the underlying container the element to be updated is stored. In a binary heap, for example, you need to know the position of the element for which you want to change the priority. With this you can change the priority of the element and then restore the heap property in O(log n) by bubbling the modified element up or down.
For Dijkstra, if I remember correctly, something called Fibonacci heap is more efficient than a binary heap.
Unfortunately, std::priority_queue doesn't support updates of entries in the heap, but you may be able to invalidate entries in the heap and wait for them to percolate up to the root from where they can eventually be deleted. So instead of changing an existing entry, you invalidate it and insert another one with the new priority. Whether you can live with the consequences of having invalid entries filling up the heap, is for you to judge.
For an idea how this might work in practice, see here.
I have 2 vectors, y and T initially of the same size, and they need to stay separated like this. I do a loop until T is empty and every time it loops, the first element of T is used for an algorithm and then erased from the vector T and pushed into vector S (which is empty at first). Every loop, some values in vector y will change and I need to sort them. My problem is: when I sort y, if y[2] and y[3] swap, I need to swap the elements in T that were at [2] and [3] BEFORE the first loop!
I know this seems weird but this is for a Dijkstra algorithm for my Graph project. I understand if it's not clear and I'll try to clarify if you need it. Any advice will be very helpful for me! Thank you!
If I follow you correctly, T is really a FIFO queue. The first element in it each iteration is being 'popped off the front' and placed elsewhere.
Should you be doing this and want at any time to know an ordering of T which includes the elements that were popped off, perhaps you could not remove them from T at all. Just have an iterator or pointer to the next node to be processed.
That node could either be copied to S, or perhaps S would just contain pointers to the node if it were expensive to copy.
This way you're never actually removing the element from T, simply moving on to look at the next item. Presumably it could be cleaned up at the end.
For the Dijkstra algorithm, you usually use a binary heap to implement a priority queue and visit nodes in ascending order of distance to source nodes. At each visit, neighboring distances may be updated (relax operation) so corresponding queue elements change priority (decrease-key operation).
Instead of using the distance directly as the key of heap elements, use the node identifier (or pointer). But also use a comparison function (as e.g. in std::make_heap and most STL algorithms) that, given two node identifiers, compares the corresponding distances.
This way, node identifiers are re-ordered according to heap operations and, whenever you pop an element from the heap (having minimal distance), you can access any information you like given its node identifier.
I have a red black tree algorithm which is working fine. When a node is inserted into the tree, the insert() method returns to the caller a pointer to the node that was inserted. I store all such pointers in a STL vector.
The problem is, within the operation of the RB tree, sometimes these pointers are invalidated. For instance, there is a method that is called during a rotateleft/right that copies the values of node A into the current node and then deletes node A. Well I had a pointer to node A in that vector which is now invalid.
I thought about making a way to update the pointers in the vector as follows,
1) keep a multimap which maps node pointers to the vector indices that holds those pointers.
2) Before deleting a node, check this multimap to find all the spots in the vector that will be affected
3) iterate over the vector and change the old pointer to the new pointer
4) Update the key value in the multimap to reflect the new pointer as well.
Problem is, you can't update a key value of a map collection for obvious reasons. Also this seems like a horrible solution both for complexity and implementation reasons. Any ideas on how I can accomplish this dynamic updating of pointers?
It seems more reasonable to keep the data in some opaque data structure pointed by node, and to keep external pointers to this structures instead of nodes.
Basicly it means adding a level of indirection between the tree and actual data.
I'm not sure if this is exactly what you're trying to do, but to keep track of items added to tree/heap data structures, the following has worked for me in the past:
Store two "index" vectors in addition to the underlying tree data:
std::vector<int> item_to_tree;
std::vector<int> tree_to_item;
So, to find the index in the tree of the ith item, use item_to_tree[i]. To find the item at a particular jth tree index, use tree_to_item[j]. This is similar to storing explicit pointers, as you've done, but by making use of indices you can essentially get a bi-directional map with O(1) lookup.
Obviously, within the tree operations, you have to make sure that the mappings stay consistent. I've not thought about this for an RB tree, but definitely for other tree-like structures this just adds some O(1) complexity to each operation.
In the case of the ith item "removed" from the tree, tree_to_item no longer contains the ith item index, and I usually set item_to_tree[i] = -1, or some "empty" flag.
Hope this helps.
Is i have a priority queue with declaration
priority_queue<<Node>,vector<Node>,myComp> openQ
i am inserting node objects into it. but at some time i have to delete the element from it. (not to remove the top element)
Currently to delete it i am popping the element and putting it in array. if the top most element is desired then expect it i push other elements in array.
This is like linear search and delete. I know its not efficient and i am looking for some better ways
priority_queue class is designed for using as queue with priorities. And it designed to remove elements with pop function. If you want to get different behavior you should use different class. For instance, std::map.
If you're ready to manually control a consistence of the queue you may take a look on std::make_heap. It's easy to create a max-heap with it. But in this case you need to manually rebuild a queue each time you want to remove an element.
std::set are orderd and can erase elements by value. the underlying datastructure is a binary search tree, thats why its cheaper to erase elements by value.
a priority queue would have to linearly search through the queue, so its possible but just not very efficient, thats why they didnt include it.