I'm new to STL containers (and C++ in general) so thought I would reach out to the community for help. I basically want to have a priority_queue that supports constant iteration. Now, it seems that std::priority_queue doesn't support iteration, so I'm going to have to use something else, but I'm not sure exactly what.
Requirements:
Maintains order on insertion (like a priority queue)
Pop from top of list
Get const access to each element of the list (don't care about the order in the queue for this stage)
One option would be to keep a priority_queue and separately have an unordered_set of references, but I'd rather not have two containers floating around. I could also use a deque and search through for the right insertion position, but I'd rather have the container manage the sorting for me if possible (and constant-time insertion would be nicer than linear-time). Any suggestions?
There are two options that come to mind:
1) Implement your own iterable priority queue, using std::vector and the heap operation algorithms (see Heap Operations here).
2) derive (privately) from priority_queue. This gives you access to the underlying container via data member c. You can then expose iteration, random access, and other methods of interest in your public interface.
Using a std::vector might be enough as others already pointed, but if you want already-ready implementation, maybe use Boost.Heap (which is a library with several priority queue containers): http://www.boost.org/doc/libs/1_53_0/doc/html/heap.html
Boost is a collection of libraries that basically complete the standard library (which is not really big). A lot of C++ developers have boost ready on their dev computer to use it when needed. Just be careful in your choices of libraries.
You can use (ordered) set as a queue. set.begin() will be your top element, and you can pop it via erase(set.begin()).
Have you observed heap (std::make_heap) ? It hasn't order inside of queue, but has priority "pop from top of list" which you need.
Related
There's a well known image (cheat sheet) called "C++ Container choice". It's a flow chart to choose the best container for the wanted usage.
Does anybody know if there's already a C++11 version of it?
This is the previous one:
Not that I know of, however it can be done textually I guess. Also, the chart is slightly off, because list is not such a good container in general, and neither is forward_list. Both lists are very specialized containers for niche applications.
To build such a chart, you just need two simple guidelines:
Choose for semantics first
When several choices are available, go for the simplest
Worrying about performance is usually useless at first. The big O considerations only really kick in when you start handling a few thousands (or more) of items.
There are two big categories of containers:
Associative containers: they have a find operation
Simple Sequence containers
and then you can build several adapters on top of them: stack, queue, priority_queue. I will leave the adapters out here, they are sufficiently specialized to be recognizable.
Question 1: Associative ?
If you need to easily search by one key, then you need an associative container
If you need to have the elements sorted, then you need an ordered associative container
Otherwise, jump to the question 2.
Question 1.1: Ordered ?
If you do not need a specific order, use an unordered_ container, otherwise use its traditional ordered counterpart.
Question 1.2: Separate Key ?
If the key is separate from the value, use a map, otherwise use a set
Question 1.3: Duplicates ?
If you want to keep duplicates, use a multi, otherwise do not.
Example:
Suppose that I have several persons with a unique ID associated to them, and I would like to retrieve a person data from its ID as simply as possible.
I want a find function, thus an associative container
1.1. I couldn't care less about order, thus an unordered_ container
1.2. My key (ID) is separate from the value it is associated with, thus a map
1.3. The ID is unique, thus no duplicate should creep in.
The final answer is: std::unordered_map<ID, PersonData>.
Question 2: Memory stable ?
If the elements should be stable in memory (ie, they should not move around when the container itself is modified), then use some list
Otherwise, jump to question 3.
Question 2.1: Which ?
Settle for a list; a forward_list is only useful for lesser memory footprint.
Question 3: Dynamically sized ?
If the container has a known size (at compilation time), and this size will not be altered during the course of the program, and the elements are default constructible or you can provide a full initialization list (using the { ... } syntax), then use an array. It replaces the traditional C-array, but with convenient functions.
Otherwise, jump to question 4.
Question 4: Double-ended ?
If you wish to be able to remove items from both the front and back, then use a deque, otherwise use a vector.
You will note that, by default, unless you need an associative container, your choice will be a vector. It turns out it is also Sutter and Stroustrup's recommendation.
I like Matthieu's answer, but I'm going to restate the flowchart as this:
When to NOT use std::vector
By default, if you need a container of stuff, use std::vector. Thus, every other container is only justified by providing some functionality alternative to std::vector.
Constructors
std::vector requires that its contents are move-constructible, since it needs to be able to shuffle the items around. This is not a terrible burden to place on the contents (note that default constructors are not required, thanks to emplace and so forth). However, most of the other containers don't require any particular constructor (again, thanks to emplace). So if you have an object where you absolutely cannot implement a move constructor, then you will have to pick something else.
A std::deque would be the general replacement, having many of the properties of std::vector, but you can only insert at either ends of the deque. Inserts in the middle require moving. A std::list places no requirement on its contents.
Needs Bools
std::vector<bool> is... not. Well, it is standard. But it's not a vector in the usual sense, as operations that std::vector normally allows are forbidden. And it most certainly does not contain bools.
Therefore, if you need real vector behavior from a container of bools, you're not going to get it from std::vector<bool>. So you'll have to make due with a std::deque<bool>.
Searching
If you need to find elements in a container, and the search tag can't just be an index, then you may need to abandon std::vector in favor of set and map. Note the key word "may"; a sorted std::vector is sometimes a reasonable alternative. Or Boost.Container's flat_set/map, which implements a sorted std::vector.
There are now four variations of these, each with their own needs.
Use a map when the search tag is not the same thing as the item you're looking for itself. Otherwise use a set.
Use unordered when you have a lot of items in the container and search performance absolutely needs to be O(1), rather than O(logn).
Use multi if you need multiple items to have the same search tag.
Ordering
If you need a container of items to always be sorted based on a particular comparison operation, you can use a set. Or a multi_set if you need multiple items to have the same value.
Or you can use a sorted std::vector, but you'll have to keep it sorted.
Stability
When iterators and references are invalidated is sometimes a concern. If you need a list of items, such that you have iterators/pointers to those items in various other places, then std::vector's approach to invalidation may not be appropriate. Any insertion operation may cause invalidation, depending on the current size and capacity.
std::list offers a firm guarantee: an iterator and its associated references/pointers are only invalidated when the item itself is removed from the container. std::forward_list is there if memory is a serious concern.
If that's too strong a guarantee, std::deque offers a weaker but useful guarantee. Invalidation results from insertions in the middle, but insertions at the head or tail causes only invalidation of iterators, not pointers/references to items in the container.
Insertion Performance
std::vector only provides cheap insertion at the end (and even then, it becomes expensive if you blow capacity).
std::list is expensive in terms of performance (each newly inserted item costs a memory allocation), but it is consistent. It also offers the occasionally indispensable ability to shuffle items around for virtually no performance cost, as well as to trade items with other std::list containers of the same type at no loss of performance. If you need to shuffle things around a lot, use std::list.
std::deque provides constant-time insertion/removal at the head and tail, but insertion in the middle can be fairly expensive. So if you need to add/remove things from the front as well as the back, std::deque might be what you need.
It should be noted that, thanks to move semantics, std::vector insertion performance may not be as bad as it used to be. Some implementations implemented a form of move semantic-based item copying (the so-called "swaptimization"), but now that moving is part of the language, it's mandated by the standard.
No Dynamic Allocations
std::array is a fine container if you want the fewest possible dynamic allocations. It's just a wrapper around a C-array; this means that its size must be known at compile-time. If you can live with that, then use std::array.
That being said, using std::vector and reserveing a size would work just as well for a bounded std::vector. This way, the actual size can vary, and you only get one memory allocation (unless you blow the capacity).
Here is the C++11 version of the above flowchart. [originally posted without attribution to its original author, Mikael Persson]
Here's a quick spin, although it probably needs work
Should the container let you manage the order of the elements?
Yes:
Will the container contain always exactly the same number of elements?
Yes:
Does the container need a fast move operator?
Yes: std::vector
No: std::array
No:
Do you absolutely need stable iterators? (be certain!)
Yes: boost::stable_vector (as a last case fallback, std::list)
No:
Do inserts happen only at the ends?
Yes: std::deque
No: std::vector
No:
Are keys associated with Values?
Yes:
Do the keys need to be sorted?
Yes:
Are there more than one value per key?
Yes: boost::flat_map (as a last case fallback, std::map)
No: boost::flat_multimap (as a last case fallback, std::map)
No:
Are there more than one value per key?
Yes: std::unordered_multimap
No: std::unordered_map
No:
Are elements read then removed in a certain order?
Yes:
Order is:
Ordered by element: std::priority_queue
First in First out: std::queue
First in Last out: std::stack
Other: Custom based on std::vector?????
No:
Should the elements be sorted by value?
Yes: boost::flat_set
No: std::vector
You may notice that this differs wildly from the C++03 version, primarily due to the fact that I really do not like linked nodes. The linked node containers can usually be beat in performance by a non-linked container, except in a few rare situations. If you don't know what those situations are, and have access to boost, don't use linked node containers. (std::list, std::slist, std::map, std::multimap, std::set, std::multiset). This list focuses mostly on small and middle sided containers, because (A) that's 99.99% of what we deal with in code, and (B) Large numbers of elements need custom algorithms, not different containers.
As read on cplusplus.com, std::queue is implemented as follows:
queues are implemented as containers adaptors, which are classes that
use an encapsulated object of a specific container class as its
underlying container, providing a specific set of member functions to
access its elements. Elements are pushed into the "back" of the
specific container and popped from its "front".
The underlying container may be one of the standard container class
template or some other specifically designed container class. This
underlying container shall support at least the following operations:
......
The standard container classes deque and list fulfill these
requirements. By default, if no container class is specified for a
particular queue class instantiation, the standard container deque is
used.
I am confused as to why deque (a double-ended-queue on steroids) is used as a default here, instead of list (which is a doubly-linked list).
It seems to me that std::deque is very much overkill: It is a double-ended queue, but also has constant-time element access and many other features; being basically a full-featured std::vector bar the 'elements are stored contiguously in memory' guarantee.
As a normal std::queue only has very few possible operations, it seems to me that a doubly-linked list should be much more efficient, as there is a lot less plumbing that needs to happen internally.
Why then is std::queue implemented using std::deque as default, instead of std::list?
Stop thinking of list as "This is awkward to use, and lacks a bunch of useful features, so it must be the best choice when I don't need those features".
list is implemented as a doubly-linked list with a cached count. There are a narrow set of situations where it is optimal; when you need really, really strong reference/pointer/iterator stability. When you erase and insert in the middle of a container orders of magnitude more often than you iterate to the middle of a container.
And that is about it.
The std datatypes were generally implemented, then their performance and other characteristics analyzed, then the standard was written saying "you gotta guarantee these requirements". A little bit of wiggle room was left.
So when they wrote queue, someone probably profiled how list and deque performed and discovered how much faster deque was, so used deque by default.
In practice, someone could ship a deque with horrible performance (for example, MSVC has a tiny block size), but making it worse than what is required for a std::list would be tricky. list basically mandates one-node-per-element, and that makes memory caches cry.
The reason is that deque is orders of magnitude faster than list. List allocates each element separately, while deque allocates large chunks of elements.
The advantage of list is that it is possible to delete elements in the middle, but a queue does not require this feature.
Vectors and Linked Lists
Vectors are stored in memory serially, and therefore any element may be accessed using the operator[], just as in an array.
A linked list contains elements which may not be stored continuously in memory, and therefore a random element must be accessed by following pointers, using an iterator.
(You probably already knew this.)
Advantage of Priority Que
Recently I discovered the 'priority queue', which works kind of like a stack, but elements are push()-ed into the container, and this function places them in an order according to comparisons made with the operator<, I believe.
This suits me perfectly, as I am testing for events and placing them in the queue according to the time remaining until they occur. The queue automatically sorts them for me as I push() and pop() elements. (Popping does not affect the order.) I can write an operator< so this isn't a problem.
Issues I have not been able to resolve
There are three things which I need to be able to do with this event queue:
1:) Search the event queue for an item. I assume this can be done with an algorithm in the standard library? For example, 'find'? If not I can implement one myself, provided I can do point 2. (See below)
2:) Iterate over the queue. I think the default underlying container is std::vector... Is there a way to access random elements within the underlying vector? What if I choose to use std::deque instead? I need to do this to modify certain event parameters. (Events are placed in the queue.) As an example, I may need to process an event and then subtract a constant amount of time from each remaining event's 'time_to_event' parameter. I suspect this cannot be done due to this question.
3:) Remove an element from the queue. Sometimes when processing events, other events become invalidated, and therefore need to be removed.
Can points 1-3 be done? All the information I have on std::priority_queue has come from cplusplus.com, and so my default answer would be "no", there is no way to do any of these things. If I can't do all three things, then I guess I will have to write my own Priority Queue wrapper. (Oh boring)
No, you can't iterate over items in an std::priority_queue. All it supports is inserting items, and removing the highest priority item.
When you want more flexibility, you probably want to use std::make_heap to build the heap structure into your container, std::push_heap to add an item, and std::pop_heap to remove an item.
Since these are algorithms you apply to a container, you can still use the container's iterators as you see fit. Depending on how you modify the data in the heap, you may need to re-build the heap afterwards -- if you modify it in a way that the heap property no longer applies. You can test that with std::is_heap if you have any question.
Aside: many of us find http://www.cppreference.com more useful and accurate than the site you've linked.
Take a look at Boost.Heap. It looks like it addresses at least two of your issues (iteration and mutability).
I have a priority_queue, and I want to modify some of it's contents (the priority value), will the queue be resorted then?
It depends if it resorts on push/pop (more probable, becouse you just need to "insert", not resort whole), or when accessing top or pop.
I really want to change some elements in the queue. Something like that:
priority_queue<int> q;
int a=2,b=3,c=5;
int *ca=&a, *cb=&b, cc=&c;
q.push(a);
q.push(b);
q.push(c); //q is now {2,3,5}
*ca=4;
//what happens to q?
// 1) {3,4,5}
// 2) {4,2,5}
// 3) crash
priority_queue copies the values you push into it. Your assignment at the end there will have zero effect on the order of the priority queue, nor the values stored inside of it.
Unfortunately, the std::priority_queue class doesn't support the increase/decrease_key operations that you're looking for. Of course it's possible to find the element within the heap you want to update, and then call make_heap to restore the binary heap invariants, but this can't be done as efficiently as it should be with the std:: container/algorithms. Scanning the heap to find the item is O(N) and then make_heap is O(N) on top of that - it should be possible to do increase/decrease_key in O(log(N)) for binary heaps that properly support updates.
Boost provides a set of priority queue implementations, which are potentially more efficient than the std::priority_queue (pairing heaps, Fibonacci heaps, etc) and also offer mutability, so you can efficiently perform dynamic updates. So all round, using the boost containers is potentially a much better option.
Okay, after searching a bit I found out how to "resort" queue, so after each priority value change you need to call:
std::make_heap(const_cast<Type**>(&queue.top()),
const_cast<Type**>(&queue.top()) + queue.size(),
ComparerClass());
And queue must be then
std::priority_queue<Type*,vector<Type*>,ComparerClass> queue;
Hope this helps.
I stumbled on this issue while considering the use of priority queues for an A* algorithm.
Basically, C++ priority queues are a very limited toy.
Dynamically changing the priority of a queued element requires to perform a complete reconstruction of the underlying heap manually, which is not even guaranteed to work on a given STL implementation and is grossly inefficient.
Besides, reconstructing the heap requires butt-ugly code, which would have to be hidden in yet another obfuscated class/template.
As for so many other things in C++, you'll have to reinvent the wheel, or find whatever fashionable library that reinvented it for you.
I'm trying to design a task scheduler to a game engine. A task could be an animation, a trigger controller, etc.
My problem is what container to choose. The idea is: when you insert a new task, the container must reorder and put the task in the proper place. Once executed, task could change and be scheduled again or deleted. This is mainly push and pop.
But, if possible, it would be nice if I could have random access to an element, but not vital. No matter if the container supports one or more elements with the same key.
I think that priority queue fits my needs but I saw that is based on vector implementation, and I think that this container must be somehow optimized to push and pop.
Opinions?
(source: adrinael.net)
(original source: Liam Devine)
A priority queue seems to be the best option for you.
As you can see, the pop functions has a constant complexity and the push function is logarithmic in time.
std::vector is pretty good for this task, especially if the "steady-state" size of the container remains reasonably constant (you have a number of tasks on the queue doesn't differ widely).
If you need an updatable queue (and std::priority_queue is not), I would suggest you use the d_ary_heap_indirect (which can be found in the Boost.Graph "detail" folder). This is a priority queue used a lot for Dijkstra and A* algorithms that require an updatable priority queue. Random-access is necessary, anyways. Also, using an indirect makes the insertion and deletion from the queue quite efficient. Finally, you can choose your container (as a template argument), but it has to be random-access (so, you can try either vector or deque). Pop is constant-time, push and/or update is log-time, and the proper choice of container will make the container insertion constant-amortized (and the d_ary_heap_indirect amortizes a second time as well, so I wouldn't worry about that).
The vector is optimized for push and pop at one end. :-)
To prioritize you will have to sort the tasks. A vector isn't that bad, if the number of objects is reasonably small, even if it means copying objects during the sort.
Other containers, like linked lists, instead suffer from the need to allocate a new node for each object.
You can specify the container type you want with std::priority_queue.
However: you're storing pointers (I presume, since it sounds like what you're is
polymorphic and has identity), so copying is cheap. You're managing it
as a heap (that's what std::priority_queue does), so insertions are done
using push_back and a number of swaps (lg(n) max). I can't see any
reason to even consider another structure than std::vector.
std::priority_queue does hide all of the direct access operators (e.g.
operator[]). It does this because if you modify an entry, you're
likely to invalidate the heap (which is a class invariant of the class).
If you do want to provide direct read access, however, the underlying
container is only protected, not private, so you can derive from it
and add the operators you want. I'd very much limit it to const
operators, however.
Depends on how often you're going to be adding tasks and pulling tasks off (and presumably executing them) and how many there are.
If you're going to have tons of little tasks, then prefer priority queue because the cost of node allocation will probably not hurt you as much as the asymptotic growth of n log n for the sort.
If you're going to have a small number of tasks that constantly keep changing priority, then sorting a vector might be reasonable, but you want to use an sorting algorithm that works well when the list is almost sorted.
Scheduling is an art though and you're going to have to profile it once you build it. There's probably too little information at this point so say. I'd lean towards a priority queue, but keep other options in mind if performance isn't adequate.