I'm trying to design a task scheduler to a game engine. A task could be an animation, a trigger controller, etc.
My problem is what container to choose. The idea is: when you insert a new task, the container must reorder and put the task in the proper place. Once executed, task could change and be scheduled again or deleted. This is mainly push and pop.
But, if possible, it would be nice if I could have random access to an element, but not vital. No matter if the container supports one or more elements with the same key.
I think that priority queue fits my needs but I saw that is based on vector implementation, and I think that this container must be somehow optimized to push and pop.
Opinions?
(source: adrinael.net)
(original source: Liam Devine)
A priority queue seems to be the best option for you.
As you can see, the pop functions has a constant complexity and the push function is logarithmic in time.
std::vector is pretty good for this task, especially if the "steady-state" size of the container remains reasonably constant (you have a number of tasks on the queue doesn't differ widely).
If you need an updatable queue (and std::priority_queue is not), I would suggest you use the d_ary_heap_indirect (which can be found in the Boost.Graph "detail" folder). This is a priority queue used a lot for Dijkstra and A* algorithms that require an updatable priority queue. Random-access is necessary, anyways. Also, using an indirect makes the insertion and deletion from the queue quite efficient. Finally, you can choose your container (as a template argument), but it has to be random-access (so, you can try either vector or deque). Pop is constant-time, push and/or update is log-time, and the proper choice of container will make the container insertion constant-amortized (and the d_ary_heap_indirect amortizes a second time as well, so I wouldn't worry about that).
The vector is optimized for push and pop at one end. :-)
To prioritize you will have to sort the tasks. A vector isn't that bad, if the number of objects is reasonably small, even if it means copying objects during the sort.
Other containers, like linked lists, instead suffer from the need to allocate a new node for each object.
You can specify the container type you want with std::priority_queue.
However: you're storing pointers (I presume, since it sounds like what you're is
polymorphic and has identity), so copying is cheap. You're managing it
as a heap (that's what std::priority_queue does), so insertions are done
using push_back and a number of swaps (lg(n) max). I can't see any
reason to even consider another structure than std::vector.
std::priority_queue does hide all of the direct access operators (e.g.
operator[]). It does this because if you modify an entry, you're
likely to invalidate the heap (which is a class invariant of the class).
If you do want to provide direct read access, however, the underlying
container is only protected, not private, so you can derive from it
and add the operators you want. I'd very much limit it to const
operators, however.
Depends on how often you're going to be adding tasks and pulling tasks off (and presumably executing them) and how many there are.
If you're going to have tons of little tasks, then prefer priority queue because the cost of node allocation will probably not hurt you as much as the asymptotic growth of n log n for the sort.
If you're going to have a small number of tasks that constantly keep changing priority, then sorting a vector might be reasonable, but you want to use an sorting algorithm that works well when the list is almost sorted.
Scheduling is an art though and you're going to have to profile it once you build it. There's probably too little information at this point so say. I'd lean towards a priority queue, but keep other options in mind if performance isn't adequate.
Related
I'm new to STL containers (and C++ in general) so thought I would reach out to the community for help. I basically want to have a priority_queue that supports constant iteration. Now, it seems that std::priority_queue doesn't support iteration, so I'm going to have to use something else, but I'm not sure exactly what.
Requirements:
Maintains order on insertion (like a priority queue)
Pop from top of list
Get const access to each element of the list (don't care about the order in the queue for this stage)
One option would be to keep a priority_queue and separately have an unordered_set of references, but I'd rather not have two containers floating around. I could also use a deque and search through for the right insertion position, but I'd rather have the container manage the sorting for me if possible (and constant-time insertion would be nicer than linear-time). Any suggestions?
There are two options that come to mind:
1) Implement your own iterable priority queue, using std::vector and the heap operation algorithms (see Heap Operations here).
2) derive (privately) from priority_queue. This gives you access to the underlying container via data member c. You can then expose iteration, random access, and other methods of interest in your public interface.
Using a std::vector might be enough as others already pointed, but if you want already-ready implementation, maybe use Boost.Heap (which is a library with several priority queue containers): http://www.boost.org/doc/libs/1_53_0/doc/html/heap.html
Boost is a collection of libraries that basically complete the standard library (which is not really big). A lot of C++ developers have boost ready on their dev computer to use it when needed. Just be careful in your choices of libraries.
You can use (ordered) set as a queue. set.begin() will be your top element, and you can pop it via erase(set.begin()).
Have you observed heap (std::make_heap) ? It hasn't order inside of queue, but has priority "pop from top of list" which you need.
I have a priority_queue, and I want to modify some of it's contents (the priority value), will the queue be resorted then?
It depends if it resorts on push/pop (more probable, becouse you just need to "insert", not resort whole), or when accessing top or pop.
I really want to change some elements in the queue. Something like that:
priority_queue<int> q;
int a=2,b=3,c=5;
int *ca=&a, *cb=&b, cc=&c;
q.push(a);
q.push(b);
q.push(c); //q is now {2,3,5}
*ca=4;
//what happens to q?
// 1) {3,4,5}
// 2) {4,2,5}
// 3) crash
priority_queue copies the values you push into it. Your assignment at the end there will have zero effect on the order of the priority queue, nor the values stored inside of it.
Unfortunately, the std::priority_queue class doesn't support the increase/decrease_key operations that you're looking for. Of course it's possible to find the element within the heap you want to update, and then call make_heap to restore the binary heap invariants, but this can't be done as efficiently as it should be with the std:: container/algorithms. Scanning the heap to find the item is O(N) and then make_heap is O(N) on top of that - it should be possible to do increase/decrease_key in O(log(N)) for binary heaps that properly support updates.
Boost provides a set of priority queue implementations, which are potentially more efficient than the std::priority_queue (pairing heaps, Fibonacci heaps, etc) and also offer mutability, so you can efficiently perform dynamic updates. So all round, using the boost containers is potentially a much better option.
Okay, after searching a bit I found out how to "resort" queue, so after each priority value change you need to call:
std::make_heap(const_cast<Type**>(&queue.top()),
const_cast<Type**>(&queue.top()) + queue.size(),
ComparerClass());
And queue must be then
std::priority_queue<Type*,vector<Type*>,ComparerClass> queue;
Hope this helps.
I stumbled on this issue while considering the use of priority queues for an A* algorithm.
Basically, C++ priority queues are a very limited toy.
Dynamically changing the priority of a queued element requires to perform a complete reconstruction of the underlying heap manually, which is not even guaranteed to work on a given STL implementation and is grossly inefficient.
Besides, reconstructing the heap requires butt-ugly code, which would have to be hidden in yet another obfuscated class/template.
As for so many other things in C++, you'll have to reinvent the wheel, or find whatever fashionable library that reinvented it for you.
I am doing a game where I create objects and kill them frequently. I must be able to loop the list of objects linearly, in a way that the next object is always newer than previous, so the rendering of the objects will be correct (they will overlap). I also need to be able to store the pointers of each object into a quadtree, to quickly find nearby objects.
My first thought was to use std::list, but I have never done anything like this before, so I am looking for experts thoughts about this.
What container should I use?
Edit: I am not just deleting from the front: the objects can be killed at any order, but they are always added in the end of the list, so last item is newest.
std::vector is the recommended container to start with when you're not sure what you're doing. Only when you know that's not going to work for you should you choose something else.
That said, if you're regularly adding to the back of the container and deleting from the front, you probably want std::deque. [Edit] But it appears that's not what you're doing.
I'm thinking you might want two containers, one to maintain the insert order and one for your quadtree. There are lots of quadtree implementations on the Internet, so I'll focus on the other one. Using std::list as you suggest will make the delete operation itself faster than vector or deque. It also has the advantage of letting you store iterators, because list won't invalidate the other iterators when an element is removed. Your objects in the quadtree could maintain an iterator into the insert order list. When you remove an element from the quadtree, you can remove it from the list too in O(1) time.
As always, the decision about which container to use is all about tradeoffs. A list comes with increased memory footprint over vector and the loss of contiguous memory layout. You might be surprised how much cache locality matters when your data set is large. The only way to know for sure is to try various containers and see which one runs the best for your application.
I think boost::stable_vector fits your needs for deletion\iteration.
So, you want to be able to iterate through through your container in the order in which the items have been added, but you want to be able to remove items from any point in the container. A simple queue obviously isn't going to hack it.
Happily, there are 4 containers that will do this job easily enough, std::vector, std::list and std::deque and std::set. If you use standard container idioms (eg. begin, end, erase, insert, and to a lesser extent, push_front, pop_back, front, back) you can use whichever container you felt like. With those 8 operations, you could switch between std::vector, std::list and std::deque, and with just the first 4 you could use std::set, too. Write your code, and then you can easily chop and change between the different container types and do a little profiling to compare performance and memory overheads or whatever.
Intuitively, std::list is probably a good bet, and perhaps std::set would work too. But rather than making assumptions, just use the general tools the template library gives you, and profile and optimise things later when you have some meaningful performance data to work with.
I'm using queue's and priority queues, through which I plan on pumping a lot of data quite quickly.
Therefore, I want my q's and pq's to be responsive to additions and subtractions.
What are the relative merits of using a vector, list, or deque as the underlying container?
Update:
At the time of writing, both Mike Seymour and Steve Townsend's answers below are worth reading. Thanks both!
The only way to be sure how the choice effects performance is to measure it, in a situation similar to your expected use cases. That said, here are some observations:
For std::queue:
std::deque is usually the best choice; it supports all the necessary operations in constant time, and allocates memory in chunks as it grows.
std::list also supports the necessary operations, but may be slower due to more memory allocations; in special circumstances, you might be able to get good results by allocating from a dedicated object pool, but that's not entirely straightforward.
std::vector can't be used as it doesn't have a pop_front() operation; such an operation would be slow, as it would have to move all the remaining elements.
A potentially faster, but less flexible, alternative is to implement a circular buffer over a fixed-size array (e.g. std::array, or a std::vector that you don't resize). You'll need to deal with the case of it filling up, either by reporting an error, or allocating a larger buffer and copying all the data.
For std::priority_queue:
std::vector is usually the best choice; it grows exponentially (reducing the number of memory allocations), and is a simple data structure that's very fast to access - an iterator can be implemented simply as a wrapper around a pointer.
std::deque may be slower as it typically grows linearly (requiring more memory allocation), and access is more complicated than with a vector.
std::list can't be used as it doesn't provide random access.
To summarise - the defaults are usually the best choice, but if speed really is important, then measure the alternatives.
I would use std::queue for your basic queue which is (by default at least) a wrapper on deque. Do something more special-purpose if that does not work for you.
std::priority_queue also exists (over vector by default) but the added semantics make it more likely that you could have to roll your own here, depending on perf observed for your particular access pattern.
vector has storage characteristics which make it very ill-suited to removal from front of the dataset. A lot of shuffling down to be done every time you pop_front. For a simple queue, avoid this.
list is likely to be too expensive for any high-hit queue, because by contract it has to offer function you don't need. It could be a candidate for use as a priority queue but my instinct is always to trust the STL.
vector would implement a stack as your fast insertion is at the end and fast removal is also at the end. If you want a FIFO queue, vector would be the wrong implementation to use.
deque or list both provide constant time insertion at either end. list is good for LRU caches where you want to move elements out of the middle fast and where you want your iterators to remain valid no matter how much you move them about. deque is generally used when insertions and deletions are at the end.
The main thing I need to ask about your collection is whether they are accessed by multiple threads. I sort-of assume they are, in which case one of your primary aims is to reduce locking. This is best done if you at least have a multi_push and multi_get feature so that you can put more than one element on at a time without any locking.
There are also lock-free containers or semi-lock-free containers.
You will probably find that your locking strategy is more important than any performance in the collection itself as long as your operations are all constant-time.
I am trying to implement LRU Cache using C++ . I would like to know what is the best design for implementing them. I know LRU should provide find(), add an element and remove an element. The remove should remove the LRU element. what is the best ADTs to implement this
For ex: If I use a map with element as value and time counter as key I can search in O(logn) time, Inserting is O(n), deleting is O(logn).
One major issue with LRU caches is that there is little "const" operations, most will change the underlying representation (if only because they bump the element accessed).
This is of course very inconvenient, because it means it's not a traditional STL container, and therefore any idea of exhibiting iterators is quite complicated: when the iterator is dereferenced this is an access, which should modify the list we are iterating on... oh my.
And there are the performances consideration, both in term of speed and memory consumption.
It is unfortunate, but you'll need some way to organize your data in a queue (LRU) (with the possibility to remove elements from the middle) and this means your elements will have to be independant from one another. A std::list fits, of course, but it's more than you need. A singly-linked list is sufficient here, since you don't need to iterate the list backward (you just want a queue, after all).
However one major drawback of those is their poor locality of reference, if you need more speed you'll need to provide your own custom (pool ?) allocator for the nodes, so that they are kept as close together as possible. This will also alleviate heap fragmentation somewhat.
Next, you obviously need an index structure (for the cache bit). The most natural is to turn toward a hash map. std::tr1::unordered_map, std::unordered_map or boost::unordered_map are normally good quality implementation, some should be available to you. They also allocate extra nodes for hash collision handling, you might prefer other kinds of hash maps, check out Wikipedia's article on the subject and read about the characteristics of the various implementation technics.
Continuing, there is the (obvious) threading support. If you don't need thread support, then it's fine, if you do however, it's a bit more complicated:
As I said, there is little const operation on such a structure, thus you don't really need to differentiate Read/Write accesses
Internal locking is fine, but you might find that it doesn't play nice with your uses. The issue with internal locking is that it doesn't support the concept of "transaction" since it relinquish the lock between each call. If this is your case, transform your object into a mutex and provide a std::unique_ptr<Lock> lock() method (in debug, you can assert than the lock is taken at the entry point of each method)
There is (in locking strategies) the issue of reentrance, ie the ability to "relock" the mutex from within the same thread, check Boost.Thread for more information about the various locks and mutexes available
Finally, there is the issue of error reporting. Since it is expected that a cache may not be able to retrieve the data you put in, I would consider using an exception "poor taste". Consider either pointers (Value*) or Boost.Optional (boost::optional<Value&>). I would prefer Boost.Optional because its semantic is clear.
The best way to implement an LRU is to use the combination of a std::list and stdext::hash_map (want to use only std then std::map).
Store the data in the list so that
the least recently used in at the
last and use the map to point to the
list items.
For "get" use the map to get the
list addr and retrieve the data
and move the current node to the
first(since this was used now) and update the map.
For "insert" remove the last element
from the list and add the new data
to the front and update the map.
This is the fastest you can get, If you are using a hash_map you should almost have all the operations done in O(1). If using std::map it should take O(logn) in all cases.
A very good implementation is available here
This article describes a couple of C++ LRU cache implementations (one using STL, one using boost::bimap).
When you say priority, I think "heap" which naturally leads to increase-key and delete-min.
I would not make the cache visible to the outside world at all if I could avoid it. I'd just have a collection (of whatever) and handle the caching invisibly, adding and removing items as needed, but the external interface would be exactly that of the underlying collection.
As far as the implementation goes, a heap is probably the most obvious. It has complexities roughly similar to a map, but instead of building a tree from linked nodes, it arranges items in an array and the "links" are implicit based on array indices. This increases the storage density of your cache and improves locality in the "real" (physical) processor cache.
I suggest a heap and maybe a Fibonacci Heap
I'd go with a normal heap in C++.
With the std::make_heap (guaranteed by the standard to be O(n)), std::pop_heap, and std::push_heap in #include, implementing it would be absolutely cake. You only have to worry about increase-key.