Quickest Queue Container (C++) - c++

I'm using queue's and priority queues, through which I plan on pumping a lot of data quite quickly.
Therefore, I want my q's and pq's to be responsive to additions and subtractions.
What are the relative merits of using a vector, list, or deque as the underlying container?
Update:
At the time of writing, both Mike Seymour and Steve Townsend's answers below are worth reading. Thanks both!

The only way to be sure how the choice effects performance is to measure it, in a situation similar to your expected use cases. That said, here are some observations:
For std::queue:
std::deque is usually the best choice; it supports all the necessary operations in constant time, and allocates memory in chunks as it grows.
std::list also supports the necessary operations, but may be slower due to more memory allocations; in special circumstances, you might be able to get good results by allocating from a dedicated object pool, but that's not entirely straightforward.
std::vector can't be used as it doesn't have a pop_front() operation; such an operation would be slow, as it would have to move all the remaining elements.
A potentially faster, but less flexible, alternative is to implement a circular buffer over a fixed-size array (e.g. std::array, or a std::vector that you don't resize). You'll need to deal with the case of it filling up, either by reporting an error, or allocating a larger buffer and copying all the data.
For std::priority_queue:
std::vector is usually the best choice; it grows exponentially (reducing the number of memory allocations), and is a simple data structure that's very fast to access - an iterator can be implemented simply as a wrapper around a pointer.
std::deque may be slower as it typically grows linearly (requiring more memory allocation), and access is more complicated than with a vector.
std::list can't be used as it doesn't provide random access.
To summarise - the defaults are usually the best choice, but if speed really is important, then measure the alternatives.

I would use std::queue for your basic queue which is (by default at least) a wrapper on deque. Do something more special-purpose if that does not work for you.
std::priority_queue also exists (over vector by default) but the added semantics make it more likely that you could have to roll your own here, depending on perf observed for your particular access pattern.
vector has storage characteristics which make it very ill-suited to removal from front of the dataset. A lot of shuffling down to be done every time you pop_front. For a simple queue, avoid this.
list is likely to be too expensive for any high-hit queue, because by contract it has to offer function you don't need. It could be a candidate for use as a priority queue but my instinct is always to trust the STL.

vector would implement a stack as your fast insertion is at the end and fast removal is also at the end. If you want a FIFO queue, vector would be the wrong implementation to use.
deque or list both provide constant time insertion at either end. list is good for LRU caches where you want to move elements out of the middle fast and where you want your iterators to remain valid no matter how much you move them about. deque is generally used when insertions and deletions are at the end.
The main thing I need to ask about your collection is whether they are accessed by multiple threads. I sort-of assume they are, in which case one of your primary aims is to reduce locking. This is best done if you at least have a multi_push and multi_get feature so that you can put more than one element on at a time without any locking.
There are also lock-free containers or semi-lock-free containers.
You will probably find that your locking strategy is more important than any performance in the collection itself as long as your operations are all constant-time.

Related

Fast data structure that supports finding the minimum element and accessing, inserting, removing and updating data at any index

I'm looking for ideas to implement a templatized sequence container data structure which can beat the performance of std::vector in as many features as possible and potentially perform much faster. It should support the following:
Finding the minimum element (and returning it's index)
Insertion at any index
Removal at any index
Accessing and updating any element by index (via operator[])
What would be some good ways to implement such a structure in C++?
You generally be pretty sure that the STL implementations of all containers tend to be very good at the range of tasks they were designed for. That is to say, you're unlikely to be able to build a container that is as robust as std::vector and quicker for all applications. However, generally speaking, it is almost always possible to beat a generic tool when optimizing for a specific application.
First, let's think about what a vector actually is. You can think of it as a pointer to a c-style array, except that its elements are stored on the heap. Unlike a c array, it also provides a bunch of methods that make it a little bit more convenient to manipulate. But like a c-array, all of it's data is stored contiguously in memory, so lookups are extremely cheap, but changing its size may require the entire array to be shifted elsewhere in memory to make room for the new elements.
Here are some ideas for how you could do each of the things you're asking for better than a vanilla std::vector:
Finding the minimum element: Search is typically O(N) for many containers, and certainly for a vector (because you need to iterate through all elements to find the lowest). You can make it O(1), or very close to free, by simply keeping the smallest element at all times, and only updating it when the container is changed.
Insertion at any index: If your elements are small and there are not many, I wouldn't bother tinkering here, just do what the vector does and keep elements contiguously next to each other to keep lookups quick. If you have large elements, store pointers to the elements instead of the elements themselves (boost's stable vector will do this for you). Keep in mind that this make lookup more expensive, because you now need to dereference the pointer, so whether you want to do this will depend on your application. If you know the number of elements you are going to insert, std::vector provides the reserve method which preallocates some memory for you, but what it doesn't do is allow you to decide how the size of the allocated memory grows. So if your application warrants lots of push_back operations without enough information to intelligently call reserve, you might be able to beat the standard std::vector implementation by tailoring the growth function of your container to your particular needs. Another option is using a linked list (e.g. std::list), which will beat an std::vector in insertions for larger containers. However, the cost here is that lookup (see 4.) will now become vastly slower (O(N) instead of O(1) for vectors), so you're unlikely to want to go down this path unless you plan to do more insertions/erasures than lookups.
Removal at any index: Similar considerations as for 2.
Accessing and updating any element by index (via operator[]): The only way you can beat std::vector in this regard is by making sure your data is in the cache when you try to access it. This is because lookup for a vector is essentially an array lookup, which is really just some pointer arithmetic and a pointer dereference. If you don't access your vector often you might be able to squeeze out a few clock cycles by using a custom allocator (see boost pools) and placing your pool close to the stack pointer.
I stopped writing mainly because there are dozens of ways in which you could approach this problem.
At the end of the day, this is probably more of an exercise in teaching you that the implementation of std::vector is likely to be extremely efficient for most compilers. All of these suggestions are essentially micro-optimizations (which are the root of all evil), so please don't blindly apply these in important code, as they're highly likely to end up costing you a lot of time and headache.
However, that's not to say you shouldn't tinker and learn for yourself, so by all means go ahead and try to beat it for your application and let us know how you go! Good luck :)

Need some advice to choose the proper container

I'm trying to design a task scheduler to a game engine. A task could be an animation, a trigger controller, etc.
My problem is what container to choose. The idea is: when you insert a new task, the container must reorder and put the task in the proper place. Once executed, task could change and be scheduled again or deleted. This is mainly push and pop.
But, if possible, it would be nice if I could have random access to an element, but not vital. No matter if the container supports one or more elements with the same key.
I think that priority queue fits my needs but I saw that is based on vector implementation, and I think that this container must be somehow optimized to push and pop.
Opinions?
(source: adrinael.net)
(original source: Liam Devine)
A priority queue seems to be the best option for you.
As you can see, the pop functions has a constant complexity and the push function is logarithmic in time.
std::vector is pretty good for this task, especially if the "steady-state" size of the container remains reasonably constant (you have a number of tasks on the queue doesn't differ widely).
If you need an updatable queue (and std::priority_queue is not), I would suggest you use the d_ary_heap_indirect (which can be found in the Boost.Graph "detail" folder). This is a priority queue used a lot for Dijkstra and A* algorithms that require an updatable priority queue. Random-access is necessary, anyways. Also, using an indirect makes the insertion and deletion from the queue quite efficient. Finally, you can choose your container (as a template argument), but it has to be random-access (so, you can try either vector or deque). Pop is constant-time, push and/or update is log-time, and the proper choice of container will make the container insertion constant-amortized (and the d_ary_heap_indirect amortizes a second time as well, so I wouldn't worry about that).
The vector is optimized for push and pop at one end. :-)
To prioritize you will have to sort the tasks. A vector isn't that bad, if the number of objects is reasonably small, even if it means copying objects during the sort.
Other containers, like linked lists, instead suffer from the need to allocate a new node for each object.
You can specify the container type you want with std::priority_queue.
However: you're storing pointers (I presume, since it sounds like what you're is
polymorphic and has identity), so copying is cheap. You're managing it
as a heap (that's what std::priority_queue does), so insertions are done
using push_back and a number of swaps (lg(n) max). I can't see any
reason to even consider another structure than std::vector.
std::priority_queue does hide all of the direct access operators (e.g.
operator[]). It does this because if you modify an entry, you're
likely to invalidate the heap (which is a class invariant of the class).
If you do want to provide direct read access, however, the underlying
container is only protected, not private, so you can derive from it
and add the operators you want. I'd very much limit it to const
operators, however.
Depends on how often you're going to be adding tasks and pulling tasks off (and presumably executing them) and how many there are.
If you're going to have tons of little tasks, then prefer priority queue because the cost of node allocation will probably not hurt you as much as the asymptotic growth of n log n for the sort.
If you're going to have a small number of tasks that constantly keep changing priority, then sorting a vector might be reasonable, but you want to use an sorting algorithm that works well when the list is almost sorted.
Scheduling is an art though and you're going to have to profile it once you build it. There's probably too little information at this point so say. I'd lean towards a priority queue, but keep other options in mind if performance isn't adequate.

Least Recently Used cache using C++

I am trying to implement LRU Cache using C++ . I would like to know what is the best design for implementing them. I know LRU should provide find(), add an element and remove an element. The remove should remove the LRU element. what is the best ADTs to implement this
For ex: If I use a map with element as value and time counter as key I can search in O(logn) time, Inserting is O(n), deleting is O(logn).
One major issue with LRU caches is that there is little "const" operations, most will change the underlying representation (if only because they bump the element accessed).
This is of course very inconvenient, because it means it's not a traditional STL container, and therefore any idea of exhibiting iterators is quite complicated: when the iterator is dereferenced this is an access, which should modify the list we are iterating on... oh my.
And there are the performances consideration, both in term of speed and memory consumption.
It is unfortunate, but you'll need some way to organize your data in a queue (LRU) (with the possibility to remove elements from the middle) and this means your elements will have to be independant from one another. A std::list fits, of course, but it's more than you need. A singly-linked list is sufficient here, since you don't need to iterate the list backward (you just want a queue, after all).
However one major drawback of those is their poor locality of reference, if you need more speed you'll need to provide your own custom (pool ?) allocator for the nodes, so that they are kept as close together as possible. This will also alleviate heap fragmentation somewhat.
Next, you obviously need an index structure (for the cache bit). The most natural is to turn toward a hash map. std::tr1::unordered_map, std::unordered_map or boost::unordered_map are normally good quality implementation, some should be available to you. They also allocate extra nodes for hash collision handling, you might prefer other kinds of hash maps, check out Wikipedia's article on the subject and read about the characteristics of the various implementation technics.
Continuing, there is the (obvious) threading support. If you don't need thread support, then it's fine, if you do however, it's a bit more complicated:
As I said, there is little const operation on such a structure, thus you don't really need to differentiate Read/Write accesses
Internal locking is fine, but you might find that it doesn't play nice with your uses. The issue with internal locking is that it doesn't support the concept of "transaction" since it relinquish the lock between each call. If this is your case, transform your object into a mutex and provide a std::unique_ptr<Lock> lock() method (in debug, you can assert than the lock is taken at the entry point of each method)
There is (in locking strategies) the issue of reentrance, ie the ability to "relock" the mutex from within the same thread, check Boost.Thread for more information about the various locks and mutexes available
Finally, there is the issue of error reporting. Since it is expected that a cache may not be able to retrieve the data you put in, I would consider using an exception "poor taste". Consider either pointers (Value*) or Boost.Optional (boost::optional<Value&>). I would prefer Boost.Optional because its semantic is clear.
The best way to implement an LRU is to use the combination of a std::list and stdext::hash_map (want to use only std then std::map).
Store the data in the list so that
the least recently used in at the
last and use the map to point to the
list items.
For "get" use the map to get the
list addr and retrieve the data
and move the current node to the
first(since this was used now) and update the map.
For "insert" remove the last element
from the list and add the new data
to the front and update the map.
This is the fastest you can get, If you are using a hash_map you should almost have all the operations done in O(1). If using std::map it should take O(logn) in all cases.
A very good implementation is available here
This article describes a couple of C++ LRU cache implementations (one using STL, one using boost::bimap).
When you say priority, I think "heap" which naturally leads to increase-key and delete-min.
I would not make the cache visible to the outside world at all if I could avoid it. I'd just have a collection (of whatever) and handle the caching invisibly, adding and removing items as needed, but the external interface would be exactly that of the underlying collection.
As far as the implementation goes, a heap is probably the most obvious. It has complexities roughly similar to a map, but instead of building a tree from linked nodes, it arranges items in an array and the "links" are implicit based on array indices. This increases the storage density of your cache and improves locality in the "real" (physical) processor cache.
I suggest a heap and maybe a Fibonacci Heap
I'd go with a normal heap in C++.
With the std::make_heap (guaranteed by the standard to be O(n)), std::pop_heap, and std::push_heap in #include, implementing it would be absolutely cake. You only have to worry about increase-key.

selection of data structure

I use C++, say i want to store 40 usernames, I will simply use an array. However, if I want to store 40000 usernames is this still a good idea in terms of search speed? Which data structure should I use to improve this speed?
You need to specify what the insertion and removal requirements are. Do things need to be removed and inserted at random points in the sequence?
Also, why the requirement to search sequentially? Are you doing searches that aren't suitable for a hash table lookup?
At the moment I'd suggest a deque or a list. Often it's best to choose a container with the interface that makes for the simplest implementation for your algorithm and then only change the choice if the performance is inadequate and an alternative provides the necessary speedup.
A vector has two principle advantages, there is no per-object memory overhead, although vectors will over-allocate to prevent frequent copying and objects are stored contiguously so sequential access tends to be fast. These are also its disadvantages. Growing vectors require reallocation and copying, and insertion and removal from anywhere other than the end of the vector also require copying. Contiguous storage can produce problems for vectors with large numbers of objects or large objects as the contiguous storage requirements can be hard to satisfy even with only mild memory fragmentation.
A list doesn't require contigous storage but list nodes usually have a per-object overhead of two pointers (in most implementation). This can be significant in list of very small objects (e.g. in a list of pointers, each node is 3x the size of the data item). Insertion and removal from the middle of a list is very cheap though and list nodes never need to me moved in memory once created.
A deque uses chunked storage, so it has a low per-object overhead similar to a vector, but doesn't require contiguous storage over the whole container so doesn't have the same problem with fragmented memory spaces. It is often a very good choice for collections and is often overlooked.
As a rule of thumb, prefer vector to list or, diety forbid, C-style array.
After the vector is filled, make sure it is properly ordered using the sort algorithm. You can then search for a particular record using either find, binary_search or lower_bound. (You don't need to sort to use find.)
Seriously unless you are in a resource constrained environment (embedded platform, phone, or other). Use a std::map, save the effort of doing sorting or searching and let the container take care of everything. This will possibly be a sorted tree structure, probably balance (e.g. Red-Black), which means you will get good searching performance. Unless the size of you data is close to the size of one or two pointers, the memory overhead of whatever data structure you pick is negligable. You Graphics Card probably has more memory that you are going to use up for the data you are think about.
As others said there is very little good reason to use vanilla array, if you don't want to use a map use std::vector or std::list depending on whether you need insert/delete data (=>list) or not (=>vector)
Also consider if you really need all that data in memory, how about putting it on disk via sqlite. Or even use sqlite for in memory access. It all depends on what you need to do with your data.
std::vector and std::list seem good for this task. You can use an array if you know the maximum number of records beforehands.
If you need only sequentially search and storage, then list is the proper container.
Also, vector wouldn't be a bad choice.

Benefit of slist over vector?

What I need is just a dynamically growing array. I don't need random access, and I always insert to the end and read it from the beginning to the end.
slist seems to be the first choice, because it provides just enough of what I need. However, I can't tell what benefit I get by using slist instead of vector. Besides, several materials I read about STL say, "vectors are generally the most efficient in time for accessing elements and to add or remove elements from the end of the sequence". Therefore, my question is: for my needs, is slist really a better choice than vector? Thanks in advance.
For starters, slist is non-standard.
For your choice, a linked list will be slower than a vector, count on it. There are two reasons contributing to this:
First and foremost, cache locality; vectors store their elements linearly in the RAM which facilitates caching and pre-fetching.
Secondly, appending to a linked list involves dynamic allocations which add a large overhead. By contrast, vectors most of the time do not need to allocate memory.
However, a std::deque will probably be even faster. In-depth performance analysis has shown that, despite bias to the contrary, std::deque is almost always superior to std::vector in performance (if no random access is needed), due to its improved (chunked) memory allocation strategy.
Yes, if you are always reading beginning to end, slist (a linked list) sounds like the way to go. The possible exception is if you will be inserting a large quantity of elements at the end at the same time. Then, the vector could be better if you use reserve appropriately.
Of course, profile to be sure it's better for your application.
Matt Austern (author of "Generic Programming and the STL" and general C++ guru) is a strong advocate of singly-linked lists for inclusion in the forthcoming C++ standard; see his presentation at http://www.accu-usa.org/Slides/SinglyLinkedLists.ppt and his long article at http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2543.htm for more details, including a discussion of the trade-offs involved that may guide you in possibly choosing this data structure. (Note that the currently proposed name is forward_list, though slist is how it was traditionally named in SGI's STL & other popular libraries).
I'll second (or maybe third...) the opinion that std::vector or std::deque will do the job. The only thing that I will add is a few additional factors that should guide the decision between std::vector<T> and std::list<T>. These have a lot to do with the characteristics of T and what algorithms you plan on using.
The first is memory overhead. Std::list is a node-based container so if T is a primitive type or relatively small user-defined type, then the memory overhead of the node-based linkage might be non-negligible - consider that std::list<int> is likely to use at least 3 * sizeof(int) storage for each element whereas std::vector will only use sizeof(int) storage with a small header overhead. Std::deque is similar to std::vector but has a small overhead that is linear to N.
The next issue is the cost of copy construction. If T(T const&) is at all expensive, then steer clear of std::vector<T> since it cause a bunch of copies to occur as the size of the vector grows. This is where std::deque<T> is a clear winner and std::list<T> is also a contender.
The final issue that usually guides the decision on container type is whether your algorithms can work with the iterator invalidation constraints of std::vector and std::deque. If you will be manipulating the container elements a lot (e.g., sorting, inserting in the middle, or shuffling), then you might want to lean towards std::list since manipulating the order requires little more than resetting a few linkage pointers.
I'm guessing you mean std::list by "slist". Vectors are good when you need fast, random-access to a sequence of elements, with guaranteed contiguous memory, and fast sequential reading (IOW, from the beginning to the end). Lists are good when you need fast (constant-time) insertion or deletion of items at the beginning or end of the sequence, but don't care about the performance of random-access or sequential reading.
The reason for the difference is the way the 2 are implemented. Vectors are implemented internally as an array of items, which needs to be reallocated when its size/capacity is reached on adding an item. Lists are implemented as a doubly-linked list, which can cause cache-misses for sequential reading. Random-access for lists also requires scanning from the first (or last) item in the list, until it locates the item you're requesting.
Sounds like a good job for std::deque to me. It has the memory benefits of a vector like contiguous memory allocation for each "slab" (good for CPU caches), no overhead for each element like with std::list and it does not need to reallocate the whole set as std::vector does. Read more about std::deque here