As read on cplusplus.com, std::queue is implemented as follows:
queues are implemented as containers adaptors, which are classes that
use an encapsulated object of a specific container class as its
underlying container, providing a specific set of member functions to
access its elements. Elements are pushed into the "back" of the
specific container and popped from its "front".
The underlying container may be one of the standard container class
template or some other specifically designed container class. This
underlying container shall support at least the following operations:
......
The standard container classes deque and list fulfill these
requirements. By default, if no container class is specified for a
particular queue class instantiation, the standard container deque is
used.
I am confused as to why deque (a double-ended-queue on steroids) is used as a default here, instead of list (which is a doubly-linked list).
It seems to me that std::deque is very much overkill: It is a double-ended queue, but also has constant-time element access and many other features; being basically a full-featured std::vector bar the 'elements are stored contiguously in memory' guarantee.
As a normal std::queue only has very few possible operations, it seems to me that a doubly-linked list should be much more efficient, as there is a lot less plumbing that needs to happen internally.
Why then is std::queue implemented using std::deque as default, instead of std::list?
Stop thinking of list as "This is awkward to use, and lacks a bunch of useful features, so it must be the best choice when I don't need those features".
list is implemented as a doubly-linked list with a cached count. There are a narrow set of situations where it is optimal; when you need really, really strong reference/pointer/iterator stability. When you erase and insert in the middle of a container orders of magnitude more often than you iterate to the middle of a container.
And that is about it.
The std datatypes were generally implemented, then their performance and other characteristics analyzed, then the standard was written saying "you gotta guarantee these requirements". A little bit of wiggle room was left.
So when they wrote queue, someone probably profiled how list and deque performed and discovered how much faster deque was, so used deque by default.
In practice, someone could ship a deque with horrible performance (for example, MSVC has a tiny block size), but making it worse than what is required for a std::list would be tricky. list basically mandates one-node-per-element, and that makes memory caches cry.
The reason is that deque is orders of magnitude faster than list. List allocates each element separately, while deque allocates large chunks of elements.
The advantage of list is that it is possible to delete elements in the middle, but a queue does not require this feature.
Related
There's a well known image (cheat sheet) called "C++ Container choice". It's a flow chart to choose the best container for the wanted usage.
Does anybody know if there's already a C++11 version of it?
This is the previous one:
Not that I know of, however it can be done textually I guess. Also, the chart is slightly off, because list is not such a good container in general, and neither is forward_list. Both lists are very specialized containers for niche applications.
To build such a chart, you just need two simple guidelines:
Choose for semantics first
When several choices are available, go for the simplest
Worrying about performance is usually useless at first. The big O considerations only really kick in when you start handling a few thousands (or more) of items.
There are two big categories of containers:
Associative containers: they have a find operation
Simple Sequence containers
and then you can build several adapters on top of them: stack, queue, priority_queue. I will leave the adapters out here, they are sufficiently specialized to be recognizable.
Question 1: Associative ?
If you need to easily search by one key, then you need an associative container
If you need to have the elements sorted, then you need an ordered associative container
Otherwise, jump to the question 2.
Question 1.1: Ordered ?
If you do not need a specific order, use an unordered_ container, otherwise use its traditional ordered counterpart.
Question 1.2: Separate Key ?
If the key is separate from the value, use a map, otherwise use a set
Question 1.3: Duplicates ?
If you want to keep duplicates, use a multi, otherwise do not.
Example:
Suppose that I have several persons with a unique ID associated to them, and I would like to retrieve a person data from its ID as simply as possible.
I want a find function, thus an associative container
1.1. I couldn't care less about order, thus an unordered_ container
1.2. My key (ID) is separate from the value it is associated with, thus a map
1.3. The ID is unique, thus no duplicate should creep in.
The final answer is: std::unordered_map<ID, PersonData>.
Question 2: Memory stable ?
If the elements should be stable in memory (ie, they should not move around when the container itself is modified), then use some list
Otherwise, jump to question 3.
Question 2.1: Which ?
Settle for a list; a forward_list is only useful for lesser memory footprint.
Question 3: Dynamically sized ?
If the container has a known size (at compilation time), and this size will not be altered during the course of the program, and the elements are default constructible or you can provide a full initialization list (using the { ... } syntax), then use an array. It replaces the traditional C-array, but with convenient functions.
Otherwise, jump to question 4.
Question 4: Double-ended ?
If you wish to be able to remove items from both the front and back, then use a deque, otherwise use a vector.
You will note that, by default, unless you need an associative container, your choice will be a vector. It turns out it is also Sutter and Stroustrup's recommendation.
I like Matthieu's answer, but I'm going to restate the flowchart as this:
When to NOT use std::vector
By default, if you need a container of stuff, use std::vector. Thus, every other container is only justified by providing some functionality alternative to std::vector.
Constructors
std::vector requires that its contents are move-constructible, since it needs to be able to shuffle the items around. This is not a terrible burden to place on the contents (note that default constructors are not required, thanks to emplace and so forth). However, most of the other containers don't require any particular constructor (again, thanks to emplace). So if you have an object where you absolutely cannot implement a move constructor, then you will have to pick something else.
A std::deque would be the general replacement, having many of the properties of std::vector, but you can only insert at either ends of the deque. Inserts in the middle require moving. A std::list places no requirement on its contents.
Needs Bools
std::vector<bool> is... not. Well, it is standard. But it's not a vector in the usual sense, as operations that std::vector normally allows are forbidden. And it most certainly does not contain bools.
Therefore, if you need real vector behavior from a container of bools, you're not going to get it from std::vector<bool>. So you'll have to make due with a std::deque<bool>.
Searching
If you need to find elements in a container, and the search tag can't just be an index, then you may need to abandon std::vector in favor of set and map. Note the key word "may"; a sorted std::vector is sometimes a reasonable alternative. Or Boost.Container's flat_set/map, which implements a sorted std::vector.
There are now four variations of these, each with their own needs.
Use a map when the search tag is not the same thing as the item you're looking for itself. Otherwise use a set.
Use unordered when you have a lot of items in the container and search performance absolutely needs to be O(1), rather than O(logn).
Use multi if you need multiple items to have the same search tag.
Ordering
If you need a container of items to always be sorted based on a particular comparison operation, you can use a set. Or a multi_set if you need multiple items to have the same value.
Or you can use a sorted std::vector, but you'll have to keep it sorted.
Stability
When iterators and references are invalidated is sometimes a concern. If you need a list of items, such that you have iterators/pointers to those items in various other places, then std::vector's approach to invalidation may not be appropriate. Any insertion operation may cause invalidation, depending on the current size and capacity.
std::list offers a firm guarantee: an iterator and its associated references/pointers are only invalidated when the item itself is removed from the container. std::forward_list is there if memory is a serious concern.
If that's too strong a guarantee, std::deque offers a weaker but useful guarantee. Invalidation results from insertions in the middle, but insertions at the head or tail causes only invalidation of iterators, not pointers/references to items in the container.
Insertion Performance
std::vector only provides cheap insertion at the end (and even then, it becomes expensive if you blow capacity).
std::list is expensive in terms of performance (each newly inserted item costs a memory allocation), but it is consistent. It also offers the occasionally indispensable ability to shuffle items around for virtually no performance cost, as well as to trade items with other std::list containers of the same type at no loss of performance. If you need to shuffle things around a lot, use std::list.
std::deque provides constant-time insertion/removal at the head and tail, but insertion in the middle can be fairly expensive. So if you need to add/remove things from the front as well as the back, std::deque might be what you need.
It should be noted that, thanks to move semantics, std::vector insertion performance may not be as bad as it used to be. Some implementations implemented a form of move semantic-based item copying (the so-called "swaptimization"), but now that moving is part of the language, it's mandated by the standard.
No Dynamic Allocations
std::array is a fine container if you want the fewest possible dynamic allocations. It's just a wrapper around a C-array; this means that its size must be known at compile-time. If you can live with that, then use std::array.
That being said, using std::vector and reserveing a size would work just as well for a bounded std::vector. This way, the actual size can vary, and you only get one memory allocation (unless you blow the capacity).
Here is the C++11 version of the above flowchart. [originally posted without attribution to its original author, Mikael Persson]
Here's a quick spin, although it probably needs work
Should the container let you manage the order of the elements?
Yes:
Will the container contain always exactly the same number of elements?
Yes:
Does the container need a fast move operator?
Yes: std::vector
No: std::array
No:
Do you absolutely need stable iterators? (be certain!)
Yes: boost::stable_vector (as a last case fallback, std::list)
No:
Do inserts happen only at the ends?
Yes: std::deque
No: std::vector
No:
Are keys associated with Values?
Yes:
Do the keys need to be sorted?
Yes:
Are there more than one value per key?
Yes: boost::flat_map (as a last case fallback, std::map)
No: boost::flat_multimap (as a last case fallback, std::map)
No:
Are there more than one value per key?
Yes: std::unordered_multimap
No: std::unordered_map
No:
Are elements read then removed in a certain order?
Yes:
Order is:
Ordered by element: std::priority_queue
First in First out: std::queue
First in Last out: std::stack
Other: Custom based on std::vector?????
No:
Should the elements be sorted by value?
Yes: boost::flat_set
No: std::vector
You may notice that this differs wildly from the C++03 version, primarily due to the fact that I really do not like linked nodes. The linked node containers can usually be beat in performance by a non-linked container, except in a few rare situations. If you don't know what those situations are, and have access to boost, don't use linked node containers. (std::list, std::slist, std::map, std::multimap, std::set, std::multiset). This list focuses mostly on small and middle sided containers, because (A) that's 99.99% of what we deal with in code, and (B) Large numbers of elements need custom algorithms, not different containers.
There's a well known image (cheat sheet) called "C++ Container choice". It's a flow chart to choose the best container for the wanted usage.
Does anybody know if there's already a C++11 version of it?
This is the previous one:
Not that I know of, however it can be done textually I guess. Also, the chart is slightly off, because list is not such a good container in general, and neither is forward_list. Both lists are very specialized containers for niche applications.
To build such a chart, you just need two simple guidelines:
Choose for semantics first
When several choices are available, go for the simplest
Worrying about performance is usually useless at first. The big O considerations only really kick in when you start handling a few thousands (or more) of items.
There are two big categories of containers:
Associative containers: they have a find operation
Simple Sequence containers
and then you can build several adapters on top of them: stack, queue, priority_queue. I will leave the adapters out here, they are sufficiently specialized to be recognizable.
Question 1: Associative ?
If you need to easily search by one key, then you need an associative container
If you need to have the elements sorted, then you need an ordered associative container
Otherwise, jump to the question 2.
Question 1.1: Ordered ?
If you do not need a specific order, use an unordered_ container, otherwise use its traditional ordered counterpart.
Question 1.2: Separate Key ?
If the key is separate from the value, use a map, otherwise use a set
Question 1.3: Duplicates ?
If you want to keep duplicates, use a multi, otherwise do not.
Example:
Suppose that I have several persons with a unique ID associated to them, and I would like to retrieve a person data from its ID as simply as possible.
I want a find function, thus an associative container
1.1. I couldn't care less about order, thus an unordered_ container
1.2. My key (ID) is separate from the value it is associated with, thus a map
1.3. The ID is unique, thus no duplicate should creep in.
The final answer is: std::unordered_map<ID, PersonData>.
Question 2: Memory stable ?
If the elements should be stable in memory (ie, they should not move around when the container itself is modified), then use some list
Otherwise, jump to question 3.
Question 2.1: Which ?
Settle for a list; a forward_list is only useful for lesser memory footprint.
Question 3: Dynamically sized ?
If the container has a known size (at compilation time), and this size will not be altered during the course of the program, and the elements are default constructible or you can provide a full initialization list (using the { ... } syntax), then use an array. It replaces the traditional C-array, but with convenient functions.
Otherwise, jump to question 4.
Question 4: Double-ended ?
If you wish to be able to remove items from both the front and back, then use a deque, otherwise use a vector.
You will note that, by default, unless you need an associative container, your choice will be a vector. It turns out it is also Sutter and Stroustrup's recommendation.
I like Matthieu's answer, but I'm going to restate the flowchart as this:
When to NOT use std::vector
By default, if you need a container of stuff, use std::vector. Thus, every other container is only justified by providing some functionality alternative to std::vector.
Constructors
std::vector requires that its contents are move-constructible, since it needs to be able to shuffle the items around. This is not a terrible burden to place on the contents (note that default constructors are not required, thanks to emplace and so forth). However, most of the other containers don't require any particular constructor (again, thanks to emplace). So if you have an object where you absolutely cannot implement a move constructor, then you will have to pick something else.
A std::deque would be the general replacement, having many of the properties of std::vector, but you can only insert at either ends of the deque. Inserts in the middle require moving. A std::list places no requirement on its contents.
Needs Bools
std::vector<bool> is... not. Well, it is standard. But it's not a vector in the usual sense, as operations that std::vector normally allows are forbidden. And it most certainly does not contain bools.
Therefore, if you need real vector behavior from a container of bools, you're not going to get it from std::vector<bool>. So you'll have to make due with a std::deque<bool>.
Searching
If you need to find elements in a container, and the search tag can't just be an index, then you may need to abandon std::vector in favor of set and map. Note the key word "may"; a sorted std::vector is sometimes a reasonable alternative. Or Boost.Container's flat_set/map, which implements a sorted std::vector.
There are now four variations of these, each with their own needs.
Use a map when the search tag is not the same thing as the item you're looking for itself. Otherwise use a set.
Use unordered when you have a lot of items in the container and search performance absolutely needs to be O(1), rather than O(logn).
Use multi if you need multiple items to have the same search tag.
Ordering
If you need a container of items to always be sorted based on a particular comparison operation, you can use a set. Or a multi_set if you need multiple items to have the same value.
Or you can use a sorted std::vector, but you'll have to keep it sorted.
Stability
When iterators and references are invalidated is sometimes a concern. If you need a list of items, such that you have iterators/pointers to those items in various other places, then std::vector's approach to invalidation may not be appropriate. Any insertion operation may cause invalidation, depending on the current size and capacity.
std::list offers a firm guarantee: an iterator and its associated references/pointers are only invalidated when the item itself is removed from the container. std::forward_list is there if memory is a serious concern.
If that's too strong a guarantee, std::deque offers a weaker but useful guarantee. Invalidation results from insertions in the middle, but insertions at the head or tail causes only invalidation of iterators, not pointers/references to items in the container.
Insertion Performance
std::vector only provides cheap insertion at the end (and even then, it becomes expensive if you blow capacity).
std::list is expensive in terms of performance (each newly inserted item costs a memory allocation), but it is consistent. It also offers the occasionally indispensable ability to shuffle items around for virtually no performance cost, as well as to trade items with other std::list containers of the same type at no loss of performance. If you need to shuffle things around a lot, use std::list.
std::deque provides constant-time insertion/removal at the head and tail, but insertion in the middle can be fairly expensive. So if you need to add/remove things from the front as well as the back, std::deque might be what you need.
It should be noted that, thanks to move semantics, std::vector insertion performance may not be as bad as it used to be. Some implementations implemented a form of move semantic-based item copying (the so-called "swaptimization"), but now that moving is part of the language, it's mandated by the standard.
No Dynamic Allocations
std::array is a fine container if you want the fewest possible dynamic allocations. It's just a wrapper around a C-array; this means that its size must be known at compile-time. If you can live with that, then use std::array.
That being said, using std::vector and reserveing a size would work just as well for a bounded std::vector. This way, the actual size can vary, and you only get one memory allocation (unless you blow the capacity).
Here is the C++11 version of the above flowchart. [originally posted without attribution to its original author, Mikael Persson]
Here's a quick spin, although it probably needs work
Should the container let you manage the order of the elements?
Yes:
Will the container contain always exactly the same number of elements?
Yes:
Does the container need a fast move operator?
Yes: std::vector
No: std::array
No:
Do you absolutely need stable iterators? (be certain!)
Yes: boost::stable_vector (as a last case fallback, std::list)
No:
Do inserts happen only at the ends?
Yes: std::deque
No: std::vector
No:
Are keys associated with Values?
Yes:
Do the keys need to be sorted?
Yes:
Are there more than one value per key?
Yes: boost::flat_map (as a last case fallback, std::map)
No: boost::flat_multimap (as a last case fallback, std::map)
No:
Are there more than one value per key?
Yes: std::unordered_multimap
No: std::unordered_map
No:
Are elements read then removed in a certain order?
Yes:
Order is:
Ordered by element: std::priority_queue
First in First out: std::queue
First in Last out: std::stack
Other: Custom based on std::vector?????
No:
Should the elements be sorted by value?
Yes: boost::flat_set
No: std::vector
You may notice that this differs wildly from the C++03 version, primarily due to the fact that I really do not like linked nodes. The linked node containers can usually be beat in performance by a non-linked container, except in a few rare situations. If you don't know what those situations are, and have access to boost, don't use linked node containers. (std::list, std::slist, std::map, std::multimap, std::set, std::multiset). This list focuses mostly on small and middle sided containers, because (A) that's 99.99% of what we deal with in code, and (B) Large numbers of elements need custom algorithms, not different containers.
With the stl priority_queue you can set the underlying container, such as a vector. What are some of the advantages of specifying a container for the stl priority_queue?
Setting the underlying container makes it possible to separate out two logically separate concerns:
How do you store the actual elements that make up the priority queue (the container), and
How do you organize those elements to efficiently implement a priority queue (the priority_queue adapter class).
As an example, the standard implementation of vector is not required to shrink itself down when its capacity is vastly greater than its actual size. This means that if you have a priority queue backed by a vector, you might end up wasting memory if you enqueue a lot of elements and then dequeue all of them, since the vector will keep its old capacity. If, on the other hand, you implement your own shrinking_vector class that does actually decrease its capacity when needed, you can get all the benefits of the priority_queue interface while having the storage be used more efficiently.
Another possible example - you might want to change the allocator being used so that the elements of the priority queue are allocated from a special pool of resources. You can do this by just setting the container type of the priority_queue to be a vector with a custom allocator.
One more thought - suppose that you are storing a priority_queue of very large objects whose copy time is very great. In that case, the fact that the vector dynamically resizes itself and copies its old elements (or at least, in a C++03 compiler) might be something you're not willing to pay for. You could thus switch to some other type, perhaps a deque, that makes an effort not to copy elements when resizing and could realize some big performance wins.
Hope this helps!
The priority_queue class is an example of the adapter pattern. It provides a way of providing the services of a priority queue over an existing data set. As an adapter, it actually requires an underlying container. By default, it specifies a vector. (from here).
In terms of the advantages, it's simply a more flexible. The priority_queue uses the following methods of the backing store and requires it to support random access iterators.
front
push_back
pop_back
By providing it as an adapter, you can control the performance characteristics by supplying a different implementation.
Two examples that implement this in STL are vector and deque. These both have different performance characteristics. For example, a vector typically is continguous in memory, whereas a deque typically isn't. The push_back operation in a vector is only amortized constant time (it might have to reallocate the vector), whereas for the deque it's specified in constant time.
How are STL List and Vector implement?
I was just asked this in an an interview.
I just said maybe by using binary tree or hash table about vector. not sure about list...
Am I wrong, I guess so..
give some ideas thanks.
Hash table or binary tree? Why?
std::vector, as the name itself suggests, is implemented with a normal dynamically-allocated array, that is reallocated when its capacity is exhausted (usually doubling its size or something like that).
std::list instead is (usually1) implemented with a doubly-linked list.
The binary tree you mentioned is the usual implementation of std::map; the hash table instead is generally used for the unordered_map container (available in the upcoming C++0x standard).
"Usually" because the standard do not mandate a particular implementation, but specifies the asymptotic complexity of its methods, and such constraints are met easily with a doubly-linked list.
On the other hand, for std::vector the "contiguous space" requirement is enforced by the standard (from C++03 onwards), so it must be some form of dynamically allocated array.
std::vector uses a contiguously allocated array and placement new
std::list uses dynamically allocated chunks with pointer to the next and previous element.
nothing as fancy as binary trees or hash tables (which can be used for std::map)
You can spend half a semester talking about either of the containers, but here are a few points:
std::vector is a contiguous container, which means every element follows right after the previous element in memory. It can grow at runtime, which means it allocates its storage in dynamic memory.
std::list is a bidirectional linked list. This means that the elements are scattered in memory in arbitrary layout, and that each element knows where the next and previous elements in sequence are.
std::vector, std::list and the other containers don't take ownership of the elements they hold, but they do cleanup after themselves. So, if the elements are pointers to dynamic memory then the user must free the pointers before the container destructs. But if the container contains automatic data then the data's destructors will call automatically upon the container's cleanup.
So far, very simple and roughly equivalent to any other language or toolset. What's unique about the STL is that the containers are generic and decoupled from the means of iterating over them and (for the most part) from the operations you can perform over them. Some operations can be done particularly efficiently with some containers, so the containers will provide member functions in these cases. For example, std::list has a sort() member function.
The STL doesn't provide container classes (for the most part), but rather container templates. In other words, when the library talks about a container it only refers to the data type anonymously, say, as T, never by its true name. Never int or double or Car; always T, for any type. There are exceptions, like std::vector<bool>, but this is the general case. Then, when the user instantiates a container template, they specify a type, and the compiler creates a container class from the template for that type.
The STL also offers algorithms as free template functions. These algorithms work on iterators, themselves templates. Often iterators come in pairs that denote the beginning and end of a sequence, on which the algorithm operates. std::vector, std::list and other containers then expose their own iterators that can traverse and manipulate their data. So the same free algorithm can work on a std::vector and a std::list and other containers, provided the iterators conform with specific assumptions about the iterators' abilities.
All this abstraction is done at compile-time, and that is the biggest difference when compared to other languages. This translates to outstanding performance with relatively short and concise code. The same performance that in C you'd only get with lots of copy-pasting or hardcoding.
I was wondering why the heap concept is implemented as algorithms (make_heap, pop_heap, push_heap, sort_heap) instead of a container. I am especially interested is some one's solution can also explain why set and map are containers instead of similar collections of algorithms (make_set add_set rm_set etc).
STL does provide a heap in the form of a std::priority_queue. The make_heap, etc., functions are there because they have uses outside the realm of the data structure itself (e.g. sorting), and to allow heaps to be built on top of custom structures (like stack arrays for a "keep the top 10" container).
By analogy, you can use a std::set to store a sorted list, or you can use std::sort on a vector with std::adjacent_find; std::sort is the more general-purpose and makes few assumptions about the underlying data structure.
(As a note, the std::priority_queue implementation does not actually provide for its own storage; by default it creates a std::vector as its backing store.)
One obvious reason is that you can arrange elements as a heap inside another container.
So you can call make_heap() on a vector or a deque or even a C array.
A heap is a specific data structure. The standard containers have complexity requirements but don't specify how they are to be implemented. It's a fine but important distinction. You can make_heap on several different containers, including one you wrote yourself. But a set or map mean more than just a way of arranging the data.
Said another way, a standard container is more than just its underlying data structure.
Heaps* are almost always implemented using an array as the underlying data structure. As such it can be considered a set of algorithms that operate on the array data structure. This is the path that the STL took when implementing the heap - it will work on any data structure that has random access iterators (a standard array, vector, deque, etc).
You'll also notice that the STL priority_queue requires a container (which by default is a vector). This is essentially your heap container - it implements a heap on your underlying data structure and provides a wrapper container for all of the typical heap operations.
*Binary heaps in particular. Other forms of heaps (Binomial, Fibonacci, etc) are not.
Well, heaps aren't really a generic container in the same sense as a set or a map. Usually, you use a heap to implement some other abstract data type. (The most obvious being a priority queue.) I suspect this is the reason for the different treatment.