Since the only operations required for a container to be used in a stack are:
back()
push_back()
pop_back()
Why is the default container for it a deque instead of a vector?
Don't deque reallocations give a buffer of elements before front() so that push_front() is an efficient operation? Aren't these elements wasted since they will never ever be used in the context of a stack?
If there is no overhead for using a deque this way instead of a vector, why is the default for priority_queue a vector not a deque also? (priority_queue requires front(), push_back(), and pop_back() - essentially the same as for stack)
Updated based on the Answers below:
It appears that the way deque is usually implemented is a variable size array of fixed size arrays. This makes growing faster than a vector (which requires reallocation and copying), so for something like a stack which is all about adding and removing elements, deque is likely a better choice.
priority_queue requires indexing heavily, as every removal and insertion requires you to run pop_heap() or push_heap(). This probably makes vector a better choice there since adding an element is still amortized constant anyways.
As the container grows, a reallocation for a vector requires copying all the elements into the new block of memory. Growing a deque allocates a new block and links it to the list of blocks - no copies are required.
Of course you can specify that a different backing container be used if you like. So if you have a stack that you know is not going to grow much, tell it to use a vector instead of a deque if that's your preference.
See Herb Sutter's Guru of the Week 54 for the relative merits of vector and deque where either would do.
I imagine the inconsistency between priority_queue and queue is simply that different people implemented them.
Related
Assume we have a vector std::vector<int> v and let's assume that some resources are allocated to it. To my knowledge, v.clear() and v.shrink_to_fit() releases all resources allocated to v. I am wondering if there exist similar operations for std::map and std::unordered_map that release all resources manually. I can only find a member function clear() for these two templates. Can someone explain why there is no shrink_to_fit() for the latter two templates?
There is no shrink_to_fit() in std::map, because it would be useless.
In std::vector, to ensure amortized constant insertion time required by standard, implementation can allocate more memory than is currently necessary for future storage (so that it doesn't have to reallocate everything each push_back()). Most implementations allocate 2*size() if current capacity() would be exceeded.
shrink_to_fit() asks to release that extra memory to make size() == capacity() (but it's not guaranteed that this will actually happen).
Now, std::map is usually implemented as a red-black tree. Adding an element into such structure is just creating new node and a bit of pointer magic. It will not involve reallocation of other nodes and you cannot speed it up by pre-allocating some memory. shrink_to_fit() doesn't make sense, because there is nothing to shrink.
Update after dewaffled's comment: For std::unordered_map there's a rehash() method which may decrease size of hash table by recalculating it, but similarly to shrink_to_fit() it's not a guaranteed result.
I'm going to assume you're talking about dynamic maps; the shrink_to_fit() wouldn't make sense because the map is only as big as its linked elements. My understanding is that there isn't 'empty' nodes for a map, like you could have empty fields in a vector.
In C++ Primer 5th, it says that the default implementation of stack and queue is deque.
I'm wondering why they don't use list? Stack and Queue doesn't support random access, always operate on both ends, therefore list should be the most intuitive way to implement them, and deque, which support random access (with constant time) is somehow not necessary.
Could anyone explain the reason behind this implementation?
With std::list as underlying container each std::stack::push does a memory allocation. Whereas std::deque allocates memory in chunks and can reuse its spare capacity to avoid the memory allocation.
With small elements the storage overhead of list nodes can also become significant. E.g. std::list<int> node size is 24 bytes (on a 64-bit system), with only 4 bytes occupied by the element - at least 83% storage overhead.
I think the question should be asked the other way around: Why use a list if you can use an array?
The list is more complicated: More allocations, more resources (for storing pointers) and more work to do (even if it is all in constant time). On the other hand, the main property for preferring lists is also not relevant to stacks and queues: constant-time random insertion and deletion.
the main reason is because deque is faster than the list in average for front and back insertions and deletions
deque is faster because memory is allocated by chunks were as list need allocation on each elements and allocation are costly operations.
benchmark
Let's compare the sequence containers:
std::array is right out, it doesn't change size.
std::list optimises for iterator non-invalidation, allows insertion at known positions, and lacks random access. It has O(N) space overhead, with a large constant, and it has bad cache locality.
std::forward_list is an even more crippled list with smaller space overhead
std::deque optimises for appending or prepending, at the expense of not being contiguous. It has O(N) space overhead, with a smaller constant, and it has mediocre cache locality.
std::vector optimises for access speed, at the expense of insertion / removal anywhere but the end. It has O(1) space overhead, and the great cache locality.
So what does this mean for stack and queue?
std::stack only requires operations at one end. std::vector, std::deque and std::list all provide the necessary operations
std::queue requires operations at both ends. std::deque and std::list are the only candidates.
The choice of std::deque as the default is then one of consistency, as std::vector is generally better for std::stack, but inapplicable for std::queue.
Note that std::priority_queue, whilst named similar to std::queue, is actually more akin to std::stack in requiring only modification at one end. It also benefits more from the raw access speed of std::vector, maintaining the heap invariant.
Within a function, I have created a vector with generous amounts of space to which I push a runtime determined amount of objects(Edge). Other objects, however, maintain pointers to the Edges within the vector. Occasionally the entire program seg faults because a pointer becomes invalid, and I suspect that this happens when the vector reaches capacity and reallocates, thereby invalidating the memory addresses.
Is there any way around this? Or perhaps is there another solution to grouping together heap allocations?
Note: that the primary motivation for this is to minimize heap allocations, for this is what is slowing down my algorithm. Initially I had vector<Edge *> and every element added was individually allocated. Batch allocation increased the speed dramatically, but the vector method described here invalidates pointers.
Your code example, as requested:
This is the vector I declare as a stack var:
vector<Edge> edgeListTemp(1000);
I then add to it as such, using an rvalue overload:
edgeListTemp.push_back(Edge{edge->movie, first, second});
Node objects keep pointers to these:
first->edges.push_back(&edgeListTemp.back());
second->edges.push_back(&edgeListTemp.back());
Where edges is declared as follows:
std::vector<Edge *> edges; /**< Adjacency list */
There are several possible solutions:
if you already know the maximum number of elements in advance, do a reserve over the vector from the start; elements won't be reallocated until you reach that size;
if you don't know the maximum number of elements/don't want to preallocate the maximum size for performance reasons but you only add/remove elements from the end (or from the start) of the vector, use an std::deque instead. std::deque guarantees that pointers to elements aren't invalidated as long as you only push/pop from front/back;
std::list guarantees to never invalidate references to elements, but it introduces several serious performance penalties (no O(1) addressing, one allocation for each node);
if you want to ignore the problem completely, add a layer of indirection, and store into the vector pointers to elements allocated over the heap; even better, make a vector of std::shared_ptr and always use it to keep references to the elements; this obviously has the disadvantage of needing one allocation for each element, which may or may not be acceptable, depending on your use case.
A std::deque does not move elements once added, so iterators and references are stable as long as you don't delete the referenced element.
Like std::vector, std::deque offers random access iterators. Random access into a deque is a little slower than std::vector, but still O(1). If you need stable references, the slight slow-down is probably acceptable.
Alternatively, instead of the pointer to the element, you could keep a reference to the vector and an index into the vector.
I am trying to figure out the fastest way to keep constant number of elements in vector (or maybe there is some ready-made structure that do it automatically).
In my app I am adding multiple elements to the vector and I need to do it fast. Because of vector's self resizing at some point it is significantly decreasing overall application speed. What I was thinking about is to do something like this:
if(my_vector.size() < 300)
my_vector.push_back(new_element);
else
{
my_vector.pop_front();
my_vector.push_back(new_element);
}
but after first few tests I've realized that it might not be the best solution, because I am not sure if pop_front() and later push_back() doesn't still need to resize at some point.
Is there any other solution for this?
Use a std::queue. Its underlying container is a std::deque, but like a stack a queue's interface is specifically for FIFO operations (push_back, pop_front), which is exactly what you're doing in your situation. Here's why a deque is better for this situation:
The storage of a deque is automatically expanded and contracted as
needed. Expansion of a deque is cheaper than the expansion of a
std::vector because it does not involve copying of the existing
elements to a new memory location.
The complexity (efficiency) of common operations on deques is as
follows:
Random access - constant O(1)
Insertion or removal of elements at the end or beginning - constant O(1)
To implement a fixed-size container with push_back and pop_front and minimal memory shuffling, use a std::array of the appropriate size. To keep track of things you'll need a front index for pushing elements and a back index for popping things. To push, store the element at the location given by front_index, then increment front_index and take the remainder modulo the container size. To pop, read the element at the location given by back_index, and adjust that index the same way you did front_index. With that in place, the code in the question will do what you need.
You just need to reserve the capacity to a reasonable number. The vector will not automatically shrink. So it only will grow and, possibly, stop at some point.
You might be also interested in the resize policies. For example Facebook made a substantial research and created own implementation of the vector - folly::fbvector which has better performance than std::vector
Should I use deque instead of vector if i'd like to push elements also in the beginning of the container? When should I use list and what's the point of it?
Use deque if you need efficient insertion/removal at the beginning and end of the sequence and random access; use list if you need efficient insertion anywhere, at the sacrifice of random access. Iterators and references to list elements are very stable under almost any mutation of the container, while deque has very peculiar iterator and reference invalidation rules (so check them out carefully).
Also, list is a node-based container, while a deque uses chunks of contiguous memory, so memory locality may have performance effects that cannot be captured by asymptotic complexity estimates.
deque can serve as a replacement for vector almost everywhere and should probably have been considered the "default" container in C++ (on account of its more flexible memory requirements); the only reason to prefer vector is when you must have a guaranteed contiguous memory layout of your sequence.
deque and vector provide random access, list provides only linear accesses. So if you need to be able to do container[i], that rules out list. On the other hand, you can insert and remove items anywhere in a list efficiently, and operations in the middle of vector and deque are slow.
deque and vector are very similar, and are basically interchangeable for most purposes. There are only two differences worth mentioning. First, vector can only efficiently add new items at the end, while deque can add items at either end efficiently. So why would you ever use a vector then? Unlike deque, vector guarantee that all items will be stored in contiguous memory locations, which makes iterating through them faster in some situations.