STL Containers - difference between vector, list and deque - c++

Should I use deque instead of vector if i'd like to push elements also in the beginning of the container? When should I use list and what's the point of it?

Use deque if you need efficient insertion/removal at the beginning and end of the sequence and random access; use list if you need efficient insertion anywhere, at the sacrifice of random access. Iterators and references to list elements are very stable under almost any mutation of the container, while deque has very peculiar iterator and reference invalidation rules (so check them out carefully).
Also, list is a node-based container, while a deque uses chunks of contiguous memory, so memory locality may have performance effects that cannot be captured by asymptotic complexity estimates.
deque can serve as a replacement for vector almost everywhere and should probably have been considered the "default" container in C++ (on account of its more flexible memory requirements); the only reason to prefer vector is when you must have a guaranteed contiguous memory layout of your sequence.

deque and vector provide random access, list provides only linear accesses. So if you need to be able to do container[i], that rules out list. On the other hand, you can insert and remove items anywhere in a list efficiently, and operations in the middle of vector and deque are slow.
deque and vector are very similar, and are basically interchangeable for most purposes. There are only two differences worth mentioning. First, vector can only efficiently add new items at the end, while deque can add items at either end efficiently. So why would you ever use a vector then? Unlike deque, vector guarantee that all items will be stored in contiguous memory locations, which makes iterating through them faster in some situations.

Related

Get pointer to raw data in set like &(vector[0])

To get the pointer to the data in a vector we can use
vector<double> Vec;
double* Array_Pointer = &(Vec[0]);
Function(Array_Pointer);
Is that possible to get the pointer to the data in a set? Can I use that as array pointer like above?
If not possible, what is the best way to make a vector out of set? I mean without loop over all elements.
No, this is not necessarily possible. The C++ ISO standard explicitly guarantees contiguous storage of elements in a std::vector, so you can safely take the address of the first element and then use that pointer as if you were pointing at a raw array. Other containers in the standard library do not have this guarantee.
The reason for this is to efficiently support most operations on a std::set, the implementation needs to use complex data structures like balanced binary search trees to store and organize the data. These structures are inherently nonlinear and require nodes to be allocated and linked together. Efficiently getting this to work with the elements in a flat array would be difficult, if not impossible, in the time constraints laid out by the standard (amortized O(log n) for most operations.)
EDIT: In response to your question - there is no way to build a std::vector from a std::set without some code somewhere iterating over the set and copying the elements over. You can do this without explicitly using any loops yourself by using the std::vector range constructor:
std::vector<T> vec(mySet.begin(), mySet.end());
Hope this helps!
No. It's not possible to implement set in such a way that you can do this.
If you implement set in such a way that elements are stored in a single array, then when you add more elements, that array will inevitably need to be reallocated at some point. At that time, any references to existing elements will be invalidated.
One of features of set is that it guarantees that references to elements will never be invalidated if you add (or remove) other elements. As stated in [associative.reqmts]:
The insert and emplace members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.
So it's impossible to implement set in such a way that all of the elements of the set are stored in a single array.
Note that this has nothing to do with the efficiency requirements such as O(log n) insert/delete/lookup (if you squint really hard and allow for amortized O(log n) insertion time, at least), or maintaining sorted order, or anything like that. If it was just these, they could easily be handled with a data structure on top of the underlying elements, and the elements themselves could be stored in an array. It also doesn't even have anything to do with guarantees about iterator invalidation, since iterators are abstract.
No, the only thing holding you back is the reference invalidation requirement.

Vector vs Deque insertion in middle

I know that deque is more efficient than vector when insertions are at front or end and vector is better if we have to do pointer arithmetic. But which one to use when we have to perform insertions in middle.? and Why.?
You might think that a deque would have the advantage, because it stores the data broken up into blocks. However to implement operator[] in constant time requires all those blocks to be the same size. Inserting or deleting an element in the middle still requires shifting all the values on one side or the other, same as a vector. Since the vector is simpler and has better locality for caching, it should come out ahead.
Selection criteria with Standard library containers is, You select a container depending upon:
Type of data you want to store &
The type of operations you want to perform on the data.
If you want to perform large number of insertions in the middle you are much better off using a std::list.
If the choice is just between a std::deque and std::vector then there are a number of factors to consider:
Typically, there is one more indirection in case of deque to access the elements, so element
access and iterator movement of deques are usually a bit slower.
In systems that have size limitations for blocks of memory, a deque might contain more elements because it uses more than one block of memory. Thus, max_size() might be larger for deques.
Deques provide no support to control the capacity and the moment of reallocation. In
particular, any insertion or deletion of elements other than at the beginning or end
invalidates all pointers, references, and iterators that refer to elements of the deque.
However, reallocation may perform better than for vectors, because according to their
typical internal structure, deques don't have to copy all elements on reallocation.
Blocks of memory might get freed when they are no longer used, so the memory size of a
deque might shrink (this is not a condition imposed by standard but most implementations do)
std::deque could perform better for large containers because it is typically implemented as a linked sequence of contiguous data blocks, as opposed to the single block used in an std::vector. So an insertion in the middle would result in less data being copied from one place to another, and potentially less reallocations.
Of course, whether that matters or not depends on the size of the containers and the cost of copying the elements stored. With C++11 move semantics, the cost of the latter is less important. But in the end, the only way of knowing is profiling with a realistic application.
Deque would still be more efficient, as it doesn't have to move half of the array every time you insert an element.
Of course, this will only really matter if you consider large numbers of elements, and even than it is advisable to run a benchmark and see which one works better in your particular case. Remember that premature optimization is the root of all evil.

vector implemented with many blocks and no resize copy

I'm wondering if it would be possible to implement an stl-like vector where the storage is done in blocks, and rather than allocate a larger block and copy from the original block, you could keep different blocks in different places, and overload the operator[] and the iterator's operator++ so that the user of the vector wasn't aware that the blocks weren't contiguous.
This could save a copy when moving beyond the existing capacity.
You would be looking for std::deque
See GotW #54 Using Vector and Deque
In Most Cases, Prefer Using deque (Controversial)
Contains benchmarks to demonstrate the behaviours
The latest C++11 standard says:
§ 23.2.3 Sequence containers
[2] The sequence containers offer the programmer different complexity trade-offs and should be used accordingly.
vector or array is the type of sequence container that should be used by default. list or forward_list
should be used when there are frequent insertions and deletions from the middle of the sequence. deque is
the data structure of choice when most insertions and deletions take place at the beginning or at the end of
the sequence.
FAQ > Prelude's Corner > Vector or Deque? (intermediate) Says:
A vector can only add items to the end efficiently, any attempt to insert an item in the middle of the vector or at the beginning can be and often is very inefficient. A deque can insert items at both the beginning and then end in constant time, O(1), which is very good. Insertions in the middle are still inefficient, but if such functionality is needed a list should be used. A deque's method for inserting at the front is push_front(), the insert() method can also be used, but push_front is more clear.
Just like insertions, erasures at the front of a vector are inefficient, but a deque offers constant time erasure from the front as well.
A deque uses memory more effectively. Consider memory fragmentation, a vector requires N consecutive blocks of memory to hold its items where N is the number of items and a block is the size of a single item. This can be a problem if the vector needs 5 or 10 megabytes of memory, but the available memory is fragmented to the point where there are not 5 or 10 megabytes of consecutive memory. A deque does not have this problem, if there isn't enough consecutive memory, the deque will use a series of smaller blocks.
[...]
Yes it's possible.
do you know rope? it's what you describe, for strings (big string == rope, got the joke?). Rope is not part of the standard, but for practical purposes: it's available on modern compilers. You could use it to represent the complete content of a text editor.
Take a look here: STL Rope - when and where to use
And always remember:
the first rule of (performance) optimizations is: don't do it
the second rule (for experts only): don't do it now.

Are vector a special case of linked lists?

When talking about the STL, I have several schoolmates telling me that "vectors are linked lists".
I have another one arguing that if you call the erase() method with an iterator, it breaks the vector, since it's a linked list.
They also tend to don't understand why I'm always arguing that vector are contiguous, just like any other array, and don't seem to understand what random access means. Are vector stricly contiguous just like regular arrays, or just at most contiguous ? (for example it will allocate several contiguous segments if the whole array doesn't fit).
I'm sorry to say that your schoolmates are completely wrong. If your schoolmates can honestly say that "vectors are linked lists" then you need to respectfully tell them that they need to pick up a good C++ book (or any decent computer science book) and read it. Or perhaps even the Wikipedia articles for vectors and lists. (Also see the articles for dynamic arrays and linked lists.)
Vectors (as in std::vector) are not linked lists. (Note that std::vector do not derive from std::list). While they both can store a collection of data, how a vector does it is completely different from how a linked list does it. Therefore, they have different performance characteristics in different situations.
For example, insertions are a constant-time operation on linked lists, while it is a linear-time operation on vectors if it is inserted in somewhere other than the end. (However, it is amortized constant-time if you insert at the end of a vector.)
The std::vector class in C++ are required to be contiguous by the C++ standard:
23.2.4/1 Class template vector
A vector is a kind of sequence that supports random access iterators. In addition, it supports (amortized) constant time insert and erase operations at the end; insert and erase in the middle take linear time. Storage management is handled automatically, though hints can be given to improve efficienty. The elements of a vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().
Compare that to std::list:
23.2.2/1 Class template list
A list is a kind of sequence that supports bidirectional iterators and allows constant time insert and erase operations anywhere within the sequence, with storage management handled automatically. Unlike vectors (23.2.4) and deques (23.2.1), fast random access to list elements is not supported, but many algorithms only need sequential access anyway.
Clearly, the C++ standard stipulates that a vector and a list are two different containers that do things differently.
You can't "break" a vector (at least not intentionally) by simply calling erase() with a valid iterator. That would make std::vectors rather useless since the point of its existence is to manage memory for you!
vector will hold all of it's storage in a single place. A vector is not even remotely like a linked list. Infact, if I had to pick two data structures that were most unlike each other, it would be vector and list. "At most contiguous" is how a deque operates.
Vector:
Guaranteed contiguous storage for all elements - will copy or move elements.
O(1) access time.
O(n) for insert or remove.
Iterators invalidated upon insertion or removal of any element.
List:
No contiguous storage at all - never copies or moves elements.
O(n) access time- plus all the nasty cache misses you're gonna get.
O(1) insert or remove.
Iterators valid as long as that specific element is not removed.
As you can see, they behave differently in every data structure use case.
By definition, vectors are contiguous blocks of memory like C arrays. See: http://en.wikipedia.org/wiki/Vector_(C%2B%2B)
Vectors allow random access; that is,
an element of a vector may be
referenced in the same manner as
elements of arrays (by array indices).
Linked-lists and sets, on the other
hand, do not support random access or
pointer arithmetic.
Vectors are not linked linked list, they provide random access and are contiguous just like arrays. In order to achieve this they re-allocate memory under the hood.
List is designed to allow quick insertions and deletions, while not invalidating any references or iterators except
the ones to the deleted element.

C++ deque vs vector and C++ map vs Set

Can some one please tell me what is the difference between vector vs deque. I know the implementation of vector in C++ but not deque. Also interfaces of map and set seem similar to me. What is the difference between the two and when to use one.
std::vector: A dynamic-array class. The internal memory allocation makes sure that it always creates an array. Useful when the size of the data is known and is known to not change too often. It is also good when you want to have random-access to elements.
std::deque: A double-ended queue that can act as a stack or queue. Good for when you are not sure about the number of elements and when accessing data-element are always in a serial manner. They are fast when elements are added/removed from front and end but not when they're added/removed to/from the middle.
std::list: A double-linked list that can be used to create a 'list' of data. The advantage of a list is that elements can be inserted or deleted from any part of the list without affecting an iterator that is pointing to a list member (and is still a member of the list after deletion). Useful when you know that elements will be deleted very often from any part of the list.
std::map: A dictionary that maps a 'key' to a 'value'. Useful for applications like 'arrays' whose index are not an integer. Basically can be used to create a map-list of name to an element, like a map that stores name-to-widget relationship.
std::set: A list of 'unique' data values. For e.g. if you insert 1, 2, 2, 1, 3, the list will only have the elements 1, 2, 3. Note that the elements in this list are always ordered. Internally, they're usually implemented as binary search trees (like map).
See here for full details:
What are the complexity guarantees of the standard containers?
vector Vs deque
A deque is the same as a vector but with the following addition:
It is a "front insertion sequence"
This means that deque is the same as a vector but provides the following additional gurantees:
push_front() O(1)
pop_front() O(1)
set Vs map
A map is a "Pair Associative Container" while set is a "Simple Associative Container"
This means they are exactly the same. The difference is that map holds pairs of items (Key/Value) rather than just a value.
std::vector
Your default sequential containers should be a std::vector. Generally, std::vector will provide you with the right balance of performance and speed. The std::vector container is similar to a C-style array that can grow or shrink during runtime. The underlying buffer is stored contiguously and is guaranteed to be compatible with C-style arrays.
Consider using a std::vector if:
You need your data to be stored contiguously in memory
Especially useful for C-style API compatibility
You do not know the size at compile time
You need efficient random access to your elements (O(1))
You will be adding and removing elements from the end
You want to iterate over the elements in any order
Avoid using a std::vector if:
You will frequently add or remove elements to the front or middle of the sequence
The size of your buffer is constant and known in advance (prefer std::array)
Be aware of the specialization of std::vector: Since C++98, std::vector has been specialized such that each element only occupies one bit. When accessing individual boolean elements, the operators return a copy of a bool that is constructed with the value of that bit.
std::array
The std::array container is the most like a built-in array, but offering extra features such as bounds checking and automatic memory management. Unlike std::vector, the size of std::array is fixed and cannot change during runtime.
Consider using a std::array if:
You need your data to be stored contiguously in memory
Especially useful for C-style API compatibility
The size of your array is known in advance
You need efficient random access to your elements (O(1))
You want to iterate over the elements in any order
Avoid using a std::array if:
You need to insert or remove elements
You don’t know the size of your array at compile time
You need to be able to resize your array dynamically
std::deque
The std::deque container gets its name from a shortening of “double ended queue”. The std::deque container is most efficient when appending items to the front or back of a queue. Unlike std::vector, std::deque does not provide a mechanism to reserve a buffer. The underlying buffer is also not guaranteed to be compatible with C-style array APIs.
Consider using std::deque if:
You need to insert new elements at both the front and back of a sequence (e.g. in a scheduler)
You need efficient random access to your elements (O(1))
You want the internal buffer to automatically shrink when elements are removed
You want to iterate over the elements in any order
Avoid using std::deque if:
You need to maintain compatibility with C-style APIs
You need to reserve memory ahead of time
You need to frequently insert or remove elements from the middle of the sequence
Calling insert in the middle of a std::deque invalidates all iterators and references to its elements
std::list
The std::list and std::forward_list containers implement linked list data structures. Where std::list provides a doubly-linked list, the std::forward_list only contains a pointer to the next object. Unlike the other sequential containers, the list types do not provide efficient random access to elements. Each element must be traversed in order.
Consider using std::list if:
You need to store many items but the number is unknown
You need to insert or remove new elements from any position in the sequence
You do not need efficient access to random elements
You want the ability to move elements or sets of elements within the container or between different containers
You want to implement a node-wise memory allocation scheme
Avoid using std::list if:
You need to maintain compatibility with C-style APIs
You need efficient access to random elements
Your system utilizes a cache (prefer std::vector for reduced cache misses)
The size of your data is known in advance and can be managed by a std::vector
A map is what is often refered to as an associative array, usually implemented using a binary tree (for example). A deque is a double-ended queue, a certain incarnation of a linked list.
That is not to say that the actual implementations of the container library uses these concepts - the containr library will just give you some guarantees about how you can access the container and at what (amortized) cost.
I suggest you take a look at a reference that will go into detail about what those guarantees are. Scott Meyers book "Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library" should talk a bit about those, if I remember correctly. Apart from that, the C++ standard is obviously a good choice.
What I really want to say is: containers really are described by their properties, not by the underlying implementation.
set: holds unique values. Put 'a' in twice, the set has one 'a'.
map: maps keys to values, e.g. 'name' => 'fred', 'age' => 40. You can look up 'name' and you'll get 'fred' out.
dequeue, like a vector but you can only add/remove at the ends. No inserts into the middle. http://en.wikipedia.org/wiki/Deque
edit: my dequeue description is lacking, see comments below for corrections