Good sources to understand Containers - list

I'm studying containers and their different performances.
When , for a vector, I read something like
"inserting elements other than the back is slow"
but for a list:
"fast insert/delete at any point"
My questions are:
How a vector is different from a list in such a way that the two sentences above are true?
How its a vector built differently from a list?
Is because they store their elements in different memory parts?
I was searching some sources to better understand these concepts.

All examples will be linked to C++ language as I believe it is perfectly described there
Vectors
Vectors are the same as dynamic arrays with the ability to resize themselves
automatically when an element is inserted or deleted, with their
storage being handled automatically by the container. Vector elements
are placed in contiguous storage so that they can be accessed and
traversed using iterators. In vectors, data is inserted at the end.
Inserting at the end takes differential time, as sometimes there may
be a need of extending the array. Removing the last element takes only
constant time because no resizing happens. Inserting and erasing at
the beginning or in the middle is linear in time.
Vector is the dynamic array, but which can manage the memory allocated to itself. This means that you can create arrays whose length is set at runtime without using the new and delete operators (explicitly specifying the allocation and deallocation of memory).
Lists
Lists are sequence containers that allow non-contiguous memory
allocation. As compared to vector, the list has slow traversal, but once a
position has been found, insertion and deletion are quick. Normally,
when we say a List, we talk about a doubly linked list. For implementing
a singly linked list, we use a forward list.
There are 2 types of lists:
Double linked list where each element has an address of [next]
and [previous], to access first or last element you could you a specific function (e.g. in C++ front() or back() on the list - listName.front() will return you the 1st(0) element in your list), head.previous point to NULL and the tail.next point to NULL;
Single linked list where each element only has the link(knows)
about the [next] element, and the last element in this list points
to NULL.
Now let's get back to your question:
how a vector is different from a list in such a way
that the two sentences above are true? How is a vector built
differently from a list? Is it because they store their elements in
different memory parts? I was searching some sources to better
understand these concepts.
As I have mentioned there are 2 types of lists (single linked and double liked), they are good when you are going to have:
many insertion/deletion everywhere except the end of a list;
You could use vector in case you are planning to:
frequently access insert/delete elements at the end of a list;
access elements at the random position (as you could use [N] to access the element at any point, where the N is the index/position of an element.
Whereas in the List you would need to use iterators to access the element at the position/index N.
So vector is a dynamic array and they tend to perform faster when you are accessing it (since there is no additional wrapper over it, and you directly access a point in memory by the pointer).
The list is a sequence container (so this one has a wrapper over some base language functionality) that sacrifices some point in favor of additional simplicity of insertion and deletion by providing a user with some useful methods to work with its elements.
And to resolve you question, we could conclude the following:
Vector
inserting elements other than the back is slow
List
fast insert/delete at any point
This can be judged by the structure they have what they are.
Insertion is swift in List because it is a linked list and this
means what? Exactly, This means that the only change is to be taken
to achieve it is to change the pointer of the [previous ] and the
[next] item, and we are done!
Whereas in Vector it would take waaaay more time to insert an element anywhere other than at the end. This could be proven by the array concept. If you have an array of one million elements and you want to replace/delete/insert the element at the very beginning of the array it would need to change the position of each element that is coming after the altered element.
List vs Vector Image:
images were taken from this sources:
singly linked list
double linked list
double linked list 2nd image
vector
Also, try having a look here vector-vs-list-in-stl. Their comparison is well described there. + look through the-c-standard-template-library-stl, by going to the "Containers" section and checking the description and its methods.

Related

Dynamic array VS linked list in C++ [duplicate]

This question already has answers here:
Linked List vs Vector
(5 answers)
Closed 7 years ago.
Why we need a linked list when we have dynamic array list?
I have studied static list and linked list. I have knowledge of dynamic array list. but I couldn't find out the exact difference between that
Anyone please help me to answer this
Dynamic array is an array that resizes itself up or down depending on the number of content.
Advantage:
accessing and assignment by index is very fast O(1) process, since internally index access is just [address of first member] + [offset].
appending object (inserting at the end of array) is relatively fast amortized O(1). Same performance characteristic as removing objects at the end of the array. Note: appending and removing objects near the end of array is also known as push and pop.
Disadvantage:
inserting or removing objects in a random position in a dynamic array is very slow O(n/2), as it must shift (on average) half of the array every time. Especially poor is insertion and removal near the start of the array, as it must copy the whole array.
Unpredictable performance when insertion or removal requires resizing
There is a bit of unused space, since dynamic array implementation usually allocates more memory than necessary (since resize is a very slow operation)
Linked List is an object that have a general structure of [head, [tail]], head is the data, and tail is another Linked List. There are many versions of linked list: singular LL, double LL, circular LL, etc.
Advantage:
fast O(1) insertion and removal at any position in the list, as insertion in linked list is only breaking the list, inserting, and repairing it back together (no need to copy the tails)
Linked list is a persistent data structure, rather hard to explain in short sentence, see: wiki-link . This advantage allow tail sharing between two linked list. Tail sharing makes it easy to use linked list as copy-on-write data structure.
Disadvantage:
Slow O(n) index access (random access), since accessing linked list by index means you have to recursively loop over the list.
poor locality, the memory used for linked list is scattered around in a mess. In contrast with, arrays which uses a contiguous addresses in memory. Arrays (slightly) benefits from processor caching since they are all near each other
Others:
Due to the nature of linked list, you have to think recursively. Programmers that are not used to recursive functions may have some difficulties in writing algorithms for linked list (or worse they may try to use indexing).
Simply put, when you want to use algorithms that requires random access, forget linked list. When you want to use algorithms that requires heavy insertion and removal, forget arrays.
This is taken from the best answer of this question
I am convinced by this answer.
Vector aka Dynamic Array: Like regular array. The continuous memory location is used for storing vector. Whenever you need to allocate more memory, and memory is not available in the current location, the entire array is copied to another location and extra memory is allocated.
List: Maintain a pointer in each element and that pointer points to the next element.
What are the complexity guarantees of the standard containers?
Look at this link for more information.

Vector of Doubly Linked Lists of Doubly Linked List Pointers

I have a vector that contains Doubly Linked Lists, (i.e std::vector< DoublyLinkedList >) and then each Doubly Linked List will contain a pointer to another Doubly Linked List in the vector.
Here is an example of what I'm talking about:
So let's say we have the following vector of Doubly Linked Lists, { {1,2,0} ,{0,2,1,5}, {2,1,0,4,5} ,{4,5,1,0}, {5,4} }.
Let's look at the first Doubly Linked List in the vector, {1,2,0}. What I want is for 1 to point to the list {1,2,0} and 2 to point to the list {2,1,0,4,5} and 0 to point to {0,2,1,5} and similarly for the other lists in the vector.
In addition to having this kind of structure I also need the pointers to point to the correct list if we permute the elements of the vector.
So, say, if in the above example I swap the first two lists in the vector, which gives:
{ {0,2,1,5},{1,2,0},{2,1,0,4,5},{4,5,1,0},{5,4} }
I would still like in the list {1,2,0} that 1 points to {1,2,0} and 2 points to {2,1,0,4,5} and 0 to {0,2,1,5}.
So I am able to implement every part up until this last part.
What I've been doing so far for this part is, before the permutation I can have all the 0's in each list point to &vector[1] and then after a permutation I would have to go through every element in each list to find 0 and point them to 0's new position, so they would point to say &vector[k].
The problem with this is I have to search each list for 0 but I do not want to do the search. So is there any way to implement this without having to do a search? (The code is in C++)
In addition to the issue you described, the other problem with storing your structures directly in a vector is that certain operations on a vector invalidate some or all existing pointers to the vector. Namely removing from, or inserting elements into the vector.
Generally, in situations like these, it's better for the vector to store pointers to objects, rather than the objects themselves. In your example, a std::vector< DoublyLinkedList *> is going to work better. Your various instances of DoubleLinkedList can store pointers to each other, directly, and moving the pointers around in the vector won't have any effect on their validity.
Of course, this solution also has some other issues to work through, such as heap management, that will have to be sorted out. But that would be a different question.

C++ vector and list insertion

Could anybody know why inserting an element into the middle of a list is faster
than inserting an element into the middle of a vector?
I prefer to use vector but am told to use list if I can.
Anybody can explains why?
And is it always recommended to use list over vector?
If I take the question verbatim, finding the middle of an array (std::vector) is a simple operation, you divide the length by two and then round up or down to get the index. Finding the middle of a doubly linked list (std::list) requires walking through all elements. Even if you know its size, you still need to walk over half of the elements. Therefore std::vector is faster than std::list, in other words one is O(1) while the other is O(n).
Inserting at a known position requires shuffing the adjacent elements for an array and just linking in another node for a doubly linked list, as others explained here. Therefore, std::list with O(1) is faster than std::vector with O(n).
Together, to insert in the exact middle, we have O(1) + O(n) for the array and O(n) + O(1) for the doubly linked list, making inserting in the middle O(n) for both container types. All this leaves out things like CPU caches and allocator speed though, it just compares the number of "simple" operations. In summary, you need to find out how you use the container. If you really insert/delete at random positions a lot, std::list might be better. If you only do so rarely and then only read the container, a std::vector might be better. If you only have ten elements, all the O(x) is probably worthless anyway and you should go with the one you like best.
Inserting into the middle of the vector requires all the elements after the insertion point to be shuffled along to make space, potentially involving lots of copying.
The list is implemented as a linked list with each node occupying its own space in memory with references to neighboring nodes, so adding a new node just requires changing 2 references to point to the new node.
Depending on the data type you use, a vector may well perform much faster than a list. But the more complex the object is to copy, the worse a vector gets.
In simple terms, a vector is an array. So, its elements are stored in consecutive memory locations (i.e., one next to the other). The only exception is that a vector allows resizing during run-time, without causing data loss.
Now, to insert to a list, you identify the node, then create the new element (any where in memory), store the value and connect the pointers.
But in the case of the vector (array), you must physically move the elements from one cell to the other in order to create that space for a new elements. That physical movement is what causes the delay, particularly if many elements (i.e., data) needs to be moved. You are not physcially moving array elements. Rather, its their contents.
Ulrich Eckhardt's answer is pretty good. I don't have enough reputation to add a comment so I will write an answer myself. Like Ulrich said the speed of insertion in the middle for both the list and the vector is O(n) in theory. In practice, modern CPUs have a thing called "prefetcher". it's pretty good at getting contiguous data. Since the vector is contiguous in memory, moving lots of elements is pretty fast because of the prefetcher. You need to be manipulating really, really big vectors in order for them to be slower in inserting than the list. For more details check this awesome blog post:
http://gameprogrammingpatterns.com/data-locality.html

what is the difference between linked list and array when search through them?

I understand that array's value can be accessed directly by their position and linked list have to go through them one by one but have no idea of how to explain the difference in terms of their overhead and storage when the search is happening.
( I am think more in terms of does the previous node need to temporary store somewhere while try to access the next node any additional storage or overhead on the system part when go through them? and same when search through an array)
Can anyone give me a detail of what happen when search in each structure? or simply point to a right direction
An array is a vectorial variable of a fixed size.
A Linked List has no specified size: each element of the list contains a pointer to the next element. That's why you need to iterate through it sequentially. The advantage here, is that the structure shall not be allocated in a sequential block of memory, and doesn't need to resize if you add more elements in it.
Also in an array, if you remove an element, you need to shift all previous elements. If you insert an element in the middle of an array, you need to shift elements to make space for the new one. In lists you just update the pointers:
On the other side, array can be accessed randomly and don't need sequential access: so they are faster to search for objects, to sort, etc.
Having random access to a list element allows you to implement search algorithms such as a binary search, which would be impractical using a linked list.

Are vector a special case of linked lists?

When talking about the STL, I have several schoolmates telling me that "vectors are linked lists".
I have another one arguing that if you call the erase() method with an iterator, it breaks the vector, since it's a linked list.
They also tend to don't understand why I'm always arguing that vector are contiguous, just like any other array, and don't seem to understand what random access means. Are vector stricly contiguous just like regular arrays, or just at most contiguous ? (for example it will allocate several contiguous segments if the whole array doesn't fit).
I'm sorry to say that your schoolmates are completely wrong. If your schoolmates can honestly say that "vectors are linked lists" then you need to respectfully tell them that they need to pick up a good C++ book (or any decent computer science book) and read it. Or perhaps even the Wikipedia articles for vectors and lists. (Also see the articles for dynamic arrays and linked lists.)
Vectors (as in std::vector) are not linked lists. (Note that std::vector do not derive from std::list). While they both can store a collection of data, how a vector does it is completely different from how a linked list does it. Therefore, they have different performance characteristics in different situations.
For example, insertions are a constant-time operation on linked lists, while it is a linear-time operation on vectors if it is inserted in somewhere other than the end. (However, it is amortized constant-time if you insert at the end of a vector.)
The std::vector class in C++ are required to be contiguous by the C++ standard:
23.2.4/1 Class template vector
A vector is a kind of sequence that supports random access iterators. In addition, it supports (amortized) constant time insert and erase operations at the end; insert and erase in the middle take linear time. Storage management is handled automatically, though hints can be given to improve efficienty. The elements of a vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().
Compare that to std::list:
23.2.2/1 Class template list
A list is a kind of sequence that supports bidirectional iterators and allows constant time insert and erase operations anywhere within the sequence, with storage management handled automatically. Unlike vectors (23.2.4) and deques (23.2.1), fast random access to list elements is not supported, but many algorithms only need sequential access anyway.
Clearly, the C++ standard stipulates that a vector and a list are two different containers that do things differently.
You can't "break" a vector (at least not intentionally) by simply calling erase() with a valid iterator. That would make std::vectors rather useless since the point of its existence is to manage memory for you!
vector will hold all of it's storage in a single place. A vector is not even remotely like a linked list. Infact, if I had to pick two data structures that were most unlike each other, it would be vector and list. "At most contiguous" is how a deque operates.
Vector:
Guaranteed contiguous storage for all elements - will copy or move elements.
O(1) access time.
O(n) for insert or remove.
Iterators invalidated upon insertion or removal of any element.
List:
No contiguous storage at all - never copies or moves elements.
O(n) access time- plus all the nasty cache misses you're gonna get.
O(1) insert or remove.
Iterators valid as long as that specific element is not removed.
As you can see, they behave differently in every data structure use case.
By definition, vectors are contiguous blocks of memory like C arrays. See: http://en.wikipedia.org/wiki/Vector_(C%2B%2B)
Vectors allow random access; that is,
an element of a vector may be
referenced in the same manner as
elements of arrays (by array indices).
Linked-lists and sets, on the other
hand, do not support random access or
pointer arithmetic.
Vectors are not linked linked list, they provide random access and are contiguous just like arrays. In order to achieve this they re-allocate memory under the hood.
List is designed to allow quick insertions and deletions, while not invalidating any references or iterators except
the ones to the deleted element.