This question already has answers here:
Linked List vs Vector
(5 answers)
Closed 7 years ago.
Why we need a linked list when we have dynamic array list?
I have studied static list and linked list. I have knowledge of dynamic array list. but I couldn't find out the exact difference between that
Anyone please help me to answer this
Dynamic array is an array that resizes itself up or down depending on the number of content.
Advantage:
accessing and assignment by index is very fast O(1) process, since internally index access is just [address of first member] + [offset].
appending object (inserting at the end of array) is relatively fast amortized O(1). Same performance characteristic as removing objects at the end of the array. Note: appending and removing objects near the end of array is also known as push and pop.
Disadvantage:
inserting or removing objects in a random position in a dynamic array is very slow O(n/2), as it must shift (on average) half of the array every time. Especially poor is insertion and removal near the start of the array, as it must copy the whole array.
Unpredictable performance when insertion or removal requires resizing
There is a bit of unused space, since dynamic array implementation usually allocates more memory than necessary (since resize is a very slow operation)
Linked List is an object that have a general structure of [head, [tail]], head is the data, and tail is another Linked List. There are many versions of linked list: singular LL, double LL, circular LL, etc.
Advantage:
fast O(1) insertion and removal at any position in the list, as insertion in linked list is only breaking the list, inserting, and repairing it back together (no need to copy the tails)
Linked list is a persistent data structure, rather hard to explain in short sentence, see: wiki-link . This advantage allow tail sharing between two linked list. Tail sharing makes it easy to use linked list as copy-on-write data structure.
Disadvantage:
Slow O(n) index access (random access), since accessing linked list by index means you have to recursively loop over the list.
poor locality, the memory used for linked list is scattered around in a mess. In contrast with, arrays which uses a contiguous addresses in memory. Arrays (slightly) benefits from processor caching since they are all near each other
Others:
Due to the nature of linked list, you have to think recursively. Programmers that are not used to recursive functions may have some difficulties in writing algorithms for linked list (or worse they may try to use indexing).
Simply put, when you want to use algorithms that requires random access, forget linked list. When you want to use algorithms that requires heavy insertion and removal, forget arrays.
This is taken from the best answer of this question
I am convinced by this answer.
Vector aka Dynamic Array: Like regular array. The continuous memory location is used for storing vector. Whenever you need to allocate more memory, and memory is not available in the current location, the entire array is copied to another location and extra memory is allocated.
List: Maintain a pointer in each element and that pointer points to the next element.
What are the complexity guarantees of the standard containers?
Look at this link for more information.
Related
For a simple linked list in which random access to list elements is not a requirement, are there any significant advantages (performance or otherwise) to using std::list instead of std::vector? If backwards traversal is required, would it be more efficient to use std::slist and reverse() the list prior to iterating over its elements?
As usual the best answer to performance questions is to profile both implementations for your use case and see which is faster.
In general if you have insertions into the data-structure (other than at the end) then vector may be slower, otherwise in most cases vector is expected to perform better than list if only for data locality issues, this means that if two elements that are adjacent in the data-set are adjacent in memory then the next element will already be in the processor's cache and will not have to page fault the memory into the cache.
Also keep in mind that the space overhead for a vector is constant (3 pointers) while the space overhead for a list is paid for each element, this also reduces the number of full elements (data plus overhead) that can reside in the cache at any one time.
Default data structure to think of in C++ is the Vector.
Consider the following points...
1] Traversal:
List nodes are scattered everywhere in memory and hence list traversal leads to cache misses. But traversal of vectors is smooth.
2] Insertion and Deletion:
Average 50% of elements must be shifted when you do that to a Vector but caches are very good at that! But, with lists, you need to traverse to the point of insertion/deletion... so again... cache misses!
And surprisingly vectors win this case as well!
3] Storage:
When you use lists, there are 2 pointers per elements(forward & backward) so a List is much bigger than a Vector!
Vectors need just a little more memory than the actual elements need.
Yout should have a reason not to use a vector.
Reference:
I learned this in a talk of The Lord Bjarne Stroustrup: https://youtu.be/0iWb_qi2-uI?t=2680
Simply no. List has advantages over Vector, but sequential access is not one of them - if that's all you're doing, then a vector is better.
However.. a vector is more expensive to add additional elements than a list, especially if you're inserting in the middle.
Understand how these collections are implemented: a vector is a sequential array of data, a list is an element that contains the data and pointers to the next elements. Once you understand that, you'll understand why lists are good for inserts, and bad for random access.
(so, reverse iteration of a vector is exactly the same as for forward iteration - the iterator just subtracts the size of the data items each time, the list still has to jump to the next item via the pointer)
If you need backwards traversal an slist is unlikely to be the datastructure for you.
A conventional (doubly) linked list gives you constant insertion and deletion time anywhere in the list; a vector only gives amortised constant time insertion and deletion at the end of the list. For a vector insertion and deletion time is linear anywhere other than the end. This isn't the whole story; there are also constant factors. A vector is a more simple datastructure that has advantages and disadvantages depending on the context.
The best way to understand this is to understand how they are implemented. A linked list has a next and a previous pointer for each element. A vector has an array of elements addressed by index. From this you can see that both can do efficient forwards and backwards traversal, while only a vector can provide efficient random access. You can also see that the memory overhead of a linked list is per element while for the vector it is constant. And you can also see why insertion time is different between the two structures.
Some rigorous benchmarks on the topic:
http://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html
As has been noted by others, contiguous memory storage means std::vector is better for most things. There is virtually no good reason to use std::list except for small amounts of data (where it can all fit in the cache) and/or where erasure and reinsertion are frequent.
Complexity guarantees do Not relate to real-world performance because of the difference between cache and main memory speeds (200x) and how contiguous memory access affects cache usage. See Chandler Carruth (google) talk about the issue here:
https://www.youtube.com/watch?v=fHNmRkzxHWs
And Mike Acton's Data Oriented Design talk here:
https://www.youtube.com/watch?v=rX0ItVEVjHc
See this question for details about the costs:
What are the complexity Guarantees of the standard containers
If you have an slist and you now want to traverse it in reverse order why not change the type to list everywhere?
std::vector is insanely faster than std::list to find an element
std::vector always performs faster than std::list with very small data
std::vector is always faster to push elements at the back than std::list
std::list handles large elements very well, especially for
sorting or inserting in the front
Note: If you want to learn more about performance, I would recommend to see this
I'm studying containers and their different performances.
When , for a vector, I read something like
"inserting elements other than the back is slow"
but for a list:
"fast insert/delete at any point"
My questions are:
How a vector is different from a list in such a way that the two sentences above are true?
How its a vector built differently from a list?
Is because they store their elements in different memory parts?
I was searching some sources to better understand these concepts.
All examples will be linked to C++ language as I believe it is perfectly described there
Vectors
Vectors are the same as dynamic arrays with the ability to resize themselves
automatically when an element is inserted or deleted, with their
storage being handled automatically by the container. Vector elements
are placed in contiguous storage so that they can be accessed and
traversed using iterators. In vectors, data is inserted at the end.
Inserting at the end takes differential time, as sometimes there may
be a need of extending the array. Removing the last element takes only
constant time because no resizing happens. Inserting and erasing at
the beginning or in the middle is linear in time.
Vector is the dynamic array, but which can manage the memory allocated to itself. This means that you can create arrays whose length is set at runtime without using the new and delete operators (explicitly specifying the allocation and deallocation of memory).
Lists
Lists are sequence containers that allow non-contiguous memory
allocation. As compared to vector, the list has slow traversal, but once a
position has been found, insertion and deletion are quick. Normally,
when we say a List, we talk about a doubly linked list. For implementing
a singly linked list, we use a forward list.
There are 2 types of lists:
Double linked list where each element has an address of [next]
and [previous], to access first or last element you could you a specific function (e.g. in C++ front() or back() on the list - listName.front() will return you the 1st(0) element in your list), head.previous point to NULL and the tail.next point to NULL;
Single linked list where each element only has the link(knows)
about the [next] element, and the last element in this list points
to NULL.
Now let's get back to your question:
how a vector is different from a list in such a way
that the two sentences above are true? How is a vector built
differently from a list? Is it because they store their elements in
different memory parts? I was searching some sources to better
understand these concepts.
As I have mentioned there are 2 types of lists (single linked and double liked), they are good when you are going to have:
many insertion/deletion everywhere except the end of a list;
You could use vector in case you are planning to:
frequently access insert/delete elements at the end of a list;
access elements at the random position (as you could use [N] to access the element at any point, where the N is the index/position of an element.
Whereas in the List you would need to use iterators to access the element at the position/index N.
So vector is a dynamic array and they tend to perform faster when you are accessing it (since there is no additional wrapper over it, and you directly access a point in memory by the pointer).
The list is a sequence container (so this one has a wrapper over some base language functionality) that sacrifices some point in favor of additional simplicity of insertion and deletion by providing a user with some useful methods to work with its elements.
And to resolve you question, we could conclude the following:
Vector
inserting elements other than the back is slow
List
fast insert/delete at any point
This can be judged by the structure they have what they are.
Insertion is swift in List because it is a linked list and this
means what? Exactly, This means that the only change is to be taken
to achieve it is to change the pointer of the [previous ] and the
[next] item, and we are done!
Whereas in Vector it would take waaaay more time to insert an element anywhere other than at the end. This could be proven by the array concept. If you have an array of one million elements and you want to replace/delete/insert the element at the very beginning of the array it would need to change the position of each element that is coming after the altered element.
List vs Vector Image:
images were taken from this sources:
singly linked list
double linked list
double linked list 2nd image
vector
Also, try having a look here vector-vs-list-in-stl. Their comparison is well described there. + look through the-c-standard-template-library-stl, by going to the "Containers" section and checking the description and its methods.
I had the following exchange with my professor which wasn't very satisfying. I included my parts of the exchange which should be enough to get my point across.
"For vectors, does the C++ implementation traverse through each element of the old dynamically allocated array and free it?
(Edit: I mean, when resizing and adding elements, either by pushback or resize)
I am especially curious because the book tries to make the case that linked lists are troublesome because of having to traverse each time. Does not seem to me that vectors have a huge advantage in that regard.
The main benefit of the vectors I can see is the convenience and fast accessing but not much more. As in, everytime you try to do something other than accessing, you will be traversing through everything to move and free memory. Is that correct?"
After his reply, I added.
"Professor xxxx,
I went out to test, and in fact, the addresses change if you either resize or push_back, so my assumption that the old addresses are freed is correct. I can only assume that program will have to go to each element to free it, and if that's correct, won't the insertion of new things be costly in terms of time, even more so than traversing linked lists?
Can you kindly correct the following statement If it states any incorrect facts or assumptions. Using vectors in any other way than using arrays (for any other purpose than accessing already stored data), means that linked lists will almost always be faster because unlike linked lists, in vectors not only would you traverse through the elements, you will traverse through them, free them, and then create a whole new array to accommodate new space. That is because the next address after the last element of the current vector could have a pointer variable pointing to it, and using that address will cause an extremely strange behavior that I cannot imagine the poor soul's misery that tries to figure out what had gone wrong."
TL;DR:
Disadvantage of linked lists is traversing, but vector uses (push_back, resize(), etc) most often require the traversing anyway, so how are vectors exactly faster?
There are several things that are faster than you expect:
When a vector reallocates, the original elements are destroyed, not freed, one by one. Their storage is then freed all at once. This is as opposed to a linked list, where each node is allocated and freed individually. But this is somewhat moot because:
A vector batches reallocations. std::vector is specified to have an amortized constant insertion cost, which implies that it avoids reallocating every time you push_back, to the extent that this cost becomes negligible when considering complexity. Typical implementations multiply the vector's capacity by a fixed factor every time it is exceeded, so when performing the costly reallocation it provides room for the next several push_backs. These then do not need to traverse the vector or allocate anything.
A vector is extremely cache-friendly. This makes all sequential operations on a vector blazing fast, and can counter-intuitively outperform a linked list in many cases, especially in long-running applications where memory might become fragmented.
As a complement of the already given answer and comments...
The std::vector has its elements stored contiguously in memory while it is not the case of a linked list.
The direct consequence is that element access is trivial for a std::vector while it's not for a linked list.
For example, if I want to access the nth element of a linked list, I have to iterate through the list until reaching the desired element.
But on the other hand, the linked list will perform better if we want to insert a new element inside.
Indeed, for a linked list, we have to iterate until we reach the desired position, then we just have to change the connections between the previous and the next node so that the new element is inserted in-between.
For a std::vector, you have to relocate every elements after the desired position (and make a reallocation if needed, i.e. if adding a new element exceeds the reserved available space).
So the std::vector is better for element access but is less efficient when inserting an element inside (same thing for removal).
So one of the topics in my comp sci class is concerning time complexity and using arrays and linked lists as a good way to compare certain operations and what container is better at doing so, so you can choose the appropiate data structure.
I understand the reasoning behind most of the operations but I'm unsure about one and that is inserting and appending in an array.
The worst case scenario for both of these is O(n). I believe I understand why inserting is O(n) because worst case, you insert at the front causing you to shift all elements over to the right meaning that its linear and dependent on the number of elements in the array.
For appending, I was curious why it was not O(1) since it takes one operation no matter the size to add an element at the end, given that there is space.
Is that the issue, if there isn't enough space you have to copy the array to a larger one for its worst case scenario?
[...] if there isn't enough space you have to copy the array to a larger one
for its worst case scenario?
Bingo.
A typical array is a chunk of contiguous memory with a definite size, which is determined either at compile or run time. There is no such thing as removing or inserting elements into an array, but simply writing into the already-allocated memory.
A linked list is a non-contiguous collection of memory chunks, which are connected by means of their addresses. There is such a thing as removing and inserting elements into a linked list.
The benefits of an array over a linked list are easier traversal and compactness (extra memory to store the address of the next [or previous] element is unnecessary). However, unlike a linked list, this cannot be extended as easily.
Nevertheless, in order for us to more precisely talk about the time complexities of the algorithms inherent to a data structure, we need to first define the data structure.
Doubly linked lists? Do we store the addresses of the first and last elements (like a queue)? Binary trees (which are a type of linked list)?
Could anybody know why inserting an element into the middle of a list is faster
than inserting an element into the middle of a vector?
I prefer to use vector but am told to use list if I can.
Anybody can explains why?
And is it always recommended to use list over vector?
If I take the question verbatim, finding the middle of an array (std::vector) is a simple operation, you divide the length by two and then round up or down to get the index. Finding the middle of a doubly linked list (std::list) requires walking through all elements. Even if you know its size, you still need to walk over half of the elements. Therefore std::vector is faster than std::list, in other words one is O(1) while the other is O(n).
Inserting at a known position requires shuffing the adjacent elements for an array and just linking in another node for a doubly linked list, as others explained here. Therefore, std::list with O(1) is faster than std::vector with O(n).
Together, to insert in the exact middle, we have O(1) + O(n) for the array and O(n) + O(1) for the doubly linked list, making inserting in the middle O(n) for both container types. All this leaves out things like CPU caches and allocator speed though, it just compares the number of "simple" operations. In summary, you need to find out how you use the container. If you really insert/delete at random positions a lot, std::list might be better. If you only do so rarely and then only read the container, a std::vector might be better. If you only have ten elements, all the O(x) is probably worthless anyway and you should go with the one you like best.
Inserting into the middle of the vector requires all the elements after the insertion point to be shuffled along to make space, potentially involving lots of copying.
The list is implemented as a linked list with each node occupying its own space in memory with references to neighboring nodes, so adding a new node just requires changing 2 references to point to the new node.
Depending on the data type you use, a vector may well perform much faster than a list. But the more complex the object is to copy, the worse a vector gets.
In simple terms, a vector is an array. So, its elements are stored in consecutive memory locations (i.e., one next to the other). The only exception is that a vector allows resizing during run-time, without causing data loss.
Now, to insert to a list, you identify the node, then create the new element (any where in memory), store the value and connect the pointers.
But in the case of the vector (array), you must physically move the elements from one cell to the other in order to create that space for a new elements. That physical movement is what causes the delay, particularly if many elements (i.e., data) needs to be moved. You are not physcially moving array elements. Rather, its their contents.
Ulrich Eckhardt's answer is pretty good. I don't have enough reputation to add a comment so I will write an answer myself. Like Ulrich said the speed of insertion in the middle for both the list and the vector is O(n) in theory. In practice, modern CPUs have a thing called "prefetcher". it's pretty good at getting contiguous data. Since the vector is contiguous in memory, moving lots of elements is pretty fast because of the prefetcher. You need to be manipulating really, really big vectors in order for them to be slower in inserting than the list. For more details check this awesome blog post:
http://gameprogrammingpatterns.com/data-locality.html