Which one to use? Vector or List

Which one to use? Vector or List - list

I have to store 10 elements of type card (user defined class). I cannot decide whether to go with vector or list. Following are the operations I would be performing on the structure:
Appending or Inserting at the end of the structure
(better to go with vector).
Ramdom access(element to be accessed can be at end, begining or any position in the structure) (again vector is a better choice).
To delete the random accessed element i.e. Erasing an element from begining or end or any position
(Vector only good for end positions, elsewhere list preferred).
Move element from one position to other such that the element is not swapped with the element at desired position but it gets
inserted within (List are much better here).
To move more than one elements in the same manner as point 4.
(Again I would prefer list)
So can you please guide me which one to pick.
Thanks a ton!

Sounds like you have done your research on vector and list, as you can see there are some conflicting requirements. One other thing to consider is perhaps the frequent of those operations. I.e how often are you expecting to be inserting or deleting from the middle of the collection. Another consideration is the size of the collection, 10 elements is a very small collection so copying 10 elements around isn't a big deal unless you are doing it very frequently. My default choice would be vector, but you can profile both to see which one performs better.

Related

In c++, how to change the position of elements in a vector and do the same changes to an other different vector?

I have 2 vectors, y and T initially of the same size, and they need to stay separated like this. I do a loop until T is empty and every time it loops, the first element of T is used for an algorithm and then erased from the vector T and pushed into vector S (which is empty at first). Every loop, some values in vector y will change and I need to sort them. My problem is: when I sort y, if y[2] and y[3] swap, I need to swap the elements in T that were at [2] and [3] BEFORE the first loop!
I know this seems weird but this is for a Dijkstra algorithm for my Graph project. I understand if it's not clear and I'll try to clarify if you need it. Any advice will be very helpful for me! Thank you!

If I follow you correctly, T is really a FIFO queue. The first element in it each iteration is being 'popped off the front' and placed elsewhere.
Should you be doing this and want at any time to know an ordering of T which includes the elements that were popped off, perhaps you could not remove them from T at all. Just have an iterator or pointer to the next node to be processed.
That node could either be copied to S, or perhaps S would just contain pointers to the node if it were expensive to copy.
This way you're never actually removing the element from T, simply moving on to look at the next item. Presumably it could be cleaned up at the end.

For the Dijkstra algorithm, you usually use a binary heap to implement a priority queue and visit nodes in ascending order of distance to source nodes. At each visit, neighboring distances may be updated (relax operation) so corresponding queue elements change priority (decrease-key operation).
Instead of using the distance directly as the key of heap elements, use the node identifier (or pointer). But also use a comparison function (as e.g. in std::make_heap and most STL algorithms) that, given two node identifiers, compares the corresponding distances.
This way, node identifiers are re-ordered according to heap operations and, whenever you pop an element from the heap (having minimal distance), you can access any information you like given its node identifier.

C++ vector and list insertion

Could anybody know why inserting an element into the middle of a list is faster
than inserting an element into the middle of a vector?
I prefer to use vector but am told to use list if I can.
Anybody can explains why?
And is it always recommended to use list over vector?

If I take the question verbatim, finding the middle of an array (std::vector) is a simple operation, you divide the length by two and then round up or down to get the index. Finding the middle of a doubly linked list (std::list) requires walking through all elements. Even if you know its size, you still need to walk over half of the elements. Therefore std::vector is faster than std::list, in other words one is O(1) while the other is O(n).
Inserting at a known position requires shuffing the adjacent elements for an array and just linking in another node for a doubly linked list, as others explained here. Therefore, std::list with O(1) is faster than std::vector with O(n).
Together, to insert in the exact middle, we have O(1) + O(n) for the array and O(n) + O(1) for the doubly linked list, making inserting in the middle O(n) for both container types. All this leaves out things like CPU caches and allocator speed though, it just compares the number of "simple" operations. In summary, you need to find out how you use the container. If you really insert/delete at random positions a lot, std::list might be better. If you only do so rarely and then only read the container, a std::vector might be better. If you only have ten elements, all the O(x) is probably worthless anyway and you should go with the one you like best.

Inserting into the middle of the vector requires all the elements after the insertion point to be shuffled along to make space, potentially involving lots of copying.
The list is implemented as a linked list with each node occupying its own space in memory with references to neighboring nodes, so adding a new node just requires changing 2 references to point to the new node.
Depending on the data type you use, a vector may well perform much faster than a list. But the more complex the object is to copy, the worse a vector gets.

In simple terms, a vector is an array. So, its elements are stored in consecutive memory locations (i.e., one next to the other). The only exception is that a vector allows resizing during run-time, without causing data loss.
Now, to insert to a list, you identify the node, then create the new element (any where in memory), store the value and connect the pointers.
But in the case of the vector (array), you must physically move the elements from one cell to the other in order to create that space for a new elements. That physical movement is what causes the delay, particularly if many elements (i.e., data) needs to be moved. You are not physcially moving array elements. Rather, its their contents.

Ulrich Eckhardt's answer is pretty good. I don't have enough reputation to add a comment so I will write an answer myself. Like Ulrich said the speed of insertion in the middle for both the list and the vector is O(n) in theory. In practice, modern CPUs have a thing called "prefetcher". it's pretty good at getting contiguous data. Since the vector is contiguous in memory, moving lots of elements is pretty fast because of the prefetcher. You need to be manipulating really, really big vectors in order for them to be slower in inserting than the list. For more details check this awesome blog post:
http://gameprogrammingpatterns.com/data-locality.html

Container for a stack of unique elements

What I'm trying to do is to construct a stack which contains unique elements.
And if an element is pushed to which is already in stack the element is not pushed but the existing elemetn should be moved to the top of the stack, i.e.
ABCD + B > ACDB
I would like to here from you which container will be the best choice to have this functionality.
I decided to user stack adapter over list, because
list does provide constant time for element move
list is one of the natively supported containers for the stack.
The drawback of my choice is that I have to manually check for the duplicate elements.
P.S. My compiler is not so recent so please don't suggest unordered_set.

Basically you have to chose between constant time moving + long search, or constant time search + long moving.
It's hard to say which would be more time-consuming, but consider this:
You will have to search for if the element exists every time you try to add an element
You will not have to move elements every time, since obviously at some times you will be adding elements that are "new" for the container.
I'd suggest you to store elements and their stack positions in different containers. Store elements in a way that provides fast search, store stack positions in a way that provides fast movement. Connect both with pointers (so you can know which element is on which position, and which position holds which element <-- messy phrase, I know it!), and you will perform stuff rather fast.

From your requirements, it seems to me that the structure you want could be derived from a Max Heap.
If instead of storing just the item, you store a pair (generation, item), with generation coming from a monotonically increasing counter, then the "root" of the heap is always the last seen element (and the other elements do not really matter, do they ?).
Pop: typical pop operation on the heap (delete-max operation)
Push: modified operation, to account for uniqueness of "item" within the structure
look for "item", if found update its generation (increase-key operation)
if not, insert it (insert operation)
Given the number of elements (20), building the heap on a vector seems a natural choice.

Difference between lists and arrays

It seems a list in lisp can use push to add another element to it, while an array can use vector-push-extend to do the same thing (if you use :adjustable t, except add an element at the end. Similarly, pop removes the first item in a list, while vector-pop removes the last item from a vector.
So what is the difference between a list and a vector in lisp?

a list is made of cons cells and nil. One has to move sequential through the list to access an element. Adding and removing an element at the front is very cheap. Operations at other list positions are more costly. Lists may have a space overhead, since each element is stored in a cons cell.
a vector is a one-dimensional array of a certain length. If an array is adjustable, it might change its size, but possibly the elements have to be copied. Adding an element is costly, when the array has to be adjusted. Adjustable arrays have space overhead, since arrays are usually not adjusted by increments of one. One can access all elements directly via an index.

A vector is tangible thing, but the list you're thinking of is a name for a way to view several loosely-connected but separate things.
The vector is like an egg carton—it's a box with some fixed number of slots, each of which may or may not have a thing inside of it. By contrast, what you're thinking of a list is more like taking a few individual shoes and tying their laces together. The "list" is really a view of one cons cell that holds a value—like a slot in the egg carton—and points to another cons cell, which also holds a value and points to another cons cell, which holds a value and possibly points to nothing else, making it the end of the list. What you think of as a list is not really one thing there, but rather it's a relationship between several distinct cons cells.
You are correct that there are some operations you can perform on both structures, but their time complexity is different. For a list, pushing an item onto the front does not actually modify anything about the existing list; instead, it means making a new cons cell to represent the new head of the list, and pointing that cons cell at the existing list as the new tail. That's an O(1) operation; it doesn't matter how long the list was, as putting a new cell in front of it always takes the same amount of time. Similarly, popping an item off the front of the list doesn't change the existing list; it just means shifting your view one cell over from the current head to see what was the second item as the new head.
By contrast, with a vector, pushing a new item onto the front requires first moving all the existing items over by one space to leave an empty space at the beginning—assuming there's enough space in the vector to hold all the existing items and one more. This shifting over has time complexity O(n), meaning that the more items there are in the vector, the longer it takes to shift them over and make room for the new item. Similarly, popping from the front of a vector requires shifting all the existing items but the first one down toward the front, so that what was the second becomes the first, and what was the third becomes the second, and so on. Again, that's of time complexity O(n).
It's sometimes possible to fiddle with the bookkeeping in a vector so that popping an item off the front doesn't require shifting all the existing items over; instead, just change the record of which item counts as the first. You can imagine that doing so has many challenges: indices used to access items need to be adjusted, the vector's capacity is no longer simple to interpret, and reallocating space for the vector to accommodate new items will now need to consider where to leave the newly-available empty space after copying the existing data over to new storage. The "deque" structure addresses these concerns.
Choosing between a list and a vector requires thinking about what you intend to do with it, and how much you know in advance about the nature and size of the content. They each sing in different scenarios, despite the apparent overlap in their purpose.
At the time I wrote the answer above, the question was asking about pushing and popping elements from the front of both a vector and a list. Operating on the end of a vector as vector-push and vector-pop do is much more similar to manipulating the head of a list than the distinction made in my comparison above. Depending on whether the vector's capacity can accommodate another element without reallocating, pushing and popping elements at the end of a vector take constant (O(1)) time.

array vs vector vs list

I am maintaining a fixed-length table of 10 entries. Each item is a structure of like 4 fields. There will be insert, update and delete operations, specified by numeric position. I am wondering which is the best data structure to use to maintain this table of information:
array - insert/delete takes linear time due to shifting; update takes constant time; no space is used for pointers; accessing an item using [] is faster.
stl vector - insert/delete takes linear time due to shifting; update takes constant time; no space is used for pointers; accessing an item is slower than an array since it is a call to operator[] and a linked list .
stl list - insert and delete takes linear time since you need to iterate to a specific position before applying the insert/delete; additional space is needed for pointers; accessing an item is slower than an array since it is a linked list linear traversal.
Right now, my choice is to use an array. Is it justifiable? Or did I miss something?
Which is faster: traversing a list, then inserting a node or shifting items in an array to produce an empty position then inserting the item in that position?
What is the best way to measure this performance? Can I just display the timestamp before and after the operations?

Use STL vector. It provides an equally rich interface as list and removes the pain of managing memory that arrays require.
You will have to try very hard to expose the performance cost of operator[] - it usually gets inlined.
I do not have any number to give you, but I remember reading performance analysis that described how vector<int> was faster than list<int> even for inserts and deletes (under a certain size of course). The truth of the matter is that these processors we use are very fast - and if your vector fits in L2 cache, then it's going to go really really fast. Lists on the other hand have to manage heap objects that will kill your L2.

Premature optimization is the root of all evil.
Based on your post, I'd say there's no reason to make your choice of data structure here a performance based one. Pick whatever is most convenient and return to change it if and only if performance testing demonstrates it's a problem.

It is really worth investing some time in understanding the fundamental differences between lists and vectors.
The most significant difference between the two is the way they store elements and keep track of them.
- Lists -
List contains elements which have the address of a previous and next element stored in them. This means that you can INSERT or DELETE an element anywhere in the list with constant speed O(1) regardless of the list size. You also splice (insert another list) into the existing list anywhere with constant speed as well. The reason is that list only needs to change two pointers (the previous and next) for the element we are inserting into the list.
Lists are not good if you need random access. So if one plans to access nth element in the list - one has to traverse the list one by one - O(n) speed
- Vectors -
Vector contains elements in sequence, just like an array. This is very convenient for random access. Accessing the "nth" element in a vector is a simple pointer calculation (O(1) speed). Adding elements to a vector is, however, different. If one wants to add an element in the middle of a vector - all the elements that come after that element will have to be re allocated down to make room for the new entry. The speed will depend on the vector size and on the position of the new element. The worst case scenario is inserting an element at position 2 in a vector, the best one is appending a new element. Therefore - insert works with speed O(n), where "n" is the number of elements that need to be moved - not necessarily the size of a vector.
There are other differences that involve memory requirements etc., but understanding these basic principles of how lists and vectors actually work is really worth spending some time on.
As always ... "Premature optimization is the root of all evil" so first consider what is more convenient and make things work exactly the way you want them, then optimize. For 10 entries that you mention - it really does not matter what you use - you will never be able to see any kind of performance difference whatever method you use.

Prefer an std::vector over and array. Some advantages of vector are:
They allocate memory from the free space when increasing in size.
They are NOT a pointer in disguise.
They can increase/decrease in size run-time.
They can do range checking using at().
A vector knows its size, so you don't have to count elements.
The most compelling reason to use a vector is that it frees you from explicit memory management, and it does not leak memory. A vector keeps track of the memory it uses to store its elements. When a vector needs more memory for elements, it allocates more; when a vector goes out of scope, it frees that memory. Therefore, the user need not be concerned with the allocation and deallocation of memory for vector elements.

You're making assumptions you shouldn't be making, such as "accessing an item is slower than an array since it is a call to operator[]." I can understand the logic behind it, but you nor I can know until we profile it.
If you do, you'll find there is no overhead at all, when optimizations are turned on. The compiler inlines the function calls. There is a difference in memory performance. An array is statically allocated, while a vector dynamically allocates. A list allocates per node, which can throttle cache if you're not careful.
Some solutions are to have the vector allocate from the stack, and have a pool allocator for a list, so that the nodes can fit into cache.
So rather than worry about unsupported claims, you should worry about making your design as clean as possible. So, which makes more sense? An array, vector, or list? I don't know what you're trying to do so I can't answer you.
The "default" container tends to be a vector. Sometimes an array is perfectly acceptable too.

First a couple of notes:
A good rule of thumb about selecting data structures: Generally, if you examined all the possibilities and determined that an array is your best choice, start over. You did something very wrong.
STL lists don't support operator[], and if they did the reason that it would be slower than indexing an array has nothing to do with the overhead of a function call.
Those things being said, vector is the clear winner here. The call to operator[] is essentially negligible since the contents of a vector are guaranteed to be contiguous in memory. It supports insert() and erase() operations which you would essntially have to write yourself if you used an array. Basically it boils down to the fact that a vector is essentially an upgraded array which already supports all the operations you need.

I am maintaining a fixed-length table of 10 entries. Each item is a
structure of like 4 fields. There will be insert, update and delete
operations, specified by numeric position. I am wondering which is the
best data structure to use to maintain this table of information:
Based on this description it seems like list might be the better choice since its O(1) when inserting and deleting in the middle of the data structure. Unfortunately you cannot use numeric positions when using lists to do inserts and deletes like you can for arrays/vectors. This dilemma leads to a slew of questions which can be used to make an initial decision of which structure may be best to use. This structure can later be changed if testing clearly shows its the wrong choice.
The questions you need to ask are three fold. The first is how often are you planning on doing deletes/inserts in the middle relative to random reads. The second is how important is using a numeric position compared to an iterator. Finally, is order in your structure important.
If the answer to the first question is random reads will be more prevalent than a vector/array will probably work well. Note iterating through a data structure is not considered a random read even if the operator[] notation is used. For the second question, if you absolutely require numeric position than a vector/array will be required even though this may lead to a performance hit. Later testing may show this performance hit is negligible relative to the easier coding with numerical positions. Finally if order is unimportant you can insert and delete in a vector/array with an O(1) algorithm. A sample algorithm is shown below.
template <class T>
void erase(vector<T> & vect, int index) //note: vector cannot be const since you are changing vector
{
vect[index]= vect.back();//move the item in the back to the index
vect.pop_back(); //delete the item in the back
}
template <class T>
void insert(vector<T> & vect, int index, T value) //note: vector cannot be const since you are changing vector
{
vect.push_back(vect[index]);//insert the item at index to the back of the vector
vect[index] = value; //replace the item at index with value
}

I Believe it's as per your need if one needs more insert/to delete in starting or middle use list(doubly-linked internally) if one needs to access data randomly and addition to last element use array ( vector have dynamic allocation but if you require more operation as a sort, resize, etc use vector)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js