Highly parallel deque - c++

Is there an implementation (boost or otherwise) of a highly parallel deque? In particular, I want to be able to say things like this (pseudocode):
parallel.for(deque.erase, list<locations>);
In other words, in parallel remove all the items passed in a e.g. List of locations in the deque. At the same time, I don't want this operation to block other deletes or inserts that have nothing to do with this delete/insertion.
So for example, thread 1 could be (parallel) deleting items at locations 1,3,7,9, thread 2 (parallel) inserting into the deque (parallel insertions can be push_back and seem easy unless trying to insert into old erased locations), and thread three could be (parallel) deleting locations 2,4,8 [note that different thread erases never intersect erase locations]. Trying to erase an already erased location (holding a sentinel value) is an error. That probably means that locations are stateful until some compaction occurs (which requires a lock.) Erased locations could hold a sentinel value that says to other threads you can insert into me...So the interface may not be push_back, but push_available.
Thinking outloud, I realize that the deque may become fragmented and grow (memory leak) as push_backs don't fill erased values (?), but some sort of compaction seems possible eventually.
The paralell_deque should also allow locking where no insertions or deletions are allowed until the lock is removed. For example, when printing the state (contents) of the deque...
deque can have multiple strategy for implementation, and some may be more amenable than others to this sort of parallelism.
I also realize that this question might be too vague and get shot down before interesting answers are given.

It sounds like you really want a node-based container (which doesn't invalidate iterators) but with contiguous storage.
I'd suggest constructing a linked-list on top of contiguously stored elements using Boost Intrusive.
However, before that, I'd live with the simplest data structure you can think of until you do know what exactly you want and why.

Related

What advantages do vectors have over linked lists

I had the following exchange with my professor which wasn't very satisfying. I included my parts of the exchange which should be enough to get my point across.
"For vectors, does the C++ implementation traverse through each element of the old dynamically allocated array and free it?
(Edit: I mean, when resizing and adding elements, either by pushback or resize)
I am especially curious because the book tries to make the case that linked lists are troublesome because of having to traverse each time. Does not seem to me that vectors have a huge advantage in that regard.
The main benefit of the vectors I can see is the convenience and fast accessing but not much more. As in, everytime you try to do something other than accessing, you will be traversing through everything to move and free memory. Is that correct?"
After his reply, I added.
"Professor xxxx,
I went out to test, and in fact, the addresses change if you either resize or push_back, so my assumption that the old addresses are freed is correct. I can only assume that program will have to go to each element to free it, and if that's correct, won't the insertion of new things be costly in terms of time, even more so than traversing linked lists?
Can you kindly correct the following statement If it states any incorrect facts or assumptions. Using vectors in any other way than using arrays (for any other purpose than accessing already stored data), means that linked lists will almost always be faster because unlike linked lists, in vectors not only would you traverse through the elements, you will traverse through them, free them, and then create a whole new array to accommodate new space. That is because the next address after the last element of the current vector could have a pointer variable pointing to it, and using that address will cause an extremely strange behavior that I cannot imagine the poor soul's misery that tries to figure out what had gone wrong."
TL;DR:
Disadvantage of linked lists is traversing, but vector uses (push_back, resize(), etc) most often require the traversing anyway, so how are vectors exactly faster?
There are several things that are faster than you expect:
When a vector reallocates, the original elements are destroyed, not freed, one by one. Their storage is then freed all at once. This is as opposed to a linked list, where each node is allocated and freed individually. But this is somewhat moot because:
A vector batches reallocations. std::vector is specified to have an amortized constant insertion cost, which implies that it avoids reallocating every time you push_back, to the extent that this cost becomes negligible when considering complexity. Typical implementations multiply the vector's capacity by a fixed factor every time it is exceeded, so when performing the costly reallocation it provides room for the next several push_backs. These then do not need to traverse the vector or allocate anything.
A vector is extremely cache-friendly. This makes all sequential operations on a vector blazing fast, and can counter-intuitively outperform a linked list in many cases, especially in long-running applications where memory might become fragmented.
As a complement of the already given answer and comments...
The std::vector has its elements stored contiguously in memory while it is not the case of a linked list.
The direct consequence is that element access is trivial for a std::vector while it's not for a linked list.
For example, if I want to access the nth element of a linked list, I have to iterate through the list until reaching the desired element.
But on the other hand, the linked list will perform better if we want to insert a new element inside.
Indeed, for a linked list, we have to iterate until we reach the desired position, then we just have to change the connections between the previous and the next node so that the new element is inserted in-between.
For a std::vector, you have to relocate every elements after the desired position (and make a reallocation if needed, i.e. if adding a new element exceeds the reserved available space).
So the std::vector is better for element access but is less efficient when inserting an element inside (same thing for removal).

Which STL container best meets these needs?

I would like some advice on which STL container best meets the following needs:
The collection is relatively short-lived.
The collection contains pointers.
Elements are added only at the end. The order of elements must be maintained.
The number of elements is unknown and may vary from hundreds to millions. The number is known only after the final element is added.
I may iterate over the elements several times.
After all elements are added, I need to sort the collection, based on the objects the pointers refer to.
After sorting, I may iterate over the elements several more times.
After that, the collection will be destroyed.
Thread safety is not required.
Here are my thoughts:
list: Requires a separate allocation for each element. More expensive traversal.
vector: Need to be reallocated as the collection grows. Best sort and traversal performance.
deque: Fewer allocations than list and fewer reallocations than vector. I don't know about behavior with respect to sort.
I am currently using list. The flowchart at In which scenario do I use a particular STL container? leads me to deque.
My knowledge of STL is old; I don't know about container types that have been added since 2003, so maybe there's something well-suited that I've never heard of.
std::vector<T*> will be the winner based on the points discussed.
Don't be afraid of the resizing that will need to occur--just reserve() a reasonable amount (say 500 if many of your collections will be around there).
Sorting performance with vector<T*> will also be very good.
Allocation and deallocation of each T will be important. Pay attention to this. For example you may want to allocate thousands of Ts at a time, to reduce the memory allocation overhead (and make it faster to deallocate everything at the end). This is known as an "arena" or "pool". You can probably store 32-bit relative pointers into the arena, saving half the pointer storage space.
And of course, if T is small you might consider storing it by value instead of by pointer.

Is there a container like the one I am asking for?

I was looking to implementing a c++ container object with following properties:
Keeps all the elements contiguously in memory, so that it can be iterated over without causing any cache misses.
Expandable, not like arrays which are of a fixed sized, but much like vectors in stl which can adjust the memory allocated to accommodate as many elements as i add.
Does not reallocate elements to new place in memory when resizing like in the case of std vectors. I need to keep pointers to the elements that are in the container, so reallocation the pointers should not be invalidated when adding new elements.
Must be compatible with ranged based for loops, so that the contents can be efficiently iterated through.
Is there any container out there which meets these requirements, in any external library or do i have to implement my own in this case?
As some comments have pointed out, not all the desired properties can be implemented together. I had ought over this, and i have an implementation in mind. Since making things contiguous entirely is not possible, some discontinuity can be accomodated. For example, the data container allocates space for 10 elements initially, and when the cap is reached, allocates another chunk of memory double the amount of the previous block, but doesn't copy the existing elements to that new block. Instead, it fills the new block with the new elements i put into it. This minimizes the amount of discontinuity.
So, is there a data structure which already implements that?
IMHO, the data-structure that is the closest from your need is the
deque in the STL. Basically it stores chunk of contiguous memories and
provides both random access iterators and stability regards to push_back
(your elements stays at the same place although iterators are invalidated).
The only problem with your constrains is that the memory is not contiguous
everywhere but as others commented, your set of needs is incompatible if you want
to fully satisfy them all.
By the way one sweet thing with this container is that you can also push on
the front.

Is there ever a reason to use std::list? [duplicate]

This question already has answers here:
Under what circumstances are linked lists useful?
(17 answers)
Closed 9 years ago.
After having read this question and looking at some results here, it seems like one should altogether completely avoid lists in C++. I always expected that linked lists would be the containers of choice for cases where I only need to iterate over all the contents because insertion is a matter of pointer manipulation and there is never a need to reallocate.
Apparently, because of "cache locality," lists are iterated over very slowly, so any benefit from having to use less reserve memory or faster addition (which is not that much faster, it seems, from the second link) doesn't seem worth it.
Having said that, when should I, from a performance standpoint, use std::list over std::deque or, if possible, std::vector?
On a side note, will std::forward_list also have lots of cache misses?
std::list is useful in a few corner cases.
However, the general rule of C++ sequential containers is "if your algorithm are compatible, use std::vector. If your algorithms are not compatible, modify your algorithms so you can use std::vector."
Exceptions exist, and here is an attempt to exhaustively list reasons that std::list is a better choice:
When you need to (A) insert into the middle of the container and (B) you need each objects location in memory to be stable. Requirement (B) can usually be removed by having a non-stable container consisting of pointers to elements, so this isn't a strong reason to use std::list.
When you need to (A) insert or delete in the middle of the container (B) orders of magnitude more often than you need to iterate over the container. This is also an extreme corner case: in order to find the element to delete from a list, you usually need to iterate!
Which leads to
you need (A) insert or delete in the middle of the container and (B) have iterators to all other elements remain valid. This ends up being a hidden requirement of case 1 and case 2: it is hard to delete or insert more often than you iterate when you don't have persistant iterators, and the stability of iterators and of objects is highly related.
the final case, is the case of splicing was once a reason to use std::list.
Back in C++03, all versions of std::list::splice could (in theory) be done in O(1) time. However, the extremely efficient forms of splice required that size be an O(n) operation. C++11 has required that size on a list be O(1), so splice's extreme efficiency is limited to the "splicing an entire other list" and "self-splicing a sublist" case. In the case of single element splice, this is just an insert and delete. In the case of sub range splice, the code now has to visit every node in the splice just to count them in order to maintain size as O(1) (except in self-splicing).
So, if you are doing only whole-list splices, or self-list subrange splices, list can do these operations much faster than other non-list containers.
When should I, from a performance standpoint, use std::list
From a performance standpoint, rarely. The only situation that comes to mind is if you have many lists that you need to split and join to form other lists; a linked list can do this without allocating memory or moving objects.
The real benefit of list is stability: elements don't need to be movable, and iterators and references are never invalidated unless they refer to an element that's been erased.
On a side note, will std::forward_list also have lots of cache misses?
Yes; it's still a linked list.
The biggest improvement that std::list can provide is when you're moving one or more elements from the middle of one list into another list. This splice operation is extremely efficient on list while it may involve allocation and movement of items in random access containers such as vector. That said, this comes of very rarely and much of the time vector is your best container in terms of performance an simplicity.
Always profile your code if you suspect performance problems with a container choice.
You chose std::list over std::deque and std::vector (and other contiguous memory containers) when you frequently add/remove items in the middle of your container, see also.

Vector vs Deque insertion in middle

I know that deque is more efficient than vector when insertions are at front or end and vector is better if we have to do pointer arithmetic. But which one to use when we have to perform insertions in middle.? and Why.?
You might think that a deque would have the advantage, because it stores the data broken up into blocks. However to implement operator[] in constant time requires all those blocks to be the same size. Inserting or deleting an element in the middle still requires shifting all the values on one side or the other, same as a vector. Since the vector is simpler and has better locality for caching, it should come out ahead.
Selection criteria with Standard library containers is, You select a container depending upon:
Type of data you want to store &
The type of operations you want to perform on the data.
If you want to perform large number of insertions in the middle you are much better off using a std::list.
If the choice is just between a std::deque and std::vector then there are a number of factors to consider:
Typically, there is one more indirection in case of deque to access the elements, so element
access and iterator movement of deques are usually a bit slower.
In systems that have size limitations for blocks of memory, a deque might contain more elements because it uses more than one block of memory. Thus, max_size() might be larger for deques.
Deques provide no support to control the capacity and the moment of reallocation. In
particular, any insertion or deletion of elements other than at the beginning or end
invalidates all pointers, references, and iterators that refer to elements of the deque.
However, reallocation may perform better than for vectors, because according to their
typical internal structure, deques don't have to copy all elements on reallocation.
Blocks of memory might get freed when they are no longer used, so the memory size of a
deque might shrink (this is not a condition imposed by standard but most implementations do)
std::deque could perform better for large containers because it is typically implemented as a linked sequence of contiguous data blocks, as opposed to the single block used in an std::vector. So an insertion in the middle would result in less data being copied from one place to another, and potentially less reallocations.
Of course, whether that matters or not depends on the size of the containers and the cost of copying the elements stored. With C++11 move semantics, the cost of the latter is less important. But in the end, the only way of knowing is profiling with a realistic application.
Deque would still be more efficient, as it doesn't have to move half of the array every time you insert an element.
Of course, this will only really matter if you consider large numbers of elements, and even than it is advisable to run a benchmark and see which one works better in your particular case. Remember that premature optimization is the root of all evil.