Good container for insert-in-order? C++

Good container for insert-in-order? C++ - c++

Hi I was wondering what the best container for inserting elements in order in? A map I think is unnecessary since I am just going to be accessing the element at the front, popping it and then inserting more elements (I'm implementing a pathfinding algorithm (Dijkstra) with weights)
I could probably have used a list and inserted in order myself, but the inability to bisect (because you start accessing at the front or back) would be hindering to performance.

If you need only access the front and back, std::deque (double-ended queue) fits the bill perfectly.
However, for a Dijkstra algorithm, don't you need a priority queue instead?

If you are using C++ there is a std::priority_queue container adapter in the <queue> header file.

Related

Why is not possible to have a queue implemented as a vector?

What are the drawbacks of using a std::vector for simulating a queue? I am naively thinking that push_back is used for push and for pop one just stores the position of the first element and increments it. Why does not std::queue allow a std::vector implementation like this in principle (I know the reason is it has no push_front method, but maybe there is something deeper that makes it slow this way)? Thank you for helping.

Why does not std::queue allow a std::vector implementation like this
std::queue is a simple container adapter. It works by delegating pop function to the pop_front function of the underlying container. Vector has no pop front operation, so std::queue cannot adapt it.
but maybe there is something deeper that makes it slow this way
Pushing and popping from the front of the vector is slow because it has to shift all elements which has linear cost. This is why vector doesn't provide pop_front.
stores the position of the first element and increments it.
It's possible to implement a container that does store the position of first element within a buffer, but vector is not an implementation of such container. Storing that position has an overhead that vector doesn't need to pay, and so it doesn't.

Is the overhead for utilizing a vector with insert() as a priority queue relative to utilizing a heap? (c++)

I am currently working on a project where I implemented a vector of structure pointers to use as a priority queue. I use a for loop to determine position in the vector (if not less than the back) and then use insert() to place the struct pointer in position in queue. I am using back() as the front of queue so I can maintain the pop functionality of the vector.
I was just trying to determine if using the heap library instead would add a speed increase, as this project is dependent on time. Can provide code if you'd like, size of the heap/vector may increase tremendously as this is a tower of hanoi A* search algorithm.
Figured I would ask for future knowledge as well to save me some debug breakpoint shuffles if anyone knew offhand.

I've made a simple benchmark of first inserting N random ints into a priority queue and then popping N top elements.
Expectedly, sorted std::vector with linear search wins when the queue size is small, and std::priority_queue, which is implemented as a max-heap with O(log N) worst-case insertion time, wins when the queue size is large.
Benchmark code can be found here.

std::list sort algorithm runtime

I have a list of elements that is almost in the correct order and the elements are just off by a relatively small amount of places compared to their correct position (e.g. no element that is supposed to be in the front of the list is in the end).
< TL;DR >
Practical origin: I have an incoming stream of UDP-Packages that contain signals all marked with a certain timestamp. Evaluating the data has shown, that the packages have not been send (or received) in the right order, so that the timestamp is not constantly increasing but jittering a bit. To export the data I need to sort it in advance.
< /TL;DR >
I want to use std::list.sort() to sort this list.
What is the sorting algorithm used by std::list.sort() and how is it affected by the fact that the list is almost sorted. I have a "feeling", that a divide-and-conquer based algorithm might profit from it.
Is there a more efficient algorithm for my quite specific problem?

If every element is within k positions of its proper place, then insertion sort will take less than kN comparisons and swaps/moves. It's also very easy to implement.
Compare this to the N*log(N) operations required by merge sort or quick sort to see if that will work better for you.

It is not defined which algorithm uses, allthough it should be around N log N on average, such as quicksort.
If you are appending packets to the end of the "queue" as you consume them, so you want the queue always sorted, then you can expect any new packets correct position to nearly always be near the "end".
Therefore, rather than sorting the entire queue, just insert the packet into the correct position. Start at the back of the queue and compare the existing packets timestamp with those already there, insert it after the first packet with a smaller timestamp (likely to always be the end) or the front if there is no such packet in the event things are greatly out of order.
Alternatively, if you want to add all packets in order and then sort it, Bubble Sort should be fairly optimal because the list should still be nearly sorted already.

On mostly-sorted data Insertion Sort and Bubble Sort are among the most common ones that perform the best.
Live demo
Note also that having a list structure puts an extra constraint on indexed access, so algorithms that require indexed access will perform extra poorly. Therefore insertion sort is an even better fit since it needs only sequential access.

In the case of Visual Studio prior to 2015, a bottom up merge sorting using 26 internal lists is used, following the algorithm shown in this wiki article:
https://en.wikipedia.org/wiki/Merge_sort#Bottom-up_implementation_using_lists
Visual Studio 2015 added support for not having a default allocator. The 26 internal lists initializers could have been expanded to 26 instances of initializers with user specified allocators, specifically: _Myt _Binlist[_MAXBINS+1] = { _Myt(get_allocator()), ... , _Myt(get_allocator())};, but instead someone at Microsoft switched to a top down merge sort based on iterators. This is slower, but has the advantage that it doesn't need special recovery if the compare throws an exception. The author of that change pointed out the fact that if performance is a goal, then it's faster to copy the list to an array or vector, sort the array or vector, then create a new list from the array or vector.
There's a prior thread about this.
`std::list<>::sort()` - why the sudden switch to top-down strategy?
In your case something like an insertion sort on a doubly linked list should be faster, if an out order node is found, remove the from the list, scan backwards or forwards to the proper spot and insert the node back into the list. If you want to used std::list, you can use iterators, erase, and emplace to "move" nodes, but that involves freeing and reallocating a node for each move. It would be faster to implement this with your own doubly linked list, in which case you can just manipulate the links, avoiding the freeing and reallocation of memory if using std::list.

STL iterable container like priority_queue

I'm new to STL containers (and C++ in general) so thought I would reach out to the community for help. I basically want to have a priority_queue that supports constant iteration. Now, it seems that std::priority_queue doesn't support iteration, so I'm going to have to use something else, but I'm not sure exactly what.
Requirements:
Maintains order on insertion (like a priority queue)
Pop from top of list
Get const access to each element of the list (don't care about the order in the queue for this stage)
One option would be to keep a priority_queue and separately have an unordered_set of references, but I'd rather not have two containers floating around. I could also use a deque and search through for the right insertion position, but I'd rather have the container manage the sorting for me if possible (and constant-time insertion would be nicer than linear-time). Any suggestions?

There are two options that come to mind:
1) Implement your own iterable priority queue, using std::vector and the heap operation algorithms (see Heap Operations here).
2) derive (privately) from priority_queue. This gives you access to the underlying container via data member c. You can then expose iteration, random access, and other methods of interest in your public interface.

Using a std::vector might be enough as others already pointed, but if you want already-ready implementation, maybe use Boost.Heap (which is a library with several priority queue containers): http://www.boost.org/doc/libs/1_53_0/doc/html/heap.html
Boost is a collection of libraries that basically complete the standard library (which is not really big). A lot of C++ developers have boost ready on their dev computer to use it when needed. Just be careful in your choices of libraries.

You can use (ordered) set as a queue. set.begin() will be your top element, and you can pop it via erase(set.begin()).

Have you observed heap (std::make_heap) ? It hasn't order inside of queue, but has priority "pop from top of list" which you need.

C++ boost - Is there a container working like a queue with direct key access?

I was wonndering about a queue-like container but which has key-access, like a map.
My goal is simple : I want a FIFO queue, but, if I insert an element and an element with a given key is already in the queue, I want it the new element to replaced the one already in the queue. For example, a map ordered by insertion time would work .
If there is no container like that, do you think it can be implemented by using both a queue and a map ?

Boost multi-index provides this kind of container.
To implement it myself, I'd probably go for a map whose values consist of a linked list node plus a payload. The list node could be hand-rolled, or could be Boost intrusive.
Note that the main point of the queue adaptor is to hide most of the interface of Sequence, but you want to mess with the details it hides. So I think you should aim to reproduce the interface of queue (slightly modified with your altered semantics for push) rather than actually use it.

Obviously what you want can be done simply with the queue-like container, but you would have to spend O(n) time on every insertion to determine if the element is already present. If you implement your queue based on something like std::vector you could use the binary search and basically speed up your insertion to O(log n) (that would still require O(n) operations when the memory reallocation is done).
If this is fine, just stick to it. The variant with additional container might give you a performance boost, but it's also likely to be error-prone to write and if the first solution is sufficient, just use it.
In the second scenario you might want to store your elements twice in different containers - the original queue and something like a map (or sometimes a hashmap may perform better). The map is used only to determine if the element is already present in the container or not - and if YES, you will have to update it in your queue.
Basically that gives us O(1) complexity for hashmap lookups (in real world this might get uglier because of the collisions - hashmaps aren't really good for determining element existence) and O(1) insertion time for the case when no update is required and O(n) insertion time for the case update is needed.
Based on the percentage of the actual update operations, the actual insertion performance may vary from O(1) to O(n), but this scheme will definitely outperform the first one if the number of updates is small enough.
Still, you have to insert your elements in two containers simultaneosly and the same should be done if the element is deleted and I would think twice "do I really need that performance boost?".

I see easy way of doing this with a queue and optionally a map.
Define some sort of == operator for your elements.
Then simply have a queue and search for your element every time you want to insert it.
You could optimize this by having a map of element locations to elements instead of searching the queue every time.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js