What is the efficient way of implementing a queue, inorder to learn how it is implemented?
EDIT:
I looked into stl::queue code inorder to learn abt how it is implemented, but the template code making it difficult to understand. After all there is better efficient way is used than having a linked list.
The most efficent way is to have someone else do it.
Both C++ and C# (and .NET et al) have one in their native libraries.
Of course for any production code you should rely on a robust library implementation that's already withstood the test of time.
That said, for self-teaching it can be fun to write one yourself. I've done it before.
The most efficient way I know of is to use a similar approach to what most resizable collections do internally: store an array, which is increased in size (typically doubled) as needed when the collection's size reaches the length of the array.
A queue is a bit different from an ordinary collection, though, because you want to be able to pop off from the opposite end from where elements are pushed.
Obviously, to remove the first element and shift all other elements down one index would be costly and pointless. So instead you keep track of the starting and ending indices yourself. When the collection reaches the end of the array you use % to start pushing elements back at the beginning.
The key is simply working out your math so that you handle all cases, e.g., correctly growing the array when the queue is full, getting your bounds checks right, incrementing the start index (or looping back to 0) on every pop, etc.
Clearly, the design I've described above has many limitations: thread safety, for example. But for a simple, single-threaded, efficient implementation, it's a pretty good starting point.
Good luck -- and I also recommend posting your code if/when you've got one that you think is working!
If you can accept a maximum size for the queue, a circular buffer is super efficient. Because the library functions can't assume a maximum size they don't use this technique.
In the most generic sense, a linked-list would be your best bet if you maintain a front and rear pointer. In this case, queue insertion and deletion is an O(1) operation. You can also implement one using an array and maintaining indices for the front and rear. The math is marginally more involved (when you insert or delete you have to take into account "wrapping" to avoid going out of bounds).
For C++, stl::queue<>.
Do you understand how a queue works?
Do you understand how stl queue works?
Do you understand that "most efficient" is an abstract concept that can't hold true for every case?
If you get all of that, the "most efficient c++ queue algorithm" will come to you
If you need a thread-aware queue implementation you can try my own library. It's not complete yet, but it's well documented.
Edit: it's impemented by linked lists.
How many threads may be reading your queue at once? How many may be writing it at once? Can one thread be reading while another is writing? Do you want to pre-allocate space for the maximum size of queue? Do you need a way for a reader to block while waiting for data, or for a writer to block when the queue is full? How many objects per second do you expect to be feeding through the queue?
The optimal approach will depend upon the answers to those questions.
If your main goal is to learn how it is implemented, then you should work with linked lists. They are fun to work with and really shows the difference from sequential arrays and teaches a lot about memory.
Related
this is a question regarding writing your own data structure programs (which appears to be called implementation I think?). I am using C++ to write.
When I did a stack assignment, popping something off the stack simply changed the index of the top variable as it was more about where the user can access items vs actually physically removing the item (it's not accessible once the top variable is changed, so my professor said these things don't actually need to be deleted/removed). Once top is moved down, there's more room for items on top.
In a queue, my understanding is that one dequeues from the front (first in, first out). When this happens, would all the remaining items need to be moved up one index?
For example, if I have a queue of 3, 5, 7 and I dequeue one, would I simply have an int variable called "front" that I increment from 0 to 1 so that front is now at the index for number 5? My concern is that by doing this, the queue will no longer be able to hold the max number of items, so I would think that I would move everyone down one index so that there is still room to add things at the back.
TLDR yes, I would say you should move the elements
Let's start from the beginning:
When I did a stack assignment, [...] my professor said these things
don't actually need to be deleted/removed
It's important (imho) to understand why the professor said this.
First you need to be aware that this strategy works only for element types that have a trivial destructor (e.g. basic data types like int). This is a restriction of your particular implementation of the stack. Another limitation of your stack is that is has a limiting max capacity. A proper queue structure like std::stack has none of these shortcomings. This is perfectly fine for your assignment because implementing a stack without those restrictions would involve dynamic memory allocations and/or advanced techniques like placement new (or use an underlying container that already does all of this like std::deque). But the point of the assignment is to teach you the concepts of stack. Learning one new thing can be difficult. Learning 3 new and unrelated complex things at the same time would be irresponsible. So that's why your professor said "these things don't actually need to be deleted/removed".
Now let's go to the queue structure. Regardless of your choice your queue still has the above two limitations: applicable only for types with trivial destructor and unable to deal with unlimited elements.
If you move the elements then your queue will have the limitation of a (relatively small) maximum capacity (just like the stack you implemented)
If you don't move the elements then your queue will be limited to a maximum number of push operations, regardless of the pop operations. Just like you identified. So when the maximum push operations have been reached you can't push into the queue anymore, even if you poped or pop elements out of it.
The latter one is obviously more limited then the former. But since you already are supposed to implement a queue with limited capabilities the question is if the more limited one and the most easy to implement is acceptable in your exercise or not. Only your professor can clarify this. He/she can say something along the lines: "I am only interested you learn the abstract concepts of the queue and such you can assume/imagine an infinite capacity; you just need to deal with the indices and the correctness of push/pop operations". Or he/she can except you move the elements since it might be something you are already supposed to know how to do. However I would recommend to implement the moves regardless because it is not that difficult and is an useful skill.
As noted in the comments there are other ways of implementing a queue, like a circular queue.
I'll give some context as to why I'm trying to do this, but ultimately the context can be ignored as it is largely a classic Computer Science and C++ problem (which must surely have been asked before, but a couple of cursory searches didn't turn up anything...)
I'm working with (large) real time streaming point clouds, and have a case where I need to take 2/3/4 point clouds from multiple sensors and stick them together to create one big point cloud. I am in a situation where I do actually need all the data in one structure, whereas normally when people are just visualising point clouds they can get away with feeding them into the viewer separately.
I'm using Point Cloud Library 1.6, and on closer inspection its PointCloud class (under <pcl/point_cloud.h> if you're interested) stores all data points in an STL vector.
Now we're back in vanilla CS land...
PointCloud has a += operator for adding the contents of one point cloud to another. So far so good. But this method is pretty inefficient - if I understand it correctly, it 1) resizes the target vector, then 2) runs through all Points in the other vector, and copies them over.
This looks to me like a case of O(n) time complexity, which normally might not be too bad, but is bad news when dealing with at least 300K points per cloud in real time.
The vectors don't need to be sorted or analysed, they just need to be 'stuck together' at the memory level, so the program knows that once it hits the end of the first vector it just has to jump to the start location of the second one. In other words, I'm looking for an O(1) vector merging method. Is there any way to do this in the STL? Or is it more the domain of something like std::list#splice?
Note: This class is a pretty fundamental part of PCL, so 'non-invasive surgery' is preferable. If changes need to be made to the class itself (e.g. changing from vector to list, or reserving memory), they have to be considered in terms of the knock on effects on the rest of PCL, which could be far reaching.
Update: I have filed an issue over at PCL's GitHub repo to get a discussion going with the library authors about the suggestions below. Once there's some kind of resolution on which approach to go with, I'll accept the relevant suggestion(s) as answers.
A vector is not a list, it represents a sequence, but with the additional requirement that elements must be stored in contiguous memory. You cannot just bundle two vectors (whose buffers won't be contiguous) into a single vector without moving objects around.
This problem has been solved many times before such as with String Rope classes.
The basic approach is to make a new container type that stores pointers to point clouds. This is like a std::deque except that yours will have chunks of variable size. Unless your clouds chunk into standard sizes?
With this new container your iterators start in the first chunk, proceed to the end then move into the next chunk. Doing random access in such a container with variable sized chunks requires a binary search. In fact, such a data structure could be written as a distorted form of B+ tree.
There is no vector equivalent of splice - there can't be, specifically because of the memory layout requirements, which are probably the reason it was selected in the first place.
There's also no constant-time way to concatenate vectors.
I can think of one (fragile) way to concatenate raw arrays in constant time, but it depends on them being aligned on page boundaries at both the beginning and the end, and then re-mapping them to be adjacent. This is going to be pretty hard to generalise.
There's another way to make something that looks like a concatenated vector, and that's with a wrapper container which works like a deque, and provides a unified iterator and operator[] over them. I don't know if the point cloud library is flexible enough to work with this, though. (Jamin's suggestion is essentially to use something like this instead of the vector, and Zan's is roughly what I had in mind).
No, you can't concatenate two vectors by a simple link, you actually have to copy them.
However! If you implement move-semantics in your element type, you'd probably get significant speed gains, depending on what your element contains. This won't help if your elements don't contain any non-trivial types.
Further, if you have your vector reserve way in advance the memory needed, then that'd also help speed things up by not requiring a resize (which would cause an undesired huge new allocation, possibly having to defragment at that memory size, and then a huge memcpy).
Barring that, you might want to create some kind of mix between linked-lists and vectors, with each 'element' of the list being a vector with 10k elements, so you only need to jump list links once every 10k elements, but it allows you to dynamically grow much easier, and make your concatenation breeze.
std::list<std::vector<element>> forIllustrationOnly; //Just roll your own custom type.
index = 52403;
listIndex = index % 1000
vectorIndex = index / 1000
forIllustrationOnly[listIndex][vectorIndex] = still fairly fast lookups
forIllustrationOnly[listIndex].push_back(vector-of-points) = much faster appending and removing of blocks of points.
You will not get this scaling behaviour with a vector, because with a vector, you do not get around the copying. And you can not copy an arbitrary amount of data in fixed time.
I do not know PointCloud, but if you can use other list types, e.g. a linked list, this behaviour is well possible. You might find a linked list implementation which works in your environment, and which can simply stick the second list to the end of the first list, as you imagined.
Take a look at Boost range joint at http://www.boost.org/doc/libs/1_54_0/libs/range/doc/html/range/reference/utilities/join.html
This will take 2 ranges and join them. Say you have vector1 and vector 2.
You should be able to write
auto combined = join(vector1,vector2).
Then you can use combined with algorithms, etc as needed.
No O(1) copy for vector, ever, but, you should check:
Is the element type trivially copyable? (aka memcpy)
Iff, is my vector implementation leveraging this fact, or is it stupidly looping over all 300k elements executing a trivial assignment (or worse, copy-ctor-call) for each element?
What I have seen is that, while both memcpyas well as an assignment-for-loop have O(n) complexity, a solution leveraging memcpy can be much, much faster.
So, the problem might be that the vector implementation is suboptimal for trivial types.
Im developing an A* for the first time, and I was using a priority_queue for the open set, until I realize you need to check if nodes are in the open set too, not just the close one.
Thing is, you cant iterate over a priority queue..So why everyone recommend a priority queue for the open set? Is it yet the best option? I think the only way to iterate over it is making a copy so I can pop everything from it (enormous cost).
What the best data structure to use on A*?
A priority queue (PQ) is an abstract data structure (ADS). There are many possibilities to implement them. Unfortunately, the priority_queue supplied with the C++ standard library is rather limited, and other implementations are suited a lot better for implementing A*. Spoilers: you can use std::set/multiset instead of std::priority_queue. But read on:
So what do you need from the priority queue to implement A* is:
Get the node with the lowest cost
Decrease the costs of arbitrary elements
Any priority queue can do 1., but for 2., you need a "mutable" priority queue. The Standard-Lib one cannot do this. Also, you need an easy way to find entries in the priority queue, to find out where to decrease the keys (For when A* finds a better path to an already opened node). There are two basic ways for this: You store a handle to the priority queue element within your graph node (or use a map to store those handles for each graph node) - or you insert the graph nodes themselves.
For the first case, where you store handles for each node, you can use std::multiset for your priority queue. std::multiset::first() will always be your "lowest cost" key, and you can decrease a key by removing it from the set, changing the value and re-inserting, and updating the handle. Alternatively, you can use the mutable priority queues from Boost.Heap, which directly support "decrease-key".
For the second case, you would need some kind of "intrusive" binary tree - since your pathfinding nodes themselves need to be in the priority queue. If you don't want to roll your own, see the ordered associative containers in Boost.Intrusive.
The subject is very large, I suggest you reading this page if you want to know the different possibilities and have a good understanding of which data structure is adapted to your situation :
http://theory.stanford.edu/~amitp/GameProgramming/ImplementationNotes.html#set-representation
In my case, the binary heap was a good balance between difficulty to implement and performances, which was totally what I was looking for. But maybe you are looking for something different ?
The rest of the document is a very good reference for A* for game development
http://theory.stanford.edu/~amitp/GameProgramming/index.html
They mean A priority queue not necessarily the std::priority_queue class that comes with the language. If the built in one doesn't do what you need it to write your own, or find another.
I have a queue with n elements in it and the front is at 0. I need to create a stack of these numbers with 0 at the top.
It can only be done with EnQueue, DeQueue, Push, and Pop, and constant storage. I dont need an answer so much as an idea of how I could approach this problem.
Please don't answer this for me, but just try to understand I'm new at programming and could just use an idea of what is a way this can be done.
Is it a Towers-of-Hanoi-like approach?
Does that only use a constant storage?
This isnt for homework, I just need some advice on how to proceed. My first idea, reversing the queue and then pushing it did not work. I even tried sketching out other situations with no avail. Then I wondered if dequeueing and pushing them all, then popping and enqueueing them all, then dequeue and push again.
Is this efficient?
Does this use constant storage?
I am still learning fundamental programming concepts. Please be nice! :)
What am I facing?
The biggest problem you are facing is that your two containers aren't directly compatible with each other.
A queue is normally a FIFO1 container, while a stack is LIFO2. This means that you cannot just copy the data in sequential order from your queue to your stack, since that will make the elements appear in the "wrong" order (following your description).
Another problem is that there is no good way (performance wise) to reverse a queue. A queue is a one-way container, internally an element only has to know about the next element in line, not about the previous one. This means that you cannot iterate through the queue starting at the back, and that iteration always is O(n).
The same problem is with your stack.
The things described earlier put together makes this quite a tedious problem, though there are solutions they aren't always the most straight forward.
Hints on how to solve this issue..
You'll need some sort of intermediate state to store your elements, or could we use the LIFO/FIFO properties of our containers to our advantage?
Below is an implementation which does what you want, if you don't want to know the answer to your question don't hover with your mouse over this gray area.
It will require some additional storage since space for an extra element will be allocated during copying from one container to another.. this is inevitable, though the storage is constant.
Remember that the copy-initialization can be optimized by using rvalue-references and move in C++11.
Can't seem to get syntax highlighting working inside a spoiler..
Sample implementation can be found here.
It takes advantage of the fact that a queue is FIFO and stack LIFO, by copying the data queue to stack, then stack to queue and finally queue to stack again we have effectively reversed the order of elements in a way that will match your description.
footnotes
1. FIFO = First In First Out
2. LIFO = Last In First Out
DeQueue everything from Queue, immediately Pushing each element to Stack. Now Pop everything from Stack, immediately EnQueueing to Queue. What's in Queue now?
Presuming your Queue and Stack hold fixed sized items, the above ahem subroutine certainly only uses constant additional storage: Only storage for 1 item is needed as each item transits from Queue to Stack or vice-versa.
Edit: As you point out, my subroutine reverses the content of the Queue. Having done so, it is fairly simple to drain the Queue into the Stack again to get the desired outcome.
And, as you point out, this requires transferring 3n = O(n) items, where n is the initial size of the Queue. Could you do better? I don't believe so, or at least not significantly. In some sense, without even a counter (which would take O(log n) > O(1) extra storage), the only reasonable thing to do is drain the queue into the stack or vice versa.
I'm looking for a free software implementation of the bounded priority queue abstraction in C++. Basically, I need a data structure that will behave just like std::priority_queue but will at all times hold the "best" n elements at most.
Example:
std::vector<int> items; // many many input items
bounded_priority_queue<int> smallest_items(5);
for(vector<int>::const_iterator it=items.begin(); it!=items.end(); it++) {
smallest_items.push(*it);
}
// now smallest_items holds the 5 smallest integers from the input vector
Does anyone know of a good implementation of such thing? Any experience with it?
I think that the algorithm discussed in this thread is probably what you are looking for. If you want to get a head start, you might want to consider building upon Boost's implementation d_ary_heap_indirect which is part of Boost.Graph (in d_ary_heap.hpp). If you do a good job with it, you might submit it to Boost. It could make a nice little addition, because such an implementation certainly has many uses.
Why not use a std::vector with a functor/comparison function and std::sort() it into the correct order?
The code is probably pretty trivial