C++ std::stack Traversal

C++ std::stack Traversal - c++

I am using std::stack for an project and I need to travel on it for checking same values. I checked member functions but, I could not find an appropriate one for this task.
The first idea that came up is using a copy stack but, the program might waste lots of extra space in this case, and not using an user-defined stack class is important at this level of the project (Yeah, I made a design mistake... ).
So, any idea?
Thank you!

Avoid std::stack, it's just a useless wrapper that dumbs down the interface of the underlying container. Use a std::vector with push_back/pop_back for insert insertion/removal (insertion/removal at the end is amortized O(1)) or std::deque, whence you can push/pop on either side without significant changes in performances (still amortized O(1)). In both cases you can walk all the elements using their random-access iterators.
(the same holds for std::queue: it's useless, use directly std::deque (not vector) with push_back/pop_front)

Related

std::vector::insert vs std::list::operator[]

I know that std::list::operator[] is not implemented as it has bad performance. But what about std::vector::insert it is as much inefficient as much std::list::operator[] is. What is the explanation behind?

std::vector::insert is implemented because std::vector has to meet requirements of SequenceContainer concept, while operator[] is not required by any concepts (that I know of), possible that will be added in ContiguousContainer concept in c++17. So operator[] added to containers that can be used like arrays, while insert is required by interface specification, so containers that meet certain concept can be used in generic algorithms.

I think Slava's answer does the best job, but I'd like to offer a supplementary explanation. Generally with data structures, it's far more common to have more accesses than insertions, than vice versa. There are many data structures that may try to optimize access at the cost of insertion, but the inverse is much more rare, because it tends to be less useful in real life.
operator[], if implemented for a linked list, would access an existing value presumably (similar to how it does for vector). insert adds new values. It's much more likely you will be willing to take a big performance hit to insert some new elements into a vector, provided that the subsequent accesses are extremely fast. In many cases, element insertion may be outside of the critical path entirely whereas the critical path consists of a single traversal, or random access of already-present data. So it's simply convenient to have insert take care of the details for you in that case (it's actually a bit annoying to write efficiently and correctly). This is actually a not-uncommon use of a vector.
On the other hand, using operator[] on a linked list would almost always be a sign that you are using the wrong data structure.

std::list::operator[] would require an O(N) traversal and is not really in accordance with what a list is designed to do. If you need operator[] then use a different container type. When C++ folk see a [] they assume an O(1) (or, at worse, an O(Log N)) operation. Supplying [] for a list would break that.
But although std::vector::insert is also O(N), it can be optimised: an at-end insertion can be readily optimised by having the vector's capacity grow in large chunks. An insertion in the middle requires an element-by-element move, but that again can be performed very quickly on modern chipsets.

The [] operator is inherited from plain arrays. It has always been understood as a fast (sub linear time) accessor of the underlying container. Since list is does not support sub linear time access, it does not make sense for it to implement the operator.
auto a = Container [10]; // Ideally I can assume this is quick
The equivalent of your std::list <>::operator [] is std::next <std::list<>::iterator>. Documented at cpp-reference.
auto a = *std::next (Container.begin (), 10); // This may take a little while
This is the truly generic way index a container. If the container supports random access, it will be constant time, other wise it will be linear.

Quickest Queue Container (C++)

I'm using queue's and priority queues, through which I plan on pumping a lot of data quite quickly.
Therefore, I want my q's and pq's to be responsive to additions and subtractions.
What are the relative merits of using a vector, list, or deque as the underlying container?
Update:
At the time of writing, both Mike Seymour and Steve Townsend's answers below are worth reading. Thanks both!

The only way to be sure how the choice effects performance is to measure it, in a situation similar to your expected use cases. That said, here are some observations:
For std::queue:
std::deque is usually the best choice; it supports all the necessary operations in constant time, and allocates memory in chunks as it grows.
std::list also supports the necessary operations, but may be slower due to more memory allocations; in special circumstances, you might be able to get good results by allocating from a dedicated object pool, but that's not entirely straightforward.
std::vector can't be used as it doesn't have a pop_front() operation; such an operation would be slow, as it would have to move all the remaining elements.
A potentially faster, but less flexible, alternative is to implement a circular buffer over a fixed-size array (e.g. std::array, or a std::vector that you don't resize). You'll need to deal with the case of it filling up, either by reporting an error, or allocating a larger buffer and copying all the data.
For std::priority_queue:
std::vector is usually the best choice; it grows exponentially (reducing the number of memory allocations), and is a simple data structure that's very fast to access - an iterator can be implemented simply as a wrapper around a pointer.
std::deque may be slower as it typically grows linearly (requiring more memory allocation), and access is more complicated than with a vector.
std::list can't be used as it doesn't provide random access.
To summarise - the defaults are usually the best choice, but if speed really is important, then measure the alternatives.

I would use std::queue for your basic queue which is (by default at least) a wrapper on deque. Do something more special-purpose if that does not work for you.
std::priority_queue also exists (over vector by default) but the added semantics make it more likely that you could have to roll your own here, depending on perf observed for your particular access pattern.
vector has storage characteristics which make it very ill-suited to removal from front of the dataset. A lot of shuffling down to be done every time you pop_front. For a simple queue, avoid this.
list is likely to be too expensive for any high-hit queue, because by contract it has to offer function you don't need. It could be a candidate for use as a priority queue but my instinct is always to trust the STL.

vector would implement a stack as your fast insertion is at the end and fast removal is also at the end. If you want a FIFO queue, vector would be the wrong implementation to use.
deque or list both provide constant time insertion at either end. list is good for LRU caches where you want to move elements out of the middle fast and where you want your iterators to remain valid no matter how much you move them about. deque is generally used when insertions and deletions are at the end.
The main thing I need to ask about your collection is whether they are accessed by multiple threads. I sort-of assume they are, in which case one of your primary aims is to reduce locking. This is best done if you at least have a multi_push and multi_get feature so that you can put more than one element on at a time without any locking.
There are also lock-free containers or semi-lock-free containers.
You will probably find that your locking strategy is more important than any performance in the collection itself as long as your operations are all constant-time.

C++ smart linear container

Let me explain my problem together with the background, so it would be easier to understand why I'm asking for this specific type of thing. I'm developing an instant messenger. Most of the architecture is outlined by my teacher, however implementation detail may vary. There is an "Engine" class, EventManager, which registers clients. To identify them and to easily remove them, I use a map (with client-id's) or a set with pointers. So far, so good. But then this EventManager uses poll() (or select(), but that's nowhere as comfortable to use as poll(), as you have to rebuild the array each time, which is slow and not-so-nice, I guess, and I can restrict myself to UNIX environment, if you ask) in its main loop. Which needs an array of struct pollfd. Now every time a new client comes or goes, this array needs to be rebuilt. Either I use a dynamic array by hand and allocate memory every time (baaaaaad), or I use a vector, which would handle new client's struct pollfd insertion pretty well at the end of the container, or a deque, which would insert and remove anywhere pretty well. Now my two questions are:
If I choose vector, will it automatically shrink and move elements in the middle of itself instead of full reallocation? and
That would anyway copy a lot, if it's in the beginning, so I'd like to use deque. Does that have an array interface (like you would do with vector - &myVector[0]) or is it strictly non-contiguous?

If you remove something from the middle of a vector it will move all the following elements one position towards the beginning. It will not reallocate. You don't have to consider reallocations at all because they are amortized to give O(1) time per insertion.
deque is not much better than vector. Removing from the beginning or end is efficient. Not from the middle. If you remove from anywhere, then it will hopefully be twice as fast a vector, but not faster. Since it's a more complicated structure it'll probably be even slower. deque doesn't guarantee continuous storage, so although indexing is allowed and done in O(1) time, you still can't reliably convert it to a pointer.
Anyway it smells like premature optimization. Use vector. Since the order of clients is not significant, you can speed up the erasure of clients by swapping the element that you want to remove with the last element in the vector and calling pop_back() after that.

sort() function in C++

I found two forms of sort() in C++:
1) sort(begin, end);
2) XXX.sort();
One can be used directly without an object, and one is working with an object.
Is that all? What's the differences between these two sort()? They are from the same library or not? Is the second a method of XXX?
Can I use it like this
vector<int> myvector
myvector.sort();
or
list<int> mylist;
mylist.sort();

std::sort is a function template that works with any pair of random-access iterators. Consequently, the algorithm implemented by std::sort is tailored (optimized) for random access. Most of the time it is going to be some flavor of quick-sort.
std::list::sort is a dedicated list-oriented version of sort. std::list is a container that does not support [efficient] random access, which means that you can't use std::sort with it. This creates the need for a dedicated sorting algorithm. Most of the time it will be implemented as some flavor of merge-sort.

std::list includes a sort() member, because there are ways to do sorting efficiently that work for lists, but almost nothing else. If you tried to use the same sorting implementation for a list as everything else, one of them wouldn't work very well, so they provide a special one for list.
Most of the time, you should use std::sort, and forget that std::list even exists. It's useful in a few very specific circumstances, but they're sufficiently rare that using it is a mistake at least 90% of the time that people think they should.
Edit: The situation is not that using std::list::sort is a mistake, but that using std::list is usually a mistake. There are a couple of reasons for this. First of all, std::list requires a couple of pointers in addition to space for each item you store, so unless your data is fairly large, it can impose a substantial space overhead. Second, since each node is allocated individually, traversing nodes tends to be fairly slow (locality of reference is low, so cache utilization tends to be poor). Finally, although you can insert or delete an element from the middle of a list with constant complexity, you (normally) need to use a linear-complexity traversal of the list to find the right point at which to insert or delete that item. Couple that with the poor cache utilization, and you get insertions and deletions in the middle of a linked list usually being slower than with something like a vector or deque.
The other 10% of the time (or maybe somewhat less than that) is when you can reasonably store a position in the list in the long term, and (at least typically) plan on making a number of insertions and deletions at (close to) that same point, without traversing the linked list each time to find the right place. If you make enough modifications to the list at (or close to) the same place, you can amortize the list traversal over a number of insertions and deletions. If you make enough modifications at (close to) the same place, you can come out ahead.
That can work out well in a few cases, but forces you to store more "state" data for a long period of time, mostly to make up for the deficiencies in the data structure itself. To the extent possible, I prefer to make individual operations close to pure and atomic, so they minimize dependence (and effect) on long-term data. In the single-threaded case, this mostly helped modularity, but with multi-core, multithreaded (etc.) programming, it can also have a substantial effect on efficiency and difficulty of development. Access to that long-term data generally requires thread synchronization.
That's not to say that avoiding list will automatically make your programs lots faster, or anything like that -- just that I find situations where it's really useful to be quite rare, and a lot of situations that seem to fit what people have been told about when a linked list will be useful, really aren't.

std::sort only works with random-access iterators. That limits it to arrays, vector and deque (and user-defined types which provide random access).
std::list is a linked list which doesn't support random access. The generic sort function doesn't work for it. And even if it did, it would be suboptimal. A linked list is not sorted by copying data from node to node - it is sorted by relinking the nodes.
std::sort also wouldn't work for other containers (sets and maps), but those are always sorted anyway.

Benefit of slist over vector?

What I need is just a dynamically growing array. I don't need random access, and I always insert to the end and read it from the beginning to the end.
slist seems to be the first choice, because it provides just enough of what I need. However, I can't tell what benefit I get by using slist instead of vector. Besides, several materials I read about STL say, "vectors are generally the most efficient in time for accessing elements and to add or remove elements from the end of the sequence". Therefore, my question is: for my needs, is slist really a better choice than vector? Thanks in advance.

For starters, slist is non-standard.
For your choice, a linked list will be slower than a vector, count on it. There are two reasons contributing to this:
First and foremost, cache locality; vectors store their elements linearly in the RAM which facilitates caching and pre-fetching.
Secondly, appending to a linked list involves dynamic allocations which add a large overhead. By contrast, vectors most of the time do not need to allocate memory.
However, a std::deque will probably be even faster. In-depth performance analysis has shown that, despite bias to the contrary, std::deque is almost always superior to std::vector in performance (if no random access is needed), due to its improved (chunked) memory allocation strategy.

Yes, if you are always reading beginning to end, slist (a linked list) sounds like the way to go. The possible exception is if you will be inserting a large quantity of elements at the end at the same time. Then, the vector could be better if you use reserve appropriately.
Of course, profile to be sure it's better for your application.

Matt Austern (author of "Generic Programming and the STL" and general C++ guru) is a strong advocate of singly-linked lists for inclusion in the forthcoming C++ standard; see his presentation at http://www.accu-usa.org/Slides/SinglyLinkedLists.ppt and his long article at http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2543.htm for more details, including a discussion of the trade-offs involved that may guide you in possibly choosing this data structure. (Note that the currently proposed name is forward_list, though slist is how it was traditionally named in SGI's STL & other popular libraries).

I'll second (or maybe third...) the opinion that std::vector or std::deque will do the job. The only thing that I will add is a few additional factors that should guide the decision between std::vector<T> and std::list<T>. These have a lot to do with the characteristics of T and what algorithms you plan on using.
The first is memory overhead. Std::list is a node-based container so if T is a primitive type or relatively small user-defined type, then the memory overhead of the node-based linkage might be non-negligible - consider that std::list<int> is likely to use at least 3 * sizeof(int) storage for each element whereas std::vector will only use sizeof(int) storage with a small header overhead. Std::deque is similar to std::vector but has a small overhead that is linear to N.
The next issue is the cost of copy construction. If T(T const&) is at all expensive, then steer clear of std::vector<T> since it cause a bunch of copies to occur as the size of the vector grows. This is where std::deque<T> is a clear winner and std::list<T> is also a contender.
The final issue that usually guides the decision on container type is whether your algorithms can work with the iterator invalidation constraints of std::vector and std::deque. If you will be manipulating the container elements a lot (e.g., sorting, inserting in the middle, or shuffling), then you might want to lean towards std::list since manipulating the order requires little more than resetting a few linkage pointers.

I'm guessing you mean std::list by "slist". Vectors are good when you need fast, random-access to a sequence of elements, with guaranteed contiguous memory, and fast sequential reading (IOW, from the beginning to the end). Lists are good when you need fast (constant-time) insertion or deletion of items at the beginning or end of the sequence, but don't care about the performance of random-access or sequential reading.
The reason for the difference is the way the 2 are implemented. Vectors are implemented internally as an array of items, which needs to be reallocated when its size/capacity is reached on adding an item. Lists are implemented as a doubly-linked list, which can cause cache-misses for sequential reading. Random-access for lists also requires scanning from the first (or last) item in the list, until it locates the item you're requesting.

Sounds like a good job for std::deque to me. It has the memory benefits of a vector like contiguous memory allocation for each "slab" (good for CPU caches), no overhead for each element like with std::list and it does not need to reallocate the whole set as std::vector does. Read more about std::deque here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js