Is there any advantage to using C++11's std::find over a container's find method?
In the case of std::vector (which does not have a find method) does std::find use some smart algorithm or the naive way of simply iterating over every element?
In the case of std::map it seems you need to pass along an std::pair, which is the value_type of an std::map. This does not seem very useful as usually you'd want to find for either a key or a mapped element.
What about other containers like std::list or std::set or std::unordered_set ?
In the case of std::vector (which does not have a find method) does std::find use some smart algorithm or the naive way of simply iterating over every element?
It cannot, because vectors are not sorted. There is no other way to find an element in an unsorted vector than a linear search with O(n) complexity.
On the other hand, sequence containers do not offer a find() member functions, so you could not possibly use that.
In the case of std::map it seems you need to pass along an std::pair, which is the value_type of an std::map. This does not seem very useful as usually you'd want to find for either a key or a mapped element.
Indeed, here you should use the find() member function, which guarantees a better complexity (O(log N)).
In general, when a container exposes a member function with the same name as a generic algorithm, this is because the member function does the same thing, but offers a better complexity guarantee.
What about other containers like std::list or std::set or std::unordered_set ?
Just like std::vector, std::list is not a sorted container - so the same conclusion applies.
For std::set and std::unordered_set, instead, you should use the find() member function, which guarantees a better complexity (O(log n) and average O(1), respectively).
Related
In practice, is there any circumstance which std::unordered_map must be used instead of std::map?
I know the differences between them, say internal implementation,time complexity for searching element and so on.
But I really can't find a circumstance where std::unordered_map could not be replaced by std::map indeed.
Yes, for example if the key type does not have a sensible strict weak ordering but does have sensible equality and is hashable.
A strict weak order is required for the key type on the ordered associative containers std::set and std::map.
I know the difference between them, say internal implementation,time complexity for searching element
In that case, you should know that that the average asymptotic element lookup time complexity of unordered map is constant, while the complexity of ordered map is logarithmic. This means that there is some size of container at which point the lookup will be faster when using unordered map.
But I really can't find a circumstance where std::unordered_map could not be replaced by std::map indeed.
If the container is large enough, and if you cannot afford the cost of ordered map lookup, then you cannot choose to replace the faster unordered map lookup.
Another case where ordered map cannot be used is where there doesn't exist a cheap function to compare relative order of the key.
My opinion is that you should change question in:
when std::map must be used instead of std::unordered_map?
Indeed, insertion, deletion, search of std::unordered_map are less complex than std::map. The table of this question resumes the complexity for each operation.
So, using std::map is recommended in two cases at least:
When you need ordering
std::unordered_map are hash-based. When you have too many collisions and you can't find a suitable hash function, you may go for a std::map.
However, in normal conditions, for a single element operation, I recommend std::unordered_map.
I have a big std::vector<int> where I have to get an iterator so that I can call other functions of it, like erase. Looping through the vector to find the element I'm searching for takes a lot of time.
std::map::find() is much faster, but I don't want to allocate memory for the second value which I'm never going to use.
Is there any single-value container with find() or anything that gives me an iterator with similar speed as std::map::find ? I couldn't find any.
You're looking for std::set or std::multiset.
You could use std::unordered_map or use the same std::vector that keeps an order of elements that you could apply standard algorithms for sorted containers as for example std::equal_range.
Why do we have 2 ways like above to search for an element in the set?
Also find algorithm can be used to find an element in a list or a vector but what would be the harm in these providing a member function as well as member functions are expected to be faster than a generic algorithm?
Why do we need remove algorithm and create all the drama about erase remove where remove will just shift the elements and then use erase to delete the actual element..Just like STL list provides a member function remove why cant the other containers just offer a remove function and be done with it?
Binary_search in STL set over set's member function find?
Why do we have 2 ways like above to search for an element in the set?
Binary search returns a bool and set::find() and iterator. In order to compare apples to apples, the algorithm to compare set::find() with is std::lower_bound() which also returns an iterator.
You can apply std::lower_bound() on an arbitrary sorted range specified by a pair of (forward / bidirectional / random access) iterators and not only on a std::set. So having std::lower_bound() is justified. As std::set happens to be a sorted range, you can call
std::lower_bound(mySet.begin(), mySet.end(), value);
but the
mySet.find(value);
call is not only more concise, it is also more efficient. If you look into the implementation of std::lower_bound() you will find something like std::advance(__middle, __half); which has different complexity depending on the iterator (whether forward / bidirectional / random access iterator). In case of std::set, the iterators are bidirectional and advancing them has linear complexity, ouch! In contrast, std::set::find() is guaranteed to perform the search in logarithmic time complexity. The underlying implementation (which is a red and black tree in case of libstdc++) makes it possible. Offering a set::find() is also justified as it is more efficient than calling std::lower_bound() on std::set.
Also find algorithm can be used to find an element in a list or a
vector but what would be the harm in these providing a member function
as well as member functions are expected to be faster than a generic
algorithm?
I don't see how you could provide a faster member function for list or vector, unless the container is sorted (or possesses some special property).
Why do we need remove algorithm and create all the drama about erase
remove where remove will just shift the elements and then use erase to
delete the actual element..Just like STL list provides a member
function remove why cant the other containers just offer a remove
function and be done with it?
I can think of two reasons.
Yes, the STL is seriously lacking many convenience functions. I often feel like I live in a begin-end hell when using algorithms on an entire container; I often proved my own wrappers that accept a container, something like:
template <typename T>
bool contains(const std::vector<T>& v, const T& elem) {
return std::find(v.begin(), v.end(), elem) != v.end();
}
so that I can write
if (contains(myVector, 42)) {
instead of
if (std::find(myVector.begin(), myVector.end(), 42) != myVector.end()) {
Unfortunately, you quite often have to roll your own or use boost. Why? Because standardization is painful and slow so the standardization committee focuses on more important things. The people on the committee often donate their free time and are not paid for their work.
Now deleting elements from a vector can be tricky: Do you care about the order of your elements? Are your elements PODs? What are your exception safety requirements?
Let's assume you don't care about the order of your elements and you want to delete the i-th element:
std::swap(myVector[i], myVector.back());
myVector.pop_back();
or even simpler:
myVector[i] = myVector.back(); // but if operator= throws during copying you might be in trouble
myVector.pop_back();
In C++11 with move semantics:
myVector[i] = std::move(myVector.back());
myVector.pop_back();
Note that these are O(1) operations instead of O(N). These are examples of the efficiency and exception safety considerations that the standard committee leaves up to you. Providing a member function and "one size fits all" is not the C++ way.
Having said all these, I repeat I wish we had more convenience functions; I understand your problem.
I'll answer part of your question. The Erase-Remove idiom is from the book “Effective STL” written by Scott Meye. As to why remove() doesn't actually delete elements from the container, there is a good answer here, I just copy part of the answer:
The key is to realize that remove() is designed to work on not just a
container but on any arbitrary forward iterator pair: that means it
can't actually delete the elements, because an arbitrary iterator pair
doesn't necessarily have the ability to delete elements.
Why STL list provides a member function remove and why can't the other containers just offer a remove function and be done with it? I think it's because the idiom is more efficient than other methods to remove specific values from the contiguous-memory containers.
There is a sort() method for lists in STL. Which is absurd, because I would be more inclined to sort an array/vector.
Why isn't sort() provided for vector? Is there some underlying philosophy behind the creation of the vector container or its usage, that sort is not provided for it?
As has already been said, the standard library provides a nonmember function template that can sort any range given a pair of random access iterators.
It would be entirely redundant to have a member function to sort a vector. The following would have the same meaning:
std::sort(v.begin(), v.end());
v.sort();
One of the first principles of the STL is that algorithms are not coupled to containers. How data is stored and how data is manipulated should be as loosely coupled as possible.
Iterators are used as the interface between containers (which store data) and algorithms (which operate on the data). In this way, you can write an algorithm once and it can operate on containers of various types, and if you write a new container, the existing generic algorithms can be used to manipulate its contents.
The reason that std::list provides its own sort function as a member function is that it is not a random accessible container; it only provides bidirectional iterators (since it is intended to represent a doubly linked list, this makes sense). The generic std::sort function requires random access iterators, so you cannot use it with a std::list. std::list provides its own sort function in order that it can be sorted.
In general, there are two cases in which a container should implement an algorithm:
If the generic algorithm cannot operate on the container, but there is a different, container-specific algorithm that can provide the same functionality, as is the case with std::list::sort.
If the container can provide a specific implementation of the algorithm that is more efficient than the generic algorithm, as is the case with std::map::find, which allows an element to be found in the map in logarithmic time (the generic std::find algorithm performs a linear search because it cannot assume the range is sorted).
There are already interesting elements of answer, but there is actually more to be said about the question: while the answer to "why doesn't std::vector has a sort member function?" is indeed "because the standard library provides member functions only when they offer more than generic algorithms", the real interesting question is "why does std::list have a sort member function?", and a few things haven't been explained yet: yes, std::sort only works with random-access iterators and std::list only provides bidirectional iterators, but even if std::sort worked with bidirectional iterators, std::list::sort would still offer more. And here is why:
First of all, std::list::sort is stable, while std::sort isn't. Of course there is still std::stable_sort, but it doesn't work with bidirectional iterators either.
std::list::sort generally implements a mergesort, but it knows that it is sorting a list and can relink nodes instead of copying things. A list-aware mergesort can sort the list in O(n log n) time with only O(log n) additional memory, while your typical mergesort (such as std::stable_sort) uses O(n) additional memory or has a O(n log² n) complexity.
std::list::sort doesn't invalidate the iterators. If an iterator was pointing to a specific object in the list, it will still be pointing to the same object after the sort, even if its position in the list isn't the same than before the sort.
Last but not least, std::list::sort doesn't move or swap the objects around since it only relinks nodes. That means that it might be more performant when you need to sort objects that are expensive to move/swap around, but also that it can sort a list of objects that aren't even moveable, which is totally impossible for std::sort!
Basically, even if std::sort and std::stable_sort worked with bidirectional or forward iterators (and it would totally be possible, we know sorting algorithms that work with them), they still couldn't offer everything std::list::sort has to offer, and they couldn't relink nodes either since the standard library algorithms aren't allowed to modify the container, only the pointed values (relinking nodes counts as modifying the container). On the other hand, a dedicated std::vector::sort method wouldn't offer anything interesting, so the standard library doesn't provide one.
Note that everything that has been said about std::list::sort is also true for std::forward_list::sort.
A vector-specific sort would provide no advantage over std::sort from <algorithm>. However, std::list provides its own sort because it can use the special knowledge of how list is implemented to sort items by manipulating the links instead of copying objects.
You can easily sort a vector with:
sort(v.begin(), v.end());
UPDATE: (answer to the comment): Well, they have certainly provided it by default. The difference is that it's not a member function for vector. std::sort is a generic algorithm that's supposed to work for anything that provides iterators. However, it really expects a random access iterator to sort efficiently. std::list, being a linked list, cannot provide random access to its elements efficiently. That's why it provides its own specialized sort algorithm.
std::sort() in <algorithm> does sorting on containers with random access iterators like std::vector.
There is also std::stable_sort().
edit - why does std::list have its own sort() function versus std::vector?
std::list is different from both std::vector and std::deque (both random access iterable) in how it's implemented, so it contains its own sort algorithm that is specialized for its implementation.
Answering the question of "Why?"
A sorting algorithm for std::vector is the same as sorting a native array and is the same (probably) as sorting a custom vector class.
The STL was designed to separate containers and algorithms, and to have an efficient mechanism for applying an algorithm to data that has the right characteristics.
This lets you write a container that might have specific characteristics, and to get the algorithms free. Only where there is some special characteristic of the data that means the standard algorithm is unsuitable is a custom implementation supplied, as in the case of std::list.
what's the alternative?
Should I write by myself?
There is the std::find() algorithm, which performs a linear search over an iterator range, e.g.,
std::vector<int> v;
// Finds the first element in the vector that has the value 42:
// If there is no such value, it == v.end()
std::vector<int>::const_iterator it = std::find(v.begin(), v.end(), 42);
If your vector is sorted, you can use std::binary_search() to test whether a value is present in the vector, and std::equal_range() to get begin and end iterators to the range of elements in the vector that have that value.
The reason there is no vector::find is because there is no algorithmic advantage over std::find (std::find is O(N) and in general, you can't do better for vectors).
But the reason you have map::find is because it can be more efficient (map::find is O(log N) so you would always want to use that over std::find for maps).
Who told you that? There's is "find" algorithm for vector in C++. Ironically Coincidentally, it is called std::find. Or maybe std::binary_search. Or something else, depending on the properties of the data stored in your vector.
Containers get their own specific versions of generic algorithms (implemented as container methods) only when the effective implementation of the algorithm is somehow tied to the internal details of the container. std::list<>::sort would be one example.
In all other cases, the algorithms are implemented by standalone functions.
Having a 'find' functionality in the container class violates 'SRP' (Single Responsibility Principle). A container's core functionality is to provide interfaces for storage, retrieval of elements in the container. 'Finding', 'Sorting', 'Iterating' etc are not core functionality of any container and hence not part of it's direct interface.
However as 'Herb' states in Namespace Principle, 'find' is a part of the interface by being defined in the same namespace as 'vector' namely 'std'.
Use std::find(vec.begin(), vec.end(), value).
And don't forget to include <algorithm>
what's the alternative?
The standard offers std::find, for sequential search over arbitrary sequences of like-elements (or something like that).
This can be applied to all containers supporting iterators, but for internally sorted containers (like std::map) the search can be optimized. In that case, the container offers it's own find member function.
why there is no find for vector in C++?
There was no point in creating a std::vector<???>::find as the implementation would be identical to std::find(vector.begin(), vector.end(), value_to_find);.
Should I write by myself?
No. Unless you have specific limitations or requirements, you should use the STL implementation whenever possible.