Retrieve container's comparison function given an iterator - c++

Given an iterator, is it possible to retrieve/use the correct comparison function for the collection that this iterator refers to?
For example, let's assume I'm writing a generic algorithm:
template <class InIt, class T>
void do_something(InIt b, InIt e, T v) {
// ...
}
Now, let's say I want to do something simple, like find v in [b..e). If b and e are iterators over a std::vector, I can simply use if (*b == v) .... Let's assume, however, that b and e are iterators over a std::map. In this case, I should only compare the keys, not the whole value type of what's contained in the map.
So the question is, given those iterators into the map, how do I retrieve that map's comparison function that will only compare the keys? At the same time, I don't want to blindly assume that I'm working with a map either. For example, if the iterators pointed to a set, I'd want to use the comparison function defined for that set. If they pointed to a vector or deque, I'd probably have to use ==, because those containers won't have a comparison function defined.
Oh, almost forgot: I realize that in many cases, a container will only have an equivalent of operator< rather than operator== for the elements it contains -- I'm perfectly fine with being able to use that.

Iterators don't have to be connected to containers, so they don't give you any details about the containers that they aren't necessarily connected to. That's the essential iterator abstraction: iterators delimit sequences, without regard to where the sequence comes from. If you need to know about containers you have to write algorithms that take containers.

There is no standard way to map from an iterator to the underlying container type (if there is such a container at all). You might be able to use some heuristics to try to determine which container, although that will not be simple and probably not guaranteed either.
For example, you can use a metafunction to determine whether the *value_type* is std::pair<const K, T>, which is a hint that this could be a std::map and after extracting the types K and T try to use a metafunction to determine whether the type of the iterator and the type of std::map<K,T,X,Y>::iterator or std::map<K,T,X,Y>::const_iterator match for a particular combination of X, Y.
In the case of the map that could be sufficient to determine (i.e. guess with a high chance of success) that the iterator refers to a std::map, but you should note that even if you can use that and even extract the type X of the comparator, that is not sufficient to replicate the comparator in the general case. While uncommon (and not recommended) comparators can have state, and you would not know which is the particular state of the comparator without having access to the container directly. Also note that there are cases where this type of heuristic will not even help, in some implementations of std::vector<> the iterator type is directly a pointer, and in that case you cannot differentiate between an 'iterator' into an array and an iterator into a std::vector<> of the same underlying types.

Unfortunately iterators don't always know about the container that contains them (and sometimes they aren't in a standard container at all). Even the iterator_traits only have information about the value_type which doesn't specifically tell you how to compare.
Instead, let's draw inspiration from the standard library. All the associative containers (map, etc) have their own find methods rather than using std::find. And if you do need to use std::find on a such a container, you don't: you use find_if.
It sounds like your solution is that for associative containers you need a do_something_if that accepts a predicate telling it how to compare the entries.

Related

What container to store unique values with no operator< defined

I need to store unique objects in a container. The object provides a operator== and operator!= (operator< nor operator>).
I can't use std::set, as it requires a operator<.
I can't use std::unordered_set as it requires a hash function and I have none. Let's say I can't write one considering my object type (or I'm lazy).
Am I really forced to use a std::vector and make sure myself that items does not get duplicated in the container (using std::find which uses operator==)?
Is there really no container that could be used to store unique items only using the operator==?
There's indeed no standard container, and that's because it would be inefficient. O(N), to be precise - exactly the brute force search you imagine.
Both std::set<T> and std::unordered_set<T> avoid a brute-force search by taking advantage of a non-trivial property of T. Lacking either property, any of the existing N members of a container could be equal to a potential new value V, and you must therefore compare all N members using operator== repeatedly.
"Let's say I can't write a hash function considering my object type (or I'm lazy)."
Well, you're lazy, but I'll write one for you anyway : template<typename T> size_t degenerate_hash(T) { return 0; }.
Of course, this means you get O(N) performance because every value collides with every other value, but that was the best possible outcome anyway.
Use a std::vector and before you std::vector::push_back or std::vector::insert use first std::find to check whether the element already exists in the vector.
Or at the end of all insertions use std::unique in combination with std::vector::erase to remove duplicates.

C++ map allocator stores items in a vector?

Here is the problem I would like to solve: in C++, iterators for map, multimap, etc are missing two desirable features: (1) they can't be checked at run-time for validity, and (2) there is no operator< defined on them, which means that they can't be used as keys in another associative container. (I don't care whether the operator< has any relationship to key ordering; I just want there to be some < available at least for iterators to the same map.)
Here is a possible solution to this problem: convince map, multimap, etc to store their key/data pairs in a vector, and then have the iterators be a small struct that contain a pointer to the vector itself and a subscript index. Then two iterators, at least for the same container, could be compared (by comparing their subscript indices), and it would be possible to test at run time whether an iterator is valid.
Is this solution achievable in standard C++? In particular, could I define the 'Allocator' for the map class to actually put the items in a vector, and then define the Allocator::pointer type to be the small struct described in the last paragraph? How is the iterator for a map related to the Allocator::pointer type? Does the Allocator::pointer have to be an actual pointer, or can it be anything that supports a dereference operation?
UPDATE 2013-06-11: I'm not understanding the responses. If the (key,data) pairs are stored in a vector, then it is O(1) to obtain the items given the subscript, only slightly worse than if you had a direct pointer, so there is no change in the asymptotics. Why does a responder say map iterators are "not kept around"? The standard says that iterators remain valid as long as the item to which they refer is not deleted. As for the 'real problem': say I use a multimap for a symbol table (variable name->storage location; it is a multimap rather than map because the variables names in an inner scope may shadow variables with the same name), and say now I need a second data structure keyed by variables. The apparently easiest solution is to use as key for the second map an iterator to the specific instance of the variable's name in the first map, which would work if only iterators had an operator<.
I think not.
If you were somehow able to "convince" map to store its pairs in a vector, you would fundamentally change certain (at least two) guarantees on the map:
insert, erase and find would no longer be logarithmic in complexity.
insert would no longer be able to guarantee the validity of unaffected iterators, as the underlying vector would sometimes need to be reallocated.
Taking a step back though, two things suggest to me that you are trying to "solve" the wrong problem.
First, it is unusual to need to have a vector of iterators.
Second, it is unusual to need to check an iterator for validity, as iterators are not generally kept around.
I wonder what the real problem is that you are trying to solve?

How to handle multiple iterator types

I have a custom datastructure, which can be accessed in multiple ways. I want to try to keep this datastructure to keep to STL-standards as well as possible. So I already have lot's of typedefs, which give template parameters the STL-names. This is business as usual for me by now.
However I am not so sure how to correctly add iterators to my datastructure. The main problem I am facing is, that there would be multiple iteration policies over the datastructure. The easiest use case is iterating over all elements, which would be handled well by STL-Conforming iterators over the datastructure. However one might also want to access elements, which are somehow similar to a given key. I would also like to iterate over all these similar elements in a way which I can interface with the STL.
These are the ideas I have thought about so far:
Provide only one type of iterator:
This is basicly what std::map does. The start and end iterators for a subrange are provided by std::map::lower_bound() and std::map::upper_bound().
However this works well, because the iterators returned by begin(), end(), lower_bound() and upper_bound() are compatible, i.e. the operator==() can be given a very well defined meaning on these. In my case this would be hard to get right, or might even be impossible to give some clear semantics. For example I probably would get some cases where it1==it2 but ++it1!=++it2. I am not sure if this is allowed by the STL.
Provide multiple types of iterators:
Much easier to provide clean operator==() semantics. Nasty on the other hand because it enlarges the number of types.
Provide one type of iterator and use stl::algorithms for specialized access
I am not sure if this is possible at all. The iteration state should be kept by the iterator somehow (either directly or in a Memento). Using this approach would mean to specialize all stl::algorithms and access the predicate directly in the specialization. Most likely impossible, and if possible a very bad idea.
Right now I am mostly opting for Version 1, i.e. to provide only one type of iterator at all. However since I am not clear on how to clean up the semantics, I have not yet decided.
How would you handle this?
Standard containers support two iteration policies with two iterator types: ::iterator and ::reverse_iterator. You can convert between the two using the constructor of std::reverse_iterator, and its member function base().
Depending how similar your iteration policies are, it may or may not be easy to provide conversions to different iterator types. The idea is that the result should point at the "equivalent position" in the iteration policy of the destination type. For reverse iterators, this equivalence is defined by saying that if you insert at that point, the result is the same. So if rit is a reverse iterator, vec.insert(rit.base(), ...) inserts an element "before" rit in the reverse iteration, that is to say after the element pointed to by rit in the container. This is quite fiddly, and will only get worse when the iteration policies are completely unrelated. But if all of your iterator types are (or can be made to look like) wrappers around the "normal" iterator that goes over all elements, then you can define conversions in terms of that underlying iterator position.
You only actually need conversions if there are member functions that add or remove elements of the container, because you probably don't want to have to provide a separate overload for each iterator type (just like standard containers don't define insert and erase for reverse iterators). If iterators are used solely to point at elements, then most likely you can do without them.
If the different iteration policies are all iterating in the normal order over a subset of the elements, then look at boost::filter_iterator.
I probably would get some cases where it1==it2 but ++it1!=++it2. I am
not sure if this is allowed by the STL.
If I understand correctly, you got it1 by starting at thing.normal_begin(), and you got it2 by starting at thing.other_policy_begin(). The standard containers don't define the result of comparing iterators of the same type that belong to different ranges, so if you did use a common type, then I think this would be fine provided the documentation makes it clear that although operator== does happen to work, the ranges are separate according to where the iterator came from.
For example, you could have a skip_iterator which takes as a constructor parameter the number of steps it should move forward each time ++ is called. Then you could either include that integer in the comparison, so that thing.skip_begin(1) != thing.skip_begin(2), or you could exclude it so that thing.skip_begin(1) == thing.skip_begin(2) but ++(++(thing.skip_begin(1))) == ++(thing.skip_begin(2)). I think either is fine provided it's documented, or you could document that comparing them is UB unless they came from the same starting point.
Why are more types a problem? It does not necessarily mean much more code. For instance, you could make you iterator-type a template that takes an iteration-policy as template-parameter. The iteration-policy could then provide the implementation of the iteration:
struct iterate_all_policy {
iterate_all_policy(iterator<iterate_all_policy> & it) : it(it) {}
void advance() { /* implement specific advance here */ }
private:
iterator<iterate_all_policy> & it;
}
You will probably have to make the iteration-policy-classes friends of the iterator-types.

Generic operations on C++ containers

How to write generic operations on C++ STL containers? For example, Java has Collection interface, which every Java containers (except for maps) implements. I can do operations like add, remove, contains, and iterations regardless of whether the actual container is LinkedList, HashSet, ArrayBlockingQueue, etc. I find it very powerful.
C++ has iterators, but what about operations like add and remove?
vector has push_back, set has insert, queue has push. How to add something to C++ container in a generic way?
Iteration:
All standard containers have iterators which give ordered access to the elements of the containers. These can also be used in generic algorithms which work on any conforming iterator type.
Insertion:
All sequences and associative containers can have elements inserted into them by the expression c.insert(i, x) -- where c is a sequence or associative container, i is an iterator into c and x is a value that you want to add to c.
std::inserter and friends can be used to add elements to a sequence or associative container in a generic way.
Removal:
For any sequence or associative container the following code works:
while (true) {
X::iterator it(std::find(c.begin(), c.end(), elem));
if (it == c.end()) break;
c.erase(it);
}
Where X is the type of the container, c is a container object and elem is an object with the value that you want to remove from the container.
For sequences there is the erase-remove idiom, which looks like:
c.erase(std::remove(c.begin(), c.end(), elem), c.end());
For associative containers you can also do:
c.erase(k);
Where k is a key corresponding to the element that you want to erase.
A nice interface to all of this:
See Boost.Range.
Note -- these are compile time substitutable, whereas the java ones are run time substitutable. To allow run-time substitution it is necessary to use type erasure (that is -- make a templated sub-class that forwards the required interface to the container that it is instantiated with).
Look at the header <algorithm>. There are plenty of generic algorithms in there for finding, sorting, counting, copying, etc, that work on anything providing iterators with various specified characteristics.
C++ has std::inserter and friends to add elements to a container in a generic way. They are in the header file iterator.

why there is no find for vector in C++

what's the alternative?
Should I write by myself?
There is the std::find() algorithm, which performs a linear search over an iterator range, e.g.,
std::vector<int> v;
// Finds the first element in the vector that has the value 42:
// If there is no such value, it == v.end()
std::vector<int>::const_iterator it = std::find(v.begin(), v.end(), 42);
If your vector is sorted, you can use std::binary_search() to test whether a value is present in the vector, and std::equal_range() to get begin and end iterators to the range of elements in the vector that have that value.
The reason there is no vector::find is because there is no algorithmic advantage over std::find (std::find is O(N) and in general, you can't do better for vectors).
But the reason you have map::find is because it can be more efficient (map::find is O(log N) so you would always want to use that over std::find for maps).
Who told you that? There's is "find" algorithm for vector in C++. Ironically Coincidentally, it is called std::find. Or maybe std::binary_search. Or something else, depending on the properties of the data stored in your vector.
Containers get their own specific versions of generic algorithms (implemented as container methods) only when the effective implementation of the algorithm is somehow tied to the internal details of the container. std::list<>::sort would be one example.
In all other cases, the algorithms are implemented by standalone functions.
Having a 'find' functionality in the container class violates 'SRP' (Single Responsibility Principle). A container's core functionality is to provide interfaces for storage, retrieval of elements in the container. 'Finding', 'Sorting', 'Iterating' etc are not core functionality of any container and hence not part of it's direct interface.
However as 'Herb' states in Namespace Principle, 'find' is a part of the interface by being defined in the same namespace as 'vector' namely 'std'.
Use std::find(vec.begin(), vec.end(), value).
And don't forget to include <algorithm>
what's the alternative?
The standard offers std::find, for sequential search over arbitrary sequences of like-elements (or something like that).
This can be applied to all containers supporting iterators, but for internally sorted containers (like std::map) the search can be optimized. In that case, the container offers it's own find member function.
why there is no find for vector in C++?
There was no point in creating a std::vector<???>::find as the implementation would be identical to std::find(vector.begin(), vector.end(), value_to_find);.
Should I write by myself?
No. Unless you have specific limitations or requirements, you should use the STL implementation whenever possible.