So far, I have been storing the array in a vector and then looping through the vector to find the matching element and then returning the index.
Is there a faster way to do this in C++? The STL structure I use to store the array doesn't really matter to me (it doesn't have to be a vector). My array is also unique (no repeating elements) and ordered (e.g. a list of dates going forward in time).
Since the elements are sorted, you can use a binary search to find the matching element. The C++ Standard Library has a std::lower_bound algorithm that can be used for this purpose. I would recommend wrapping it in your own binary search algorithm, for clarity and simplicity:
/// Performs a binary search for an element
///
/// The range `[first, last)` must be ordered via `comparer`. If `value` is
/// found in the range, an iterator to the first element comparing equal to
/// `value` will be returned; if `value` is not found in the range, `last` is
/// returned.
template <typename RandomAccessIterator, typename Value, typename Comparer>
auto binary_search(RandomAccessIterator const first,
RandomAccessIterator const last,
Value const& value,
Comparer comparer) -> RandomAccessIterator
{
RandomAccessIterator it(std::lower_bound(first, last, value, comparer));
if (it == last || comparer(*it, value) || comparer(value, *it))
return last;
return it;
}
(The C++ Standard Library has a std::binary_search, but it returns a bool: true if the range contains the element, false otherwise. It's not useful if you want an iterator to the element.)
Once you have an iterator to the element, you can use std::distance algorithm to compute the index of the element in the range.
Both of these algorithms work equally well any random access sequence, including both std::vector and ordinary arrays.
If you want to associate a value with an index and find the index quickly you can use std::map or std::unordered_map. You can also combine these with other data structures (e.g. a std::list or std::vector) depending on the other operations you want to perform on the data.
For example, when creating the vector we also create a lookup table:
vector<int> test(test_size);
unordered_map<int, size_t> lookup;
int value = 0;
for(size_t index = 0; index < test_size; ++index)
{
test[index] = value;
lookup[value] = index;
value += rand()%100+1;
}
Now to look up the index you simply:
size_t index = lookup[find_value];
Using a hash table based data structure (e.g. the unordered_map) is a fairly classical space/time tradeoff and can outperform doing a binary search for this sort of "reverse" lookup operation when you need to do a lot of lookups. The other advantage is that it also works when the vector is unsorted.
For fun :-) I've done a quick benchmark in VS2012RC comparing James' binary search code with a linear search and with using unordered_map for lookup, all on a vector:
To ~50000 elements unordered_set significantly (x3-4) outpeforms the binary search which is exhibiting the expected O(log N) behavior, the somewhat surprising result is that unordered_map loses it's O(1) behavior past 10000 elements, presumably due to hash collisions, perhaps an implementation issue.
EDIT: max_load_factor() for the unordered map is 1 so there should be no collisions. The difference in performance between the binary search and the hash table for very large vectors appears to be caching related and varies depending on the lookup pattern in the benchmark.
Choosing between std::map and std::unordered_map talks about the difference between ordered and unordered maps.
Related
If I have a structure like
std::map<string, int> myMap;
myMap["banana"] = 1;
myMap["apple"] = 1;
myMap["orange"] = 1;
How can I access myMap[0]?
I know that the map sorts internally and I'm fine with this, I want to get a value in the map by index. I've tried myMap[0] but I get the error:
Error 1 error C2679: binary '[' : no operator found which takes a right-hand operand of type 'int' (or there is no acceptable conversion)
I realise I could do something like this:
string getKeyAtIndex (int index){
map<string, int>::const_iterator end = myMap.end();
int counter = 0;
for (map<string, int>::const_iterator it = myMap.begin(); it != end; ++it) {
counter++;
if (counter == index)
return it->first;
}
}
But surely this is hugely inefficient? Is there a better way?
Your map is not supposed to be accessed that way, it's indexed by keys not by positions. A map iterator is bidirectional, just like a list, so the function you are using is no more inefficient than accessing a list by position. If you want random access by position then use a vector or a deque.
Your function could be written with help from std::advance(iter, index) starting from begin():
auto it = myMap.begin();
std::advance(it, index);
return it->first;
There may be an implementation specific (non-portable) method to achieve your goal, but not one that is portable.
In general, the std::map is implemented as a type of binary tree, usually sorted by key. The definition of the first element differs depending on the ordering. Also, in your definition, is element[0] the node at the top of the tree or the left-most leaf node?
Many binary trees are implemented as linked lists. Most linked lists cannot be directly accessed like an array, because to find element 5, you have to follow the links. This is by definition.
You can resolve your issue by using both a std::vector and a std::map:
Allocate the object from dynamic memory.
Store the pointer, along with the key, into the std::map.
Store the pointer in the std::vector at the position you want it
at.
The std::map will allow an efficient method to access the object by key.
The std::vector will allow an efficient method to access the object by index.
Storing pointers allows for only one instance of the object instead of having to maintain multiple copies.
Well, actually you can't. The way you found is very unefficient, it have a computational complexity of O(n) (n operations worst case, where n is the number of elements in a map).
Accessing an item in a vector or in an array have complexity O(1) by comparison (constant computational complexity, a single operation).
Consider that map is internally implemented as a red black tree (or avl tree, it depends on the implementation) and every insert, delete and lookup operation are O(log n) worst case (it requires logarithm in base 2 operations to find an element in the tree), that is quite good.
A way you can deal with is to use a custom class that have inside both a vector and a map.
Insertion at the end of the class will be averaged O(1), lookup by name will be O(log n), lookup by index will be O(1) but in this case, removal operation will be O(n).
Previous answer (see comment): How about just myMap.begin();
You could implement a random-access map by using a vector backing-store, which is essentially a vector of pairs. You of course lose all the benefits of the standard library map at that point.
you can use some other map like containers .
keep a size fields can make binary search tree easy to random access .
here is my implementation ...
std style , random access iterator ...
size balanced tree ...
https://github.com/mm304321141/zzz_lib/blob/master/sbtree.h
and B+tree ...
https://github.com/mm304321141/zzz_lib/blob/master/bpptree.h
std::map is an ordered container, but it's iterators don't support random access, but rather bidirectional access. Therefore, you can only access the nth element by navigating all its prior elements. A shorter alternative to your example is using the standard iterator library:
std::pair<const std::string, int> &nth_element = *std::next(myMap.begin(), N);
This has linear complexity, which is not ideal if you plan to frequently access this way in large maps.
An alternative is to use an ordered container that supports random access. For example, boost::container::flat_map provides a member function nth which allows you exactly what you are looking for.
std::map<string,int>::iterator it = mymap.begin() + index;
I want a data structure where I insert elements into it, and right after the insertion, it stays sorted and I find out the index of the element I just inserted in log N time.
I've tried using a vector and a multiset but neither satisfied both requirements.
Vector:
If I want to find the index of an element, I can do:
using namespace std;
vector<int>::iterator it = lower_bound(myvec.begin(), myvec.end(), someElement);
int index = (it - myvec.begin());
However, the vector doesn't allow for O(log N) insertion time while remaining sorted. Sorting the vector after each insertion would be O(N log N) each time. I tried:
vector<int>::iterator it = lower_bound(myvec.begin(), myvec.end(), someElement);
myvec.insert(it, someElement);
This finds the right location to insert the element, but the myvec.insert runs on O(N) time rather than O(log N) time.
Multiset:
The multiset allows me to insert and remain sorted, but where it lacks is getting the index of the element after insertion.
multiset<int>::iterator it = lower_bound(myset.begin(), myset.end(), someElement);
After using lower_bound, I cannot merely do
int index = (it - myset.begin());
like I would with a vector. Instead, a method I considered was:
int index = distance(myset.begin(), it);
However, distance runs on O(N) time instead of O(log N) time.
Is there a data structure or method that allows me to satisfy both requirements in log N time?
Neither vector nor multiset can achieve the requirements.
A data structure that does achieve the requirements is a balanced binary search tree, that is augmented by storing the size of the sub tree in the nodes. Such augmented search tree is called an "Order statistic tree".
Although the ordered associative containers of the standard library are internally implemented using search trees, the standard library does not provide a generic tree data structure that could be used to implement this.
What is the most effective way to get the index of an iterator of an std::vector? explains how to do it for std::vector or std::list but what about std::map?
The cleanest way to do this would be to use the std::distance function:
auto index = std::distance(myMap.begin(), myMapItr);
However, this runs in O(n) time, which is inefficient for large maps.
If you need to determine the index of an iterator into a map or other ordered collection, you may want to search for a library containing an order statistic tree, which is a modified binary search tree that supports efficient (O(1) or O(log n)) time lookup of the index of a particular value in the tree.
Alternatively, if you are manually iterating over the tree, you can just keep a counter lying around alongside the iterator that you increment every time you traverse from one element to the next. This gives O(1)-time lookup of the index of the iterator, but is not fully general.
Hope this helps!
Try this:
int IndexOf(Type *t)
{
Type** data = vector.data();
int index = 0;
while(*data++ != t)
{
index ++;
}
return index ;
}
map <int, string> rollCallRegister;
map <int, string> :: iterator rollCallRegisterIter;
map <int, string> :: iterator temporaryRollCallRegisterIter;
rollCallRegisterIter = rollCallRegister.begin ();
tempRollCallRegisterIter = rollCallRegister.insert (rollCallRegisterIter, pair <int, string> (55, "swati"));
rollCallRegisterIter++;
tempRollCallRegisterIter = rollCallRegister.insert (rollCallRegisterIter, pair <int, string> (44, "shweta"));
rollCallRegisterIter++;
tempRollCallRegisterIter = rollCallRegister.insert (rollCallRegisterIter, pair <int, string> (33, "sindhu"));
// Displaying contents of this map.
cout << "\n\nrollCallRegister contains:\n";
for (rollCallRegisterIter = rollCallRegister.begin(); rollCallRegisterIter != rollCallRegister.end(); ++rollCallRegisterIter)
{
cout << (*rollCallRegisterIter).first << " => " << (*rollCallRegisterIter).second << endl;
}
Output:
rollCallRegister contains:
33 => sindhu
44 => shweta
55 => swati
I have incremented the iterator. Why is it still getting sorted? And if the position is supposed to be changed by the map on its own, then what's the purpose of providing an iterator?
Because std::map is a sorted associative container.
In a map, the key value is generally used to uniquely identify the element, while the mapped value is some sort of value associated to this key.
According to here position parameter is
the position of the first element to be compared for the insertion
operation. Notice that this does not force the new element to be in
that position within the map container (elements in a set always
follow a specific ordering), but this is actually an indication of a
possible insertion position in the container that, if set to the
element that precedes the actual location where the element is
inserted, makes for a very efficient insertion operation. iterator is
a member type, defined as a bidirectional iterator type.
So the purpose of this parameter is mainly slightly increasing the insertion speed by narrowing the range of elements.
You can use std::vector<std::pair<int,std::string>> if the order of insertion is important.
The interface is indeed slightly confusing, because it looks very much like std::vector<int>::insert (for example) and yet does not produce the same effect...
For associative containers, such as set, map and the new unordered_set and co, you completely relinquish the control over the order of the elements (as seen by iterating over the container). In exchange for this loss of control, you gain efficient look-up.
It would not make sense to suddenly give you control over the insertion, as it would let you break invariants of the container, and you would lose the efficient look-up that is the reason to use such containers in the first place.
And thus insert(It position, value_type&& value) does not insert at said position...
However this gives us some room for optimization: when inserting an element in an associative container, a look-up need to be performed to locate where to insert this element. By letting you specify a hint, you are given an opportunity to help the container speed up the process.
This can be illustrated for a simple example: suppose that you receive elements already sorted by way of some interface, it would be wasteful not to use this information!
template <typename Key, typename Value, typename InputStream>
void insert(std::map<Key, Value>& m, InputStream& s) {
typename std::map<Key, Value>::iterator it = m.begin();
for (; s; ++s) {
it = m.insert(it, *s).first;
}
}
Some of the items might not be well sorted, but it does not matter, if two consecutive items are in the right order, then we will gain, otherwise... we'll just perform as usual.
The map is always sorted, but you give a "hint" as to where the element may go as an optimisation.
The insertion is O(log N) but if you are able to successfully tell the container where it goes, it is constant time.
Thus if you are creating a large container of already-sorted values, then each value will get inserted at the end, although the tree will need rebalancing quite a few times.
As sad_man says, it's associative. If you set a value with an existing key, then you overwrite the previous value.
Now the iterators are necessary because you don't know what the keys are, usually.
If I have a structure like
std::map<string, int> myMap;
myMap["banana"] = 1;
myMap["apple"] = 1;
myMap["orange"] = 1;
How can I access myMap[0]?
I know that the map sorts internally and I'm fine with this, I want to get a value in the map by index. I've tried myMap[0] but I get the error:
Error 1 error C2679: binary '[' : no operator found which takes a right-hand operand of type 'int' (or there is no acceptable conversion)
I realise I could do something like this:
string getKeyAtIndex (int index){
map<string, int>::const_iterator end = myMap.end();
int counter = 0;
for (map<string, int>::const_iterator it = myMap.begin(); it != end; ++it) {
counter++;
if (counter == index)
return it->first;
}
}
But surely this is hugely inefficient? Is there a better way?
Your map is not supposed to be accessed that way, it's indexed by keys not by positions. A map iterator is bidirectional, just like a list, so the function you are using is no more inefficient than accessing a list by position. If you want random access by position then use a vector or a deque.
Your function could be written with help from std::advance(iter, index) starting from begin():
auto it = myMap.begin();
std::advance(it, index);
return it->first;
There may be an implementation specific (non-portable) method to achieve your goal, but not one that is portable.
In general, the std::map is implemented as a type of binary tree, usually sorted by key. The definition of the first element differs depending on the ordering. Also, in your definition, is element[0] the node at the top of the tree or the left-most leaf node?
Many binary trees are implemented as linked lists. Most linked lists cannot be directly accessed like an array, because to find element 5, you have to follow the links. This is by definition.
You can resolve your issue by using both a std::vector and a std::map:
Allocate the object from dynamic memory.
Store the pointer, along with the key, into the std::map.
Store the pointer in the std::vector at the position you want it
at.
The std::map will allow an efficient method to access the object by key.
The std::vector will allow an efficient method to access the object by index.
Storing pointers allows for only one instance of the object instead of having to maintain multiple copies.
Well, actually you can't. The way you found is very unefficient, it have a computational complexity of O(n) (n operations worst case, where n is the number of elements in a map).
Accessing an item in a vector or in an array have complexity O(1) by comparison (constant computational complexity, a single operation).
Consider that map is internally implemented as a red black tree (or avl tree, it depends on the implementation) and every insert, delete and lookup operation are O(log n) worst case (it requires logarithm in base 2 operations to find an element in the tree), that is quite good.
A way you can deal with is to use a custom class that have inside both a vector and a map.
Insertion at the end of the class will be averaged O(1), lookup by name will be O(log n), lookup by index will be O(1) but in this case, removal operation will be O(n).
Previous answer (see comment): How about just myMap.begin();
You could implement a random-access map by using a vector backing-store, which is essentially a vector of pairs. You of course lose all the benefits of the standard library map at that point.
you can use some other map like containers .
keep a size fields can make binary search tree easy to random access .
here is my implementation ...
std style , random access iterator ...
size balanced tree ...
https://github.com/mm304321141/zzz_lib/blob/master/sbtree.h
and B+tree ...
https://github.com/mm304321141/zzz_lib/blob/master/bpptree.h
std::map is an ordered container, but it's iterators don't support random access, but rather bidirectional access. Therefore, you can only access the nth element by navigating all its prior elements. A shorter alternative to your example is using the standard iterator library:
std::pair<const std::string, int> &nth_element = *std::next(myMap.begin(), N);
This has linear complexity, which is not ideal if you plan to frequently access this way in large maps.
An alternative is to use an ordered container that supports random access. For example, boost::container::flat_map provides a member function nth which allows you exactly what you are looking for.
std::map<string,int>::iterator it = mymap.begin() + index;