find() vs lower_bound+key_comp - c++

I came across the following question in stackOverflow
std::map insert or std::map find?
why is using find() considered inferior to lower_bound() + key_comp() ?
assume I have the following map
map<int, int> myMap;
myMap[1]=1;
myMap[2]=3;
myMap[3]=5;
int key = xxx; //some value of interest.
int value = yyy;
the suggested answer is to use
map<int, int>::iterator itr = myMap.lower_bound(key);
if (itr != myMap.end() && !(myMap.key_comp()(key, itr->first)))
{
//key found.
// do processing for itr->second
//
}else {
//insert into the end position
myMap.insert (itr, map<int, int>::value_type(key, value));
}
why is it better than the following?
map<int, int>::iterator itr = myMap.find(key);
if (itr != myMap.end())
{
//key found.
// do processing for itr->second
//
}else {
//insert into the end position
myMap.insert (itr, map<int, int>::value_type(key, value));
}

In the second case, notice that if you need to insert the value, the iterator is always myMap.end(). This can not help to improve the performance of the insert operation (except when the new element is inserted at the end, of course). The container needs to find the correct position where to insert the new node, which is usually O(log N).
With lower_bound(), you already found the best hint for the container where to insert the new element and this is the optimization opportunity that the first technique offers. This might lead to a performance close to O(1). You have an additional key comparison, but that is O(1) as well (from the container's perspective).
Since both the initial find() and lower_bound are O(log N), you end up with a O(log N) plus two O(1) operation in the first technique and with two O(log N) operations in the second case.

Related

Select random element in an unordered_map

I define an unordered_map like this:
std::unordered_map<std::string, Edge> edges;
Is there a efficient way to choose a random Edge from the unordered_map edges ?
Pre-C++11 solution:
std::tr1::unordered_map<std::string, Edge> edges;
std::tr1::unordered_map<std::string, Edge>::iterator random_it = edges.begin();
std::advance(random_it, rand_between(0, edges.size()));
C++11 onward solution:
std::unordered_map<std::string, Edge> edges;
auto random_it = std::next(std::begin(edges), rand_between(0, edges.size()));
The function that selects a valid random number is up to your choice, but be sure it returns a number in range [0 ; edges.size() - 1] when edges is not empty.
The std::next function simply wraps the std::advance function in a way that permits direct assignation.
Is there a efficient way to choose a random Edge from the unordered_map edges ?
If by efficient you mean O(1), then no, it is not possible.
Since the iterators returned by unordered_map::begin / end are ForwardIterators, the approaches that simply use std::advance are O(n) in the number of elements.
If your specific use allows it, you can trade some randomness for efficiency:
You can select a random bucket (that can be accessed in O(1)), and then a random element inside that bucket.
int bucket, bucket_size;
do
{
bucket = rnd(edges.bucket_count());
}
while ( (bucket_size = edges.bucket_size(bucket)) == 0 );
auto element = std::next(edges.begin(bucket), rnd(bucket_size));
Where rnd(n) returns a random number in the [0,n) range.
In practice if you have a decent hash most of the buckets will contain exactly one element, otherwise this function will slightly privilege the elements that are alone in their buckets.
Strict O(1) solution without buckets:
Keep a vector of keys, when you need to get a random element from your map, select a random key from the vector and return corresponding value from the map - takes constant time
If you insert a key-value pair into your map, check if such key is already present, and if it's not the case, add that key to your key vector - takes constant time
If you want to remove an element from the map after it was selected, swap the key you selected with the back() element of your key vector and call pop_back(), after that erase the element from the map and return the value - takes constant time
However, there is a limitation: if you want to delete elements from the map aside from random picking, you need to fix your key vector, this takes O(n) with naive approach. But still there is a way to get O(1) performance: keep a map that tells you where the key is in the key vector and update it with swap :)
This is how you can get random element from a map:
std::unordered_map<std::string, Edge> edges;
iterator item = edges.begin();
int random_index = rand() % edges.size();
std::advance(item, random_index);
Or take a look at this answer, which provides the following solution:
std::unordered_map<std::string, Edge> edges;
iterator item = edges.begin();
std::advance( item, random_0_to_n(edges.size()) );
The solution of
std::unordered_map<std::string, Edge> edges;
auto random_it = std::next(std::begin(edges), rand_between(0, edges.size()));
is extremely slow....
A much faster solution will be:
when assigning edges, simutaneously emplaces its keys to std::vector<std::string> vec
random an int index ranging from 0 to vec.size() - 1
then get edges[vec[index]]
you can see this problem:
problem 380. Insert Delete GetRandom O(1)
you can build a vector to use vector random iterators, get random values more efficiently. Like this:
class RandomizedSet {
public:
unordered_map<int, int> m;
vector<int> data;
RandomizedSet() {
}
bool insert(int val) {
if(m.count(val)){
return false;
} else{
int index = data.size();
data.push_back(val);
m[val] = index;
return true;
}
}
bool remove(int val) {
if(m.count(val)){
int curr_index = m[val];
int max_index = data.size()-1;
m[data[max_index]] = curr_index;
swap(data[curr_index], data[max_index]);
data.pop_back();
m.erase(val);
return true;
} else{
return false;
}
}
int getRandom() {
return data[rand() % data.size()];
}
};
/**
* Your RandomizedSet object will be instantiated and called as such:
* RandomizedSet* obj = new RandomizedSet();
* bool param_1 = obj->insert(val);
* bool param_2 = obj->remove(val);
* int param_3 = obj->getRandom();
*/

Is there a .at() equivalent for a multimap?

Is there any way to get an iterator to a multimap, for specific keys? For example:
multimap<string,int> tmp;
tmp.insert(pair<string,int>("Yes", 1));
tmp.insert(pair<string,int>("Yes", 3));
tmp.insert(pair<string,int>("No", 5));
tmp.insert(pair<string,int>("Maybe", 1));
tmp.insert(pair<string,int>("Yes", 2));
multimap<string,int>::iterator it = tmp.at("Yes);
Then I could use it for the work I want to do. Is this possible in C++? Or do we have to just cycle through the multimap, element by element, and check for the key before doing the work?
You have find for a single key value pair (any matching the key), or equal_range to get all of the pairs that match a given key (this seems to be your best bet.)
multimap<Key, T> only sort elements by its Key, so we can only find all the elements whose key value equals "Yes", then check each element one by one.
typedef multimap<string,int>::iterator Iterator;
pair<Iterator, Iterator> iter_range = tmp.equal_range("Yes");
Iterator it;
for (it = iter_range.first; it != iter_range.second; ++it) {
if (it->second == 3) {
break;
}
}
if (it != tmp.end()) {
tmp.erase(it);
}
In fact it's better to use multiset<T> in this case:
multiset< pair<string, int> > temp;
temp.insert(make_pair("Yes", 1));
temp.insert(make_pair("Yes", 3));
multiset< pair<string, int> >::iterator iter = temp.find(make_pair("Yes", 1));
if (iter != temp.end()) {
temp.erase(iter); // it erase at most one element
}
temp.erase(make_pair("Yes", 3)); // it deletes all the elements that equal to make_pair("Yes", 3)

Replace an element with old_value to new_value in vector - C++

I was going through some legacy code, and found out something that could be improved.
The vector has pointers to a class and all elements are unique in the vector, as per the design.
A function ReplaceVal replaces an element having old_value to a new_value in the vector, in the following fashion:
iterator i, i_e;
i = vector->begin();
i_e = vector->end ();
for (; i != i_e; ++i)
{
if ((*i) == old_child)
break;
}
// Insertion
vector->insert_call(new_child, i);
// Since, the pointers are invalidated, do another find for erase
i = vector->begin();
i_e = vector->end ();
for (; i != i_e; ++i)
{
if ((*i) == old_child)
break;
}
// Finally, erase the old_value
vector->erase_call(i);
So, essentially, this involves shifting of elements twice, each for insertion and erase, if you are inserting and erasing elements in the middle of the vector.
For n insertions and remove calls, the complexity is O(n*m), if m elements are shifted every time, on an average.
I think, this can be improved, if I use std::replace, as mentioned here # MSDN documentation and std_replace_example.
The complexity of the std::replace would be O(n) comparisons for the old_value and new_value & 1 assignment operation. It'd be as simple as:
replace (vector.begin( ), vector.end( ), old_value , new_value);
Please correct me, if I am wrong and share feedback on anything that I missed.
P.S. The insert and erase are custom calls, which also update pointers to left_sibling and right_sibling for a given element.
You don't even need to do that:
iterator position = std::find( vector->begin() vector->end(), old_child );
if ( position == vector->end() ) {
throw NoSuchElement();
}
*position = new_child;
should do the trickā€”no erase and no insert.
vector erase returns an iterator pointing to location of the element that followed the erased
iter = myvector->erase(i);
then you can use that iterator to insert.
myvector->inster(iter, new_value);
or the other way around. vector insert returns an iterator pointing to the inserted element
iter = myvector->inster(i, new_value);
myvector->erase(iter);
Why not use one of these?
std::set
std::multiset
std::unordered_set
std::unordered_multiset
Why you have such complexity? Is there any purpose. May be the vector is used somewhere else also, but you may also use sets along with vectors for searching.

C++ multimap iterator invalidation

I'm trying to figure out how std::multimap iterators work, therefore I've created a simple example that shows the substance of my problem. If uncomment case 1, I expect iterator to point to the first element with the key 1, but in reality it prints all the values associated with key 0 (like nothing was erased) and sometimes it crashes, probably because iterator is invalid. However if uncomment case 2, all the values with key 1 are properly deleted.
Is there any way to know what is the next valid iterator for the multimap after erasure?
(for example std::vector.erase(...) returns one)
std::multimap<int, int> m;
for(int j=0; j<3; ++j) {
for(int i=0; i<5; ++i) {
m.insert(std::make_pair(j, i));
}
}
for(std::multimap<int, int>::iterator it=m.begin(); it!=m.end();) {
printf("%d %d\n", (*it).first, (*it).second);
++it;
if( (*it).second == 3 ) {
//m.erase(0); //case 1
m.erase(1); //case 2
}
}
The cause of the problem
When you call m.erase(0) in you example, it points at an element with the key 0 - so it is invalidated. m.erase(1) works, because when it is called the first time, it is not pointing to an element with the key 1, so it is not affected. In later iterations, no elements with the key 1 remain, so nothing is deleted, and no iterator is affected.
The Solution
multimap does not have an erase-method that returns the next valid iterator. One alternative is to call it = m.upper_bound(deleted_key); after the deletion. This is logarithmic, though, which might be too slow for your scenario (erase(x) and upper_bound would be two logarithmic operations).
Assuming you want to erase the key your iterator is currently pointing to, you could do something like this (otherwise, erase is fine, of course; not tested):
std::multimap<int, int>::iterator interval_start = m.begin();
for(std::multimap<int, int>::iterator it=m.begin(); it!=m.end(); ++it) {
if(interval_start->first < it->first) // new interval starts here
interval_start == it;
if( (*it).second == 3 ) {
std::multimap<int, int>::iterator interval_end = it;
while((interval_end != m.end()) && (interval_end->first == it->first)) {
++interval_end; // search for end of interval - O(n)
}
m.erase(interval_start, interval_end); // erase interval - amortized O(1)
it = interval_end; // set it to first iterator that was not erased
interval_start = interval_end; // remember start of new interval
}
}
This uses one linear operation, all the rest are constant time. If your map is very large, and you only have few items with equal keys, this will likely be faster. However, if you have many items with equal keys, the search for the end of the interval, is probably better done using upper_bound (O(log n) instead of O(n) when searching the end of the interval).
when you erase the iterator becomes invalid. instead remember the next element then erase:
std::map<int,int>::iterator next = m + 1;
m.erase
m = next;
First answer
std::multimap<int, int> m;
// ^^^^^^^^
std::map<int, int>::iterator it=m.begin();
// ^^^
Hum....
Second answer, re: edited question
for(std::multimap<int, int>::iterator it=m.begin(); it!=m.end();) {
.... stuff ....
m.erase(1); // container mutation
.... stuff ....
}
Be extremely careful when you are mutating a container (any container) when you are iterating on it, as you might invalidate an iterator you depend on.
The so-called "node-based containers" (list, set, map...) are the most robust container WRT iterator invalidation: they only invalidate iterators to deleted elements (there is no way for these iterators not be invalidated).
In this case you should check that the element you are about to delete isn't actually *it.
I am not quite sure what you are trying really to do with your loop.
From looking at your code, I think that your ++it is causing the problem. You are assigning it to a place that might have been deleted. move it to the end, after the if statement and test. like so:
for(std::multimap<int, int>::iterator it=m.begin(); it!=m.end();) {
printf("%d %d\n", (*it).first, (*it).second);
if( (*it).second == 3 ) {
//m.erase(0); //case 1
m.erase(1); //case 2
}
++it;
}
(Edited)
for(std::multimap<int, int>::iterator it=m.begin(); it!=m.end();) {
printf("%d %d\n", (*it).first, (*it).second);
++it;
if( (*it).second == 3 ) {
//m.erase(0); //case 1
m.erase(1); //case 2
}
}
In addition to invalidation of it iterator due to m.erase that may occur depending on the contents of multimap (already covered in another answer) there is always the problem that you dereference m.end() iterator on the last iteration of your for loop when you do if( (*it).second == 3 ) each time you run your program.
I suggest to run and debug with debug builds. I'm almost sure that every sane standard library implementation should contain assert to detect end() dereferencing.
Some guys above already have answered that you are playing with a fire.
Also, I think you are forgetting that multimap is ordered map, so you are iterating from the smallest keys to the largest ones. Therefore in the first case you remove keys after printing some of them, but in the second case you are remove just before going to them.

STL Multimap Remove/Erase Values

I have STL Multimap, I want to remove entries from the map which has specific value , I do not want to remove entire key, as that key may be mapping to other values which are required.
any help please.
If I understand correctly these values can appear under any key. If that is the case you'll have to iterate over your multimap and erase specific values.
typedef std::multimap<std::string, int> Multimap;
Multimap data;
for (Multimap::iterator iter = data.begin(); iter != data.end();)
{
// you have to do this because iterators are invalidated
Multimap::iterator erase_iter = iter++;
// removes all even values
if (erase_iter->second % 2 == 0)
data.erase(erase_iter);
}
Since C++11, std::multimap::erase returns an iterator following the last removed element.
So you can rewrite Nikola's answer slightly more cleanly without needing to introduce the local erase_iter variable:
typedef std::multimap<std::string, int> Multimap;
Multimap data;
for (Multimap::iterator iter = data.begin(); iter != data.end();)
{
// removes all even values
if (iter->second % 2 == 0)
iter = data.erase(iter);
else
++iter;
}
(See also answer to this question)