std::map insert() hint location: difference between c++98 and c++11 - c++

On cplusplus' entry on map::insert() I read about the location one could add as a hint for the function that the "function optimizes its insertion time if position points to the element that will precede the inserted element" for c++98, while for c++11 the optimization occurs "if position points to the element that will follow the inserted element (or to the end, if it would be the last)".
Does this mean that the performance of code snippets of the following form (which are abundant in the legacy code I'm working on and modeled after Scott Meyer's "Effective STL", item 24) were affected in switching to a C++11-compliant compiler?
auto pLoc = someMap.lower_bound(someKey);
if(pLoc != someMap.end() && !(someMap.key_comp()(someKey, pLoc->first)))
return pLoc->second;
else
auto newValue = expensiveCalculation();
someMap.insert(pLoc, make_pair(someKey, newValue)); // using the lower bound as hint
return newValue;
What would be the best way to improve this pattern for use with C++11?

The C++98 specification is a defect in the standard. See the discussion in LWG issue 233 and N1780.
Recall that lower_bound returns an iterator to the first element with key not less than the specified key, while upper_bound returns an iterator to the first element with key greater than the specified key. If there is no key equivalent to the specified key in the container, then lower_bound and upper_bound return the same thing - an iterator to the element that would be after the key if it were in the map.
So, in other words, your current code already works correctly under the C++11 spec, and in fact would be wrong under C++98's defective specification.

Yes, it will affect the complexity. Giving the correct hint will make insert() have amortized constant complexity, while giving and incorrect hint will force the map to search for the position from the beginning, giving logarithmic complexity. Basically, a good hint makes the insertion happen in constant time, no matter how big your map is; with a bad hint the insertion will be slower on larger maps.
The solution is, apparently, to search for the hint with upper_bound instead of lower_bound.

I am thinking the correct C++11-style hint insertion might be as follows:
iterator it = table.upper_bound(key); //upper_bound returns an iterator pointing to the first element that is greater than key
if (it == table.begin() || (--it)->first < key) {
// key not found path
table.insert(it, make_pair(key, value));
}
else {
// key found path
it->second = value;
}

A snapshot of working lambda function for your reference.
Note: m_map should not be empty. It is trivially known where to add the element if the map is empty.
auto create_or_get_iter = [this] (const K& key) {
auto it_upper = m_map.upper_bound(key);
auto it_effective = it_upper == m_map.begin() ? it_upper : std::prev(it_upper);
auto init_val = it_effective->second;
if (it_effective == m_map.begin() || it_effective->first < key) {
return m_map.insert(it_effective, std::make_pair(key, init_val));
} else {
it_effective->second = init_val;
return it_effective;
}
};

Related

Efficient substitute for std::map::insert_or_assign with hint

I'm trying to write a substitute for std::map::insert_or_assign that takes the hint parameter, for build environments that don't support C++17.
I'd like for this substitute to be just as efficient, and not require that the mapped type be DefaultConstructible. The latter requirement rules out map[key] = value.
I've come up with this:
template <class M, class K, class T>
typename M::iterator insert_or_assign(M& map, typename M::const_iterator hint,
K&& key, T&& value)
{
using std::forward;
auto old_size = map.size();
auto iter = map.emplace_hint(hint, forward<K>(key), forward<T>(value));
// If the map didn't grow, the key already already existed and we can directly
// assign its associated value.
if (map.size() == old_size)
iter->second = std::forward<T>(value);
return iter;
}
However, I don't know if I can trust std::map not to move-assign the value twice in the case where the key already existed. Is this safe? If not, is there a safe way to efficiently implement a substitute for std::map::insert_or_assign taking a hint parameter?
As per NathanOliver's comment, where he cited the cppreference documentation for std::map::emplace:
The element may be constructed even if there already is an element
with the key in the container, in which case the newly constructed
element will be destroyed immediately.
If we assume the same applies for std::map::emplace_hint, then the value could moved away prematurely in the solution I proposed in my question.
I've come up with this other solution (NOT TESTED), which only forwards the value once. I admit it's not pretty. :-)
// Take 'hint' as a mutating iterator to avoid an O(N) conversion.
template <class M, class K, class T>
typename M::iterator insert_or_assign(M& map, typename M::iterator hint,
K&& key, T&& value)
{
using std::forward;
#ifdef __cpp_lib_map_try_emplace
return map.insert_or_assign(hint, forward<K>(key), forward<T>(value);
#else
// Check if the given key goes between `hint` and the entry just before
// hint. If not, check if the given key matches the entry just before hint.
if (hint != map.begin())
{
auto previous = hint;
--previous; // O(1)
auto comp = map.key_comp();
if (comp(previous->first, key)) // key follows previous
{
if (comp(key, hint->first)) // key precedes hint
{
// Should be O(1)
return map.emplace_hint(hint, forward<K>(key),
forward<T>(value));
}
}
else if (!comp(key, previous->first)) // key equals previous
{
previous->second = forward<T>(value); // O(1)
return previous;
}
}
// If this is reached, then the hint has failed.
// Check if key already exists. If so, assign its associated value.
// If not, emplace the new key-value pair.
auto iter = map.find(key); // O(log(N))
if (iter != map.end())
iter->second = forward<T>(value);
else
iter = map.emplace(forward<K>(key), forward<T>(value)); // O(log(N))
return iter;
#endif
}
I hope somebody else will come up with a nicer solution!
Note that I check for the __cpp_lib_map_try_emplace feature test macro to test if std::map::insert_or_assign is supported before resorting to this ugly mess.
EDIT: Removed the the slow iterator arithmetic silliness in attempting to check if the key already exists at hint.
EDIT 2: hint is now taken as a mutating iterator to avoid an expensive O(N) conversion if it was otherwise passed as a const_iterator. This allows me to manually check the hint and perform an O(1) insertion or assignment if the hint succeeds.

What does "result.second == false" mean in this code?

I came across this c++ code for counting frequency in a vector.
std::map<std::string, int> countMap;
// Iterate over the vector and store the frequency of each element in map
for (auto & elem : vecOfStrings)
{
auto result = countMap.insert(std::pair<std::string, int>(elem, 1));
if (result.second == false)
result.first->second++;
}
from https://thispointer.com/c-how-to-find-duplicates-in-a-vector/. I want to ask what does
result.second == false mean?
Since std::map and the other non-multi associative containers only store unique items there is a chance that when you insert something into it it wont actually insert since it may already be present. insert therefore returns a std::pair<iterator, bool> where the bool will be true if the insert succeeded and false otherwise.
I would like to point out you can get rid of the if statement in the loop. Because of how operator[] of a map works the loop can be replaced with
for (const auto & elem : vecOfStrings) // also added const here since we don't need to modify elem
{
++countMap[elem];
}
And now if elem exists then you increment the value and if it doesn't you added elem to the map and increment its value.
std::map::insert returns a std::pair<iterator, bool>.
pair.first is an iterator to the newly inserted element OR the element that was already in the map and prevented the insertion.
pair.second tells whether or not the insertion happened.
result.second == false is detecting the case where nothing was inserted into the map due to a key collision.
Note that with C++17, this can be written to be a bit more clear:
auto [itr, inserted] = countMap.insert({elem, 1});
if (!inserted) {
itr->second++;
}
From cppreference:
Returns a pair consisting of an iterator to the inserted element (or
to the element that prevented the insertion) and a bool denoting
whether the insertion took place.
result.first gives you the iterator to the element, while result.second tells you whether the element was actually inserted or did already exist.
std::map::insert returns a pair where the second value indicates whether any insertion actually happened. If the value is false, this means no value was inserted into the map because a value with the same key already exists.
However, the code shouldn’t be written like this: comparing against boolean literals is a nonsensical operation. Instead you’d write
if (not result.second)
// or
if (! result.second)
std::map::insert returns a pair of iterator and a bool. The bool indicates whether the insertion actually took place. The code you listed seems to increment the mapped int if key collision happens on insert.

What is the purpose of using an iterator in this function's return value?

I am looking at a function for parsing through local addresses and am confused by the rationale behind choice for return value. The function is
bool p2p::isLocalHostAddress(bi::address const& _addressToCheck)
{
// #todo: ivp6 link-local adresses (macos), ex: fe80::1%lo0
static const set<bi::address> c_rejectAddresses = {
{bi::address_v4::from_string("127.0.0.1")},
{bi::address_v4::from_string("0.0.0.0")},
{bi::address_v6::from_string("::1")},
{bi::address_v6::from_string("::")}
};
return find(c_rejectAddresses.begin(), c_rejectAddresses.end(), _addressToCheck) != c_rejectAddresses.end();
}
I understand the actual code of the return value, whereby std::find goes through the set looking for _addressToCheck but what is the reasoning behind comparing it with the set's end iterator? Wouldn't the same logic in this case be implemented by listing the return value as
return find(c_rejectAddresses.begin(), c_rejectAddresses.end(), _addressToCheck) != NULL;
what is the reasoning behind comparing it with the set's end iterator?
std::find() takes two iterators as input, searching from the first iterator up to but not including the second iterator. If the item is found, an iterator to the item is returned. If the item is not found, the second iterator is returned. Since end() is being passed in as the second iterator, the return value has to be compared to end() to know if the address was found or not.
Wouldn't the same logic in this case be implemented by listing the return value as
return find(c_rejectAddresses.begin(), c_rejectAddresses.end(), _addressToCheck) != NULL;
No, it would not. That implies that std::find() returns a pointer, or at least an integer where 0 represents "not found". That is not the case with many containers. STL algorithms use iterators so they can be container-agnostic.
It makes the return value compatible with other algorithm functions, meaning they can be embedded into each other.
auto found = find(list.begin(), list.end(), itemToFind);
for_each(list.begin(), found, doSomething);
Potentially not the most useful example. But, it's easier to get the value from an iterator than to get an iterator from a value if need be.

How to get the last element of an std::unordered_map?

How to get the last element of an std::unordered_map?
myMap.rbegin() and --myMap.end() are not possible.
There is no "last element" in a container that is unordered.
You might want an ordered container, e.g. std::map and access the last element with mymap.rbegin()->first (Also see this post)
EDIT:
To check if your iterator is going to hit the end, simply increment it (and possibly save it in a temporary) and check it against mymap.end(), or, even cleaner : if (std::next(it) == last)
In your comments, it appears your goal is to determine if you are on the last element when iterating forward. This is a far easier problem to solve than finding the last element:
template<class Range, class Iterator>
bool is_last_element_of( Range const& r, Iterator&& it ) {
using std::end;
if (it == end(r)) return false;
if (std::next(std::forward<Iterator>(it)) == end(r)) return true;
return false;
}
the above should work on any iterable Range (including arrays, std containers, or custom containers).
We check if we are end (in which case, we aren't the last element, and advancing would be illegal).
If we aren't end, we see if std::next of us is end. If so, we are the last element.
Otherwise, we are not.
This will not work on iterators that do not support multiple passes.
You cant. by definition, the element is not stored based on some sort of order. the key is hashed first and that's why O(1) search is possible. if you wanna check whether a key exists in the unordered_map or not, u can use this code:
std::unordered_map dico;
if(dico.count(key)!=0){
//code here
}
std::unordered_map::iterator last_elem;
for (std::unordered_map::iterator iter = myMap.begin(); iter != myMap.end(); iter++)
last_elem = iter;
// use last_elem, which now points to the last element in the map
This will give you the last element in whatever order the map gives them to you.
Edit: You need to use std::unordered_map<YourKeyType, YourValueType> instead of just std::unordered_map. I just wrote it like this because you did not provide the type in your question.
Alternatively, as suggested by vsoftco (thanks), you could declare both last_elem and iter as decltype(myMap)::iterator.
(If you're compiling with the MSVC++ compiler, then you will need to add typedef decltype(myMap) map_type; and then instead of decltype(myMap)::iterator use map_type::iterator.)
.end() is an iterator to the "element past the last element". That's why you compare it like this when you loop through a map:
for (auto it = myMap.begin(); it != myMap.end(); ++it) // '!=' operator here makes it possible to only work with valid elements
{
}
So you want the "last" element (whatever that may be, because it's not really guaranteed to be the last in an unordered map, since it ultimately depends on how the key was hashed and in which "bucket" it ends up in). Then you need: --myMap.end()
More specifically, .end() is a function, that returns an iterator, same as .begin() returns an iterator. Since there is no .rbegin() in an std::unordered_map, you have to use -- (the decrement operator):
auto it = --myMap.end();
To access the key you use it->first, to access the value you use it->second.
The accepted answer seems wrong. Unordered_map does have the last element even though the key-value pair is not stored in sorted order. Since the iterator of unorered_map is forwar_iterator(LegacyForwardIterator), the cost to find the last element is O(n). Yakk - Adam gave the correct answer. Essentially, you have to iterator the container from begin to end. At each iteration, you have to check whether the next element is end(); if yes then you are at the last element.
You cannot call prev(it) or --it. There will be no syntax error, but you will have a runtime error (more likely segmentation fault) when using the prev(it) or --it. Maybe next version of compiler can tell you that you have an logic error.
It may not be the best solution, performance-wise, but in C++11 and later, I use a combination of std::next() and size() to jump all elements from the beginning of the map, as shown below:
std::unordered_map<int,std::string> mapX;
...
if (mapX.size() > 0) {
std::unordered_map<int,std::string>::iterator itLast =
std::next(mapX.begin(), mapX.size() - 1);
...

In std::multiset is there a function or algorithm to erase just one sample (unicate or duplicate) if an element is found

Perhaps this is a duplicate but I did not find anything searching:
When erase(value) is called on std::multiset all elements with the value found are deleted. The only solution I could think of is:
std::multiset<int>::iterator hit(mySet.find(5));
if (hit!= mySet.end()) mySet.erase(hit);
This is ok but I thought there might be better. Any Ideas ?
auto itr = my_multiset.find(value);
if(itr!=my_multiset.end()){
my_multiset.erase(itr);
}
I would imagine there is a cleaner way of accomplishing the same. But this gets the job done.
Try this one:
multiset<int> s;
s.erase(s.lower_bound(value));
As long as you can ensure that the value exists in the set. That works.
if(my_multiset.find(key)!=my_multiset.end())
my_multiset.erase(my_multiset.equal_range(key).first);
This is the best way i can think of to remove a single instance in a multiset in c++
This worked for me:
multi_set.erase(multi_set.find(val));
if val exists in the multi-set.
I would try the following.
First call equal_range() to find the range of elements that equal to the key.
If the returned range is non-empty, then erase() a range of elements (i.e. the erase() which takes two iterators) where:
the first argument is the iterator to the 2nd element in the returned
range (i.e. one past .first returned) and
the second argument as the returned range pair iterator's .second one.
Edit after reading templatetypedef's (Thanks!) comment:
If one (as opposed to all) duplicate is supposed to be removed: If the pair returned by equal_range() has at least two elements, then erase() the first element by passing the the .first of the returned pair to single iterator version of the erase():
Pseudo-code:
pair<iterator, iterator> pit = mymultiset.equal_range( key );
if( distance( pit.first, pit.second ) >= 2 ) {
mymultiset.erase( pit.first );
}
We can do something like this:
multiset<int>::iterator it, it1;
it = myset.find(value);
it1 = it;
it1++;
myset.erase (it, it1);
Here is a more elegant solution using "if statement with initializer" introduced in C++17:
if(auto it = mySet.find(value); it != mySet.end())
mySet.erase(value);
The advantage of this syntax is that the scope of the iterator it is reduced to this if statement.
Since C++17 (see here):
mySet.extract(val);
auto itr=ms.find(value);
while(*itr==value){
ms.erase(value);
itr=ms.find(value);
}
Try this one It will remove all the duplicates available in the multiset.
In fact, the correct answer is:
my_multiset.erase(my_multiset.find(value));