In STL maps, is it better to use map::insert than []? - c++

A while ago, I had a discussion with a colleague about how to insert values in STL maps. I preferred map[key] = value; because it feels natural and is clear to read whereas he preferred map.insert(std::make_pair(key, value)).
I just asked him and neither of us can remember the reason why insert is better, but I am sure it was not just a style preference rather there was a technical reason such as efficiency. The SGI STL reference simply says: "Strictly speaking, this member function is unnecessary: it exists only for convenience."
Can anybody tell me that reason, or am I just dreaming that there is one?

When you write
map[key] = value;
there's no way to tell if you replaced the value for key, or if you created a new key with value.
map::insert() will only create:
using std::cout; using std::endl;
typedef std::map<int, std::string> MyMap;
MyMap map;
// ...
std::pair<MyMap::iterator, bool> res = map.insert(MyMap::value_type(key,value));
if ( ! res.second ) {
cout << "key " << key << " already exists "
<< " with value " << (res.first)->second << endl;
} else {
cout << "created key " << key << " with value " << value << endl;
}
For most of my apps, I usually don't care if I'm creating or replacing, so I use the easier to read map[key] = value.

The two have different semantics when it comes to the key already existing in the map. So they aren't really directly comparable.
But the operator[] version requires default constructing the value, and then assigning, so if this is more expensive then copy construction, then it will be more expensive. Sometimes default construction doesn't make sense, and then it would be impossible to use the operator[] version.

Another thing to note with std::map:
myMap[nonExistingKey]; will create a new entry in the map, keyed to nonExistingKey initialized to a default value.
This scared the hell out of me the first time I saw it (while banging my head against a nastly legacy bug). Wouldn't have expected it. To me, that looks like a get operation, and I didn't expect the "side-effect." Prefer map.find() when getting from your map.

If the performance hit of the default constructor isn't an issue, the please, for the love of god, go with the more readable version.
:)

insert is better from the point of exception safety.
The expression map[key] = value is actually two operations:
map[key] - creating a map element with default value.
= value - copying the value into that element.
An exception may happen at the second step. As result the operation will be only partially done (a new element was added into map, but that element was not initialized with value). The situation when an operation is not complete, but the system state is modified, is called the operation with "side effect".
insert operation gives a strong guarantee, means it doesn't have side effects (https://en.wikipedia.org/wiki/Exception_safety). insert is either completely done or it leaves the map in unmodified state.
http://www.cplusplus.com/reference/map/map/insert/:
If a single element is to be inserted, there are no changes in the container in case of exception (strong guarantee).

If your application is speed critical i will advice using [] operator because it creates total 3 copies of the original object out of which 2 are temporary objects and sooner or later destroyed as.
But in insert(), 4 copies of the original object are created out of which 3 are temporary objects( not necessarily "temporaries") and are destroyed.
Which means extra time for:
1. One objects memory allocation
2. One extra constructor call
3. One extra destructor call
4. One objects memory deallocation
If your objects are large, constructors are typical, destructors do a lot of resource freeing, above points count even more. Regarding readability, i think both are fair enough.
The same question came into my mind but not over readability but speed.
Here is a sample code through which I came to know about the point i mentioned.
class Sample
{
static int _noOfObjects;
int _objectNo;
public:
Sample() :
_objectNo( _noOfObjects++ )
{
std::cout<<"Inside default constructor of object "<<_objectNo<<std::endl;
}
Sample( const Sample& sample) :
_objectNo( _noOfObjects++ )
{
std::cout<<"Inside copy constructor of object "<<_objectNo<<std::endl;
}
~Sample()
{
std::cout<<"Destroying object "<<_objectNo<<std::endl;
}
};
int Sample::_noOfObjects = 0;
int main(int argc, char* argv[])
{
Sample sample;
std::map<int,Sample> map;
map.insert( std::make_pair<int,Sample>( 1, sample) );
//map[1] = sample;
return 0;
}

Now in c++11 I think that the best way to insert a pair in a STL map is:
typedef std::map<int, std::string> MyMap;
MyMap map;
auto& result = map.emplace(3,"Hello");
The result will be a pair with:
First element (result.first), points to the pair inserted or point to
the pair with this key if the key already exist.
Second element (result.second), true if the insertion was correct or
false it something went wrong.
PS: If you don´t case about the order you can use std::unordered_map ;)
Thanks!

A gotcha with map::insert() is that it won't replace a value if the key already exists in the map. I've seen C++ code written by Java programmers where they have expected insert() to behave the same way as Map.put() in Java where values are replaced.

One note is that you can also use Boost.Assign:
using namespace std;
using namespace boost::assign; // bring 'map_list_of()' into scope
void something()
{
map<int,int> my_map = map_list_of(1,2)(2,3)(3,4)(4,5)(5,6);
}

Here's another example, showing that operator[] overwrites the value for the key if it exists, but .insert does not overwrite the value if it exists.
void mapTest()
{
map<int,float> m;
for( int i = 0 ; i <= 2 ; i++ )
{
pair<map<int,float>::iterator,bool> result = m.insert( make_pair( 5, (float)i ) ) ;
if( result.second )
printf( "%d=>value %f successfully inserted as brand new value\n", result.first->first, result.first->second ) ;
else
printf( "! The map already contained %d=>value %f, nothing changed\n", result.first->first, result.first->second ) ;
}
puts( "All map values:" ) ;
for( map<int,float>::iterator iter = m.begin() ; iter !=m.end() ; ++iter )
printf( "%d=>%f\n", iter->first, iter->second ) ;
/// now watch this..
m[5]=900.f ; //using operator[] OVERWRITES map values
puts( "All map values:" ) ;
for( map<int,float>::iterator iter = m.begin() ; iter !=m.end() ; ++iter )
printf( "%d=>%f\n", iter->first, iter->second ) ;
}

This is a rather restricted case, but judging from the comments I've received I think it's worth noting.
I've seen people in the past use maps in the form of
map< const key, const val> Map;
to evade cases of accidental value overwriting, but then go ahead writing in some other bits of code:
const_cast< T >Map[]=val;
Their reason for doing this as I recall was because they were sure that in these certain bits of code they were not going to be overwriting map values; hence, going ahead with the more 'readable' method [].
I've never actually had any direct trouble from the code that was written by these people, but I strongly feel up until today that risks - however small - should not be taken when they can be easily avoided.
In cases where you're dealing with map values that absolutely must not be overwritten, use insert. Don't make exceptions merely for readability.

The fact that std::map insert() function doesn't overwrite value associated with the key allows us to write object enumeration code like this:
string word;
map<string, size_t> dict;
while(getline(cin, word)) {
dict.insert(make_pair(word, dict.size()));
}
It's a pretty common problem when we need to map different non-unique objects to some id's in range 0..N. Those id's can be later used, for example, in graph algorithms. Alternative with operator[] would look less readable in my opinion:
string word;
map<string, size_t> dict;
while(getline(cin, word)) {
size_t sz = dict.size();
if (!dict.count(word))
dict[word] = sz;
}

The difference between insert() and operator[] has already been well explained in the other answers. However, new insertion methods for std::map were introduced with C++11 and C++17 respectively:
C++11 offers emplace() as also mentioned in einpoklum's comment and GutiMac's answer.
C++17 offers insert_or_assign() and try_emplace().
Let me give a brief summary of the "new" insertion methods:
emplace(): When used correctly, this method can avoid unnecessary copy or move operations by constructing the element to be inserted in place. Similar to insert(), an element is only inserted if there is no element with the same key in the container.
insert_or_assign(): This method is an "improved" version of operator[]. Unlike operator[], insert_or_assign() doesn't require the map's value type to be default constructible. This overcomes the disadvantage mentioned e.g. in Greg Rogers' answer.
try_emplace(): This method is an "improved" version of emplace(). Unlike emplace(), try_emplace() doesn't modify its arguments (due to move operations) if insertion fails due to a key already existing in the map.
For more details on insert_or_assign() and try_emplace() please see my answer here.
Simple example code on Coliru

Related

Why isn't vector::operator[] implemented similar to map::operator[]?

Is there any reason for std::vector's operator[] to just return a reference instead of inserting a new element? The cppreference.com page for vector::operator says here
Unlike std::map::operator[], this operator never inserts a new element into the container.
While the page for map::operator[] says
"Returns a reference to the value that is mapped to a key equivalent to key, performing an insertion if such key does not already exist."
Why couldn't vector::operator[] be implemented by calling vector::push_back or vector::insert like how map::operator[] calls insert(std::make_pair(key, T())).first->second;?
Quite simply: Because it doesn't make sense. What do you expect
std::vector<int> a = {1, 2, 3};
a[10] = 4;
to do? Create a fourth element even though you specified index 10? Create elements 3 through to 10 and return a reference to the last one? Neither would be particularily intuitive.
If you really want to fill a vector with values using operator[] instead of push_back, you can call resize on the vector to create the elements before settings them.
Edit: Or, if you actually want to have an associative container, where the index is important apart from ordering, std::map<int, YourData> might actually make more sense.
A map and a vector are completely different concepts. A map is an "associative container" whereas a vector is a "sequence container". Delineating the differences is out of the scope of this answer, though at the most superficial of levels, a map is generally implemented as a red-black tree, while a vector is a convoluted wrapper over a C-style array (elements stored contiguously in memory).
If you want to check if an element already exists, you would need to resize the entire container. But what happens if you decide to remove the element? What do you do with the entries you just created? With a map:
std::map<int, int> m;
m[1] = 1;
m.erase(m.begin());
This is a constant operation.
With a vector:
std::vector<int> v;
// ... initialize some values between 25 and 100
v[100] = 1;
v.erase(v.begin() + 25, v.end());
This is a linear operation. That's horribly inefficient (comparatively) to a map. While this is a contrived example, it's not hard to imagine how this could blow up in other scenarios. At a minimum, most people would go out of their way to avoid operator[] which as a cost in of itself (maintenance and code complexity).
Is there any reason for std::vector's operator[] to just return a reference instead of inserting a new element?
std::vector::operator[] is implemented in an array-like fashion because std::vector is a sequence container (i.e., array-like). Standard arrays for integral types cannot be accessed out of bounds. Similarly, accessing std::vector::operator[] with an index outside of the vector's length is not allowed either. So, yes, the reasons it is not implemented as you ask about is because in no other context, do arrays in C++ act like that.
std::map::operator[] is not a sequence container. Its syntax makes it similar to associative arrays in other languages. In terms of C++ (and its predecessor, C), map::operator[] is just syntactic sugar. It is the "black sheep" of the operator[] family, not std::vector::operator[].
The interesting part of the C++ specification regarding is that accessing a map with a key that doesn't exist, using std::map::operator[], adds an element to the map. Thus,
#include <iostream>
#include <map>
int main(void) {
std::map<char, int> m;
m['a'] = 1;
std::cout << "m['a'] == " << m['a'] << ", m.size() == " << m.size() << std::endl;
std::cout << "m['b'] == " << m['b'] << ", m.size() == " << m.size() << std::endl;
}
results in:
m['a'] == 1, m.size() == 1
m['b'] == 0, m.size() == 2
See also: Difference between map[] and map.at in C++? :
[map::at] throws an exception if the key doesn't exist, find returns aMap.end() if the element doesn't exist, and operator[] value-initializes a new value for the corresponding key if no value exists there.

Why C++ map.insert() doesn't overwrite

In the code below:
#include <map>
#include <utility>
#include <iostream>
using namespace std;
int main(){
pair<int,int> p1(1,1);
pair<int,int> p2(1,2);
map<int,int> m;
m.insert(p1);
m.insert(p2);
cout << "Map value: "<< m.at(1) << endl;
}
It printed out : Map value: 1, why m.insert(p2) doesn't overwrite the previous entity in the map?
map.insert() only inserts if the container doesn't already contain an element with an equivalent key.
You should use operator[] instead:
m[p2.first] = p2.second;
In the std::map::insert reference it is said that:
Inserts element(s) into the container, if the container doesn't already contain an element with an equivalent key.
Update as of C++17 There is now the std::map::insert_or_assign() member function:
m.insert_or_assign(p1);
As the name suggests, if the key is already present then the value is assigned (and the key object kept) rather than erasing and freshly copy constructing the key and value. (So it's equivalent to the first of the two pre-C++17 snippets below.)
If you want an iterator pointing at the (new or updated) element, you again need to pick the value out of the returned pair. Since you're using C++17, you can now use a structured binding:
auto [it, wasInserted] = m.insert_or_assign(p1);
Before C++17 Putting together the other answers, if you want to avoid the assumption of being default constructable you get insert-with-overwrite code that looks like this:
auto itAndWasInserted = m.insert(p1);
if (!itAndWasInserted.second) {
*(itAndWasInserted.first) = p1;
}
In the above snippet, if the element is already present then the new value is assigned to it. That's usually what you want. If you instead want to construct rather than assign the new value, but still want to avoid a second seek (after you've erased the original value), you end up with this monster:
auto itAndWasInserted = m.insert(p1);
auto it = itAndWasInserted.first;
if (!itAndWasInserted.second) {
auto afterIt = m.erase(it);
auto newItAndWasInserted = m.insert(afterIt, p1); // Hint form of insert
it = newItAndWasInserted.first;
}
At the end of the code block, it is an iterator pointing at the just-inserted element.
Realistically, in most cases you probably just want to use yizzlez's suggestion of operator[], but I thought it would be good to note the theoretically best answer.
It doesn't overwrite. However if you check the return value, there is a std::pair<iterator, bool>. If bool is true, then it was inserted. If the bool is false, then it was not inserted because of a collision. At that point, you can then overwrite the data yourself by writing to the iterator.
This is supposed to happen. map.insert() will only insert elements into the container if it doesn't already contain any elements, so this will ignore the later value elements assigned to it.

Simultaneously iterating over and modifying an unordered_set?

Consider the following code:
unordered_set<T> S = ...;
for (const auto& x : S)
if (...)
S.insert(...);
This is broken correct? If we insert something into S then the iterators may be invalidated (due to a rehash), which will break the range-for because under the hood it is using S.begin ... S.end.
Is there some pattern to deal with this?
One way is:
unordered_set<T> S = ...;
vector<T> S2;
for (const auto& x : S)
if (...)
S2.emplace_back(...);
for (auto& x : S2)
S.insert(move(x));
This seems clunky. Is there a better way I'm missing?
(Specifically if I was using a hand-rolled hash table and I could block it from rehashing until the end of the loop, it would be safe to use the first version.)
Update:
From http://en.cppreference.com/w/cpp/container/unordered_map/insert
If rehashing occurs due to the insertion, all iterators are invalidated. Otherwise iterators are not affected. References are not invalidated. Rehashing occurs only if the new number of elements is higher than max_load_factor() * bucket_count().
Could you mess with max_load_factor somehow to prevent rehashing?
Could you mess with max_load_factor somehow to prevent rehashing?
Yes, you can set the max_load_factor() to infinity to ensure no rehashing occurs:
#include <iostream>
#include <limits>
#include <unordered_set>
int main()
{
// initialize
std::unordered_set<int> S;
for (int i = 0; i < 8; ++i)
S.insert(i);
std::cout << "buckets: " << S.bucket_count() << std::endl;
// infinite max load factor => never need to rehash
const auto oldLoadFactor = S.max_load_factor();
S.max_load_factor(std::numeric_limits<float>::infinity());
for (const auto& x : S)
{
if (x > 2)
S.insert(x * 2);
}
// restore load factor, verify same bucket count
S.max_load_factor(oldLoadFactor);
std::cout << "buckets: " << S.bucket_count() << std::endl;
// now force rehash
S.rehash(0);
std::cout << "buckets: " << S.bucket_count() << std::endl;
}
Note that simply setting a new load factor does no rehashing, so those are cheap operations.
The rehash(0) bit works because it's a request that: 1) I get at least n buckets, and 2) I have enough buckets to satisfy my max_load_factor(). We just use zero to indicate we don't care for a minimum amount, we just want to rehash to satisfy our "new" factor, as if it was never changed to infinity.
Of course, this isn't exception-safe; if anything throws between the calls to max_load_factor(), our old factor is lost forever. Easily fixed with your favorite scope-guard utility or a utility class.
Note that you get no guarantees if you'll iterate over the new elements. You will iterate over the existing elements, but you may or may not iterate over the new elements. If that is okay (which per our chat it should be), then this will work.
For example, consider you iterate over an unordered set of integer and for each even integer x, insert x * 2. If those always get inserted just after your currrent position (by chance of implementation-detail and container state), you will never terminate the loop except through exceptions.
If you do need some guarantees, you need to with an alternate storage solution.
Modifying any container while you're iterating over it tends to get hairy - even if it's a simpler structure than a hash, or even if you can prevent it from re-hashing, re-balancing or whatever.
Even if it did work, by the way, there's an ambiguity: should your newly-inserted members be iterated over or not? Is it ok to include them in this iteration only sometimes (ie, only if they happen to end up after the current iterator)?
If you need to do this a lot, you could usefully wrap the container in a generic adapter that defers all the inserts until the end, but you're really finding a way to hide the code you already have.
I realized that it is conceptually the same as what you proposed but I think it looks actually reasonably slick:
std::vector<T> tmp;
std::copy_if(S.begin(), S.end(), std::back_inserter(tmp),
[](T const& value) { return ...; });
S.insert(std::make_move_iterator(tmp.begin()),
std::make_move_iterator(tmp.end()));

stl map operator[] bad?

My code reviewers has pointed it out that the use of operator[] of the map is very bad and lead to errors:
map[i] = new someClass; // potential dangling pointer when executed twice
Or
if (map[i]==NULL) ... // implicitly create the entry i in the map
Although I understand the risk after reading the API that the insert() is better of since it checks for duplicate, thus can avoid the dangling pointer from happening, I don't understand that if handled properly, why [] can not be used at all?
I pick map as my internal container exactly because I want to use its quick and self-explaining indexing capability.
I hope someone can either argue more with me or stand on my side:)
The only time (that I can think of) where operator[] can be useful is when you want to set the value of a key (overwrite it if it already has a value), and you know that it is safe to overwrite (which it should be since you should be using smart pointers, not raw pointers) and is cheap to default construct, and in some contexts the value should have no-throw construction and assignment.
e.g. (similar to your first example)
std::map<int, std::unique_ptr<int>> m;
m[3] = std::unique_ptr<int>(new int(5));
m[3] = std::unique_ptr<int>(new int(3)); // No, it should be 3.
Otherwise there are a few ways to do it depending on context, however I would recommend to always use the general solution (that way you can't get it wrong).
Find a value and create it if it doesn't exist:
1. General Solution (recommended as it always works)
std::map<int, std::unique_ptr<int>> m;
auto it = m.lower_bound(3);
if(it == std::end(m) || m.key_comp()(3, it->first))
it = m.insert(it, std::make_pair(3, std::unique_ptr<int>(new int(3)));
2. With cheap default construction of value
std::map<int, std::unique_ptr<int>> m;
auto& obj = m[3]; // value is default constructed if it doesn't exists.
if(!obj)
{
try
{
obj = std::unique_ptr<int>(new int(3)); // default constructed value is overwritten.
}
catch(...)
{
m.erase(3);
throw;
}
}
3. With cheap default construction and no-throw insertion of value
std::map<int, my_objecct> m;
auto& obj = m[3]; // value is default constructed if it doesn't exists.
if(!obj)
obj = my_objecct(3);
Note: You could easily wrap the general solution into a helper method:
template<typename T, typename F>
typename T::iterator find_or_create(T& m, const typename T::key_type& key, const F& factory)
{
auto it = m.lower_bound(key);
if(it == std::end(m) || m.key_comp()(key, it->first))
it = m.insert(it, std::make_pair(key, factory()));
return it;
}
int main()
{
std::map<int, std::unique_ptr<int>> m;
auto it = find_or_create(m, 3, []
{
return std::unique_ptr<int>(new int(3));
});
return 0;
}
Note that I pass a templated factory method instead of a value for the create case, this way there is no overhead when the value was found and does not need to be created. Since the lambda is passed as a template argument the compiler can choose to inline it.
You are right that map::operator[] has to be used with care, but it can be quite useful: if you want to find an element in the map, and if not there create it:
someClass *&obj = map[x];
if (!obj)
obj = new someClass;
obj->doThings();
And there is just one lookup in the map.
If the new fails, you may want to remove the NULL pointer from the map, of course:
someClass *&obj = map[x];
if (!obj)
try
{
obj = new someClass;
}
catch (...)
{
obj.erase(x);
throw;
}
obj->doThings();
Naturally, if you want to find something, but not to insert it:
std::map<int, someClass*>::iterator it = map.find(x); //or ::const_iterator
if (it != map.end())
{
someClass *obj = it->second;
obj->doThings();
}
Claims like "use of operator[] of the map is very bad" should always be a warning sign of almost religious belief. But as with most such claims, there is a bit of truth lurking somewhere. The truth here however is as with almost any other construct in the C++ standard library: be careful and know what you are doing. You can (accidentally) misuse almost everything.
One common problem is potential memory leaks (assuming your map owns the objects):
std::map<int,T*> m;
m[3] = new T;
...
m[3] = new T;
This will obviously leak memory, as it overwrites the pointer. Using insert here correctly isn't easy either, and many people make a mistake that will leak anyways, like:
std::map<int,T*> m;
minsert(std::make_pair(3,new T));
...
m.insert(std::make_pair(3,new T));
While this will not overwrite the old pointer, it will not insert the new and also leak it. The correct way with insert would be (possibly better enhanced with smart pointers):
std::map<int,T*> m;
m.insert(std::make_pair(3,new T));
....
T* tmp = new T;
if( !m.insert(std::make_pair(3,tmp)) )
{
delete tmp;
}
But this is somewhat ugly too. I personally prefer for such simple cases:
std::map<int,T*> m;
T*& tp = m[3];
if( !tp )
{
tp = new T;
}
But this is maybe the same amount of personal preference as your code reviewers have for not allowing op[] usage...
operator [] is avoided for insertion, because for the same reason
you mentioned in your question. It doesn't check for duplicate key
and overwrites on the existing one.
operator [] is mostly avoided for searching in the std::map.
Because, if a key doesn't exist in your map, then operator []
would silently create new key and initialize it (typically to
0). Which may not be a preferable in all cases. One should use
[] only if there is need to create a key, if it doesn't exist.
This is not a problem with [] at all. It's a problem with storing raw pointers in containers.
If your map is like for example this :
std::map< int, int* >
then you lose, because next code snippet would leak memory :
std::map< int, int* > m;
m[3] = new int( 5 );
m[3] = new int( 2 );
if handled properly, why [] can not be used at all?
If you properly tested your code, then your code should still fail the code review, because you used raw pointers.
Other then that, if used properly, there is nothing wrong with using map::operator[]. However, you would probably be better with using insert/find methods, because of possible silent map modification.
map[i] = new someClass; // potential dangling pointer when executed twice
here, the problem isn't map's operator[], but *the lack of smart pointers.
Your pointer should be stored into some RAII object (such as a smart pointer), which imemdiately takes ownership of the allocated object, and ensures it will get freed.
If your code reviewers ignore this, and instead say that you should avid operator[], buy them a good C++ textbook.
if (map[i]==NULL) ... // implicitly create the entry i in the map
That's true. But that's because operator[] is designed to behave differently. Obviously, you shouldn't use it in situations where it does the wrong thing.
Generally the problem is that operator[] implicitly creates a value associated with the passed-in key and inserts a new pair in the map if the key does not occur already. This can break you logic from then on, e.g. when you search whether a certain key exists.
map<int, int> m;
if (m[4] != 0) {
cout << "The object exists" << endl; //furthermore this is not even correct 0 is totally valid value
} else {
cout << "The object does not exist" << endl;
}
if (m.find(4) != m.end()) {
cout << "The object exists" << endl; // We always happen to be in this case because m[4] creates the element
}
I recommend using the operator[] only when you know you will be referencing a key already existing in the map(this by the way proves to be not so infrequent case).
There's nothing wrong with operator[] of map, per se, as long as its
semantics correspond to what you want. The problem is defining what you
want (and knowing the exact semantics of operator[]). There are times
when implicitly creating a new entry with a default value when the entry
isn't present is exactly what you want (e.g. counting words in a text
document, where ++ countMap[word] is all that you need); there are
many other times that it's not.
A more serious problem in your code may be that you are storing pointers
in the map. A more natural solution might be to use a map <keyType,
someClass>, rather than a map <keyType, SomeClass*>. But again, this
depends on the desired semantics; for example, I use a lot of map
which are initialized once, at program start up, with pointers to static
instances. If you're map[i] = ... is in an initialization loop,
executed once at start up, there's probably no issue. If it's something
executed in many different places in the code, there probably is an
issue.
The solution to the problem isn't to ban operator[] (or maps to
pointers). The solution is to start by specifying the exact semantics
you need. And if std::map doesn't provide them directly (it rarely
does), write a small wrapper class which defines the exact semantics you
want, using std::map to implement them. Thus, your wrapper for
operator[] might be:
MappedType MyMap::operator[]( KeyType const& key ) const
{
MyMap::Impl::const_iterator elem = myImpl.find( key );
if ( elem == myImpl.end() )
throw EntryNotFoundError();
return elem->second;
}
or:
MappedType* MyMap::operator[]( KeyType const& key ) const
{
MyMap::Impl::const_iterator elem = myImpl.find( key );
return elem == myImpl.end()
? NULL // or the address of some default value
: &elem->second;
}
Similarly, you might want to use insert rather than operator[] if
you really want to insert a value that isn't already present.
And I've almost never seen a case where you'd insert an immediately
newed object into a map. The usual reason for using new and
delete is that the objects in question have some specific lifetime of
their own (and are not copiable—although not an absolute rule, if
you're newing an object which supports copy and assignment, you're
probably doing something wrong). When the mapped type is a pointer,
then either the pointed to objects are static (and the map is more or
less constant after initialization), or the insertion and removal is
done in the constructor and destructor of the class. (But this is just
a general rule; there are certainly exceptions.)

c++ map find() to possibly insert(): how to optimize operations?

I'm using the STL map data structure, and at the moment my code first invokes find(): if the key was not previously in the map, it calls insert() it, otherwise it does nothing.
map<Foo*, string>::iterator it;
it = my_map.find(foo_obj); // 1st lookup
if(it == my_map.end()){
my_map[foo_obj] = "some value"; // 2nd lookup
}else{
// ok do nothing.
}
I was wondering if there is a better way than this, because as far as I can tell, in this case when I want to insert a key that is not present yet, I perform 2 lookups in the map data structures: one for find(), one in the insert() (which corresponds to the operator[] ).
Thanks in advance for any suggestion.
Normally if you do a find and maybe an insert, then you want to keep (and retrieve) the old value if it already existed. If you just want to overwrite any old value, map[foo_obj]="some value" will do that.
Here's how you get the old value, or insert a new one if it didn't exist, with one map lookup:
typedef std::map<Foo*,std::string> M;
typedef M::iterator I;
std::pair<I,bool> const& r=my_map.insert(M::value_type(foo_obj,"some value"));
if (r.second) {
// value was inserted; now my_map[foo_obj]="some value"
} else {
// value wasn't inserted because my_map[foo_obj] already existed.
// note: the old value is available through r.first->second
// and may not be "some value"
}
// in any case, r.first->second holds the current value of my_map[foo_obj]
This is a common enough idiom that you may want to use a helper function:
template <class M,class Key>
typename M::mapped_type &
get_else_update(M &m,Key const& k,typename M::mapped_type const& v) {
return m.insert(typename M::value_type(k,v)).first->second;
}
get_else_update(my_map,foo_obj,"some value");
If you have an expensive computation for v you want to skip if it already exists (e.g. memoization), you can generalize that too:
template <class M,class Key,class F>
typename M::mapped_type &
get_else_compute(M &m,Key const& k,F f) {
typedef typename M::mapped_type V;
std::pair<typename M::iterator,bool> r=m.insert(typename M::value_type(k,V()));
V &v=r.first->second;
if (r.second)
f(v);
return v;
}
where e.g.
struct F {
void operator()(std::string &val) const
{ val=std::string("some value")+" that is expensive to compute"; }
};
get_else_compute(my_map,foo_obj,F());
If the mapped type isn't default constructible, then make F provide a default value, or add another argument to get_else_compute.
There are two main approaches. The first is to use the insert function that takes a value type and which returns an iterator and a bool which indicate if an insertion took place and returns an iterator to either the existing element with the same key or the newly inserted element.
map<Foo*, string>::iterator it;
it = my_map.find(foo_obj); // 1st lookup
my_map.insert( map<Foo*, string>::value_type(foo_obj, "some_value") );
The advantage of this is that it is simple. The major disadvantage is that you always construct a new value for the second parameter whether or not an insertion is required. In the case of a string this probably doesn't matter. If your value is expensive to construct this may be more wasteful than necessary.
A way round this is to use the 'hint' version of insert.
std::pair< map<foo*, string>::iterator, map<foo*, string>::iterator >
range = my_map.equal_range(foo_obj);
if (range.first == range.second)
{
if (range.first != my_map.begin())
--range.first;
my_map.insert(range.first, map<Foo*, string>::value_type(foo_obj, "some_value") );
}
The insertiong is guaranteed to be in amortized constant time only if the element is inserted immediately after the supplied iterator, hence the --, if possible.
Edit
If this need to -- seems odd, then it is. There is an open defect (233) in the standard that hightlights this issue although the description of the issue as it applies to map is clearer in the duplicate issue 246.
In your example, you want to insert when it's not found. If default construction and setting the value after that is not expensive, I'd suggest simpler version with 1 lookup:
string& r = my_map[foo_obj]; // only lookup & insert if not existed
if (r == "") r = "some value"; // if default (obj wasn't in map), set value
// else existed already, do nothing
If your example tells what you actually want, consider adding that value as str Foo::s instead, you already have the object, so no lookups would be needed, just check if it has default value for that member. And keep the objs in the std::set. Even extending class FooWithValue2 may be cheaper than using map.
But If joining data through the map like this is really needed or if you want to update only if it existed, then Jonathan has the answer.