is insert() necessary in a map or unordered_map?

is insert() necessary in a map or unordered_map? - c++

I see a lot of examples that add items to a map or unordered_map via operator[], like so:
int main() {
unordered_map <string, int> m;
m["foo"] = 42;
cout << m["foo"] << endl;
}
Is there any reason to use the insert member function instead? It would appear they both do the same thing.

They are not.
operator[] will overwrite the value for this key, if it exists, while insert will not.
In case operator[] is used for inserting element, it is expected to be a little slower (see #MatthieuM's comment below for details), but this is not that significant here.
While std::map::insert returns std::pair< iterator, bool >, where the .second will tell you if the value is inserted or it already exists.
Regarding your comment: you cannot have 2 elements with the same key and different value. This is not a multimap.
If there's an element in the map, with the same key you're trying to insert, then:
operator[] will overwrite the existing value
std::map::insert will not do anything.* return a std::pair< iterator, bool >, where the .second will be false (saying "the new element is not inserted, as such key already exists") and the .first will point to the found element.
* I changed this thanks to the note/remark, given from #luk32; but by writing "will not do anything", I didn't mean it literally, I meant that it will not change the value of the existing element

Using insert() can help improve performance in certain situations (more specifically for std::map since search time is O(log(n)) instead of constant amortized). Take the following common example:
std::map<int, int> stuff;
// stuff is populated, possibly large:
auto iterator = stuff.find(27);
if(stuff.end() != iterator)
{
// subsequent "find", set to 15
iterator->second = 15;
}
else
{
// insert with value of 10
stuff[27] = 10;
}
The code above resulted in effectively finding the element twice. We can make that (slightly) more efficient written like this:
// try to insert 27 -> 10
auto result = stuff.insert(std::make_pair(27, 10));
// already existed
if(false == result.second)
{
// update to 15, already exists
result.first->second = 15;
}
The code above only tries to find an element once, reducing algorithmic complexity. For frequent operations, this can improve performance drastically.

The two are not equivalent. insert will not overwrite an existing value, and it returns a pair<iterator, bool>, where iterator is the location of the key, regardless of whether or not it already existed. The bool indicates whether or not the insert occurred.
operator[] effectively does a lower_bound on key. If the result of that operation is an iterator with the same key, it returns a reference to the value. If not, it inserts a new node with a default-constructed value, and then returns a reference to the value. This is why operator[] is a non-const member - it auto-vivifies the key-value if it doesn't exist. This may have performance implications if the value type is costly to construct.
Also note in C++11, we have an emplace method that works nearly identical to insert, except it constructs the key-value pair in-place from forwarded arguments, if an insert occurs.

Well I disagree with Kiril's answer to a certain degree and I think it's not full so I give mine.
According to cppreference std::map::operator[] is equivalent to a certain insert() call. By this I also think he is wrong saying the value will be overwritten.
It says: "Return value
Reference to the mapped value of the new element if no element with key key existed. Otherwise a reference to the mapped value of the existing element is returned."
So it seems it is a convenient wrapper. The insert(), however has this advantage of being overloaded, so it provides more functionality under one name.
I give a point to Kiril, that they do seem to have a bit different functionality at first glance, however IHMO the examples he provides are not equivalent to each other.
Therefore, as an example/reason to use insert I would point out, inserting many elements at once, or using hint ( Calls 3-6 in here).
So is insert() necessary in a map or unordered_map?
I would say yes. Moreover, the operator[] is not necessary as it can be emulated/implemented using insert, while the other way is impossible! It simply provides more functinality. However, writing stuff like (insert(std::make_pair(key, T())).first)->second) (after cppreference) seems cumbersome than [].
Thus, is there any reason to use the insert member function instead?
I'd say for overlapping functionality, hell no.

Related

std::map - adding element using subscript operator Vs insert method

I am trying to understand and make sure if three different ways to insert elements into a std::map are effectively the same.
std::map<int, char> mymap;
Just after declaring mymap - will inserting an element with value a for key 10 be same by these three methods?
mymap[10]='a';
mymap.insert(mymap.end(), std::make_pair(10, 'a'));
mymap.insert(std::make_pair(10, 'a'));
Especially, does it make any sense using mymap.end() when there is no existing element in std::map?

The main difference is that (1) first default-constructs a key object in the map in order to be able to return a reference to this object. This enables you to assign something to it.
Keep that in mind if you are working with types that are stored in a map, but have no default constructor. Example:
struct A {
explicit A(int) {};
};
std::map<int, A> m;
m[10] = A(42); // Error! A has no default ctor
m.insert(std::make_pair(10, A(42))); // Ok
m.insert(m.end(), std::make_pair(10, A(42))); // Ok
The other notable difference is that (as #PeteBecker pointed out in the comments) (1) overwrites existing entries in the map, while (2) and (3) don't.

Yes, they are effectively the same. Just after declaring mymap, all three methods turn mymap into {10, 'a'}.
It is OK to use mymap.end() when there is no existing element in std::map. In this case, begin() == end(), which is the universal way of denoting an empty container.

(1) is different from (2) and (3) if there exists an element with the same key. (1) will replace the element, where (2) and (3) will fail and return value denoting insertion didn't happen.
(1) also requires that mapped type is default constructible. In fact (1) first default constructs the object if not present already and replaces that with the value specified.
(2) and (3) are also different. To understand the difference we need to understand what the iterator in (2) does. From cppreference, the iterator refers to a hint where insertion happens as close to that hint as possible. There is a performance difference depending on the validity of the hint. Quoting from the same page:
Amortized constant if the insertion happens in the position just after the hint, logarithmic in the size of the container otherwise.(until C++11)
Amortized constant if the insertion happens in the position just before the hint, logarithmic in the size of the container otherwise. (since C++11)
So for large maps we can get a performance boost if we already know the position somehow.
Having said all of these, if the map is just created and you are doing the operation with no prior elements in the map as you said in the question then I would say that all three will be practically same (though there internal operation will be different as specified above).

which element will be returned from std::multimap::find, and similarly std::multiset::find?

Most likely this question is a duplicate but I could not find a reference to it.
I'm looking at std::multiset::find & std::multimap::find functions and I was wondering which element will be returned if a specific key was inserted multiple times?
From the description:
Notice that this function returns an iterator to a single element (of
the possibly multiple equivalent elements)
Question
Is it guaranteed that the single element is the first one inserted or is it random?
Background
The reason I'm asking is that I'm implementing multipmap like class:
typedef std::vector<Item> Item_vector;
class Item
{
string m_name;
};
class MyItemMultiMap
{
public:
// forgive me for not checking if key exist in the map. it is just an example.
void add_item( const Item& v ) { m_map[v.m_name].push_back(v); }
// is returning the first item in the vector mimic std::multimap::find behavior?
Item& get_item( const string& v ) { return m_map[v][0]; }
private:
std::map<string,Item_vector> m_map;
};
I'd like get_item() to work exactly as std::multimap::find. is it possible? if so, how would it be implemented?

The find method may return an arbitrary one if more than one is present, though your STL implementation might indeed just give the first one.
It's safer to use the 'lower_bound' method, and ++ iterate from there (see std::multimap::lower_bound). Do note though that 'lower_bound' returns a ref to another element if what you're looking for isn't present!

The C++ standard says that for any associative container a, a.find(k) "returns an iterator pointing to an element with the key equivalent to k, or a.end() if such an element is not found", and it doesn't impose any additional requirements on multimap. Since it doesn't specify which element is returned, the implementation is permitted to return any matching element.
If you're trying to imitate the exact behavior of multimap on the platform where you're running, that's bad news, but if your goal is just to satisfy the same requirements as multimap, it's good news: you can return any matching element that you want to, and in particular it's fine to just always return the first one.

http://en.cppreference.com/w/cpp/container/multimap/find
Finds an element with key key. If there are several elements with key
in the container, the one inserted earlier is selected.
So, an iterator to the first element will be returned.
In general, I find equal_range to be the more useful method, returning a pair of iterators pointing respectively at the first, and after the last, elements matching the key.

Is it wise to use a pointer to access values in an std::map

Is it dangerous to returning a pointer out of a std::map::find to the data and using that as opposed to getting a copy of the data?
Currently, i get a pointer to an entry in my map and pass it to another function to display the data. I'm concerned about items moving causing the pointer to become invalid. Is this a legit concern?
Here is my sample function:
MyStruct* StructManagementClass::GetStructPtr(int structId)
{
std::map<int, MyStruct>::iterator foundStruct;
foundStruct= myStructList.find(structId);
if (foundStruct== myStructList.end())
{
MyStruct newStruct;
memset(&newStruct, 0, sizeof(MyStruct));
myStructList.structId= structId;
myStructList.insert(pair<int, MyStruct>(structId, newStruct));
foundStruct= myStructList.find(structId);
}
return (MyStruct*) &foundStruct->second;
}

It would undoubtedly be more typical to return an iterator than a pointer, though it probably makes little difference.
As far as remaining valid goes: a map iterator remains valid until/unless the item it refers to is removed/erased from the map.
When you insert or delete some other node in the map, that can result in the nodes in the map being rearranged. That's done by manipulating the pointers between the nodes though, so it changes what other nodes contain pointers to the node you care about, but does not change the address or content of that particular node, so pointers/iterators to that node remain valid.

As long as you, your code, and your development team understand the lifetime of std::map values ( valid after insert, and invalid after erase, clear, assign, or operator= ), then using an iterator, const_iterator, ::mapped_type*, or ::mapped_type const* are all valid. Also, if the return is always guaranteed to exist, then ::mapped_type&, or ::mapped_type const& are also valid.
As for wise, I'd prefer the const versions over the mutable versions, and I'd prefer references over pointers over iterators.
Returning an iterator vs. a pointer is bad:
it exposes an implementation detail.
it is awkward to use, as the caller has to know to dereference the iterator, that the result is an std::pair, and that one must then call .second to get the actual value.
.first is the key that the user may not care about.
determining if an iterator is invalid requires knowledge of ::end(), which is not obviously available to the caller.

It's not dangerous - the pointer remains valid just as long as an iterator or a reference does.
However, in your particular case, I would argue that it is not the right thing anyway. Your function unconditionally returns a result. It never returns null. So why not return a reference?
Also, some comments on your code.
std::map<int, MyStruct>::iterator foundStruct;
foundStruct = myStructList.find(structId);
Why not combine declaration and assignment into initialization? Then, if you have C++11 support, you can just write
auto foundStruct = myStructList.find(structId);
Then:
myStructList.insert(pair<int, MyStruct>(structId, newStruct));
foundStruct = myStructList.find(structId);
You can simplify the insertion using make_pair. You can also avoid the redundant lookup, because insert returns an iterator to the newly inserted element (as the first element of a pair).
foundStruct = myStructList.insert(make_pair(structId, newStruct)).first;
Finally:
return (MyStruct*) &foundStruct->second;
Don't ever use C-style casts. It might not do what you expect. Also, don't use casts at all when they're not necessary. &foundStruct->second already has type MyStruct*, so why insert a cast? The only thing it does is hide a place that you need to change if you ever, say, change the value type of your map.

Yes,
If you build a generic function without knowing the use of it, it can be dangerous to return the pointer (or the iterator) since it can become un-valid.
I would advice do one of two:
1. work with std::shared_ptr and return that. (see below)
2. return the struct by value (can be slower)
//change the difination of the list to
std::map<int, std::shared_ptr<MyStruct>>myStructList;
std::shared_ptr<MyStruct> StructManagementClass::GetStructPtr(int structId)
{
std::map<int, std::shared_ptr<MyStruct>>::iterator foundStruct;
foundStruct = myStructList.find(structId);
if (foundStruct == myStructList.end())
{
MyStruct newStruct;
memset(&newStruct, 0, sizeof(MyStruct));
myStructList.structId= structId;
myStructList.insert(pair<int, shared_ptr<MyStruct>>(structId, shared_ptr<MyStruct>(newStruct)));
foundStruct= myStructList.find(structId);
}
return foundStruct->second;

C++: How can I stop map's operator[] from inserting bogus values?

My code did the following:
Retrieve a value from a map with operator[].
Check the return value and if NULL use insert to insert a new element in the map.
Magically, an element with value 0 appeared in the map.
After several hours of debugging I discovered the following: map's operator[] inserts a new element if the key is not found while insert does not change the value if the key exists.
Even if a default constructor for the map value type does not exist the code compiles and operator[] inserts 0.
Is there a way (e.g. some coding convention I could follow from now on) I could have prevented this from hurting me?

Is there a way (e.g. some coding convention I could follow from now on) I could have prevented this from hurting me?
This may sound snarky, but: by reading the documentation.
Since what you did is somewhat expected behaviour of the map, there’s not much you can do to guard against it.
One thing you can heed in the future is the following. In your second step, you did something wrong:
Check the return value and if NULL use insert to insert a new element in the map.
This does never work with C++ standard library functions (other than C compatibility functions and new): the standard library doesn’t deal in pointers, least of all null pointers, so checking against NULL (or 0 or nullptr) rarely makes sense. (Apart from that, it wouldn’t make sense for a map’s operator [] to return a pointer in the first place. It obvoiusly returns the element type (or rather, a reference to it)).
In fact, the standard library predominantly uses iterators so if at all, check for iterator validity by comparing against the end() of a container.
Unfortunately, your code (checking against NULL) compiled since NULL is actually a macro that’s equal to 0 in C++ so you can compare it against an integer.
C++11 gets safer by introducing the nullptr keyword which has a distinct type, so comparing it with an integer wouldn’t compile. So this is a useful coding convention: never use NULL, and instead compile with C++11 support enabled and use nullptr.

I guess the obvious is to learn that those are indeed the semantics of the indexing operator, so you should not use it to test for element existance in a container.
Instead, use find().

Indeed, when you call operator [], if a value at that key is not found, a value-initialized default value is inserted.
If you don't want this to happen, you have to check with find:
if ( mymap.find(myKey) == mymap.end() )
{
//the key doesn't exist in a map
}
The value returned by operator [] will be NULL only if it's a map to pointers (or types that value-initialized yield 0, but you were pretty specific about NULL).

Even if a default constructor for the map value type does not exist
the code compiles
This is definitely wrong. operator[] should fail to compile if a default constructor does not exist. Anything else is an error on the part of your implementation.
Your code should have simply used insert once.

Latest implementations of std::map also have a .at(const Key& key) member functions which checks for existence of value and returns a std::out_of_range exception if the key has not been found.
http://en.cppreference.com/w/cpp/container/map/at

Instead of using the [] operator, call find on your map. This will not insert an entry if no match is found. It returns an iterator to the item found.

Here are several examples of map lookup and insert use cases, where you sometimes want to handle the case of items already existing, seeing what old value was, etc.
class Foo
{
// Use a typedef so we can conveniently declare iterators
// and conveniently construct insert pairs
typedef map<int, std::string> IdMap;
IdMap id_map;
// A function that looks up a value without adding anything
void one(int id)
{
IdMap::iterator i = id_map.find(id);
// See if an entry already exists
if (i == id_map.end())
return; // value does not exist
// Pass the string value that was stored in the map to baz
baz(i->second);
}
// A function that updates an existing value, but only if it already exists
bool two(int id, const std::string &data)
{
IdMap::iterator i = id_map.find(id);
if (i == id_map.end())
return false;
i->second = data;
return true;
}
// A function that inserts a value only if it does NOT already exist
// Returns true if the insertion happened, returns false if no effect
bool three(int id, const std::string &data)
{
return id_map.insert(IdMap::value_type(id, data)).second;
}
// A function that tries to insert if key doesn't already exist,
// but if it does already exist, needs to get the current value
void four(int id, const std::string &data)
{
std::pair<IdMap::iterator,bool> i =
id_map.insert(IdMap::value_type(id, data));
// Insertion worked, don't need to process old value
if (i->second)
return true;
// Pass the id to some imaginary function that needs
// to know id and wants to see the old string and new string
report_conflict(id, i->first->second, data);
}
};
Programmers often do multiple redundant calls to operator[], or a call to find then a redundant call to operator[], or a call to find then a redundant call to insert, out of laziness or ignorance. It is quite easy to efficiently use a map if you understand its semantics.

operator[] actually returns a Value& so if you are sure you want to insert the element you can do something like:
map<Key, Value*> my_map;
Value& entry = my_map[key]; // insert occurs here, using default constructor.
if (entry == nullptr) entry = my_new_entry; // just changing the value
Added benefit is that you only lookup in the map once.

C++ template class map

I add the constructor and two functions to the class of my previous linked question C++ iterate through a template Map and I need help at this points:
What do you reckon this constructor does?
Adding one value at the beginning of map?
I see though in the respective key only an address as value after initializing in main. What is wrong?
The operator [] is supposed to get the values for a specific key. However I cannot use it so as to get the elements of the map in the output. Any hint?
template<class K, class V>
class template_map{
public:
template_map( V const& val) {
m_map.insert(my_map.begin(),std::make_pair(std::numeric_limits<K>::min(),val));
};
typedef typename std::map<K,V> TMap;
TMap my_map;
typedef typename TMap::const_iterator const_iterator;
const_iterator begin() const { return my_map.begin(); }
const_iterator end() const { return my_map.end(); }
V const& operator[]( K const& key ) const {
return ( --my_map.upper_bound(key) )->second;
}
...
};
int main()
{
interval_map<int,int> Map1 (10);
//Show the elements of the map?
}
Consider also that it should be a function that inserts values to the map.

What do you reckon this constructor does? Adding one value at the beginning of map?
It initialises the map so that map[x] == v for any x. The map associates intervals with values, internally storing a normal map keyed by the start of each interval; it's initialised so that the entire range of the key type maps to the initial value.
I see though in the respective key only an address as value after initializing in main. What is wrong? The operator [] is supposed to get the values for a specific key. However I cannot use it so as to get the elements of the map in the output. Any hint?
I've no idea what you're asking there. If you try, for example, cout << Map1[42] << '\n';, then your program should output 10, since that is the initial value assigned to the entire range of integers.
Consider also that it should be a function that inserts values to the map.
Since the internal map is publicly exposed, you can add a new interval to the map with
Map1.my_map.insert(std::make_pair(interval_start, value));
It might be more polite to make my_map private, and provide an insert() function to do that. You could also add a non-const overload of operator[] that inserts a new range and returns a reference to its value, something like
V & operator[](K const & key) {
V const & old_value = (--my_map.upper_bound(key))->second;
return *my_map.insert(std::make_pair(key, old_value)).first;
}
although this might not be a great idea, as you'd have to be careful that you don't accidentally insert many ranges when you only want to read the values.
My problem is how to iterate through the map to get all its elements and print them in main. It shows me an address with a value of the object initialization.
Remembering that an iterator over a map refers to a key/value pair (of type std::pair<K,V>), you should be able to iterator over the map like this:
for (auto it = Map1.begin(); it != Map1.end(); ++it) {
std::cout << it->first << " maps to " << it->second << '\n';
}
(in C++03, you'll need to write template_map<int,int>::const_iterator rather than auto).

What do you reckon this constructor does? Adding one value at the
beginning of map? I see though in the respective key only an address
as value after initializing in main. What is wrong?
It adds this one value in to the map. The iterator argument is only a hint: if the new item is to be inserted right after this position, the operation can be completed faster. Otherwise the map will need to find the right place to insert the new value as usual.
The operator [] is supposed to get the values for a specific key.
However I cannot use it so as to get the elements of the map in the
output. Any hint?
upper_bound returns iterator to the first key-value pair, where key is greater than the argument. --upper_bound therefore returns an iterator to the item, whose key is either equal or less than the queried key. If upper_bound returned map.begin(), because all keys are greater than the query, decrementing it is undefined behavior.
What you need here is the find member function. You also need to deal with the case the key is not found (map.end() is returned), e.g by throwing an exception.
Alternatively you may implement your operator[] in terms of map::operator[]. This means that the function can't be const, because map inserts a new default value if the key is not found.

The iterator in map::insert() is just a hint; essentially it doesn't mean anything in terms of the semantics of the program.
Your code inserts the value passed through the constructor argument together with the key numeric_limits<K>::min(), i.e. the smallest possible value for the given key type. This will only complile if numeric_limits is specialized for the type K.
Also note that if the key already exists, the corresponding mapped value will not be overwritten, so a corresponding insert function would be of very limited use.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

is insert() necessary in a map or unordered_map? - c++

I see a lot of examples that add items to a map or unordered_map via operator[], like so: int main() { unordered_map <string, int> m; m["foo"] = 42; cout << m["foo"] << endl; } Is there any reason to use the insert member function instead? It would appear they both do the same thing.

Related

std::map - adding element using subscript operator Vs insert method

which element will be returned from std::multimap::find, and similarly std::multiset::find?

Is it wise to use a pointer to access values in an std::map

C++: How can I stop map's operator[] from inserting bogus values?

C++ template class map

Categories

Resources