I am trying to understand and make sure if three different ways to insert elements into a std::map are effectively the same.
std::map<int, char> mymap;
Just after declaring mymap - will inserting an element with value a for key 10 be same by these three methods?
mymap[10]='a';
mymap.insert(mymap.end(), std::make_pair(10, 'a'));
mymap.insert(std::make_pair(10, 'a'));
Especially, does it make any sense using mymap.end() when there is no existing element in std::map?
The main difference is that (1) first default-constructs a key object in the map in order to be able to return a reference to this object. This enables you to assign something to it.
Keep that in mind if you are working with types that are stored in a map, but have no default constructor. Example:
struct A {
explicit A(int) {};
};
std::map<int, A> m;
m[10] = A(42); // Error! A has no default ctor
m.insert(std::make_pair(10, A(42))); // Ok
m.insert(m.end(), std::make_pair(10, A(42))); // Ok
The other notable difference is that (as #PeteBecker pointed out in the comments) (1) overwrites existing entries in the map, while (2) and (3) don't.
Yes, they are effectively the same. Just after declaring mymap, all three methods turn mymap into {10, 'a'}.
It is OK to use mymap.end() when there is no existing element in std::map. In this case, begin() == end(), which is the universal way of denoting an empty container.
(1) is different from (2) and (3) if there exists an element with the same key. (1) will replace the element, where (2) and (3) will fail and return value denoting insertion didn't happen.
(1) also requires that mapped type is default constructible. In fact (1) first default constructs the object if not present already and replaces that with the value specified.
(2) and (3) are also different. To understand the difference we need to understand what the iterator in (2) does. From cppreference, the iterator refers to a hint where insertion happens as close to that hint as possible. There is a performance difference depending on the validity of the hint. Quoting from the same page:
Amortized constant if the insertion happens in the position just after the hint, logarithmic in the size of the container otherwise.(until C++11)
Amortized constant if the insertion happens in the position just before the hint, logarithmic in the size of the container otherwise. (since C++11)
So for large maps we can get a performance boost if we already know the position somehow.
Having said all of these, if the map is just created and you are doing the operation with no prior elements in the map as you said in the question then I would say that all three will be practically same (though there internal operation will be different as specified above).
Related
std::map<int, Obj> mp;
// insert elements into mp
// case 1
std::map<int, Obj> mp2;
mp2 = std::move(mp);
// case 2
std::map<int, Obj> mp3;
std::move(std::begin(mp), std::end(mp), std::inserter(mp3, std::end(mp3));
I am confused by the two cases. Are they exactly the same?
Are they exactly the same?
No, They are not!
The first one invokes the move constructor of the std::map4 and the move operation will be done at class/ data structure level.
[...]
Move constructor.
After container move construction (overload (4)), references, pointers, and iterators (other than the end iterator) to other remain valid, but refer to elements that are now in *this. The current standard makes this guarantee via the blanket statement in container.requirements.general, and a more direct guarantee is under consideration via LWG 2321
Complexity
4) Constant. If alloc is given and alloc != other.get_allocator(), then linear.
The second std::move is from <algorithm> header, which does element wise(i.e. key value pairs) movement to the other map.
Moves the elements in the range [first, last), to another range beginning at d_first, starting from first and proceeding to last - 1. After this operation the elements in the moved-from range will still contain valid values of the appropriate type, but not necessarily the same values as before the move.
Complexity
Exactly last - first move assignments.
No, they are not the same.
Case 1 moves the content of the whole map at once. The map's internal pointer(s) are "moved" to mp2 - none of the pairs in the map are affected.
Case 2 moves the individual pair's in the map, one by one. Note that map Key s are const so they can't be moved but will instead be copied. mp will still contain as many elements as before - but with values in an indeterminable state.
How can I emplace an empty vector into a std::map? For example, if I have a std::map<int, std::vector<int>>, and I want map[4] to contain an empty std::vector<int>, what can I call?
If you use operator[](const Key&), the map will automatically emplace a value-initialized (i.e. in the case of std::vector, default-constructed) value if you access an element that does not exist. See here:
http://en.cppreference.com/w/cpp/container/map/operator_at
(Since C++ 11 the details are a tad more complicated, but in your case this is what matters).
That means if your map is empty and you do map[4], it will readily give you a reference to an empty (default-constructed) vector. Assigning an empty vector is unnecessary, although it may make your intent more clear.
Demo: https://godbolt.org/g/rnfW7g
Unfortunately the strictly-correct answer is indeed to use std::piecewise_construct as the first argument, followed by two tuples. The first represents the arguments to create the key (4), and the second represents the arguments to create the vector (empty argument set).
It would look like this:
map.emplace(std::piecewise_construct, // signal piecewise construction
std::make_tuple(4), // key constructed from int(4)
std::make_tuple()); // value is default constructed
Of course this looks unsightly, and other alternatives will work. They may even generate no more code in an optimised build:
This one notionally invokes default-construction and move-assignment, but it is likely that the optimiser will see through it.
map.emplace(4, std::vector<int>());
This one invokes default-construction followed by copy-assignment. But again, the optimiser may well see through it.
map[4] = {};
To ensure an empty vector is placed at position 4, you may simply attempt to clear the vector at position 4.
std::map<int, std::vector<int>> my_map;
my_map[4].clear();
As others have mentioned, the indexing operator for std::map will construct an empty value at the specified index if none already exists. If that is the case, calling clear is redundant. However, if a std::vector<int> does already exist, the call to clear serves to, well, clear the vector there, resulting in an empty vector.
This may be more efficient than my previous approach of assigning to {} (see below), because we probably plan on adding elements to the vector at position 4, and we don't pay any cost of new allocation this way. Additionally, if previous usage of my_map[4] indicates future usage, then our new vector will likely be eventually resized to the nearly the same size as before, meaning we save on reallocation costs.
Previous approach:
just assign to {} and the container should properly construct an empty vector there:
std::map<int, std::vector<int>> my_map;
my_map[4] = {};
std::cout << my_map.size() << std::endl; // prints 1
Demo
Edit: As Jodocus mentions, if you know that the std::map doesn't already contain a vector at position 4, then simply attempting to access the vector at that position will default-construct one, e.g.:
std::map<int, std::vector<int>> my_map;
my_map[4]; // default-constructs a vector there
What's wrong with the simplest possible solution? std::map[4] = {};.
In modern C++, this should do what you want with no or at least, very little, overhead.
If you must use emplace, the best solution I can come up with is this:
std::map<int, std::vector<int>> map;
map.emplace(4, std::vector<int>());
Use piecewise_construct with std::make_tuple:
map.emplace(std::piecewise_construct, std::make_tuple(4), std::make_tuple());
We are inserting an empty vector at position 4.
And if there is a general case like, emplacing a vector of size 100 with 10 filled up then:
map.emplace(std::piecewise_construct, std::make_tuple(4), std::make_tuple(100, 10));
piecewise_construct: This constant value is passed as the first argument to construct a pair object to select the constructor form that constructs its members in place by forwarding the elements of two tuple objects to their respective constructor.
Let's say I have a map<int, int>:
std::map<int, int> map;
map.emplace(1, 2);
map.insert({3, 4});
Will there be any difference between the two calls?
In the first call, the two integers will be copied by value to the emplace function and then again to the std::pair<int, int> constructor. In the second call, the two integers will be copied by value to the std::pair<int, int> constructor and then be copied by value to the internal std::pair<int, int> again as members of the first pair.
I understand the benefits of emplace for types like std::string where they would be copied by value in the second call and moved all the way in the first one, but is there any benefit in using emplace in the situation described?
Emplace is slower, if there is a chance that the emplace will fail (the key is already present).
This is because emplace is required to allocate a node and construct the pair<Key const, Value> into it, then extract the key from that node and check whether the key is already present, then deallocate the node if the key is already present. On the other hand insert can extract the key from the passed value to be inserted, so does not need to allocate a node if the insert would fail. See: performance of emplace is worse than check followed by emplace.
To fix this, C++17 adds a member function try_emplace(const key_type& k, Args&&... args) (etc.)
In case of success, there is no real difference between the two cases; the order of operations is different, but that will not affect performance in any predictable fashion. Code size will still be slightly larger for the emplace variant, as it has to be ready to perform more work in the failure case.
I see a lot of examples that add items to a map or unordered_map via operator[], like so:
int main() {
unordered_map <string, int> m;
m["foo"] = 42;
cout << m["foo"] << endl;
}
Is there any reason to use the insert member function instead? It would appear they both do the same thing.
They are not.
operator[] will overwrite the value for this key, if it exists, while insert will not.
In case operator[] is used for inserting element, it is expected to be a little slower (see #MatthieuM's comment below for details), but this is not that significant here.
While std::map::insert returns std::pair< iterator, bool >, where the .second will tell you if the value is inserted or it already exists.
Regarding your comment: you cannot have 2 elements with the same key and different value. This is not a multimap.
If there's an element in the map, with the same key you're trying to insert, then:
operator[] will overwrite the existing value
std::map::insert will not do anything.* return a std::pair< iterator, bool >, where the .second will be false (saying "the new element is not inserted, as such key already exists") and the .first will point to the found element.
* I changed this thanks to the note/remark, given from #luk32; but by writing "will not do anything", I didn't mean it literally, I meant that it will not change the value of the existing element
Using insert() can help improve performance in certain situations (more specifically for std::map since search time is O(log(n)) instead of constant amortized). Take the following common example:
std::map<int, int> stuff;
// stuff is populated, possibly large:
auto iterator = stuff.find(27);
if(stuff.end() != iterator)
{
// subsequent "find", set to 15
iterator->second = 15;
}
else
{
// insert with value of 10
stuff[27] = 10;
}
The code above resulted in effectively finding the element twice. We can make that (slightly) more efficient written like this:
// try to insert 27 -> 10
auto result = stuff.insert(std::make_pair(27, 10));
// already existed
if(false == result.second)
{
// update to 15, already exists
result.first->second = 15;
}
The code above only tries to find an element once, reducing algorithmic complexity. For frequent operations, this can improve performance drastically.
The two are not equivalent. insert will not overwrite an existing value, and it returns a pair<iterator, bool>, where iterator is the location of the key, regardless of whether or not it already existed. The bool indicates whether or not the insert occurred.
operator[] effectively does a lower_bound on key. If the result of that operation is an iterator with the same key, it returns a reference to the value. If not, it inserts a new node with a default-constructed value, and then returns a reference to the value. This is why operator[] is a non-const member - it auto-vivifies the key-value if it doesn't exist. This may have performance implications if the value type is costly to construct.
Also note in C++11, we have an emplace method that works nearly identical to insert, except it constructs the key-value pair in-place from forwarded arguments, if an insert occurs.
Well I disagree with Kiril's answer to a certain degree and I think it's not full so I give mine.
According to cppreference std::map::operator[] is equivalent to a certain insert() call. By this I also think he is wrong saying the value will be overwritten.
It says: "Return value
Reference to the mapped value of the new element if no element with key key existed. Otherwise a reference to the mapped value of the existing element is returned."
So it seems it is a convenient wrapper. The insert(), however has this advantage of being overloaded, so it provides more functionality under one name.
I give a point to Kiril, that they do seem to have a bit different functionality at first glance, however IHMO the examples he provides are not equivalent to each other.
Therefore, as an example/reason to use insert I would point out, inserting many elements at once, or using hint ( Calls 3-6 in here).
So is insert() necessary in a map or unordered_map?
I would say yes. Moreover, the operator[] is not necessary as it can be emulated/implemented using insert, while the other way is impossible! It simply provides more functinality. However, writing stuff like (insert(std::make_pair(key, T())).first)->second) (after cppreference) seems cumbersome than [].
Thus, is there any reason to use the insert member function instead?
I'd say for overlapping functionality, hell no.
I have an std::unordered_map, and I want both to increment the first value in a std::pair, hashed by key, and to create a reference to key. For example:
std::unordered_map<int, std::pair<int, int> > hash;
hash[key].first++;
auto it(hash.find(key));
int& my_ref(it->first);
I could, instead of using the [] operator, insert the data with insert(), but I'd allocate a pair, even if it were to be deallocated later, as hash may already have key -- not sure of it, though. Making it clearer:
// If "key" is already inserted, the pair(s) will be allocated
// and then deallocated, right?
auto it(hash.insert(std::make_pair(key, std::make_pair(0, 0))));
it->second.first++;
// Here I can have my reference, with extra memory operations,
// but without an extra search in `hash`
int& my_ref(it->first);
I'm pretty much inclined to use the first option, but I can't seem to decide which one is the best. Any better solution to this?
P.S.: an ideal solution for me would be something like an insertion that does not require an initial, possibly useless, allocation of the value.
As others have pointed out, a "allocating" a std::pair<int,int> is really nothing more than copying two integers (on the stack). For the map<int,pair<int,int>>::value_type, which is pair<int const, pair<int, int>> you are at three ints, so there is no significant overhead in using your second approach. You can slightly optimize by using emplace instead of insert i.e.:
// Here an `int` and a struct containing two `int`s are passed as arguments (by value)
auto it(hash.emplace(key, std::make_pair(0, 0)).first);
it->second.first++;
// You get your reference, without an extra search in `hash`
// Not sure what "extra memory operations" you worry about
int const& my_ref(it->first);
Your first approach, using both hash[key] and hash.find(key) is bound to be more expensive, because an element search will certainly be more expensive than an iterator dereference.
Premature copying of arguments on their way to construction of the unordered_map<...>::value_type is a negligible problem, when all arguments are just ints. But if instead you have a heavyweight key_type or a pair of heavyweight types as mapped_type, you can use the following variant of the above to forward everything by reference as far as possible (and use move semantics for rvalues):
// Here key and arguments to construct mapped_type
// are forwarded as tuples of universal references
// There is no copying of key or value nor construction of a pair
// unless a new map element is needed.
auto it(hash.emplace(std::piecewise_construct,
std::forward_as_tuple(key), // one-element tuple
std::forward_as_tuple(0, 0) // args to construct mapped_type
).first);
it->second.first++;
// As in all solutions, get your reference from the iterator we already have
int const& my_ref(it->first);
How about this:
auto it = hash.find(key);
if (it == hash.end()) { it = hash.emplace(key, std::make_pair(0, 0)).first; }
++it->second.first;
int const & my_ref = it->first; // must be const
(If it were an ordered map, you'd use lower_bound and hinted insertion to recycle the tree walk.)
If I understand correctly, what you want is an operator[] that returns an iterator, not a mapped_type. The current interface of unordered_map does not provide such feature, and operator[] implementation relies on private members (at least the boost implementation, I don't have access C++11 std files in my environment).
I suppose that JoergB's answer will be faster and Kerrek SB's one will have a smaller memory footprint. It's up to you to decide what is more critical for your project.