How to optimize std::map insert() function? - c++

The best way to explain what I'm trying is accomplish is with this example (compiled with Visual Studio 2008 SP1):
struct ELEMENT1{
//Its members
ELEMENT1()
{
//Constructor code
}
~ELEMENT1()
{
//Destructor code
}
};
std::map<std::wstring, ELEMENT1> map;
std::pair<std::map<std::wstring, ELEMENT1>::iterator, bool> resIns;
ELEMENT1 element;
std::wstring strKey;
for(size_t i = 0; i < numberRepetitions; i++)
{
//Do processing
//...
//set 'strKey'
//Insert new element into the map first
resIns = map.insert(std::pair<std::wstring, ELEMENT1>(strKey, element)); //This line calls ELEMENT1 constructor & destructor twice
//Then fill out the data
fill_in_data(resIns.first->second);
}
BOOL fill_in_data(ELEMENT1& outInfo)
{
//Fill in 'outInfo' -- MUST be in its own function
//...
}
My goal is to optimize this code, and thus I did the following:
Moved ELEMENT1 element construction/destruction outside of the loop.
I'm inserting the element into the map and then attempt to fill it out using the pointer to the inserted element instead of constructing new element, then filling it out, then copying it into the map, and then destroying it. (At least that was the plan.)
But when I compile this for a Release build and check the assembler code, I can see that the C++ line with map.insert() function calls ELEMENT1 constructor twice! and then twice its destructor. So the following machine code is just for that map.insert() line:
So I'm obviously not seeing something here.
Can someone suggest what's going on in that compiled code & if it's possible to optimize it?

The reason you have 2 constructor calls is because what you are passing to insert does not match what it need. std::map::insert takes a const value_type& and value_type for a map is
std::pair<const key_type, element_type>
^^^^^ this is important
So, since they do not match you construct one element when you use
std::pair<std::wstring, ELEMENT1>(strKey, element)
and then the compiler calls the copy constructor to convert that into a
std::pair<const std::wstring, ELEMENT1>
A quick fix is to change the code to
std::pair<const std::wstring, ELEMENT1>(strKey, element)
Which leaves you with one temporary that is constructed and destructed. You can also do as zett42 suggests in their answer to avoid the creation of the temporary entirely.

resIns = map.insert(std::pair<std::wstring, ELEMENT1>(strKey, element));
You are constructing a temporary std::pair whose member second is a ELEMENT1. This causes the copy constructor of ELEMENT1 to be called.
The 2nd call to the copy constructor of ELEMENT1 is when std::map::insert() creates a new element in the map that will be initialized by the temporary std::pair.
You can avoid the duplicate constructor call caused by the temporary by using std::map::operator[] instead:
ELEMENT1& resIns = map[ strKey ];
fill_in_data( resIns );
If strKey doesn't already exist in the map, an ELEMENT1 will be default-constructed directly within the map and a reference to the new object will be returned. The constructor will be called exactly one time.
If strKey already exists in the map, a reference to the existing object will be returned.

You should use emplace to avoid creation on temp objects:
resIns = map.emplace
(
::std::piecewise_construct
, ::std::forward_as_tuple(strKey)
, ::std::forward_as_tuple()
);
A good reason to switch to newer VS version.

Related

Does node_handle insertion invalidate the handle?

Do I understand correctly, that since std::map's node-handle insertion takes r-value:
insert_return_type std::map::insert(node_type&& nh),
the node-handle cannot be used (to change value) after insertion?
std::map<...> m;
// Wrong:
m.insert(nh);
nh.value() = new_value;
// Right:
auto it = m.insert(nh);
it->second = new_value;
The object at nh will be copied out of, but the returned iterator will point to the new copy that exists inside the map.
The first example is indeed wrong, since nh can't be used for anything other than destruction after being moved from.
That's generally true, even without the r-value form: the map creates a value that lives inside its own data structure, and is a copy of the one being passed in to insert. For the regular (const &) form, you understand that changing the original will not affect the map!

Put a stack params to a map?

I view this code at CCArmatureDataManager.cpp 253 line. RelativeData is a struct.Here, put a stack param into a map. Why, no problem?? Is there someone explain this to me? thx!!!
struct RelativeData
{
std::vector<std::string> plistFiles;
std::vector<std::string> armatures;
std::vector<std::string> animations;
std::vector<std::string> textures;
};
void CCArmatureDataManager::addRelativeData(const std::string& configFilePath)
{
if (_relativeDatas.find(configFilePath) == _relativeDatas.end())
{
_relativeDatas[configFilePath] = RelativeData();
}
}
In the expression
_relativeDatas[configFilePath] = RelativeData()
The RelativeData() part creates a temporary default-constructed object.
The _relativeDatas[configFilePath] part calls std::map::operator[] which returns a reference to an object.
The assignment copies from the temporary object to the object whose reference the [] operator returned. In other words, the RelativeData copy assignment operator is called (the compiler will in most cases create one for you if you don't have one).
If there is no element with the key configFilePath, then the map will default construct one, and return a reference to it.
So what your code does is create two default-constructed objects of type RelativeData, and copies the contents from one to the other. It is, in maybe not so kind words, pretty much useless.
Looks like the function simply adds, if not already there, an empty struct to the _relativeDatas map ( which is most seemingly a std::map < std::String configFile, struct RelativeData > ), which can then be filled with data

STL map insertion copy constructor

I have objects of type MyClass stored as pairs <std::string, MyClass> in an STL Map. The std::string is a unique name for each MyClass object. I want every MyClass object to be instantiated only ONCE per name and thus destroyed only once at the end in my application. So I try to avoid invocation of copy constructors or default constructors, as they might invoke destruction. A MyClass object refers to some kind of ressource that shall be allocated/freed only once. I tried to use this code to create instances of MyClass, put them in my map and give a pointer to the just created instance back.
MyClass* FooClass::GetItem(std::string name)
{
MyClass* item = GetItemExists(name);
if (item == NULL)
{
item = &(*((this->myMap.insert(std::pair<std::string, MyClass>
(name, MyClass(name)))).first)).second;
}
return item;
}
Creation and insertion works this way. But the destructor of Class MyClass is called 3! times. Even the return item; statement invokes the destructor, as this is a pointer?! I thought this is impossible and must be forced by delete item?!
I thought an alternative is to store pointers MyClass* instead of objects in the map. Or is there a better alternative? I did not use myMap[name] = MyClass(name); to avoid copy/destruction, but I think insert doesnt make it better.
You need to emplace and piecewise construct the inserted element:
item = &(this->myMap.emplace(std::piecewise_construct,
std::forward_as_tuple(name),
std::forward_as_tuple(name)).first->second);

Insert elements into std::map without extra copying

Consider this program:
#include <map>
#include <string>
#define log magic_log_function // Please don't mind this.
//
// ADVENTURES OF PROGO THE C++ PROGRAM
//
class element;
typedef std::map<int, element> map_t;
class element {
public:
element(const std::string&);
element(const element&);
~element();
std::string name;
};
element::element(const std::string& arg)
: name(arg)
{
log("element ", arg, " constucted, ", this);
}
element::element(const element& other)
: name(other.name)
{
name += "-copy";
log("element ", name, " copied, ", this);
}
element::~element()
{
log("element ", name, " destructed, ", this);
}
int main(int argc, char **argv)
{
map_t map1; element b1("b1");
log(" > Done construction.");
log(" > Making map 1.");
map1.insert(std::pair<int, element>(1, b1));
log(" > Done making map 1.");
log(" > Before returning from main()");
}
It creates some objects on stack and inserts them into an std::map container, creating two extra temporary copies in the process:
element b1 constucted, 0x7fff228c6c60
> Done construction.
> Making map 1.
element b1-copy copied, 0x7fff228c6ca8
element b1-copy-copy copied, 0x7fff228c6c98
element b1-copy-copy-copy copied, 0x232d0c8
element b1-copy-copy destructed, 0x7fff228c6c98
element b1-copy destructed, 0x7fff228c6ca8
> Done making map 1.
> Before returning from main()
element b1 destructed, 0x7fff228c6c60
element b1-copy-copy-copy destructed, 0x232d0c8
We can get rid of one extra copy constructor call by changing the std::pair signature to std::pair<int, element&>, however, the second temporary is still created and immediately destroyed:
element b1 constucted, 0x7fff0fe75390
> Done construction.
> Making map 1.
element b1-copy copied, 0x7fff0fe753c8
element b1-copy-copy copied, 0x1bc4098
element b1-copy destructed, 0x7fff0fe753c8
> Done making map 1.
> Before returning from main()
element b1 destructed, 0x7fff0fe75390
element b1-copy-copy destructed, 0x1bc4098
Is there a way to make std::map just take an object on stack by reference and make a single internal copy of it?
This is one of the many use cases which motivated C++11's move functionality, supported by a host of new features, particularly rvalue references, and a variety of new standard library interfaces, including std::map::emplace, std::vector::emplace_back, etc.
If, for whatever reason, you cannot yet use C++11, you can at least console yourself with the thought that the problem has been recognized, and that a solution has been standardized and implemented, and that furthermore many of us are using it, some of us [1] in production-code. So, as the old joke has it, a solution exists and it's your call as to when you take it up.
Note that you don't have to use the emplace member function if your objects implement move constructors, which they may even do by default. This won't happen if the have explicit copy constructors, so your test above may produce observer effects (and indeed, it might also suppress compiler optimizations in the case of PODs, so even with C++03 you might not have the problem you think you do).
There are a variety of hacks available which kinda-sorta avoid copies with only "minor" source code alterations, but IMHO the best approach is to start moving towards C++11. Whatever you do, try to do it in a way that will make the inevitable migration less painful.
[Note 1]: Disclaimer: I no longer write production code, having more or less retired, so I'm not part of the "some of us" in that sentence.
Standard practice (with older C++ versions) where I've been is to use a Map of shared pointers.
Still creates a copy of the shared pointers, but that's usually much less onerous than copying large objects.
You can use emplace() :
The element is constructed in-place, i.e. no copy or move operations are performed. The constructor of the element type (value_type, that is, std::pair) is called with exactly the same arguments as supplied to the function
Well, if you dont have emplace, you can construct the element on the heap and pass pointers to map:
typedef std::map<int, element*> map_t;
...
printf(" > Making pair 1.\n");
std::pair<int, element*> pair(1, new element ("b1")) ;
printf(" > Making map 1.\n");
map1.insert(pair);
but then you are subject to memory-leaks if you dont take care when your map leaves scope...

Does insert method of STL's set copy the value of passed objects?

I've got such a C++ code (please don't ask why it looks so ugly ;) - you have to believe that because of further part of code it really has a sense):
IntSet temp;
SuperSet superSet;
for (uint i = 0; i < noItems; i++) {
temp.insert(i);
superSet.insert(temp);
temp.clear();
}
It is intended for preparing noItems sets of integers (IntSet, each containing one integer value) and inserting it to other set (SuperSet). Both sets are defined as follows:
typedef unsigned int DataType;
typedef std::set<DataType> IntSet;
typedef std::set<IntSet> SuperSet;
For me, this code shouldn't work as intended, because just after inserting temp to superSet I'm clearing the temp, and I found that insert is getting a reference as its argument: pair<iterator,bool> insert ( const value_type& x ); (http://www.cplusplus.com/reference/stl/set/insert/)
So, as the result of the code presented above, I should get a SuperSet containing only cleared IntSet's. But "unfortunately" this code works - all IntSet's are filled with proper values... So my question is - what does insert method from STL's set really do in its body? Does it simply copy objects that are passed to it by reference? And what is the difference in this method's behaviour between passing object or primitive types?
Thank you for your answers!
insert() takes a reference argument to avoid the copy when passing the argument. but it creates a copy when stores the item in the collection. This is why the clear() can work in this case. Also, this is true in both cases, so even though you are "reusing" temp, there will be separate copies in superSet
Your declaration for SuperSet stores IntSet by value so the only way to insert a new element is to make a copy. Since a copy is made changes to the original IntSet will not be reflected in the copy.
This applies specifically to how you are passing temp to superSet though, in C++11 your usage becomes inefficient. By declaring a local variable to use as a temporary you prevent the use of move semantics by forcing a copy to be made.
SuperSet superSet;
for (DataType i = 0; i < noItems; i++)
{
superSet.insert(IntSet(&i, &i + 1));
}
Discounting optimizations the compiler will create a temporary IntSet and initialize it with a single element. Because the compiler knows this is a temporary it can insert the value using the move constructor. This will do a shallow copy of the IntSet passed and reset it's values to a default state (i.e pointers to nullptr) resulting in a move.